Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 6 |
| Since 2007 (last 20 years) | 14 |
Descriptor
| Computer Assisted Testing | 18 |
| Rating Scales | 18 |
| Test Reliability | 11 |
| Language Tests | 7 |
| Second Language Learning | 7 |
| Evaluators | 6 |
| Foreign Countries | 6 |
| Scores | 6 |
| Test Validity | 6 |
| Comparative Analysis | 5 |
| Language Proficiency | 5 |
| More ▼ | |
Source
Author
| Bobek, Becky L. | 1 |
| Cillessen, Antonius H. N. | 1 |
| Coniam, David | 1 |
| Cowles, Michael | 1 |
| Davis, Caroline | 1 |
| Davis, Lawrence Edward | 1 |
| Doewes, Afrizal | 1 |
| Endedijk, Hinke M. | 1 |
| Galaczi, Evelina | 1 |
| Garb, Howard N. | 1 |
| Gore, Paul A. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 14 |
| Reports - Research | 10 |
| Reports - Evaluative | 5 |
| Information Analyses | 2 |
| Reports - Descriptive | 2 |
| Speeches/Meeting Papers | 2 |
| Dissertations/Theses -… | 1 |
| Opinion Papers | 1 |
Education Level
| Elementary Education | 3 |
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Secondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Secondary Education | 1 |
| Preschool Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Heng Lu – PASAA: Journal of Language Teaching and Learning in Thailand, 2023
The test view is on the Duolingo English Test (DET), an alternative online English proficiency test with a machine-driven characteristic. The review covers essential information of the DET such as test purpose, usage, score-mapping with CEFR scale, price, and publisher. Meanwhile, the test usefulness is discussed with focuses on reliability,…
Descriptors: Computer Software, Computer Assisted Instruction, Second Language Learning, Second Language Instruction
Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021
Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Li, Shuai; Taguchi, Naoko; Xiao, Feng – Language Assessment Quarterly, 2019
Adopting Linacre's guidelines for evaluating rating scale effectiveness, we examined whether and how a six-point rating scale functioned differently across raters, speech acts, and second language (L2) proficiency levels. We developed a 12-item Computerized Oral Discourse Completion Task (CODCT) for assessing the production of requests, refusals,…
Descriptors: Speech Acts, Rating Scales, Guidelines, Evaluators
Min, Shangchao; He, Lianzhen; Zhang, Jie – Language Teaching, 2020
This article reviews a selected sample of 70 empirical studies in journal articles and doctoral dissertations on language assessment in China between 2011 and 2018. Following a brief introduction to the history and current state of language assessment in China, the article presents a critical review of language assessment research on six themes…
Descriptors: Language Tests, Test Reliability, Test Validity, Journal Articles
Endedijk, Hinke M.; Cillessen, Antonius H. N. – International Journal of Behavioral Development, 2015
In preschool classes, sociometric peer ratings are used to measure children's peer relationships. The current study examined a computerized version of preschool sociometric ratings. The psychometric properties were compared of computerized sociometric ratings and traditional peer ratings for preschoolers. The distributions, inter-item…
Descriptors: Sociometric Techniques, Preschool Education, Preschool Children, Peer Relationship
Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015
The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…
Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing
Greathouse, Dan; Shaughnessy, Michael F. – Journal of Psychoeducational Assessment, 2016
Whenever a major intelligence or achievement test is revised, there is always renewed interest in the underlying structure of the test as well as a renewed interest in the scoring, administration, and interpretation changes. In this interview, Amy Gabel discusses the most recent revision of the "Wechsler Intelligence Scale for Children-Fifth…
Descriptors: Children, Intelligence Tests, Test Use, Test Validity
Keiser, Ashley; Reddy, Linda – Journal of Applied School Psychology, 2013
The Pediatric Attention Disorders Diagnostic Screener is a multidimensional, computerized screening tool designed to assess attention and global aspects of executive functioning in children at risk for attention disorders. The screener consists of a semi-structured diagnostic interview, brief parent and teacher rating scales, 3 computer-based…
Descriptors: Screening Tests, Computer Assisted Testing, Children, At Risk Persons
Davis, Lawrence Edward – ProQuest LLC, 2012
Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…
Descriptors: Evaluators, Expertise, Scores, Second Language Learning
Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013
Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…
Descriptors: Oral Language, Language Proficiency, Scaling, Scores
Coniam, David – Journal of Educational Technology Systems, 2011
This article details an investigation into the onscreen marking (OSM) of Liberal Studies (LS) in Hong Kong--where paper-based marking (PBM) of public examinations is being phased out and wholly superseded by OSM. The study involved 14 markers who had previously rated Liberal Studies scripts on screen in the 2009 Hong Kong Advanced Level…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Comparative Analysis
Bobek, Becky L.; Gore, Paul A. – American College Testing (ACT), Inc., 2004
This research report describes changes made to the Inventory of Work-Relevant Values when it was revised for online use as a part of the Internet version of DISCOVER. Users will see the following differences between the online and CD-ROM versions of the inventory: 22 items rather than 61, simplified presentation, and the contribution of all items…
Descriptors: Interrater Reliability, Field Tests, Internet, Test Construction
Garb, Howard N. – Psychological Assessment, 2007
To evaluate the value of computer-administered interviews and rating scales, the following topics are reviewed in the present article: (a) strengths and weaknesses of structured and unstructured assessment instruments, (b) advantages and disadvantages of computer administration, and (c) the validity and utility of computer-administered interviews…
Descriptors: Computer Assisted Testing, Rating Scales, Interviews, Evaluation Methods
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
