Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 18 |
Descriptor
Comparative Analysis | 36 |
Test Format | 36 |
Scoring | 27 |
Computer Assisted Testing | 15 |
Test Items | 14 |
Test Construction | 11 |
Multiple Choice Tests | 8 |
Testing | 8 |
Scoring Rubrics | 7 |
Foreign Countries | 6 |
Higher Education | 6 |
More ▼ |
Source
Author
Publication Type
Education Level
Higher Education | 5 |
Postsecondary Education | 4 |
Elementary Secondary Education | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 8 | 1 |
High Schools | 1 |
Audience
Practitioners | 3 |
Teachers | 2 |
Location
Arizona | 1 |
Europe | 1 |
France | 1 |
Hungary | 1 |
Malawi | 1 |
Maryland | 1 |
United Arab Emirates | 1 |
United Kingdom | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Graduate Record Examinations | 1 |
International English… | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022
While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…
Descriptors: Scoring, Testing, Test Items, Test Format
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Dongmei Li; Shalini Kapoor; Ann Arthur; Chi-Yu Huang; YoungWoo Cho; Chen Qiu; Hongling Wang – ACT Education Corp., 2025
Starting in April 2025, ACT will introduce enhanced forms of the ACT® test for national online testing, with a full rollout to all paper and online test takers in national, state and district, and international test administrations by Spring 2026. ACT introduced major updates by changing the test lengths and testing times, providing more time per…
Descriptors: College Entrance Examinations, Testing, Change, Scoring
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E. – Grantee Submission, 2019
The "Next Generation Science Standards" calls for new assessments that measure students' integrated three-dimensional science learning. The National Research Council has suggested that these assessments utilize a combination of item formats including constructed-response and multiple-choice. In this study, students were randomly assigned…
Descriptors: Science Tests, Multiple Choice Tests, Test Format, Test Items
Al Habbash, Maha; Alsheikh, Negmeldin; Liu, Xu; Al Mohammedi, Najah; Al Othali, Safa; Ismail, Sadiq Abdulwahed – International Journal of Instruction, 2021
This convergent mixed method study aimed at exploring the English context of the widely used Emirates Standardized Test (EmSAT) by juxtaposing it to its sequel, the International English Language Testing System (IELTS). For this purpose, the study used the Common European Framework of Reference (CEFR) international standards which is used as a…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Guidelines
Vandeweerd, Nathan; Housen, Alex; Paquot, Magali – Language Testing, 2023
This study investigates whether re-thinking the separation of lexis and grammar in language testing could lead to more valid inferences about proficiency across modes. As argued by Römer, typical scoring rubrics ignore important information about proficiency encoded at the lexis-grammar interface, in particular how the co-selection of lexical and…
Descriptors: French, Language Tests, Grammar, Second Language Learning
National Academies Press, 2022
The National Assessment of Educational Progress (NAEP) -- often called "The Nation's Report Card" -- is the largest nationally representative and continuing assessment of what students in public and private schools in the United States know and can do in various subjects and has provided policy makers and the public with invaluable…
Descriptors: Costs, Futures (of Society), National Competency Tests, Educational Trends
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015
The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…
Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing
Wang, Xinrui – ProQuest LLC, 2013
The computer-adaptive multistage testing (ca-MST) has been developed as an alternative to computerized adaptive testing (CAT), and been increasingly adopted in large-scale assessments. Current research and practice only focus on ca-MST panels for credentialing purposes. The ca-MST test mode, therefore, is designed to gauge a single scale. The…
Descriptors: Computer Assisted Testing, Adaptive Testing, Diagnostic Tests, Comparative Analysis
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…
Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format
Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014
Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…
Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items
Pellicer-Sanchez, Ana; Schmitt, Norbert – Language Testing, 2012
Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…
Descriptors: Language Tests, Scoring, Reaction Time, Vocabulary Development
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias