ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	7

Descriptor

Comparative Analysis	17
Computer Assisted Testing	17
Test Bias	17
Test Items	9
Adaptive Testing	8
Item Response Theory	5
Mathematics Tests	5
Scores	5
Simulation	5
Test Content	4
Test Format	4
Achievement Tests	3
English (Second Language)	3
Evaluation Methods	3
Language Tests	3
Algebra	2
Bayesian Statistics	2
Computer Literacy	2
Correlation	2
Difficulty Level	2
Educational Technology	2
Effect Size	2
Foreign Countries	2
Laptop Computers	2
Mathematical Models	2
More ▼

Source

International Journal of…	2
Applied Measurement in…	1
ELT Journal	1
ETS Research Report Series	1
Educational and Psychological…	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Technology,…	1
Large-scale Assessments in…	1
Mathematics Education…	1
Partnership for Assessment of…	1
ProQuest LLC	1
More ▼

Publication Type

Reports - Research	11
Journal Articles	10
Reports - Evaluative	4
Speeches/Meeting Papers	4
Dissertations/Theses -…	1
Information Analyses	1
Opinion Papers	1

Education Level

Secondary Education	3
Elementary Education	2
High Schools	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 11	1
Grade 3	1
Grade 5	1
Grade 7	1
Grade 8	1
Grade 9	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Location

Australia

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Comparison of Disengagement Levels and the Impact of Disengagement on Item Parameters between PISA 2015 and PISA 2018 in the United States

Peer reviewed

Direct link

Kuang, Huan; Sahin, Fusun – Large-scale Assessments in Education, 2023

Background: Examinees may not make enough effort when responding to test items if the assessment has no consequence for them. These disengaged responses can be problematic in low-stakes, large-scale assessments because they can bias item parameter estimates. However, the amount of bias, and whether this bias is similar across administrations, is…

Descriptors: Test Items, Comparative Analysis, Mathematics Tests, Reaction Time

Mitigating Gender and L1 Biases in Automated English Speaking Assessment

Direct link

Alexander James Kwako – ProQuest LLC, 2023

Automated assessment using Natural Language Processing (NLP) has the potential to make English speaking assessments more reliable, authentic, and accessible. Yet without careful examination, NLP may exacerbate social prejudices based on gender or native language (L1). Current NLP-based assessments are prone to such biases, yet research and…

Descriptors: Gender Bias, Natural Language Processing, Native Language, Computational Linguistics

The Comparability of Scores from Different Digital Devices: A Literature Review and Synthesis with Recommendations for Practice

Peer reviewed

Direct link

Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018

Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…

Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education

Mode Comparability Study Based on Spring 2015 Operational Test Data

Download full text

Liu, Junhui; Brown, Terran; Chen, Jianshen; Ali, Usama; Hou, Likun; Costanzo, Kate – Partnership for Assessment of Readiness for College and Careers, 2016

The Partnership for Assessment of Readiness for College and Careers (PARCC) is a state-led consortium working to develop next-generation assessments that more accurately, compared to previous assessments, measure student progress toward college and career readiness. The PARCC assessments include both English Language Arts/Literacy (ELA/L) and…

Descriptors: Testing, Achievement Tests, Test Items, Test Bias

An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014

Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items

Comparing the Score Distribution of a Trial Computer-Based Examination Cohort with That of the Standard Paper-Based Examination Cohort

Download full text

Zoanetti, Nathan; Les, Magdalena; Leigh-Lancaster, David – Mathematics Education Research Group of Australasia, 2014

From 2011-2013 the VCAA conducted a trial aligning the use of computers in curriculum, pedagogy and assessment culminating in a group of 62 volunteer students sitting their end of Year 12 technology-active Mathematical Methods (CAS) Examination 2 as a computer-based examination. This paper reports on statistical modelling undertaken to compare the…

Descriptors: Computer Assisted Testing, Comparative Analysis, Mathematical Concepts, Mathematics Tests

Correcting for Person Misfit in Aggregated Score Reporting

Peer reviewed

Direct link

Brown, Richard S.; Villarreal, Julio C. – International Journal of Testing, 2007

There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation,…

Descriptors: Psychometrics, Test Bias, Testing, Simulation

The Goal of Equity within and between Computerized Adaptive Tests and Paper and Pencil Forms.

Download full text

Thomasson, Gary L. – 1997

Score comparability is important to those who take tests and those who use them. One important concept related to test score comparability is that of "equity," which is defined as existing when examinees are indifferent as to which of two alternate forms of a test they would prefer to take. By their nature, computerized adaptive tests…

Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing

Comparability of TOEFL CBT Writing Prompts for Different Native Language Groups

Peer reviewed

Direct link

Lee, Yong-Won; Breland, Hunter; Muraki, Eiji – International Journal of Testing, 2005

This study has investigated the comparability of computer-based testing writing prompts in the Test of English as a Foreign LanguageTM (TOEFL) for examinees of different native language backgrounds. A total of 81 writing prompts introduced from July 1998 through August 2000 were examined using a 3-step logistic regression procedure for ordinal…

Descriptors: Language Aptitude, Effect Size, Test Bias, English (Second Language)

Revising Answers to Items in Computerized Adaptive Tests: A Comparison of Three Models.

Download full text

Stocking, Martha L. – 1996

The interest in the application of large-scale computerized adaptive testing has served to focus attention on issues that arise when theoretical advances are made operational. Some of these issues stem less from changes in testing conditions and more from changes in testing paradigms. One such issue is that of the order in which questions are…

Descriptors: Adaptive Testing, Cognitive Processes, Comparative Analysis, Computer Assisted Testing

The Effects of Variable Entry for a Bayesian Adaptive Test.

Peer reviewed

Hankins, Janette A. – Educational and Psychological Measurement, 1990

The effects of a fixed and variable entry procedure on bias and information of a Bayesian adaptive test were compared. Neither procedure produced biased ability estimates on the average. Bias at the distribution extremes, efficiency curves, item subsets generated for administration, and items required to reach termination are discussed. (TJH)

Descriptors: Adaptive Testing, Aptitude Tests, Bayesian Statistics, Comparative Analysis

Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2006

Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Item Banks

Comparing Methods of Assessing Differential Item Functioning in a Computerized Adaptive Testing Environment

Peer reviewed

Direct link

Lei, Pui-Wa; Chen, Shu-Ying; Yu, Lan – Journal of Educational Measurement, 2006

Mantel-Haenszel and SIBTEST, which have known difficulty in detecting non-unidirectional differential item functioning (DIF), have been adapted with some success for computerized adaptive testing (CAT). This study adapts logistic regression (LR) and the item-response-theory-likelihood-ratio test (IRT-LRT), capable of detecting both unidirectional…

Descriptors: Evaluation Methods, Test Bias, Computer Assisted Testing, Multiple Regression Analysis

Assessing the Reliability of Computer Adaptive Testing Branching Algorithms Using HyperCAT.

Shermis, Mark D.; And Others – 1992

The reliability of four branching algorithms commonly used in computer adaptive testing (CAT) was examined. These algorithms were: (1) maximum likelihood (MLE); (2) Bayesian; (3) modal Bayesian; and (4) crossover. Sixty-eight undergraduate college students were randomly assigned to one of the four conditions using the HyperCard-based CAT program,…

Descriptors: Adaptive Testing, Algorithms, Bayesian Statistics, Comparative Analysis

Comparing Dual-Language Versions of an International Computerized-Adaptive Certification Exam.

Download full text

Sireci, Stephen G.; Foster, David F.; Robin, Frederic; Olsen, James – 1997

Evaluating the comparability of a test administered in different languages is a difficult, if not impossible, task. Comparisons are problematic because observed differences in test performance between groups who take different language versions of a test could be due to a difference in difficulty between the tests, to cultural differences in test…

Descriptors: Adaptive Testing, Adults, Certification, Comparative Analysis

Previous Page | Next Page »

Pages: 1 | 2

Alexander James Kwako	1
Ali, Usama	1
Ali, Usama S.	1
Allen, Nancy	1
Ariel, Adelaide	1
Bennett, Randy Elliott	1
Breland, Hunter	1
Brown, Richard S.	1
Brown, Terran	1
Chang, Hua-Hua	1
Chen, Jianshen	1
Chen, Shu-Ying	1
Costanzo, Kate	1
Dadey, Nathan	1
DePascale, Charles	1
Foster, David F.	1
Fulcher, Glenn	1
Hankins, Janette A.	1
Horkay, Nancy	1
Hou, Likun	1
Kaplan, Bruce	1
Kuang, Huan	1
Lee, Yong-Won	1
Lei, Pui-Wa	1
More ▼