ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	16
Since 2016 (last 10 years)	40
Since 2006 (last 20 years)	86

Descriptor

Scores	117
Test Bias	117
Test Items	117
Item Response Theory	31
Comparative Analysis	28
Test Reliability	27
Statistical Analysis	25
Test Validity	25
Item Analysis	22
Difficulty Level	21
Foreign Countries	21
Psychometrics	19
Mathematics Tests	18
Correlation	17
College Entrance Examinations	16
Test Construction	16
Evaluation Methods	14
Models	14
Simulation	14
Computer Assisted Testing	12
Gender Differences	12
Scoring	12
Error of Measurement	11
Goodness of Fit	11
Regression (Statistics)	11
More ▼

Publication Type

Reports - Research	80
Journal Articles	79
Reports - Evaluative	20
Reports - Descriptive	8
Dissertations/Theses -…	7
Speeches/Meeting Papers	6
Tests/Questionnaires	4
Opinion Papers	3
Numerical/Quantitative Data	2
Guides - General	1
Guides - Non-Classroom	1
Information Analyses	1
More ▼

Education Level

Secondary Education	20
Higher Education	19
Postsecondary Education	13
Elementary Education	12
High Schools	12
Grade 8	8
Middle Schools	8
Elementary Secondary Education	7
Grade 4	6
Grade 7	6
Junior High Schools	6
Grade 9	5
Grade 3	4
Intermediate Grades	4
Early Childhood Education	3
Grade 5	3
Grade 6	3
Primary Education	3
More ▼

Audience

Researchers	3
Community	1
Parents	1

Location

Canada	4
Iran	4
California	2
Florida	2
Indonesia	2
New Mexico	2
Pennsylvania	2
Spain	2
Turkey	2
Alabama	1
Belgium	1
China	1
Delaware	1
Hungary	1
Idaho	1
Malaysia	1
Nebraska	1
Netherlands	1
New York	1
North Carolina	1
North Dakota	1
Ohio	1
Philippines	1
Singapore	1
South Africa	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 117 results Save | Export

Artificial Intelligence and Educational Measurement: Opportunities and Threats

Peer reviewed

Direct link

Andrew D. Ho – Journal of Educational and Behavioral Statistics, 2024

I review opportunities and threats that widely accessible Artificial Intelligence (AI)-powered services present for educational statistics and measurement. Algorithmic and computational advances continue to improve approaches to item generation, scale maintenance, test security, test scoring, and score reporting. Predictable misuses of AI for…

Descriptors: Artificial Intelligence, Measurement, Educational Assessment, Technology Uses in Education

A Three-Step DIF Analysis of a Reading Comprehension Test across Regional Dialects to Improve Test Score Validity

Peer reviewed

Direct link

Paula Elosua – Language Assessment Quarterly, 2024

In sociolinguistic contexts where standardized languages coexist with regional dialects, the study of differential item functioning is a valuable tool for examining certain linguistic uses or varieties as threats to score validity. From an ecological perspective, this paper describes three stages in the study of differential item functioning…

Descriptors: Reading Tests, Reading Comprehension, Scores, Test Validity

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

Using Item Scores and Distractors to Detect Aberrant Behavior

Direct link

Gorney, Kylie – ProQuest LLC, 2023

Aberrant behavior refers to any type of unusual behavior that would not be expected under normal circumstances. In educational and psychological testing, such behaviors have the potential to severely bias the aberrant examinee's test score while also jeopardizing the test scores of countless others. It is therefore crucial that aberrant examinees…

Descriptors: Behavior Problems, Educational Testing, Psychological Testing, Test Bias

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

Extended Multivariate Generalizability Theory with Complex Design Structures

Peer reviewed

Direct link

Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022

This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…

Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction

Impact of Differential Item Functioning on Group Score Reporting in the Context of Large-Scale Assessments

Peer reviewed

Direct link

Joo, Sean; Ali, Usama; Robin, Frederic; Shin, Hyo Jeong – Large-scale Assessments in Education, 2022

We investigated the potential impact of differential item functioning (DIF) on group-level mean and standard deviation estimates using empirical and simulated data in the context of large-scale assessment. For the empirical investigation, PISA 2018 cognitive domains (Reading, Mathematics, and Science) data were analyzed using Jackknife sampling to…

Descriptors: Test Items, Item Response Theory, Scores, Student Evaluation

Taking Inventory of the Creative Behavior Inventory: An Item Response Theory Analysis of the CBI

Peer reviewed

Direct link

Rodriguez, Rebekah M.; Silvia, Paul J.; Kaufman, James C.; Reiter-Palmon, Roni; Puryear, Jeb S. – Creativity Research Journal, 2023

The original 90-item Creative Behavior Inventory (CBI) was a landmark self-report scale in creativity research, and the 28-item brief form developed nearly 20 years ago continues to be a popular measure of everyday creativity. Relatively little is known, however, about the psychometric properties of this widely used scale. In the current research,…

Descriptors: Creativity Tests, Creativity, Creative Thinking, Psychometrics

Establishing the Validity and Reliability of the LOCUS Assessments

Peer reviewed
PDF on ERIC

Download full text

Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023

The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…

Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students

Bias and Bias Correction Method for Nonproportional Abilities Requirement (NPAR) Tests

Peer reviewed

Direct link

Ip, Edward H.; Strachan, Tyler; Fu, Yanyan; Lay, Alexandra; Willse, John T.; Chen, Shyh-Huei; Rutkowski, Leslie; Ackerman, Terry – Journal of Educational Measurement, 2019

Test items must often be broad in scope to be ecologically valid. It is therefore almost inevitable that secondary dimensions are introduced into a test during test development. A cognitive test may require one or more abilities besides the primary ability to correctly respond to an item, in which case a unidimensional test score overestimates the…

Descriptors: Test Items, Test Bias, Test Construction, Scores

Comparison of Disengagement Levels and the Impact of Disengagement on Item Parameters between PISA 2015 and PISA 2018 in the United States

Peer reviewed

Direct link

Kuang, Huan; Sahin, Fusun – Large-scale Assessments in Education, 2023

Background: Examinees may not make enough effort when responding to test items if the assessment has no consequence for them. These disengaged responses can be problematic in low-stakes, large-scale assessments because they can bias item parameter estimates. However, the amount of bias, and whether this bias is similar across administrations, is…

Descriptors: Test Items, Comparative Analysis, Mathematics Tests, Reaction Time

Mitigating Gender and L1 Biases in Automated English Speaking Assessment

Direct link

Alexander James Kwako – ProQuest LLC, 2023

Automated assessment using Natural Language Processing (NLP) has the potential to make English speaking assessments more reliable, authentic, and accessible. Yet without careful examination, NLP may exacerbate social prejudices based on gender or native language (L1). Current NLP-based assessments are prone to such biases, yet research and…

Descriptors: Gender Bias, Natural Language Processing, Native Language, Computational Linguistics

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Measuring Response Style Stability across Constructs with Item Response Trees

Peer reviewed

Direct link

Ames, Allison J. – Educational and Psychological Measurement, 2022

Individual response style behaviors, unrelated to the latent trait of interest, may influence responses to ordinal survey items. Response style can introduce bias in the total score with respect to the trait of interest, threatening valid interpretation of scores. Despite claims of response style stability across scales, there has been little…

Descriptors: Response Style (Tests), Individual Differences, Scores, Test Items

Differential Item Functioning of the Scales for Assessing Emotional Disturbance-3 for White and African American Students

Peer reviewed

Direct link

Lambert, Matthew C.; Martin, Jodie; Epstein, Michael H.; Cullinan, Douglas; Katsiyannis, Antonis – Psychology in the Schools, 2021

The present study investigated the psychometric properties of the "Scales for Assessing Emotional Disturbance -- Third Edition: Rating Scale" (SAED-3 RS), which is designed for use in identifying students with emotional disturbance for special education services. The purposes of this study were to evaluate (a) the measurement invariance…

Descriptors: Disability Identification, Rating Scales, Test Items, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

ETS Research Report Series	8
ProQuest LLC	7
Journal of Educational…	6
Educational and Psychological…	5
International Journal of…	5
Journal of Educational and…	5
College Board	4
Applied Measurement in…	3
Applied Psychological…	3
College Entrance Examination…	3
Educational Assessment	3
New Meridian Corporation	3
Educational Measurement:…	2
Hacettepe University Journal…	2
Language Assessment Quarterly	2
Large-scale Assessments in…	2
Psychological Assessment	2
SAGE Open	2
ACT, Inc.	1
Advances in Health Sciences…	1
African Journal of Research…	1
Asia Pacific Education Review	1
Assessment	1
Assessment & Evaluation in…	1
Canadian Modern Language…	1
More ▼

Dorans, Neil J.	3
Liu, Ou Lydia	3
Ackerman, Terry	2
Camilli, Gregory	2
Gorney, Kylie	2
Lee, Won-Chan	2
Lee, Yi-Hsuan	2
Magis, David	2
Prowker, Adam	2
Robin, Frederic	2
Sinharay, Sandip	2
Smith, Richard M.	2
Stricker, Lawrence J.	2
Zhang, Jinming	2
de la Torre, Jimmy	2
Alexander James Kwako	1
Ali, Usama	1
Ames, Allison J.	1
Andrew D. Ho	1
Armstrong, Anne-Marie	1
Aronson, Joshua	1
Aryadoust, Vahid	1
Atar, Burcu	1
Ayodele, Alicia Nicole	1
More ▼

SAT (College Admission Test)	7
Program for International…	4
ACT Assessment	3
Advanced Placement…	2
Graduate Record Examinations	2
Iowa Tests of Basic Skills	2
Peabody Picture Vocabulary…	2
Test of English as a Foreign…	2
Trends in International…	2
ACT Interest Inventory	1
Beck Depression Inventory	1
California Achievement Tests	1
California Test of Basic…	1
Center for Epidemiologic…	1
Childrens Manifest Anxiety…	1
Computer Attitude Scale	1
Flesch Kincaid Grade Level…	1
Flesch Reading Ease Formula	1
International English…	1
Metropolitan Achievement Tests	1
Minnesota Multiphasic…	1
National Assessment of…	1
New Jersey College Basic…	1
Progress in International…	1
Sequential Tests of…	1
More ▼