ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	33

Descriptor

Error of Measurement	51
Item Response Theory	51
Test Construction	13
Test Items	13
Test Reliability	13
Mathematics Tests	9
Computation	8
Data Collection	8
Scaling	8
Scores	8
Simulation	8
Test Validity	8
English	7
Equated Scores	7
Goodness of Fit	7
Measurement Techniques	7
Scoring	7
Testing Programs	7
Evaluation Methods	6
Grade 3	6
Grade 4	6
Reliability	6
Testing	6
Academic Achievement	5
Achievement Tests	5
More ▼

Publication Type

Reports - Descriptive	51
Journal Articles	37
Numerical/Quantitative Data	8
Reports - Evaluative	2
Speeches/Meeting Papers	2
Information Analyses	1
Tests/Questionnaires	1

Education Level

Secondary Education	8
Early Childhood Education	6
Elementary Education	6
Elementary Secondary Education	6
Grade 3	6
Grade 4	6
Primary Education	6
Grade 5	5
Grade 6	5
Grade 7	5
Grade 8	5
Intermediate Grades	5
Junior High Schools	5
Middle Schools	5
High Schools	2
Grade 1	1
Grade 2	1
Kindergarten	1
More ▼

Audience

Location

New York	5
New Mexico	2
Taiwan	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

National Assessment of…	3
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
Program for International…	1
Trends in International…	1
Wechsler Preschool and…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 51 results Save | Export

Digital Module 27: Hierarchical Rater Models

Peer reviewed

Direct link

Casabianca, Jodi M. – Educational Measurement: Issues and Practice, 2021

Module Overview: In this digital ITEMS module, Dr. Jodi M. Casabianca provides a primer on the "hierarchical rater model" (HRM) framework and the recent expansions to the model for analyzing raters and ratings of constructed responses. In the first part of the module, she establishes an understanding of the nature of constructed…

Descriptors: Hierarchical Linear Modeling, Rating Scales, Error of Measurement, Item Response Theory

Conditional Precision of Measurement for Test Scores: Are Conditional Standard Errors Sufficient?

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2019

This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…

Descriptors: Measurement, Accuracy, Scores, Error of Measurement

CTT Package in R

Peer reviewed

Direct link

Sheng, Yanyan – Measurement: Interdisciplinary Research and Perspectives, 2019

Classical approach to test theory has been the foundation for educational and psychological measurement for over 90 years. This approach concerns with measurement error and hence test reliability, which in part relies on individual test items. The CTT package, developed in light of this, provides functions for test- and item-level analyses of…

Descriptors: Item Response Theory, Test Reliability, Item Analysis, Error of Measurement

A Technical Note on IRT Simulation Studies: Dealing with Truth, Estimates, Observed Data, and Residuals

Peer reviewed

Direct link

Luecht, Richard; Ackerman, Terry A. – Educational Measurement: Issues and Practice, 2018

Simulation studies are extremely common in the item response theory (IRT) research literature. This article presents a didactic discussion of "truth" and "error" in IRT-based simulation studies. We ultimately recommend that future research focus less on the simple recovery of parameters from a convenient generating IRT model,…

Descriptors: Item Response Theory, Simulation, Ethics, Error of Measurement

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

On Studying Common Factor Dominance and Approximate Unidimensionality in Multicomponent Measuring Instruments with Discrete Items

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2018

This article outlines a procedure for examining the degree to which a common factor may be dominating additional factors in a multicomponent measuring instrument consisting of binary items. The procedure rests on an application of the latent variable modeling methodology and accounts for the discrete nature of the manifest indicators. The method…

Descriptors: Measurement Techniques, Factor Analysis, Item Response Theory, Likert Scales

Polytomous Rasch Models in Counseling Assessment

Peer reviewed

Direct link

Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017

This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.

Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation

Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development

Peer reviewed
PDF on ERIC

Download full text

Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018

Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…

Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making

TIMSS 2015: Illustrating Advancements in Large-Scale International Assessments

Peer reviewed

Direct link

Martin, Michael O.; Mullis, Ina V. S. – Journal of Educational and Behavioral Statistics, 2019

International large-scale assessments of student achievement such as International Association for the Evaluation of Educational Achievement's Trends in International Mathematics and Science Study (TIMSS) and Progress in International Reading Literacy Study and Organization for Economic Cooperation and Development's Program for International…

Descriptors: Achievement Tests, International Assessment, Mathematics Tests, Science Achievement

Comment on 3PL IRT Adjustment for Guessing

Peer reviewed

Direct link

Chiu, Ting-Wei; Camilli, Gregory – Applied Psychological Measurement, 2013

Guessing behavior is an issue discussed widely with regard to multiple choice tests. Its primary effect is on number-correct scores for examinees at lower levels of proficiency. This is a systematic error or bias, which increases observed test scores. Guessing also can inflate random error variance. Correction or adjustment for guessing formulas…

Descriptors: Item Response Theory, Guessing (Tests), Multiple Choice Tests, Error of Measurement

Bad Questions: An Essay Involving Item Response Theory

Peer reviewed

Direct link

Thissen, David – Journal of Educational and Behavioral Statistics, 2016

David Thissen, a professor in the Department of Psychology and Neuroscience, Quantitative Program at the University of North Carolina, has consulted and served on technical advisory committees for assessment programs that use item response theory (IRT) over the past couple decades. He has come to the conclusion that there are usually two purposes…

Descriptors: Item Response Theory, Test Construction, Testing Problems, Student Evaluation

New York State Testing Program 2018: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2018

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 English Language Arts (ELA) and Mathematics 2018 Operational Tests. This report includes information about test content and test development, item (i.e., individual…

Descriptors: English, Language Arts, Language Tests, Mathematics Tests

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

New York State Testing Program 2017: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2017

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 English Language Arts (ELA) and Mathematics 2017 Operational Tests. This report includes information about test content and test development, item (i.e., individual…

Descriptors: English, Language Arts, Language Tests, Mathematics Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Applied Psychological…	6
Educational and Psychological…	6
Journal of Educational and…	5
New York State Education…	5
Journal of Educational…	4
Educational Measurement:…	3
Measurement and Evaluation in…	2
Measurement:…	2
National Center for Education…	2
New Mexico Public Education…	2
Behavioral Research and…	1
Council of Chief State School…	1
Health Education Research	1
International Journal of…	1
Journal of Experimental…	1
Journal of Outcome Measurement	1
Practical Assessment,…	1
Psychological Methods	1
Psychometrika	1
RAND Corporation	1
Structural Equation Modeling	1
Studies in Educational…	1
More ▼

Kolen, Michael J.	4
Ogasawara, Haruhiko	3
Dever, Jill A.	2
Fritch, Laura Burns	2
Herget, Deborah R.	2
Ingels, Steven J.	2
Kitmitto, Sami	2
Leinwand, Steve	2
Ottem, Randolph	2
Pratt, Daniel J.	2
Raykov, Tenko	2
Rogers, James E.	2
Wilson, Mark	2
Ackerman, Terry A.	1
Adams, Raymond J.	1
Allen, Diane D.	1
Alonzo, Julie	1
Baghi, Heibatollah	1
Bichi, Ado Abdu	1
Bolsinova, Maria	1
Brennan, Robert L.	1
Camilli, Gregory	1
Casabianca, Jodi M.	1
Chen, Hsueh-Chu	1
Chiu, Ting-Wei	1
More ▼