ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	18

Descriptor

Error of Measurement	22
Test Items	22
Item Response Theory	13
Goodness of Fit	7
Psychometrics	6
Test Construction	6
Computation	5
Scoring	5
Cutting Scores	4
Scores	4
Simulation	4
Student Evaluation	4
Test Reliability	4
Test Theory	4
Test Validity	4
Achievement Tests	3
Difficulty Level	3
Elementary School Students	3
Evaluation Methods	3
Interrater Reliability	3
Item Analysis	3
Mathematics Achievement	3
Measurement Techniques	3
Measures (Individuals)	3
Raw Scores	3
More ▼

Source

Educational and Psychological…	3
Applied Psychological…	2
Behavioral Research and…	2
Educational Measurement:…	2
Journal of Educational and…	2
New Mexico Public Education…	2
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Measurement and Evaluation in…	1
Measurement:…	1
Partnership for Assessment of…	1
Practical Assessment,…	1
More ▼

Publication Type

Reports - Descriptive	22
Journal Articles	15
Numerical/Quantitative Data	4
Speeches/Meeting Papers	2
Tests/Questionnaires	1

Education Level

Elementary Education	4
Elementary Secondary Education	3
Grade 3	3
Grade 1	2
Grade 2	2
Grade 4	2
Grade 5	2
Kindergarten	2
Early Childhood Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Researchers

Location

New Mexico	2
Colorado (Boulder)	1
Maryland	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

National Assessment of…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Polytomous Rasch Models in Counseling Assessment

Peer reviewed

Direct link

Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017

This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.

Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation

Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development

Peer reviewed
PDF on ERIC

Download full text

Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018

Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…

Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making

Bad Questions: An Essay Involving Item Response Theory

Peer reviewed

Direct link

Thissen, David – Journal of Educational and Behavioral Statistics, 2016

David Thissen, a professor in the Department of Psychology and Neuroscience, Quantitative Program at the University of North Carolina, has consulted and served on technical advisory committees for assessment programs that use item response theory (IRT) over the past couple decades. He has come to the conclusion that there are usually two purposes…

Descriptors: Item Response Theory, Test Construction, Testing Problems, Student Evaluation

Quality Control Charts in Large-Scale Assessment Programs

Peer reviewed

Direct link

Schafer, William D.; Coverdale, Bradley J.; Luxenberg, Harlan; Jin, Ying – Practical Assessment, Research & Evaluation, 2011

There are relatively few examples of quantitative approaches to quality control in educational assessment and accountability contexts. Among the several techniques that are used in other fields, Shewart charts have been found in a few instances to be applicable in educational settings. This paper describes Shewart charts and gives examples of how…

Descriptors: Charts, Quality Control, Educational Assessment, Statistical Analysis

Multidimensional Item Response Theory Parameter Estimation with Nonsimple Structure Items

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2011

Estimation of multidimensional item response theory (MIRT) model parameters can be carried out using the normal ogive with unweighted least squares estimation with the normal-ogive harmonic analysis robust method (NOHARM) software. Previous simulation research has demonstrated that this approach does yield accurate and efficient estimates of item…

Descriptors: Item Response Theory, Computation, Test Items, Simulation

On Bias in Linear Observed-Score Equating

Peer reviewed

Direct link

van der Linden, Wim J. – Measurement: Interdisciplinary Research and Perspectives, 2010

The traditional way of equating the scores on a new test form X to those on an old form Y is equipercentile equating for a population of examinees. Because the population is likely to change between the two administrations, a popular approach is to equate for a "synthetic population." The authors of the articles in this issue of the…

Descriptors: Test Format, Equated Scores, Population Distribution, Population Trends

Polytomous Differential Item Functioning and Violations of Ordering of the Expected Latent Trait by the Raw Score

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2008

The graded response (GR) and generalized partial credit (GPC) models do not imply that examinees ordered by raw observed score will necessarily be ordered on the expected value of the latent trait (OEL). Factors were manipulated to assess whether increased violations of OEL also produced increased Type I error rates in differential item…

Descriptors: Test Items, Raw Scores, Test Theory, Error of Measurement

Theory of Test Translation Error

Peer reviewed

Direct link

Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009

In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…

Descriptors: Test Items, Investigations, Semantics, Translation

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Making Inferences about Growth and Value-Added: Design Issues for the PARCC Consortium. A White Paper

Download full text

Briggs, Derek C. – Partnership for Assessment of Readiness for College and Careers, 2011

There is often confusion about distinctions between growth models and value-added models. The first half of this paper attempts to dispel some of these confusions by clarifying terminology and illustrating by example how the results from a large-scale assessment can and will be used to make inferences about student growth and the value-added…

Descriptors: Value Added Models, Language Usage, Measurement, Inferences

Effects of Assigning Raters to Items

Peer reviewed

Direct link

Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008

Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…

Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring

Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

Peer reviewed

Direct link

Lee, Won-Chan – Applied Psychological Measurement, 2007

This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

Descriptors: Simulation, Error of Measurement, Scoring, Test Items

Error Variance of Rasch Measurement with Logistic Ability Distributions.

PDF pending restoration

Dimitrov, Dimiter M. – 2002

Exact formulas for classical error variance are provided for Rasch measurement with logistic distributions. An approximation formula with the normal ability distribution is also provided. With the proposed formulas, the additive contribution of individual items to the population error variance can be determined without knowledge of the other test…

Descriptors: Ability, Error of Measurement, Item Response Theory, Test Items

Previous Page | Next Page »

Pages: 1 | 2

Alonzo, Julie	2
Tindal, Gerald	2
Anderson, Dan	1
Aylesworth, Richard	1
Backhoff, Eduardo	1
Bichi, Ado Abdu	1
Bolsinova, Maria	1
Briggs, Derek C.	1
Contreras-Nino, Luis Angel	1
Cook, Linda L.	1
Coverdale, Bradley J.	1
DeMars, Christine E.	1
Dimitrov, Dimiter M.	1
Doran, Harold C.	1
Finch, Holmes	1
Griph, Gerald W.	1
Ito, Kyoko	1
Jin, Ying	1
Kolstad, Andrew	1
Kristjansson, Elizabeth	1
Lee, Won-Chan	1
Liaw, Yuan-Ling	1
Liu, Kimy	1
Luxenberg, Harlan	1
Mcdowell, Ian	1
More ▼