ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	26
Since 2006 (last 20 years)	40

Descriptor

Error of Measurement	46
Item Response Theory	46
Test Reliability	46
Test Validity	19
Test Items	16
Test Construction	14
Scoring	12
Scores	11
Foreign Countries	9
Test Bias	9
Data Collection	8
Grade 3	8
Mathematics Tests	8
Simulation	8
English	7
Grade 4	7
Achievement Tests	6
Equated Scores	6
Grade 8	6
Psychometrics	6
Sample Size	6
Scaling	6
Test Length	6
Test Theory	6
Testing Programs	6
More ▼

Publication Type

Journal Articles	28
Reports - Research	23
Reports - Descriptive	13
Numerical/Quantitative Data	8
Reports - Evaluative	5
Dissertations/Theses -…	3
Guides - Non-Classroom	2
Speeches/Meeting Papers	2
Collected Works - General	1

Education Level

Secondary Education	10
Elementary Education	8
Junior High Schools	7
Middle Schools	7
Early Childhood Education	6
Grade 3	6
Grade 4	6
Grade 8	6
Intermediate Grades	6
Primary Education	6
Grade 5	5
Grade 6	5
Grade 7	5
Elementary Secondary Education	4
High Schools	2
Kindergarten	2
Grade 9	1
Higher Education	1
Postsecondary Education	1
More ▼

Audience

Location

New York	5
Indonesia	3
New Mexico	2
Canada	1
Netherlands	1
Norway	1
South Africa	1
South Korea	1
Spain	1
Taiwan	1
United Kingdom (England)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Early Childhood Longitudinal…	2
Armed Forces Qualification…	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
National Assessment of…	1
Program for International…	1
Student Teacher Relationship…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 46 results Save | Export

Detecting Differential Item Functioning among Multiple Groups Using IRT Residual DIF Framework

Peer reviewed

Direct link

Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024

This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…

Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Linking Errors Introduced by Rapid Guessing Responses When Employing Multigroup Concurrent IRT Scaling

Direct link

Jiayi Deng – ProQuest LLC, 2024

Test score comparability in international large-scale assessments (LSA) is of utmost importance in measuring the effectiveness of education systems and understanding the impact of education on economic growth. To effectively compare test scores on an international scale, score linking is widely used to convert raw scores from different linguistic…

Descriptors: Item Response Theory, Scoring Rubrics, Scoring, Error of Measurement

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates

Peer reviewed

Direct link

Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024

Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…

Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity

The Effect of Multiple-Choice Test Items' Difficulty Degree on the Reliability Coefficient and the Standard Error of Measurement Depending on the Item Response Theory (IRT)

Peer reviewed
PDF on ERIC

Download full text

Al-zboon, Habis Saad; Alrekebat, Amjad Farhan – International Journal of Higher Education, 2021

This study aims at identifying the effect of multiple-choice test items' difficulty degree on the reliability coefficient and the standard error of measurement depending on the item response theory IRT. To achieve the objectives of the study, (WinGen3) software was used to generate the IRT parameters (difficulty, discrimination, guessing) for four…

Descriptors: Multiple Choice Tests, Test Items, Difficulty Level, Error of Measurement

Conditional Standard Error of Measurement: Classical Test Theory, Generalizability Theory and Many-Facet Rasch Measurement with Applications to Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021

Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…

Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory

New Curriculum Efficacy Study and Lord's Paradox

Peer reviewed

Direct link

Perman Gochyyev; Mark Wilson – Society for Research on Educational Effectiveness, 2021

Lord's paradox arises from the conflicting inferences obtained from two alternative approaches that are typically used in evaluating the treatment effect using a pre-post test design. The two main approaches for analyzing such data are: (1) to regress the change from pretest to posttest on the treatment indicator (change score approach; CS); (2)…

Descriptors: Students, Mathematics Curriculum, Mathematics Instruction, Curriculum Development

The Invariance Paradox: Using Optimal Test Design to Minimize Bias

Peer reviewed

Direct link

Jones, Andrew T.; Kopp, Jason P.; Ong, Thai Q. – Educational Measurement: Issues and Practice, 2020

Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form…

Descriptors: Test Construction, Test Bias, Classification, Accuracy

Conditional Precision of Measurement for Test Scores: Are Conditional Standard Errors Sufficient?

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2019

This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…

Descriptors: Measurement, Accuracy, Scores, Error of Measurement

CTT Package in R

Peer reviewed

Direct link

Sheng, Yanyan – Measurement: Interdisciplinary Research and Perspectives, 2019

Classical approach to test theory has been the foundation for educational and psychological measurement for over 90 years. This approach concerns with measurement error and hence test reliability, which in part relies on individual test items. The CTT package, developed in light of this, provides functions for test- and item-level analyses of…

Descriptors: Item Response Theory, Test Reliability, Item Analysis, Error of Measurement

From OLS to Multilevel Multidimensional Mixture IRT: A Model Refinement Approach to Investigating Patterns of Relationships in PISA 2012 Data

Direct link

Gulsah Gurkan – ProQuest LLC, 2021

Secondary analyses of international large-scale assessments (ILSA) commonly characterize relationships between variables of interest using correlations. However, the accuracy of correlation estimates is impaired by artefacts such as measurement error and clustering. Despite advancements in methodology, conventional correlation estimates or…

Descriptors: Secondary School Students, Achievement Tests, International Assessment, Foreign Countries

Student Perceptions of Teaching Quality in Five Countries: A Partial Credit Model Approach to Assess Measurement Invariance

Peer reviewed

Direct link

van der Lans, Rikkert M.; Maulana, Ridwan; Helms-Lorenz, Michelle; Fernández-García, Carmen-María; Chun, Seyeoung; de Jager, Thelma; Irnidayanti, Yulia; Inda-Caro, Mercedes; Lee, Okhwa; Coetzee, Thys; Fadhilah, Nurul; Jeon, Meae; Moorer, Peter – SAGE Open, 2021

This study examines measurement invariance of student perceptions of teaching quality collected in five countries: Indonesia (n students = 6,331), the Netherlands (n students = 6,738), South Africa (n students = 3,422), South Korea (n students = 6,997) and Spain (n students = 4,676). The administered questionnaire was the My Teacher Questionnaire…

Descriptors: Foreign Countries, Student Attitudes, Student Evaluation of Teacher Performance, Teacher Effectiveness

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Developing IRT-Based Physics Critical Thinking Skill Test: A CAT to Answer 21st Century Challenge

Peer reviewed
PDF on ERIC

Download full text

Istiyono, Edi; Dwandaru, Wipsar Sunu Brams; Lede, Yulita Adelfin; Rahayu, Farida; Nadapdap, Amipa – International Journal of Instruction, 2019

The objective of this study was to develop Physics critical thinking skill test using computerized adaptive test (CAT) based on item response theory (IRT). This research was a development research using 4-D (define, design, develop, and disseminate). The content validity of the items was proven using Aiken's V. The test trial involved 252 students…

Descriptors: Critical Thinking, Thinking Skills, Cognitive Tests, Physics

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

New York State Education…	5
Educational and Psychological…	4
Applied Measurement in…	3
ProQuest LLC	3
ETS Research Report Series	2
Educational Measurement:…	2
Grantee Submission	2
Measurement:…	2
National Center for Education…	2
New Mexico Public Education…	2
Applied Psychological…	1
EURASIA Journal of…	1
Educational Sciences: Theory…	1
IEEE Transactions on Education	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational…	1
Practical Assessment,…	1
Psychometrika	1
Research Papers in Education	1
SAGE Open	1
Society for Research on…	1
More ▼

Blaker, Lisa	2
Kolen, Michael J.	2
Lê, Thanh	2
Najarian, Michelle	2
Nord, Christine	2
Paek, Insu	2
Schoen, Robert C.	2
Tourangeau, Karen	2
Vaden-Kiernan, Nancy	2
Wallner-Allen, Kathleen	2
Yang, Xiaotong	2
Yao, Lihua	2
Adams, Raymond J.	1
Al-zboon, Habis Saad	1
Alrekebat, Amjad Farhan	1
Bichi, Ado Abdu	1
Bramley, Tom	1
Brennan, Robert L.	1
Bristow, M.	1
Caroline M. Böhm	1
Chen, Hsueh-Chu	1
Cheung, K. C.	1
Chun, Seyeoung	1
Coetzee, Thys	1
Colton, Dean A.	1
More ▼