ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	14

Descriptor

Computation	16
Test Theory	16
Test Items	8
Item Response Theory	7
Scores	6
Models	5
Comparative Analysis	4
Correlation	4
Reliability	4
Statistical Analysis	4
College Entrance Examinations	3
Difficulty Level	3
Elementary School Students	3
Error of Measurement	3
Grade 2	3
Multiple Choice Tests	3
Test Reliability	3
Alternative Assessment	2
Bias	2
Equations (Mathematics)	2
Foreign Countries	2
Likert Scales	2
Monte Carlo Methods	2
Regression (Statistics)	2
Sample Size	2
More ▼

Source

Applied Psychological…	3
Educational and Psychological…	3
Annenberg Institute for…	1
Applied Measurement in…	1
ETS Research Report Series	1
Educational Measurement:…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Science Education…	1
Measurement:…	1
More ▼

Publication Type

Reports - Research	16
Journal Articles	14
Speeches/Meeting Papers	1

Education Level

Elementary Education	5
Higher Education	4
Postsecondary Education	4
Early Childhood Education	3
Grade 2	3
Primary Education	3
Grade 1	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Practitioners	1
Researchers	1

Location

Australia	1
Colorado	1
Florida	1
New York	1
North Carolina	1
Sweden	1
Tennessee	1
Texas	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	1
National Assessment of…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Classical Item Analysis from a Signal Detection Perspective

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023

A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness

A Closed-Form Alternative for Estimating [omega] Reliability under Unidimensionality

Peer reviewed

Direct link

Hancock, Gregory R.; An, Ji – Measurement: Interdisciplinary Research and Perspectives, 2020

As an alternative to Cronbach's [alpha] for estimating scale reliability, McDonald's [omega] has attracted increased attention within the methodological community for its less stringent measurement assumptions. Notwithstanding, [omega] is still seldom used by practitioners, likely due to its unavailability in popular software packages (e.g., SPSS)…

Descriptors: Evaluation, Alternative Assessment, Reliability, Test Reliability

Estimating Treatment Effects with the Explanatory Item Response Model. EdWorkingPaper No. 22-677

Download full text

Joshua B. Gilbert – Annenberg Institute for School Reform at Brown University, 2022

This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores provide generally equivalent bias and false positive…

Descriptors: Item Response Theory, Models, Test Theory, Computation

Digital ITEMS Module 1: Reliability in Classical Test Theory

Peer reviewed

Direct link

Lewis, Charlie; Chajewski, Michael; Rupp, André A. – Educational Measurement: Issues and Practice, 2018

In this ITEMS module, we provide a two-part introduction to the topic of reliability from the perspective of "classical test theory" (CTT). In the first part, which is directed primarily at beginning learners, we review and build on the content presented in the original didactic ITEMS article by Traub and Rowley (1991). Specifically, we…

Descriptors: Test Reliability, Test Theory, Computation, Data Collection

On True Score Evaluation Using Item Response Theory Modeling

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; Marcoulides, George A.; Harrison, Michael – Educational and Psychological Measurement, 2019

Building on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person…

Descriptors: True Scores, Item Response Theory, Test Items, Test Theory

Modifying Spearman's Attenuation Equation to Yield Partial Corrections for Measurement Error--With Application to Sample Size Calculations

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2018

Spearman's correction for attenuation (measurement error) corrects a correlation coefficient for measurement errors in either-or-both of two variables, and follows from the assumptions of classical test theory. Spearman's equation removes all measurement error from a correlation coefficient which translates into "increasing the reliability of…

Descriptors: Error of Measurement, Correlation, Sample Size, Computation

Effects of Various Simulation Conditions on Latent-Trait Estimates: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Kogar, Hakan – International Journal of Assessment Tools in Education, 2018

The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…

Descriptors: Simulation, Context Effect, Computation, Statistical Analysis

"TechCheck": Development and Validation of an Unplugged Assessment of Computational Thinking in Early Childhood Education

Peer reviewed

Direct link

Relkin, Emily; de Ruiter, Laura; Bers, Marina Umaschi – Journal of Science Education and Technology, 2020

There is a need for developmentally appropriate Computational Thinking (CT) assessments that can be implemented in early childhood classrooms. We developed a new instrument called "TechCheck" for assessing CT skills in young children that does not require prior knowledge of computer programming. "TechCheck" is based on…

Descriptors: Developmentally Appropriate Practices, Computation, Thinking Skills, Early Childhood Education

A Strategy for Replacing Sum Scoring

Peer reviewed

Direct link

Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017

This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…

Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics

Quantifying Local, Response Dependence between Two Polytomous Items Using the Rasch Model

Peer reviewed

Direct link

Andrich, David; Humphry, Stephen M.; Marais, Ida – Applied Psychological Measurement, 2012

Models of modern test theory imply statistical independence among responses, generally referred to as "local independence." One violation of local independence occurs when the response to one item governs the response to a subsequent item. Expanding on a formulation of this kind of violation as a process in the dichotomous Rasch model,…

Descriptors: Test Theory, Models, Item Response Theory, Evidence

A Comparison of Teacher Effectiveness Measures Calculated Using Three Multilevel Models for Raters Effects

Peer reviewed

Direct link

Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015

This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…

Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory

Using IRT Trait Estimates versus Summated Scores in Predicting Outcomes

Peer reviewed

Direct link

Xu, Ting; Stone, Clement A. – Educational and Psychological Measurement, 2012

It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…

Descriptors: Educational Research, Monte Carlo Methods, Measures (Individuals), Item Response Theory

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

Quantifying Response Dependence between Two Dichotomous Items Using the Rasch Model

Peer reviewed

Direct link

Andrich, David; Kreiner, Svend – Applied Psychological Measurement, 2010

Descriptors: Test Theory, Item Response Theory, Test Items, Correlation

When Can Subscores Have Value? Research Report. ETS RR-05-08

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2005

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Scores, Test Items, Error of Measurement, Computation

Previous Page | Next Page »

Pages: 1 | 2

Andrich, David	2
Almehrizi, Rashid S.	1
An, Ji	1
Beretvas, S. Natasha	1
Bers, Marina Umaschi	1
Chajewski, Michael	1
DeCarlo, Lawrence T.	1
Dimitrov, Dimiter M.	1
Haberman, Shelby J.	1
Hancock, Gregory R.	1
Harrison, Michael	1
Humphry, Stephen M.	1
Joshua B. Gilbert	1
Kogar, Hakan	1
Kreiner, Svend	1
Lewis, Charlie	1
Marais, Ida	1
Marcoulides, George A.	1
Murphy, Daniel L.	1
Nicewander, W. Alan	1
Ramsay, James O.	1
Raykov, Tenko	1
Relkin, Emily	1
Rupp, André A.	1
Schmidt, Hans-Jurgen	1
More ▼