ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	21
Since 2006 (last 20 years)	50

Descriptor

Statistical Analysis	134
Test Theory	134
Test Reliability	38
Test Items	37
Psychometrics	27
Correlation	23
Mathematical Models	23
Test Construction	23
Item Analysis	22
Scores	22
Test Validity	21
Item Response Theory	20
Foreign Countries	19
Comparative Analysis	18
Measurement Techniques	16
Latent Trait Theory	15
Career Development	13
Criterion Referenced Tests	13
Equated Scores	13
Multiple Choice Tests	13
Test Interpretation	13
Testing Problems	13
Error of Measurement	12
Factor Analysis	12
Scoring	12
More ▼

Education Level

Higher Education	16
Postsecondary Education	11
Secondary Education	9
Elementary Education	8
Middle Schools	6
Grade 8	4
Junior High Schools	4
Grade 7	3
Grade 3	2
Grade 4	2
Grade 6	2
High Schools	2
Intermediate Grades	2
Early Childhood Education	1
Grade 2	1
Grade 5	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Researchers	13
Practitioners	1
Students	1

Location

Turkey	3
Germany	2
Texas	2
Belgium	1
Brazil	1
California	1
Canada	1
Colorado	1
Cyprus	1
Florida	1
Hong Kong	1
Indonesia	1
Italy	1
Luxembourg	1
Netherlands	1
Pakistan	1
Spain	1
United States	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

What Works Clearinghouse Rating

Showing 1 to 15 of 134 results Save | Export

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

Peer reviewed

Direct link

Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…

Descriptors: Test Bias, Interrater Reliability, Responses, Correlation

A Measurement Is a Choice and Stevens' Scales of Measurement Do Not Help Make It: A Response to Chalmers

Peer reviewed

Direct link

Zumbo, Bruno D.; Kroc, Edward – Educational and Psychological Measurement, 2019

Chalmers recently published a critique of the use of ordinal a[alpha] proposed in Zumbo et al. as a measure of test reliability in certain research settings. In this response, we take up the task of refuting Chalmers' critique. We identify three broad misconceptions that characterize Chalmers' criticisms: (1) confusing assumptions with…

Descriptors: Test Reliability, Statistical Analysis, Misconceptions, Mathematical Models

Estimating Treatment Effects with the Explanatory Item Response Model. EdWorkingPaper No. 22-677

Download full text

Joshua B. Gilbert – Annenberg Institute for School Reform at Brown University, 2022

This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores provide generally equivalent bias and false positive…

Descriptors: Item Response Theory, Models, Test Theory, Computation

On True Score Evaluation Using Item Response Theory Modeling

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; Marcoulides, George A.; Harrison, Michael – Educational and Psychological Measurement, 2019

Building on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person…

Descriptors: True Scores, Item Response Theory, Test Items, Test Theory

Modifying Spearman's Attenuation Equation to Yield Partial Corrections for Measurement Error--With Application to Sample Size Calculations

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2018

Spearman's correction for attenuation (measurement error) corrects a correlation coefficient for measurement errors in either-or-both of two variables, and follows from the assumptions of classical test theory. Spearman's equation removes all measurement error from a correlation coefficient which translates into "increasing the reliability of…

Descriptors: Error of Measurement, Correlation, Sample Size, Computation

Detecting and Treating Errors in Tests and Surveys

Peer reviewed

Direct link

von Davier, Matthias – Quality Assurance in Education: An International Perspective, 2018

Purpose: Surveys that include skill measures may suffer from additional sources of error compared to those containing questionnaires alone. Examples are distractions such as noise or interruptions of testing sessions, as well as fatigue or lack of motivation to succeed. This paper aims to provide a review of statistical tools based on latent…

Descriptors: Statistical Analysis, Surveys, International Assessment, Error Patterns

Investigating the Impact of Missing Data Handling Methods on the Detection of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Selvi, Hüseyin; Özdemir Alici, Devrim – International Journal of Assessment Tools in Education, 2018

In this study, it is aimed to investigate the impact of different missing data handling methods on the detection of Differential Item Functioning methods (Mantel Haenszel and Standardization methods based on Classical Test Theory and Likelihood Ratio Test method based on Item Response Theory). In this regard, on the data acquired from 1046…

Descriptors: Test Bias, Test Theory, Item Response Theory, Multiple Choice Tests

Components of Variance of Scales with a Bifactor Subscale Structure from Two Calculations of Alpha

Peer reviewed

Direct link

Andrich, David – Educational Measurement: Issues and Practice, 2016

Since Cronbach's (1951) elaboration of a from its introduction by Guttman (1945), this coefficient has become ubiquitous in characterizing assessment instruments in education, psychology, and other social sciences. Also ubiquitous are caveats on the calculation and interpretation of this coefficient. This article summarizes a recent contribution…

Descriptors: Computation, Correlation, Test Theory, Measures (Individuals)

A Response to "Assessment and Learning: Fields Apart?"

Peer reviewed

Direct link

Goldstein, Harvey – Assessment in Education: Principles, Policy & Practice, 2017

The author's commentary focuses more on the quantitative discussion about educational assessment of the original article than on the idea of the assessment for learning, which did not raise any substantial issues. He starts by offering some general comments on the paper. He feels the authors made a number of assumptions about quantitative…

Descriptors: Educational Assessment, Statistical Analysis, International Assessment, Learning Theories

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Effects of Various Simulation Conditions on Latent-Trait Estimates: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Kogar, Hakan – International Journal of Assessment Tools in Education, 2018

The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…

Descriptors: Simulation, Context Effect, Computation, Statistical Analysis

Determination of Differential Item Functioning (DIF) According to SIBTEST, Lord's [Chi-squared], Raju's Area Measurement and Breslow-Day Methods

Peer reviewed
PDF on ERIC

Download full text

Ayva Yörü, Fatma Gökçen; Atar, Hakan Yavuz – Journal of Pedagogical Research, 2019

The aim of this study is to examine whether the items in the mathematics subtest of the Centralized High School Entrance Placement Test [HSEPT] administered in 2012 by the Ministry of National Education in Turkey show DIF according to gender and type of school. For this purpose, SIBTEST, Breslow-Day, Lord's [chi-squared] and Raju's area…

Descriptors: Test Bias, Mathematics Tests, Test Items, Gender Differences

Causal Indicator Models: Unresolved Issues of Construction and Evaluation

Peer reviewed

Direct link

West, Stephen G.; Grimm, Kevin J. – Measurement: Interdisciplinary Research and Perspectives, 2014

These authors agree with Bainter and Bollen that causal effects represents a useful measurement structure in some applications. The structure of the science of the measurement problem should determine the model; the measurement model should not determine the science. They also applaud Bainter and Bollen's important reminder that the full…

Descriptors: Causal Models, Measurement, Test Theory, Statistical Analysis

A Comparison of Reliability and Precision of Subscore Reporting Methods for a State English Language Proficiency Assessment

Peer reviewed

Direct link

Longabach, Tanya; Peyton, Vicki – Language Testing, 2018

K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…

Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency

Gender Fairness within the Force Concept Inventory

Peer reviewed

Direct link

Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018

Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…

Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	9
ProQuest LLC	7
ETS Research Report Series	5
Journal of Educational…	5
Educational Measurement:…	4
Psychometrika	4
Applied Psychological…	3
Assessment in Education:…	2
International Journal of…	2
Journal of Educational and…	2
Language Testing	2
Physical Review Physics…	2
Quality Assurance in…	2
Advances in Physiology…	1
Alberta Journal of…	1
Annenberg Institute for…	1
Behavioral Research and…	1
Chemistry Education Research…	1
College Board Review	1
Current Issues in Education	1
EURASIA Journal of…	1
Early Education and…	1
Edinburgh Working Papers in…	1
Educational Sciences: Theory…	1
Florida Center for Reading…	1
More ▼

Mislevy, Robert J.	6
van der Linden, Wim J.	3
Bormuth, John R.	2
Brennan, Robert L.	2
Cliff, Norman	2
Fan, Xitao	2
Holland, Paul W.	2
Huynh, Huynh	2
Levine, Michael V.	2
Livingston, Samuel A.	2
Weiss, David J., Ed.	2
Yen, Wendy M.	2
Zimmerman, Donald W.	2
Zumbo, Bruno D.	2
Agus, Mirian	1
Algina, James	1
Andrich, David	1
Atar, Hakan Yavuz	1
Audette, Jennifer Gail	1
Ayva Yörü, Fatma Gökçen	1
Balch, William R.	1
Beauducel, Andre	1
Beguin, A. A.	1
Belfry, M. Joan	1
More ▼

Reports - Research	79
Journal Articles	70
Speeches/Meeting Papers	25
Reports - Evaluative	24
Reports - Descriptive	12
Dissertations/Theses -…	7
Information Analyses	5
Opinion Papers	5
Numerical/Quantitative Data	3
Tests/Questionnaires	3
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - Non-Classroom	2
Book/Product Reviews	1
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Collected Works - Serials	1
Guides - Classroom - Learner	1
Historical Materials	1
Reference Materials -…	1
Reports - General	1
More ▼

National Assessment of…	4
Armed Services Vocational…	2
California Achievement Tests	2
Comprehensive Tests of Basic…	2
SAT (College Admission Test)	2
Wechsler Intelligence Scale…	2
Armed Forces Qualification…	1
Defining Issues Test	1
Eysenck Personality Inventory	1
Law School Admission Test	1
Leadership Practices Inventory	1
Piers Harris Childrens Self…	1
Program for International…	1
SRA Primary Mental Abilities…	1
Stanford Achievement Tests	1
Strengths and Difficulties…	1
Tennessee Self Concept Scale	1
Test of English as a Foreign…	1
Wisconsin Card Sorting Test	1
Woodcock Johnson Tests of…	1
Writing Apprehension Test	1
More ▼