ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	6

Descriptor

Evaluation	6
Psychometrics	6
Test Construction	3
Test Items	3
Scores	2
Test Results	2
Automation	1
Best Practices	1
Common Core State Standards	1
Comparative Analysis	1
Computer Assisted Testing	1
Concept Mapping	1
Criticism	1
Disclosure	1
English (Second Language)	1
Equated Scores	1
Error of Measurement	1
Focus Groups	1
Group Behavior	1
Health Sciences	1
Information Dissemination	1
Language Proficiency	1
Mathematics	1
Measurement	1
Medical Education	1
More ▼

Source

Educational Measurement:…

Author

Choe, Edison M.	1
Choi, Jaehwa	1
Dorans, Neil J.	1
Fu, Yanyan	1
Gierl, Mark J.	1
Hambleton, Ronald K.	1
Kolen, Michael J.	1
Lai, Hollis	1
Lee, Won-Chan	1
Liang, Longjuan	1
Lim, Hwanggyu	1
Schulz, E. Matthew	1
Sinharay, Sandip	1
Zenisky, April L.	1
More ▼

Publication Type

Journal Articles	6
Reports - Descriptive	3
Reports - Research	2
Opinion Papers	1

Education Level

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach

Peer reviewed

Direct link

Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022

This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…

Descriptors: Test Items, Measurement, Psychometrics, Test Construction

A Process for Reviewing and Evaluating Generated Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2016

Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…

Descriptors: Test Items, Test Construction, Psychometrics, Models

Developing Test Score Reports that Work: The Process and Best Practices for Effective Communication

Peer reviewed

Direct link

Zenisky, April L.; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2012

Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…

Descriptors: Scores, Psychometrics, Test Results, Usability

First Language of Test Takers and Fairness Assessment Procedures

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011

Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…

Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency

Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Peer reviewed

Direct link

Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores

Commentary: A Response to Reckase's Conceptual Framework and Examples for Evaluating Standard Setting Methods

Peer reviewed

Direct link

Schulz, E. Matthew – Educational Measurement: Issues and Practice, 2006

A look at real data shows that Reckase's psychometric theory for standard setting is not applicable to bookmark and that his simulations cannot explain actual differences between methods. It is suggested that exclusively test-centered, criterion-referenced approaches are too idealized and that a psychophysics paradigm and a theory of group…

Descriptors: Psychometrics, Group Behavior, Standard Setting, Simulation