ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Classification	9
Cutting Scores	9
Test Construction	9
Psychometrics	3
Test Format	3
Test Items	3
Computer Assisted Testing	2
Evaluation Methods	2
Item Response Theory	2
Pass Fail Grading	2
Reliability	2
Scores	2
Simulation	2
Statistical Analysis	2
Test Interpretation	2
Test Length	2
Ability	1
Accountability	1
Accuracy	1
Achievement	1
Achievement Tests	1
Adaptive Testing	1
Advanced Placement Programs	1
Computer Simulation	1
Criterion Referenced Tests	1
More ▼

Source

Applied Measurement in…	2
Applied Psychological…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1
Research in the Schools	1

Author

Wyse, Adam E.	2
Babcock, Ben	1
Batinic, Bernad	1
Becker, Valerie	1
Breyer, F. Jay	1
Gnambs, Timo	1
Hall, John D.	1
Howerton, D. Lynn	1
Huff, Kristen	1
Jones, Craig H.	1
Lewis, Charles	1
Morgan, Rick	1
Oshima, T. C.	1
Papageorgiou, Spiros	1
Plake, Barbara S.	1
Reshetar, Rosemary	1
Reshetar, Rosemary A.	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	5
Reports - Evaluative	4
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	1
High Schools	1
Secondary Education	1

Audience

Location

Arkansas

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Does Maximizing Information at the Cut Score Always Maximize Classification Accuracy and Consistency?

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2016

A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…

Descriptors: Cutting Scores, Psychometrics, Test Construction, Classification

Enhancing the Interpretability of the Overall Results of an International Test of English-Language Proficiency

Peer reviewed

Direct link

Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015

The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…

Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning

The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy

Peer reviewed

Direct link

Wyse, Adam E. – Applied Psychological Measurement, 2011

In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…

Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics

Polytomous Adaptive Classification Testing: Effects of Item Pool Size, Test Termination Criterion, and Number of Cutscores

Peer reviewed

Direct link

Gnambs, Timo; Batinic, Bernad – Educational and Psychological Measurement, 2011

Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification…

Descriptors: Test Length, Computer Assisted Testing, Classification, Test Items

Evidence-Centered Assessment Design as a Foundation for Achievement-Level Descriptor Development and for Standard Setting

Peer reviewed

Direct link

Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary – Applied Measurement in Education, 2010

In many large-scale assessment programs, achievement level descriptors (ALDs) provide a critical role in communicating what scores on the assessment mean and in interpreting what examinees know and are able to do based on their test performance. Based on their test performance, examinees are often classified into performance categories. The…

Descriptors: Evidence, Test Construction, Measurement, Standard Setting

Achievement Testing in the No Child Left Behind Era: The Arkansas Benchmark

Peer reviewed

Direct link

Hall, John D.; Howerton, D. Lynn; Jones, Craig H. – Research in the Schools, 2008

The No Child Left Behind Act and the accountability movement in public education caused many states to develop criterion-referenced academic achievement tests. Scores from these tests are often used to make high stakes decisions. Even so, these tests typically do not receive independent psychometric scrutiny. We evaluated the 2005 Arkansas…

Descriptors: Criterion Referenced Tests, Achievement Tests, High Stakes Tests, Public Education

Pass-Fail Reliability for Tests with Cut Scores: A Simplified Method.

Download full text

Breyer, F. Jay; Lewis, Charles – 1994

A single-administration classification reliability index is described that estimates the probability of consistently classifying examinees to mastery or nonmastery states as if those examinees had been tested with two alternate forms. The procedure is applicable to any test used for classification purposes, subdividing that test into two…

Descriptors: Classification, Cutting Scores, Objective Tests, Pass Fail Grading

Differential Item Functioning for a Test with a Cutoff Score: Use of Limited Closed-Interval Measures.

Peer reviewed

Oshima, T. C.; And Others – Applied Measurement in Education, 1994

A procedure to detect differential item functioning (DIF) is introduced that is suitable for tests with a cutoff score. DIF is assessed on a limited closed interval of thetas in which a cutoff score falls. How this approach affects the identification of DIF items is demonstrated with real data sets. (SLD)

Descriptors: Ability, Classification, Cutting Scores, Identification

An Adaptive Testing Simulation for a Certifying Examination.

Download full text

Reshetar, Rosemary A.; And Others – 1992

This study examined performance of a simulated computerized adaptive test that was designed to help direct the development of a medical recertification examination. The item pool consisted of 229 single-best-answer items from a random sample of 3,000 examinees, calibrated using the two-parameter logistic model. Examinees' responses were known. For…

Descriptors: Adaptive Testing, Classification, Computer Assisted Testing, Computer Simulation