ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Test Theory	13
Testing Problems	13
Test Items	6
Test Construction	4
Test Validity	4
Achievement Tests	3
Construct Validity	3
Criterion Referenced Tests	3
Educational Testing	3
Measurement Techniques	3
Test Reliability	3
Test Use	3
Academic Achievement	2
Adaptive Testing	2
Cutting Scores	2
Difficulty Level	2
Error of Measurement	2
Evidence	2
Inferences	2
Literature Reviews	2
Mathematical Models	2
Multiple Choice Tests	2
Performance Based Assessment	2
Psychometrics	2
Scoring Formulas	2
More ▼

Source

Educational Measurement:…	2
Review of Research in…	2
Evaluation in Education:…	1
Journal of Educational…	1
Online Submission	1
School Psychology Review	1

Author

van der Linden, Wim J.	2
Altepeter, Tom	1
Breithaupt, Krista	1
Chuah, Siang Chee	1
Glaser, Robert	1
Kettler, Ryan J.	1
Linn, Robert L.	1
Norris, Stephen P.	1
Salmani Nodoushan, Mohammad…	1
Theunissen, Phiel J. J. M.	1
Wilcox, Rand R.	1
Wiliam, Dylan	1
Williams, John	1
Yen, Wendy M.	1
Zhang, Yanwei	1
Zin, Than Than	1
More ▼

Publication Type

Reports - Evaluative	13
Journal Articles	8
Speeches/Meeting Papers	2
Information Analyses	1
Opinion Papers	1
Reports - Research	1

Education Level

Elementary Secondary Education

Audience

Location

United Kingdom	1
United States	1

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

Expressive One Word Picture…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Test Affordances or Test Function? Did We Get Messick's Message Right?

Download full text

Salmani Nodoushan, Mohammad Ali – Online Submission, 2021

This paper follows a line of logical argumentation to claim that what Samuel Messick conceptualized about construct validation has probably been misunderstood by some educational policy makers, practicing educators, and classroom teachers. It argues that, while Messick's unified theory of test validation aimed at (a) warning educational…

Descriptors: Construct Validity, Test Theory, Test Use, Affordances

Adaptations and Access to Assessment of Common Core Content

Peer reviewed

Direct link

Kettler, Ryan J. – Review of Research in Education, 2015

This chapter introduces theory that undergirds the role of testing adaptations in assessment, provides examples of item modifications and testing accommodations, reviews research relevant to each, and introduces a new paradigm that incorporates opportunity to learn (OTL), academic enablers, testing adaptations, and inferences that can be made from…

Descriptors: Meta Analysis, Literature Reviews, Testing, Testing Accommodations

Detecting Differential Speededness in Multistage Testing

Peer reviewed

Direct link

van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007

A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…

Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time

Obtaining Some Degree of Correspondence Between Unequatable Scores: A Comparison of Item Response Theory and Equipercentile Equating Methods.

Yen, Wendy M. – 1982

Test scores that are not perfectly reliable cannot be strictly equated unless they are strictly parallel. This fact implies that tau equivalence can be lost if an equipercentile equating is applied to observed scores that are not strictly parallel. Thirty-six simulated data sets are produced to simulate equating tests with different difficulties…

Descriptors: Difficulty Level, Equated Scores, Latent Trait Theory, Methods

Controlling for Background Beliefs When Developing Multiple-Choice Critical Thinking Tests. Technical Report No. 429.

Download full text

Norris, Stephen P. – 1988

The problems of validity and fairness involved in multiple-choice critical thinking tests can be lessened by using verbal reports of examinees' thinking during the process of developing such tests in order to retain only those items which rely on critical thinking skills to obtain the correct answer. Multiple-choice testing can lead to unfair…

Descriptors: Critical Thinking, High School Students, High Schools, Multiple Choice Tests

An Alternative Interpretation of Three Stability Models. Measurement and Methodology, Work Unit 2: Technical Adequacy of Tests.

Wilcox, Rand R. – 1978

Two fundamental problems in mental test theory are to estimate true score and to estimate the amount of error when testing an examinee. In this report, three probability models which characterize a single test item in terms of a population of examinees are described. How these models may be modified to characterize a single examinee in terms of an…

Descriptors: Achievement Tests, Comparative Analysis, Error of Measurement, Mathematical Models

Passing Score and Length of a Mastery Test.

van der Linden, Wim J. – Evaluation in Education: International Progress, 1982

In mastery testing a linear relationship between an optimal passing score and test length is presented with a new optimization criterion. The usual indifference zone approach, a binomial error model, decision errors, and corrections for guessing are discussed. Related results in sequential testing and the latent class approach are included. (CM)

Descriptors: Cutting Scores, Educational Testing, Mastery Tests, Mathematical Models

Searching for Better Scoring of Multiple-Choice Tests: Proper Treatment of Misinformation, Guessing and Partial Knowledge.

Zin, Than Than; Williams, John – 1991

Brief explanations are presented of some of the different methods used to score multiple-choice tests; and some studies of partial information, guessing strategies, and test-taking behaviors are reviewed. Studies are grouped in three categories of effort to improve scoring: (1) those that require extra effort from the examinee to answer…

Descriptors: Educational Research, Estimation (Mathematics), Guessing (Tests), Literature Reviews

Two Weak Spots in the Practice of Criterion-referenced Measurement.

Peer reviewed

Linn, Robert L. – Educational Measurement: Issues and Practice, 1982

Confusion in the terminology used in criterion-referenced measurement specifications and development and standard setting and the attendant role of cut-off scores are shown to need practical clarification through psychometric research on test applications and consequences. (CM)

Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Measurement Objectives

What Counts as Evidence of Educational Achievement? The Role of Constructs in the Pursuit of Equity in Assessment

Peer reviewed

Direct link

Wiliam, Dylan – Review of Research in Education, 2010

The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…

Descriptors: Educational Assessment, Validity, Inferences, Construct Validity

Criterion-Referenced Tests: Part II. Unfinished Business.

Peer reviewed

Glaser, Robert – Educational Measurement: Issues and Practice, 1994

Some unfinished issues relating to achievement test theory that seemed implicit in the basic idea of criterion-referenced testing are reviewed, recognizing their importance in current studies of authentic assessment and performance-based tests. The future of performance-based evaluation is explored. (SLD)

Descriptors: Academic Achievement, Achievement Tests, Criterion Referenced Tests, Educational History

A Discussion of the Expressive One-Word Picture Vocabulary Test.

Peer reviewed

Altepeter, Tom – School Psychology Review, 1983

A critical review of the Expressive One-Word Picture Vocabulary Test (Gardner) is offered. The reviewer feels that the instrument cannot be recommended in its present form. Further research concerning the manual, and theoretical issues, (particularly test-retest stability) is strongly recommended. (Author/PN)

Descriptors: Error of Measurement, Intelligence Tests, Item Analysis, Pictorial Stimuli

Introduction to Rasch Measurement: Some Implications for Languages.

Theunissen, Phiel J. J. M. – 1983

Any systematic approach to the assessment of students' ability implies the use of a model. The more explicit the model is, the more its users know about what they are doing and what the consequences are. The Rasch model is a strong model where measurement is a bonus of the model itself. It is based on four ideas: (1) separation of observable…

Descriptors: Ability Grouping, Difficulty Level, Evaluation Criteria, Item Sampling