ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Source

Applied Psychological…	1
College Student Journal	1
Educational and Psychological…	1
Psychometrika	1

Author

Berger, Martijn P. F.	1
Bors, Douglas A.	1
Cliff, Norman	1
Donoghue, John R.	1
Linn, Robert	1
Mason, Victor W.	1
Nation, Paul	1
Read, John	1
Taylor, Annette Kujawski	1
Theunissen, Phiel J. J. M.	1
Vigneau, Francois	1
Waller, Niels G.	1
de Jong, John H. A. L.	1
More ▼

Publication Type

Reports - Evaluative	10
Journal Articles	4
Speeches/Meeting Papers	4
Information Analyses	1

Education Level

Higher Education

Audience

Location

Netherlands

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Commingled Samples: A Neglected Source of Bias in Reliability Analysis

Peer reviewed

Direct link

Waller, Niels G. – Applied Psychological Measurement, 2008

Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…

Descriptors: Test Items, Reliability, Scores, Psychometrics

On the Efficiency of IRT Models When Applied to Different Sampling Designs. Project Psychometric Aspects of Item Banking No. 45.

Berger, Martijn P. F. – 1989

The problem of obtaining designs that result in the most precise parameter estimates is encountered in at least two situations where item response theory (IRT) models are used. In so-called two-stage testing procedures, certain designs that match difficulty levels of the test items with the ability of the examinees may be located. Such designs…

Descriptors: Difficulty Level, Efficiency, Equations (Mathematics), Heuristics

Violating Conventional Wisdom in Multiple Choice Test Construction

Peer reviewed

Taylor, Annette Kujawski – College Student Journal, 2005

This research examined 2 elements of multiple-choice test construction, balancing the key and optimal number of options. In Experiment 1 the 3 conditions included a balanced key, overrepresentation of a and b responses, and overrepresentation of c and d responses. The results showed that error-patterns were independent of the key, reflecting…

Descriptors: Comparative Analysis, Test Items, Multiple Choice Tests, Test Construction

Conceptualization of Issues in Construct and Content Validity. Studies in Measurement and Methodology, Work Unit No. 1: Conceptual and Design Problems in Competency-Based Measurements.

Linn, Robert – 1978

A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…

Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias

Ordinal Test Fidelity Estimated by an Item Sampling Model.

Peer reviewed

Cliff, Norman; Donoghue, John R. – Psychometrika, 1992

A test theory using only ordinal assumptions is presented, based on the idea that the test items are a sample from a universe of items. The sum across items of the ordinal relations for a pair of persons on the universe items is analogous to a true score. (SLD)

Descriptors: Equations (Mathematics), Estimation (Mathematics), Item Response Theory, Item Sampling

Items in Context: Assessing the Dimensionality of Raven's Advanced Progressive Matrices

Peer reviewed

Direct link

Vigneau, Francois; Bors, Douglas A. – Educational and Psychological Measurement, 2005

The problem of dimensionality with respect to Raven's Advanced Progressive Matrices (APM) specifically and, more generally, "g" or fluid intelligence, has been a long-standing issue. The present article reports two studies examining the dimensionality of both the original Set II of the APM (n = 506) and a short form (n = 644), using principal…

Descriptors: Context Effect, Item Response Theory, Intelligence Tests, Test Items

Introduction to Rasch Measurement: Some Implications for Languages.

Theunissen, Phiel J. J. M. – 1983

Any systematic approach to the assessment of students' ability implies the use of a model. The more explicit the model is, the more its users know about what they are doing and what the consequences are. The Rasch model is a strong model where measurement is a bonus of the model itself. It is based on four ideas: (1) separation of observable…

Descriptors: Ability Grouping, Difficulty Level, Evaluation Criteria, Item Sampling

Preparing Tests of Reading Comprehension.

Mason, Victor W. – 1986

Reading skills are crucial to students learning and using English as a second language for academic purposes. Teachers can construct valid reading tests if they approach the task with care and focus on the test's ability to measure construct rather than face validity. In reading tests, the crucial elements of test design affecting validity are (1)…

Descriptors: Communicative Competence (Languages), English for Academic Purposes, English (Second Language), Higher Education

Some Issues in the Testing of Vocabulary Knowledge.

Download full text

Read, John; Nation, Paul – 1986

A review of the literature on a variety of issues related to testing vocabulary knowledge in a second language addresses these topics: problems in estimating vocabulary size, including the related questions of what constitutes a word, how a sample should be selected, and what are the criteria for knowing a word; sampling the basic and specialized…

Descriptors: Achievement Tests, Check Lists, Classification, Comparative Analysis

Testing Foreign Language Listening Comprehension.

Download full text

de Jong, John H. A. L. – 1982

The development and validation of a test of listening comprehension for English as a second language at the Dutch National Institute for Educational Measurement (Cito) is described. The test uses two distinct item formats: true-false items and modified cloze items with two options. Both item formats were found to measure foreign language listening…

Descriptors: Cloze Procedure, English (Second Language), Evaluation Criteria, Foreign Countries

Item Sampling	10
Test Items	10
Test Construction	5
Language Tests	4
Test Validity	4
Evaluation Criteria	3
Test Reliability	3
Comparative Analysis	2
Difficulty Level	2
English (Second Language)	2
Equations (Mathematics)	2
Foreign Countries	2
Item Response Theory	2
Language Proficiency	2
Mathematical Models	2
Psychometrics	2
Second Languages	2
Statistical Analysis	2
Test Bias	2
Test Format	2
Testing Problems	2
Ability Grouping	1
Achievement Tests	1
Bias	1
Check Lists	1
More ▼