ERIC - Search Results

Publication Date

In 2025	2
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

Test Reliability	22
Mathematical Models	13
Models	9
Error of Measurement	7
Mastery Tests	6
Criterion Referenced Tests	5
Item Response Theory	5
Measurement	5
Test Items	5
Test Validity	5
Comparative Analysis	4
Latent Trait Theory	4
Statistical Analysis	4
Test Construction	4
Computer Programs	3
Item Analysis	3
Sampling	3
Testing Problems	3
Computer Software	2
Decision Making	2
Equated Scores	2
Equations (Mathematics)	2
High School Students	2
High Schools	2
Probability	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	11
Reports - Research	7
Reports - Evaluative	4

Education Level

Audience

Location

South Carolina

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
General Educational…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

Relating Unidimensional IRT Parameters to a Multidimensional Response Space: A Review of Two Alternative Projection IRT Models for Scoring Subscales

Peer reviewed

Direct link

Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011

A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…

Descriptors: Test Length, Test Items, Alignment (Education), Models

Stability of Individual Differences in Multiwave Panel Studies: Comparison of Simplex Models and One-Factor Models.

Peer reviewed

Marsh, Herbert W. – Journal of Educational Measurement, 1993

Structural equation models of the same construct collected on different occasions are evaluated in 2 studies involving the evaluation of 157 college instructors over 8 years and data for over 2,200 high school students over 4 years for the Youth in Transition Study. Results challenge overreliance on simplex models. (SLD)

Descriptors: College Faculty, Comparative Analysis, High School Students, High Schools

An Index of Dependability for Mastery Tests

Peer reviewed

Brennan, Robert L.; Kane, Michael T. – Journal of Educational Measurement, 1977

An index for the dependability of mastery tests is described. Assumptions necessary for the index and the mathematical development of the index are provided. (Author/JKS)

Descriptors: Criterion Referenced Tests, Mastery Tests, Mathematical Models, Test Reliability

Test Length and the Standard Error of Measurement

Peer reviewed

Gardner, P. L. – Journal of Educational Measurement, 1970

Descriptors: Error of Measurement, Mathematical Models, Statistical Analysis, Test Reliability

On Consistency of Decisions in Criterion-Referenced Testing

Peer reviewed

Huynh, Huynh – Journal of Educational Measurement, 1976

Within the beta-binomial Bayesian framework, procedures are described for the evaluation of the kappa index of reliability on the basis of one administration of a domain-referenced test. Major factors affecting this index include cutoff score, test score variability and test length. Empirical data which substantiate some theoretical trends deduced…

Descriptors: Criterion Referenced Tests, Decision Making, Mathematical Models, Probability

Estimating Reliability from a Single Administration of a Mastery Test

Peer reviewed

Subkoviak, Michael J. – Journal of Educational Measurement, 1976

A number of different reliability coefficients have recently been proposed for tests used to differentiate between groups such as masters and nonmasters. One promising index is the proportion of students in a class that are consistently assigned to the same mastery group across two testings. The present paper proposes a single test administration…

Descriptors: Criterion Referenced Tests, Mastery Tests, Mathematical Models, Probability

Latent Trait Models and Their Use in the Analysis of Educational Test Data

Peer reviewed

Hambleton, Ronald K.; Cook, Linda L. – Journal of Educational Measurement, 1977

This article presents a non-mathematical introduction to latent trait test models and some of their features. Latent trait models are compared to classical test models. Two promising applications of latent trait models and available computer programs are discussed. (Author/JKS)

Descriptors: Computer Programs, Latent Trait Theory, Measurement, Models

Solving Measurement Problems with the Rasch Model

Peer reviewed

Wright, Benjamin D. – Journal of Educational Measurement, 1977

This article explains the Rasch model for sample-free item analysis and test-free person measurement. It shows how to estimate model parameters from data and how to evaluate the statistical fit of these estimates to the data. Attention is paid to practical considerations. (Author/JKS)

Descriptors: Computer Programs, Latent Trait Theory, Measurement, Models

A Note on Huynh's Normal Approximation Procedure for Estimating Criterion-Referenced Reliability.

Peer reviewed

Peng, Chao-Ying, J.; Subkoviak, Michael J. – Journal of Educational Measurement, 1980

Huynh (1976) suggested a method of approximating the reliability coefficient of a mastery test. The present study examines the accuracy of Huynh's approximation and also describes a computationally simpler approximation which appears to be generally more accurate than the former. (Author/RL)

Descriptors: Error of Measurement, Mastery Tests, Mathematical Models, Statistical Analysis

Practical Applications of Item Characteristic Curve Theory

Peer reviewed

Lord, Frederic M. – Journal of Educational Measurement, 1977

A variety of practical applications of item characteristic curve test theory are discussed. Among these applications are tailored testing, two stage testing, determining whether two tests measure the same latent trait, and measuring item bias towards minority or other groups. (Author/JKS)

Descriptors: Computer Programs, Latent Trait Theory, Mastery Tests, Measurement

Models, Meanings and Misunderstandings: Some Issues in Applying Rasch's Theory

Peer reviewed

Whitely, Susan E. – Journal of Educational Measurement, 1977

A debate concerning specific issues and the general usefulness of the Rasch latent trait test model is continued. Methods of estimation, necessary sample size, and the applicability of the model are discussed. (JKS)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Measurement

Efficiency of Linear Equating as a Function of the Length of the Anchor Test.

Peer reviewed

Budescu, David – Journal of Educational Measurement, 1985

An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)

Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests

Previous Page | Next Page »

Pages: 1 | 2

Hambleton, Ronald K.	2
Huynh, Huynh	2
Subkoviak, Michael J.	2
Wright, Benjamin D.	2
Brennan, Robert L.	1
Budescu, David	1
Cook, Linda L.	1
Emrick, John A.	1
Gardner, P. L.	1
Jin, Kuan-Yu	1
Kahraman, Nilufer	1
Kane, Michael T.	1
Kolen, Michael J.	1
Kuan-Yu Jin	1
Lord, Frederic M.	1
Marsh, Herbert W.	1
Mingfeng Xue	1
Muthen, Bengt O.	1
Novick, Melvin R.	1
Peng, Chao-Ying, J.	1
Ping Chen	1
Saunders, Joseph C.	1
Sireci, Stephen G.	1
Thompson, Tony	1
More ▼