ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	3

Descriptor

Error of Measurement	21
Test Reliability	21
Criterion Referenced Tests	6
Mathematical Models	6
Test Interpretation	6
True Scores	6
Statistical Analysis	5
Test Construction	5
Test Validity	5
Item Analysis	4
Mastery Tests	4
Measurement	3
Norm Referenced Tests	3
Sampling	3
Test Results	3
Correlation	2
Cutting Scores	2
Evaluation Methods	2
Mathematical Applications	2
Measurement Techniques	2
Psychometrics	2
Scores	2
Scoring	2
Test Bias	2
Test Format	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	11
Reports - Research	8
Reports - Evaluative	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1

Education Level

Secondary Education

Audience

Location

South Carolina

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Modeling the Intraindividual Relation of Ability and Speed within a Test

Peer reviewed

Direct link

Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024

Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…

Descriptors: Testing, Academic Ability, Time on Task, Correlation

Detecting Differential Item Functioning among Multiple Groups Using IRT Residual DIF Framework

Peer reviewed

Direct link

Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024

This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…

Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction

Maintaining Equivalent Cut Scores for Small Sample Test Forms

Peer reviewed

Direct link

Dwyer, Andrew C. – Journal of Educational Measurement, 2016

This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards

Test Length and the Standard Error of Measurement

Peer reviewed

Gardner, P. L. – Journal of Educational Measurement, 1970

Descriptors: Error of Measurement, Mathematical Models, Statistical Analysis, Test Reliability

Fallibility of Measurement and the Power of a Statistical Test

Peer reviewed

Subkoviak, Michael J.; Levin, Joel R. – Journal of Educational Measurement, 1977

Measurement error in dependent variables reduces the power of statistical tests to detect mean differences of specified magnitude. Procedures for determining power and sample size that consider the reliability of the dependent variable are discussed and illustrated. Methods for estimating reliability coefficients used in these procedures are…

Descriptors: Error of Measurement, Hypothesis Testing, Power (Statistics), Sampling

A Note on Huynh's Normal Approximation Procedure for Estimating Criterion-Referenced Reliability.

Peer reviewed

Peng, Chao-Ying, J.; Subkoviak, Michael J. – Journal of Educational Measurement, 1980

Huynh (1976) suggested a method of approximating the reliability coefficient of a mastery test. The present study examines the accuracy of Huynh's approximation and also describes a computationally simpler approximation which appears to be generally more accurate than the former. (Author/RL)

Descriptors: Error of Measurement, Mastery Tests, Mathematical Models, Statistical Analysis

On the Reliability of Categorically Scored Examinations

Peer reviewed

Direct link

Kupermintz, Haggai – Journal of Educational Measurement, 2004

A decision-theoretic approach to the question of reliability in categorically scored examinations is explored. The concepts of true scores and errors are discussed as they deviate from conventional psychometric definitions and measurement error in categorical scores is cast in terms of misclassifications. A reliability measure based on…

Descriptors: Test Reliability, Error of Measurement, Psychometrics, Test Theory

The Nature of Objectivity with the Rasch Model

Peer reviewed

Whitely, Susan E.; Dawis, Rene V. – Journal of Educational Measurement, 1974

Descriptors: Error of Measurement, Item Analysis, Matrices, Measurement Techniques

The Role of Reliability in Criterion-Referenced Tests.

Peer reviewed

Kane, Michael T. – Journal of Educational Measurement, 1986

These analyses suggest that if a criterion-referenced test had a reliability (defined in terms of internal consistency) below 0.5, a simple a priori procedure would provide better estimates of students' universe scores than would individual observed scores. (Author/LMO)

Descriptors: Criterion Referenced Tests, Educational Research, Error of Measurement, Generalizability Theory

Interpretive Problems When Correcting for Attenuation.

Peer reviewed

Winne, Philip H.; Belfry, M. Joan – Journal of Educational Measurement, 1982

This review of issues about correcting for attenuation concludes that the basic difficulty lies in being able to identify and equate sources of variance in estimates of validity and reliability. Recommendations are proposed for cautious use of correction for attenuation. (Author/CM)

Descriptors: Correlation, Error of Measurement, Research Methodology, Statistical Analysis

Models, Meanings and Misunderstandings: Some Issues in Applying Rasch's Theory

Peer reviewed

Whitely, Susan E. – Journal of Educational Measurement, 1977

A debate concerning specific issues and the general usefulness of the Rasch latent trait test model is continued. Methods of estimation, necessary sample size, and the applicability of the model are discussed. (JKS)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Measurement

Accuracy of Two Procedures for Estimating Reliability of Mastery Tests.

Peer reviewed

Huynh, Huynh; Saunders, Joseph C. – Journal of Educational Measurement, 1980

Single administration (beta-binomial) estimates for the raw agreement index p and the corrected-for-chance kappa index in mastery testing are compared with those based on two test administrations in terms of estimation bias and sampling variability. Bias is about 2.5 percent for p and 10 percent for kappa. (Author/RL)

Descriptors: Comparative Analysis, Error of Measurement, Mastery Tests, Mathematical Models

Misunderstanding the Rasch Model

Peer reviewed

Wright, Benjamin D. – Journal of Educational Measurement, 1977

Statements made in a previous article of this journal concerning the Rasch latent trait test model are questioned. Methods of estimation, necessary sample sizes, several formuli, and the general usefulness of the Rasch model are discussed. (JKS)

Descriptors: Computers, Error of Measurement, Item Analysis, Mathematical Models

The Answer Key as a Source of Error in Examinations for Professionals.

Peer reviewed

Norcini, John J. – Journal of Educational Measurement, 1987

Answer keys for physician and teacher licensing examinations were studied. The impact of variability on total errors of measurement was examined for answer keys constructed using the aggregate method. Results indicated that, in some cases, scorers contributed to a sizable reduction in measurement error. (Author/GDC)

Descriptors: Adults, Answer Keys, Error of Measurement, Evaluators

Criterion-Referenced Testing: Comments on Reliability

Peer reviewed

Shavelson, Richard J.; And Others – Journal of Educational Measurement, 1972

In this comment a recent attempt by Samuel A. Livingston to develop a theory of reliability for criterion-referenced measures is critiqued. For Livingston's rejoinder see TM 500 560. (Authors/MB)

Descriptors: Criterion Referenced Tests, Error of Measurement, Measurement Techniques, Response Style (Tests)

Previous Page | Next Page »

Pages: 1 | 2

Livingston, Samuel A.	3
Subkoviak, Michael J.	2
Whitely, Susan E.	2
Augustin Mutak	1
Belfry, M. Joan	1
Danqi Zhu	1
Dawis, Rene V.	1
Dwyer, Andrew C.	1
Edison M. Choe	1
Emrick, John A.	1
Esther Ulitzsch	1
Gardner, P. L.	1
Harris, Chester W.	1
Huynh, Huynh	1
Hwanggyu Lim	1
Jochen Ranger	1
Kane, Michael T.	1
Kupermintz, Haggai	1
Kyung T. Han	1
Levin, Joel R.	1
Muthen, Bengt O.	1
Norcini, John J.	1
Peng, Chao-Ying, J.	1
Robert Krause	1
More ▼