ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Descriptor

Comparative Analysis	8
Test Reliability	8
True Scores	8
Statistical Analysis	4
Error of Measurement	3
Mathematical Models	3
Test Interpretation	3
Models	2
Prediction	2
Response Style (Tests)	2
Scoring	2
Weighted Scores	2
Achievement Gains	1
Achievement Tests	1
Analysis of Covariance	1
Analysis of Variance	1
Attitude Change	1
Attitudes	1
Bias	1
Career Development	1
Classification	1
Comparative Education	1
Computer Assisted Testing	1
Computer Programs	1
Construct Validity	1
More ▼

Source

Psychometrika	2
Applied Psychological…	1
ETS Research Report Series	1
Educational and Psychological…	1

Author

Agunwamba, Christian C.	1
Attali, Yigal	1
Cohen, Stanley H.	1
Donlon, Thomas F.	1
Huck, Schuyler W.	1
Hunter, John E.	1
Jackson, Paul H.	1
Marston, Paul T., Borich,…	1
Mellenbergh, Gideon J.	1
Smith, Donald M.	1
van der Linden, Wim J.	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	3

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Lower Bounds for the Reliability of the Total Score on a Test Composed of Nonhomogeneous Items: I. Algebraic Lower Bounds

Peer reviewed

Jackson, Paul H.; Agunwamba, Christian C. – Psychometrika, 1977

Finding and interpreting lower bounds for reliability coefficients for tests with nonhomogenous items has been a problem for psychometricians. This paper presents a mathematical formula for finding the greatest lower bound for such a coefficient. (Author/JKS)

Descriptors: Comparative Analysis, Mathematical Models, Measurement, Test Interpretation

Correcting for Unreliability in Nonlinear Models of Attitude Change

Peer reviewed

Hunter, John E.; Cohen, Stanley H. – Psychometrika, 1974

Descriptors: Attitude Change, Attitudes, Comparative Analysis, Models

Construct Validity of "e-rater"® in Scoring TOEFL® Essays. Research Report. ETS RR-07-21

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal – ETS Research Report Series, 2007

This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first…

Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language)

The Internal and External Optimality of Decisions Based on Tests.

Peer reviewed

Mellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979

For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)

Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement

An Empirical Investigation of Lu's Method of Reliability Estimation.

Peer reviewed

Huck, Schuyler W.; And Others – Educational and Psychological Measurement, 1981

Believing that examinee-by-item interaction should be conceptualized as true score variability rather than as a result of errors of measurement, Lu proposed a modification of Hoyt's analysis of variance reliability procedure. Via a computer simulation study, it is shown that Lu's approach does not separate interaction from error. (Author/RL)

Descriptors: Analysis of Variance, Comparative Analysis, Computer Programs, Difficulty Level

An Optimizing Weight For Wrong Scores.

Download full text

Donlon, Thomas F. – 1975

This study empirically determined the optimizing weight to be applied to the Wrongs Total Score in scoring rubrics of the general form = R - kW, where S is the Score, R the Rights Total, k the weight and W the Wrongs Total, if reliability is to be maximized. As is well known, the traditional formula score rests on a theoretical framework which is…

Descriptors: Achievement Tests, Comparative Analysis, Guessing (Tests), Multiple Choice Tests

The KR-20 Reliability Coefficient as a Special Case of a More General Formula.

Download full text

Smith, Donald M. – 1976

The Kuder Richardson-20 Formula is shown to be a special case, where each examinee is given sufficient time to answer each item, of a more general formula where each examinee may not be allowed the necessary time. The formula is extended to allow two scores, knowledge and speed, to be extracted from each examinees test score. Using a sample of 82…

Descriptors: Career Development, Comparative Analysis, Grade Point Average, Predictive Measurement

Analysis of Covariance: Is It the Appropriate Model to Study Change?

Download full text

Marston, Paul T., Borich, Gary D. – 1977

The four main approaches to measuring treatment effects in schools; raw gain, residual gain, covariance, and true scores; were compared. A simulation study showed true score analysis produced a large number of Type-I errors. When corrected for this error, this method showed the least power of the four. This outcome was clearly the result of the…

Descriptors: Achievement Gains, Analysis of Covariance, Comparative Analysis, Error of Measurement