Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 1 |
Descriptor
Scoring | 8 |
Test Reliability | 8 |
True Scores | 8 |
Testing | 4 |
Criterion Referenced Tests | 3 |
Error of Measurement | 3 |
Measurement Techniques | 3 |
Statistical Analysis | 3 |
Test Validity | 3 |
Comparative Analysis | 2 |
Essay Tests | 2 |
More ▼ |
Author
Attali, Yigal | 1 |
Brennan, Robert L. | 1 |
Feldt, Leonard S. | 1 |
Gleser, Leon Jay | 1 |
Hanna, Gerald S. | 1 |
Livingston, Samuel A. | 1 |
Perry, Dallis | 1 |
Smith, Donald M. | 1 |
Spray, Judith A. | 1 |
Publication Type
Reports - Research | 5 |
Journal Articles | 3 |
Information Analyses | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Livingston, Samuel A. – 1984
Much previously published material for estimating the reliability of classification has been based on the assumption that a test consists of a known number of equally weighted items. The test score is the number of those items answered correctly. These methods cannot be used with classifications based on weighted composite scores, especially if…
Descriptors: Equated Scores, Essay Tests, Estimation (Mathematics), Mathematical Models

Hanna, Gerald S.; And Others – Journal of School Psychology, 1981
Discusses four ubiquitous major sources of measurement error for individual intelligence scales. Argues that where these sources cannot be directly investigated, they should be estimated rather than ignored. Estimated the typical magnitude of error arising from each of content sampling, time sampling, scoring, and administration. (Author)
Descriptors: Error of Measurement, Intelligence Tests, Measurement Techniques, Sampling
Attali, Yigal – ETS Research Report Series, 2007
This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first…
Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language)
Gleser, Leon Jay – 1971
An attempt is made to indicate why the concept of "true score" naturally leads to the belief that test validity must increase with an increase in test and/or average item reliability, and why this is correct for the classical single-factor model first introduced by Spearman. The statistical model used by Loevinger is introduced to…
Descriptors: Factor Analysis, Item Analysis, Mathematical Models, Measurement Techniques

Feldt, Leonard S.; Spray, Judith A. – Research Quarterly for Exercise and Sport, 1983
The reliabilities of two types of measurement plans were compared across six hypothetical distributions of true scores or abilities. The measurement plans were: (1) fixed-length, where the number of trials for all examinees is set in advance; and (2) trials-to-criterion, where examinees must keep trying until they complete a given number of trials…
Descriptors: Criterion Referenced Tests, Evaluation Methods, Higher Education, Measurement Techniques
Smith, Donald M. – 1976
The Kuder Richardson-20 Formula is shown to be a special case, where each examinee is given sufficient time to answer each item, of a more general formula where each examinee may not be allowed the necessary time. The formula is extended to allow two scores, knowledge and speed, to be extracted from each examinees test score. Using a sample of 82…
Descriptors: Career Development, Comparative Analysis, Grade Point Average, Predictive Measurement
Brennan, Robert L. – 1974
The first four chapters of this report primarily provide an extensive, critical review of the literature with regard to selected aspects of the criterion-referenced and mastery testing fields. Major topics treated include: (a) definitions, distinctions, and background, (b) the relevance of classical test theory, (c) validity and procedures for…
Descriptors: Computer Programs, Confidence Testing, Criterion Referenced Tests, Error of Measurement
Perry, Dallis – 1971
Principles of test administration, test validity, and accuracy of measurement underlying interpretation of standardized test scores in educational administration, instruction, and guidance are presented. Types of norm-referenced score transformations, including percentiles, standard scores, and grade equivalents, and of criterion referenced…
Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation, Expectancy Tables