Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 4 |
Descriptor
Source
Journal of Educational… | 36 |
Author
Hambleton, Ronald K. | 2 |
Linn, Robert L. | 2 |
Ali, Usama S. | 1 |
Ambrosino, Robert J. | 1 |
Ankenman, Robert D. | 1 |
Baldwin, Su G. | 1 |
Block, James H. | 1 |
Brown, Charles | 1 |
Burton, Nancy W. | 1 |
Chase, Clinton I. | 1 |
Chen, Shu-Ying | 1 |
More ▼ |
Publication Type
Journal Articles | 27 |
Reports - Research | 15 |
Reports - Descriptive | 2 |
Reports - Evaluative | 2 |
Information Analyses | 1 |
Opinion Papers | 1 |
Education Level
Audience
Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lane, Suzanne – Journal of Educational Measurement, 2019
Rater-mediated assessments require the evaluation of the accuracy and consistency of the inferences made by the raters to ensure the validity of score interpretations and uses. Modeling rater response processes allows for a better understanding of how raters map their representations of the examinee performance to their representation of the…
Descriptors: Responses, Accuracy, Validity, Interrater Reliability
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis
Lee, Woo-yeol; Cho, Sun-Joo – Journal of Educational Measurement, 2017
Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…
Descriptors: Test Items, Item Response Theory, Item Analysis, Simulation
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

Shani, Esther; Petrosko, Joseph M. – Journal of Educational Measurement, 1976
Data from the Center for the Study of Evaluation's Secondary School Test Evaluations were analyzed to explore the present adequacy and propose a future direction of formal procedures for evaluating standardized tests. (Author/RC)
Descriptors: Evaluation, Evaluation Criteria, Secondary Education, Standardized Tests

Sawyer, Richard L.; And Others – Journal of Educational Measurement, 1976
This article examines some of the values that might be considered in a selection situation within the context of a decision theoretic model also described here. Several alternate expressions of fair selection are suggested in the form of utility statements in which these values can be understood and compared. (Author/DEP)
Descriptors: Bias, Decision Making, Evaluation Criteria, Models

Hoepfner, Ralph; Doherty, William J. – Journal of Educational Measurement, 1973
The profiles of seven major publishers of elementary-level tests were prepared from systematic ratings of the qualities of their tests. Meaningful rating differences among the publishers' priorities were found and three types of publishers could be identified and described. (Authors)
Descriptors: Evaluation Criteria, Publishing Industry, Tables (Data), Test Reviews

Petersen, Nancy S.; Novick, Melvin R. – Journal of Educational Measurement, 1976
Compares and evaluates models for bias in selection. Strategies are compared and evaluated as to their advantages and disadvantages in the areas of business and education. Some suggested formats for establishing culture fair selection are felt, by the authors, to be inadequate for their task and require a more complex analysis. (Author/DEP)
Descriptors: Bias, Culture Fair Tests, Decision Making, Evaluation Criteria

Linn, Robert L. – Journal of Educational Measurement, 1976
Discusses some models, including the Petersen Novick Model (TM 502 259) regarding fair selection procedures. (DEP)
Descriptors: Bias, Decision Making, Evaluation Criteria, Models

Popham, W. James – Journal of Educational Measurement, 1978
A defense of the use of standards with criterion-referenced testing is made in response to Glass's article (TM 504 031). (JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Evaluation Criteria, Mastery Tests

Haladyna, Tom; Roid, Gale – Journal of Educational Measurement, 1981
The rationale for use of instructional sensitivity in the empirical review of test items is examined, and the results of a study that distinguishes instructional sensitivity from other item concepts are presented. Research is reviewed which indicates the existence of instructional sensitivity as a unique criterion-referenced test item concept. (RL)
Descriptors: Criterion Referenced Tests, Difficulty Level, Evaluation Criteria, Pretests Posttests

Block, James H. – Journal of Educational Measurement, 1978
The use of setting standards for criterion-referenced tests is defended in a response to two papers by Gene Glass and Nancy Burton. (JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making

Levin, Henry M. – Journal of Educational Measurement, 1978
An historical and philosophical perspective on the setting of standards for criterion-referenced testing is presented. The author suggests that performance standards are more symbolic than substantive. (JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Decision Making, Evaluation Criteria

Zoref, Leslie; Williams, Paul – Journal of Educational Measurement, 1980
Criteria were developed to assess sexual and racial item content bias for every item from six IQ tests. Each reference was judged as either stereotyped or not sterotyped. This analysis pointed out an overwhelming sexual and racial imbalance in item content. (Author/RL)
Descriptors: Content Analysis, Ethnic Stereotypes, Evaluation Criteria, Intelligence Tests

Hambleton, Ronald K. – Journal of Educational Measurement, 1978
The use of cut-off scores with criterion referenced tests is defended in this response to two papers by Gene Glass and Nancy Burton. Suggestions for setting cut-off scores are made. (JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making