ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	4

Descriptor

Evaluation Criteria	36
Academic Standards	12
Test Interpretation	12
Criterion Referenced Tests	11
Decision Making	9
Mastery Tests	8
Test Items	6
Test Reliability	6
Test Validity	6
Testing Problems	6
Cutting Scores	5
Higher Education	5
Test Bias	5
Academic Achievement	4
Minimum Competency Testing	4
Models	4
Standard Setting (Scoring)	4
Test Construction	4
Bias	3
College Entrance Examinations	3
College Students	3
Difficulty Level	3
Educational Assessment	3
Evaluation Methods	3
Item Analysis	3
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	27
Reports - Research	15
Reports - Descriptive	2
Reports - Evaluative	2
Information Analyses	1
Opinion Papers	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Alabama High School…	1
Cattell Culture Fair…	1
National Assessment of…	1
Peabody Picture Vocabulary…	1
SAT (College Admission Test)	1
Sequential Tests of…	1
Slosson Intelligence Test	1
Stanford Binet Intelligence…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 36 results Save | Export

Modeling Rater Response Processes in Evaluating Score Meaning

Peer reviewed

Direct link

Lane, Suzanne – Journal of Educational Measurement, 2019

Rater-mediated assessments require the evaluation of the accuracy and consistency of the inferences made by the raters to ensure the validity of score interpretations and uses. Modeling rater response processes allows for a better understanding of how raters map their representations of the examinee performance to their representation of the…

Descriptors: Responses, Accuracy, Validity, Interrater Reliability

Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

Peer reviewed

Direct link

Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017

Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis

Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models

Peer reviewed

Direct link

Lee, Woo-yeol; Cho, Sun-Joo – Journal of Educational Measurement, 2017

Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…

Descriptors: Test Items, Item Response Theory, Item Analysis, Simulation

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

Structural Components Derived from Evaluating Standardized Tests

Peer reviewed

Shani, Esther; Petrosko, Joseph M. – Journal of Educational Measurement, 1976

Data from the Center for the Study of Evaluation's Secondary School Test Evaluations were analyzed to explore the present adequacy and propose a future direction of formal procedures for evaluating standardized tests. (Author/RC)

Descriptors: Evaluation, Evaluation Criteria, Secondary Education, Standardized Tests

Utilities and the Issue of Fairness in a Decision Theoretic Model for Selection

Peer reviewed

Sawyer, Richard L.; And Others – Journal of Educational Measurement, 1976

This article examines some of the values that might be considered in a selection situation within the context of a decision theoretic model also described here. Several alternate expressions of fair selection are suggested in the form of utility statements in which these values can be understood and compared. (Author/DEP)

Descriptors: Bias, Decision Making, Evaluation Criteria, Models

Priorities of Test Publishers

Peer reviewed

Hoepfner, Ralph; Doherty, William J. – Journal of Educational Measurement, 1973

The profiles of seven major publishers of elementary-level tests were prepared from systematic ratings of the qualities of their tests. Meaningful rating differences among the publishers' priorities were found and three types of publishers could be identified and described. (Authors)

Descriptors: Evaluation Criteria, Publishing Industry, Tables (Data), Test Reviews

An Evaluation of Some Models for Culture-Fair Selection

Peer reviewed

Petersen, Nancy S.; Novick, Melvin R. – Journal of Educational Measurement, 1976

Compares and evaluates models for bias in selection. Strategies are compared and evaluated as to their advantages and disadvantages in the areas of business and education. Some suggested formats for establishing culture fair selection are felt, by the authors, to be inadequate for their task and require a more complex analysis. (Author/DEP)

Descriptors: Bias, Culture Fair Tests, Decision Making, Evaluation Criteria

In Search of Fair Selection Procedures

Peer reviewed

Linn, Robert L. – Journal of Educational Measurement, 1976

Discusses some models, including the Petersen Novick Model (TM 502 259) regarding fair selection procedures. (DEP)

Descriptors: Bias, Decision Making, Evaluation Criteria, Models

As Always, Provocative.

Peer reviewed

Popham, W. James – Journal of Educational Measurement, 1978

A defense of the use of standards with criterion-referenced testing is made in response to Glass's article (TM 504 031). (JKS)

Descriptors: Academic Standards, Criterion Referenced Tests, Evaluation Criteria, Mastery Tests

The Role of Instructional Sensitivity in the Empirical Review of Criterion-Referenced Test Items.

Peer reviewed

Haladyna, Tom; Roid, Gale – Journal of Educational Measurement, 1981

The rationale for use of instructional sensitivity in the empirical review of test items is examined, and the results of a study that distinguishes instructional sensitivity from other item concepts are presented. Research is reviewed which indicates the existence of instructional sensitivity as a unique criterion-referenced test item concept. (RL)

Descriptors: Criterion Referenced Tests, Difficulty Level, Evaluation Criteria, Pretests Posttests

Standards and Criteria: A Response.

Peer reviewed

Block, James H. – Journal of Educational Measurement, 1978

The use of setting standards for criterion-referenced tests is defended in a response to two papers by Gene Glass and Nancy Burton. (JKS)

Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making

Educational Performance Standards: Image or Substance?

Peer reviewed

Levin, Henry M. – Journal of Educational Measurement, 1978

An historical and philosophical perspective on the setting of standards for criterion-referenced testing is presented. The author suggests that performance standards are more symbolic than substantive. (JKS)

Descriptors: Academic Standards, Criterion Referenced Tests, Decision Making, Evaluation Criteria

A Look at Content Bias in IQ Tests.

Peer reviewed

Zoref, Leslie; Williams, Paul – Journal of Educational Measurement, 1980

Criteria were developed to assess sexual and racial item content bias for every item from six IQ tests. Each reference was judged as either stereotyped or not sterotyped. This analysis pointed out an overwhelming sexual and racial imbalance in item content. (Author/RL)

Descriptors: Content Analysis, Ethnic Stereotypes, Evaluation Criteria, Intelligence Tests

On the Use of Cut-Off Scores with Criterion-Referenced Tests in Instructional Settings.

Peer reviewed

Hambleton, Ronald K. – Journal of Educational Measurement, 1978

The use of cut-off scores with criterion referenced tests is defended in this response to two papers by Gene Glass and Nancy Burton. Suggestions for setting cut-off scores are made. (JKS)

Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making

Previous Page | Next Page »

Pages: 1 | 2 | 3

Hambleton, Ronald K.	2
Linn, Robert L.	2
Ali, Usama S.	1
Ambrosino, Robert J.	1
Ankenman, Robert D.	1
Baldwin, Su G.	1
Block, James H.	1
Brown, Charles	1
Burton, Nancy W.	1
Chase, Clinton I.	1
Chen, Shu-Ying	1
Cho, Sun-Joo	1
Cizek, Gregory J.	1
Clauser, Brian E.	1
Debeer, Dries	1
Dillon, Gerard F.	1
Doherty, William J.	1
Eignor, Daniel R.	1
Elliott, Rogers	1
Glass, Gene V.	1
Goldman, Roy D.	1
Gross, Leon J.	1
Haladyna, Tom	1
Hardy, Roy A.	1
More ▼