ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	10

Descriptor

Correlation	11
Validity	6
Reliability	5
Scores	4
Comparative Analysis	3
Foreign Countries	3
Multiple Choice Tests	3
Scoring	3
Statistical Analysis	3
Test Validity	3
College Entrance Examinations	2
Differences	2
Essays	2
Gender Differences	2
Grade Point Average	2
Item Analysis	2
Licensing Examinations…	2
Mathematics Tests	2
Predictive Validity	2
Test Construction	2
Academic Achievement	1
African Americans	1
Automation	1
Case Studies	1
College Admission	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	11
Reports - Research	9
Reports - Evaluative	2

Education Level

High Schools	2
Higher Education	2
Elementary Secondary Education	1
Grade 12	1
Grade 3	1
Grade 5	1
Grade 8	1
Postsecondary Education	1

Audience

Location

Germany	1
Israel	1
New York	1
Norway	1
Slovenia	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

Bar Examinations	1
Graduate Record Examinations	1
National Assessment of…	1
Program for International…	1
Trends in International…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Of Small Beauties and Large Beasts: The Quality of Distractors on Multiple-Choice Tests Is More Important than Their Quantity

Peer reviewed

Direct link

Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017

In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…

Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Increasing the Validity of Angoff Standards through Analysis of Judge-Level Internal Consistency

Peer reviewed

Direct link

Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014

The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…

Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation

Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

Peer reviewed

Direct link

Sawyer, Richard – Applied Measurement in Education, 2013

Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

Descriptors: High Schools, Grade Point Average, College Entrance Examinations, College Admission

A Cross-National Comparison of Reported Effort and Mathematics Performance in TIMSS Advanced

Peer reviewed

Direct link

Eklöf, Hanna; Pavešic, Barbara Japelj; Grønmo, Liv Sissel – Applied Measurement in Education, 2014

The purpose of the study was to measure students' reported test-taking effort and the relationship between reported effort and performance on the Trends in International Mathematics and Science Study (TIMSS) Advanced mathematics test. This was done in three countries participating in TIMSS Advanced 2008 (Sweden, Norway, and Slovenia), and the…

Descriptors: Mathematics Tests, Cross Cultural Studies, Foreign Countries, Correlation

Providing Subscale Scores for Diagnostic Information: A Case Study when the Test Is Essentially Unidimensional

Peer reviewed

Direct link

Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010

Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…

Descriptors: Item Response Theory, Case Studies, Reliability, Scores

A Note on Presenting What Predictive Validity Numbers Mean

Peer reviewed

Direct link

Bridgeman, Brent; Burton, Nancy; Cline, Frederick – Applied Measurement in Education, 2009

Descriptions of validity results based solely on correlation coefficients or percent of the variance accounted for are not merely difficult to interpret, they are likely to be misinterpreted. Predictors that apparently account for a small percent of the variance may actually be highly important from a practical perspective. This study combined two…

Descriptors: Predictive Validity, College Entrance Examinations, Graduate Study, Grade Point Average

The Critical Role of Anchor Paper Selection in Writing Assessment

Peer reviewed

Direct link

Osborn Popp, Sharon E.; Ryan, Joseph M.; Thompson, Marilyn S. – Applied Measurement in Education, 2009

Scoring rubrics are routinely used to evaluate the quality of writing samples produced for writing performance assessments, with anchor papers chosen to represent score points defined in the rubric. Although the careful selection of anchor papers is associated with best practices for scoring, little research has been conducted on the role of…

Descriptors: Writing Evaluation, Scoring Rubrics, Selection, Scoring

Modeling Group Differences in OLS and Orthogonal Regression: Implications for Differential Validity Studies

Peer reviewed

Direct link

Kane, Michael T.; Mroch, Andrew A. – Applied Measurement in Education, 2010

In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…

Descriptors: Least Squares Statistics, Regression (Statistics), Differences, Validity

Can Validity Rise When Reliability Declines?

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 1997

It has often been asserted that the reliability of a measure places an upper limit on its validity. This article demonstrates in theory that validity can rise when reliability declines, even when validity evidence is a correlation with an acceptable criterion. Whether empirical examples can actually be found is an open question. (SLD)

Descriptors: Correlation, Criteria, Reliability, Test Construction

Ben-Simon, Anat	1
Bridgeman, Brent	1
Burton, Nancy	1
Clauser, Brian E.	1
Clauser, Jerome C.	1
Cline, Frederick	1
Cohen, Yoav	1
Eklöf, Hanna	1
Feldt, Leonard S.	1
Ferrara, Steve	1
Grønmo, Liv Sissel	1
Hambleton, Ronald K.	1
Kane, Michael T.	1
Lane, Suzanne	1
Levi, Effi	1
Mroch, Andrew A.	1
Musch, Jochen	1
Osborn Popp, Sharon E.	1
Papenberg, Martin	1
Pavešic, Barbara Japelj	1
Ryan, Joseph M.	1
Sawyer, Richard	1
Steedle, Jeffrey T.	1
Stone, Clement A.	1
Thompson, Marilyn S.	1
More ▼