ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	13
Since 2016 (last 10 years)	35
Since 2006 (last 20 years)	64

Descriptor

Scores	107
Test Interpretation	27
Test Items	25
Achievement Tests	22
Testing Problems	19
Psychometrics	18
Educational Assessment	17
Test Use	17
Test Validity	17
Elementary Secondary Education	16
Test Results	16
Standardized Tests	14
Test Construction	14
Academic Achievement	11
Item Response Theory	11
Elementary Education	10
Norm Referenced Tests	10
Standards	10
Tests	10
Validity	10
College Entrance Examinations	9
Comparative Analysis	9
Evaluation Methods	9
Measurement	9
Testing	9
More ▼

Source

Educational Measurement:…

107

Publication Type

Journal Articles	107
Reports - Research	37
Reports - Evaluative	35
Reports - Descriptive	21
Opinion Papers	13
Information Analyses	5
Tests/Questionnaires	5
Guides - Non-Classroom	3
Speeches/Meeting Papers	3

Education Level

Higher Education	10
Postsecondary Education	8
Secondary Education	8
High Schools	6
Elementary Education	5
Elementary Secondary Education	5
Grade 3	3
Grade 4	3
Early Childhood Education	2
Grade 5	2
Intermediate Grades	2
Middle Schools	2
Grade 10	1
Grade 6	1
Grade 7	1
Grade 8	1
Junior High Schools	1
Primary Education	1
Two Year Colleges	1
More ▼

Audience

Teachers	2
Counselors	1

Location

Arizona	1
Canada	1
Idaho	1
Kansas	1
South Carolina	1
United States	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	4
Every Student Succeeds Act…	1

Assessments and Surveys

SAT (College Admission Test)	5
ACT Assessment	3
National Assessment of…	2
Program for International…	2
California Achievement Tests	1
Graduate Record Examinations	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 107 results Save | Export

Defining Test-Score Interpretation, Use, and Claims: Delphi Study for the Validity Argument

Peer reviewed

Direct link

Folger, Timothy D.; Bostic, Jonathan; Krupa, Erin E. – Educational Measurement: Issues and Practice, 2023

Validity is a fundamental consideration of test development and test evaluation. The purpose of this study is to define and reify three key aspects of validity and validation, namely test-score interpretation, test-score use, and the claims supporting interpretation and use. This study employed a Delphi methodology to explore how experts in…

Descriptors: Test Interpretation, Scores, Test Use, Test Validity

Supporting the Interpretive Validity of Student-Level Claims in Science Assessment with Tiered Claim Structures

Peer reviewed

Direct link

Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022

We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…

Descriptors: Science Tests, Test Validity, Test Items, Test Construction

Mode Effects in College Admissions Testing and Differential Speededness as a Possible Explanation

Peer reviewed

Direct link

Steedle, Jeffrey T.; Cho, Young Woo; Wang, Shichao; Arthur, Ann M.; Li, Dongmei – Educational Measurement: Issues and Practice, 2022

As testing programs transition from paper to online testing, they must study mode comparability to support the exchangeability of scores from different testing modes. To that end, a series of three mode comparability studies was conducted during the 2019-2020 academic year with examinees randomly assigned to take the ACT college admissions exam on…

Descriptors: College Entrance Examinations, Computer Assisted Testing, Scores, Test Format

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Reporting Pass-Fail Decisions to Examinees with Incomplete Data: A Commentary on Feinberg (2021)

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022

Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…

Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items

Digital Module 32: Understanding and Mitigating the Impact of Low Effort on Common Uses of Test and Survey Scores

Peer reviewed

Direct link

Soland, James – Educational Measurement: Issues and Practice, 2023

Most individuals who take, interpret, design, or score tests are aware that examinees do not always provide full effort when responding to items. However, many such individuals are not aware of how pervasive the issue is, what its consequences are, and how to address it. In this digital ITEMS module, Dr. James Soland will help fill these gaps in…

Descriptors: Student Behavior, Tests, Scores, Incidence

What Are the Conditions Associated with Subscore Added Value Noninvariance? Implications for Improving Subscore Interpretation Fairness

Peer reviewed

Direct link

Rios, Joseph A.; Miranda, Alejandra A. – Educational Measurement: Issues and Practice, 2021

Subscore added value analyses assume invariance across test taking populations; however, this assumption may be untenable in practice as differential subdomain relationships may be present among subgroups. The purpose of this simulation study was to understand the conditions associated with subscore added value noninvariance when manipulating: (1)…

Descriptors: Scores, Test Length, Ability, Correlation

Affordances of Item Formats and Their Effects on Test-Taker Cognition under Uncertainty

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019

The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…

Descriptors: Affordances, Test Items, Test Format, Test Wiseness

Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Educational Measurement: Issues and Practice, 2022

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores,…

Descriptors: Reliability, Scores, Scaling, Statistical Analysis

Digital Module 13: Monte Carlo Simulation Studies in Item Response Theory

Peer reviewed

Direct link

Leventhal, Brian; Ames, Allison – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of "Monte Carlo simulation studies" (MCSS) in "item response theory" (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because…

Descriptors: Item Response Theory, Monte Carlo Methods, Simulation, Test Items

The Relationship between Item Developer Alignment of Items to Range Achievement-Level Descriptors and Item Difficulty: Implications for Validating Intended Score Interpretations

Peer reviewed

Direct link

Schneider, M. Christina; Agrimson, Jared; Veazey, Mary – Educational Measurement: Issues and Practice, 2022

This paper presents results of a score interpretation study for a computer adaptive mathematics assessment. The study purpose was to test the efficacy of item developers' alignment of items to Range Achievement-Level Descriptors (RALDs; Egan et al.) against the empirical achievement-level alignment of items to investigate the use of RALDs as the…

Descriptors: Computer Assisted Testing, Mathematics Tests, Scores, Grade 3

Evaluating Item Fit Statistic Thresholds in PISA: Analysis of Cross-Country Comparability of Cognitive Items

Peer reviewed

Direct link

Joo, Seang-Hwane; Khorramdel, Lale; Yamamoto, Kentaro; Shin, Hyo Jeong; Robin, Frederic – Educational Measurement: Issues and Practice, 2021

In Programme for International Student Assessment (PISA), item response theory (IRT) scaling is used to examine the psychometric properties of items and scales and to provide comparable test scores across participating countries and over time. To balance the comparability of IRT item parameter estimations across countries with the best possible…

Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students

State Assessment Score Reporting Practices for English Learner Parents

Peer reviewed

Direct link

Rios, Joseph A.; Ihlenfeldt, Samuel D. – Educational Measurement: Issues and Practice, 2021

This study sought to investigate how states communicate results for academic achievement and English language proficiency (ELP) assessments to parents who are English learners (EL). This objective was addressed by evaluating: (a) whether score reports and interpretive guides for state academic achievement and ELP assessments in each state were…

Descriptors: Parents, English Language Learners, Communication (Thought Transfer), Scores

Impact of Both Local Item Dependencies and Cut-Point Locations on Examinee Classifications

Peer reviewed

Direct link

Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018

Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…

Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability

How Well Does the Sum Score Summarize the Test? Summability as a Measure of Internal Consistency

Peer reviewed

Direct link

Goeman, J. J.; De Jong, N. H. – Educational Measurement: Issues and Practice, 2018

Many researchers use Cronbach's alpha to demonstrate internal consistency, even though it has been shown numerous times that Cronbach's alpha is not suitable for this. Because the intention of questionnaire and test constructers is to summarize the test by its overall sum score, we advocate summability, which we define as the proportion of total…

Descriptors: Tests, Scores, Questionnaires, Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Hills, John R.	5
Sinharay, Sandip	5
Frisbie, David A.	3
Sireci, Stephen G.	3
Brennan, Robert L.	2
Cannell, John Jacob	2
D'Agostino, Jerome V.	2
Feinberg, Richard A.	2
Ho, Andrew D.	2
Kolen, Michael J.	2
Krupa, Erin E.	2
Mehrens, William A.	2
Puhan, Gautam	2
Rios, Joseph A.	2
Roberts, Mary Roduta	2
Wainer, Howard	2
Agrimson, Jared	1
Almehrizi, Rashid S.	1
Ames, Allison	1
An, Chen	1
An, Lily Shiao	1
Anderson, Gretchen	1
Armstrong, Anne-Marie	1
Arthur, Ann M.	1
More ▼