ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	14

Descriptor

Correlation	15
Scores	12
Comparative Analysis	5
Computation	3
Error of Measurement	3
Item Response Theory	3
Models	3
Regression (Statistics)	3
Accountability	2
Achievement Gains	2
College Freshmen	2
Equated Scores	2
Grades (Scholastic)	2
Inferences	2
Measurement	2
Rating Scales	2
Reliability	2
Simulation	2
Students	2
Test Construction	2
Test Interpretation	2
Test Items	2
Test Validity	2
Tests	2
Ability	1
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	15
Reports - Research	11
Reports - Descriptive	2
Reports - Evaluative	2
Information Analyses	1

Education Level

Higher Education	3
Elementary Secondary Education	2
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Idaho	1
Netherlands	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

ACT Assessment	1
Program for the International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Combining Process Information and Item Response Modeling to Estimate Problem-Solving Ability

Peer reviewed

Direct link

Xiao, Yue; Veldkamp, Bernard; Liu, Hongyun – Educational Measurement: Issues and Practice, 2022

The action sequences of respondents in problem-solving tasks reflect rich and detailed information about their performance, including differences in problem-solving ability, even if item scores are equal. It is therefore not sufficient to infer individual problem-solving skills based solely on item scores. This study is a preliminary attempt to…

Descriptors: Problem Solving, Item Response Theory, Scores, Item Analysis

What Are the Conditions Associated with Subscore Added Value Noninvariance? Implications for Improving Subscore Interpretation Fairness

Peer reviewed

Direct link

Rios, Joseph A.; Miranda, Alejandra A. – Educational Measurement: Issues and Practice, 2021

Subscore added value analyses assume invariance across test taking populations; however, this assumption may be untenable in practice as differential subdomain relationships may be present among subgroups. The purpose of this simulation study was to understand the conditions associated with subscore added value noninvariance when manipulating: (1)…

Descriptors: Scores, Test Length, Ability, Correlation

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

A Review of Recent Research on Individual-Level Score Reports

Peer reviewed

Direct link

Gotch, Chad M.; Roduta Roberts, Mary – Educational Measurement: Issues and Practice, 2018

As the primary interface between test developers and multiple educational stakeholders, score reports are a critical component to the success (or failure) of any assessment program. The purpose of this review is to document recent research on individual-level score reporting to advance the research and practice of score reporting. We conducted a…

Descriptors: Scores, Models, Correlation, Stakeholders

On Natural Variation in Grades in Higher Education, and Its Implications for Assessing Effectiveness of Educational Innovations

Peer reviewed

Direct link

Boevé, Anja J.; Meijer, Rob R.; Beldhuis, Hans J. A.; Bosker, Roel J.; Albers, Casper J. – Educational Measurement: Issues and Practice, 2019

To investigate the effect of innovations in the teaching-learning environment, researchers often compare study results from different cohorts across years. However, variance in scores can be attributed to both random fluctuation and systematic changes due to the innovation, complicating cohort comparisons. In the present study, we illustrate how…

Descriptors: Grades (Scholastic), Foreign Countries, Teaching Methods, Educational Innovation

On the Choice of Anchor Tests in Equating

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018

The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…

Descriptors: Test Content, Difficulty Level, Test Items, Test Construction

How Should Colleges Treat Multiple Admissions Test Scores?

Peer reviewed

Direct link

Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew D. – Educational Measurement: Issues and Practice, 2018

The percentage of students retaking college admissions tests is rising. Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing organizations such as ACT and the College Board, interested in validity evidence like correlations with first-year grade point average (FYGPA), often…

Descriptors: College Admission, Scores, Correlation, College Entrance Examinations

Studying the Relationships between the Number of APs, AP Performance, and College Outcomes

Peer reviewed

Direct link

Beard, Jonathan J.; Hsu, Julian; Ewing, Maureen; Godfrey, Kelly E. – Educational Measurement: Issues and Practice, 2019

High school students enroll in Advanced Placement (AP) courses and take AP exams for a variety of reasons. However, a lack of information about the extent to which there are incremental benefits associated with taking multiple AP exams has fostered a perception that students must take many APs to be prepared for college. Conversely, many American…

Descriptors: Correlation, Advanced Placement, Tests, College Preparation

Quantifying Error and Uncertainty Reductions in Scaling Functions: An ITEMS Module

Peer reviewed

Direct link

Moses, Tim – Educational Measurement: Issues and Practice, 2014

This module describes and extends X-to-Y regression measures that have been proposed for use in the assessment of X-to-Y scaling and equating results. Measures are developed that are similar to those based on prediction error in regression analyses but that are directly suited to interests in scaling and equating evaluations. The regression and…

Descriptors: Scaling, Regression (Statistics), Equated Scores, Comparative Analysis

Covariate Measurement Error Correction for Student Growth Percentiles Using the SIMEX Method

Peer reviewed

Direct link

Shang, Yi; VanIwaarden, Adam; Betebenner, Damian W. – Educational Measurement: Issues and Practice, 2015

In this study, we examined the impact of covariate measurement error (ME) on the estimation of quantile regression and student growth percentiles (SGPs), and find that SGPs tend to be overestimated among students with higher prior achievement and underestimated among those with lower prior achievement, a problem we describe as ME endogeneity in…

Descriptors: Error of Measurement, Regression (Statistics), Achievement Gains, Students

Examining the Reliability of Student Growth Percentiles Using Multidimensional IRT

Peer reviewed

Direct link

Monroe, Scott; Cai, Li – Educational Measurement: Issues and Practice, 2015

Student growth percentiles (SGPs, Betebenner, 2009) are used to locate a student's current score in a conditional distribution based on the student's past scores. Currently, following Betebenner (2009), quantile regression (QR) is most often used operationally to estimate the SGPs. Alternatively, multidimensional item response theory (MIRT) may…

Descriptors: Item Response Theory, Reliability, Growth Models, Computation

Application of Latent Trait Models to Identifying Substantively Interesting Raters

Peer reviewed

Direct link

Wolfe, Edward W.; McVay, Aaron – Educational Measurement: Issues and Practice, 2012

Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of…

Descriptors: Researchers, Research, Correlation, Test Bias

The Effect of Ignoring Classroom-Level Variance in Estimating the Generalizability of School Mean Scores

Peer reviewed

Direct link

Wei, Xin; Haertel, Edward – Educational Measurement: Issues and Practice, 2011

Contemporary educational accountability systems, including state-level systems prescribed under No Child Left Behind as well as those envisioned under the "Race to the Top" comprehensive assessment competition, rely on school-level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that…

Descriptors: Scores, Reliability, Computation, Generalizability Theory

Beyond Accountability and Average Mathematics Scores: Relating State Education Policy Attributes to Cognitive Achievement Domains

Peer reviewed

Direct link

Desimone, Laura M.; Smith, Thomas M.; Hayes, Susan A.; Frisvold, David – Educational Measurement: Issues and Practice, 2005

We found moderate correlations among four policy attributes (consistency, specificity, authority, and power), which suggest that in many states, at least in design, standards-based reform is working as advocates imagined--aligned content standards and assessments established, backed up by detailed guidelines and frameworks, incentivized by rewards…

Descriptors: Educational Change, Accountability, Educational Policy, Cognitive Development

Building Validity Evidence for Scores on a State-Wide Alternate Assessment: A Contrasting Groups, Multimethod Approach

Peer reviewed

Direct link

Elliott, Stephen N.; Compton, Elizabeth; Roach, Andrew T. – Educational Measurement: Issues and Practice, 2007

The relationships between ratings on the Idaho Alternate Assessment (IAA) for 116 students with significant disabilities and corresponding ratings for the same students on two norm-referenced teacher rating scales were examined to gain evidence about the validity of resulting IAA scores. To contextualize these findings, another group of 54…

Descriptors: Inferences, Disabilities, Rating Scales, Eligibility

Albers, Casper J.	1
An, Lily Shiao	1
Beard, Jonathan J.	1
Beldhuis, Hans J. A.	1
Bertling, Maria	1
Betebenner, Damian W.	1
Boevé, Anja J.	1
Bosker, Roel J.	1
Cai, Li	1
Compton, Elizabeth	1
Davis, Laurie Laughlin	1
Desimone, Laura M.	1
Elliott, Stephen N.	1
Ewing, Maureen	1
Frisvold, David	1
Godfrey, Kelly E.	1
Gotch, Chad M.	1
Haertel, Edward	1
Hayes, Susan A.	1
Ho, Andrew D.	1
Ho, Andrew Dean	1
Hsu, Julian	1
Liu, Hongyun	1
Mattern, Krista	1
McVay, Aaron	1
More ▼