NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Xiao, Yue; Veldkamp, Bernard; Liu, Hongyun – Educational Measurement: Issues and Practice, 2022
The action sequences of respondents in problem-solving tasks reflect rich and detailed information about their performance, including differences in problem-solving ability, even if item scores are equal. It is therefore not sufficient to infer individual problem-solving skills based solely on item scores. This study is a preliminary attempt to…
Descriptors: Problem Solving, Item Response Theory, Scores, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A.; Miranda, Alejandra A. – Educational Measurement: Issues and Practice, 2021
Subscore added value analyses assume invariance across test taking populations; however, this assumption may be untenable in practice as differential subdomain relationships may be present among subgroups. The purpose of this simulation study was to understand the conditions associated with subscore added value noninvariance when manipulating: (1)…
Descriptors: Scores, Test Length, Ability, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Gotch, Chad M.; Roduta Roberts, Mary – Educational Measurement: Issues and Practice, 2018
As the primary interface between test developers and multiple educational stakeholders, score reports are a critical component to the success (or failure) of any assessment program. The purpose of this review is to document recent research on individual-level score reporting to advance the research and practice of score reporting. We conducted a…
Descriptors: Scores, Models, Correlation, Stakeholders
Peer reviewed Peer reviewed
Direct linkDirect link
Boevé, Anja J.; Meijer, Rob R.; Beldhuis, Hans J. A.; Bosker, Roel J.; Albers, Casper J. – Educational Measurement: Issues and Practice, 2019
To investigate the effect of innovations in the teaching-learning environment, researchers often compare study results from different cohorts across years. However, variance in scores can be attributed to both random fluctuation and systematic changes due to the innovation, complicating cohort comparisons. In the present study, we illustrate how…
Descriptors: Grades (Scholastic), Foreign Countries, Teaching Methods, Educational Innovation
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018
The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…
Descriptors: Test Content, Difficulty Level, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew D. – Educational Measurement: Issues and Practice, 2018
The percentage of students retaking college admissions tests is rising. Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing organizations such as ACT and the College Board, interested in validity evidence like correlations with first-year grade point average (FYGPA), often…
Descriptors: College Admission, Scores, Correlation, College Entrance Examinations
Peer reviewed Peer reviewed
Direct linkDirect link
Beard, Jonathan J.; Hsu, Julian; Ewing, Maureen; Godfrey, Kelly E. – Educational Measurement: Issues and Practice, 2019
High school students enroll in Advanced Placement (AP) courses and take AP exams for a variety of reasons. However, a lack of information about the extent to which there are incremental benefits associated with taking multiple AP exams has fostered a perception that students must take many APs to be prepared for college. Conversely, many American…
Descriptors: Correlation, Advanced Placement, Tests, College Preparation
Peer reviewed Peer reviewed
Direct linkDirect link
Moses, Tim – Educational Measurement: Issues and Practice, 2014
This module describes and extends X-to-Y regression measures that have been proposed for use in the assessment of X-to-Y scaling and equating results. Measures are developed that are similar to those based on prediction error in regression analyses but that are directly suited to interests in scaling and equating evaluations. The regression and…
Descriptors: Scaling, Regression (Statistics), Equated Scores, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Shang, Yi; VanIwaarden, Adam; Betebenner, Damian W. – Educational Measurement: Issues and Practice, 2015
In this study, we examined the impact of covariate measurement error (ME) on the estimation of quantile regression and student growth percentiles (SGPs), and find that SGPs tend to be overestimated among students with higher prior achievement and underestimated among those with lower prior achievement, a problem we describe as ME endogeneity in…
Descriptors: Error of Measurement, Regression (Statistics), Achievement Gains, Students
Peer reviewed Peer reviewed
Direct linkDirect link
Monroe, Scott; Cai, Li – Educational Measurement: Issues and Practice, 2015
Student growth percentiles (SGPs, Betebenner, 2009) are used to locate a student's current score in a conditional distribution based on the student's past scores. Currently, following Betebenner (2009), quantile regression (QR) is most often used operationally to estimate the SGPs. Alternatively, multidimensional item response theory (MIRT) may…
Descriptors: Item Response Theory, Reliability, Growth Models, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Wolfe, Edward W.; McVay, Aaron – Educational Measurement: Issues and Practice, 2012
Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of…
Descriptors: Researchers, Research, Correlation, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Wei, Xin; Haertel, Edward – Educational Measurement: Issues and Practice, 2011
Contemporary educational accountability systems, including state-level systems prescribed under No Child Left Behind as well as those envisioned under the "Race to the Top" comprehensive assessment competition, rely on school-level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that…
Descriptors: Scores, Reliability, Computation, Generalizability Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Desimone, Laura M.; Smith, Thomas M.; Hayes, Susan A.; Frisvold, David – Educational Measurement: Issues and Practice, 2005
We found moderate correlations among four policy attributes (consistency, specificity, authority, and power), which suggest that in many states, at least in design, standards-based reform is working as advocates imagined--aligned content standards and assessments established, backed up by detailed guidelines and frameworks, incentivized by rewards…
Descriptors: Educational Change, Accountability, Educational Policy, Cognitive Development
Peer reviewed Peer reviewed
Direct linkDirect link
Elliott, Stephen N.; Compton, Elizabeth; Roach, Andrew T. – Educational Measurement: Issues and Practice, 2007
The relationships between ratings on the Idaho Alternate Assessment (IAA) for 116 students with significant disabilities and corresponding ratings for the same students on two norm-referenced teacher rating scales were examined to gain evidence about the validity of resulting IAA scores. To contextualize these findings, another group of 54…
Descriptors: Inferences, Disabilities, Rating Scales, Eligibility