NotesFAQContact Us
Collection
Advanced
Search Tips
Source
ETS Research Report Series14
Audience
Location
New Jersey1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2023
Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers.…
Descriptors: Scores, Test Items, Accuracy, Psychometrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…
Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tannenbaum, Richard J.; Kane, Michael T. – ETS Research Report Series, 2019
Testing programs are often classified as high or low stakes to indicate how stringently they need to be evaluated. However, in practice, this classification falls short. A high-stakes label is taken to imply that all indicators of measurement quality must meet high standards; whereas a low-stakes label is taken to imply the opposite. This approach…
Descriptors: High Stakes Tests, Testing Programs, Measurement, Evaluation Criteria
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Reckase, Mark D. – ETS Research Report Series, 2017
A common interpretation of achievement test results is that they provide measures of achievement that are much like other measures we commonly use for height, weight, or the cost of goods. In a limited sense, such interpretations are correct, but some nuances of these interpretations have important implications for the use of achievement test…
Descriptors: Models, Achievement Tests, Test Results, Test Construction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Russell, Javarro; Markle, Ross – ETS Research Report Series, 2017
From 2006 to 2008, Educational Testing Service (ETS) produced a series of reports titled "A Culture of Evidence," designed to capture a changing climate in higher education assessment. A decade later, colleges and universities already face new and different challenges resulting from societal, technological, and scientific influences.…
Descriptors: Evidence Based Practice, Evidence, Educational Testing, Educational Improvement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bennett, Randy E. – ETS Research Report Series, 2016
Media reports have recently given significant attention to the opt-out movement, an organized effort to refuse to take standardized tests. Although the narrative often told in early press accounts was of a viral grass-roots effort led by parents who object to state-mandated testing, the reality has turned out to be more complicated. Through a…
Descriptors: Educational Testing, Standardized Tests, Resistance (Psychology), Activism
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zwick, Rebecca; Ye, Lei; Isham, Steven – ETS Research Report Series, 2013
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we…
Descriptors: Test Bias, Statistical Analysis, Simulation, Educational Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
van Rijn, Peter W.; Rijmen, Frank – ETS Research Report Series, 2012
Hooker and colleagues addressed a paradoxical situation that can arise in the application of multidimensional item response theory (MIRT) models to educational test data. We demonstrate that this MIRT paradox is an instance of the explaining-away phenomenon in Bayesian networks, and we attempt to enhance the understanding of MIRT models by placing…
Descriptors: Item Response Theory, Educational Testing, Bayesian Statistics, Statistical Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Markle, Ross; Brenneman, Meghan; Jackson, Teresa; Burrus, Jeremy; Robbins, Steven – ETS Research Report Series, 2013
The public, education, and workforce sectors all have expressed interest regarding the key knowledge, skills, and abilities that enable individuals to be productive members of society. Although past efforts have attempted to create frameworks of student learning outcomes, the results have varied due to different perspectives and goals. Thus, the…
Descriptors: Higher Education, Outcomes of Education, Educational Testing, Creativity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zaromb, Franklin; Adler, Rachel M.; Bruce, Kelly; Attali, Yigal; Rock, JoAnn – ETS Research Report Series, 2014
This study investigates the benefits of no-stakes educational testing during students' summer vacation as a strategy to mitigate summer learning loss. Fifty-one students in Grades 3-8 from the Every Child Valued (ECV) and Lawrence Community Center (LCC) summer programs in Lawrenceville, NJ, took short, online assessments throughout the summer,…
Descriptors: Educational Testing, Summer Programs, Grade 3, Grade 4
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xu, Xueli – ETS Research Report Series, 2007
Monotonicity properties of a general diagnostic model (GDM) are considered in this paper. Simple data summaries are identified to inform about the ordered categories of latent traits. The findings are very much in accordance with the statements made about the GPCM (Hemker, Sijtsma, Molenaar, & Junker, 1996, 1997). On the one hand, by fitting a…
Descriptors: Models, Statistical Analysis, Educational Testing, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Haberman, Shelby J. – ETS Research Report Series, 2008
In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…
Descriptors: Scores, Validity, Educational Testing, Correlation