ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	14

Descriptor

Educational Testing	14
Item Response Theory	6
Statistical Analysis	5
Models	4
Scores	4
Bayesian Statistics	3
Measurement	3
Test Bias	3
Test Construction	3
Accountability	2
Achievement Tests	2
Correlation	2
Criterion Referenced Tests	2
Equations (Mathematics)	2
Higher Education	2
Mathematics Skills	2
Mathematics Tests	2
Psychometrics	2
Sample Size	2
Simulation	2
Standardized Tests	2
Test Items	2
Test Results	2
Testing Programs	2
Academic Standards	1
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	14
Reports - Research	13
Reports - Evaluative	1

Education Level

Higher Education	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Primary Education	1
More ▼

Audience

Location

New Jersey

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Methods for Imputing Scores When All Responses Are Missing for One or More Polytomous Items: Accuracy and Impact on Psychometric Property. Research Report. ETS RR-23-07

Peer reviewed
PDF on ERIC

Download full text

Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2023

Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers.…

Descriptors: Scores, Test Items, Accuracy, Psychometrics

A Note on Using Weighted Sum Scores in the P-DIF Statistic. Research Report. ETS RR-19-32

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…

Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory

Stakes in Testing: Not a Simple Dichotomy but a Profile of Consequences That Guides Needed Evidence of Measurement Quality. Research Report. ETS RR-19-19

Peer reviewed
PDF on ERIC

Download full text

Tannenbaum, Richard J.; Kane, Michael T. – ETS Research Report Series, 2019

Testing programs are often classified as high or low stakes to indicate how stringently they need to be evaluated. However, in practice, this classification falls short. A high-stakes label is taken to imply that all indicators of measurement quality must meet high standards; whereas a low-stakes label is taken to imply the opposite. This approach…

Descriptors: High Stakes Tests, Testing Programs, Measurement, Evaluation Criteria

A Tale of Two Models: Sources of Confusion in Achievement Testing. Research Report. ETS RR-17-44

Peer reviewed
PDF on ERIC

Download full text

Reckase, Mark D. – ETS Research Report Series, 2017

A common interpretation of achievement test results is that they provide measures of achievement that are much like other measures we commonly use for height, weight, or the cost of goods. In a limited sense, such interpretations are correct, but some nuances of these interpretations have important implications for the use of achievement test…

Descriptors: Models, Achievement Tests, Test Results, Test Construction

Continuing a Culture of Evidence: Assessment for Improvement. Research Report. ETS RR-17-08

Peer reviewed
PDF on ERIC

Download full text

Russell, Javarro; Markle, Ross – ETS Research Report Series, 2017

From 2006 to 2008, Educational Testing Service (ETS) produced a series of reports titled "A Culture of Evidence," designed to capture a changing climate in higher education assessment. A decade later, colleges and universities already face new and different challenges resulting from societal, technological, and scientific influences.…

Descriptors: Evidence Based Practice, Evidence, Educational Testing, Educational Improvement

Opt Out: An Examination of Issues. Research Report. ETS RR-16-13

Peer reviewed
PDF on ERIC

Download full text

Bennett, Randy E. – ETS Research Report Series, 2016

Media reports have recently given significant attention to the opt-out movement, an organized effort to refuse to take standardized tests. Although the narrative often told in early press accounts was of a viral grass-roots effort led by parents who object to state-mandated testing, the reality has turned out to be more complicated. Through a…

Descriptors: Educational Testing, Standardized Tests, Resistance (Psychology), Activism

An Investigation of the Efficacy of Criterion Refinement Procedures in Mantel-Haenszel DIF Analysis. Research Report. ETS RR-13-16

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca; Ye, Lei; Isham, Steven – ETS Research Report Series, 2013

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we…

Descriptors: Test Bias, Statistical Analysis, Simulation, Educational Testing

A Note on Explaining Away and Paradoxical Results in Multidimensional Item Response Theory. Research Report. ETS RR-12-13

Peer reviewed
PDF on ERIC

Download full text

van Rijn, Peter W.; Rijmen, Frank – ETS Research Report Series, 2012

Hooker and colleagues addressed a paradoxical situation that can arise in the application of multidimensional item response theory (MIRT) models to educational test data. We demonstrate that this MIRT paradox is an instance of the explaining-away phenomenon in Bayesian networks, and we attempt to enhance the understanding of MIRT models by placing…

Descriptors: Item Response Theory, Educational Testing, Bayesian Statistics, Statistical Analysis

Synthesizing Frameworks of Higher Education Student Learning Outcomes. Research Report. ETS RR-13-22

Peer reviewed
PDF on ERIC

Download full text

Markle, Ross; Brenneman, Meghan; Jackson, Teresa; Burrus, Jeremy; Robbins, Steven – ETS Research Report Series, 2013

The public, education, and workforce sectors all have expressed interest regarding the key knowledge, skills, and abilities that enable individuals to be productive members of society. Although past efforts have attempted to create frameworks of student learning outcomes, the results have varied due to different perspectives and goals. Thus, the…

Descriptors: Higher Education, Outcomes of Education, Educational Testing, Creativity

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Using No-Stakes Educational Testing to Mitigate Summer Learning Loss: A Pilot Study. Research Report. ETS RR-14-21

Peer reviewed
PDF on ERIC

Download full text

Zaromb, Franklin; Adler, Rachel M.; Bruce, Kelly; Attali, Yigal; Rock, JoAnn – ETS Research Report Series, 2014

This study investigates the benefits of no-stakes educational testing during students' summer vacation as a strategy to mitigate summer learning loss. Fifty-one students in Grades 3-8 from the Every Child Valued (ECV) and Lawrence Community Center (LCC) summer programs in Lawrenceville, NJ, took short, online assessments throughout the summer,…

Descriptors: Educational Testing, Summer Programs, Grade 3, Grade 4

Modeling Change in Large-Scale Longitudinal Studies of Educational Growth: Four Decades of Contributions to the Assessment of Educational Growth. Research Report. ETS RR-12-04. ETS R&D Scientific and Policy Contributions Series. ETS SPC-12-01

Peer reviewed
PDF on ERIC

Download full text

Rock, Donald A. – ETS Research Report Series, 2012

This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…

Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development

Monotone Properties of a General Diagnostic Model. Research Report. ETS RR-07-25

Peer reviewed
PDF on ERIC

Download full text

Xu, Xueli – ETS Research Report Series, 2007

Monotonicity properties of a general diagnostic model (GDM) are considered in this paper. Simple data summaries are identified to inform about the ordered categories of latent traits. The findings are very much in accordance with the statements made about the GPCM (Hemker, Sijtsma, Molenaar, & Junker, 1996, 1997). On the one hand, by fitting a…

Descriptors: Models, Statistical Analysis, Educational Testing, Item Response Theory

Subscores and Validity. Research Report. ETS RR-08-64

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2008

In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…

Descriptors: Scores, Validity, Educational Testing, Correlation

Markle, Ross	2
Zwick, Rebecca	2
Adler, Rachel M.	1
Attali, Yigal	1
Bennett, Randy E.	1
Brenneman, Meghan	1
Bruce, Kelly	1
Burrus, Jeremy	1
Dorans, Neil J.	1
Guo, Hongwen	1
Haberman, Shelby J.	1
Isham, Steven	1
Jackson, Teresa	1
Kane, Michael T.	1
Reckase, Mark D.	1
Rijmen, Frank	1
Robbins, Steven	1
Rock, Donald A.	1
Rock, JoAnn	1
Russell, Javarro	1
Sandip Sinharay	1
Tannenbaum, Richard J.	1
Xu, Xueli	1
Yanxuan Qu	1
Ye, Lei	1
More ▼