ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	18
Since 2006 (last 20 years)	56

Descriptor

Error of Measurement	58
Computation	21
Statistical Analysis	21
Equated Scores	19
Item Response Theory	17
Comparative Analysis	16
Test Items	15
Scores	13
Regression (Statistics)	12
Accuracy	11
Sample Size	11
Simulation	11
Sampling	10
Statistical Bias	9
National Competency Tests	8
Models	7
Prediction	7
Raw Scores	7
Scoring	7
Test Length	7
Computer Assisted Testing	6
Evaluation Methods	6
Factor Analysis	6
Reliability	6
Test Construction	6
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	58
Reports - Research	55
Reports - Evaluative	3
Speeches/Meeting Papers	2
Tests/Questionnaires	2
Numerical/Quantitative Data	1

Education Level

Higher Education	8
Postsecondary Education	8
Secondary Education	6
Elementary Education	3
Grade 8	3
High Schools	3
Junior High Schools	3
Middle Schools	3
Grade 4	2
Intermediate Grades	2
Adult Education	1
Elementary Secondary Education	1
Grade 3	1
High School Equivalency…	1
More ▼

Audience

Location

California	1
Nevada	1
New Jersey	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

National Assessment of…	8
Praxis Series	3
SAT (College Admission Test)	3
Test of English as a Foreign…	2
Graduate Record Examinations	1
National Merit Scholarship…	1
Preliminary Scholastic…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 58 results Save | Export

Estimating Reliability for Tests with One Constructed-Response Item in a Section. Research Report. ETS RR-24-07

Peer reviewed
PDF on ERIC

Download full text

Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2024

The goal of this paper is to find better ways to estimate the internal consistency reliability of scores on tests with a specific type of design that are often encountered in practice: tests with constructed-response items clustered into sections that are not parallel or tau-equivalent, and one of the sections has only one item. To estimate the…

Descriptors: Test Reliability, Essay Tests, Construct Validity, Error of Measurement

A Hybrid Model for Orthogonal Regression. Research Report. ETS RR-23-04

Peer reviewed
PDF on ERIC

Download full text

Michael Kane – ETS Research Report Series, 2023

Linear functional relationships are intended to be symmetric and therefore cannot generally be accurately estimated using ordinary least squares regression equations. Orthogonal regression (OR) models allow for errors in both "Y" and "X" and therefore can provide symmetric estimates of these relationships. The most…

Descriptors: Factor Analysis, Regression (Statistics), Mathematical Models, Relationship

Symmetric Least Squares Estimates of Functional Relationships. Research Report. ETS RR-21-21

Peer reviewed
PDF on ERIC

Download full text

Kane, Michael T. – ETS Research Report Series, 2021

Ordinary least squares (OLS) regression provides optimal linear predictions of a dependent variable, y, given an independent variable, x, but OLS regressions are not symmetric or reversible. In order to get optimal linear predictions of x given y, a separate OLS regression in that direction would be needed. This report provides a least squares…

Descriptors: Least Squares Statistics, Regression (Statistics), Prediction, Geometric Concepts

Model Adequacy Checking for Applying Harmonic Regression to Assessment Quality Control. Research Report. ETS RR-21-13

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe; Li, Shuhong – ETS Research Report Series, 2021

In recent years, harmonic regression models have been applied to implement quality control for educational assessment data consisting of multiple administrations and displaying seasonality. As with other types of regression models, it is imperative that model adequacy checking and model fit be appropriately conducted. However, there has been no…

Descriptors: Models, Regression (Statistics), Language Tests, Quality Control

Measures of Agreement versus Measures of Prediction Accuracy. Research Report. ETS RR-19-20

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2019

Measures of agreement are compared to measures of prediction accuracy within a general context. Differences in appropriate use are emphasized, and approaches are examined for both numerical and nominal variables. General estimation methods are developed, and their large-sample properties are compared.

Descriptors: Measurement Techniques, Classification, Prediction, Accuracy

Orthogonal Regression, the Cleary Criterion, and Lord's Paradox: Asking the Right Questions. Research Report. ETS RR-20-14

Peer reviewed
PDF on ERIC

Download full text

Kane, Michael T.; Mroch, Andrew A. – ETS Research Report Series, 2020

Ordinary least squares (OLS) regression and orthogonal regression (OR) address different questions and make different assumptions about errors. The OLS regression of Y on X yields predictions of a dependent variable (Y) contingent on an independent variable (X) and minimizes the sum of squared errors of prediction. It assumes that the independent…

Descriptors: Regression (Statistics), Least Squares Statistics, Test Bias, Error of Measurement

Variance Estimation with Complex Data and Finite Population Correction--A Paradigm for Comparing Jackknife and Formula-Based Methods for Variance Estimation. Research Report. ETS RR-20-11

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe – ETS Research Report Series, 2020

The finite population correction (FPC) factor is often used to adjust variance estimators for survey data sampled from a finite population without replacement. As a replicated resampling approach, the jackknife approach is usually implemented without the FPC factor incorporated in its variance estimates. A paradigm is proposed to compare the…

Descriptors: Computation, Sampling, Data, Statistical Analysis

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Error Variance in Common Population Linking Bridge Studies. Research Report. ETS RR-19-42

Peer reviewed
PDF on ERIC

Download full text

Jewsbury, Paul A. – ETS Research Report Series, 2019

When an assessment undergoes changes to the administration or instrument, bridge studies are typically used to try to ensure comparability of scores before and after the change. Among the most common and powerful is the common population linking design, with the use of a linear transformation to link scores to the metric of the original…

Descriptors: Evaluation Research, Scores, Error Patterns, Error of Measurement

Different Methods of Adjusting for Form Difficulty under the Rasch Model: Impact on Consistency of Assessment Results. Research Report. ETS RR-19-08

Peer reviewed
PDF on ERIC

Download full text

Manna, Venessa F.; Gu, Lixiong – ETS Research Report Series, 2019

When using the Rasch model, equating with a nonequivalent groups anchor test design is commonly achieved by adjustment of new form item difficulty using an additive equating constant. Using simulated 5-year data, this report compares 4 approaches to calculating the equating constants and the subsequent impact on equating results. The 4 approaches…

Descriptors: Item Response Theory, Test Items, Test Construction, Sample Size

Grouping Effects on Jackknifed Variance Estimation for Item Response Theory Scaling and Equating with Cluster-Based Assessment Data. Research Report. ETS RR-18-16

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2018

Educational assessment data are often collected from a set of test centers across various geographic regions, and therefore the data samples contain clusters. Such cluster-based data may result in clustering effects in variance estimation. However, in many grouped jackknife variance estimation applications, jackknife groups are often formed by a…

Descriptors: Item Response Theory, Scaling, Equated Scores, Cluster Grouping

Measurement Error and Bias in Value-Added Models. Research Report. ETS RR-17-25

Peer reviewed
PDF on ERIC

Download full text

Kane, Michael T. – ETS Research Report Series, 2017

By aggregating residual gain scores (the differences between each student's current score and a predicted score based on prior performance) for a school or a teacher, value-added models (VAMs) can be used to generate estimates of school or teacher effects. It is known that random errors in the prior scores will introduce bias into predictions of…

Descriptors: Error of Measurement, Value Added Models, Scores, Teacher Effectiveness

A Modified "a"-Stratified Method for Computerized Adaptive Testing. Research Report. ETS RR-19-10

Peer reviewed
PDF on ERIC

Download full text

Gu, Lixiong; Ling, Guangming; Qu, Yanxuan – ETS Research Report Series, 2019

Research has found that the "a"-stratified item selection strategy (STR) for computerized adaptive tests (CATs) may lead to insufficient use of high a items at later stages of the tests and thus to reduced measurement precision. A refined approach, unequal item selection across strata (USTR), effectively improves test precision over the…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Use, Test Items

A Generalizability Theory Study to Examine Sources of Score Variance in Third-Party Evaluations Used in Decision-Making for Graduate School Admissions. ETS GRE® Board Research Report. ETS GRE®-18-03. ETS RR-18-37

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Oliveri, Maria Elena; Holtzman, Steven – ETS Research Report Series, 2018

Scores from noncognitive measures are increasingly valued for their utility in helping to inform postsecondary admissions decisions. However, their use has presented challenges because of faking, response biases, or subjectivity, which standardized third-party evaluations (TPEs) can help minimize. Analysts and researchers using TPEs, however, need…

Descriptors: Generalizability Theory, Scores, College Admission, Admission Criteria

Benchmark Keystroke Biometrics Accuracy from High-Stakes Writing Tasks. Research Report. ETS RR-21-15

Peer reviewed
PDF on ERIC

Download full text

Choi, Ikkyu; Hao, Jiangang; Deane, Paul; Zhang, Mo – ETS Research Report Series, 2021

"Biometrics" are physical or behavioral human characteristics that can be used to identify a person. It is widely known that keystroke or typing dynamics for short, fixed texts (e.g., passwords) could serve as a behavioral biometric. In this study, we investigate whether keystroke data from essay responses can lead to a reliable…

Descriptors: Accuracy, High Stakes Tests, Writing Tests, Benchmarking

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Moses, Tim	7
Qian, Jiahe	6
Haberman, Shelby J.	5
Kim, Sooyeon	5
Oranje, Andreas	5
Puhan, Gautam	5
Guo, Hongwen	4
Livingston, Samuel A.	4
Holland, Paul	3
Kane, Michael T.	3
Lee, Yi-Hsuan	3
Li, Deping	3
Liu, Jinghua	3
Dorans, Neil J.	2
Gu, Lixiong	2
Sinharay, Sandip	2
Wang, Lin	2
Zhang, Jinming	2
von Davier, Alina A.	2
Antal, Tamás	1
Attali, Yigal	1
Braun, Henry	1
Casabianca, Jodi	1
Choi, Ikkyu	1
Curley, Edward	1
More ▼