ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	11

Descriptor

Regression (Statistics)	12
Scores	7
Correlation	4
Item Response Theory	4
Computation	3
Prediction	3
Statistical Analysis	3
Computer Software	2
Educational Testing	2
Equations (Mathematics)	2
Error of Measurement	2
Essays	2
Mathematics Tests	2
Models	2
Probability	2
Reliability	2
Scoring	2
Test Items	2
Test Theory	2
True Scores	2
Achievement Tests	1
Aptitude Tests	1
Automation	1
College Entrance Examinations	1
Comparative Analysis	1
More ▼

Source

ETS Research Report Series	6
Journal of Educational and…	3
Educational Testing Service	1
Journal of Educational…	1
Large-scale Assessments in…	1

Author

Haberman, Shelby J.	12
Sinharay, Sandip	4
Lee, Yi-Hsuan	2
Johnson, Matthew S.	1
Qian, Jiahe	1
van Rijn, Peter W.	1

Publication Type

Journal Articles	11
Reports - Research	9
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Elementary Secondary Education	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Studying Score Stability with a Harmonic Regression Family: A Comparison of Three Approaches to Adjustment of Examinee-Specific Demographic Data

Peer reviewed

Direct link

Lee, Yi-Hsuan; Haberman, Shelby J. – Journal of Educational Measurement, 2021

For assessments that use different forms in different administrations, equating methods are applied to ensure comparability of scores over time. Ideally, a score scale is well maintained throughout the life of a testing program. In reality, instability of a score scale can result from a variety of causes, some are expected while others may be…

Descriptors: Scores, Regression (Statistics), Demography, Data

Assessment of Fit of Item Response Theory Models Used in Large-Scale Educational Survey Assessments

Peer reviewed

Direct link

van Rijn, Peter W.; Sinharay, Sandip; Haberman, Shelby J.; Johnson, Matthew S. – Large-scale Assessments in Education, 2016

Latent regression models are used for score-reporting purposes in large-scale educational survey assessments such as the National Assessment of Educational Progress (NAEP) and Trends in International Mathematics and Science Study (TIMSS). One component of these models is based on item response theory. While there exists some research on assessment…

Descriptors: Goodness of Fit, Item Response Theory, Regression (Statistics), National Competency Tests

A General Program for Item-Response Analysis That Employs the Stabilized Newton-Raphson Algorithm. Research Report. ETS RR-13-32

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2013

A general program for item-response analysis is described that uses the stabilized Newton-Raphson algorithm. This program is written to be compliant with Fortran 2003 standards and is sufficiently general to handle independent variables, multidimensional ability parameters, and matrix sampling. The ability variables may be either polytomous or…

Descriptors: Predictor Variables, Mathematics, Item Response Theory, Probability

Statistical Procedures to Evaluate Quality of Scale Anchoring. Research Report. ETS RR-11-02

Download full text

Haberman, Shelby J.; Sinharay, Sandip; Lee, Yi-Hsuan – Educational Testing Service, 2011

Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement (Carroll, 1993). Scale anchoring (Beaton & Allen, 1992), a technique that describes what students at different points on a score scale know and can do,…

Descriptors: Statistical Analysis, Scores, Regression (Statistics), Item Response Theory

The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

Peer reviewed

Direct link

Haberman, Shelby J.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2010

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

Descriptors: Scoring, Regression (Statistics), Essays, Computer Software

Linking Parameter Estimates Derived from an Item Response Model through Separate Calibrations. Research Report. ETS RR-09-40

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2009

A regression procedure is developed to link simultaneously a very large number of item response theory (IRT) parameter estimates obtained from a large number of test forms, where each form has been separately calibrated and where forms can be linked on a pairwise basis by means of common items. An application is made to forms in which a…

Descriptors: Regression (Statistics), Item Response Theory, Models, Equated Scores

Sample-Size Requirements for Automated Essay Scoring. Research Report. ETS RR-08-32

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J.; Sinharay, Sandip – ETS Research Report Series, 2008

Sample-size requirements were considered for automated essay scoring in cases in which the automated essay score estimates the score provided by a human rater. Analysis considered both cases in which an essay prompt is examined in isolation and those in which a family of essay prompts is studied. In typical cases in which content analysis is not…

Descriptors: Sample Size, Scoring, Essays, Automation

Outliers in Assessments. Research Report. ETS RR-08-41

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2008

Outliers in assessments are often treated as a nuisance for data analysis; however, they can also assist in quality assurance. Their frequency can suggest problems with form codes, scanning accuracy, ability of examinees to enter responses as they intend, or exposure of items.

Descriptors: Educational Assessment, Quality Assurance, Scores, Regression (Statistics)

When Can Subscores Have Value?

Peer reviewed

Direct link

Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation

Linear Prediction of a True Score from a Direct Estimate and Several Derived Estimates

Peer reviewed

Direct link

Haberman, Shelby J.; Qian, Jiahe – Journal of Educational and Behavioral Statistics, 2007

Statistical prediction problems often involve both a direct estimate of a true score and covariates of this true score. Given the criterion of mean squared error, this study determines the best linear predictor of the true score given the direct estimate and the covariates. Results yield an extension of Kelley's formula for estimation of the true…

Descriptors: Prediction, Regression (Statistics), True Scores, Correlation

Subscores and Validity. Research Report. ETS RR-08-64

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2008

In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…

Descriptors: Scores, Validity, Educational Testing, Correlation

When Can Subscores Have Value? Research Report. ETS RR-05-08

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2005

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Scores, Test Items, Error of Measurement, Computation