ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	11

Descriptor

Psychometrics	14
Models	7
Scores	6
Item Response Theory	5
Test Items	5
Bayesian Statistics	3
Comparative Analysis	3
Computer Assisted Testing	3
Correlation	3
Diagnostic Tests	3
English (Second Language)	3
Goodness of Fit	3
Scoring	3
Statistical Analysis	3
Criterion Referenced Tests	2
Data	2
Educational Assessment	2
Educational Testing	2
Equated Scores	2
Evaluation Methods	2
Factor Analysis	2
High Stakes Tests	2
Language Tests	2
Measurement Techniques	2
Reliability	2
More ▼

Source

ETS Research Report Series	4
Educational Measurement:…	2
Psychometrika	2
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Measurement:…	1
Multivariate Behavioral…	1

Author

Sinharay, Sandip	14
Haberman, Shelby J.	6
Puhan, Gautam	2
Almond, Russell G.	1
Attali, Yigal	1
Dorans, Neil J.	1
Holland, Paul W.	1
Johnson, Matthew	1
Katz, Irvin R.	1
Keehner, Madeleine	1
Liang, Longjuan	1
Moon, Jung Aa	1
Sawaki, Yasuyo	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	8
Reports - Descriptive	3
Reports - Evaluative	2
Opinion Papers	1

Education Level

Higher Education	2
Postsecondary Education	2
Middle Schools	1

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Investigating Technology-Enhanced Item Formats Using Cognitive and Item Response Theory Approaches

Peer reviewed

Direct link

Moon, Jung Aa; Sinharay, Sandip; Keehner, Madeleine; Katz, Irvin R. – International Journal of Testing, 2020

The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants' tendency to respond to an item was affected by the presence of a grid and variations of…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Test Wiseness, Psychometrics

Equating of Augmented Subscores

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Journal of Educational Measurement, 2011

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008b) suggested reporting an augmented subscore that is a linear combination of a subscore and the total score. Sinharay and Haberman (2008) and Sinharay (2010) showed that augmented subscores often lead to more accurate…

Descriptors: Diagnostic Tests, Psychometrics, Testing, Equated Scores

An NCME Instructional Module on Subscores

Peer reviewed

Direct link

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2011

The purpose of this ITEMS module is to provide an introduction to subscores. First, examples of subscores from an operational test are provided. Then, a review of methods that can be used to examine if subscores have adequate psychometric quality is provided. It is demonstrated, using results from operational and simulated data, that subscores…

Descriptors: Scores, Psychometrics, Tests, Data

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Peer reviewed

Direct link

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

Descriptors: Educational Testing, Scores, Reports, Psychometrics

Reporting of Subscores Using Multidimensional Item Response Theory

Peer reviewed

Direct link

Haberman, Shelby J.; Sinharay, Sandip – Psychometrika, 2010

Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…

Descriptors: Item Response Theory, Psychometrics, Statistical Analysis, Scores

First Language of Test Takers and Fairness Assessment Procedures

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011

Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…

Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency

Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

Peer reviewed
PDF on ERIC

Download full text

Sawaki, Yasuyo; Sinharay, Sandip – ETS Research Report Series, 2013

This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…

Descriptors: Scores, Computer Assisted Testing, Factor Analysis, Correlation

Limits on Log Odds Ratios for Unidimensional Item Response Theory Models

Peer reviewed

Direct link

Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip – Psychometrika, 2007

Bounds are established for log odds ratios (log cross-product ratios) involving pairs of items for item response models. First, expressions for bounds on log odds ratios are provided for one-dimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are…

Descriptors: Goodness of Fit, Item Response Theory, Research Methodology, Measurement Techniques

How Much Can We Reliably Know about What Examinees Know?

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.

Descriptors: Scoring, Reliability, Validity, Classification

Assessing Fit of Cognitive Diagnostic Models: A Case Study

Peer reviewed

Direct link

Sinharay, Sandip; Almond, Russell G. – Educational and Psychological Measurement, 2007

A cognitive diagnostic model uses information from educational experts to describe the relationships between item performances and posited proficiencies. When the cognitive relationships can be described using a fully Bayesian model, Bayesian model checking procedures become available. Checking models tied to cognitive theory of the domains…

Descriptors: Epistemology, Clinical Diagnosis, Job Training, Item Response Theory

Experiences with Markov Chain Monte Carlo Convergence Assessment in Two Psychometric Examples

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2004

There is an increasing use of Markov chain Monte Carlo (MCMC) algorithms for fitting statistical models in psychometrics, especially in situations where the traditional estimation techniques are very difficult to apply. One of the disadvantages of using an MCMC algorithm is that it is not straightforward to determine the convergence of the…

Descriptors: Psychometrics, Mathematics, Inferences, Markov Processes

Analysis of Data from an Admissions Test with Item Models. Research Report. ETS RR-05-06

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Johnson, Matthew – ETS Research Report Series, 2005

"Item models" (LaDuca, Staples, Templeton, & Holzman, 1986) are classes from which it is possible to generate/produce items that are equivalent/isomorphic to other items from the same model (e.g., Bejar, 1996; Bejar, 2002). They have the potential to produce large number of high-quality items at reduced cost. This paper introduces…

Descriptors: Item Analysis, Test Items, Scoring, Psychometrics

Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip – ETS Research Report Series, 2004

Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…

Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit