NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Moon, Jung Aa; Sinharay, Sandip; Keehner, Madeleine; Katz, Irvin R. – International Journal of Testing, 2020
The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants' tendency to respond to an item was affected by the presence of a grid and variations of…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Test Wiseness, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Haberman, Shelby J. – Journal of Educational Measurement, 2011
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008b) suggested reporting an augmented subscore that is a linear combination of a subscore and the total score. Sinharay and Haberman (2008) and Sinharay (2010) showed that augmented subscores often lead to more accurate…
Descriptors: Diagnostic Tests, Psychometrics, Testing, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2011
The purpose of this ITEMS module is to provide an introduction to subscores. First, examples of subscores from an operational test are provided. Then, a review of methods that can be used to examine if subscores have adequate psychometric quality is provided. It is demonstrated, using results from operational and simulated data, that subscores…
Descriptors: Scores, Psychometrics, Tests, Data
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…
Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Descriptors: Educational Testing, Scores, Reports, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Haberman, Shelby J.; Sinharay, Sandip – Psychometrika, 2010
Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…
Descriptors: Item Response Theory, Psychometrics, Statistical Analysis, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011
Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…
Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sawaki, Yasuyo; Sinharay, Sandip – ETS Research Report Series, 2013
This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…
Descriptors: Scores, Computer Assisted Testing, Factor Analysis, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip – Psychometrika, 2007
Bounds are established for log odds ratios (log cross-product ratios) involving pairs of items for item response models. First, expressions for bounds on log odds ratios are provided for one-dimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are…
Descriptors: Goodness of Fit, Item Response Theory, Research Methodology, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.
Descriptors: Scoring, Reliability, Validity, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Almond, Russell G. – Educational and Psychological Measurement, 2007
A cognitive diagnostic model uses information from educational experts to describe the relationships between item performances and posited proficiencies. When the cognitive relationships can be described using a fully Bayesian model, Bayesian model checking procedures become available. Checking models tied to cognitive theory of the domains…
Descriptors: Epistemology, Clinical Diagnosis, Job Training, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2004
There is an increasing use of Markov chain Monte Carlo (MCMC) algorithms for fitting statistical models in psychometrics, especially in situations where the traditional estimation techniques are very difficult to apply. One of the disadvantages of using an MCMC algorithm is that it is not straightforward to determine the convergence of the…
Descriptors: Psychometrics, Mathematics, Inferences, Markov Processes
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sinharay, Sandip; Johnson, Matthew – ETS Research Report Series, 2005
"Item models" (LaDuca, Staples, Templeton, & Holzman, 1986) are classes from which it is possible to generate/produce items that are equivalent/isomorphic to other items from the same model (e.g., Bejar, 1996; Bejar, 2002). They have the potential to produce large number of high-quality items at reduced cost. This paper introduces…
Descriptors: Item Analysis, Test Items, Scoring, Psychometrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sinharay, Sandip – ETS Research Report Series, 2004
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…
Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit