Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 14 |
Descriptor
Prediction | 14 |
Scores | 9 |
Comparative Analysis | 7 |
Essays | 4 |
Regression (Statistics) | 4 |
Scoring | 4 |
Accuracy | 3 |
Bayesian Statistics | 3 |
Correlation | 3 |
Goodness of Fit | 3 |
Licensing Examinations… | 3 |
More ▼ |
Source
Author
Sinharay, Sandip | 14 |
Haberman, Shelby | 3 |
Attali, Yigal | 2 |
Haberman, Shelby J. | 2 |
Han, Ning | 2 |
Holland, Paul W. | 2 |
Larkin, Kevin | 2 |
Puhan, Gautam | 2 |
Zhang, Mo | 2 |
von Davier, Alina A. | 2 |
Boughton, Keith | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Research | 9 |
Reports - Evaluative | 3 |
Reports - Descriptive | 2 |
Opinion Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
High Schools | 2 |
Adult Education | 1 |
Elementary Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019
Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…
Descriptors: Scores, Prediction, Writing Processes, Data Analysis
Zhang, Mo; Sinharay, Sandip – International Journal of Testing, 2022
This article demonstrates how recent advances in technology allow fine-grained analyses of candidate-produced essays, thus providing a deeper insight on writing performance. We examined how essay features, automatically extracted using natural language processing and keystroke logging techniques, can predict various performance measures using data…
Descriptors: At Risk Persons, Writing Achievement, Educational Technology, Writing Improvement
Sinharay, Sandip – Measurement: Interdisciplinary Research and Perspectives, 2018
Producers and consumers of test scores are increasingly concerned about fraudulent behavior before and during the test. There exist several statistical or psychometric methods for detecting fraudulent behavior on tests. This paper provides a review of the Bayesian approaches among them. Four hitherto-unpublished real data examples are provided to…
Descriptors: Ethics, Cheating, Student Behavior, Bayesian Statistics
Sinharay, Sandip – Grantee Submission, 2018
Producers and consumers of test scores are increasingly concerned about fraudulent behavior before and during the test. There exist several statistical or psychometric methods for detecting fraudulent behavior on tests. This paper provides a review of the Bayesian approaches among them. Four hitherto-unpublished real data examples are provided to…
Descriptors: Ethics, Cheating, Student Behavior, Bayesian Statistics
Sinharay, Sandip; Haberman, Shelby; Boughton, Keith – Educational Measurement: Issues and Practice, 2015
Feinberg and Wainer (2014) provided a simple equation to approximate/predict a subscore's value. The purpose of this note is to point out that their equation is often inaccurate in that it does not always predict a subscore's value correctly. Therefore, the utility of their simple equation is not clear.
Descriptors: Equations (Mathematics), Scores, Prediction, Accuracy
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015
Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…
Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…
Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar
Haberman, Shelby J.; Sinharay, Sandip – Educational Testing Service, 2011
Subscores are reported for several operational assessments. Haberman (2008) suggested a method based on classical test theory to determine if the true subscore is predicted better by the corresponding subscore or the total score. Researchers are often interested in learning how different subgroups perform on subtests. Stricker (1993) and…
Descriptors: True Scores, Test Theory, Prediction, Group Membership
Haberman, Shelby J.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2010
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Descriptors: Scoring, Regression (Statistics), Essays, Computer Software
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Holland, Paul W.; Sinharay, Sandip; von Davier, Alina A.; Han, Ning – Journal of Educational Measurement, 2008
Two important types of observed score equating (OSE) methods for the non-equivalent groups with Anchor Test (NEAT) design are chain equating (CE) and post-stratification equating (PSE). CE and PSE reflect two distinctly different ways of using the information provided by the anchor test for computing OSE functions. Both types of methods include…
Descriptors: Equated Scores, Prediction, Comparative Analysis
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008
Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…
Descriptors: Comparative Analysis, Scores, Tests, Certification
Holland, Paul W.; von Davier, Alina A.; Sinharay, Sandip; Han, Ning – ETS Research Report Series, 2006
This paper focuses on the Non-Equivalent Groups with Anchor Test (NEAT) design for test equating and on two classes of observed--score equating (OSE) methods--chain equating (CE) and poststratification equating (PSE). These two classes of methods reflect two distinctly different ways of using the information provided by the anchor test for…
Descriptors: Equated Scores, Test Items, Statistical Analysis, Comparative Analysis