Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 4 |
Descriptor
Accuracy | 4 |
Scoring | 3 |
Comparative Analysis | 2 |
Test Items | 2 |
Achievement Tests | 1 |
Bias | 1 |
Classification | 1 |
Data Analysis | 1 |
Educational Assessment | 1 |
Elementary Secondary Education | 1 |
Equated Scores | 1 |
More ▼ |
Source
ETS Research Report Series | 4 |
Author
Chen, Haiwen H. | 1 |
DeCarlo, Lawrence T. | 1 |
Kim, Sooyeon | 1 |
Kong, Nan | 1 |
Livingston, Samuel A. | 1 |
Lu, Ying | 1 |
Yamamoto, Kentaro | 1 |
Yen, Wendy M. | 1 |
von Davier, Matthias | 1 |
Publication Type
Journal Articles | 4 |
Numerical/Quantitative Data | 4 |
Reports - Research | 4 |
Education Level
Elementary Secondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
What Works Clearinghouse Rating
Lu, Ying; Yen, Wendy M. – ETS Research Report Series, 2014
This article explores the use of longitudinal regression as a tool for identifying scoring inaccuracies. Student progression patterns, as evaluated through longitudinal regressions, typically are more stable from year to year than are scale score distributions and statistics, which require representative samples to conduct credibility checks.…
Descriptors: Quality Control, Regression (Statistics), Scoring, Accuracy
Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015
One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Livingston, Samuel A.; Kim, Sooyeon – ETS Research Report Series, 2010
A series of resampling studies investigated the accuracy of equating by four different methods in a random groups equating design with samples of 400, 200, 100, and 50 test takers taking each form. Six pairs of forms were constructed. Each pair was constructed by assigning items from an existing test taken by 9,000 or more test takers. The…
Descriptors: Equated Scores, Accuracy, Sample Size, Sampling
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias