Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 31 |
Descriptor
Source
Educational Measurement:… | 71 |
Author
Allalouf, Avi | 2 |
Burkhardt, Amy | 2 |
Cizek, Gregory J. | 2 |
Frisbie, David A. | 2 |
Plake, Barbara S. | 2 |
Solano-Flores, Guillermo | 2 |
Yen, Wendy M. | 2 |
Anderson, Dan | 1 |
Aray, Henry | 1 |
Attali, Yigal | 1 |
Baird, Jo-Anne | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 3 |
Elementary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
High Schools | 1 |
Higher Education | 1 |
Secondary Education | 1 |
Audience
Teachers | 3 |
Researchers | 2 |
Practitioners | 1 |
Laws, Policies, & Programs
Education Consolidation… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
ACT Assessment | 2 |
SAT (College Admission Test) | 2 |
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
Preliminary Scholastic… | 1 |
Teacher Performance… | 1 |
What Works Clearinghouse Rating
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Zesch, Torsten; Horbach, Andrea; Zehner, Fabian – Educational Measurement: Issues and Practice, 2023
In this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced…
Descriptors: Influences, Academic Achievement, Feasibility Studies, Automation
Firoozi, Tahereh; Mohammadi, Hamid; Gierl, Mark J. – Educational Measurement: Issues and Practice, 2023
Research on Automated Essay Scoring has become increasing important because it serves as a method for evaluating students' written responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The…
Descriptors: Active Learning, Automation, Scoring, Essays
Xiong, Jiawei; Li, Feiming – Educational Measurement: Issues and Practice, 2023
Multidimensional scoring evaluates each constructed-response answer from more than one rating dimension and/or trait such as lexicon, organization, and supporting ideas instead of only one holistic score, to help students distinguish between various dimensions of writing quality. In this work, we present a bilevel learning model for combining two…
Descriptors: Scoring, Models, Task Analysis, Learning Processes
Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2020
Educational tests are standardized so that all examinees are tested on the same material, under the same testing conditions, and with the same scoring protocols. This uniformity is designed to provide a level "playing field" for all examinees so that the test is "the same" for everyone. Thus, standardization is designed to…
Descriptors: Standards, Educational Assessment, Culture Fair Tests, Scoring
Lottridge, Sue; Burkhardt, Amy; Boyer, Michelle – Educational Measurement: Issues and Practice, 2020
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows…
Descriptors: Computer Assisted Testing, Scoring, Automation, Educational Assessment
Attali, Yigal – Educational Measurement: Issues and Practice, 2019
Rater training is an important part of developing and conducting large-scale constructed-response assessments. As part of this process, candidate raters have to pass a certification test to confirm that they are able to score consistently and accurately before they begin scoring operationally. Moreover, many assessment programs require raters to…
Descriptors: Evaluators, Certification, High Stakes Tests, Scoring
DiCerbo, Kristen – Educational Measurement: Issues and Practice, 2020
We have the ability to capture data from students' interactions with digital environments as they engage in learning activity. This provides the potential for a reimagining of assessment to one in which assessment become part of our natural education activity and can be used to support learning. These new data allow us to more closely examine the…
Descriptors: Student Diversity, Information Technology, Learning Activities, Learning Processes
Aray, Henry; Pedauga, Luis – Educational Measurement: Issues and Practice, 2019
This article presents a novel experimental methodology in which groups of students were offered the option to choose between two equivalent scoring rules to assess a multiple-choice test. The effect of choosing the scoring rule on marks is tested. Two major contributions arise from this research. First, it contributes to the literature on the…
Descriptors: Multiple Choice Tests, Scoring, Student Attitudes, Decision Making
Allalouf, Avi; Gutentag, Tony; Baumer, Michal – Educational Measurement: Issues and Practice, 2017
Quality control (QC) in testing is paramount. QC procedures for tests can be divided into two types. The first type, one that has been well researched, is QC for tests administered to large population groups on few administration dates using a small set of test forms (e.g., large-scale assessment). The second type is QC for tests, usually…
Descriptors: Quality Control, Scoring, Computer Assisted Testing, Error Patterns
Liu, Ou Lydia; Brew, Chris; Blackmore, John; Gerard, Libby; Madhok, Jacquie; Linn, Marcia C. – Educational Measurement: Issues and Practice, 2014
Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics…
Descriptors: Science Tests, Test Items, Scoring, Automation
Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew D. – Educational Measurement: Issues and Practice, 2018
The percentage of students retaking college admissions tests is rising. Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing organizations such as ACT and the College Board, interested in validity evidence like correlations with first-year grade point average (FYGPA), often…
Descriptors: College Admission, Scores, Correlation, College Entrance Examinations
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Davis, Laurie; Morrison, Kristin; Kong, Xiaojing; McBride, Yuanyuan – Educational Measurement: Issues and Practice, 2017
The use of tablets for large-scale testing programs has transitioned from concept to reality for many state testing programs. This study extended previous research on score comparability between tablets and computers with high school students to compare score distributions across devices for reading, math, and science and to evaluate device…
Descriptors: Computer Assisted Testing, Handheld Devices, Telecommunications, Scoring