Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 12 |
Descriptor
Source
Educational Measurement:… | 39 |
Author
Jaeger, Richard M. | 2 |
Kane, Michael | 2 |
Linn, Robert L. | 2 |
Baldwin, Peter | 1 |
Bond, Lloyd | 1 |
Brennan, Robert L. | 1 |
Brookhart, Susan M. | 1 |
Burkhardt, Amy | 1 |
Burton, Elizabeth | 1 |
Clarizio, Harvey F. | 1 |
Cohen, Allan | 1 |
More ▼ |
Publication Type
Journal Articles | 39 |
Reports - Evaluative | 39 |
Information Analyses | 3 |
Speeches/Meeting Papers | 3 |
Opinion Papers | 2 |
Tests/Questionnaires | 2 |
Education Level
Audience
Location
Nebraska | 1 |
Pennsylvania | 1 |
Laws, Policies, & Programs
Education Consolidation… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
ACT Assessment | 1 |
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
Preliminary Scholastic… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Baldwin, Peter – Educational Measurement: Issues and Practice, 2021
In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…
Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)
Leventhal, Brian C.; Grabovsky, Irina – Educational Measurement: Issues and Practice, 2020
Standard setting is arguably one of the most subjective techniques in test development and psychometrics. The decisions when scores are compared to standards, however, are arguably the most consequential outcomes of testing. Providing licensure to practice in a profession has high stake consequences for the public. Denying graduation or forcing…
Descriptors: Standard Setting (Scoring), Weighted Scores, Test Construction, Psychometrics
DiCerbo, Kristen – Educational Measurement: Issues and Practice, 2020
We have the ability to capture data from students' interactions with digital environments as they engage in learning activity. This provides the potential for a reimagining of assessment to one in which assessment become part of our natural education activity and can be used to support learning. These new data allow us to more closely examine the…
Descriptors: Student Diversity, Information Technology, Learning Activities, Learning Processes
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Higgins, Derrick; Heilman, Michael – Educational Measurement: Issues and Practice, 2014
As methods for automated scoring of constructed-response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct-irrelevant…
Descriptors: Automation, Scoring, Responses, Test Wiseness
Gotch, Chad M.; French, Brian F. – Educational Measurement: Issues and Practice, 2014
This work systematically reviews teacher assessment literacy measures within the context of contemporary teacher evaluation policy. In this study, the researchers collected objective tests of assessment knowledge, teacher self-reports, and rubrics to evaluate teachers' work in assessment literacy studies from 1991 to 2012. Then they evaluated…
Descriptors: Measures (Individuals), Objective Tests, Measurement Techniques, Scoring Rubrics
Suto, Irenka – Educational Measurement: Issues and Practice, 2012
Internationally, many assessment systems rely predominantly on human raters to score examinations. Arguably, this facilitates the assessment of multiple sophisticated educational constructs, strengthening assessment validity. It can introduce subjectivity into the scoring process, however, engendering threats to accuracy. The present objectives…
Descriptors: Evaluation Methods, Scoring, Qualitative Research, Protocol Analysis
Goldberg, Gail Lynn – Educational Measurement: Issues and Practice, 2012
The engagement of teachers as raters to score constructed response items on assessments of student learning is widely claimed to be a valuable vehicle for professional development. This paper examines the evidence behind those claims from several sources, including research and reports over the past two decades, information from a dozen state…
Descriptors: Academic Achievement, Performance Based Assessment, Scoring, Professional Development
Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012
Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…
Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability
Nichols, Paul; Twing, Jon; Mueller, Canda D.; O'Malley, Kimberly – Educational Measurement: Issues and Practice, 2010
Some writers in the measurement literature have been skeptical of the meaningfulness of achievement standards and described the standard-setting process as blatantly arbitrary. We argue that standard setting is more appropriately conceived of as a measurement process similar to student assessment. The construct being measured is the panelists'…
Descriptors: Scaling, Achievement, Standard Setting (Scoring), Measurement

Frary, Robert B. – Educational Measurement: Issues and Practice, 1988
Formula scoring is designed to reduce multiple-choice test score irregularities due to guessing. It is inappropriate for most classroom testing, but may be desirable for speeded tests and difficult tests with low passing scores. An annotated bibliography and a Self-Test are provided. (SLD)
Descriptors: Multiple Choice Tests, Scoring, Testing Problems

Geisinger, Kurt F. – Educational Measurement: Issues and Practice, 1991
Ways to use standard-setting data to adjust cutoff scores on examinations are reviewed. Ten sources of information to be used in determining standards are listed. The decision to modify passing scores should be based on these types of information and consideration of adverse impact or rating process irregularities. (SLD)
Descriptors: Cutting Scores, Evaluation Utilization, Evaluators, Interrater Reliability

Plake, Barbara S.; And Others – Educational Measurement: Issues and Practice, 1991
Possible sources of intrajudge inconsistency in standard setting are reviewed, and approaches are presented to improve the accuracy of rating. Procedures for providing judges with feedback through discussion or computerized communication are discussed. Monitoring and maintaining judges' consistency throughout the rating process are essential. (SLD)
Descriptors: Computer Assisted Instruction, Evaluators, Examiners, Feedback

Mills, Craig N.; And Others – Educational Measurement: Issues and Practice, 1991
An approach is presented to the definition of minimal competence for judges to use in standard setting. Panelists in standard setting must receive training to ensure that differences in rating result from differences in perceptions of item difficulty, not in differences of opinion about the definition of minimal competence. (SLD)
Descriptors: Cutting Scores, Decision Making, Definitions, Difficulty Level