NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Solano-Flores, Guillermo – Educational Measurement: Issues and Practice, 2021
This article proposes a Boolean approach to representing and analyzing interobserver agreement in dichotomous coding. Building on the notion that observations are samples of a universe of observations, it submits that coding can be viewed as a process in which observers sample pieces of evidence on constructs. It distinguishes between formal and…
Descriptors: Online Searching, Coding, Interrater Reliability, Evidence
Peer reviewed Peer reviewed
Direct linkDirect link
Babcock, Ben; Risk, Nicole M.; Wyse, Adam E. – Educational Measurement: Issues and Practice, 2020
This study compared the statistical properties of four job analysis task survey response scale types: criticality, difficulty in learning, importance, and frequency. We used nine job analysis studies spanning two fields, medical imaging and allied health professionals, to compare the job analysis scales in terms of variability and interrater…
Descriptors: Job Analysis, Radiology, Allied Health Personnel, Surveys
Peer reviewed Peer reviewed
Direct linkDirect link
Leighton, Jacqueline P.; Lehman, Blair – Educational Measurement: Issues and Practice, 2020
In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes. Learners are introduced to historical, theoretical, and procedural differences between these methods and how to use and analyze…
Descriptors: Protocol Analysis, Interviews, Problem Solving, Cognitive Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Flake, Jessica Kay; Petway, Kevin Terrance, II – Educational Measurement: Issues and Practice, 2019
Numerous studies merely note divergence in students' and teachers' ratings of student noncognitive constructs. However, given the increased attention and use of these constructs in educational research and practice, an in-depth study focused on this issue was needed. Using a variety of quantitative methodologies, we thoroughly investigate…
Descriptors: Teachers, Students, Achievement Rating, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Traynor, A.; Merzdorf, H. E. – Educational Measurement: Issues and Practice, 2018
During the development of large-scale curricular achievement tests, recruited panels of independent subject-matter experts use systematic judgmental methods--often collectively labeled "alignment" methods--to rate the correspondence between a given test's items and the objective statements in a particular curricular standards document.…
Descriptors: Achievement Tests, Expertise, Alignment (Education), Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Anderson, Daniel; Irvin, Shawn; Alonzo, Julie; Tindal, Gerald A. – Educational Measurement: Issues and Practice, 2015
The alignment of test items to content standards is critical to the validity of decisions made from standards-based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the…
Descriptors: Test Items, Alignment (Education), Standards, Online Systems
Peer reviewed Peer reviewed
Direct linkDirect link
McCaffrey, Daniel F.; Yuan, Kun; Savitsky, Terrance D.; Lockwood, J. R.; Edelen, Maria O. – Educational Measurement: Issues and Practice, 2015
We examine the factor structure of scores from the CLASS-S protocol obtained from observations of middle school classroom teaching. Factor analysis has been used to support both interpretations of scores from classroom observation protocols, like CLASS-S, and the theories about teaching that underlie them. However, classroom observations contain…
Descriptors: Factor Structure, Multivariate Analysis, Scores, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Williamson, David M.; Xi, Xiaoming; Breyer, F. Jay – Educational Measurement: Issues and Practice, 2012
A framework for evaluation and use of automated scoring of constructed-response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are…
Descriptors: Automation, Scoring, Evaluation, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Royal-Dawson, Lucy; Baird, Jo-Anne – Educational Measurement: Issues and Practice, 2009
Hundreds of thousands of raters are recruited internationally to score examinations, but little research has been conducted on the selection criteria for these raters. Many countries insist upon teaching experience as a selection criterion and this has frequently become embedded in the cultural expectations surrounding the tests. Shortages in…
Descriptors: National Curriculum, Scoring, Foreign Countries, Teaching Experience
Peer reviewed Peer reviewed
Direct linkDirect link
Webb, Noreen M.; Herman, Joan L.; Webb, Norman L. – Educational Measurement: Issues and Practice, 2007
This article examines the role of reviewer agreement in judgments about alignment between tests and standards. We used case data from three state alignment studies to explore how different approaches to incorporating reviewer agreement changes alignment conclusions. The three case studies showed varying degrees of reviewer agreement about…
Descriptors: Test Items, Case Studies, Mathematics, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008
Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…
Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Campbell, Cynthia; Collins, Vicki L. – Educational Measurement: Issues and Practice, 2007
We reviewed the five top-selling introductory assessment textbooks in both general and special education to identify topics contained in textbooks and to determine the extent of agreement among authors regarding the essentialness of topics within and across discipline. Content analysis across the 10 assessment textbooks yielded 73 topics related…
Descriptors: Special Education, Content Analysis, Textbook Evaluation, Textbook Content
Previous Page | Next Page ยป
Pages: 1  |  2