Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 22 |
Since 2016 (last 10 years) | 54 |
Since 2006 (last 20 years) | 88 |
Descriptor
Interrater Reliability | 115 |
Item Response Theory | 115 |
Foreign Countries | 36 |
Scoring | 33 |
Scoring Rubrics | 23 |
Evaluators | 19 |
Rating Scales | 19 |
Test Items | 19 |
Scores | 17 |
Test Construction | 16 |
Psychometrics | 15 |
More ▼ |
Source
Author
Johnson, Evelyn S. | 6 |
Moylan, Laura A. | 6 |
Zheng, Yuzhu | 6 |
Crawford, Angela R. | 5 |
Lunz, Mary E. | 5 |
Engelhard, George, Jr. | 4 |
Wind, Stefanie A. | 4 |
Karakaya, Ismail | 3 |
O'Neill, Thomas R. | 3 |
Wyse, Adam E. | 3 |
Avery, Marybell | 2 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 2 |
Practitioners | 1 |
Location
Turkey | 9 |
Taiwan | 4 |
South Korea | 3 |
Australia | 2 |
Canada | 2 |
Finland | 2 |
Hong Kong | 2 |
Netherlands | 2 |
New Mexico | 2 |
United Kingdom | 2 |
California (Berkeley) | 1 |
More ▼ |
Laws, Policies, & Programs
American Recovery and… | 1 |
Elementary and Secondary… | 1 |
Assessments and Surveys
Home Observation for… | 1 |
International English… | 1 |
Peabody Picture Vocabulary… | 1 |
Strengths and Difficulties… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024
This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…
Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory
Siqi Huang – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023
The goal of this paper is twofold. First, the paper clarifies and elaborates on an important theoretical construct called orientation with respect to understanding in mathematics, which denotes the degree to which students exhibit an inclination towards and demonstrate an earnest concern for understanding in mathematical learning. Second, the…
Descriptors: Mathematics Instruction, Teaching Methods, Problem Solving, Reliability
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Journal of Learning Disabilities, 2021
In this study, we examined the relationship of special education teachers' performance on the Recognizing Effective Special Education Teachers (RESET) Explicit Instruction observation protocol with student growth on academic measures. Special education teachers provided video-recorded observations of three instructional lessons along with data…
Descriptors: Special Education Teachers, Teacher Effectiveness, Teacher Evaluation, Direct Instruction
Safak, Pinar; Cakmak, Salih; Karakoc, Tamer; Aydin O'Dwyer, Pinar – European Journal of Educational Research, 2021
This study aimed to develop a valid and reliable instrument that measures the functional vision of students with low vision. Thus, an assessment tool and performance activities were developed for three vision skill groups (near vision skills, distance vision skills, and visual field) that include functional vision skills. The universe was 1485…
Descriptors: Foreign Countries, Vision Tests, Diagnostic Tests, Vision
Gübes, Nese Öztürk – Participatory Educational Research, 2021
The aim of this study is to show how a many-facet Rasch measurement model (MFRM) can be used for quality control whilst monitoring a musical aptitude examination. The data used in this study was gathered from a musical aptitude examination which was applied in 2019-2020 academic year for selecting teacher candidates to a music education department…
Descriptors: Foreign Countries, Music Education, Teacher Education Programs, Preservice Teacher Education
Nnamdi Chika Ezike – ProQuest LLC, 2022
Fitting wrongly specified models to observed data may lead to invalid inferences about the model parameters of interest. The current study investigated the performance of the posterior predictive model checking (PPMC) approach in detecting model-data misfit of the hierarchical rater model (HRM). The HRM is a rater-mediated model that incorporates…
Descriptors: Prediction, Models, Interrater Reliability, Item Response Theory
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Anthony, Christopher J.; Styck, Kara M.; Volpe, Robert J.; Robert, Christopher R. – School Psychology, 2023
Although originally conceived of as a marriage of direct behavioral observation and indirect behavior rating scales, recent research has indicated that Direct Behavior Ratings (DBRs) are affected by rater idiosyncrasies (rater effects) similar to other indirect forms of behavioral assessment. Most of this research has been conducted using…
Descriptors: Item Response Theory, Generalizability Theory, Interrater Reliability, Behavior Rating Scales
Yvette Jackson – ProQuest LLC, 2023
Rater-mediated activities in educational research occur when an expert judge or rater utilizes an instrument to judge persons or items and generates scale scores. Scale scores are from a subjective judgment and must undergo a quality control measure called rating quality. Rating quality in this study is broadly defined as the extent to which…
Descriptors: Educational Research, Evaluators, Test Theory, Item Response Theory
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Uyar, Seyma; Yayla, Onur; Zunber, Hidayet – International Journal of Assessment Tools in Education, 2022
The purpose of the current study is to examine the map reading skills of Social Studies pre-service teachers with orienteering, which is an activity-based and more active practice. To this end, a total of 10 students attending the Department of Social Studies Teaching in the Education Faculty of Burdur Mehmet Akif Ersoy University and taking the…
Descriptors: Map Skills, Navigation, Item Response Theory, Social Studies
Kilic, Abdullah Faruk; Uysal, Ibrahim – International Journal of Assessment Tools in Education, 2022
Most researchers investigate the corrected item-total correlation of items when analyzing item discrimination in multi-dimensional structures under the Classical Test Theory, which might lead to underestimating item discrimination, thereby removing items from the test. Researchers might investigate the corrected item-total correlation with the…
Descriptors: Item Analysis, Correlation, Item Response Theory, Test Items
Martin, David; Jamieson-Proctor, Romina – International Journal of Research & Method in Education, 2020
In Australia, one of the key findings of the Teacher Education Ministerial Advisory Group was that not all graduating pre-service teachers possess adequate pedagogical content knowledge (PCK) to teach effectively. The concern is that higher education providers working with pre-service teachers are using pedagogical practices and assessments which…
Descriptors: Test Construction, Preservice Teachers, Pedagogical Content Knowledge, Foreign Countries
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Grantee Submission, 2020
In this study, we examined the relationship of special education teachers' performance on the RESET Explicit Instruction observation protocol with student growth on academic measures. Special education teachers provided video recorded observations of three instructional lessons along with data from standardized, curriculum-based academic measures…
Descriptors: Special Education Teachers, Teacher Effectiveness, Teacher Evaluation, Direct Instruction
Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Journal of Experimental Education, 2022
In this study, we examined the scoring and generalizability assumptions of an explicit instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Education, Classroom Observation Techniques, Validity