ERIC - Search Results

Publication Date

In 2025	1
Since 2024	5
Since 2021 (last 5 years)	17
Since 2016 (last 10 years)	24
Since 2006 (last 20 years)	36

Descriptor

Evaluation Methods	78
Evaluators	78
Interrater Reliability	78
Scoring	16
Foreign Countries	15
Performance Based Assessment	15
Rating Scales	13
Comparative Analysis	12
Evaluation Criteria	12
Higher Education	11
Educational Assessment	10
Language Tests	10
Scores	10
Correlation	9
Teacher Evaluation	9
Student Evaluation	8
Test Validity	8
Decision Making	7
Elementary Secondary Education	7
Second Language Learning	7
Test Reliability	7
Accuracy	6
Training	6
College Students	5
Computer Software	5
More ▼

Publication Type

Reports - Research	52
Journal Articles	51
Speeches/Meeting Papers	18
Reports - Evaluative	17
Tests/Questionnaires	8
Information Analyses	5
Reports - Descriptive	5
Dissertations/Theses -…	3
Opinion Papers	2
Collected Works - Serials	1
ERIC Digests in Full Text	1
ERIC Publications	1
More ▼

Education Level

Higher Education	11
Postsecondary Education	10
Elementary Education	3
Elementary Secondary Education	3
Secondary Education	3
High Schools	2
Middle Schools	2
Adult Education	1
Grade 4	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Junior High Schools	1
More ▼

Audience

Practitioners	2
Researchers	2

Location

Israel	3
Australia	2
China	2
Tennessee	2
Vietnam	2
Canada	1
Cuba	1
Europe	1
Illinois (Urbana)	1
India	1
Netherlands	1
Spain	1
Sweden	1
Texas	1
United Kingdom	1
United Kingdom (England)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

National Assessment of…	1
Praxis Series	1
Stanford Achievement Tests	1
Test of English as a Foreign…	1
Torrance Tests of Creative…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 78 results Save | Export

Investigating the Effect of Classroom-Based Feedback on Speaking Assessment: A Multifaceted Rasch Analysis

Peer reviewed

Direct link

Bijani, Houman; Hashempour, Bahareh; Ibrahim, Khaled Ahmed Abdel-Al; Orabah, Salim Said Bani; Heydarnejad, Tahereh – Language Testing in Asia, 2022

Due to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently,…

Descriptors: Oral Language, Language Tests, Feedback (Response), Bias

Same Grade for Different Reasons, Different Grades for the Same Reason?

Peer reviewed

Direct link

Ilona Rinne – Assessment & Evaluation in Higher Education, 2024

It is widely acknowledged in research that common criteria and aligned standards do not result in consistent assessment of such a complex performance as the final undergraduate thesis. Assessment is determined by examiners' understanding of rubrics and their views on thesis quality. There is still a gap in the research literature about how…

Descriptors: Foreign Countries, Undergraduate Students, Teacher Education Programs, Evaluation Criteria

Do Source Use Features Impact Raters' Judgment of Argumentation? An Experimental Study

Peer reviewed

Direct link

Ping-Lin Chuang – Language Testing, 2025

This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…

Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources

Raters' Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalisation

Peer reviewed

Direct link

Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024

Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…

Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Exploring an Alternative to Record Motor Competence Assessment: Interrater and Intrarater Audio-Video Reliability

Peer reviewed

Direct link

Cristina Menescardi; Aida Carballo-Fazanes; Núria Ortega-Benavent; Isaac Estevan – Journal of Motor Learning and Development, 2024

The Canadian Agility and Movement Skill Assessment (CAMSA) is a valid and reliable circuit-based test of motor competence which can be used to assess children's skills in a live or recorded performance and then coded. We aimed to analyze the intrarater reliability of the CAMSA scores (total, time, and skill score) and time measured, by comparing…

Descriptors: Interrater Reliability, Evaluators, Scoring, Psychomotor Skills

The Whole Is More than the Sum of Its Parts -- Assessing Writing Using the Consensual Assessment Technique

Peer reviewed

Direct link

Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021

Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…

Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries

Do You Mean What I Mean? Comparing Teacher Performance Self-Scores and Evaluator-Generated Scores

Peer reviewed

Direct link

Hunter, Seth B. – Journal of Education Human Resources, 2023

Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability

How Many Raters Can Be Enough: G Theory Applied to Assessment and Measurement of L2 Speech Perception

Peer reviewed
PDF on ERIC

Download full text

Kevin Hirschi; Okim Kang – Language Teaching Research Quarterly, 2023

This paper extends the use of Generalizability Theory to the measurement of extemporaneous L2 speech through the lens of speech perception. Using six datasets of previous studies, it reports on "G studies"--a method of breaking down measurement variance--and "D studies"--a predictive study of the impact on reliability when…

Descriptors: Evaluators, Generalization, Evaluation Methods, Speech Communication

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

Teacher Perceptions of Psychological Reports: An Empirical Comparison of District Evaluators' and Contracted Evaluators' Report Styles

Direct link

Peter Stern – ProQuest LLC, 2021

Across the country, school districts are increasingly seeking out privately contracted psychologists to conduct psychological evaluations. As such, it is increasingly important that psychological reports adhere to best practices and are written to ensure comprehension by both parents and teachers. This study explored the potential differences…

Descriptors: Teachers, Special Education Teachers, Teacher Attitudes, Psychological Evaluation

Comparing Machine and Human Reviewers to Evaluate the Risk of Bias in Randomized Controlled Trials

Peer reviewed

Direct link

Armijo-Olivo, Susan; Craig, Rodger; Campbell, Sandy – Research Synthesis Methods, 2020

Background: Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is…

Descriptors: Comparative Analysis, Computer Software, Decision Making, Randomized Controlled Trials

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

Cross-Validation and Application of a Scale Assessing School Band Performance

Peer reviewed

Direct link

Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021

This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…

Descriptors: Music Education, Music Activities, Music, Performance

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Language Testing	4
Applied Measurement in…	3
ProQuest LLC	3
Assessment Update	2
Educational Measurement:…	2
Journal of Personnel…	2
Personnel Psychology	2
Applied Psychological…	1
Assessment & Evaluation in…	1
Assessment and Evaluation in…	1
Australian Journal of…	1
Canadian Modern Language…	1
Child Abuse and Neglect: The…	1
Child Language Teaching and…	1
Creativity Research Journal	1
ETS Research Report Series	1
Education and Information…	1
Educational Assessment	1
Educational Research and…	1
Educational and Psychological…	1
English Language Teaching	1
Evaluation and the Health…	1
International Educational…	1
International Journal of…	1
Interpreter and Translator…	1
More ▼

Jaeger, Richard M.	3
Bejar, Isaac I.	2
Houston, Walter M.	2
Myford, Carol M.	2
Plake, Barbara S.	2
Wind, Stefanie A.	2
Aaron Zimmerman	1
Abedi, Jamal	1
Ahmadi Safa, Mohammad	1
Aida Carballo-Fazanes	1
Apache, R. R.	1
Armijo-Olivo, Susan	1
Backlund, Phil	1
Baer, John	1
Bakker, Mirjam E. J.	1
Beijaard, Douwe	1
Benyon, Howard E., III.	1
Bergee, Martin J.	1
Bethany L. Miller	1
Bijani, Houman	1
Binghan Zheng	1
Black, Don	1
Bourke, Sid	1
Boyd, Victoria	1
More ▼