ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	20
Since 2007 (last 20 years)	45

Descriptor

Evaluators	63
Validity	63
Reliability	52
Evaluation Methods	19
Foreign Countries	18
Interrater Reliability	18
Comparative Analysis	11
Correlation	11
Rating Scales	11
Second Language Learning	10
Scoring Rubrics	9
Teaching Methods	9
English (Second Language)	8
Feedback (Response)	8
Measures (Individuals)	8
Scores	8
Second Language Instruction	8
Writing Evaluation	8
Essays	7
Higher Education	7
Language Tests	7
Psychometrics	7
Scoring	7
Standards	7
Statistical Analysis	7
More ▼

Publication Type

Journal Articles	44
Reports - Research	40
Speeches/Meeting Papers	9
Reports - Evaluative	8
Reports - Descriptive	6
Dissertations/Theses -…	4
Tests/Questionnaires	4
Information Analyses	3
Guides - Non-Classroom	2
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - General	1
More ▼

Education Level

Higher Education	17
Postsecondary Education	15
Secondary Education	4
Elementary Education	3
Elementary Secondary Education	2
Grade 4	2
Grade 6	2
Grade 3	1
Grade 5	1
Grade 7	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Researchers	2
Policymakers	1

Location

Australia	4
United Kingdom	3
Hong Kong	2
Indonesia	2
Japan	2
California	1
Canada	1
China	1
Cyprus	1
Ecuador	1
Finland	1
Florida	1
Iran	1
Minnesota	1
Netherlands	1
North Carolina	1
Taiwan	1
United Kingdom (England)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

Test of English as a Foreign…	2
Behavior Assessment System…	1
Flesch Kincaid Grade Level…	1
Strengths and Difficulties…	1
Systematic Screening for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 63 results Save | Export

Reliability and Validity of Using Structured Visual-Inspection Criteria to Interpret Latency-Based Functional Analysis Outcomes

Peer reviewed

Direct link

Sunde, Eleah; Briggs, Adam M.; Mitteer, Daniel R. – Journal of Applied Behavior Analysis, 2022

Prior research has evaluated the reliability and validity of structured visual inspection (SVI) criteria for interpreting functional analysis (FA) outcomes (Hagopian et al., 1997; Roane et al., 2013). We adapted these criteria to meet the unique needs of interpreting latency-based FA outcomes and examined the reliability and validity of applying…

Descriptors: Reliability, Validity, Visual Perception, Evaluation Criteria

Evaluation Is Creation: Self and Social Judgments of Creativity across the Four-C Model

Peer reviewed

Direct link

Denis Dumas; James C. Kaufman – Educational Psychology Review, 2024

Who should evaluate the originality and task-appropriateness of a given idea has been a perennial debate among psychologists of creativity. Here, we argue that the most relevant evaluator of a given idea depends crucially on the level of expertise of the person who generated it. To build this argument, we draw on two complimentary theoretical…

Descriptors: Decision Making, Creativity, Task Analysis, Psychologists

The Development of an Instrument to Measure the College Student Entrepreneurship Skills

Peer reviewed
PDF on ERIC

Download full text

Nofrida, Eka R.; PH, Slamet; Prasojo, Lantip D.; Mahmudah, Fitri N. – Pegem Journal of Education and Instruction, 2022

This study aims to (1) produce an instrument for measuring college student entrepreneurial skills; (2) describe the quality of the measurement instrument for college students' entrepreneurial skills; (3) describe the practicality of the measurement instrument for college students' entrepreneurial skills. The method used in this study is the…

Descriptors: Measures (Individuals), College Students, Validity, Reliability

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Judges' Views on Pairwise Comparative Judgement and Rank Ordering as Alternatives to Analytical Essay Marking

Download full text

Walland, Emma – Research Matters, 2022

In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…

Descriptors: Essays, Grading, Writing Evaluation, Evaluators

Revisiting the Effectiveness of a Performance Decision Tree-Style Rubric Compared to a Grid-Style Rubric

Peer reviewed

Direct link

Yuichiro Yokouchi – Language Testing in Asia, 2025

The performance decision tree (PDT; Fulcher et al., 2011) is a rubric style that is applicable to performance assessment, with origins in Upshur and Turner's (1995) empirically derived binary-choice, boundary-definition (EBB) scale. It is easier for raters to assess performance by evaluating multiple binary-choice descriptors. Additionally,…

Descriptors: Scoring Rubrics, Second Language Learning, Second Language Instruction, Language Teachers

Peer Assessment Using Soft Computing Techniques

Peer reviewed

Direct link

Pinargote-Ortega, Maricela; Bowen-Mendoza, Lorena; Meza, Jaime; Ventura, Sebastián – Journal of Computing in Higher Education, 2021

In this paper, we applied a peer assessment scenario at the Technical University of Manabí (Ecuador). Students and professors evaluated some works through rubrics, assigned a numerical score, and provided textual feedback grounding why such a numerical score was determined, to detect inaccuracy between both assessments. The proposed model uses…

Descriptors: Foreign Countries, College Students, Peer Evaluation, Scoring Rubrics

A Weighted Individual Performance-Based Assessment for Middle School Orchestral Strings: Establishing Validity and Reliability

Direct link

Kevin Ward – ProQuest LLC, 2022

The study established the validity and reliability of a weighted individual performance-based assessment tool within the utility scope of middle school orchestral strings. The following research questions guided this study: 1. What specific string-playing behaviors and corresponding criteria validate a weighted individual performance-based…

Descriptors: Music Education, Musical Instruments, Psychometrics, Music

Subjective Ratings of Age-of-Acquisition: Exploring Issues of Validity and Rater Reliability

Peer reviewed

Direct link

Wikse Barrow, Carla; Nilsson Bjorkenstam, Kristina; Strombergsson, Sofia – Journal of Child Language, 2019

This study aimed to investigate concerns of validity and reliability in subjective ratings of age-of-acquisition (AoA), through exploring characteristics of the individual rater. An additional aim was to validate the obtained AoA ratings against two corpora -- one of child speech and one of adult speech -- specifically exploring whether words…

Descriptors: Language Acquisition, Evaluators, Validity, Reliability

Calibrated Parsing Items Evaluation: A Step towards Objectifying the Translation Assessment

Peer reviewed

Direct link

Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019

The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…

Descriptors: Test Items, Translation, Computer Software, Evaluators

Opportunities for Mathematics Engagement in Secondary Teachers' Practice: Validating an Observation Tool

Peer reviewed
PDF on ERIC

Download full text

Jansen, Amanda; Smith, Ethan P.; Middleton, James A.; Cullicott, Catherine E. – North American Chapter of the International Group for the Psychology of Mathematics Education, 2021

The purpose of this report is to present our process and results for establishing validity and reliability of an observation tool used to investigate teaching practices that high school mathematics teachers use to engage students. We developed our tool using established practices, such as reviewing literature to develop a framework for instruction…

Descriptors: Mathematics Instruction, Secondary School Teachers, High School Teachers, Scoring Rubrics

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

Investigating Human Essay Rating Quality in a Large-Scale Assessment Using Many-Facet Rasch Measurement

Peer reviewed

Direct link

Zhang, Xiuyuan – AERA Online Paper Repository, 2019

The main purpose of the study is to evaluate the qualities of human essay ratings for a large-scale assessment using Rasch measurement theory. Specifically, Many-Facet Rasch Measurement (MFRM) was utilized to examine the rating scale category structure and provide important information about interpretations of ratings in the large-scale…

Descriptors: Essays, Evaluators, Writing Evaluation, Reliability

Development and Validation of a Rating Scale for Iranian EFL Academic Writing Assessment: A Mixed-Methods Study

Peer reviewed

Direct link

Ghanbari, Nasim; Barati, Hossein – Language Testing in Asia, 2020

The present study reports the process of development and validation of a rating scale in the Iranian EFL academic writing assessment context. To achieve this goal, the study was conducted in three distinct phases. Early in the study, the researcher interviewed a number of raters in different universities. Next, a questionnaire was developed based…

Descriptors: Rating Scales, Writing Evaluation, English for Academic Purposes, Second Language Learning

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

ProQuest LLC	4
Language Testing in Asia	3
Advances in Health Sciences…	2
Research Matters	2
AERA Online Paper Repository	1
Applied Measurement in…	1
Assessing Writing	1
Assessment & Evaluation in…	1
Autism: The International…	1
Behavioral Disorders	1
British Journal of…	1
Center on Great Teachers and…	1
Child Abuse & Neglect: The…	1
Education and Information…	1
Educational Assessment	1
Educational Psychology	1
Educational Psychology Review	1
Journal of Applied Behavior…	1
Journal of Child Language	1
Journal of Child and Family…	1
Journal of Clinical Child and…	1
Journal of Computing in…	1
Journal of Consulting and…	1
Journal of Early Adolescence	1
Journal of Educational…	1
More ▼

Coniam, David	2
Goe, Laura	2
Holdheide, Lynn	2
Miller, Tricia	2
Abraham, Anne	1
Akbari, Alireza	1
Alexander, Regi	1
Apple, Kristen	1
Aryadoust, Vahid	1
Baer, Donald M.	1
Balzotti, Jon	1
Barati, Hossein	1
Barnoux, Magali	1
Barrett, Andrew J.	1
Bazeley, Patricia	1
Bhaumik, Sabyasachi	1
Bowen-Mendoza, Lorena	1
Briggs, Adam M.	1
Byers, Katherine L.	1
Camara, Wayne J.	1
Chen, Yuan-shan	1
Cheng, Liying	1
Chenven, Mark	1
Colombini, Crystal Broch	1
More ▼