ERIC - Search Results

Publication Date

In 2025	0
Since 2024	4
Since 2021 (last 5 years)	12
Since 2016 (last 10 years)	20
Since 2006 (last 20 years)	37

Descriptor

Correlation	42
Evaluators	42
Scoring	42
Second Language Learning	18
Computer Assisted Testing	15
English (Second Language)	15
Language Tests	15
Writing Evaluation	14
Comparative Analysis	13
Essays	13
Interrater Reliability	12
Reliability	11
Computer Software	10
Foreign Countries	10
Scores	8
Undergraduate Students	8
Accuracy	7
Rating Scales	7
Statistical Analysis	7
Evaluation Criteria	6
Holistic Approach	6
Performance Based Assessment	6
Writing Tests	6
Evaluation Methods	5
Oral Language	5
More ▼

Publication Type

Reports - Research	37
Journal Articles	36
Tests/Questionnaires	4
Reports - Evaluative	3
Dissertations/Theses -…	2
Information Analyses	2
Speeches/Meeting Papers	2

Education Level

Higher Education	10
Postsecondary Education	9
Secondary Education	3
Elementary Education	1
High Schools	1

Audience

Location

China	3
India	2
Australia	1
Japan	1
Michigan	1
Nigeria	1
Singapore	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	7
Gates MacGinitie Reading Tests	1
International English…	1
Torrance Tests of Creative…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 42 results Save | Export

Evaluating ChatGPT as a Self-Learning Tool in Medical Biochemistry: A Performance Assessment in Undergraduate Medical University Examination

Peer reviewed

Direct link

Krishna Mohan Surapaneni; Anusha Rajajagadeesan; Lakshmi Goudhaman; Shalini Lakshmanan; Saranya Sundaramoorthi; Dineshkumar Ravi; Kalaiselvi Rajendiran; Porchelvan Swaminathan – Biochemistry and Molecular Biology Education, 2024

The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an…

Descriptors: Biochemistry, Science Instruction, Artificial Intelligence, Teaching Methods

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices

Peer reviewed

Direct link

Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022

Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…

Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Measuring Original Thinking in Elementary School: Development and Validation of a Computational Psychometric Approach

Peer reviewed

Direct link

Selcuk Acar; Denis Dumas; Peter Organisciak; Kelly Berthiaume – Grantee Submission, 2024

Creativity is highly valued in both education and the workforce, but assessing and developing creativity can be difficult without psychometrically robust and affordable tools. The open-ended nature of creativity assessments has made them difficult to score, expensive, often imprecise, and therefore impractical for school- or district-wide use. To…

Descriptors: Thinking Skills, Elementary School Students, Artificial Intelligence, Measurement Techniques

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Using Latent Semantic Analysis to Score Short Answer Constructed Responses: Automated Scoring of the Consequences Test

Peer reviewed

Direct link

LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020

Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…

Descriptors: Semantics, Evaluators, Essays, Scoring

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

Comparing Holistic and Analytic Marking Methods in Assessing Speech Act Production in L2 Chinese

Peer reviewed

Direct link

Li, Shuai; Wen, Ting; Li, Xian; Feng, Yali; Lin, Chuan – Language Testing, 2023

This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1)…

Descriptors: Speech Acts, Second Language Learning, Second Language Instruction, Chinese

An Investigation of the Impact of Jagged Profile on L2 Speaking Test Ratings: Evidence from Rating and Eye-Tracking Data

Peer reviewed

Direct link

Ma, Wenyue; Winke, Paula – Language Assessment Quarterly, 2022

The factors that influence rater scoring have been a subject of great interest to researchers in second language assessment. However, the research on the impact of test-takers' speech profiles (e.g., a jagged or a flat profile reflecting analytic subscores) on raters' scoring behaviors remains to be seen. To investigate the role of speech profiles…

Descriptors: Language Tests, Second Language Learning, Speech Communication, Profiles

Can Automated Machine Translation Evaluation Metrics Be Used to Assess Students' Interpretation in the Language Learning Classroom?

Peer reviewed

Direct link

Han, Chao; Lu, Xiaolei – Computer Assisted Language Learning, 2023

The use of translation and interpreting (T&I) in the language learning classroom is commonplace, serving various pedagogical and assessment purposes. Previous utilization of T&I exercises is driven largely by their potential to enhance language learning, whereas the latest trend has begun to underscore T&I as a crucial skill to be…

Descriptors: Translation, Computational Linguistics, Correlation, Language Processing

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

Rater Attitude towards Emerging Varieties of English: A New Rater Effect?

Peer reviewed

Direct link

Hsu, Tammy Huei-Lien – Language Testing in Asia, 2019

Background: A strong interest in researching World Englishes (WE) in relation to language assessment has become an emerging theme in language assessment studies over the past two decades. While research on WE has highlighted the status, function, and legitimacy of varieties of English language, it remains unclear how raters respond to the results…

Descriptors: Language Attitudes, Language Variation, Language Tests, Second Language Learning

Previous Page | Next Page »

Pages: 1 | 2 | 3

Language Testing	6
ETS Research Report Series	5
Applied Measurement in…	3
Educational and Psychological…	2
Grantee Submission	2
Measurement:…	2
ProQuest LLC	2
Advances in Physiology…	1
American Journal on Mental…	1
Assessment	1
Assessment in Education:…	1
Biochemistry and Molecular…	1
CALICO Journal	1
Computer Assisted Language…	1
Contemporary Issues in…	1
Education and Information…	1
English Teaching	1
International Journal of…	1
Journal of Educational and…	1
Journal of Speech, Language,…	1
Language Assessment Quarterly	1
Language Testing in Asia	1
Online Submission	1
Perceptual and Motor Skills	1
World Journal of Education	1
More ▼

Bridgeman, Brent	2
Linn, Robert L.	2
Wind, Stefanie A.	2
Xi, Xiaoming	2
Abdul Gafoor, K.	1
Allan S. Cohen	1
Amanda Huee-Ping Wong	1
Anusha Rajajagadeesan	1
Apple, Kristen	1
Ardison, Sharon	1
Attali, Yigal	1
Barkaoui, Khaled	1
Blair, William O.	1
Boccaccini, Marcus T.	1
Breyer, F. Jay	1
Brown, Michelle Stallone	1
Carifio, James	1
Crossley, Scott A.	1
Dalton, Sarah Grace	1
Davey, Tim	1
Davis, Lawrence Edward	1
Denis Dumas	1
Dineshkumar Ravi	1
Duchnowski, Matthew P.	1
More ▼