ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	15

Descriptor

Computer Assisted Testing	17
Correlation	17
Reliability	17
Scoring	10
Scores	8
Accuracy	6
Comparative Analysis	6
Validity	6
Evaluators	5
Foreign Countries	5
English (Second Language)	4
Item Response Theory	4
Second Language Learning	4
Statistical Analysis	4
Alternative Assessment	3
Computer Software	3
Essays	3
Evaluation Methods	3
Language Tests	3
Psychometrics	3
Regression (Statistics)	3
Test Items	3
Vocabulary	3
Writing Evaluation	3
College Entrance Examinations	2
More ▼

Source

ETS Research Report Series	3
Advances in Physiology…	1
Applied Linguistics	1
Applied Psychological…	1
Assessment & Evaluation in…	1
Assessment for Effective…	1
CALICO Journal	1
Creativity Research Journal	1
International Journal of…	1
Journal of Autism and…	1
Journal of Education and…	1
Journal of Educational and…	1
Journal of Speech, Language,…	1
ProQuest LLC	1
More ▼

Publication Type

Journal Articles	15
Reports - Research	14
Reports - Evaluative	2
Tests/Questionnaires	2
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Higher Education	6
Postsecondary Education	6
Elementary Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Arizona	1
Canada	1
China	1
North Carolina (Greensboro)	1
Pennsylvania	1
Portugal	1
Singapore	1
South Carolina	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Graduate Record Examinations	2
Dynamic Indicators of Basic…	1
Peabody Individual…	1
Social Skills Rating System	1
United States Medical…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Semantic Distance and the Alternate Uses Task: Recommendations for Reliable Automated Assessment of Originality

Peer reviewed

Direct link

Beaty, Roger E.; Johnson, Dan R.; Zeitlen, Daniel C.; Forthmann, Boris – Creativity Research Journal, 2022

Semantic distance is increasingly used for automated scoring of originality on divergent thinking tasks, such as the Alternate Uses Task (AUT). Despite some psychometric support for semantic distance -- including positive correlations with human creativity ratings -- additional work is needed to optimize its reliability and validity, including…

Descriptors: Semantics, Scoring, Creative Thinking, Creativity

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

The "Reading the Mind in the Eyes" Test: Investigation of Psychometric Properties and Test-Retest Reliability of the Persian Version

Peer reviewed

Direct link

Khorashad, Behzad S.; Baron-Cohen, Simon; Roshan, Ghasem M.; Kazemian, Mojtaba; Khazai, Ladan; Aghili, Zahra; Talaei, Ali; Afkhamizadeh, Mozhgan – Journal of Autism and Developmental Disorders, 2015

The psychometric properties of the Persian "Reading the Mind in the Eyes" test were investigated, so were the predictions from the Empathizing-Systemizing theory of psychological sex differences. Adults aged 16-69 years old (N = 545, female = 51.7%) completed the test online. The analysis of items showed them to be generally acceptable.…

Descriptors: Psychometrics, Theory of Mind, Gender Differences, Measures (Individuals)

Human Rights Attitude Scale: A Validity and Reliability Study

Peer reviewed
PDF on ERIC

Download full text

Ercan, Recep; Yaman, Tugba; Demir, Selcuk Besir – Journal of Education and Training Studies, 2015

The objective of this study is to develop a valid and reliable attitude scale having quality psychometric features that can measure secondary school students' attitudes towards human rights. The study group of the research is comprised by 710 6th, 7th and 8th grade students who study at 4 secondary schools in the centre of Sivas. The study group…

Descriptors: Civil Rights, Attitude Measures, Factor Analysis, Construct Validity

Improving Measurement Precision of Hierarchical Latent Traits Using Adaptive Testing

Peer reviewed

Direct link

Wang, Chun – Journal of Educational and Behavioral Statistics, 2014

Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…

Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of "WriteToLearn"

Peer reviewed
PDF on ERIC

Download full text

Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016

This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

Peer reviewed

Direct link

Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013

Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis

Automated Trait Scores for "GRE"® Writing Tasks. Research Report. ETS RR-15-15

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…

Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar

Predicting End-of-Year Achievement Test Performance: A Comparison of Assessment Methods

Peer reviewed

Direct link

Kettler, Ryan J.; Elliott, Stephen N.; Kurz, Alexander; Zigmond, Naomi; Lemons, Christopher J.; Kloo, Amanda; Shrago, Jacqueline; Beddow, Peter A.; Williams, Leila; Bruen, Charles; Lupp, Lynda; Farmer, Jeanie; Mosiman, Melanie – Assessment for Effective Intervention, 2014

Motivated by the multiple-measures clause of recent federal policy regarding student eligibility for alternate assessments based on modified academic achievement standards (AA-MASs), this study examined how scores or combinations of scores from a diverse set of assessments predicted students' end-of-year proficiency status on statewide achievement…

Descriptors: Eligibility, Alternative Assessment, Academic Achievement, Predictive Validity

Rater Expertise in a Second Language Speaking Assessment: The Influence of Training and Experience

Direct link

Davis, Lawrence Edward – ProQuest LLC, 2012

Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…

Descriptors: Evaluators, Expertise, Scores, Second Language Learning

Toward Automated Multi-Trait Scoring of Essays: Investigating Links among Holistic, Analytic, and Text Feature Scores

Peer reviewed

Direct link

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

Effect of Immediate Feedback and Revision on Psychometric Properties of Open-Ended Sentence- Completion Items. ETS GRE Board Research Report No. 03-15. ETS RR-08-16

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Powers, Don; Hawthorn, John – ETS Research Report Series, 2008

Registered examinees for the GRE® General Test answered open-ended sentence-completion items. For half of the items, participants received immediate feedback on the correctness of their answers and up to two opportunities to revise their answers. A significant feedback-and-revision effect was found. Participants were able to correct many of their…

Descriptors: College Entrance Examinations, Graduate Study, Sentences, Psychometrics

E-Assessment within the Bologna Paradigm: Evidence from Portugal

Peer reviewed

Direct link

Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010

The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment

Previous Page | Next Page »

Pages: 1 | 2

Attali, Yigal	2
Wolfe, Edward W.	2
Afkhamizadeh, Mozhgan	1
Aghili, Zahra	1
Amanda Huee-Ping Wong	1
Apple, Kristen	1
Baldwin, Peter	1
Baron-Cohen, Simon	1
Beaty, Roger E.	1
Beddow, Peter A.	1
Bruen, Charles	1
Clauser, Brian	1
Dalton, Sarah Grace	1
Davis, Lawrence Edward	1
Demir, Selcuk Besir	1
Elliott, Stephen N.	1
Engelhard, George, Jr.	1
Ercan, Recep	1
Farmer, Jeanie	1
Ferrao, Maria	1
Foltz, Peter	1
Forthmann, Boris	1
Fromm, Davida	1
Gentile, Claudia	1
Harik, Polina	1
More ▼