NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 17 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Beaty, Roger E.; Johnson, Dan R.; Zeitlen, Daniel C.; Forthmann, Boris – Creativity Research Journal, 2022
Semantic distance is increasingly used for automated scoring of originality on divergent thinking tasks, such as the Alternate Uses Task (AUT). Despite some psychometric support for semantic distance -- including positive correlations with human creativity ratings -- additional work is needed to optimize its reliability and validity, including…
Descriptors: Semantics, Scoring, Creative Thinking, Creativity
Peer reviewed Peer reviewed
Direct linkDirect link
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Khorashad, Behzad S.; Baron-Cohen, Simon; Roshan, Ghasem M.; Kazemian, Mojtaba; Khazai, Ladan; Aghili, Zahra; Talaei, Ali; Afkhamizadeh, Mozhgan – Journal of Autism and Developmental Disorders, 2015
The psychometric properties of the Persian "Reading the Mind in the Eyes" test were investigated, so were the predictions from the Empathizing-Systemizing theory of psychological sex differences. Adults aged 16-69 years old (N = 545, female = 51.7%) completed the test online. The analysis of items showed them to be generally acceptable.…
Descriptors: Psychometrics, Theory of Mind, Gender Differences, Measures (Individuals)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ercan, Recep; Yaman, Tugba; Demir, Selcuk Besir – Journal of Education and Training Studies, 2015
The objective of this study is to develop a valid and reliable attitude scale having quality psychometric features that can measure secondary school students' attitudes towards human rights. The study group of the research is comprised by 710 6th, 7th and 8th grade students who study at 4 secondary schools in the centre of Sivas. The study group…
Descriptors: Civil Rights, Attitude Measures, Factor Analysis, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Chun – Journal of Educational and Behavioral Statistics, 2014
Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…
Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013
Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…
Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…
Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar
Peer reviewed Peer reviewed
Direct linkDirect link
Kettler, Ryan J.; Elliott, Stephen N.; Kurz, Alexander; Zigmond, Naomi; Lemons, Christopher J.; Kloo, Amanda; Shrago, Jacqueline; Beddow, Peter A.; Williams, Leila; Bruen, Charles; Lupp, Lynda; Farmer, Jeanie; Mosiman, Melanie – Assessment for Effective Intervention, 2014
Motivated by the multiple-measures clause of recent federal policy regarding student eligibility for alternate assessments based on modified academic achievement standards (AA-MASs), this study examined how scores or combinations of scores from a diverse set of assessments predicted students' end-of-year proficiency status on statewide achievement…
Descriptors: Eligibility, Alternative Assessment, Academic Achievement, Predictive Validity
Davis, Lawrence Edward – ProQuest LLC, 2012
Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…
Descriptors: Evaluators, Expertise, Scores, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010
The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Powers, Don; Hawthorn, John – ETS Research Report Series, 2008
Registered examinees for the GRE® General Test answered open-ended sentence-completion items. For half of the items, participants received immediate feedback on the correctness of their answers and up to two opportunities to revise their answers. A significant feedback-and-revision effect was found. Participants were able to correct many of their…
Descriptors: College Entrance Examinations, Graduate Study, Sentences, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010
The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment
Previous Page | Next Page »
Pages: 1  |  2