Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 15 |
Descriptor
Computer Assisted Testing | 17 |
Correlation | 17 |
Reliability | 17 |
Scoring | 10 |
Scores | 8 |
Accuracy | 6 |
Comparative Analysis | 6 |
Validity | 6 |
Evaluators | 5 |
Foreign Countries | 5 |
English (Second Language) | 4 |
More ▼ |
Source
Author
Attali, Yigal | 2 |
Wolfe, Edward W. | 2 |
Afkhamizadeh, Mozhgan | 1 |
Aghili, Zahra | 1 |
Amanda Huee-Ping Wong | 1 |
Apple, Kristen | 1 |
Baldwin, Peter | 1 |
Baron-Cohen, Simon | 1 |
Beaty, Roger E. | 1 |
Beddow, Peter A. | 1 |
Bruen, Charles | 1 |
More ▼ |
Publication Type
Journal Articles | 15 |
Reports - Research | 14 |
Reports - Evaluative | 2 |
Tests/Questionnaires | 2 |
Dissertations/Theses -… | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 6 |
Elementary Education | 1 |
Grade 8 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Arizona | 1 |
Canada | 1 |
China | 1 |
North Carolina (Greensboro) | 1 |
Pennsylvania | 1 |
Portugal | 1 |
Singapore | 1 |
South Carolina | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Graduate Record Examinations | 2 |
Dynamic Indicators of Basic… | 1 |
Peabody Individual… | 1 |
Social Skills Rating System | 1 |
United States Medical… | 1 |
Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Beaty, Roger E.; Johnson, Dan R.; Zeitlen, Daniel C.; Forthmann, Boris – Creativity Research Journal, 2022
Semantic distance is increasingly used for automated scoring of originality on divergent thinking tasks, such as the Alternate Uses Task (AUT). Despite some psychometric support for semantic distance -- including positive correlations with human creativity ratings -- additional work is needed to optimize its reliability and validity, including…
Descriptors: Semantics, Scoring, Creative Thinking, Creativity
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Khorashad, Behzad S.; Baron-Cohen, Simon; Roshan, Ghasem M.; Kazemian, Mojtaba; Khazai, Ladan; Aghili, Zahra; Talaei, Ali; Afkhamizadeh, Mozhgan – Journal of Autism and Developmental Disorders, 2015
The psychometric properties of the Persian "Reading the Mind in the Eyes" test were investigated, so were the predictions from the Empathizing-Systemizing theory of psychological sex differences. Adults aged 16-69 years old (N = 545, female = 51.7%) completed the test online. The analysis of items showed them to be generally acceptable.…
Descriptors: Psychometrics, Theory of Mind, Gender Differences, Measures (Individuals)
Ercan, Recep; Yaman, Tugba; Demir, Selcuk Besir – Journal of Education and Training Studies, 2015
The objective of this study is to develop a valid and reliable attitude scale having quality psychometric features that can measure secondary school students' attitudes towards human rights. The study group of the research is comprised by 710 6th, 7th and 8th grade students who study at 4 secondary schools in the centre of Sivas. The study group…
Descriptors: Civil Rights, Attitude Measures, Factor Analysis, Construct Validity
Wang, Chun – Journal of Educational and Behavioral Statistics, 2014
Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…
Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013
Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…
Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…
Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar
Kettler, Ryan J.; Elliott, Stephen N.; Kurz, Alexander; Zigmond, Naomi; Lemons, Christopher J.; Kloo, Amanda; Shrago, Jacqueline; Beddow, Peter A.; Williams, Leila; Bruen, Charles; Lupp, Lynda; Farmer, Jeanie; Mosiman, Melanie – Assessment for Effective Intervention, 2014
Motivated by the multiple-measures clause of recent federal policy regarding student eligibility for alternate assessments based on modified academic achievement standards (AA-MASs), this study examined how scores or combinations of scores from a diverse set of assessments predicted students' end-of-year proficiency status on statewide achievement…
Descriptors: Eligibility, Alternative Assessment, Academic Achievement, Predictive Validity
Davis, Lawrence Edward – ProQuest LLC, 2012
Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…
Descriptors: Evaluators, Expertise, Scores, Second Language Learning
Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010
The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays
Attali, Yigal; Powers, Don; Hawthorn, John – ETS Research Report Series, 2008
Registered examinees for the GRE® General Test answered open-ended sentence-completion items. For half of the items, participants received immediate feedback on the correctness of their answers and up to two opportunities to revise their answers. A significant feedback-and-revision effect was found. Participants were able to correct many of their…
Descriptors: College Entrance Examinations, Graduate Study, Sentences, Psychometrics
Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010
The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment
Previous Page | Next Page »
Pages: 1 | 2