ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	15
Since 2007 (last 20 years)	57

Descriptor

Computer Assisted Testing	71
Correlation	71
Test Reliability	44
Test Validity	29
Scores	27
Scoring	27
Foreign Countries	24
Comparative Analysis	20
English (Second Language)	19
Interrater Reliability	18
Second Language Learning	18
Language Tests	17
Reliability	17
Statistical Analysis	14
Psychometrics	12
Student Attitudes	12
Computer Software	11
Essays	11
Evaluation Methods	11
Evaluators	11
Test Construction	11
College Students	8
Regression (Statistics)	8
Test Items	8
Undergraduate Students	8
More ▼

Publication Type

Journal Articles	58
Reports - Research	55
Reports - Evaluative	8
Tests/Questionnaires	5
Dissertations/Theses -…	3
Numerical/Quantitative Data	3
Speeches/Meeting Papers	3
Information Analyses	2
Reports - Descriptive	2
Collected Works - Proceedings	1
Guides - General	1
Opinion Papers	1
More ▼

Education Level

Higher Education	19
Postsecondary Education	18
Secondary Education	6
Elementary Education	5
Elementary Secondary Education	5
Grade 8	4
Junior High Schools	2
Middle Schools	2
Adult Education	1
Early Childhood Education	1
Grade 11	1
Grade 3	1
Grade 4	1
Grade 5	1
High Schools	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Location

China	3
Hong Kong	3
Turkey	3
Canada	2
Florida	2
Israel	2
Japan	2
Netherlands	2
Pennsylvania	2
Portugal	2
Singapore	2
Sweden	2
United Kingdom (England)	2
Arizona	1
Asia	1
Australia	1
Brazil	1
California	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Germany	1
Greece	1
Hawaii	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	9
Graduate Record Examinations	2
Wechsler Intelligence Scale…	2
Coopersmith Self Esteem…	1
Defining Issues Test	1
Dynamic Indicators of Basic…	1
Minnesota Multiphasic…	1
National Assessment of…	1
Peabody Individual…	1
Peabody Picture Vocabulary…	1
Praxis Series	1
Social Skills Rating System	1
Stanford Achievement Tests	1
State Trait Anxiety Inventory	1
Strengths and Difficulties…	1
United States Medical…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 71 results Save | Export

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022

Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

Psychometric Properties of Online Adolescent Anger Instrument

Peer reviewed
PDF on ERIC

Download full text

Ahmad, Nor Shafrin; Zaharudin, Rozniza; Khairani, Ahmad Zamri – International Journal of Educational Methodology, 2022

Anger is a topic that requires intervention from teachers, counsellors, psychologists, parents, and all communities. The expressions of anger are subjective and sometimes hard to identify. Thus, anger should be measured more objectively, while the expressions need to be examined closely. The purpose of this study is to provide valid confirmation…

Descriptors: Psychological Patterns, Test Validity, Psychometrics, Adolescents

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Semantic Distance and the Alternate Uses Task: Recommendations for Reliable Automated Assessment of Originality

Peer reviewed

Direct link

Beaty, Roger E.; Johnson, Dan R.; Zeitlen, Daniel C.; Forthmann, Boris – Creativity Research Journal, 2022

Semantic distance is increasingly used for automated scoring of originality on divergent thinking tasks, such as the Alternate Uses Task (AUT). Despite some psychometric support for semantic distance -- including positive correlations with human creativity ratings -- additional work is needed to optimize its reliability and validity, including…

Descriptors: Semantics, Scoring, Creative Thinking, Creativity

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Development of the English Listening and Reading Computerized Revised Token Test into Cantonese: Validity, Reliability, and Sensitivity/Specificity in People with Aphasia and Healthy Controls

Peer reviewed

Direct link

Bakhtiar, Mehdi; Wong, Min Ney; Tsui, Emily Ka Yin; McNeil, Malcolm R. – Journal of Speech, Language, and Hearing Research, 2020

Purpose: This study reports the psychometric development of the Cantonese versions of the English Computerized Revised Token Test (CRTT) for persons with aphasia (PWAs) and healthy controls (HCs). Method: The English CRTT was translated into standard Chinese for the Reading--Word Fade version (CRTT-R-[subscript WF]-Cantonese) and into formal…

Descriptors: Psychometrics, Sino Tibetan Languages, Computer Assisted Testing, Aphasia

Tablets Instead of Paper-Based Tests for Young Children? Comparability between Paper and Tablet Versions of the Mathematical Heidelberger Rechen Test 1-4

Peer reviewed

Direct link

Hassler Hallstedt, Martin; Ghaderi, Ata – Educational Assessment, 2018

Tablets can be used to facilitate systematic testing of academic skills. Yet, when using validated paper tests on tablet, comparability between the mediums must be established. Comparability between a tablet and a paper version of a basic math skills test (HRT: Heidelberger Rechen Test 1-4) was investigated. Five samples with second and third…

Descriptors: Handheld Devices, Scores, Test Format, Computer Assisted Testing

Validating an Online Assessment of Developmental Spelling in Grades Five through Eight

Peer reviewed

Direct link

Gehsmann, Kristin; Spichtig, Alexandra; Tousley, Elias – Literacy Research: Theory, Method, and Practice, 2017

Assessments of developmental spelling, also called spelling inventories, are commonly used to understand students' orthographic knowledge (i.e., knowledge of how written words work) and to determine their stages of spelling and reading development. The information generated by these assessments is used to inform teachers' grouping practices and…

Descriptors: Spelling, Computer Assisted Testing, Grouping (Instructional Purposes), Teaching Methods

Subjective Mental Health, Peer Relations, Family, and School Environment in Adolescents with Intellectual Developmental Disorder: A First Report of a New Questionnaire Administered on Tablet PCs

Peer reviewed

Direct link

Boström, Petra; Johnels, Jakob Åsberg; Thorson, Maria; Broberg, Malin – Journal of Mental Health Research in Intellectual Disabilities, 2016

Few studies have explored the subjective mental health of adolescents with intellectual disabilities, while proxy ratings indicate an overrepresentation of mental health problems. The present study reports on the design and an initial empirical evaluation of the Well-being in Special Education Questionnaire (WellSEQ). Questions, response scales,…

Descriptors: Mental Health, Peer Relationship, Family Environment, Educational Environment

Reliability and Validity of the Computerized Revised Token Test: Comparison of Reading and Listening Versions in Persons with and without Aphasia

Peer reviewed

Direct link

McNeil, Malcolm R.; Pratt, Sheila R.; Szuminsky, Neil; Sung, Jee Eun; Fossett, Tepanta R. D.; Fassbinder, Wiltrud; Lim, Kyoung Yuel – Journal of Speech, Language, and Hearing Research, 2015

Purpose: This study assessed the reliability and validity of intermodality associations and differences in persons with aphasia (PWA) and healthy controls (HC) on a computerized listening and 3 reading versions of the Revised Token Test (RTT; McNeil & Prescott, 1978). Method: Thirty PWA and 30 HC completed the test versions, including a…

Descriptors: Aphasia, Test Validity, Test Reliability, Scores

The Influence of Training and Experience on Rater Performance in Scoring Spoken Language

Peer reviewed

Direct link

Davis, Larry – Language Testing, 2016

Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…

Descriptors: Evaluators, Oral Language, Scores, Language Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

ETS Research Report Series	9
Computers in Human Behavior	3
Journal of Speech, Language,…	3
ProQuest LLC	3
Advances in Health Sciences…	2
Assessment for Effective…	2
Language Testing	2
Advances in Physiology…	1
American College Testing…	1
Applied Linguistics	1
Applied Measurement in…	1
Applied Psychological…	1
Assessment & Evaluation in…	1
CALICO Journal	1
Computers & Education	1
Council for Aid to Education	1
Creativity Research Journal	1
Educational Assessment	1
Educational Research and…	1
Educational Testing Service	1
Educational and Psychological…	1
English Teaching	1
Eurasian Journal of…	1
Florida Center for Reading…	1
Grantee Submission	1
More ▼

Attali, Yigal	3
Anna-Maria Fall	2
Bennett, Randy Elliot	2
Beula M. Magimairaj	2
Coniam, David	2
Gentile, Claudia	2
Greg Roberts	2
Jenkins, Frank	2
Kantor, Robert	2
Lee, Yong-Won	2
McNeil, Malcolm R.	2
Persky, Hilary	2
Philip Capin	2
Ronald B. Gillam	2
Sandra L. Gillam	2
Sharon Vaughn	2
Sinharay, Sandip	2
Wolfe, Edward W.	2
Abedi, Jamal	1
Afkhamizadeh, Mozhgan	1
Aghili, Zahra	1
Ahmad, Nor Shafrin	1
Amanda Huee-Ping Wong	1
Anderson, Paul S.	1
More ▼