NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 258 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
LaFlair, Geoffrey T.; Langenfeld, Thomas; Baig, Basim; Horie, André Kenji; Attali, Yigal; von Davier, Alina A. – Journal of Computer Assisted Learning, 2022
Background: Digital-first assessments leverage the affordances of technology in all elements of the assessment process--from design and development to score reporting and evaluation to create test taker-centric assessments. Objectives: The goal of this paper is to describe the engineering, machine learning, and psychometric processes and…
Descriptors: Computer Assisted Testing, Affordances, Scoring, Engineering
Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025
Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…
Descriptors: Value Added Models, Tests, Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025
In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…
Descriptors: Automation, Grading, Computer Assisted Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Julie Sriken; Bradley T. Erford; Martin F. Sherman; Kristen Watson; Heather L. Smith – Measurement and Evaluation in Counseling and Development, 2024
Psychometric characteristics of CESD-R scores were explored on a sample of 966 undergraduate students. Internal consistency ([alpha] = 0.92), external convergent and discriminant validity, and response bias were adequate to excellent. Strong measurement invariance was evident for gender and race comparisons, and the unidimensional model fit the…
Descriptors: Symptoms (Individual Disorders), Depression (Psychology), Measures (Individuals), Undergraduate Students
Peer reviewed Peer reviewed
Direct linkDirect link
Troy L. Cox; Gregory L. Thompson; Steven S. Stokes – Foreign Language Annals, 2025
This study investigated the differences between the ACTFL Oral Proficiency Interview (OPI) and the ACTFL Oral Proficiency Interview - Computer (OPIc) among Spanish learners at a U.S. university. Participants (N = 154) were randomly assigned to take both tests in a counterbalanced order to mitigate test order effects. Data were analyzed using an…
Descriptors: Oral Language, Language Proficiency, Interviews, Computer Uses in Education
Peer reviewed Peer reviewed
Direct linkDirect link
VanDerHeyden, Amanda M.; Codding, Robin; Solomon, Benjamin G. – Remedial and Special Education, 2023
Computer-based curriculum-based measurement (CBM) is a relatively common practice, but surprisingly few studies have examined the reliability of computer-based CBM. This study sought to examine the reliability of CBM administered via paper/pencil versus the computer. Twenty-one of 25 students in two third-grade classes (N = 21) participated in two…
Descriptors: Curriculum Based Assessment, Computer Assisted Testing, Test Format, Grade 3
Peer reviewed Peer reviewed
Direct linkDirect link
Jiayi Wang; Michael T. Kalkbrenner; Riley Schaner – Psychology in the Schools, 2025
Teaching is a stressful profession with a high turnover rate. Schools and related institutions need to take more action to support teachers and keep teacher stress at a manageable level. The continued research and practical effort require measures to examine teachers' stress in a briefer and accurate manner. The Teacher Stress Scale is a recently…
Descriptors: Elementary School Teachers, Secondary School Teachers, Preschool Teachers, Stress Variables
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Senel, Selma; Kutlu, Ömer – European Journal of Special Needs Education, 2018
This paper examines listening comprehension skills of visually impaired students (VIS) using computerised adaptive testing (CAT) and reader-assisted paper-pencil testing (raPPT) and student views about them. Explanatory mixed method design was used in this study. Sample is comprised of 51 VIS, in 7th and 8th grades. 9 of these students were…
Descriptors: Computer Assisted Testing, Adaptive Testing, Visual Impairments, Student Attitudes
Peer reviewed Peer reviewed
Direct linkDirect link
Pauline Frizelle; Ana Oliveira-Buckley; Tricia Biancone; Jorge Oliveira; Paul Fletcher; Dorothy V. M. Bishop; Cristina McKean – International Journal of Language & Communication Disorders, 2025
Introduction: The present study investigated English-speaking 5-9 year olds' (n = 600, normative sample) comprehension of relative, adverbial and complement clauses using the Test of Complex Syntax-Electronic (TECS-E), an online interactive assessment. with strong test-retest reliability, concurrent validity and internal consistency. Method: Using…
Descriptors: Syntax, Child Language, Young Children, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019
Reliability is the consistency of a set of scores that are designed to measure the same thing. Reliability is a statistical property of scores that must be demonstrated rather than assumed.
Descriptors: Scores, Measurement, Test Reliability, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022
In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…
Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  18