NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
China1
Portugal1
Laws, Policies, & Programs
Assessments and Surveys
National Household Education…1
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Choi, Ikkyu; Hao, Jiangang; Deane, Paul; Zhang, Mo – ETS Research Report Series, 2021
"Biometrics" are physical or behavioral human characteristics that can be used to identify a person. It is widely known that keystroke or typing dynamics for short, fixed texts (e.g., passwords) could serve as a behavioral biometric. In this study, we investigate whether keystroke data from essay responses can lead to a reliable…
Descriptors: Accuracy, High Stakes Tests, Writing Tests, Benchmarking
Peer reviewed Peer reviewed
Direct linkDirect link
Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015
The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…
Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Thomas, Michael L. – Assessment, 2011
Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…
Descriptors: Item Response Theory, Psychological Evaluation, Reliability, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010
The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Scherbaum, Charles A.; Cohen-Charash, Yochi; Kern, Michael J. – Educational and Psychological Measurement, 2006
General self-efficacy (GSE), individuals' belief in their ability to perform well in a variety of situations, has been the subject of increasing research attention. However, the psychometric properties (e.g., reliability, validity) associated with the scores on GSE measures have been criticized, which has hindered efforts to further establish the…
Descriptors: Self Efficacy, Measures (Individuals), Psychometrics, Reliability
Ree, Malcom James; Jensen, Harald E. – 1980
By means of computer simulation of test responses, the reliability of item analysis data and the accuracy of equating were examined for hypothetical samples of 250, 500, 1000, and 2000 subjects for two tests with 20 equating items plus 60 additional items on the same scale. Birnbaum's three-parameter logistic model was used for the simulation. The…
Descriptors: Computer Assisted Testing, Equated Scores, Error of Measurement, Item Analysis
Brick, J. Michael; West, Jerry – 1992
In the spring of 1991 the first full-scale National Household Education Survey (NHES:91) was conducted for the National Center for Education Statistics (NCES). The NHES:91 was a national random digit dial telephone survey of about 14,000 parents of 3- to 8-year-old children concerning the educational experiences of young children. A reinterview…
Descriptors: Computer Assisted Testing, Early Childhood Education, Educational Attitudes, Educational Experience