NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20256
Since 202418
Since 2021 (last 5 years)99
Since 2016 (last 10 years)243
Since 2006 (last 20 years)511
What Works Clearinghouse Rating
Showing 1 to 15 of 511 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tiffany Wu; Christina Weiland; Meghan McCormick; JoAnn Hsueh; Catherine Snow; Jason Sachs – Grantee Submission, 2024
The Hearts and Flowers (H&F) task is a computerized executive functioning (EF) assessment that has been used to measure EF from early childhood to adulthood. It provides data on accuracy and reaction time (RT) across three different task blocks (hearts, flowers, and mixed). However, there is a lack of consensus in the field on how to score the…
Descriptors: Scoring, Executive Function, Kindergarten, Young Children
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Güntay Tasçi – Science Insights Education Frontiers, 2024
The present study has aimed to develop and validate a protein concept inventory (PCI) consisting of 25 multiple-choice (MC) questions to assess students' understanding of protein, which is a fundamental concept across different biology disciplines. The development process of the PCI involved a literature review to identify protein-related content,…
Descriptors: Science Instruction, Science Tests, Multiple Choice Tests, Biology
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrara, Steve; Qunbar, Saed – Journal of Educational Measurement, 2022
In this article, we argue that automated scoring engines should be transparent and construct relevant--that is, as much as is currently feasible. Many current automated scoring engines cannot achieve high degrees of scoring accuracy without allowing in some features that may not be easily explained and understood and may not be obviously and…
Descriptors: Artificial Intelligence, Scoring, Essays, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Matt Homer – Advances in Health Sciences Education, 2024
Quantitative measures of systematic differences in OSCE scoring across examiners (often termed examiner stringency) can threaten the validity of examination outcomes. Such effects are usually conceptualised and operationalised based solely on checklist/domain scores in a station, and global grades are not often used in this type of analysis. In…
Descriptors: Examiners, Scoring, Validity, Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Plucker, Jonathan A. – Creativity Research Journal, 2023
In 1998, Plucker and Runco provided an overview of creativity assessment, noting current issues (fluency confounds, generality vs. specificity), recent advances (predictive validity, implicit theories), and promising future directions (moving beyond divergent thinking measures, reliance on batteries of assessments, translation into practice). In…
Descriptors: Creativity, Creativity Tests, Creative Thinking, Semantics
Peer reviewed Peer reviewed
Direct linkDirect link
Michael D. Wray; Matthew R. Reynolds – Journal of Psychoeducational Assessment, 2025
The KeyMath-3 Diagnostic Assessment (KM-3) is an individually-administered math assessment used in educational placement and diagnostic decisions. It includes 10 subtests making up Basic Concepts, Operations, and Applications indexes and a "Total Test" composite that measures overall math ability. Here, covariances among subtests from…
Descriptors: Diagnostic Tests, Mathematics Tests, Arithmetic, Factor Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Er, Zübeyde; Dinç Artut, Perihan; Bal, Ayten Pinar – Pegem Journal of Education and Instruction, 2023
The aim of this research is to develop a reliable and valid scale to determine the mathematical thinking skills of gifted students. In addition, with the developed scale, thinking skills of gifted students was examined in terms of various variables. In this context, the research was carried out on two different study groups. The first stage of…
Descriptors: Measures (Individuals), Rating Scales, Test Construction, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Beisemann, Marie; Forthmann, Boris; Bürkner, Paul-Christian; Holling, Heinz – Journal of Creative Behavior, 2020
The Remote Associates Test (RAT; Mednick, 1962; Mednick & Mednick, 1967) is a commonly employed test of creative convergent thinking. The RAT is scored with a dichotomous scoring, scoring correct answers as 1 and all other answers as 0. Based on recent research into the information processing underlying RAT performance, we argued that the…
Descriptors: Psychometrics, Scoring, Tests, Semantics
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven; Kuhfeld, Megan – Applied Measurement in Education, 2021
Effort-moderated (E-M) scoring is intended to estimate how well a disengaged test taker would have performed had they been fully engaged. It accomplishes this adjustment by excluding disengaged responses from scoring and estimating performance from the remaining responses. The scoring method, however, assumes that the remaining responses are not…
Descriptors: Scoring, Achievement Tests, Identification, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Aloisi, Cesare – European Journal of Education, 2023
This article considers the challenges of using artificial intelligence (AI) and machine learning (ML) to assist high-stakes standardised assessment. It focuses on the detrimental effect that even state-of-the-art AI and ML systems could have on the validity of national exams of secondary education, and how lower validity would negatively affect…
Descriptors: Standardized Tests, Test Validity, Credibility, Algorithms
Selcuk Acar; Kelly Berthiaume; Katalin Grajzel; Denis Dumas; Charles Flemister; Peter Organisciak – Gifted Child Quarterly, 2023
In this study, we applied different text-mining methods to the originality scoring of the Unusual Uses Test (UUT) and Just Suppose Test (JST) from the Torrance Tests of Creative Thinking (TTCT)--Verbal. Responses from 102 and 123 participants who completed Form A and Form B, respectively, were scored using three different text-mining methods. The…
Descriptors: Creative Thinking, Creativity Tests, Scoring, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D. – Journal of Educational Measurement, 2022
One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…
Descriptors: Scoring, Essays, Validity, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Marcos Jiménez; María Zapata-Cáceres; Marcos Román-González; Gregorio Robles; Jesús Moreno-León; Estefanía Martín-Barroso – Journal of Science Education and Technology, 2024
Computational thinking (CT) is a multidimensional term that encompasses a wide variety of problem-solving skills related to the field of computer science. Unfortunately, standardized, valid, and reliable methods to assess CT skills in preschool children are lacking, compromising the reliability of the results reported in CT interventions. To…
Descriptors: Computation, Thinking Skills, Student Evaluation, Preschool Children
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deborah Oluwadele; Yashik Singh; Timothy Adeliyi – Electronic Journal of e-Learning, 2024
Validation is needed for any newly developed model or framework because it requires several real-life applications. The investment made into e-learning in medical education is daunting, as is the expectation for a positive return on investment. The medical education domain requires data-wise implementation of e-learning as the debate continues…
Descriptors: Electronic Learning, Evaluation Methods, Medical Education, Sustainability
Peer reviewed Peer reviewed
Direct linkDirect link
Huawei, Shi; Aryadoust, Vahid – Education and Information Technologies, 2023
Automated writing evaluation (AWE) systems are developed based on interdisciplinary research and technological advances such as natural language processing, computer sciences, and latent semantic analysis. Despite a steady increase in research publications in this area, the results of AWE investigations are often mixed, and their validity may be…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Automation
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  35