Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 9 |
Since 2016 (last 10 years) | 24 |
Since 2006 (last 20 years) | 127 |
Descriptor
Evaluation Methods | 177 |
Psychometrics | 177 |
Educational Assessment | 54 |
Student Evaluation | 50 |
Measurement Techniques | 49 |
Test Construction | 49 |
Testing | 47 |
Measurement | 45 |
Educational Testing | 40 |
Computer Assisted Testing | 39 |
Test Validity | 34 |
More ▼ |
Source
Author
Rupp, Andre A. | 3 |
Thurlow, Martha | 3 |
Bielinski, John | 2 |
Cui, Ying | 2 |
Dunne, Michael P. | 2 |
Engelhard, George, Jr. | 2 |
Ferrara, Steve | 2 |
Frey, Andreas | 2 |
Holling, Heinz | 2 |
Jiao, Hong | 2 |
Kato, Pamela M. | 2 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 5 |
Researchers | 4 |
Counselors | 2 |
Students | 1 |
Location
United Kingdom | 7 |
Australia | 6 |
United States | 4 |
Germany | 3 |
United Kingdom (England) | 3 |
Connecticut | 2 |
Florida | 2 |
Massachusetts | 2 |
Netherlands | 2 |
Spain | 2 |
Taiwan | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 5 |
Education of the Handicapped… | 1 |
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Morris, Scott B.; Bass, Michael; Howard, Elizabeth; Neapolitan, Richard E. – International Journal of Testing, 2020
The standard error (SE) stopping rule, which terminates a computer adaptive test (CAT) when the "SE" is less than a threshold, is effective when there are informative questions for all trait levels. However, in domains such as patient-reported outcomes, the items in a bank might all target one end of the trait continuum (e.g., negative…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Banks, Item Response Theory
Weicong Lyu – ProQuest LLC, 2023
Item response theory (IRT) is currently the dominant methodological paradigm in educational and psychological measurement. IRT models are based on assumptions about the relationship between latent traits and observed responses, so the accuracy of the methodology depends heavily on the reasonableness of these assumptions. This dissertation consists…
Descriptors: Item Response Theory, Educational Assessment, Psychological Testing, Psychometrics
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Elizabeth Talbott; Andres De Los Reyes; Devin M. Kearns; Jeannette Mancilla-Martinez; Mo Wang – Exceptional Children, 2023
Evidence-based assessment (EBA) requires that investigators employ scientific theories and research findings to guide decisions about what domains to measure, how and when to measure them, and how to make decisions and interpret results. To implement EBA, investigators need high-quality assessment tools along with evidence-based processes. We…
Descriptors: Evidence Based Practice, Evaluation Methods, Special Education, Educational Research
Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Blackshear, Tara B. – International Journal of Kinesiology in Higher Education, 2022
Many physical education teacher education (PETE) programs have adopted FITNESSGRAM® as the preferred method to assess teacher candidate fitness levels. The rationale, however, is unclear. This article presents fitness testing results of PETE candidates using FITNESSGRAM® with the aim to evaluate its appropriateness. 86 PETE students participated…
Descriptors: Physical Education Teachers, Teacher Education Programs, Physical Fitness, Preservice Teachers
Hsueh, JoAnn; Portilla, Ximena; McCormick, Meghan; Balu, Rekha; Najafi, Behnosh – MDRC, 2022
The Measures for Early Success Initiative aims to reimagine the landscape of early learning assessments for the millions of 3- to 5-year-olds enrolled in Pre-K, so that more equitable data can be applied to meaningfully support and strengthen early learning experiences for all young children. This document outlines design parameters for child…
Descriptors: Early Childhood Education, Preschool Children, Student Evaluation, Child Development
Gliksman, Yarden; Berebbi, Shir; Hershman, Ronen; Henik, Avishai – Applied Cognitive Psychology, 2022
Math fluency (MF) is the ability to quickly and accurately solve simple math exercises. Proficiency in MF is one of the buildings of arithmetic achievement during school. However, so far only paper and pencil tests have been used to assess MF. In the current study, we present the BGU-MF (Ben-Gurion University Math Fluency) test, a new computerized…
Descriptors: Foreign Countries, Mathematics Skills, Mathematics Tests, Computer Assisted Testing
Ng, Zi Jia; Willner, Cynthia J.; Mannweiler, Morgan D.; Hoffmann, Jessica D.; Bailey, Craig S.; Cipriano, Christina – Educational Psychology Review, 2022
Many emotion regulation assessments have been developed for research purposes, but few are frequently used in schools despite the rapid growth of social and emotional learning programs with an explicit focus on emotion regulation in schools. This systematic review provides an overview of emotion regulation assessments that have been utilized with…
Descriptors: Emotional Response, Self Control, Elementary School Students, Secondary School Students
Heritage, Margaret; Kingston, Neal M. – Journal of Educational Measurement, 2019
Classroom assessment and large-scale assessment have, for the most part, existed in mutual isolation. Some experts have felt this is for the best and others have been concerned that the schism limits the potential contribution of both forms of assessment. Margaret Heritage has long been a champion of best practices in classroom assessment. Neal…
Descriptors: Measurement, Psychometrics, Context Effect, Classroom Environment
Pérez, Jorge; Vizcarro, Carmen; García, Javier; Bermúdez, Aurelio; Cobos, Ruth – IEEE Transactions on Education, 2017
In the context of higher education, a competence may be understood as the combination of skills, knowledge, attitudes, values, and abilities that underpin effective and/or superior performance in a professional area. The aim of the work reported here was to design a set of procedures to assess a transferable competence, i.e., problem solving, that…
Descriptors: Problem Solving, Computer Science Education, Minimum Competency Testing, Competency Based Education
de Klerk, Sebastiaan; Kato, Pamela M. – Journal of Applied Testing Technology, 2017
Game-based assessments will most likely be an increasing part of testing programs in future generations because they provide promising possibilities for more valid and reliable measurement of students' skills as compared to the traditional methods of assessment like paper-and-pencil tests or performance-based assessments. The current status of…
Descriptors: Futures (of Society), Educational Games, Testing, Educational Benefits
Kato, Pamela M.; de Klerk, Sebastiaan – Journal of Applied Testing Technology, 2017
Serious games are increasingly being explored for use as assessment tools in broad domains. Drawing from research in these domains, we present important advantages and challenges that arise when using games for assessment. In light of this context and as an introduction to this special issue on Serious Games and Assessments, we introduce the…
Descriptors: Evaluation Methods, Formative Evaluation, Design, Educational Games
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage