Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 37 |
Descriptor
Evaluation Methods | 47 |
Psychometrics | 47 |
Testing | 47 |
Measurement Techniques | 17 |
Measurement | 14 |
Student Evaluation | 13 |
Test Construction | 13 |
Evaluation Research | 10 |
Models | 10 |
Scores | 9 |
Test Reliability | 9 |
More ▼ |
Source
Author
Dunne, Michael P. | 2 |
Runyan, Desmond K. | 2 |
Zolotor, Adam J. | 2 |
Adams, Wendy K. | 1 |
Amery D. Wu | 1 |
Andreva-Miller, Inna | 1 |
Bailey, Craig S. | 1 |
Balu, Rekha | 1 |
Bank, Jurgen | 1 |
Barry, Robin A. | 1 |
Bartram, Dave | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 9 |
Postsecondary Education | 4 |
Elementary Secondary Education | 2 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Kindergarten | 1 |
Secondary Education | 1 |
Audience
Practitioners | 1 |
Students | 1 |
Location
Australia | 2 |
Canada | 1 |
Colombia | 1 |
Congo | 1 |
Egypt | 1 |
Germany | 1 |
India | 1 |
Lebanon | 1 |
Malaysia | 1 |
Russia | 1 |
South Africa | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Assessments and Surveys
Beck Anxiety Inventory | 1 |
Center for Epidemiologic… | 1 |
Early Childhood Longitudinal… | 1 |
Rosenberg Self Esteem Scale | 1 |
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Blackshear, Tara B. – International Journal of Kinesiology in Higher Education, 2022
Many physical education teacher education (PETE) programs have adopted FITNESSGRAM® as the preferred method to assess teacher candidate fitness levels. The rationale, however, is unclear. This article presents fitness testing results of PETE candidates using FITNESSGRAM® with the aim to evaluate its appropriateness. 86 PETE students participated…
Descriptors: Physical Education Teachers, Teacher Education Programs, Physical Fitness, Preservice Teachers
Hsueh, JoAnn; Portilla, Ximena; McCormick, Meghan; Balu, Rekha; Najafi, Behnosh – MDRC, 2022
The Measures for Early Success Initiative aims to reimagine the landscape of early learning assessments for the millions of 3- to 5-year-olds enrolled in Pre-K, so that more equitable data can be applied to meaningfully support and strengthen early learning experiences for all young children. This document outlines design parameters for child…
Descriptors: Early Childhood Education, Preschool Children, Student Evaluation, Child Development
Ng, Zi Jia; Willner, Cynthia J.; Mannweiler, Morgan D.; Hoffmann, Jessica D.; Bailey, Craig S.; Cipriano, Christina – Educational Psychology Review, 2022
Many emotion regulation assessments have been developed for research purposes, but few are frequently used in schools despite the rapid growth of social and emotional learning programs with an explicit focus on emotion regulation in schools. This systematic review provides an overview of emotion regulation assessments that have been utilized with…
Descriptors: Emotional Response, Self Control, Elementary School Students, Secondary School Students
Heritage, Margaret; Kingston, Neal M. – Journal of Educational Measurement, 2019
Classroom assessment and large-scale assessment have, for the most part, existed in mutual isolation. Some experts have felt this is for the best and others have been concerned that the schism limits the potential contribution of both forms of assessment. Margaret Heritage has long been a champion of best practices in classroom assessment. Neal…
Descriptors: Measurement, Psychometrics, Context Effect, Classroom Environment
de Klerk, Sebastiaan; Kato, Pamela M. – Journal of Applied Testing Technology, 2017
Game-based assessments will most likely be an increasing part of testing programs in future generations because they provide promising possibilities for more valid and reliable measurement of students' skills as compared to the traditional methods of assessment like paper-and-pencil tests or performance-based assessments. The current status of…
Descriptors: Futures (of Society), Educational Games, Testing, Educational Benefits
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2017
Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the…
Descriptors: Measurement Techniques, Student Evaluation, Evaluation Methods, Testing
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Haro, Elizabeth K.; Haro, Luis S. – Journal of Chemical Education, 2014
The multiple-choice question (MCQ) is the foundation of knowledge assessment in K-12, higher education, and standardized entrance exams (including the GRE, MCAT, and DAT). However, standard MCQ exams are limited with respect to the types of questions that can be asked when there are only five choices. MCQs offering additional choices more…
Descriptors: Multiple Choice Tests, Coding, Scoring Rubrics, Test Scoring Machines
Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014
Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…
Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy
Adams, Wendy K.; Wieman, Carl E. – International Journal of Science Education, 2011
This paper describes the process for creating and validating an assessment test that measures the effectiveness of instruction by probing how well that instruction causes students in a class to think like experts about specific areas of science. The design principles and process are laid out and it is shown how these align with professional…
Descriptors: Expertise, Psychological Testing, Disabilities, Psychometrics
Kubinger, Klaus D.; Rasch, Dieter; Yanagida, Takuya – Educational Research and Evaluation, 2011
Though calibration of an achievement test within psychological and educational context is very often carried out by the Rasch model, data sampling is hardly designed according to statistical foundations. However, Kubinger, Rasch, and Yanagida (2009) recently suggested an approach for the determination of sample size according to a given Type I and…
Descriptors: Sample Size, Simulation, Testing, Achievement Tests
Engelhard, George, Jr.; Perkins, Aminah F. – Measurement: Interdisciplinary Research and Perspectives, 2011
Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
Descriptors: Foreign Countries, Social Sciences, Item Response Theory, Testing
Matson, Johnny L.; Wilkins, Jonathan – Research in Developmental Disabilities: A Multidisciplinary Journal, 2009
Social skill excesses and deficits have garnered considerable attention from researchers and clinicians over the last three decades. This trend is undoubtedly due to the central role these problems play in psychopathology and the general adjustment of children of all ages. Not surprisingly, these concerns and attention to such problems have also…
Descriptors: Testing, Psychopathology, Psychometrics, Interpersonal Competence