Publication Date
In 2025 | 257 |
Since 2024 | 1178 |
Since 2021 (last 5 years) | 4984 |
Since 2016 (last 10 years) | 13492 |
Since 2006 (last 20 years) | 29421 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Policymakers | 491 |
Practitioners | 487 |
Researchers | 347 |
Teachers | 332 |
Administrators | 187 |
Parents | 68 |
Community | 67 |
Students | 44 |
Counselors | 33 |
Media Staff | 7 |
Support Staff | 3 |
More ▼ |
Location
Turkey | 1150 |
Texas | 781 |
California | 733 |
Florida | 593 |
United States | 562 |
Canada | 506 |
Australia | 498 |
China | 470 |
North Carolina | 437 |
New York | 381 |
United Kingdom | 370 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 65 |
Meets WWC Standards with or without Reservations | 112 |
Does not meet standards | 116 |
Kim, Stella Y. – Educational Measurement: Issues and Practice, 2022
In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating. Traditional unidimensional item response theory (IRT) equating methods impose the sometimes untenable restriction on data that only a single ability is assessed. This module discusses potential sources of multidimensionality…
Descriptors: Item Response Theory, Models, Equated Scores, Evaluation Methods
Johnson, Matthew S.; Liu, Xiang; McCaffrey, Daniel F. – Journal of Educational Measurement, 2022
With the increasing use of automated scores in operational testing settings comes the need to understand the ways in which they can yield biased and unfair results. In this paper, we provide a brief survey of some of the ways in which the predictive methods used in automated scoring can lead to biased, and thus unfair automated scores. After…
Descriptors: Psychometrics, Measurement Techniques, Bias, Automation
Sadhwani, Anjali; Wheeler, Anne; Gwaltney, Angela; Peters, Sarika U.; Barbieri-Welge, Rene L.; Horowitz, Lucia T.; Noll, Lisa M.; Hundley, Rachel J.; Bird, Lynne M.; Tan, Wen-Hann – Journal of Autism and Developmental Disorders, 2023
We describe the development of 236 children with Angelman syndrome (AS) using the Bayley Scales of Infant and Toddler Development, Third Edition. Multilevel linear mixed modeling approaches were used to explore differences between molecular subtypes and over time. Individuals with AS continue to make slow gains in development through at least age…
Descriptors: Child Development, Developmental Disabilities, Psychomotor Skills, Infants
Moore, C. Missy; Mullen, Patrick R.; Hinchey, Kaitlin J.; Lambie, Glenn W. – Counselor Education and Supervision, 2023
Our study examines the differential item functioning of the Counselor Competencies Scale--Revised (CCS-R) scores due to respondents' gender, the type of evaluation, and a combination of these two variables using a large sample (N = 1614). Implications of the findings are offered to inform counselor educators and supervisors using the CCS-R and…
Descriptors: Item Analysis, Measures (Individuals), Counselors, Competence
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Folger, Timothy D.; Bostic, Jonathan; Krupa, Erin E. – Educational Measurement: Issues and Practice, 2023
Validity is a fundamental consideration of test development and test evaluation. The purpose of this study is to define and reify three key aspects of validity and validation, namely test-score interpretation, test-score use, and the claims supporting interpretation and use. This study employed a Delphi methodology to explore how experts in…
Descriptors: Test Interpretation, Scores, Test Use, Test Validity
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023
The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…
Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence
Mosquera, Jose Miguel Llanos; Suarez, Carlos Giovanny Hidalgo; Guerrero, Victor Andres Bucheli – Education and Information Technologies, 2023
This paper proposes to evaluate learning efficiency by implementing the flipped classroom and automatic source code evaluation based on the Kirkpatrick evaluation model in students of CS1 programming course. The experimentation was conducted with 82 students from two CS1 courses; an experimental group (EG = 56) and a control group (CG = 26). Each…
Descriptors: Flipped Classroom, Coding, Programming, Evaluation Methods
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2023
Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers.…
Descriptors: Scores, Test Items, Accuracy, Psychometrics
Tresansky, Lindsay M. – ProQuest LLC, 2023
The Annual Professional Performance Review (APPR) system in New York State (NYS) has been called into question by educators since its adoption nearly 10 years ago, yet it remains the mandated evaluation system in NYS schools today. Much of the concern has been over changes such as assigning teachers final evaluation scores, as well as for the…
Descriptors: Foreign Countries, Comparative Education, Teacher Evaluation, Alternative Assessment
Ebrahim Azimi; Jane Friesen; Simon Woodcock – Education Finance and Policy, 2023
We investigate the effects of private schools on reading and numeracy scores using rich population data. Conditional on lagged test scores and narrowly defined neighborhood indicators, Catholic and non-Christian faith private schools on average raise test scores by 0.18 standard deviation or more relative to the average public school, while…
Descriptors: Private Schools, Academic Achievement, Catholic Schools, Scores
Wendy Chan; Jimin Oh; Chen Li; Jiexuan Huang; Yeran Tong – Society for Research on Educational Effectiveness, 2023
Background: The generalizability of a study's results continues to be at the forefront of concerns in evaluation research in education (Tipton & Olsen, 2018). Over the past decade, statisticians have developed methods, mainly based on propensity scores, to improve generalizations in the absence of random sampling (Stuart et al., 2011; Tipton,…
Descriptors: Generalizability Theory, Probability, Scores, Sampling
Hess, Jessica – ProQuest LLC, 2023
This study was conducted to further research into the impact of student-group item parameter drift (SIPD) --referred to as subpopulation item parameter drift in previous research-- on ability estimates and proficiency classification accuracy when occurring in the discrimination parameter of a 2-PL item response theory (IRT) model. Using Monte…
Descriptors: Test Items, Groups, Ability, Item Response Theory
Fahruddin; Merci Robbi Kurniawanti; T. Heru Nurgiansah; Dhiniaty Gularso – Journal of Education and Learning (EduLearn), 2025
This study aims to find out: firstly, the qualifications for developing teaching materials to evaluate observation-based history learning and secondly the level of students' critical thinking skills. The results of this research contribute to improving students' critical thinking skills through the development of teaching materials. This research…
Descriptors: Critical Thinking, Scores, History Instruction, Thinking Skills