Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Nikola Ebenbeck; Morten Bastian; Andreas Mühling; Markus Gebhardt – Journal of Computer Assisted Learning, 2024
Background: Computerised adaptive tests (CATs) are tests that provide personalised, efficient and accurate measurement while reducing testing time, depending on the desired level of precision. Schools have different types of assessments that can benefit from a significant reduction in testing time to varying degrees, depending on the area of…
Descriptors: Computer Assisted Testing, Elementary Secondary Education, Public Schools, Special Schools
José Manuel Arencibia Alemán; Astrid Marie Jorde Sandsør; Henrik Daae Zachrisson; Sigrid Blömeke – Assessment in Education: Principles, Policy & Practice, 2024
Modest correlations between teacher-assigned grades and external assessments of academic achievement (r = 0.40-0.60) have led many educational stakeholders to deem grades subjective and unreliable. However, theoretical and methodological challenges, such as construct misalignment, data unavailability and sample unrepresentativeness, limit the…
Descriptors: Grades (Scholastic), Grading, Achievement Tests, Test Validity
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Guler, Gul; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022
The purpose of this study was to investigate the Type I Error findings and power rates of the methods used to determine dimensionality in unidimensional and bidimensional psychological constructs for various conditions (characteristic of the distribution, sample size, length of the test, and interdimensional correlation) and to examine the joint…
Descriptors: Comparative Analysis, Error of Measurement, Decision Making, Factor Analysis
Ben-Michael, Eli; Feller, Avi; Rothstein, Jesse – Grantee Submission, 2022
Staggered adoption of policies by different units at different times creates promising opportunities for observational causal inference. Estimation remains challenging, however, and common regression methods can give misleading results. A promising alternative is the synthetic control method (SCM), which finds a weighted average of control units…
Descriptors: Causal Models, Statistical Inference, Computation, Evaluation Methods
Evan Rosenman; Rina Friedberg; Michael Baiocchi – Society for Research on Educational Effectiveness, 2022
Background and Context: In 2016, our team designed and implemented a cluster-randomized trial of a school-based empowerment training program, targeting adolescent girls in Nairobi, Kenya (Baiocchi et al., 2019; Rosenman et al., 2020). In that study, the primary outcome was the experience of sexual violence in the prior year. Participants disclosed…
Descriptors: Foreign Countries, Adolescents, Females, Sexual Abuse
Gao, Ruiqin – ProQuest LLC, 2023
This multiple-manuscript dissertation explored the measurement invariance (MI) testing with multiple-group confirmatory factor analysis (MG-CFA) approach from different perspectives. Study 1 explored MI from a theoretical perspective by conducting a systematic review study on MI practices in education. The findings of this study indicated…
Descriptors: Error of Measurement, Factor Analysis, Simulation, Elementary School Students
Gilbert, Joshua B.; Kim, James S.; Miratrix, Luke W. – Journal of Educational and Behavioral Statistics, 2023
Analyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard statistical methods for addressing heterogeneous treatment effects (HTE) fail to address the HTE that may exist "within" outcome measures. In…
Descriptors: Test Items, Item Response Theory, Computer Assisted Testing, Program Effectiveness
Haberman, Shelby J. – ETS Research Report Series, 2019
Measures of agreement are compared to measures of prediction accuracy within a general context. Differences in appropriate use are emphasized, and approaches are examined for both numerical and nominal variables. General estimation methods are developed, and their large-sample properties are compared.
Descriptors: Measurement Techniques, Classification, Prediction, Accuracy
Kelsey Harkness; Signe Bray; Chelsea M. Durber; Deborah Dewey; Kara Murias – Journal of Autism and Developmental Disorders, 2025
Attention and executive function (EF) dysregulation are common in a number of disorders including autism and attention-deficit/hyperactivity disorder (ADHD). Better understanding of the relationship between indirect and direct measures of attention and EF and common neurodevelopmental diagnoses may contribute to more efficient and effective…
Descriptors: Adolescents, Autism Spectrum Disorders, Attention Deficit Hyperactivity Disorder, Executive Function
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2020
The current study aims to evaluate the performance of three non-IRT procedures (i.e., normal approximation, Livingston-Lewis, and compound multinomial) for estimating classification indices when the observed score distribution shows atypical patterns: (a) bimodality, (b) structural (i.e., systematic) bumpiness, or (c) structural zeros (i.e., no…
Descriptors: Classification, Accuracy, Scores, Cutting Scores
Gauly, Britta; Daikeler, Jessica; Gummer, Tobias; Rammstedt, Beatrice – International Journal of Social Research Methodology, 2020
One question frequently included in surveys asks about respondents' earnings. As this information serves, for example, as a basis for evaluating policy interventions, it must be of high quality. This study aims to advance knowledge about possible measurement errors in earnings data and the potential of data linkage to improve substantive…
Descriptors: Foreign Countries, Research Methodology, Surveys, Data
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
Koçak, Duygu – International Journal of Progressive Education, 2020
The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…
Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size
Peabody, Michael R. – Applied Measurement in Education, 2020
The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…
Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory