Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Walters, Glenn D. – International Journal of Social Research Methodology, 2019
Identifying mediators in variable chains as part of a causal mediation analysis can shed light on issues of causation, assessment, and intervention. However, coefficients and effect sizes in a causal mediation analysis are nearly always small. This can lead those less familiar with the approach to reject the results of causal mediation analysis.…
Descriptors: Effect Size, Statistical Analysis, Sampling, Statistical Inference
Moses, Tim; Kim, YoungKoung – Journal of Educational Measurement, 2017
The focus of this article is on scale score transformations that can be used to stabilize conditional standard errors of measurement (CSEMs). Three transformations for stabilizing the estimated CSEMs are reviewed, including the traditional arcsine transformation, a recently developed general variance stabilization transformation, and a new method…
Descriptors: Error of Measurement, Scores, Comparative Analysis, Item Response Theory
da Silva, M. A. Salgueiro; Seixas, T. M. – Physics Teacher, 2017
Measuring one physical quantity as a function of another often requires making some choices prior to the measurement process. Two of these choices are: the data range where measurements should focus and the number (n) of data points to acquire in the chosen data range. Here, we consider data range as the interval of variation of the independent…
Descriptors: Physics, Regression (Statistics), Measurement, Measurement Techniques
Burke, Danielle L.; Ensor, Joie; Snell, Kym I. E.; van der Windt, Danielle; Riley, Richard D. – Research Synthesis Methods, 2018
Percentage study weights in meta-analysis reveal the contribution of each study toward the overall summary results and are especially important when some studies are considered outliers or at high risk of bias. In meta-analyses of test accuracy reviews, such as a bivariate meta-analysis of sensitivity and specificity, the percentage study weights…
Descriptors: Meta Analysis, Research Reports, Statistical Analysis, Sample Size
Sinharay, Sandip – Journal of Educational Measurement, 2018
The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…
Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory
Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021
In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…
Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests
Rohm, Theresa; Carstensen, Claus H.; Fischer, Luise; Gnambs, Timo – Large-scale Assessments in Education, 2021
Background: After elementary school, students in Germany are separated into different school tracks (i.e., school types) with the aim of creating homogeneous student groups in secondary school. Consequently, the development of students' reading achievement diverges across school types. Findings on this achievement gap have been criticized as…
Descriptors: Achievement Gap, Reading Achievement, Test Bias, Error of Measurement
Tsaousis, Ioannis; Sideridis, Georgios D.; AlGhamdi, Hannan M. – Journal of Psychoeducational Assessment, 2021
This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Cognitive Tests, Cognitive Ability
Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both…
Descriptors: Equated Scores, Validity, Methods, School Districts
Ford, Andrea L. B.; Johnson, LeAnne D. – Journal of Speech, Language, and Hearing Research, 2021
Purpose: A myriad features can impact the nature, frequency, and length of adult-child interactions important for language learning. Empirical investigations of language learning opportunities for young children with autism spectrum disorder (ASD) provide limited generalizable insight, with inferences more constrained to the sample than is often…
Descriptors: Autism, Pervasive Developmental Disorders, Preschool Children, Measurement Techniques
Martín-Puga, M. Eva; Pelegrina, Santiago; Gómez-Pérez, M. Mar; Justicia-Galiano, M. José – Journal of Psychoeducational Assessment, 2022
The objectives were to examine the factorial structure of the Academic Procrastination Scale-Short Form (APS-S) and the measurement invariance across gender and educational levels, to determine possible differences in procrastination across gender, educational levels, and grades. The sample was formed of 1486 Spanish primary and secondary school…
Descriptors: Psychometrics, Measures (Individuals), Study Habits, Scores
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Zieger, Laura Raffaella; Jerrim, J.; Anders, J.; Shure, N. – Assessment in Education: Principles, Policy & Practice, 2022
The OECD's Programme for International Student Assessment (PISA) has become one of the key studies for evidence-based education policymaking across the globe. PISA has however received a lot of methodological criticism, including how the test scores are created. The aim of this paper is to investigate the so-called 'conditioning model', where…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Kilic, Abdullah Faruk; Uysal, Ibrahim; Atar, Burcu – International Journal of Assessment Tools in Education, 2020
This Monte Carlo simulation study aimed to investigate confirmatory factor analysis (CFA) estimation methods under different conditions, such as sample size, distribution of indicators, test length, average factor loading, and factor structure. Binary data were generated to compare the performance of maximum likelihood (ML), mean and variance…
Descriptors: Factor Analysis, Computation, Methods, Sample Size
Lee, HyeSun; Smith, Weldon Z. – Educational and Psychological Measurement, 2020
Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response…
Descriptors: Bayesian Statistics, Item Response Theory, Monte Carlo Methods, Test Format