Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 11 |
Descriptor
Source
Journal of Educational and… | 11 |
Author
Sinharay, Sandip | 2 |
Bennink, Margot | 1 |
Boyd, Donald | 1 |
Croon, Marcel A. | 1 |
Haberman, Shelby J. | 1 |
Ho, Andrew D. | 1 |
Jeon, Minjeong | 1 |
Kalogrides, Demetra | 1 |
Keuning, Jos | 1 |
Lankford, Hamilton | 1 |
Lewis, Charles | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 7 |
Reports - Evaluative | 3 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Grade 4 | 2 |
Grade 8 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 9 | 1 |
More ▼ |
Audience
Location
Netherlands | 1 |
New York | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Iowa Tests of Basic Skills | 1 |
Measures of Academic Progress | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2022
Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (APĀ®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on…
Descriptors: Computation, Data Analysis, Educational Testing, Accuracy
Sinharay, Sandip; van Rijn, Peter W. – Journal of Educational and Behavioral Statistics, 2020
Response time models (RTMs) are of increasing interest in educational and psychological testing. This article focuses on the lognormal model for response times, which is one of the most popular RTMs. Several existing statistics for testing normality and the fit of factor analysis models are repurposed for testing the fit of the lognormal model. A…
Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Factor Analysis
Liu, Yang; Wang, Xiaojing – Journal of Educational and Behavioral Statistics, 2020
Parametric methods, such as autoregressive models or latent growth modeling, are usually inflexible to model the dependence and nonlinear effects among the changes of latent traits whenever the time gap is irregular and the recorded time points are individually varying. Often in practice, the growth trend of latent traits is subject to certain…
Descriptors: Bayesian Statistics, Nonparametric Statistics, Regression (Statistics), Item Response Theory
Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both…
Descriptors: Equated Scores, Validity, Methods, School Districts
van der Linden, Wim J.; Jeon, Minjeong – Journal of Educational and Behavioral Statistics, 2012
The probability of test takers changing answers upon review of their initial choices is modeled. The primary purpose of the model is to check erasures on answer sheets recorded by an optical scanner for numbers and patterns that may be indicative of irregular behavior, such as teachers or school administrators changing answer sheets after their…
Descriptors: Probability, Models, Test Items, Educational Testing
Bennink, Margot; Croon, Marcel A.; Keuning, Jos; Vermunt, Jeroen K. – Journal of Educational and Behavioral Statistics, 2014
In educational measurement, responses of students on items are used not only to measure the ability of students, but also to evaluate and compare the performance of schools. Analysis should ideally account for the multilevel structure of the data, and school-level processes not related to ability, such as working climate and administration…
Descriptors: Academic Ability, Educational Assessment, Educational Testing, Test Bias
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation
Livingston, Samuel A. – Journal of Educational and Behavioral Statistics, 2006
This article suggests a graphic technique that uses P-P plots to show the extent to which two groups differ on two variables. It can be used even if the variables are measured in completely different, noncomparable units. The comparison is symmetric with respect to the variables and the groups. It reflects the differences between the groups over…
Descriptors: Comparative Analysis, Groups, Differences, Graphs
Liu, Yuming; Schulz, E. Matthew; Yu, Lei – Journal of Educational and Behavioral Statistics, 2008
A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…
Descriptors: Reading Comprehension, Test Format, Markov Processes, Educational Testing
Lewis, Charles – Journal of Educational and Behavioral Statistics, 2006
In the context of reviewing an article for this journal (van der Linden & Sotaridona, this issue, pp. 283-304) the topic of unconditional and conditional hypothesis testing came under consideration. While this is hardly a new issue (consider, for example, arguments regarding the chi square vs. Fisher exact test of independence for a 2 x 2…
Descriptors: Hypothesis Testing, Educational Testing, Item Response Theory, Research Problems