ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	11

Source

Journal of Educational and…

Publication Type

Journal Articles	11
Reports - Research	7
Reports - Evaluative	3
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Grade 4	2
Grade 8	2
Junior High Schools	2
Middle Schools	2
Secondary Education	2
Elementary Education	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 9	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Location

Netherlands	1
New York	1

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	1
Measures of Academic Progress	1
National Assessment of…	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Reporting Proficiency Levels for Examinees with Incomplete Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2022

Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on…

Descriptors: Computation, Data Analysis, Educational Testing, Accuracy

Assessing Fit of the Lognormal Model for Response Times

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; van Rijn, Peter W. – Journal of Educational and Behavioral Statistics, 2020

Response time models (RTMs) are of increasing interest in educational and psychological testing. This article focuses on the lognormal model for response times, which is one of the most popular RTMs. Several existing statistics for testing normality and the fit of factor analysis models are repurposed for testing the fit of the lognormal model. A…

Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Factor Analysis

Bayesian Nonparametric Monotone Regression of Dynamic Latent Traits in Item Response Theory Models

Peer reviewed

Direct link

Liu, Yang; Wang, Xiaojing – Journal of Educational and Behavioral Statistics, 2020

Parametric methods, such as autoregressive models or latent growth modeling, are usually inflexible to model the dependence and nonlinear effects among the changes of latent traits whenever the time gap is irregular and the recorded time points are individually varying. Often in practice, the growth trend of latent traits is subject to certain…

Descriptors: Bayesian Statistics, Nonparametric Statistics, Regression (Statistics), Item Response Theory

Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale

Peer reviewed
PDF on ERIC

Download full text

Direct link

Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both…

Descriptors: Equated Scores, Validity, Methods, School Districts

Modeling Answer Changes on Test Items

Peer reviewed

Direct link

van der Linden, Wim J.; Jeon, Minjeong – Journal of Educational and Behavioral Statistics, 2012

The probability of test takers changing answers upon review of their initial choices is modeled. The primary purpose of the model is to check erasures on answer sheets recorded by an optical scanner for numbers and patterns that may be indicative of irregular behavior, such as teachers or school administrators changing answer sheets after their…

Descriptors: Probability, Models, Test Items, Educational Testing

Measuring Student Ability, Classifying Schools, and Detecting Item Bias at School Level, Based on Student-Level Dichotomous Items

Peer reviewed

Direct link

Bennink, Margot; Croon, Marcel A.; Keuning, Jos; Vermunt, Jeroen K. – Journal of Educational and Behavioral Statistics, 2014

In educational measurement, responses of students on items are used not only to measure the ability of students, but also to evaluate and compare the performance of schools. Analysis should ideally account for the multilevel structure of the data, and school-level processes not related to ability, such as working climate and administration…

Descriptors: Academic Ability, Educational Assessment, Educational Testing, Test Bias

Measuring Test Measurement Error: A General Approach

Peer reviewed

Direct link

Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013

Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…

Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

When Can Subscores Have Value?

Peer reviewed

Direct link

Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation

Double P-P Plots for Comparing Differences between Two Groups

Peer reviewed

Direct link

Livingston, Samuel A. – Journal of Educational and Behavioral Statistics, 2006

This article suggests a graphic technique that uses P-P plots to show the extent to which two groups differ on two variables. It can be used even if the variables are measured in completely different, noncomparable units. The comparison is symmetric with respect to the variables and the groups. It reflects the differences between the groups over…

Descriptors: Comparative Analysis, Groups, Differences, Graphs

Standard Error Estimation of 3PL IRT True Score Equating with an MCMC Method

Peer reviewed

Direct link

Liu, Yuming; Schulz, E. Matthew; Yu, Lei – Journal of Educational and Behavioral Statistics, 2008

A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…

Descriptors: Reading Comprehension, Test Format, Markov Processes, Educational Testing

Note on Unconditional and Conditional Hypothesis Testing: A Discussion of an Issue Raised by van der Linden and Sotaridona

Peer reviewed

Direct link

Lewis, Charles – Journal of Educational and Behavioral Statistics, 2006

In the context of reviewing an article for this journal (van der Linden & Sotaridona, this issue, pp. 283-304) the topic of unconditional and conditional hypothesis testing came under consideration. While this is hardly a new issue (consider, for example, arguments regarding the chi square vs. Fisher exact test of independence for a 2 x 2…

Descriptors: Hypothesis Testing, Educational Testing, Item Response Theory, Research Problems

Educational Testing	11
Item Response Theory	5
Error of Measurement	4
Comparative Analysis	3
Markov Processes	3
Monte Carlo Methods	3
Regression (Statistics)	3
Achievement Tests	2
Bayesian Statistics	2
Cheating	2
Computation	2
Correlation	2
Longitudinal Studies	2
Mathematics Tests	2
Reading Tests	2
Scores	2
Standardized Tests	2
Statistical Analysis	2
Student Evaluation	2
Test Items	2
Academic Ability	1
Accountability	1
Accuracy	1
Classification	1
Cohort Analysis	1
More ▼

Sinharay, Sandip	2
Bennink, Margot	1
Boyd, Donald	1
Croon, Marcel A.	1
Haberman, Shelby J.	1
Ho, Andrew D.	1
Jeon, Minjeong	1
Kalogrides, Demetra	1
Keuning, Jos	1
Lankford, Hamilton	1
Lewis, Charles	1
Liu, Yang	1
Liu, Yuming	1
Livingston, Samuel A.	1
Loeb, Susanna	1
Reardon, Sean F.	1
Schulz, E. Matthew	1
Vermunt, Jeroen K.	1
Wang, Xiaojing	1
Wyckoff, James	1
Yu, Lei	1
van Rijn, Peter W.	1
van der Linden, Wim J.	1
More ▼