ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	12

Source

Journal of Educational and…

Publication Type

Journal Articles	15
Reports - Descriptive	7
Reports - Research	5
Reports - Evaluative	3
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	2
Grade 4	2
Grade 8	2
Secondary Education	2
Elementary Secondary Education	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 7	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
More ▼

Audience

Location

Austria (Vienna)	1
China	1
Germany	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications

Peer reviewed

Direct link

Doran, Harold – Journal of Educational and Behavioral Statistics, 2023

This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms…

Descriptors: Scaling, Algorithms, Psychometrics, Computation

Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022

One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…

Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis

A Scaled Threshold Model for Measuring Extreme Response Style

Peer reviewed

Direct link

Lubbe, Dirk; Schuster, Christof – Journal of Educational and Behavioral Statistics, 2020

Extreme response style is the tendency of individuals to prefer the extreme categories of a rating scale irrespective of item content. It has been shown repeatedly that individual response style differences affect the reliability and validity of item responses and should, therefore, be considered carefully. To account for extreme response style…

Descriptors: Response Style (Tests), Rating Scales, Item Response Theory, Models

Regression Discontinuity Designs with an Ordinal Running Variable: Evaluating the Effects of Extended Time Accommodations for English-Language Learners

Peer reviewed

Direct link

Suk, Youmi; Steiner, Peter M.; Kim, Jee-Seon; Kang, Hyunseung – Journal of Educational and Behavioral Statistics, 2022

Regression discontinuity (RD) designs are commonly used for program evaluation with continuous treatment assignment variables. But in practice, treatment assignment is frequently based on ordinal variables. In this study, we propose an RD design with an ordinal running variable to assess the effects of extended time accommodations (ETA) for…

Descriptors: Regression (Statistics), Program Evaluation, Research Design, English Language Learners

The New (Educational) Statistics: Properties of Scales That Matter

Peer reviewed

Direct link

Ho, Andrew Dean – Journal of Educational and Behavioral Statistics, 2016

in this article, Andrew Dean Ho presents a response to David Thissen's essay, "Bad Questions: An Essay Involving Item Response Theory (2016)," calling it an excellent contribution to the genre of commentaries on the field which joins the likes of the piece by Thissen's frequent collaborator, Howard Wainer (2010), who published "14…

Descriptors: Item Response Theory, Statistics, Psychometrics, Goodness of Fit

Toward Education Quality Improvement in China: A Brief Overview of the National Assessment of Education Quality

Peer reviewed

Direct link

Jiang, Yu; Zhang, Jiahui; Xin, Tao – Journal of Educational and Behavioral Statistics, 2019

This article is an overview of the National Assessment of Education Quality (NAEQ) of China in reading, mathematics, sciences, arts, physical education, and moral education at Grades 4 and 8. After a review of the background and history of NAEQ, we present the assessment framework with students' holistic development at the core and the design for…

Descriptors: Foreign Countries, Educational Quality, Educational Improvement, National Competency Tests

TIMSS 2015: Illustrating Advancements in Large-Scale International Assessments

Peer reviewed

Direct link

Martin, Michael O.; Mullis, Ina V. S. – Journal of Educational and Behavioral Statistics, 2019

International large-scale assessments of student achievement such as International Association for the Evaluation of Educational Achievement's Trends in International Mathematics and Science Study (TIMSS) and Progress in International Reading Literacy Study and Organization for Economic Cooperation and Development's Program for International…

Descriptors: Achievement Tests, International Assessment, Mathematics Tests, Science Achievement

On the Hedges Correction for a "t"-Test

Peer reviewed

Direct link

VanHoudnos, Nathan M.; Greenhouse, Joel B. – Journal of Educational and Behavioral Statistics, 2016

When cluster randomized experiments are analyzed as if units were independent, test statistics for treatment effects can be anticonservative. Hedges proposed a correction for such tests by scaling them to control their Type I error rate. This article generalizes the Hedges correction from a posttest-only experimental design to more common designs…

Descriptors: Statistical Analysis, Randomized Controlled Trials, Error of Measurement, Scaling

Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning

Peer reviewed

Direct link

Strobl, Carolin; Wickelmaier, Florian; Zeileis, Achim – Journal of Educational and Behavioral Statistics, 2011

The preference scaling of a group of subjects may not be homogeneous, but different groups of subjects with certain characteristics may show different preference scalings, each of which can be derived from paired comparisons by means of the Bradley-Terry model. Usually, either different models are fit in predefined subsets of the sample or the…

Descriptors: Individual Differences, Scaling, Statistical Analysis, Models

The Gains from Vertical Scaling

Peer reviewed

Direct link

Briggs, Derek C.; Domingue, Ben – Journal of Educational and Behavioral Statistics, 2013

It is often assumed that a vertical scale is necessary when value-added models depend upon the gain scores of students across two or more points in time. This article examines the conditions under which the scale transformations associated with the vertical scaling process would be expected to have a significant impact on normative interpretations…

Descriptors: Evaluation Methods, Scaling, Scores, Achievement Tests

A Model for Teacher Effects from Longitudinal Data without Assuming Vertical Scaling

Peer reviewed

Direct link

Mariano, Louis T.; McCaffrey, Daniel F.; Lockwood, J. R. – Journal of Educational and Behavioral Statistics, 2010

There is an increasing interest in using longitudinal measures of student achievement to estimate individual teacher effects. Current multivariate models assume each teacher has a single effect on student outcomes that persists undiminished to all future test administrations (complete persistence [CP]) or can diminish with time but remains…

Descriptors: Persistence, Academic Achievement, Data Analysis, Teacher Influence

Origin of the Scaling Constant "d" = 1.7 in Item Response Theory.

Peer reviewed

Camilli, Gregory – Journal of Educational and Behavioral Statistics, 1994

Describes the scaling constant "d" = 1.702, used in Item Response Theory, which minimizes the maximum difference between the normal and logistic distribution functions. Recapitulates the theoretical and numerical derivation of "d" given by D. Haley (1952). (SLD)

Descriptors: Item Response Theory, Scaling

The Hierarchical Rater Model for Rated Test Items and Its Application to Large-Scale Educational Assessment Data.

Peer reviewed

Patz, Richard J.; Junker, Brian W.; Johnson, Matthew S.; Mariano, Louis T. – Journal of Educational and Behavioral Statistics, 2002

Discusses the hierarchical rater model (HRM) of R. Patz (1996) and shows how it can be used to scale examinees and items, model aspects of consensus among raters, and model individual rater severity and consistency effects. Also shows how the HRM fits into the generalizability theory framework. Compares the HRM to the conventional item response…

Descriptors: Educational Assessment, Generalizability Theory, Item Response Theory, Scaling

A Maximum Test for Scale: Type I Error Rates and Power.

Peer reviewed

Algina, James; And Others – Journal of Educational and Behavioral Statistics, 1995

A maximum test in which the test statistic is the more extreme of the Brown-Forsythe and in which O'Brien's test statistics are developed, with estimated Type I error rates and power for all three tests. For study conditions, Type I error rates for the maximum test are near the nominal level. (SLD)

Descriptors: Error of Measurement, Estimation (Mathematics), Power (Statistics), Scaling

A Multilevel Bayesian Item Response Theory Method for Scaling Socioeconomic Status in International Studies of Education

Peer reviewed

Direct link

May, Henry – Journal of Educational and Behavioral Statistics, 2006

In this article, a new method is presented and implemented for deriving a scale of socioeconomic status (SES) from international survey data using a multilevel Bayesian item response theory (IRT) model. The proposed model incorporates both international anchor items and nation-specific items and is able to (a) produce student family SES scores…

Descriptors: Item Response Theory, Bayesian Statistics, Socioeconomic Status, Scaling

Scaling	15
Item Response Theory	7
Statistical Analysis	6
Achievement Tests	5
Foreign Countries	4
Models	4
Error of Measurement	3
Mathematics Achievement	3
National Competency Tests	3
Scores	3
Test Items	3
Academic Achievement	2
Bayesian Statistics	2
Computation	2
Equations (Mathematics)	2
Evaluation Methods	2
International Assessment	2
Item Sampling	2
Longitudinal Studies	2
Psychometrics	2
Research Design	2
Sampling	2
Teacher Effectiveness	2
Achievement Gains	1
Algorithms	1
More ▼

Mariano, Louis T.	2
Algina, James	1
Briggs, Derek C.	1
Camilli, Gregory	1
Domingue, Ben	1
Doran, Harold	1
Greenhouse, Joel B.	1
Ho, Andrew Dean	1
Jiang, Yu	1
Johnson, Matthew S.	1
Junker, Brian W.	1
Kang, Hyunseung	1
Kim, Jee-Seon	1
Lockwood, J. R.	1
Lubbe, Dirk	1
Lüdtke, Oliver	1
Martin, Michael O.	1
May, Henry	1
McCaffrey, Daniel F.	1
Mullis, Ina V. S.	1
Patz, Richard J.	1
Robitzsch, Alexander	1
Schuster, Christof	1
Steiner, Peter M.	1
More ▼