ERIC - Search Results

Publication Date

In 2025	4
Since 2024	7
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	22
Since 2006 (last 20 years)	38

Publication Type

Reports - Research	38
Journal Articles	31
Tests/Questionnaires	2
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	11
Elementary Education	9
Middle Schools	8
Junior High Schools	6
Elementary Secondary Education	5
Higher Education	4
Postsecondary Education	4
Grade 6	3
Grade 8	3
High Schools	3
Intermediate Grades	3
Grade 10	2
Grade 11	2
Grade 5	2
Grade 7	2
Grade 9	2
Early Childhood Education	1
Grade 1	1
Grade 12	1
Grade 4	1
Primary Education	1
More ▼

Audience

Location

Texas	2
Canada	1
Chile	1
Florida	1
Germany	1
Kentucky (Louisville)	1
Maine	1
Netherlands	1
North Carolina	1
Qatar	1
Spain (Madrid)	1
Turkey	1
More ▼

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Program for International…	3
Graduate Record Examinations	1
Motivated Strategies for…	1
National Assessment of…	1
SAT (College Admission Test)	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

Using Plausible Values When Fitting Multilevel Models with Large-Scale Assessment Data Using R

Peer reviewed

Direct link

Francis L. Huang – Large-scale Assessments in Education, 2024

The use of large-scale assessments (LSAs) in education has grown in the past decade though analysis of LSAs using multilevel models (MLMs) using R has been limited. A reason for its limited use may be due to the complexity of incorporating both plausible values and weighted analyses in the multilevel analyses of LSA data. We provide additional…

Descriptors: Hierarchical Linear Modeling, Evaluation Methods, Educational Assessment, Data Analysis

Small Sample Methods in Multilevel Analysis

Peer reviewed

Direct link

Yasuhiro Yamamoto; Yasuo Miyazaki – Journal of Experimental Education, 2025

Bayesian methods have been said to solve small sample problems in frequentist methods by reflecting prior knowledge in the prior distribution. However, there are dangers in strongly reflecting prior knowledge or situations where much prior knowledge cannot be used. In order to address the issue, in this article, we considered to apply two Bayesian…

Descriptors: Sample Size, Hierarchical Linear Modeling, Bayesian Statistics, Prior Learning

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Dynamic Fit Index Cutoffs for Hierarchical and Second-Order Factor Models

Peer reviewed

Direct link

Daniel McNeish; Patrick D. Manapat – Structural Equation Modeling: A Multidisciplinary Journal, 2024

A recent review found that 11% of published factor models are hierarchical models with second-order factors. However, dedicated recommendations for evaluating hierarchical model fit have yet to emerge. Traditional benchmarks like RMSEA <0.06 or CFI >0.95 are often consulted, but they were never intended to generalize to hierarchical models.…

Descriptors: Factor Analysis, Goodness of Fit, Hierarchical Linear Modeling, Benchmarking

Predictive Performance of Bayesian Stacking in Multilevel Education Data

Peer reviewed

Direct link

Mingya Huang; David Kaplan – Journal of Educational and Behavioral Statistics, 2025

The issue of model uncertainty has been gaining interest in education and the social sciences community over the years, and the dominant methods for handling model uncertainty are based on Bayesian inference, particularly, Bayesian model averaging. However, Bayesian model averaging assumes that the true data-generating model is within the…

Descriptors: Bayesian Statistics, Hierarchical Linear Modeling, Statistical Inference, Predictor Variables

DIF Detection for Multiple Groups: Comparing Three-Level GLMMs and Multiple-Group IRT Models

Peer reviewed

Direct link

Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024

For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…

Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory

Re-Examining Measurement Invariance of School Climate Surveys across Race/Ethnicity

Peer reviewed

Direct link

Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025

Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…

Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment

Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

Download full text

Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…

Descriptors: Test Validity, Evaluation Methods, School Districts, Scores

Going beyond the Mean: Using Variances to Enhance Understanding of the Impact of Educational Interventions for Multilevel Models

Peer reviewed
PDF on ERIC

Download full text

Peralta, Yadira; Moreno, Mario; Harwell, Michael; Guzey, S. Selcen; Moore, Tamara J. – Educational Research Quarterly, 2018

Variance heterogeneity is a common feature of educational data when treatment differences expressed through means are present, and often reflects a treatment by subject interaction with respect to an outcome variable. Identifying variables that account for this interaction can enhance understanding of whom a treatment does and does not benefit in…

Descriptors: Educational Research, Hierarchical Linear Modeling, Engineering, Design

A Comparison of Split-Half and Multilevel Methods to Assess the Reliability of Progress Monitoring Outcomes

Peer reviewed

Direct link

Van Norman, Ethan R.; Parker, David C. – Journal of Psychoeducational Assessment, 2018

One consideration for selecting progress monitoring tools is the reliability in which the instrument measures student response to instruction. Researchers and vendors establish reliability of growth using two analytic methods: (a) calculating slopes to even and odd observations for each student and correlating the resulting slopes (split-half),…

Descriptors: Curriculum Based Assessment, Hierarchical Linear Modeling, Reliability, Progress Monitoring

Comparability of Computer-Based and Paper-Based Science Assessments

Peer reviewed
PDF on ERIC

Download full text

Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E. – Grantee Submission, 2018

We compared students' performance on a paper-based test (PBT) and three computer-based tests (CBTs). The three computer-based tests used different test navigation and answer selection features, allowing us to examine how these features affect student performance. The study sample consisted of 9,698 fourth through twelfth grade students from across…

Descriptors: Evaluation Methods, Tests, Computer Assisted Testing, Scores

Orthogonal Higher Order Structure of the WISC-IV Spanish Using Hierarchical Exploratory Factor Analytic Procedures

Peer reviewed

Direct link

McGill, Ryan J.; Canivez, Gary L. – Journal of Psychoeducational Assessment, 2016

As recommended by Carroll, the present study examined the factor structure of the Wechsler Intelligence Scale for Children-Fourth Edition Spanish (WISC-IV Spanish) normative sample using higher order exploratory factor analytic techniques not included in the WISC-IV Spanish Technical Manual. Results indicated that the WISC-IV Spanish subtests were…

Descriptors: Children, Intelligence Tests, Spanish, Factor Analysis

Do Schools Affect Girls' and Boys' Reading Performance Differently? A Multilevel Study on the Gendered Effects of School Resources and School Practices

Peer reviewed

Direct link

van Hek, Margriet; Kraaykamp, Gerbert; Pelzer, Ben – School Effectiveness and School Improvement, 2018

Few studies on male-female inequalities in education have elaborated on whether school characteristics affect girls' and boys' educational performance differently. This study investigated how school resources, being schools' socioeconomic composition, proportion of girls, and proportion of highly educated teachers, and school practices, being…

Descriptors: Gender Differences, Reading Achievement, Institutional Characteristics, Educational Resources

Two Teacher Quality Measures and the Role of Context: Evidence from Chile

Peer reviewed

Direct link

Santelices, Maria Veronica; Valencia, Edgar; Gonzalez, Jorge; Taut, Sandy – Educational Assessment, Evaluation and Accountability, 2017

This research examines empirically the relationship between two measures of teacher quality: one based on professional standards and a second one using teacher value-added estimates. It also studies the extent to which teacher observable characteristics, such as teacher training variables, are associated to better performance on either of these…

Descriptors: Teacher Effectiveness, Context Effect, Foreign Countries, Value Added Models

The Factorial Survey: Design Selection and its Impact on Reliability and Internal Validity

Peer reviewed

Direct link

Dülmer, Hermann – Sociological Methods & Research, 2016

The factorial survey is an experimental design consisting of varying situations (vignettes) that have to be judged by respondents. For more complex research questions, it quickly becomes impossible for an individual respondent to judge all vignettes. To overcome this problem, random designs are recommended most of the time, whereas quota designs…

Descriptors: Factor Analysis, Reliability, Validity, Benchmarking

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Educational…	3
Educational Assessment,…	2
Educational and Psychological…	2
Grantee Submission	2
Journal of Educational and…	2
Journal of Psychoeducational…	2
National Center for Education…	2
Society for Research on…	2
American Journal of Evaluation	1
Assessment & Evaluation in…	1
Assessment for Effective…	1
Assessment in Education:…	1
ETS Research Report Series	1
Educational Research Quarterly	1
Eurasian Journal of…	1
Journal of Chemical Education	1
Journal of Direct Instruction	1
Journal of Experimental…	1
Journal of Experimental…	1
Large-scale Assessments in…	1
Measurement in Physical…	1
Prospects: Quarterly Review…	1
Psychological Assessment	1
Reading & Writing Quarterly	1
School Effectiveness and…	1
More ▼

Evaluation Methods	38
Hierarchical Linear Modeling	38
Correlation	12
Foreign Countries	11
Statistical Analysis	10
Models	8
Scores	8
Academic Achievement	7
Comparative Analysis	7
Factor Analysis	7
Achievement Tests	6
Equations (Mathematics)	6
Intervention	6
Regression (Statistics)	6
Educational Research	5
Measures (Individuals)	5
Test Validity	5
Computation	4
Data Analysis	4
Elementary Secondary Education	4
Item Response Theory	4
Randomized Controlled Trials	4
Secondary School Students	4
Simulation	4
Teacher Effectiveness	4
More ▼

Schochet, Peter Z.	2
Al-bakr, Fawziah	1
Albano, Anthony D.	1
Artur Pokropek	1
Bakia, Marianne	1
Benassi, Victor A.	1
Canivez, Gary L.	1
Carl Westine	1
Carmen Köhler	1
Choi, Kilchan	1
Crowley, Ryann	1
Dai, Yunyun	1
Daniel McNeish	1
David Kaplan	1
DeBoer, George E.	1
Delvaux, Eva	1
Dena Dossett	1
Devos, Geert	1
Dülmer, Hermann	1
Feng, Mingyu	1
Francis L. Huang	1
Gaviria, Jose Luis	1
Gonzalez, Jorge	1
Guzey, S. Selcen	1
Hampton, David D.	1
More ▼