ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	12

Descriptor

Statistical Analysis	20
Test Items	8
Scores	5
Test Theory	4
Educational Assessment	3
Measurement Techniques	3
Models	3
Test Construction	3
Test Reliability	3
Test Validity	3
Validity	3
Achievement Gains	2
Achievement Tests	2
Diagnostic Tests	2
Educational Testing	2
Elementary Secondary Education	2
Evaluation Methods	2
Factor Analysis	2
Foreign Countries	2
Goodness of Fit	2
Item Bias	2
Predictor Variables	2
Reliability	2
Standardized Tests	2
Standards	2
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	20
Reports - Evaluative	7
Reports - Descriptive	6
Reports - Research	6
Information Analyses	2
Book/Product Reviews	1
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	2
Elementary Education	1
Grade 4	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Canada	2
United States	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Educational Measurement: Issues and Practice, 2022

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores,…

Descriptors: Reliability, Scores, Scaling, Statistical Analysis

Digital Module 16: Longitudinal Data Analysis

Peer reviewed

Direct link

Harring, Jeffrey R.; Johnson, Tessa L. – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Jeffrey Harring and Ms. Tessa Johnson introduce the linear mixed effects (LME) model as a flexible general framework for simultaneously modeling continuous repeated measures data with a scientifically defensible function that adequately summarizes both individual change as well as the average response. The module…

Descriptors: Educational Assessment, Data Analysis, Longitudinal Studies, Case Studies

Easier Said than Done: Rejoinder on Sijtsma and on Green and Yang

Peer reviewed

Direct link

Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2016

The main points of Sijtsma and Green and Yang in Educational Measurement: Issues and Practice (34, 4) are that reliability, internal consistency, and unidimensionality are distinct and that Cronbach's alpha may be problematic. Neither of these assertions are at odds with Davenport, Davison, Liou, and Love in the same issue. However, many authors…

Descriptors: Educational Assessment, Reliability, Validity, Test Construction

Automated Scoring of Students' Small-Group Discussions to Assess Reading Ability

Peer reviewed

Direct link

Kosh, Audra E.; Greene, Jeffrey A.; Murphy, P. Karen; Burdick, Hal; Firetto, Carla M.; Elmore, Jeff – Educational Measurement: Issues and Practice, 2018

We explored the feasibility of using automated scoring to assess upper-elementary students' reading ability through analysis of transcripts of students' small-group discussions about texts. Participants included 35 fourth-grade students across two classrooms that engaged in a literacy intervention called Quality Talk. During the course of one…

Descriptors: Computer Assisted Testing, Small Group Instruction, Group Discussion, Student Evaluation

A Synthesis of the Peer-Reviewed Differential Bundle Functioning Research

Peer reviewed

Direct link

Banks, Kathleen – Educational Measurement: Issues and Practice, 2013

The purpose of this article was to present a synthesis of the peer-reviewed differential bundle functioning (DBF) research that has been conducted to date. A total of 16 studies were synthesized according to the following characteristics: tests used and learner groups, organizing principles used for developing bundles, DBF detection methods used,…

Descriptors: Test Bias, Research, Tests, Student Characteristics

Components of Variance of Scales with a Bifactor Subscale Structure from Two Calculations of Alpha

Peer reviewed

Direct link

Andrich, David – Educational Measurement: Issues and Practice, 2016

Since Cronbach's (1951) elaboration of a from its introduction by Guttman (1945), this coefficient has become ubiquitous in characterizing assessment instruments in education, psychology, and other social sciences. Also ubiquitous are caveats on the calculation and interpretation of this coefficient. This article summarizes a recent contribution…

Descriptors: Computation, Correlation, Test Theory, Measures (Individuals)

Validating Student Score Inferences with Person-Fit Statistic and Verbal Reports: A Person-Fit Study for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Roberts, Mary Roduta – Educational Measurement: Issues and Practice, 2013

The goal of this study was to investigate the usefulness of person-fit analysis in validating student score inferences in a cognitive diagnostic assessment. In this study, a two-stage procedure was used to evaluate person fit for a diagnostic test in the domain of statistical hypothesis testing. In the first stage, the person-fit statistic, the…

Descriptors: Scores, Validity, Cognitive Tests, Diagnostic Tests

A Meta-Analysis of Research on the Read Aloud Accommodation

Peer reviewed

Direct link

Buzick, Heather; Stone, Elizabeth – Educational Measurement: Issues and Practice, 2014

Read aloud is a testing accommodation that has been studied by many researchers, and its use on K-12 assessments continues to be debated because of its potential to change the measured construct or unfairly increase test scores. This study is a summary of quantitative research on the read aloud accommodation. Previous studies contributed…

Descriptors: Meta Analysis, Reading Aloud to Others, Educational Research, Statistical Analysis

Evaluating the Predictive Value of Growth Prediction Models

Peer reviewed

Direct link

Murphy, Daniel L.; Gaertner, Matthew N. – Educational Measurement: Issues and Practice, 2014

This study evaluates four growth prediction models--projection, student growth percentile, trajectory, and transition table--commonly used to forecast (and give schools credit for) middle school students' future proficiency. Analyses focused on vertically scaled summative mathematics assessments, and two performance standards conditions (high…

Descriptors: Prediction, Models, Achievement Gains, Middle School Students

Universal Design and Multimethod Approaches to Item Review

Peer reviewed

Direct link

Johnstone, Christopher J.; Thompson, Sandra J.; Bottsford-Miller, Nicole A.; Thurlow, Martha L. – Educational Measurement: Issues and Practice, 2008

Test items undergo multiple iterations of review before states and vendors deem them acceptable to be placed in a live statewide assessment. This article reviews three approaches that can add validity evidence to states' item review processes. The first process is a structured sensitivity review process that focuses on universal design…

Descriptors: Test Items, Disabilities, Test Construction, Testing Programs

Subscores Based on Classical Test Theory: To Report or Not to Report

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007

There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…

Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis

The Technical Quality of Performance Assessments: Standard Errors of Percents of Pupils Reaching Standards.

Peer reviewed

Yen, Wendy M. – Educational Measurement: Issues and Practice, 1997

The accuracy of statistics based on performance assessments that represent percentages of students reaching standards is explored using data from a large-scale performance assessment, the Maryland School Performance Assessment Program. Results with students in grades 3, 5, and 8 support the accuracy of pooling results to produce the statistics.…

Descriptors: Achievement Tests, Elementary Education, Error of Measurement, Performance Based Assessment

Using Statistical Procedures To Identify Differentially Functioning Test Items. An NCME Instructional Module.

Peer reviewed

Clauser, Brian E.; Mazor, Kathleen M. – Educational Measurement: Issues and Practice, 1998

This module prepares the reader to use statistical procedures to detect differentially functioning test items. The Mantel-Haenszel statistic, logistic regression, the SIBTEST procedure, the Standardization procedure, and various item response theory-based procedures are presented. Theoretical frameworks, strengths and weaknesses, and…

Descriptors: Item Bias, Item Response Theory, Statistical Analysis, Teaching Methods

Book Review: Educational Measurement, Third Edition.

Peer reviewed

Chronbach, Lee J. – Educational Measurement: Issues and Practice, 1989

The book reviewed is a compendium of current thinking about measurement theory and test use. It includes content by 26 authors at 3 levels: (1) accessible to educators, policy makers, and graduate students; (2) suited for technical students; and (3) written for qualified measurement specialists. Strengths and weaknesses are noted. (SLD)

Descriptors: Book Reviews, Educational Assessment, Evaluation Methods, Measurement Techniques

Using Dimensionality-Based DIF Analyses to Identify and Interpret Constructs That Elicit Group Differences

Peer reviewed

Direct link

Gierl, Mark J. – Educational Measurement: Issues and Practice, 2005

In this paper I describe and illustrate the Roussos-Stout (1996) multidimensionality-based DIF analysis paradigm, with emphasis on its implication for the selection of a matching and studied subtest for DIF analyses. Standard DIF practice encourages an exploratory search for matching subtest items based on purely statistical criteria, such as a…

Descriptors: Models, Test Items, Test Bias, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2

Gierl, Mark J.	2
Allalouf, Avi	1
Almehrizi, Rashid S.	1
Andrich, David	1
Banks, Kathleen	1
Bisanz, Gay L.	1
Bisanz, Jeffrey	1
Bottsford-Miller, Nicole A.	1
Boughton, Keith A.	1
Brennan, Robert L.	1
Burdick, Hal	1
Buzick, Heather	1
Chronbach, Lee J.	1
Clauser, Brian E.	1
Cui, Ying	1
Davenport, Ernest C.	1
Davison, Mark L.	1
Ellwein, Mary C.	1
Elmore, Jeff	1
Firetto, Carla M.	1
Gaertner, Matthew N.	1
Greene, Jeffrey A.	1
Gullickson, Arlen R.	1
Haberman, Shelby	1
Harring, Jeffrey R.	1
More ▼