ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	13

Descriptor

Error of Measurement	14
Mathematics Tests	14
Statistical Analysis	14
Scores	8
Achievement Tests	7
Foreign Countries	6
Reading Tests	6
Elementary Secondary Education	5
Item Response Theory	5
Mathematics Achievement	5
Science Tests	5
Comparative Analysis	4
International Assessment	4
Science Achievement	4
Computation	3
Models	3
Test Items	3
College Entrance Examinations	2
Correlation	2
Data Analysis	2
Differences	2
Effect Size	2
Equated Scores	2
Factor Analysis	2
Fractions	2
More ▼

Source

Grantee Submission	2
Large-scale Assessments in…	2
ProQuest LLC	2
ACT, Inc.	1
Applied Measurement in…	1
Athens Journal of Education	1
Eurasian Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Psychoeducational…	1

Publication Type

Journal Articles	10
Reports - Research	8
Reports - Evaluative	3
Dissertations/Theses -…	2
Guides - General	1
Numerical/Quantitative Data	1
Opinion Papers	1
Reports - Descriptive	1

Education Level

Elementary Secondary Education	5
Middle Schools	4
Elementary Education	3
Grade 8	3
Grade 4	2
Grade 5	2
Higher Education	2
Intermediate Grades	2
Junior High Schools	2
Postsecondary Education	2
Secondary Education	2
Grade 10	1
Grade 3	1
Grade 6	1
Grade 7	1
High Schools	1
More ▼

Audience

Location

Bahrain	1
Kuwait	1
New York	1
Saudi Arabia	1
Singapore	1
Tunisia	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	5
ACT Assessment	1
General Aptitude Test Battery	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Effect Size Measures for Differential Item Functioning in Cognitive Diagnostic Models

Direct link

Yanan Feng – ProQuest LLC, 2021

This dissertation aims to investigate the effect size measures of differential item functioning (DIF) detection in the context of cognitive diagnostic models (CDMs). A variety of DIF detection techniques have been developed in the context of CDMs. However, most of the DIF detection procedures focus on the null hypothesis significance test. Few…

Descriptors: Effect Size, Item Response Theory, Cognitive Measurement, Models

Rejoinder: Response To--"An Examination of Plausible Score Correlation from the Trend in Mathematics and Science Study"

Peer reviewed
PDF on ERIC

Download full text

Wang, Jianjun; Ma, Xin – Athens Journal of Education, 2019

This rejoinder keeps the original focus on statistical computing pertaining to the correlation of student achievement between mathematics and science from the Trend in Mathematics and Science Study (TIMSS). Albeit the availability of student performance data in TIMSS and the emphasis of the inter-subject connection in the Next Generation Science…

Descriptors: Scores, Correlation, Achievement Tests, Elementary Secondary Education

Determining Differential Item Functioning with the Mixture Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Yalcin, Seher – Eurasian Journal of Educational Research, 2018

Purpose: Studies in the literature have generally demonstrated that the causes of differential item functioning (DIF) are complex and not directly related to defined groups. The purpose of this study is to determine the DIF according to the mixture item response theory (MixIRT) model, based on the latent group approach, as well as the…

Descriptors: Item Response Theory, Test Items, Test Bias, Error of Measurement

Multilevel Multidimensional Item Response Model with a Multilevel Latent Covariate

Peer reviewed
PDF on ERIC

Download full text

Direct link

Cho, Sun-Joo; Bottge, Brian A. – Grantee Submission, 2015

In a pretest-posttest cluster-randomized trial, one of the methods commonly used to detect an intervention effect involves controlling pre-test scores and other related covariates while estimating an intervention effect at post-test. In many applications in education, the total post-test and pre-test scores that ignores measurement error in the…

Descriptors: Item Response Theory, Hierarchical Linear Modeling, Pretests Posttests, Scores

Detecting Intervention Effects in a Cluster-Randomized Design Using Multilevel Structural Equation Modeling for Binary Responses

Peer reviewed
PDF on ERIC

Download full text

Direct link

Cho, Sun-Joo; Preacher, Kristopher J.; Bottge, Brian A. – Grantee Submission, 2015

Multilevel modeling (MLM) is frequently used to detect group differences, such as an intervention effect in a pre-test--post-test cluster-randomized design. Group differences on the post-test scores are detected by controlling for pre-test scores as a proxy variable for unobserved factors that predict future attributes. The pre-test and post-test…

Descriptors: Structural Equation Models, Hierarchical Linear Modeling, Intervention, Program Effectiveness

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Multi-Population Invariance with Dichotomous Measures: Combining Multi-Group and MIMIC Methodologies in Evaluating the General Aptitude Test in the Arabic Language

Peer reviewed

Direct link

Sideridis, Georgios D.; Tsaousis, Ioannis; Al-harbi, Khaleel A. – Journal of Psychoeducational Assessment, 2015

The purpose of the present study was to extend the model of measurement invariance by simultaneously estimating invariance across multiple populations in the dichotomous instrument case using multi-group confirmatory factor analytic and multiple indicator multiple causes (MIMIC) methodologies. Using the Arabic version of the General Aptitude Test…

Descriptors: Semitic Languages, Aptitude Tests, Error of Measurement, Factor Analysis

Detecting Differential Item Functioning Using Generalized Logistic Regression in the Context of Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Rutkowski, Leslie – Large-scale Assessments in Education, 2014

Background: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences…

Descriptors: Test Items, Item Response Theory, Item Analysis, Regression (Statistics)

Comparison of the One- and Bi-Direction Chained Equipercentile Equating

Peer reviewed

Direct link

Oh, Hyeonjoo; Moses, Tim – Journal of Educational Measurement, 2012

This study investigated differences between two approaches to chained equipercentile (CE) equating (one- and bi-direction CE equating) in nearly equal groups and relatively unequal groups. In one-direction CE equating, the new form is linked to the anchor in one sample of examinees and the anchor is linked to the reference form in the other…

Descriptors: Equated Scores, Statistical Analysis, Comparative Analysis, Differences

Multiple Imputation Using Chained Equations for Missing Data in TIMSS: A Case Study

Peer reviewed

Direct link

Bouhlila, Donia Smaali; Sellaouti, Fethi – Large-scale Assessments in Education, 2013

In this paper, we document a study that involved applying a multiple imputation technique with chained equations to data drawn from the 2007 iteration of the TIMSS database. More precisely, we imputed missing variables contained in the student background datafile for Tunisia (one of the TIMSS 2007 participating countries), by using Van Buuren,…

Descriptors: Databases, Student Characteristics, Error of Measurement, Intervals

An Investigation into the Psychometric Properties of the Proportional Reduction of Mean Squared Error and Augmented Scores

Direct link

Stephens, Christopher Neil – ProQuest LLC, 2012

Augmentation procedures are designed to provide better estimates for a given test or subtest through the use of collateral information. The main purpose of this dissertation was to use Haberman's and Wainer's augmentation procedures on a large-scale, standardized achievement test to understand the relationship between reliability and…

Descriptors: Psychometrics, Error of Measurement, Scores, Reliability

Measuring Test Measurement Error: A General Approach

Peer reviewed

Direct link

Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013

Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…

Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

The Determination of Empirical Standard Errors of Equating the Scores on SAT-Verbal and SAT-Mathematical.

Download full text

Angoff, William H. – 1991

An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…

Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students

Bottge, Brian A.	2
Cho, Sun-Joo	2
Al-harbi, Khaleel A.	1
Angoff, William H.	1
Bouhlila, Donia Smaali	1
Boyd, Donald	1
Harris, Deborah J.	1
Lankford, Hamilton	1
Li, Dongmei	1
Loeb, Susanna	1
Ma, Xin	1
Moses, Tim	1
Oh, Hyeonjoo	1
Pastor, Dena A.	1
Powers, Sonya	1
Preacher, Kristopher J.	1
Rutkowski, Leslie	1
Sellaouti, Fethi	1
Sideridis, Georgios D.	1
Stephens, Christopher Neil	1
Suh, Hongwook	1
Svetina, Dubravka	1
Taylor, Melinda Ann	1
Tsaousis, Ioannis	1
Wang, Jianjun	1
More ▼