ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	20

Descriptor

Error of Measurement	31
Sampling	31
Scores	31
Computation	7
Correlation	7
Reliability	7
Test Reliability	7
Research Methodology	6
Data Analysis	5
Foreign Countries	5
Item Response Theory	5
Measurement	5
Measurement Techniques	5
Sample Size	5
Statistical Analysis	5
Achievement Tests	4
Comparative Analysis	4
Mathematics Achievement	4
Monte Carlo Methods	4
Test Validity	4
Context Effect	3
Effect Size	3
Evaluation Methods	3
Models	3
Research Design	3
More ▼

Publication Type

Journal Articles	19
Reports - Research	16
Reports - Evaluative	9
Reports - Descriptive	4
Speeches/Meeting Papers	4
Numerical/Quantitative Data	2
Dissertations/Theses -…	1
Guides - General	1
Guides - Non-Classroom	1
Opinion Papers	1

Education Level

Elementary Secondary Education	3
Kindergarten	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 4	1
Grade 5	1
Grade 7	1
Grade 8	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Primary Education	1
More ▼

Audience

Researchers

Location

California	2
Arizona	1
Australia	1
Germany	1
Missouri	1
North Carolina	1
Tennessee	1
United Kingdom (England)	1
United Kingdom (Great Britain)	1

Laws, Policies, & Programs

Assessments and Surveys

Alabama High School…	1
Early Childhood Longitudinal…	1
National Assessment of…	1
Student Teacher Relationship…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items

Peer reviewed

Direct link

Pere J. Ferrando; David Navarro-González; Fabia Morales-Vives – Educational and Psychological Measurement, 2025

The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the…

Descriptors: Scores, Accuracy, Evaluation Methods, Factor Analysis

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Error Variance in Common Population Linking Bridge Studies. Research Report. ETS RR-19-42

Peer reviewed
PDF on ERIC

Download full text

Jewsbury, Paul A. – ETS Research Report Series, 2019

When an assessment undergoes changes to the administration or instrument, bridge studies are typically used to try to ensure comparability of scores before and after the change. Among the most common and powerful is the common population linking design, with the use of a linear transformation to link scores to the metric of the original…

Descriptors: Evaluation Research, Scores, Error Patterns, Error of Measurement

Rejoinder: Response To--"An Examination of Plausible Score Correlation from the Trend in Mathematics and Science Study"

Peer reviewed
PDF on ERIC

Download full text

Wang, Jianjun; Ma, Xin – Athens Journal of Education, 2019

This rejoinder keeps the original focus on statistical computing pertaining to the correlation of student achievement between mathematics and science from the Trend in Mathematics and Science Study (TIMSS). Albeit the availability of student performance data in TIMSS and the emphasis of the inter-subject connection in the Next Generation Science…

Descriptors: Scores, Correlation, Achievement Tests, Elementary Secondary Education

How Do Predictions Change Learning from Science Texts?

Peer reviewed
PDF on ERIC

Download full text

Direct link

Guerrero, Tricia A.; Griffin, Thomas D.; Wiley, Jennifer – Grantee Submission, 2020

The Predict-Observe-Explain (POE) learning cycle improves understanding of the connection between empirical results and theoretical concepts when students engage in hands-on experimentation. This study explored whether training students to use a POE strategy when learning from social science texts that describe theories and experimental results…

Descriptors: Prediction, Observation, Reading Comprehension, Correlation

Practical Issues in Estimating Achievement Gaps from Coarsened Data

Peer reviewed

Direct link

Reardon, Sean F.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2015

In an earlier paper, we presented methods for estimating achievement gaps when test scores are coarsened into a small number of ordered categories, preventing fine-grained distinctions between individual scores. We demonstrated that gaps can nonetheless be estimated with minimal bias across a broad range of simulated and real coarsened data…

Descriptors: Achievement Gap, Performance Factors, Educational Practices, Scores

Practical Issues in Estimating Achievement Gaps from Coarsened Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Reardon, Sean F.; Ho, Andrew D. – Grantee Submission, 2015

Ho and Reardon (2012) present methods for estimating achievement gaps when test scores are coarsened into a small number of ordered categories, preventing fine-grained distinctions between individual scores. They demonstrate that gaps can nonetheless be estimated with minimal bias across a broad range of simulated and real coarsened data…

Descriptors: Achievement Gap, Performance Factors, Educational Practices, Scores

An Investigation of Measurement Invariance of the Key Stage 2 National Curriculum Science Sampling Test in England

Peer reviewed

Direct link

He, Qingping; Anwyll, Steve; Glanville, Matthew; Opposs, Dennis – Research Papers in Education, 2014

Since 2010, the whole national cohort Key Stage 2 (KS2) National Curriculum test in science in England has been replaced with a sampling test taken by pupils at the age of 11 from a nationally representative sample of schools annually. The study reported in this paper compares the performance of different subgroups of the samples (classified by…

Descriptors: National Curriculum, Sampling, Foreign Countries, Factor Analysis

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

Reliability Generalization: An Examination of the Positive Affect and Negative Affect Schedule

Peer reviewed

Direct link

Leue, Anja; Lange, Sebastian – Assessment, 2011

The assessment of positive affect (PA) and negative affect (NA) by means of the Positive Affect and Negative Affect Schedule has received a remarkable popularity in the social sciences. Using a meta-analytic tool--namely, reliability generalization (RG)--population reliability scores of both scales have been investigated on the basis of a random…

Descriptors: Social Sciences, True Scores, Generalization, Affective Behavior

Classroom Climate and Contextual Effects: Conceptual and Methodological Issues in the Evaluation of Group-Level Effects

Peer reviewed

Direct link

Marsh, Herbert W.; Ludtke, Oliver; Nagengast, Benjamin; Trautwein, Ulrich; Morin, Alexandre J. S.; Abduljabbar, Adel S.; Koller, Olaf – Educational Psychologist, 2012

Classroom context and climate are inherently classroom-level (L2) constructs, but applied researchers sometimes--inappropriately--represent them by student-level (L1) responses in single-level models rather than more appropriate multilevel models. Here we focus on important conceptual issues (distinctions between climate and contextual variables;…

Descriptors: Foreign Countries, Classroom Environment, Educational Research, Research Design

Attenuation of the Squared Canonical Correlation Coefficient under Varying Estimates of Score Reliability

Direct link

Wilson, Celia M. – ProQuest LLC, 2010

Research pertaining to the distortion of the squared canonical correlation coefficient has traditionally been limited to the effects of sampling error and associated correction formulas. The purpose of this study was to compare the degree of attenuation of the squared canonical correlation coefficient under varying conditions of score reliability.…

Descriptors: Monte Carlo Methods, Measurement, Multivariate Analysis, Error of Measurement

Addressing Two Commonly Unrecognized Sources of Score Instability in Annual State Assessments

Download full text

Doorey, Nancy A. – Council of Chief State School Officers, 2011

The work reported in this paper reflects a collaborative effort of many individuals representing multiple organizations. It began during a session at the October 2008 meeting of TILSA when a representative of a member state asked the group if any of their programs had experienced unexpected fluctuations in the annual state assessment scores, and…

Descriptors: Testing, Sampling, Expertise, Testing Programs

Estimating the Impacts of Educational Interventions Using State Tests or Study-Administered Tests. NCEE 2012-4016

Peer reviewed
PDF on ERIC

Download full text

Olsen, Robert B.; Unlu, Fatih; Price, Cristofer; Jaciw, Andrew P. – National Center for Education Evaluation and Regional Assistance, 2011

This report examines the differences in impact estimates and standard errors that arise when these are derived using state achievement tests only (as pre-tests and post-tests), study-administered tests only, or some combination of state- and study-administered tests. State tests may yield different evaluation results relative to a test that is…

Descriptors: Achievement Tests, Standardized Tests, State Standards, Reading Achievement

Previous Page | Next Page »

Pages: 1 | 2 | 3

Educational and Psychological…	3
Applied Measurement in…	2
Grantee Submission	2
Assessment	1
Athens Journal of Education	1
Brookings Papers on Education…	1
CORE: Collected Original…	1
Council of Chief State School…	1
ETS Research Report Series	1
Educational Psychologist	1
Educational Testing Service	1
Institute for Research on…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
National Center for Education…	1
National Center for Education…	1
ProQuest LLC	1
Psychometrika	1
Research Papers in Education	1
Studies in Educational…	1
More ▼

Ho, Andrew D.	2
Reardon, Sean F.	2
Abduljabbar, Adel S.	1
Anwyll, Steve	1
Backhouse, John K.	1
Birenbaum, Menucha	1
Blaker, Lisa	1
Carol Eckerly	1
David Navarro-González	1
Doorey, Nancy A.	1
Dorans, Neil J.	1
Fabia Morales-Vives	1
Foster, Jeff L.	1
Fuller, Edward,	1
Glanville, Matthew	1
Griffin, Thomas D.	1
Guerrero, Tricia A.	1
Haberman, Shelby J.	1
Haladyna, Thomas M.	1
He, Qingping	1
Hollister, Robinson	1
Jaciw, Andrew P.	1
Jarjoura, David	1
Jewsbury, Paul A.	1
More ▼