ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	9

Descriptor

Accuracy	9
Comparative Analysis	9
Foreign Countries	3
Item Response Theory	3
Secondary School Students	3
Achievement Tests	2
Classification	2
Educational Assessment	2
Elementary School Students	2
Equated Scores	2
Grade 3	2
International Assessment	2
Mathematics	2
Sampling	2
Scores	2
Scoring	2
Simulation	2
Statistical Analysis	2
Statistical Bias	2
Test Construction	2
Test Items	2
Validity	2
Adolescents	1
Athletics	1
Benchmarking	1
More ▼

Source

ETS Research Report Series	2
ACT, Inc.	1
Behavioral Research and…	1
Cambridge Assessment	1
Journal of Experimental…	1
MDRC	1
Office of Planning,…	1
Research Quarterly for…	1

Publication Type

Numerical/Quantitative Data	9
Reports - Research	8
Journal Articles	4
Reports - Evaluative	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	4
Early Childhood Education	2
Elementary Education	2
Elementary Secondary Education	2
Grade 3	2
Primary Education	2
Grade 10	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

California (Los Angeles)	1
China	1
Florida	1
Hawaii	1
Maryland (Baltimore)	1
Ohio	1
Rhode Island	1
Texas	1
Texas (Houston)	1
United Kingdom (England)	1

Laws, Policies, & Programs

American Recovery and…	1
Elementary and Secondary…	1

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Criterion Validity and Classification Accuracy of easyCBM: Grades 3-8. (Technical Report # 2401)

Download full text

Gerald Tindal; Joseph F. T. Nese – Behavioral Research and Teaching, 2024

We present two types of validity evidence to support inferences and decisions about use of easyCBMs in relation to state testing programs. The first type involves the use of Benchmarks in reading to use in making predictions of performance on the Smarter Balanced (SB) test. These predictions can be made both well in advance (several months) or…

Descriptors: Classification, Accuracy, Validity, Criteria

Evaluating the 'Similar Items Method' for Standard Maintaining. Conference Paper

Direct link

Bramley, Tom – Cambridge Assessment, 2018

The aim of the research reported here was to get some idea of the accuracy of grade boundaries (cut-scores) obtained by applying the 'similar items method' described in Bramley & Wilson (2016). In this method experts identify items on the current version of a test that are sufficiently similar to items on previous versions for them to be…

Descriptors: Accuracy, Cutting Scores, Test Items, Item Analysis

Comparing Data Treatments on Item-Level Nonresponse and Their Effects on Data Analysis of Large-Scale Assessments: 2009 PISA Study. Research Report. ETS RR-15-12

Peer reviewed
PDF on ERIC

Download full text

Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015

One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…

Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students

A Comparison of Four Linear Equating Methods for the Common-Item Nonequivalent Groups Design Using Simulation Methods. ACT Research Report Series, 2013 (2)

Download full text

Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu – ACT, Inc., 2013

This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…

Descriptors: Comparative Analysis, Equated Scores, Methods, Simulation

Exploring the Quality of School-Level Expenditure Data: Practices and Lessons Learned in Nine Sites

Download full text

Atchison, Drew; Baker, Bruce; Levin, Jesse; Manship, Karen – Office of Planning, Evaluation and Policy Development, US Department of Education, 2017

Concerns about the equitable distribution of school funding within and across school districts have led to new federal data collections on school-level expenditures. The American Recovery and Reinvestment Act of 2009 (ARRA) required states to collect and report, for the first time, school-level data on both personnel and non-personnel expenditures…

Descriptors: Educational Quality, Expenditures, School Districts, Data Collection

A Bootstrap Procedure of Propensity Score Estimation

Peer reviewed

Direct link

Bai, Haiyan – Journal of Experimental Education, 2013

Propensity score estimation plays a fundamental role in propensity score matching for reducing group selection bias in observational data. To increase the accuracy of propensity score estimation, the author developed a bootstrap propensity score. The commonly used propensity score matching methods: nearest neighbor matching, caliper matching, and…

Descriptors: Statistical Inference, Sampling, Probability, Computation

The Validity and Precision of the Comparative Interrupted Time Series Design and the Difference-in-Difference Design in Educational Evaluation

Download full text

Somers, Marie-Andrée; Zhu, Pei; Jacob, Robin; Bloom, Howard – MDRC, 2013

In this paper, we examine the validity and precision of two nonexperimental study designs (NXDs) that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. In a CITS design, program impacts are evaluated by looking at whether the treatment group deviates from its…

Descriptors: Research Design, Educational Assessment, Time, Intervals

Predicting Chinese Children and Youth's Energy Expenditure Using ActiGraph Accelerometers: A Calibration and Cross-Validation Study

Peer reviewed

Direct link

Zhu, Zheng; Chen, Peijie; Zhuang, Jie – Research Quarterly for Exercise and Sport, 2013

Purpose: The purpose of this study was to develop and cross-validate an equation based on ActiGraph accelerometer GT3X output to predict children and youth's energy expenditure (EE) of physical activity (PA). Method: Participants were 367 Chinese children and youth (179 boys and 188 girls, aged 9 to 17 years old) who wore 1 ActiGraph GT3X…

Descriptors: Foreign Countries, Physical Activities, Physical Activity Level, Children

Studies of a Latent-Class Signal-Detection Model for Constructed-Response Scoring. Research Report. ETS RR-08-63

Peer reviewed
PDF on ERIC

Download full text

DeCarlo, Lawrence T. – ETS Research Report Series, 2008

Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…

Descriptors: Scoring, Responses, Test Format, Bias

Atchison, Drew	1
Bai, Haiyan	1
Baker, Bruce	1
Bloom, Howard	1
Bramley, Tom	1
Chen, Haiwen H.	1
Chen, Hanwei	1
Chen, Peijie	1
Cui, Zhongmin	1
DeCarlo, Lawrence T.	1
Fang, Yu	1
Gerald Tindal	1
Jacob, Robin	1
Joseph F. T. Nese	1
Kong, Nan	1
Levin, Jesse	1
Manship, Karen	1
Somers, Marie-Andrée	1
Topczewski, Anna	1
Woodruff, David	1
Yamamoto, Kentaro	1
Zhu, Pei	1
Zhu, Zheng	1
Zhuang, Jie	1
von Davier, Matthias	1
More ▼