Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 9 |
Descriptor
Source
ETS Research Report Series | 2 |
ACT, Inc. | 1 |
Behavioral Research and… | 1 |
Cambridge Assessment | 1 |
Journal of Experimental… | 1 |
MDRC | 1 |
Office of Planning,… | 1 |
Research Quarterly for… | 1 |
Author
Atchison, Drew | 1 |
Bai, Haiyan | 1 |
Baker, Bruce | 1 |
Bloom, Howard | 1 |
Bramley, Tom | 1 |
Chen, Haiwen H. | 1 |
Chen, Hanwei | 1 |
Chen, Peijie | 1 |
Cui, Zhongmin | 1 |
DeCarlo, Lawrence T. | 1 |
Fang, Yu | 1 |
More ▼ |
Publication Type
Numerical/Quantitative Data | 9 |
Reports - Research | 8 |
Journal Articles | 4 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Secondary Education | 4 |
Early Childhood Education | 2 |
Elementary Education | 2 |
Elementary Secondary Education | 2 |
Grade 3 | 2 |
Primary Education | 2 |
Grade 10 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
More ▼ |
Audience
Location
California (Los Angeles) | 1 |
China | 1 |
Florida | 1 |
Hawaii | 1 |
Maryland (Baltimore) | 1 |
Ohio | 1 |
Rhode Island | 1 |
Texas | 1 |
Texas (Houston) | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
American Recovery and… | 1 |
Elementary and Secondary… | 1 |
Assessments and Surveys
Program for International… | 1 |
What Works Clearinghouse Rating
Gerald Tindal; Joseph F. T. Nese – Behavioral Research and Teaching, 2024
We present two types of validity evidence to support inferences and decisions about use of easyCBMs in relation to state testing programs. The first type involves the use of Benchmarks in reading to use in making predictions of performance on the Smarter Balanced (SB) test. These predictions can be made both well in advance (several months) or…
Descriptors: Classification, Accuracy, Validity, Criteria
Bramley, Tom – Cambridge Assessment, 2018
The aim of the research reported here was to get some idea of the accuracy of grade boundaries (cut-scores) obtained by applying the 'similar items method' described in Bramley & Wilson (2016). In this method experts identify items on the current version of a test that are sufficiently similar to items on previous versions for them to be…
Descriptors: Accuracy, Cutting Scores, Test Items, Item Analysis
Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015
One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu – ACT, Inc., 2013
This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…
Descriptors: Comparative Analysis, Equated Scores, Methods, Simulation
Atchison, Drew; Baker, Bruce; Levin, Jesse; Manship, Karen – Office of Planning, Evaluation and Policy Development, US Department of Education, 2017
Concerns about the equitable distribution of school funding within and across school districts have led to new federal data collections on school-level expenditures. The American Recovery and Reinvestment Act of 2009 (ARRA) required states to collect and report, for the first time, school-level data on both personnel and non-personnel expenditures…
Descriptors: Educational Quality, Expenditures, School Districts, Data Collection
Bai, Haiyan – Journal of Experimental Education, 2013
Propensity score estimation plays a fundamental role in propensity score matching for reducing group selection bias in observational data. To increase the accuracy of propensity score estimation, the author developed a bootstrap propensity score. The commonly used propensity score matching methods: nearest neighbor matching, caliper matching, and…
Descriptors: Statistical Inference, Sampling, Probability, Computation
Somers, Marie-Andrée; Zhu, Pei; Jacob, Robin; Bloom, Howard – MDRC, 2013
In this paper, we examine the validity and precision of two nonexperimental study designs (NXDs) that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. In a CITS design, program impacts are evaluated by looking at whether the treatment group deviates from its…
Descriptors: Research Design, Educational Assessment, Time, Intervals
Zhu, Zheng; Chen, Peijie; Zhuang, Jie – Research Quarterly for Exercise and Sport, 2013
Purpose: The purpose of this study was to develop and cross-validate an equation based on ActiGraph accelerometer GT3X output to predict children and youth's energy expenditure (EE) of physical activity (PA). Method: Participants were 367 Chinese children and youth (179 boys and 188 girls, aged 9 to 17 years old) who wore 1 ActiGraph GT3X…
Descriptors: Foreign Countries, Physical Activities, Physical Activity Level, Children
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias