Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 34 |
Descriptor
Equated Scores | 35 |
Item Response Theory | 15 |
Grade 8 | 14 |
Mathematics Tests | 13 |
Achievement Tests | 12 |
Foreign Countries | 12 |
Test Items | 11 |
Scaling | 9 |
Comparative Analysis | 8 |
Elementary School Students | 8 |
Error of Measurement | 8 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 24 |
Journal Articles | 23 |
Numerical/Quantitative Data | 6 |
Reports - Evaluative | 6 |
Reports - Descriptive | 3 |
Speeches/Meeting Papers | 3 |
Dissertations/Theses -… | 1 |
Guides - Non-Classroom | 1 |
Education Level
Elementary Education | 35 |
Secondary Education | 17 |
Middle Schools | 16 |
Junior High Schools | 15 |
Grade 8 | 14 |
Intermediate Grades | 10 |
Grade 4 | 8 |
Elementary Secondary Education | 7 |
Grade 6 | 7 |
Grade 3 | 6 |
Grade 7 | 6 |
More ▼ |
Audience
Location
New York | 3 |
Turkey | 3 |
United States | 2 |
Australia | 1 |
Austria | 1 |
Canada | 1 |
Florida | 1 |
Georgia | 1 |
Hungary | 1 |
Israel | 1 |
Italy | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Guangming Li; Zhengyan Liang – SAGE Open, 2024
In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used…
Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3
Kim, Dong-In; Julian, Marc; Hermann, Pam – Online Submission, 2022
In test equating, one critical equating property is the group invariance property which indicates that the equating function used to convert performance on each alternate form to the reporting scale should be the same for various subgroups. To mitigate the impact of disrupted learning on the item parameters during the COVID-19 pandemic, a…
Descriptors: COVID-19, Pandemics, Test Format, Equated Scores
Gübes, Nese; Uyar, Seyma – International Journal of Progressive Education, 2020
This study aims to compare the performance of different small sample equating methods in the presence and absence of differential item functioning (DIF) in common items. In this research, Tucker linear equating, Levine linear equating, unsmoothed and pre-smoothed (C=4) chained equipercentile equating, and simplified circle arc equating methods…
Descriptors: Test Bias, Equated Scores, Test Items, Methods
Akin Arikan, Cigdem – Eurasian Journal of Educational Research, 2019
Problem Statement: Equating can be defined as a statistical process that allows modifying the differences between test forms with similar content and difficulty so that the scores obtained from these forms can be used interchangeably. In the literature, there are many equating methods, one of which is Kernel equating. Trends in International…
Descriptors: Equated Scores, Foreign Countries, Achievement Tests, International Assessment
Tomkowicz, Joanna; Kim, Dong-In; Wan, Ping – Online Submission, 2022
In this study we evaluated the stability of item parameters and student scores, using the pre-equated (pre-pandemic) parameters from Spring 2019 and post-equated (post-pandemic) parameters from Spring 2021 in two calibration and equating designs related to item parameter treatment: re-estimating all anchor parameters (Design 1) and holding the…
Descriptors: Equated Scores, Test Items, Evaluation Methods, Pandemics
Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both…
Descriptors: Equated Scores, Validity, Methods, School Districts
Lim, Hwanggyu; Sireci, Stephen G. – Education Policy Analysis Archives, 2017
The Trends in International Mathematics and Science Study (TIMSS) makes it possible to compare the performance of students in the US in Mathematics and Science to the performance of students in other countries. TIMSS uses four international benchmarks for describing student achievement: Low, Intermediate, High, and Advanced. In this study, we…
Descriptors: Achievement Tests, Mathematics Achievement, Mathematics Tests, International Assessment
Ozdemir, Burhanettin – International Journal of Progressive Education, 2017
The purpose of this study is to equate Trends in International Mathematics and Science Study (TIMSS) mathematics subtest scores obtained from TIMSS 2011 to scores obtained from TIMSS 2007 form with different nonlinear observed score equating methods under Non-Equivalent Anchor Test (NEAT) design where common items are used to link two or more test…
Descriptors: Achievement Tests, Elementary Secondary Education, Foreign Countries, International Assessment
Schoen, Robert C.; Yang, Xiaotong; Paek, Insu – Grantee Submission, 2018
This report provides evidence of the substantive and structural validity of the Knowledge for Teaching Elementary Fractions Test. Field-test data were gathered with a sample of 241 elementary educators, including teachers, administrators, and instructional support personnel, in spring 2017, as part of a larger study involving a multisite…
Descriptors: Psychometrics, Pedagogical Content Knowledge, Mathematics Tests, Mathematics Instruction
Da'as, Rima'a – Compare: A Journal of Comparative and International Education, 2017
Despite substantial interest and research in measuring leader's skills, little is known about the measurement equivalence and mean differences in the scores measuring principals' skills (cognitive, interpersonal, strategic) across cultures (collectivism versus individualism). The aim of the present study was to assess measurement…
Descriptors: Principals, Leadership Qualities, Measurement Techniques, Cross Cultural Studies
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
Barr, Christopher D.; Reutebuch, Colleen K.; Carlson, Coleen D.; Vaughn, Sharon; Francis, David J. – Journal of Research on Educational Effectiveness, 2019
Beginning in 2002, researchers developed, implemented, and evaluated the efficacy of an English reading intervention for first-grade English learners using multiple randomized control trials (RCTs). As a result of this efficacy work, researchers successfully competed for an IES Goal 4 effectiveness study using the same intervention. Unlike the…
Descriptors: Intervention, English Language Learners, Grade 1, Elementary School Students
Strietholt, Rolf; Rosén, Monica – Measurement: Interdisciplinary Research and Perspectives, 2016
Since the start of the new millennium, international comparative large-scale studies have become one of the most well-known areas in the field of education. However, the International Association for the Evaluation of Educational Achievement (IEA) has already been conducting international comparative studies for about half a century. The present…
Descriptors: Reading Tests, Comparative Analysis, Comparative Education, Trend Analysis
Ye, Meng; Xin, Tao – Educational and Psychological Measurement, 2014
The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…
Descriptors: Scaling, Test Items, Equated Scores, Achievement Gains
Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores