Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 42 |
| Since 2017 (last 10 years) | 126 |
| Since 2007 (last 20 years) | 479 |
Descriptor
Source
Author
| Bianchini, John C. | 35 |
| von Davier, Alina A. | 34 |
| Dorans, Neil J. | 33 |
| Kolen, Michael J. | 31 |
| Loret, Peter G. | 31 |
| Kim, Sooyeon | 26 |
| Moses, Tim | 24 |
| Livingston, Samuel A. | 22 |
| Holland, Paul W. | 20 |
| Puhan, Gautam | 20 |
| Liu, Jinghua | 19 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 9 |
| Australia | 8 |
| Florida | 8 |
| United Kingdom (England) | 8 |
| Netherlands | 7 |
| New York | 7 |
| United States | 7 |
| Israel | 6 |
| Turkey | 6 |
| United Kingdom | 6 |
| California | 5 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 12 |
| No Child Left Behind Act 2001 | 5 |
| Education Consolidation… | 3 |
| Hawkins Stafford Act 1988 | 1 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Meng, Yu – ProQuest LLC, 2012
The kernel method of test equating is a unified approach to test equating with some advantages over traditional equating methods. Therefore, it is important to evaluate in a comprehensive way the usefulness and appropriateness of the Kernel equating (KE) method, as well as its advantages and disadvantages compared with several popular item…
Descriptors: Equated Scores, Evaluation Methods, Item Response Theory, Comparative Analysis
Lee, Eunjung – ProQuest LLC, 2013
The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…
Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory
Cummings, Kelli D.; Park, Yonghan; Schaper, Holle A. Bauer – Assessment for Effective Intervention, 2013
The purpose of this article is to describe passage effects on "Dynamic Indicators of Basic Early Literacy Skills--Next Edition Oral Reading Fluency" ("DIBELS Next ORF") progress-monitoring measures for Grades 1 through 6. Approximately 572 students per grade (total "N" with at least one data point = 3,092) read all…
Descriptors: Reading Fluency, Emergent Literacy, Equated Scores, Oral Reading
El Masri, Yasmine H.; Baird, Jo-Anne; Graesser, Art – Assessment in Education: Principles, Policy & Practice, 2016
We investigate the extent to which language versions (English, French and Arabic) of the same science test are comparable in terms of item difficulty and demands. We argue that language is an inextricable part of the scientific literacy construct, be it intended or not by the examiner. This argument has considerable implications on methodologies…
Descriptors: International Assessment, Difficulty Level, Test Items, Language Variation
Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores
Puhan, Gautam – Journal of Educational Measurement, 2012
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis
Wiberg, Marie; van der Linden, Wim J. – Journal of Educational Measurement, 2011
Two methods of local linear observed-score equating for use with anchor-test and single-group designs are introduced. In an empirical study, the two methods were compared with the current traditional linear methods for observed-score equating. As a criterion, the bias in the equated scores relative to true equating based on Lord's (1980)…
Descriptors: Equated Scores, Statistical Analysis, Comparative Analysis, Statistical Bias
Kim, Sooyeon; Livingston, Samuel A.; Lewis, Charles – Applied Measurement in Education, 2011
This article describes a preliminary investigation of an empirical Bayes (EB) procedure for using collateral information to improve equating of scores on test forms taken by small numbers of examinees. Resampling studies were done on two different forms of the same test. In each study, EB and non-EB versions of two equating methods--chained linear…
Descriptors: Sample Size, Equated Scores, Bayesian Statistics, Accuracy
Li, Yanmei – ETS Research Report Series, 2012
In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…
Descriptors: Scores, Item Response Theory, Equated Scores, Simulation
GED Testing Service, 2014
This manual was written to provide technical information regarding the General Educational Development (GED®) test as evidence that the GED® test is technically sound. Throughout this manual, documentation is provided regarding the development of the GED® test and data collection activities, as well as evidence of reliability and validity. This…
Descriptors: High School Equivalency Programs, Equivalency Tests, Testing Programs, Test Validity
Moses, Tim; Liu, Jinghua – Educational Testing Service, 2011
In equating research and practice, equating functions that are smooth are typically assumed to be more accurate than equating functions with irregularities. This assumption presumes that population test score distributions are relatively smooth. In this study, two examples were used to reconsider common beliefs about smoothing and equating. The…
Descriptors: Equated Scores, Data Analysis, Scores, Methods
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Saven, Jessica L.; Farley, Dan; Tindal, Gerald – National Center on Assessment and Accountability for Special Education, 2013
Longitudinally modeling the growth of students with significant cognitive disabilities (SWSCDs) on alternate assessments based on alternate achievement standards (AA-AAS) presents many challenges for states. The number of students in Grades 3-8 who remain in a cohort group varies over time, depending on the methods used to construct the…
Descriptors: Alternative Assessment, Academic Achievement, Intellectual Disability, Growth Models
Huggins, Anne Corinne; Elbaum, Batya – Educational Assessment, 2013
The purpose of this study is to utilize Score Equity Assessment (SEA) to examine measurement comparability and equity in reported scores on a statewide fifth-grade science assessment with respect to groups of students defined by disability status, English Language Learner status and use of test accommodations. Benefits of SEA include a focus on…
Descriptors: Testing Accommodations, Science Tests, Equated Scores, Grade 5
Jurich, Daniel P.; DeMars, Christine E.; Goodman, Joshua T. – Applied Psychological Measurement, 2012
The prevalence of high-stakes test scores as a basis for significant decisions necessitates the dissemination of accurate and fair scores. However, the magnitude of these decisions has created an environment in which examinees may be prone to resort to cheating. To reduce the risk of cheating, multiple test forms are commonly administered. When…
Descriptors: High Stakes Tests, Scores, Prevention, Cheating

Direct link
Peer reviewed
