Publication Date
In 2025 | 4 |
Since 2024 | 9 |
Since 2021 (last 5 years) | 58 |
Since 2016 (last 10 years) | 147 |
Since 2006 (last 20 years) | 496 |
Descriptor
Source
Author
Bianchini, John C. | 35 |
von Davier, Alina A. | 34 |
Dorans, Neil J. | 33 |
Kolen, Michael J. | 31 |
Loret, Peter G. | 31 |
Kim, Sooyeon | 26 |
Moses, Tim | 24 |
Livingston, Samuel A. | 22 |
Holland, Paul W. | 20 |
Puhan, Gautam | 20 |
Liu, Jinghua | 19 |
More ▼ |
Publication Type
Education Level
Location
Canada | 9 |
Australia | 8 |
Florida | 8 |
United Kingdom (England) | 8 |
Netherlands | 7 |
New York | 7 |
United States | 7 |
Israel | 6 |
Turkey | 6 |
United Kingdom | 6 |
California | 5 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 12 |
No Child Left Behind Act 2001 | 5 |
Education Consolidation… | 3 |
Hawkins Stafford Act 1988 | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Dong, Yixiao; Clements, Douglas H.; Day-Hess, Crystal A.; Sarama, Julie; Dumas, Denis – Journal of Psychoeducational Assessment, 2021
Psychometric work with young children faces the particular challenge that children's attention spans are relatively short, and therefore, shorter assessments are required while retaining comprehensive coverage. This article reports on three empirical studies that encompass the development and validation of the research-based early mathematics…
Descriptors: Young Children, Numeracy, Test Construction, Test Validity
Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022
The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…
Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019
When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…
Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations
Musa Adekunle Ayanwale – Discover Education, 2023
Examination scores obtained by students from the West African Examinations Council (WAEC), and National Business and Technical Examinations Board (NABTEB) may not be directly comparable due to differences in examination administration, item characteristics of the subject in question, and student abilities. For more accurate comparisons, scores…
Descriptors: Equated Scores, Mathematics Tests, Test Items, Test Format
McGill, Ryan J.; Ward, Thomas J.; Canivez, Gary L. – School Psychology International, 2020
The Wechsler Intelligence Scale for Children (WISC) is the most widely used intelligence test in the world. Now in its fifth edition, the WISC-V has been translated and adapted for use in nearly a dozen countries. Despite its popularity, numerous concerns have been raised about some of the procedures used to develop and validate translated and…
Descriptors: Children, Intelligence Tests, Translation, Test Validity
Dowling, N. Maritza; Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2020
Equating of psychometric scales and tests is frequently required and conducted in educational, behavioral, and clinical research. Construct comparability or equivalence between measuring instruments is a necessary condition for making decisions about linking and equating resulting scores. This article is concerned with a widely applicable method…
Descriptors: Evaluation Methods, Psychometrics, Screening Tests, Dementia
Furter, Robert T.; Dwyer, Andrew C. – Applied Measurement in Education, 2020
Maintaining equivalent performance standards across forms is a psychometric challenge exacerbated by small samples. In this study, the accuracy of two equating methods (Rasch anchored calibration and nominal weights mean) and four anchor item selection methods were investigated in the context of very small samples (N = 10). Overall, nominal…
Descriptors: Classification, Accuracy, Item Response Theory, Equated Scores
Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020
Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…
Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Kim, Dong-In; Julian, Marc; Hermann, Pam – Online Submission, 2022
In test equating, one critical equating property is the group invariance property which indicates that the equating function used to convert performance on each alternate form to the reporting scale should be the same for various subgroups. To mitigate the impact of disrupted learning on the item parameters during the COVID-19 pandemic, a…
Descriptors: COVID-19, Pandemics, Test Format, Equated Scores
van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2019
Lord's (1980) equity theorem claims observed-score equating to be possible only when two test forms are perfectly reliable or strictly parallel. An analysis of its proof reveals use of an incorrect statistical assumption. The assumption does not invalidate the theorem itself though, which can be shown to follow directly from the discrete nature of…
Descriptors: Equated Scores, Testing Problems, Item Response Theory, Evaluation Methods
Zhang, Zhonghua – Applied Measurement in Education, 2020
The characteristic curve methods have been applied to estimate the equating coefficients in test equating under the graded response model (GRM). However, the approaches for obtaining the standard errors for the estimates of these coefficients have not been developed and examined. In this study, the delta method was applied to derive the…
Descriptors: Error of Measurement, Computation, Equated Scores, True Scores
Lee, Yi-Hsuan; Haberman, Shelby J.; Dorans, Neil J. – Journal of Educational Measurement, 2019
In many educational tests, both multiple-choice (MC) and constructed-response (CR) sections are used to measure different constructs. In many common cases, security concerns lead to the use of form-specific CR items that cannot be used for equating test scores, along with MC sections that can be linked to previous test forms via common items. In…
Descriptors: Scores, Multiple Choice Tests, Test Items, Responses
Zheng, Xiaying; Yang, Ji Seung – Measurement: Interdisciplinary Research and Perspectives, 2021
The purpose of this paper is to briefly introduce two most common applications of multiple group item response theory (IRT) models, namely detecting differential item functioning (DIF) analysis and nonequivalent group score linking with a simultaneous calibration. We illustrate how to conduct those analyses using the "Stata" item…
Descriptors: Item Response Theory, Test Bias, Computer Software, Statistical Analysis
Babcock, Ben; Hodge, Kari J. – Educational and Psychological Measurement, 2020
Equating and scaling in the context of small sample exams, such as credentialing exams for highly specialized professions, has received increased attention in recent research. Investigators have proposed a variety of both classical and Rasch-based approaches to the problem. This study attempts to extend past research by (1) directly comparing…
Descriptors: Item Response Theory, Equated Scores, Scaling, Sample Size