ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	17

Descriptor

Equated Scores	98
Testing Problems	98
Test Interpretation	22
Test Items	22
Test Construction	21
Latent Trait Theory	20
Test Reliability	19
College Entrance Examinations	18
Testing Programs	17
Comparative Analysis	16
Higher Education	16
Scaling	15
Test Validity	15
Achievement Tests	14
Educational Testing	14
Elementary Secondary Education	14
Scoring	14
Statistical Analysis	14
Item Response Theory	12
Measurement Techniques	12
Test Theory	12
Educational Assessment	11
Item Analysis	11
Mathematical Models	11
Evaluation Methods	10
More ▼

Publication Type

Reports - Research	47
Journal Articles	35
Speeches/Meeting Papers	31
Reports - Evaluative	21
Opinion Papers	15
Reports - Descriptive	5
Collected Works - Proceedings	4
Information Analyses	4
Guides - Non-Classroom	3
Numerical/Quantitative Data	3
Collected Works - General	2
Books	1
Collected Works - Serials	1
Dissertations/Theses -…	1
Legal/Legislative/Regulatory…	1
Tests/Questionnaires	1
More ▼

Education Level

Elementary Secondary Education	6
Secondary Education	2
Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

United Kingdom (England)	3
United States	3
Australia	2
Netherlands	2
New York	2
United Kingdom	2
United Kingdom (Wales)	2
Alaska	1
California	1
Hawaii	1
Idaho	1
Oregon	1
Washington	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

What Works Clearinghouse Rating

Showing 1 to 15 of 98 results Save | Export

Population Invariance in Composite-Score Equating with the Random Groups Design

Direct link

Chang, Kuo-Feng – ProQuest LLC, 2022

This dissertation was designed to foster a deeper understanding of population invariance in the context of composite-score equating and provide practitioners with guidelines for addressing score equity concerns at the composite score level. The purpose of this dissertation was threefold. The first was to compare different composite equating…

Descriptors: Test Items, Equated Scores, Methods, Design

Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022

Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…

Descriptors: Ability, Tests, Equated Scores, Testing Problems

Which Assessment Is Harder? Some Limits of Statistical Linking

Download full text

Benton, Tom; Williamson, Joanna – Research Matters, 2022

Equating methods are designed to adjust between alternate versions of assessments targeting the same content at the same level, with the aim that scores from the different versions can be used interchangeably. The statistical processes used in equating have, however, been extended to statistically "link" assessments that differ, such as…

Descriptors: Statistical Analysis, Equated Scores, Definitions, Alternative Assessment

Effect of Item Parameter Drift in Mixed Format Common Items on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022

The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…

Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores

Use of Translated and Adapted Versions of the WISC-V: Caveat Emptor

Peer reviewed

Direct link

McGill, Ryan J.; Ward, Thomas J.; Canivez, Gary L. – School Psychology International, 2020

The Wechsler Intelligence Scale for Children (WISC) is the most widely used intelligence test in the world. Now in its fifth edition, the WISC-V has been translated and adapted for use in nearly a dozen countries. Despite its popularity, numerous concerns have been raised about some of the procedures used to develop and validate translated and…

Descriptors: Children, Intelligence Tests, Translation, Test Validity

Lord's Equity Theorem Revisited

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2019

Lord's (1980) equity theorem claims observed-score equating to be possible only when two test forms are perfectly reliable or strictly parallel. An analysis of its proof reveals use of an incorrect statistical assumption. The assumption does not invalidate the theorem itself though, which can be shown to follow directly from the discrete nature of…

Descriptors: Equated Scores, Testing Problems, Item Response Theory, Evaluation Methods

Interpretation of the Translated WISC-V: Caveat Venditor and Caveat Emptor

Peer reviewed

Direct link

Kettler, Ryan J. – School Psychology International, 2020

This article is a commentary on McGill et al.'s (2020) article "Use of Translated and Adapted Versions of the WISC-V: Caveat Emptor." McGill et al. use caveat emptor in their title to indicate that the buyer of an assessment must be careful about the product being purchased, presumably because the seller of the assessment is not being…

Descriptors: Children, Intelligence Tests, Translation, Test Reliability

Investigating Repeater Effects on Small Sample Equating: Include or Exclude?

Peer reviewed

Direct link

Diao, Hongyu; Keller, Lisa – Applied Measurement in Education, 2020

Examinees who attempt the same test multiple times are often referred to as "repeaters." Previous studies suggested that repeaters should be excluded from the total sample before equating because repeater groups are distinguishable from non-repeater groups. In addition, repeaters might memorize anchor items, causing item drift under a…

Descriptors: Licensing Examinations (Professions), College Entrance Examinations, Repetition, Testing Problems

Language Effects in International Testing: The Case of PISA 2006 Science Items

Peer reviewed

Direct link

El Masri, Yasmine H.; Baird, Jo-Anne; Graesser, Art – Assessment in Education: Principles, Policy & Practice, 2016

We investigate the extent to which language versions (English, French and Arabic) of the same science test are comparable in terms of item difficulty and demands. We argue that language is an inextricable part of the scientific literacy construct, be it intended or not by the examiner. This argument has considerable implications on methodologies…

Descriptors: International Assessment, Difficulty Level, Test Items, Language Variation

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Limits on the Accuracy of Linking. Research Report. ETS RR-10-22

Download full text

Haberman, Shelby J. – Educational Testing Service, 2010

Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…

Descriptors: Testing Programs, Equated Scores, Sampling, Accuracy

Educational Measurement Issues and Implications of High Stakes Decision Making in Final Examinations in Secondary Education in the Netherlands

Peer reviewed

Direct link

van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012

While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…

Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Journal of Educational…	7
Educational Measurement:…	5
Measurement:…	5
Applied Measurement in…	3
Educational and Psychological…	3
Assessment in Education:…	2
School Psychology…	2
Applied Psychological…	1
Educational Testing Service	1
Evaluation and the Health…	1
Journal of Educational and…	1
New Directions for Testing…	1
Participatory Educational…	1
Popular Measurement	1
ProQuest LLC	1
Psychometrika	1
Research Matters	1
Review of Educational Research	1
Studies in Educational…	1
Today's Education	1
More ▼

Angoff, William H.	4
Andrulis, Richard S.	2
Baird, Jo-Anne	2
Cohen, Allan S.	2
Cook, Linda L.	2
Gilmer, Jerry S.	2
Hoover, H. D.	2
Jaeger, Richard M.	2
Kim, Seock-Ho	2
Kolen, Michael J.	2
Lissitz, Robert W.	2
Lord, Frederic M.	2
Modu, Christopher C.	2
Schrader, William B.	2
Skaggs, Gary	2
Walker, Michael E.	2
Al-Karni, Ali	1
Algina, James	1
Anderson, Patricia S.	1
Arter, Judith A.	1
Baker, Frank B.	1
Beaton, Albert E.	1
Beguin, A. A.	1
Bejar, Isaac I.	1
More ▼

SAT (College Admission Test)	10
California Achievement Tests	4
Advanced Placement…	3
Comprehensive Tests of Basic…	3
Graduate Record Examinations	3
National Assessment of…	3
Wechsler Intelligence Scale…	3
College Board Achievement…	2
Graduate Management Admission…	2
Iowa Tests of Basic Skills	2
Sequential Tests of…	2
Armed Services Vocational…	1
General Educational…	1
Metropolitan Achievement Tests	1
National Longitudinal Study…	1
Program for International…	1
Test of English as a Foreign…	1
Test of Standard Written…	1
Wechsler Intelligence Scales…	1
More ▼