ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	30

Descriptor

Equated Scores	30
Testing Programs	30
Item Response Theory	14
Error of Measurement	10
Evaluation Methods	8
Mathematics Tests	8
Test Items	8
Scaling	7
Statistical Analysis	7
Test Reliability	7
Academic Achievement	6
Comparative Analysis	6
Scoring	6
Test Construction	6
Test Validity	6
Achievement Tests	5
Sampling	5
Standardized Tests	5
College Entrance Examinations	4
Correlation	4
Data Collection	4
Language Tests	4
Probability	4
Reading Tests	4
Sample Size	4
More ▼

Source

ETS Research Report Series	5
Applied Psychological…	4
Applied Measurement in…	3
New York State Education…	3
Educational Measurement:…	2
GED Testing Service	2
Northwest Evaluation…	2
ACT, Inc.	1
British Columbia Ministry of…	1
Educational Testing Service	1
Educational and Psychological…	1
Journal of Educational and…	1
Language Testing	1
Pearson	1
ProQuest LLC	1
Psychometrika	1
More ▼

Publication Type

Journal Articles	18
Reports - Research	11
Numerical/Quantitative Data	8
Reports - Evaluative	8
Reports - Descriptive	5
Guides - Non-Classroom	2
Information Analyses	2
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Guides - General	1
Tests/Questionnaires	1
More ▼

Audience

Location

New York	3
Canada	2
Minnesota	1
North Carolina	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

General Educational…	2
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Impact of Accumulated Error on Item Response Theory Pre-Equating with Mixed Format Tests

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert; Cook, Robert J.; Colvin, Kimberly F. – Applied Measurement in Education, 2016

The equating of tests is an essential process in high-stakes, large-scale testing conducted over multiple forms or administrations. By adjusting for differences in difficulty and placing scores from different administrations of a test on a common scale, equating allows scores from these different forms and administrations to be directly compared…

Descriptors: Item Response Theory, Equated Scores, Test Format, Testing Programs

Equating without an Anchor for Nonequivalent Groups of Examinees

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015

An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…

Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring

Equating in Small-Scale Language Testing Programs

Peer reviewed

Direct link

LaFlair, Geoffrey T.; Isbell, Daniel; May, L. D. Nicolas; Gutierrez Arvizu, Maria Nelly; Jamieson, Joan – Language Testing, 2017

Language programs need multiple test forms for secure administrations and effective placement decisions, but can they have confidence that scores on alternate test forms have the same meaning? In large-scale testing programs, various equating methods are available to ensure the comparability of forms. The choice of equating method is informed by…

Descriptors: Language Tests, Equated Scores, Testing Programs, Comparative Analysis

Demographically Adjusted Groups for Equating Test Scores. Research Report. ETS RR-14-30

Peer reviewed
PDF on ERIC

Download full text

Livingston, Samuel A. – ETS Research Report Series, 2014

In this study, I investigated 2 procedures intended to create test-taker groups of equal ability by poststratifying on a composite variable created from demographic information. In one procedure, the stratifying variable was the composite variable that best predicted the test score. In the other procedure, the stratifying variable was the…

Descriptors: Demography, Equated Scores, Cluster Grouping, Ability Grouping

Test Technical Manual 2014 GED® Test

Download full text

GED Testing Service, 2014

This manual was written to provide technical information regarding the General Educational Development (GED®) test as evidence that the GED® test is technically sound. Throughout this manual, documentation is provided regarding the development of the GED® test and data collection activities, as well as evidence of reliability and validity. This…

Descriptors: High School Equivalency Programs, Equivalency Tests, Testing Programs, Test Validity

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

Multiple Linking in Equating and Random Scale Drift. Research Report. ETS RR-11-46

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Liu, Jinghua; Dorans, Neil; Feigenbaum, Miriam – ETS Research Report Series, 2011

Maintaining score stability is crucial for an ongoing testing program that administers several tests per year over many years. One way to stall the drift of the score scale is to use an equating design with multiple links. In this study, we use the operational and experimental SAT® data collected from 44 administrations to investigate the effect…

Descriptors: Equated Scores, College Entrance Examinations, Reliability, Testing Programs

A Graphical Approach to Evaluating Equating Using Test Characteristic Curves

Peer reviewed

Direct link

Wyse, Adam E.; Reckase, Mark D. – Applied Psychological Measurement, 2011

An essential concern in the application of any equating procedure is determining whether tests can be considered equated after the tests have been placed onto a common scale. This article clarifies one equating criterion, the first-order equity property of equating, and develops a new method for evaluating equating that is linked to this…

Descriptors: Lawyers, Licensing Examinations (Professions), Testing Programs, Graphs

The Long-Term Sustainability of Different Item Response Theory Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011

This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…

Descriptors: Item Response Theory, Scaling, Sustainability, Classification

Practical Application of a Synthetic Linking Function on Small-Sample Equating

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011

The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…

Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

New York State Testing Program 2016: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2016

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2016 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

Minnesota Linking Study: A Study of the Alignment of the NWEA RIT Scale with the Minnesota Comprehensive Assessments (MCA) Testing Program

Download full text

Northwest Evaluation Association, 2014

Recently, Northwest Evaluation Association (NWEA) completed a study to connect the scale of the Minnesota Comprehensive Assessments (MCA) Testing Program used for Minnesota's mathematics and reading assessments with NWEA's RIT (Rasch Unit) scale. Information from the state assessments was used in a study to establish performance-level scores on…

Descriptors: Alignment (Education), Testing Programs, State Programs, Mathematics Tests

Accumulative Equating Error after a Chain of Linear Equatings

Peer reviewed

Direct link

Guo, Hongwen – Psychometrika, 2010

After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…

Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores

North Carolina Linking Study: A Study of the Alignment of the NWEA RIT Scale with the North Carolina State End of Grade (EOG) Testing Program

Download full text

Northwest Evaluation Association, 2014

Recently, the Northwest Evaluation Association (NWEA) completed a study to connect the scale of the North Carolina State End of Grade (EOG) Testing Program used for North Carolina's mathematics and reading assessments with NWEA's Rausch Interval Unit (RIT) scale. Information from the state assessments was used in a study to establish…

Descriptors: Alignment (Education), Testing Programs, Equated Scores, Standard Setting

Previous Page | Next Page »

Pages: 1 | 2

Higher Education	8
Secondary Education	5
Elementary Education	4
Adult Education	3
Early Childhood Education	3
Grade 3	3
Grade 4	3
Grade 5	3
Grade 6	3
Grade 7	3
Grade 8	3
Intermediate Grades	3
Junior High Schools	3
Middle Schools	3
Postsecondary Education	3
Primary Education	3
Elementary Secondary Education	2
High School Equivalency…	2
Adult Basic Education	1
High Schools	1
More ▼

von Davier, Alina A.	3
Guo, Hongwen	2
Haberman, Shelby	2
Keller, Lisa A.	2
Kim, Sooyeon	2
Brennan, Robert L.	1
Chen, Hanwei	1
Colvin, Kimberly F.	1
Cook, Robert J.	1
Cui, Zhongmin	1
Dorans, Neil	1
Dorans, Neil J.	1
Ezzelle, Carol	1
Feigenbaum, Miriam	1
Gao, Rui	1
Gao, Xiaohong	1
Goodman, Joshua	1
Gutierrez Arvizu, Maria Nelly	1
Haberman, Shelby J.	1
Isbell, Daniel	1
Jamieson, Joan	1
Keller, Robert	1
Keller, Robert R.	1
Kelley, Ronald Scott	1
LaFlair, Geoffrey T.	1
More ▼