ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	18

Descriptor

Test Items	20
Equated Scores	12
Comparative Analysis	10
Test Format	9
Multiple Choice Tests	6
Licensing Examinations…	5
Responses	5
Statistical Analysis	5
Accuracy	4
Sample Size	4
Sampling	4
Scores	4
Scoring	4
Ability	3
Cutting Scores	3
Difficulty Level	3
Item Analysis	3
Raw Scores	3
Test Bias	3
Testing	3
Correlation	2
Distance Education	2
Error of Measurement	2
Gender Differences	2
Item Banks	2
More ▼

Source

ETS Research Report Series	12
Journal of Educational…	3
Applied Measurement in…	1
Educational Measurement:…	1
Educational Testing Service	1
International Journal of…	1

Author

Kim, Sooyeon	20
Walker, Michael E.	6
Livingston, Samuel A.	3
McHale, Frederick	3
Lu, Ru	2
Moses, Tim	2
Puhan, Gautam	2
Walker, Michael	2
Boughton, Keith A.	1
Cohen, Allan S.	1
DiStefano, Christine A.	1
Haberman, Shelby	1
Kim, Seock-Ho	1
Robin, Frederic	1
von Davier, Alina A.	1
More ▼

Publication Type

Journal Articles	18
Reports - Research	16
Reports - Evaluative	3
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022

Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…

Descriptors: Ability, Tests, Equated Scores, Testing Problems

Effect of Statistically Matching Equating Samples for Common-Item Equating. Research Report. ETS RR-21-02

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Kim, Sooyeon – ETS Research Report Series, 2021

This study evaluated the impact of subgroup weighting for equating through a common-item anchor. We used data from a single test form to create two research forms for which the equating relationship was known. The results showed that equating was most accurate when the new form and reference form samples were weighted to be similar to the target…

Descriptors: Equated Scores, Weighted Scores, Raw Scores, Test Items

Score Comparability Issues with At-Home Testing and How to Address Them

Peer reviewed

Direct link

Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022

As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…

Descriptors: Scores, Scoring, Comparative Analysis, Testing

Assessing Mode Effects of At-Home Testing without a Randomized Trial. Research Report. ETS RR-21-10

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael – ETS Research Report Series, 2021

In this investigation, we used real data to assess potential differential effects associated with taking a test in a test center (TC) versus testing at home using remote proctoring (RP). We used a pseudo-equivalent groups (PEG) approach to examine group equivalence at the item level and the total score level. If our assumption holds that the PEG…

Descriptors: Testing, Distance Education, Comparative Analysis, Test Items

The Pseudo-Equivalent Groups Approach as an Alternative to Common-Item Equating. Research Report. ETS RR-18-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Lu, Ru – ETS Research Report Series, 2018

The purpose of this study was to evaluate the effectiveness of linking test scores by using test takers' background data to form pseudo-equivalent groups (PEG) of test takers. Using 4 operational test forms that each included 100 items and were taken by more than 30,000 test takers, we created 2 half-length research forms that had either 20…

Descriptors: Test Items, Item Banks, Difficulty Level, Comparative Analysis

An Empirical Investigation of the Potential Impact of Item Misfit on Test Scores. Research Report. ETS RR-17-60

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017

In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…

Descriptors: Scores, Test Items, Test Bias, International Assessment

Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013

The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items

Determining the Anchor Composition for a Mixed-Format Test: Evaluation of Subpopulation Invariance of Linking Functions

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael – Applied Measurement in Education, 2012

This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b)…

Descriptors: Multiple Choice Tests, Test Format, Test Items, Equated Scores

An Empirical Comparison of Methods for Equating with Randomly Equivalent Groups of 50 to 400 Test Takers. Research Report. ETS RR-10-05

Peer reviewed
PDF on ERIC

Download full text

Livingston, Samuel A.; Kim, Sooyeon – ETS Research Report Series, 2010

A series of resampling studies investigated the accuracy of equating by four different methods in a random groups equating design with samples of 400, 200, 100, and 50 test takers taking each form. Six pairs of forms were constructed. Each pair was constructed by assigning items from an existing test taken by 9,000 or more test takers. The…

Descriptors: Equated Scores, Accuracy, Sample Size, Sampling

Comparisons among Small Sample Equating Methods in a Common-Item Design

Peer reviewed

Direct link

Kim, Sooyeon; Livingston, Samuel A. – Journal of Educational Measurement, 2010

Score equating based on small samples of examinees is often inaccurate for the examinee populations. We conducted a series of resampling studies to investigate the accuracy of five methods of equating in a common-item design. The methods were chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating,…

Descriptors: Equated Scores, Test Items, Item Sampling, Item Response Theory

Does Linking Mixed-Format Tests Using a Multiple-Choice Anchor Produce Comparable Results for Male and Female Subgroups? Research Report. ETS RR-11-44

Download full text

Kim, Sooyeon; Walker, Michael E. – Educational Testing Service, 2011

This study examines the use of subpopulation invariance indices to evaluate the appropriateness of using a multiple-choice (MC) item anchor in mixed-format tests, which include both MC and constructed-response (CR) items. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using an MC-only anchor set for 4…

Descriptors: Test Format, Multiple Choice Tests, Test Items, Gender Differences

Evaluating Subpopulation Invariance of Linking Functions to Determine the Anchor Composition for a Mixed-Format Test. Research Report. ETS RR-09-36

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael E. – ETS Research Report Series, 2009

We examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. We derived linking functions in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b) a mix of…

Descriptors: Test Format, Equated Scores, Test Items, Multiple Choice Tests

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

Equating of Mixed-Format Tests in Large-Scale Assessments. Research Report. ETS RR-08-26

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – ETS Research Report Series, 2008

This study examined variations of the nonequivalent-groups equating design for mixed-format tests--tests containing both multiple-choice (MC) and constructed-response (CR) items--to determine which design was most effective in producing equivalent scores across the two tests to be equated. Four linking designs were examined: (a) an anchor with…

Descriptors: Equated Scores, Test Format, Multiple Choice Tests, Responses

Comparisons among Designs for Equating Mixed-Format Tests in Large-Scale Assessments

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010

In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…

Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias

Previous Page | Next Page »

Pages: 1 | 2