ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	20

Descriptor

Equated Scores	20
Error of Measurement	8
Sample Size	6
Statistical Bias	6
Accuracy	5
Comparative Analysis	5
Statistical Analysis	5
Testing	5
Correlation	4
Raw Scores	4
Scaling	4
Scoring	4
Test Items	4
Cutting Scores	3
Differences	3
Methods	3
Responses	3
Tests	3
Data Analysis	2
Item Sampling	2
Sampling	2
Test Construction	2
Causal Models	1
Certification	1
Criterion Referenced Tests	1
More ▼

Source

Journal of Educational…	7
ETS Research Report Series	6
Educational Testing Service	4
Educational Measurement:…	1
Educational and Psychological…	1
International Journal of…	1

Author

Puhan, Gautam	20
Guo, Hongwen	2
Gupta, Shaloo	2
Liang, Longjuan	2
Grant, Mary	1
Grant, Mary C.	1
Larkin, Kevin C.	1
McHale, Fred	1
McHale, Frederick	1
Moses, Tim	1
Moses, Timothy P.	1
Ricker, Kathryn L.	1
Rupp, Stacie L.	1
Tan, Xuan	1
Walker, Michael	1
Zu, Jiyun	1
von Davier, Alina A.	1
vonDavier, Alina	1
More ▼

Publication Type

Journal Articles	16
Reports - Research	13
Reports - Evaluative	6
Reports - Descriptive	1

Education Level

Higher Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Section Preequating under the Equivalent Groups Design without IRT

Peer reviewed

Direct link

Guo, Hongwen; Puhan, Gautam – Journal of Educational Measurement, 2014

In this article, we introduce a section preequating (SPE) method (linear and nonlinear) under the randomly equivalent groups design. In this equating design, sections of Test X (a future new form) and another existing Test Y (an old form already on scale) are administered. The sections of Test X are equated to Test Y, after adjusting for the…

Descriptors: Equated Scores, Correlation, Simulation, Testing

Preequating with Empirical Item Characteristic Curves: An Observed-Score Preequating Method

Peer reviewed

Direct link

Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014

Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…

Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling

Rater Comparability Scoring and Equating: Does Choice of Target Population Weights Matter in This Context?

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2013

When a constructed-response test form is reused, raw scores from the two administrations of the form may not be comparable. The solution to this problem requires a rescoring, at the current administration, of examinee responses from the previous administration. The scores from this "rescoring" can be used as an anchor for equating. In…

Descriptors: Scoring, Equated Scores, Testing, Correlation

Choice of Target Population Weights in Rater Comparability Scoring and Equating. Research Report. ETS RR-13-03

Peer reviewed
PDF on ERIC

Download full text

Puhan, Gautam – ETS Research Report Series, 2013

The purpose of this study was to demonstrate that the choice of sample weights when defining the target population under poststratification equating can be a critical factor in determining the accuracy of the equating results under a unique equating scenario, known as "rater comparability scoring and equating." The nature of data…

Descriptors: Scoring, Equated Scores, Sampling, Accuracy

A Criterion to Evaluate the Individual Raw-to-Scale Equating Conversions. Research Report. ETS RR-13-05

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Puhan, Gautam; Walker, Michael – ETS Research Report Series, 2013

In this study we investigated when an equating conversion line is problematic in terms of gaps and clumps. We suggest using the conditional standard error of measurement (CSEM) to measure the scale scores that are inappropriate in the overall raw-to-scale transformation.

Descriptors: Equated Scores, Test Items, Evaluation Criteria, Error of Measurement

Equating Subscores under the Nonequivalent Anchor Test (NEAT) Design

Peer reviewed

Direct link

Puhan, Gautam; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011

The study examined two approaches for equating subscores. They are (1) equating subscores using internal common items as the anchor to conduct the equating, and (2) equating subscores using equated and scaled total scores as the anchor to conduct the equating. Since equated total scores are comparable across the new and old forms, they can be used…

Descriptors: Equated Scores, Test Items, Methods

Equating Subscores Using Total Scaled Scores as an Anchor. Research Report. ETS RR-11-07

Download full text

Puhan, Gautam; Liang, Longjuan – Educational Testing Service, 2011

Because the demand for subscores is ever increasing, this study examined two different approaches for equating subscores: (a) equating a subscore on the new form to the same subscore in the old form using internal common items as the anchor to conduct the equating, and (b) equating a subscore on the new form to the same subscore in the old form…

Descriptors: Equated Scores, Scaling, Raw Scores, Methods

Choosing among Tucker or Chained Linear Equating in Two Testing Situations: Rater Comparability Scoring and Randomly Equivalent Groups with an Anchor

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2012

Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…

Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis

Can Smoothing Help When Equating with Unrepresentative Small Samples? Research Report. ETS RR-11-09

Download full text

Puhan, Gautam – Educational Testing Service, 2011

The study evaluated the effectiveness of log-linear presmoothing (Holland & Thayer, 1987) on the accuracy of small sample chained equipercentile equatings under two conditions (i.e., using small samples that differed randomly in ability from the target population "versus" using small samples that were distinctly different from the…

Descriptors: Equated Scores, Data Analysis, Accuracy, Sample Size

Futility of Log-Linear Smoothing When Equating with Unrepresentative Small Samples

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2011

The impact of log-linear presmoothing on the accuracy of small sample chained equipercentile equating was evaluated under two conditions. In the first condition the small samples differed randomly in ability from the target population. In the second condition the small samples were systematically different from the target population. Results…

Descriptors: Equated Scores, Data Analysis, Sample Size, Accuracy

A Brief Report on How Impossible Scores Affect Smoothing and Equating

Peer reviewed

Direct link

Puhan, Gautam; von Davier, Alina A.; Gupta, Shaloo – Educational and Psychological Measurement, 2010

Equating under the external anchor design is frequently conducted using scaled scores on the anchor test. However, scaled scores often lead to the unique problem of creating zero frequencies in the score distribution because there may not always be a one-to-one correspondence between raw and scaled scores. For example, raw scores of 17 and 18 may…

Descriptors: Statistical Distributions, Raw Scores, Equated Scores, Scaling

A Comparison of Chained Linear and Poststratification Linear Equating under Different Testing Conditions

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2010

In this study I compared results of chained linear, Tucker, and Levine-observed score equatings under conditions where the new and old forms samples were similar in ability and also when they were different in ability. The length of the anchor test was also varied to examine its effect on the three different equating methods. The three equating…

Descriptors: Testing, Equated Scores, Comparative Analysis, Causal Models

Chained versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data. Research Report. ETS RR-10-06

Download full text

Puhan, Gautam – Educational Testing Service, 2010

This study used real data to construct testing conditions for comparing results of chained linear, Tucker, and Levine-observed score equatings. The comparisons were made under conditions where the new- and old-form samples were similar in ability and when they differed in ability. The length of the anchor test was also varied to enable examination…

Descriptors: Equated Scores, Comparative Analysis, Statistical Analysis, Statistical Bias

Single- versus Double-Scoring of Trend Responses in Trend Score Equating with Constructed-Response Tests. Research Report. ETS RR-10-12

Download full text

Tan, Xuan; Ricker, Kathryn L.; Puhan, Gautam – Educational Testing Service, 2010

This study examines the differences in equating outcomes between two trend score equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor…

Descriptors: Equated Scores, Scoring, Responses, Test Items

Impact of Inclusion or Exclusion of Repeaters on Test Equating

Peer reviewed

Direct link

Puhan, Gautam – International Journal of Testing, 2011

This study examined the effect of including or excluding repeaters on the equating process and results. New forms of two tests were equated to their respective old forms using either all examinees or only the first timer examinees in the new form sample. Results showed that for both tests used in this study, including or excluding repeaters in the…

Descriptors: Equated Scores, Educational Testing, Student Evaluation, Sample Size

Previous Page | Next Page »

Pages: 1 | 2