ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	11

Descriptor

Error of Measurement	11
Sample Size	11
Equated Scores	7
Statistical Analysis	6
Accuracy	5
Computation	4
Simulation	4
Test Construction	4
Test Items	4
Bayesian Statistics	3
Item Response Theory	3
Sampling	3
Statistical Bias	3
Comparative Analysis	2
Equations (Mathematics)	2
Item Analysis	2
Test Format	2
Test Length	2
Ability	1
Aptitude Tests	1
College Entrance Examinations	1
Data	1
Data Collection	1
Difficulty Level	1
Educational Testing	1
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	11
Reports - Research	11
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	2
Elementary Education	1
Grade 8	1
High Schools	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
National Merit Scholarship…	1
Preliminary Scholastic…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Variance Estimation with Complex Data and Finite Population Correction--A Paradigm for Comparing Jackknife and Formula-Based Methods for Variance Estimation. Research Report. ETS RR-20-11

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe – ETS Research Report Series, 2020

The finite population correction (FPC) factor is often used to adjust variance estimators for survey data sampled from a finite population without replacement. As a replicated resampling approach, the jackknife approach is usually implemented without the FPC factor incorporated in its variance estimates. A paradigm is proposed to compare the…

Descriptors: Computation, Sampling, Data, Statistical Analysis

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Different Methods of Adjusting for Form Difficulty under the Rasch Model: Impact on Consistency of Assessment Results. Research Report. ETS RR-19-08

Peer reviewed
PDF on ERIC

Download full text

Manna, Venessa F.; Gu, Lixiong – ETS Research Report Series, 2019

When using the Rasch model, equating with a nonequivalent groups anchor test design is commonly achieved by adjustment of new form item difficulty using an additive equating constant. Using simulated 5-year data, this report compares 4 approaches to calculating the equating constants and the subsequent impact on equating results. The 4 approaches…

Descriptors: Item Response Theory, Test Items, Test Construction, Sample Size

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

An Alternative Data Collection Design for Equating with Very Small Samples. Research Report. ETS RR-08-11

Peer reviewed
PDF on ERIC

Download full text

Puhan, Gautam; Moses, Tim; Grant, Mary; McHale, Fred – ETS Research Report Series, 2008

A single group (SG) equating design with nearly equivalent test forms (SiGNET) design was developed by Grant (2006) to equate small volume tests. The basis of this design is that examinees take two largely overlapping test forms within a single administration. The scored items for the operational form are divided into mini-tests called testlets.…

Descriptors: Data Collection, Equated Scores, Item Sampling, Sample Size

Improved Reliability Estimates for Small Samples Using Empirical Bayes Techniques. Research Report. ETS RR-09-46

Peer reviewed
PDF on ERIC

Download full text

Oh, Hyeonjoo J.; Guo, Hongwen; Walker, Michael E. – ETS Research Report Series, 2009

Issues of equity and fairness across subgroups of the population (e.g., gender or ethnicity) must be seriously considered in any standardized testing program. For this reason, many testing programs require some means for assessing test characteristics, such as reliability, for subgroups of the population. However, often only small sample sizes are…

Descriptors: Standardized Tests, Test Reliability, Sample Size, Bayesian Statistics

Investigating the Effectiveness of Collateral Information on Small-Sample Equating. Research Report. ETS RR-08-52

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Linvingston, Samuel A.; Lewis, Charles – ETS Research Report Series, 2008

This paper describes an empirical evaluation of a Bayesian procedure for equating scores on test forms taken by small numbers of examinees, using collateral information from the equating of other test forms. In this procedure, a separate Bayesian estimate is derived for the equated score at each raw-score level, making it unnecessary to specify a…

Descriptors: Equated Scores, Statistical Analysis, Sample Size, Bayesian Statistics

Kernel and Traditional Equipercentile Equating with Degrees of Presmoothing. Research Report. ETS RR-07-15

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Holland, Paul – ETS Research Report Series, 2007

The purpose of this study was to empirically evaluate the impact of loglinear presmoothing accuracy on equating bias and variability across chained and post-stratification equating methods, kernel and percentile-rank continuization methods, and sample sizes. The results of evaluating presmoothing on equating accuracy generally agreed with those of…

Descriptors: Equated Scores, Statistical Analysis, Accuracy, Sample Size

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures. Research Report. ETS RR-06-20

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim – ETS Research Report Series, 2006

Population invariance is an important requirement of test equating. An equating function is said to be population invariant when the choice of (sub)population used to compute the equating function does not matter. In recent studies, the extent to which equating functions are population invariant is typically addressed in terms of practical…

Descriptors: Equated Scores, Computation, Error of Measurement, Statistical Analysis

Moses, Tim	3
Guo, Hongwen	2
Kim, Sooyeon	2
Qian, Jiahe	2
Dorans, Neil J.	1
Grant, Mary	1
Gu, Lixiong	1
Holland, Paul	1
Lee, Yi-Hsuan	1
Lewis, Charles	1
Linvingston, Samuel A.	1
Livingston, Samuel A.	1
Lu, Ru	1
Manna, Venessa F.	1
McHale, Fred	1
Oh, Hyeonjoo J.	1
Puhan, Gautam	1
Walker, Michael E.	1
Wang, Lin	1
Zwick, Rebecca	1
More ▼