ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	6

Descriptor

Ability	29
Comparative Analysis	29
Simulation	24
Test Items	13
Item Response Theory	10
Sample Size	8
Item Bias	7
Adaptive Testing	6
Probability	6
Scores	6
Statistical Distributions	6
Computer Assisted Testing	5
Computer Oriented Programs	5
Computer Simulation	5
Difficulty Level	5
Estimation (Mathematics)	5
Response Style (Tests)	5
Guessing (Tests)	4
Maximum Likelihood Statistics	4
Models	4
Statistical Bias	4
Test Length	4
Testing	4
Bayesian Statistics	3
Branching	3
More ▼

Source

Educational and Psychological…	4
Applied Measurement in…	2
ETS Research Report Series	2
ProQuest LLC	2
Applied Psychological…	1
Asia Pacific Education Review	1
Educational Sciences: Theory…	1
Journal of Educational…	1

Publication Type

Reports - Research	16
Journal Articles	12
Reports - Evaluative	9
Speeches/Meeting Papers	8
Dissertations/Theses -…	2
Dissertations/Theses	1
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Comparing Performances (Type I Error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…

Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias

Linking Item Parameters to a Base Scale

Peer reviewed

Direct link

Kang, Taehoon; Petersen, Nancy S. – Asia Pacific Education Review, 2012

This paper compares three methods of item calibration--concurrent calibration, separate calibration with linking, and fixed item parameter calibration--that are frequently used for linking item parameters to a base scale. Concurrent and separate calibrations were implemented using BILOG-MG. The Stocking and Lord in "Appl Psychol Measure"…

Descriptors: Methods, Comparative Analysis, Test Items, Item Response Theory

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

The Effects of Anchor Length, Test Difficulty, Population Ability Differences, Mixture of Populations and Sample Size on the Psychometric Properties of Levine Observed Score Linear Equating Method for Different Assumptions

Direct link

Carvajal-Espinoza, Jorge E. – ProQuest LLC, 2011

The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…

Descriptors: Sample Size, Equated Scores, Psychometrics, Measurement

Computerized Classification Testing under the One-Parameter Logistic Response Model with Ability-Based Guessing

Peer reviewed

Direct link

Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011

The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…

Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability

Refinement of a Bias-Correction Procedure for the Weighted Likelihood Estimator of Ability. Research Report. ETS RR-07-23

Peer reviewed
PDF on ERIC

Download full text

Zhang, Jinming; Lu, Ting – ETS Research Report Series, 2007

In practical applications of item response theory (IRT), item parameters are usually estimated first from a calibration sample. After treating these estimates as fixed and known, ability parameters are then estimated. However, the statistical inferences based on the estimated abilities can be misleading if the uncertainty of the item parameter…

Descriptors: Item Response Theory, Ability, Error of Measurement, Maximum Likelihood Statistics

The Effects of Score Group Width on the Mantel-Haenszel Procedure.

Download full text

Clauser, Brian; And Others – 1992

Previous research examining the effects of reducing the number of score groups used in the matching criterion of the Mantel-Haenszel procedure, when screening for differential item functioning, has produced ambiguous results. The goal of this study was to resolve the ambiguity by examining the problem with a simulated data set. The main results…

Descriptors: Ability, Comparative Analysis, Computer Simulation, Item Bias

A Monte Carlo Comparison of Item and Person Statistics Based on Item Response Theory versus Classical Test Theory.

Peer reviewed

MacDonald, Paul; Paunonen, Sampo V. – Educational and Psychological Measurement, 2002

Examined the behavior of item and person statistics from item response theory and classical test theory frameworks through Monte Carlo methods with simulated test data. Findings suggest that item difficulty and person ability estimates are highly comparable for both approaches. (SLD)

Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Response Theory

A Comparison of Logistic Regression and Analysis of Variance Differential Item Functioning Decision Methods.

Peer reviewed

Whitmore, Marjorie L.; Schumacker, Randall E. – Educational and Psychological Measurement, 1999

Compared differential item functioning detection rates for logistic regression and analysis of variance for dichotomously scored items using simulated data and varying test length, sample size, discrimination rate, and underlying ability. Explains why the logistic regression method is recommended for most applications. (SLD)

Descriptors: Ability, Analysis of Variance, Comparative Analysis, Item Bias

Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15

Peer reviewed
PDF on ERIC

Download full text

Zhang, Jinming – ETS Research Report Series, 2005

Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…

Descriptors: Statistical Bias, Maximum Likelihood Statistics, Computation, Ability

The Robustness of BILOG to Violations of the Assumptions of Unidimensionality of Test Items and Normality of Ability Distribution.

PDF pending restoration

Kirisci, Levent; Hsu, Tse-Chi – 1995

The main goal of this study was to assess how sensitive unidimensional parameter estimates derived from BILOG were when the unidimensionality assumption was violated and the underlying ability distribution was not multivariate normal. A multidimensional three-parameter logistic distribution that was a straightforward generalization of the…

Descriptors: Ability, Comparative Analysis, Correlation, Difficulty Level

Formulation of an Alternative Method of Determining Levels of Comparison for the Generalized Mantel-Haenszel Using IRT Ability Estimates.

Frey, Sharon L. – 1996

The Mantel-Haenszel procedure (N. Mantel and W. Haenszel, 1959) and its extension to constructed response items, the Generalized Mantel Haenszel (A. Agresti, 1990), compare performance of subgroups across different score groups to determine differential item functioning (DIF). At each level of comparison, or score group, the subgroups are…

Descriptors: Ability, Comparative Analysis, Constructed Response, Ethnic Groups

The Performance of the Mantel-Haenszel DIF Statistic When Comparison Group Distributions Are Incongruent.

Download full text

Pommerich, Mary; And Others – 1994

The functioning of two population-based Mantel-Haenszel (MH) common-odds ratios was compared. One ratio is conditioned on the observed test score, while the other is conditioned on a latent trait or true ability score. When the comparison group distributions are incongruent or nonoverlapping to some degree, the observed score represents different…

Descriptors: Ability, Comparative Analysis, Item Bias, Performance

A Simulation Study of Stradaptive Ability Testing. Research Report 75-6.

Download full text

Vale, C. David; Weiss, David J. – 1975

A conventional test and two forms of a stradaptive test were administered to thousands of simulated subjects by minicomputer. Characteristics of the three tests using several scoring techniques were investigated while varying the discriminating power of the items, the lengths of the tests, and the availability of prior information about the…

Descriptors: Ability, Branching, Comparative Analysis, Computer Oriented Programs

The Goal of Equity within and between Computerized Adaptive Tests and Paper and Pencil Forms.

Download full text

Thomasson, Gary L. – 1997

Score comparability is important to those who take tests and those who use them. One important concept related to test score comparability is that of "equity," which is defined as existing when examinees are indifferent as to which of two alternate forms of a test they would prefer to take. By their nature, computerized adaptive tests…

Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing

Previous Page | Next Page »

Pages: 1 | 2

Kalisch, Stanley James, Jr.	2
Weiss, David J.	2
Zhang, Jinming	2
Arsan, Nihan	1
Atalay Kabasakal, Kübra	1
Barnes, Laura L. B.	1
Betz, Nancy E.	1
Camilli, Gregory	1
Carvajal-Espinoza, Jorge E.	1
Clauser, Brian	1
DeAyala, R. J.	1
Frey, Sharon L.	1
Gök, Bilge	1
Hsu, Tse-Chi	1
Huang, Sheng-Yun	1
Kang, Taehoon	1
Kelecioglu, Hülya	1
Kim, Seock-Ho	1
Kirisci, Levent	1
Koch, William R.	1
Kocher, A. Thel	1
Lu, Ting	1
MacDonald, Paul	1
Nandakumar, Ratna	1
More ▼