ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	7

Descriptor

Correlation	9
Difficulty Level	9
Test Items	8
Item Response Theory	5
Test Bias	4
College Entrance Examinations	3
Comparative Analysis	3
Evaluation Methods	3
Item Analysis	3
Latent Trait Theory	2
Mathematics Tests	2
Models	2
Monte Carlo Methods	2
Multiple Choice Tests	2
Predictor Variables	2
Sample Size	2
Simulation	2
Statistical Analysis	2
Statistical Bias	2
Ability	1
Accuracy	1
Behavior	1
College Admission	1
College Freshmen	1
Computation	1
More ▼

Source

Educational and Psychological…

Author

Ahn, Soyeon	1
DeMars, Christine E.	1
Devine, Patrick J.	1
Hamby, Tyler	1
Kim, YoungKoung	1
Kobrin, Jennifer L.	1
Matlock, Ki Lynn	1
Park, Sung Eun	1
Raju, Nambury S.	1
Sackett, Paul R.	1
Skorupski, William P.	1
Socha, Alan	1
Stricker, Lawrence J.	1
Taylor, Wyn	1
Turner, Ronna	1
Weitzman, R. A.	1
Wolkowitz, Amanda A.	1
Zopluoglu, Cengiz	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	1

Education Level

Higher Education	2
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Graduate Record Examinations	1
Rosenberg Self Esteem Scale	1
SRA Achievement Series	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Differential Item Functioning Effect Size from the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Park, Sung Eun; Ahn, Soyeon; Zopluoglu, Cengiz – Educational and Psychological Measurement, 2021

This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across…

Descriptors: Item Analysis, Effect Size, Difficulty Level, Monte Carlo Methods

Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

Peer reviewed

Direct link

Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level

Survey Satisficing Inflates Reliability and Validity Measures: An Experimental Comparison of College and Amazon Mechanical Turk Samples

Peer reviewed

Direct link

Hamby, Tyler; Taylor, Wyn – Educational and Psychological Measurement, 2016

This study examined the predictors and psychometric outcomes of survey satisficing, wherein respondents provide quick, "good enough" answers (satisficing) rather than carefully considered answers (optimizing). We administered surveys to university students and respondents--half of whom held college degrees--from a for-pay survey website,…

Descriptors: Surveys, Test Reliability, Test Validity, Comparative Analysis

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

A Method for Imputing Response Options for Missing Data on Multiple-Choice Assessments

Peer reviewed

Direct link

Wolkowitz, Amanda A.; Skorupski, William P. – Educational and Psychological Measurement, 2013

When missing values are present in item response data, there are a number of ways one might impute a correct or incorrect response to a multiple-choice item. There are significantly fewer methods for imputing the actual response option an examinee may have provided if he or she had not omitted the item either purposely or accidentally. This…

Descriptors: Multiple Choice Tests, Statistical Analysis, Models, Accuracy

Modeling the Predictive Validity of SAT Mathematics Items Using Item Characteristics

Peer reviewed

Direct link

Kobrin, Jennifer L.; Kim, YoungKoung; Sackett, Paul R. – Educational and Psychological Measurement, 2012

There is much debate on the merits and pitfalls of standardized tests for college admission, with questions regarding the format (multiple-choice vs. constructed response), cognitive complexity, and content of these assessments (achievement vs. aptitude) at the forefront of the discussion. This study addressed these questions by investigating the…

Descriptors: Grade Point Average, Standardized Tests, Predictive Validity, Predictor Variables

Fitting the Rasch Model to Account for Variation in Item Discrimination

Peer reviewed

Direct link

Weitzman, R. A. – Educational and Psychological Measurement, 2009

Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…

Descriptors: Item Response Theory, Test Items, Difficulty Level, Test Bias

Extent of Overlap among Four Item Bias Methods.

Peer reviewed

Devine, Patrick J.; Raju, Nambury S. – Educational and Psychological Measurement, 1982

Four methods of item bias detection--transformed item difficulty, item discrimination expressed as Clemans' lambda, chi-square, and the three-parameter item characteristic curve--were studied to determine the degree of correspondence among them in identifying biased and unbiased items in reading and mathematics subtests of the 1978 SRA Achievement…

Descriptors: Correlation, Difficulty Level, Item Analysis, Latent Trait Theory

The Stability of a Partial Correlation Index for Identifying Items that Perform Differentially in Subgroups.

Peer reviewed

Stricker, Lawrence J. – Educational and Psychological Measurement, 1984

The stability was evaluated of a partial correlation index, comparisons of item characteristic curves, and comparisions of item difficulties in assessing race and sex differences in the performance of verbal items on the Graduate Record Examination Aptitude Test. All three indexes exhibited consistency in identifying the same items in different…

Descriptors: College Entrance Examinations, Comparative Analysis, Correlation, Difficulty Level