Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 6 |
Descriptor
Source
Author
Hambleton, Ronald K. | 45 |
Rogers, H. Jane | 7 |
Jones, Russell W. | 4 |
Sireci, Stephen G. | 3 |
Wells, Craig S. | 3 |
Zenisky, April L. | 3 |
Cook, Linda L. | 2 |
Robin, Frederic | 2 |
Xing, Dehui | 2 |
Allalouf, Avi | 1 |
Baldwin, Peter | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 2 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Audience
Researchers | 4 |
Practitioners | 1 |
Location
California | 1 |
Estonia | 1 |
Florida | 1 |
Israel | 1 |
Massachusetts | 1 |
Netherlands | 1 |
New York | 1 |
North Carolina | 1 |
Oklahoma | 1 |
Texas | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Management Admission… | 1 |
Medical College Admission Test | 1 |
National Assessment of… | 1 |
United States Medical… | 1 |
What Works Clearinghouse Rating
Yoo, Hanwook; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2019
Item analysis is an integral part of operational test development and is typically conducted within two popular statistical frameworks: classical test theory (CTT) and item response theory (IRT). In this digital ITEMS module, Hanwook Yoo and Ronald K. Hambleton provide an accessible overview of operational item analysis approaches within these…
Descriptors: Item Analysis, Item Response Theory, Guidelines, Test Construction
Clauser, Jerome C.; Hambleton, Ronald K.; Baldwin, Peter – Educational and Psychological Measurement, 2017
The Angoff standard setting method relies on content experts to review exam items and make judgments about the performance of the minimally proficient examinee. Unfortunately, at times content experts may have gaps in their understanding of specific exam content. These gaps are particularly likely to occur when the content domain is broad and/or…
Descriptors: Scores, Item Analysis, Classification, Decision Making
Wells, Craig S.; Hambleton, Ronald K.; Kirkpatrick, Robert; Meng, Yu – Applied Measurement in Education, 2014
The purpose of the present study was to develop and evaluate two procedures flagging consequential item parameter drift (IPD) in an operational testing program. The first procedure was based on flagging items that exhibit a meaningful magnitude of IPD using a critical value that was defined to represent barely tolerable IPD. The second procedure…
Descriptors: Test Items, Test Bias, Equated Scores, Item Response Theory
Han, Kyung T.; Wells, Craig S.; Hambleton, Ronald K. – Practical Assessment, Research & Evaluation, 2015
In item response theory test scaling/equating with the three-parameter model, the scaling coefficients A and B have no impact on the c-parameter estimates of the test items since the cparameter estimates are not adjusted in the scaling/equating procedure. The main research question in this study concerned how serious the consequences would be if…
Descriptors: Item Response Theory, Monte Carlo Methods, Scaling, Test Items
Lyren, Per-Erik; Hambleton, Ronald K. – International Journal of Testing, 2011
The equal ability distribution assumption associated with the equivalent groups equating design was investigated in the context of a selection test for admission to higher education. The purpose was to assess the consequences for the test-takers in terms of receiving improperly high or low scores compared to their peers, and to find strong…
Descriptors: Evidence, Test Items, Ability Grouping, Item Response Theory
Wells, Craig S.; Baldwin, Su; Hambleton, Ronald K.; Sireci, Stephen G.; Karatonis, Ana; Jirka, Stephen – Applied Measurement in Education, 2009
Score equity assessment is an important analysis to ensure inferences drawn from test scores are comparable across subgroups of examinees. The purpose of the present evaluation was to assess the extent to which the Grade 8 NAEP Math and Reading assessments for 2005 were equivalent across selected states. More specifically, the present study…
Descriptors: National Competency Tests, Test Bias, Equated Scores, Grade 8

Mazor, Kathleen M.; Hambleton, Ronald K.; Clauser, Brian E. – Applied Psychological Measurement, 1998
Studied whether matching on multiple test scores would reduce false-positive error rates compared to matching on a single number-correct score using simulation. False-positive error rates were reduced for most datasets. Findings suggest that assessing the dimensional structure of a test can be important in analysis of differential item functioning…
Descriptors: Error of Measurement, Item Bias, Scores, Test Items

Muniz, Jose; Hambleton, Ronald K.; Xing, Dehui – International Journal of Testing, 2001
Studied two procedures for detecting potentially flawed items in translated tests with small samples: (1) conditional item "p" value comparisons; and (2) delta plots. Varied several factors in this simulation study. Findings show that the two procedures can be valuable in identifying flawed test items, especially when the size of the…
Descriptors: Identification, Sample Size, Simulation, Test Items

Zenisky, April L.; Hambleton, Ronald K.; Robin, Frederic – Educational and Psychological Measurement, 2003
Studied a two-stage methodology for evaluating differential item functioning (DIF) in large-scale assessment data using a sample of 60,000 students taking a large-scale assessment. Findings illustrate the merit of iterative approached for DIF detection, since items identified at one stage were not necessarily the same as those identified at the…
Descriptors: Item Bias, Large Scale Assessment, Research Methodology, Test Items

Oakland, Thomas; Poortinga, Ype H.; Schlegel, Justin; Hambleton, Ronald K. – International Journal of Testing, 2001
Traces the history of the International Test Commission (ITC), reviewing the context in which it was formed, its goals, and major milestones in its development. Suggests ways the ITC may continue to impact test development positively, and introduces this inaugural journal issue. (SLD)
Descriptors: Educational History, Educational Testing, International Education, Test Construction
Xing, Dehui; Hambleton, Ronald K. – Educational and Psychological Measurement, 2004
Computer-based testing by credentialing agencies has become common; however, selecting a test design is difficult because several good ones are available - parallel forms, computer adaptive (CAT), and multistage (MST). In this study, three computer-based test designs under some common examination conditions were investigated. Item bank size and…
Descriptors: Test Construction, Psychometrics, Item Banks, Computer Assisted Testing
Zenisky, April L.; Hambleton, Ronald K.; Robin, Frederic – Educational Assessment, 2004
Differential item functioning (DIF) analyses are a routine part of the development of large-scale assessments. Less common are studies to understand the potential sources of DIF. The goals of this study were (a) to identify gender DIF in a large-scale science assessment and (b) to look for trends in the DIF and non-DIF items due to content,…
Descriptors: Program Effectiveness, Test Format, Science Tests, Test Items
Zenisky, April L.; Hambleton, Ronald K.; Sireci, Stephen G. – 2001
Measurement specialists routinely assume examinee responses to test items are independent of one another. However, previous research has shown that many contemporary tests contain item dependencies and not accounting for these dependencies leads to misleading estimates of item, test, and ability parameters. In this study, methods for detecting…
Descriptors: Ability, College Applicants, College Entrance Examinations, Higher Education

Hambleton, Ronald K.; Jones, Russell W. – Educational Research Quarterly, 1994
A judgmental method for determining item bias was applied to test data from 2,000 Native American and 2,000 Anglo-American students for a statewide proficiency test. Results indicated some shortcomings of the judgmental method but supported the use of cross-validation in empirically identifying potential bias. (SLD)
Descriptors: American Indians, Anglo Americans, Comparative Analysis, Decision Making
Hambleton, Ronald K.; And Others – 1990
Item response theory (IRT) model parameter estimates have considerable merit and open up new directions for test development, but misleading results are often obtained because of errors in the item parameter estimates. The problem of the effects of item parameter estimation errors on the test development process is discussed, and the seriousness…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Sampling