ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	24
Since 2006 (last 20 years)	50

Descriptor

Simulation	67
Test Items	67
Item Response Theory	38
Models	16
Item Analysis	13
Difficulty Level	12
Comparative Analysis	11
Computation	11
Computer Assisted Testing	11
Test Bias	11
Evaluation Methods	10
Sample Size	10
Test Construction	10
Adaptive Testing	8
Error of Measurement	8
Correlation	7
Equated Scores	7
Goodness of Fit	7
Maximum Likelihood Statistics	7
Data Analysis	6
Measurement	6
Statistical Analysis	6
Item Bias	5
Multiple Choice Tests	5
Psychometrics	5
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	67
Reports - Research	40
Reports - Evaluative	25
Speeches/Meeting Papers	3
Reports - Descriptive	2

Education Level

Secondary Education	4
Elementary Secondary Education	2
Higher Education	2
Elementary Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

China

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

National Assessment of…	4
Program for International…	3
Law School Admission Test	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 67 results Save | Export

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

A Residual-Based Differential Item Functioning Detection Framework in Item Response Theory

Peer reviewed

Direct link

Lim, Hwanggyu; Choe, Edison M.; Han, Kyung T. – Journal of Educational Measurement, 2022

Differential item functioning (DIF) of test items should be evaluated using practical methods that can produce accurate and useful results. Among a plethora of DIF detection techniques, we introduce the new "Residual DIF" (RDIF) framework, which stands out for its accessibility without sacrificing efficacy. This framework consists of…

Descriptors: Test Items, Item Response Theory, Identification, Robustness (Statistics)

Modeling Nonlinear Effects of Person-by-Item Covariates in Explanatory Item Response Models: Exploratory Plots and Modeling Using Smooth Functions

Peer reviewed

Direct link

Sun-Joo Cho; Amanda Goodwin; Matthew Naveiras; Paul De Boeck – Journal of Educational Measurement, 2024

Explanatory item response models (EIRMs) have been applied to investigate the effects of person covariates, item covariates, and their interactions in the fields of reading education and psycholinguistics. In practice, it is often assumed that the relationships between the covariates and the logit transformation of item response probability are…

Descriptors: Item Response Theory, Test Items, Models, Maximum Likelihood Statistics

Simultaneous Constrained Adaptive Item Selection for Group-Based Testing

Peer reviewed

Direct link

Bengs, Daniel; Kroehne, Ulf; Brefeld, Ulf – Journal of Educational Measurement, 2021

By tailoring test forms to the test-taker's proficiency, Computerized Adaptive Testing (CAT) enables substantial increases in testing efficiency over fixed forms testing. When used for formative assessment, the alignment of task difficulty with proficiency increases the chance that teachers can derive useful feedback from assessment data. The…

Descriptors: Computer Assisted Testing, Formative Evaluation, Group Testing, Program Effectiveness

Two IRT Fixed Parameter Calibration Methods for the Bifactor Model

Peer reviewed

Direct link

Kim, Kyung Yong – Journal of Educational Measurement, 2020

New items are often evaluated prior to their operational use to obtain item response theory (IRT) item parameter estimates for quality control purposes. Fixed parameter calibration is one linking method that is widely used to estimate parameters for new items and place them on the desired scale. This article provides detailed descriptions of two…

Descriptors: Item Response Theory, Evaluation Methods, Test Items, Simulation

A Framework for Measuring the Amount of Adaptation of Rasch-Based Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E.; McBride, James R. – Journal of Educational Measurement, 2021

A key consideration when giving any computerized adaptive test (CAT) is how much adaptation is present when the test is used in practice. This study introduces a new framework to measure the amount of adaptation of Rasch-based CATs based on looking at the differences between the selected item locations (Rasch item difficulty parameters) of the…

Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Test Items

Item Selection and Exposure Control Methods for Computerized Adaptive Testing with Multidimensional Ranking Items

Peer reviewed

Direct link

Chen, Chia-Wen; Wang, Wen-Chung; Chiu, Ming Ming; Ro, Sage – Journal of Educational Measurement, 2020

The use of computerized adaptive testing algorithms for ranking items (e.g., college preferences, career choices) involves two major challenges: unacceptably high computation times (selecting from a large item pool with many dimensions) and biased results (enhanced preferences or intensified examinee responses because of repeated statements across…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Classical Item Analysis from a Signal Detection Perspective

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023

A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness

A Method for Detecting Regression of Hard and Easy Item Angoff Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2019

One common phenomenon in Angoff standard setting is that panelists regress their ratings in toward the middle of the probability scale. This study describes two indices based on taking ratios of standard deviations that can be utilized with a scatterplot of item ratings versus expected probabilities of success to identify whether ratings are…

Descriptors: Item Analysis, Standard Setting, Probability, Feedback (Response)

Efficiency of Targeted Multistage Calibration Designs under Practical Constraints: A Simulation Study

Peer reviewed

Direct link

Berger, Stéphanie; Verschoor, Angela J.; Eggen, Theo J. H. M.; Moser, Urs – Journal of Educational Measurement, 2019

Calibration of an item bank for computer adaptive testing requires substantial resources. In this study, we investigated whether the efficiency of calibration under the Rasch model could be enhanced by improving the match between item difficulty and student ability. We introduced targeted multistage calibration designs, a design type that…

Descriptors: Simulation, Computer Assisted Testing, Test Items, Difficulty Level

Bias and Bias Correction Method for Nonproportional Abilities Requirement (NPAR) Tests

Peer reviewed

Direct link

Ip, Edward H.; Strachan, Tyler; Fu, Yanyan; Lay, Alexandra; Willse, John T.; Chen, Shyh-Huei; Rutkowski, Leslie; Ackerman, Terry – Journal of Educational Measurement, 2019

Test items must often be broad in scope to be ecologically valid. It is therefore almost inevitable that secondary dimensions are introduced into a test during test development. A cognitive test may require one or more abilities besides the primary ability to correctly respond to an item, in which case a unidimensional test score overestimates the…

Descriptors: Test Items, Test Bias, Test Construction, Scores

Item Calibration Methods with Multiple Subscale Multistage Testing

Peer reviewed

Direct link

Chun Wang; Ping Chen; Shengyu Jiang – Journal of Educational Measurement, 2020

Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait [theta] estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to…

Descriptors: Test Construction, Test Items, Adaptive Testing, Maximum Likelihood Statistics

Routing Strategies and Optimizing Design for Multistage Testing in International Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019

This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…

Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory

Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

Peer reviewed

Direct link

Drabinová, Adéla; Martinková, Patrícia – Journal of Educational Measurement, 2017

In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

Descriptors: Test Items, Regression (Statistics), Guessing (Tests), Identification

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Wang, Wen-Chung	5
Penfield, Randall D.	4
Rutkowski, Leslie	3
Suh, Youngsuk	3
Bolt, Daniel M.	2
Chang, Hua-Hua	2
Cho, Sun-Joo	2
Gierl, Mark J.	2
Jin, Kuan-Yu	2
Leighton, Jacqueline P.	2
Liaw, Yuan-Ling	2
Nandakumar, Ratna	2
Oshima, T. C.	2
Rutkowski, David	2
Wyse, Adam E.	2
de la Torre, Jimmy	2
Ackerman, Terry	1
Albano, Anthony D.	1
Algina, James	1
Amanda Goodwin	1
Andersson, Björn	1
Ariel, Adelaide	1
Babcock, Ben	1
Ban, Jae-Chun	1
More ▼