ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	26

Descriptor

Ability	127
Simulation	127
Item Response Theory	66
Test Items	59
Estimation (Mathematics)	47
Adaptive Testing	43
Computer Assisted Testing	39
Comparative Analysis	24
Maximum Likelihood Statistics	20
Item Bias	18
Models	18
Sample Size	18
Test Construction	17
Scores	16
Bayesian Statistics	14
Statistical Distributions	14
Test Length	14
Difficulty Level	13
Error of Measurement	13
Item Banks	12
Probability	11
Computation	10
Selection	10
Correlation	9
Response Style (Tests)	9
More ▼

Source

Applied Psychological…	12
Educational and Psychological…	10
Journal of Educational…	10
Journal of Educational and…	6
Applied Measurement in…	3
ProQuest LLC	3
ETS Research Report Series	2
Educational Sciences: Theory…	2
Journal of Outcome Measurement	2
Psychometrika	2
American Institutes for…	1
Asia Pacific Education Review	1
Educational Measurement:…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Online Submission	1
Simulation & Gaming	1
More ▼

Publication Type

Reports - Research	57
Journal Articles	54
Reports - Evaluative	53
Speeches/Meeting Papers	37
Reports - Descriptive	6
Dissertations/Theses -…	4
Numerical/Quantitative Data	2
Collected Works - General	1
Collected Works - Proceedings	1
Dissertations/Theses	1
Guides - General	1
Information Analyses	1
Opinion Papers	1
More ▼

Education Level

Elementary Secondary Education	2
Early Childhood Education	1
Grade 12	1
High Schools	1
Preschool Education	1
Secondary Education	1

Audience

Researchers

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	3
Advanced Placement…	1
COMPASS (Computer Assisted…	1
National Assessment of…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 127 results Save | Export

A Dual-Purpose Model for Binary Data: Estimating Ability and Misconceptions

Peer reviewed

Direct link

Wenchao Ma; Miguel A. Sorrel; Xiaoming Zhai; Yuan Ge – Journal of Educational Measurement, 2024

Most existing diagnostic models are developed to detect whether students have mastered a set of skills of interest, but few have focused on identifying what scientific misconceptions students possess. This article developed a general dual-purpose model for simultaneously estimating students' overall ability and the presence and absence of…

Descriptors: Models, Misconceptions, Diagnostic Tests, Ability

Modeling Slipping Effects in a Large-Scale Assessment with Innovative Item Formats

Peer reviewed

Direct link

Cuhadar, Ismail; Binici, Salih – Educational Measurement: Issues and Practice, 2022

This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the…

Descriptors: Item Response Theory, Measurement, Student Evaluation, Algebra

Robust Estimation of Ability and Mental Speed Employing the Hierarchical Model for Responses and Response Times

Peer reviewed

Direct link

Ranger, Jochen; Kuhn, Jörg-Tobias; Wolgast, Anett – Journal of Educational Measurement, 2021

Van der Linden's hierarchical model for responses and response times can be used in order to infer the ability and mental speed of test takers from their responses and response times in an educational test. A standard approach for this is maximum likelihood estimation. In real-world applications, the data of some test takers might be partly…

Descriptors: Models, Reaction Time, Item Response Theory, Tests

Assessing Ability Recovery of the Sequential IRT Model with Unstructured Multiple-Attempt Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Ziying Li; A. Corinne Huggins-Manley; Walter L. Leite; M. David Miller; Eric A. Wright – Educational and Psychological Measurement, 2022

The unstructured multiple-attempt (MA) item response data in virtual learning environments (VLEs) are often from student-selected assessment data sets, which include missing data, single-attempt responses, multiple-attempt responses, and unknown growth ability across attempts, leading to a complex and complicated scenario for using this kind of…

Descriptors: Sequential Approach, Item Response Theory, Data, Simulation

Parameter Estimation Bias of Dichotomous Logistic Item Response Theory Models Using Different Variables

Peer reviewed
PDF on ERIC

Download full text

Köse, Alper; Dogan, C. Deha – International Journal of Evaluation and Research in Education, 2019

The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution. In the study, number of categories (1-0), and item…

Descriptors: Statistical Bias, Item Response Theory, Simulation, Accuracy

A Short Note on the Relationship between Pass Rate and Multiple Attempts

Peer reviewed

Direct link

Cheng, Ying; Liu, Cheng – Journal of Educational Measurement, 2016

For a certification, licensure, or placement exam, allowing examinees to take multiple attempts at the test could effectively change the pass rate. Change in the pass rate can occur without any change in the underlying latent trait, and can be an artifact of multiple attempts and imperfect reliability of the test. By deriving formulae to compute…

Descriptors: Testing, Computation, Change, Simulation

Effects of Calibration Sample Size and Item Bank Size on Ability Estimation in Computerized Adaptive Testing

Peer reviewed
PDF on ERIC

Download full text

Sahin, Alper; Weiss, David J. – Educational Sciences: Theory and Practice, 2015

This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…

Descriptors: Adaptive Testing, Computer Assisted Testing, Sample Size, Item Banks

A Nonparametric Approach to Estimate Classification Accuracy and Consistency

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014

When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics

The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Applied Psychological Measurement, 2013

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Length, Ability

An Assessment of the Nonparametric Approach for Evaluating the Fit of Item Response Models

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S.; Hambleton, Ronald K. – Journal of Educational Measurement, 2014

As item response theory has been more widely applied, investigating the fit of a parametric model becomes an important part of the measurement process. There is a lack of promising solutions to the detection of model misfit in IRT. Douglas and Cohen introduced a general nonparametric approach, RISE (Root Integrated Squared Error), for detecting…

Descriptors: Item Response Theory, Measurement Techniques, Nonparametric Statistics, Models

Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Chun; Chang, Hua-Hua; Boughton, Keith A. – Applied Psychological Measurement, 2013

Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Length, Item Banks

Item-Weighted Likelihood Method for Ability Estimation in Tests Composed of Both Dichotomous and Polytomous Items

Peer reviewed

Direct link

Tao, Jian; Shi, Ning-Zhong; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2012

For mixed-type tests composed of both dichotomous and polytomous items, polytomous items often yield more information than dichotomous ones. To reflect the difference between the two types of items, polytomous items are usually pre-assigned with larger weights. We propose an item-weighted likelihood method to better assess examinees' ability…

Descriptors: Test Items, Weighted Scores, Maximum Likelihood Statistics, Statistical Bias

Comparing Performances (Type I Error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…

Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias

Linking Item Parameters to a Base Scale

Peer reviewed

Direct link

Kang, Taehoon; Petersen, Nancy S. – Asia Pacific Education Review, 2012

This paper compares three methods of item calibration--concurrent calibration, separate calibration with linking, and fixed item parameter calibration--that are frequently used for linking item parameters to a base scale. Concurrent and separate calibrations were implemented using BILOG-MG. The Stocking and Lord in "Appl Psychol Measure"…

Descriptors: Methods, Comparative Analysis, Test Items, Item Response Theory

Establishing the Criterion-Related, Construct, and Content Validities of a Simulation-Based Assessment of Inquiry Abilities

Peer reviewed

Direct link

Wu, Pai-Hsing; Wu, Hsin-Kai; Hsu, Ying-Shao – International Journal of Science Education, 2014

The emphasis on scientific inquiry has increased the importance in developing the fundamental abilities to conduct scientific investigations and urged a need for valid assessments of students' inquiry abilities. We took advantage of the advanced technology to develop a simulation-based assessment of inquiry abilities (SAIA) that allowed…

Descriptors: Construct Validity, Content Validity, Inquiry, Scientific Research

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Nandakumar, Ratna	5
Reese, Lynda M.	5
Chang, Hua-Hua	4
Glas, Cees A. W.	4
Veerkamp, Wim J. J.	4
Wang, Tianyou	4
Weiss, David J.	4
van der Linden, Wim J.	4
Berger, Martijn P. F.	3
Cheng, Ying	3
Davey, Tim	3
Nicewander, W. Alan	3
Parshall, Cynthia G.	3
Pommerich, Mary	3
Schnipke, Deborah L.	3
de la Torre, Jimmy	3
Betz, Nancy E.	2
Camilli, Gregory	2
Capar, Nilufer K.	2
De Ayala, R. J.	2
Fox, Jean-Paul	2
Hau, Kit-Tai	2
Kalisch, Stanley James, Jr.	2
Kim, Seock-Ho	2
More ▼