ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	13

Descriptor

Simulation	16
Student Evaluation	16
Test Items	16
Item Response Theory	10
Evaluation Methods	7
Computer Assisted Testing	4
Test Bias	4
Test Construction	4
Accuracy	3
Measurement	3
Models	3
Test Length	3
Test Wiseness	3
Academic Achievement	2
Adaptive Testing	2
Classification	2
Computation	2
Computer Software	2
Difficulty Level	2
Educational Assessment	2
Educational Testing	2
Foreign Countries	2
Goodness of Fit	2
Guessing (Tests)	2
Inferences	2
More ▼

Source

Applied Measurement in…	3
ProQuest LLC	2
Education Sciences	1
Educational Measurement:…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Legal Education	1
Psychometrika	1
Quality Assurance in…	1
Research in Mathematics…	1
Studies in Educational…	1
Turkish Online Journal of…	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	9
Reports - Evaluative	4
Dissertations/Theses -…	2
Reports - Descriptive	2

Education Level

Elementary Education	1
Higher Education	1
Secondary Education	1

Audience

Location

United Kingdom (England)

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

The Impact of Non-Effortful Responding on Item and Person Parameters in Item-Pool Scaling Linking

Peer reviewed

Direct link

Yue Liu; Zhen Li; Hongyun Liu; Xiaofeng You – Applied Measurement in Education, 2024

Low test-taking effort of examinees has been considered a source of construct-irrelevant variance in item response modeling, leading to serious consequences on parameter estimation. This study aims to investigate how non-effortful response (NER) influences the estimation of item and person parameters in item-pool scale linking (IPSL) and whether…

Descriptors: Item Response Theory, Computation, Simulation, Responses

Modeling Slipping Effects in a Large-Scale Assessment with Innovative Item Formats

Peer reviewed

Direct link

Cuhadar, Ismail; Binici, Salih – Educational Measurement: Issues and Practice, 2022

This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the…

Descriptors: Item Response Theory, Measurement, Student Evaluation, Algebra

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

The Effect of Person Misfit on Item Parameter Estimation and Classification Accuracy: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Mousavi, Amin; Cui, Ying – Education Sciences, 2020

Often, important decisions regarding accountability and placement of students in performance categories are made on the basis of test scores generated from tests, therefore, it is important to evaluate the validity of the inferences derived from test results. One of the threats to the validity of such inferences is aberrant responding. Several…

Descriptors: Student Evaluation, Educational Testing, Psychological Testing, Item Response Theory

Identifying Aberrant Responding: Use of Multiple Measures

Direct link

Steinkamp, Susan Christa – ProQuest LLC, 2017

For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…

Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness

Item Calibration Samples and the Stability of Achievement Estimates and System Rankings: Another Look at the PISA Model

Peer reviewed

Direct link

Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016

Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…

Descriptors: Simulation, International Programs, Adolescents, Student Evaluation

Some Implications of Choice of Tiering Model in GCSE Mathematics for Inferences about What Students Know and Can Do

Peer reviewed

Direct link

Bramley, Tom – Research in Mathematics Education, 2017

This study compared models of assessment structure for achieving differentiation across the range of examinee attainment in the General Certificate of Secondary Education (GCSE) examination taken by 16-year-olds in England. The focus was on the "adjacent levels" model, where papers are targeted at three specific non-overlapping ranges of…

Descriptors: Foreign Countries, Mathematics Education, Student Certification, Student Evaluation

Construction of Expert Knowledge Monitoring and Assessment System Based on Integral Method of Knowledge Evaluation

Peer reviewed
PDF on ERIC

Download full text

Golovachyova, Viktoriya N.; Menlibekova, Gulbakhyt Zh.; Abayeva, Nella F.; Ten, Tatyana L.; Kogaya, Galina D. – International Journal of Environmental and Science Education, 2016

Using computer-based monitoring systems that rely on tests could be the most effective way of knowledge evaluation. The problem of objective knowledge assessment by means of testing takes on a new dimension in the context of new paradigms in education. The analysis of the existing test methods enabled us to conclude that tests with selected…

Descriptors: Expertise, Computer Assisted Testing, Student Evaluation, Knowledge Level

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

A Randomized Rounding Approach for Optimization of Test Sheet Composing and Exposure Rate Control in Computer-Assisted Testing

Peer reviewed
PDF on ERIC

Download full text

Wang, Chu-Fu; Lin, Chih-Lung; Deng, Jien-Han – Turkish Online Journal of Educational Technology - TOJET, 2012

Testing is an important stage of teaching as it can assist teachers in auditing students' learning results. A good test is able to accurately reflect the capability of a learner. Nowadays, Computer-Assisted Testing (CAT) is greatly improving traditional testing, since computers can automatically and quickly compose a proper test sheet to meet user…

Descriptors: Simulation, Test Items, Student Evaluation, Test Construction

Improving Explanatory Inferences from Assessments

Direct link

Diakow, Ronli Phyllis – ProQuest LLC, 2013

This dissertation comprises three papers that propose, discuss, and illustrate models to make improved inferences about research questions regarding student achievement in education. Addressing the types of questions common in educational research today requires three different "extensions" to traditional educational assessment: (1)…

Descriptors: Inferences, Educational Assessment, Academic Achievement, Educational Research

Multidimensional Adaptive Testing in Educational and Psychological Measurement: Current State and Future Challenges

Peer reviewed

Direct link

Frey, Andreas; Seitz, Nicki-Nils – Studies in Educational Evaluation, 2009

The paper gives an overview of multidimensional adaptive testing (MAT) and evaluates its applicability in educational and psychological testing. The approach of Segall (1996) is described as a general framework for MAT. The main advantage of MAT is its capability to increase measurement efficiency. In simulation studies conceptualizing situations…

Descriptors: Psychological Testing, Adaptive Testing, Simulation, Evaluation Methods

Bayesian IRT Guessing Models for Partial Guessing Behaviors

Peer reviewed

Direct link

Cao, Jing; Stokes, S. Lynne – Psychometrika, 2008

According to the recent Nation's Report Card, 12th-graders failed to produce gains on the 2005 National Assessment of Educational Progress (NAEP) despite earning better grades on average. One possible explanation is that 12th-graders were not motivated taking the NAEP, which is a low-stakes test. We develop three Bayesian IRT mixture models to…

Descriptors: Test Items, Simulation, National Competency Tests, Item Response Theory

Performance of SIBTEST When the Percentage of DIF Items Is Large

Peer reviewed

Direct link

Gierl, Mark J.; Gotzmann, Andrea; Boughton, Keith A. – Applied Measurement in Education, 2004

Differential item functioning (DIF) analyses are used to identify items that operate differently between two groups, after controlling for ability. The Simultaneous Item Bias Test (SIBTEST) is a popular DIF detection method that matches examinees on a true score estimate of ability. However in some testing situations, like test translation and…

Descriptors: True Scores, Simulation, Test Bias, Student Evaluation

Effects of Average Signed Area Between Two Item Characteristic Curves and Test Purification Procedures on the DIF Detection via the Mantel-Haenszel Method

Peer reviewed

Direct link

Wang, Wen-Chung; Su, Ya-Hui – Applied Measurement in Education, 2004

In this study we investigated the effects of the average signed area (ASA) between the item characteristic curves of the reference and focal groups and three test purification procedures on the uniform differential item functioning (DIF) detection via the Mantel-Haenszel (M-H) method through Monte Carlo simulations. The results showed that ASA,…

Descriptors: Test Bias, Student Evaluation, Evaluation Methods, Test Items

Previous Page | Next Page »

Pages: 1 | 2

Wang, Wen-Chung	2
Abayeva, Nella F.	1
Binici, Salih	1
Boughton, Keith A.	1
Bramley, Tom	1
Cao, Jing	1
Cuhadar, Ismail	1
Cui, Ying	1
Deng, Jien-Han	1
Diakow, Ronli Phyllis	1
Frey, Andreas	1
Gierl, Mark J.	1
Golovachyova, Viktoriya N.	1
Gotzmann, Andrea	1
Hongyun Liu	1
Jin, Kuan-Yu	1
Kogaya, Galina D.	1
Kovach, Kimberlee K.	1
Kárász, Judit T.	1
Lin, Chih-Lung	1
Menlibekova, Gulbakhyt Zh.	1
Mousavi, Amin	1
Rutkowski, David	1
Rutkowski, Leslie	1
Seitz, Nicki-Nils	1
More ▼