ERIC - Search Results

Publication Date

In 2025	3
Since 2024	3
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	15

Descriptor

Computer Software	19
Item Response Theory	10
Models	8
Test Items	7
Computer Assisted Testing	6
Computation	5
Simulation	5
Statistical Analysis	4
Accuracy	3
Measurement Techniques	3
Monte Carlo Methods	3
Psychometrics	3
Scoring	3
Student Evaluation	3
Test Construction	3
Adaptive Testing	2
Bayesian Statistics	2
Cognitive Measurement	2
Comparative Analysis	2
Computer Simulation	2
Correlation	2
Difficulty Level	2
Educational Assessment	2
Elementary Secondary Education	2
Evaluation Methods	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	19
Reports - Research	11
Reports - Descriptive	4
Reports - Evaluative	3
Book/Product Reviews	1

Education Level

Elementary Education	1
High Schools	1
Higher Education	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Teachers

Location

China

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

A Generalized Objective Function for Computer Adaptive Item Selection

Peer reviewed

Direct link

Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025

Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Anchoring Validity Evidence for Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D. – Journal of Educational Measurement, 2022

One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…

Descriptors: Scoring, Essays, Validity, Writing Evaluation

Bayesian Model Selection Methods for Multilevel IRT Models: A Comparison of Five DIC-Based Indices

Peer reviewed

Direct link

Zhang, Xue; Tao, Jian; Wang, Chun; Shi, Ning-Zhong – Journal of Educational Measurement, 2019

Model selection is important in any statistical analysis, and the primary goal is to find the preferred (or most parsimonious) model, based on certain criteria, from a set of candidate models given data. Several recent publications have employed the deviance information criterion (DIC) to do model selection among different forms of multilevel item…

Descriptors: Bayesian Statistics, Item Response Theory, Measurement, Models

Modeling Skipped and Not-Reached Items Using IRTrees

Peer reviewed

Direct link

Debeer, Dries; Janssen, Rianne; De Boeck, Paul – Journal of Educational Measurement, 2017

When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree-based IRT framework for modeling responses and omissions…

Descriptors: Item Response Theory, Test Items, Responses, Testing Problems

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

Item Response Models for Examinee-Selected Items

Peer reviewed

Direct link

Wang, Wen-Chung; Jin, Kuan-Yu; Qiu, Xue-Lan; Wang, Lei – Journal of Educational Measurement, 2012

In some tests, examinees are required to choose a fixed number of items from a set of given items to answer. This practice creates a challenge to standard item response models, because more capable examinees may have an advantage by making wiser choices. In this study, we developed a new class of item response models to account for the choice…

Descriptors: Item Response Theory, Test Items, Selection, Models

The Random-Effect Generalized Rating Scale Model

Peer reviewed

Direct link

Wang, Wen-Chung; Wu, Shiu-Lien – Journal of Educational Measurement, 2011

Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…

Descriptors: Rating Scales, Models, Statistical Analysis, Computation

Estimation Methods for One-Parameter Testlet Models

Peer reviewed

Direct link

Jiao, Hong; Wang, Shudong; He, Wei – Journal of Educational Measurement, 2013

This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…

Descriptors: Computation, Item Response Theory, Models, Monte Carlo Methods

Automated Test Assembly for Cognitive Diagnosis Models Using a Genetic Algorithm

Peer reviewed

Direct link

Finkelman, Matthew; Kim, Wonsuk; Roussos, Louis A. – Journal of Educational Measurement, 2009

Much recent psychometric literature has focused on cognitive diagnosis models (CDMs), a promising class of instruments used to measure the strengths and weaknesses of examinees. This article introduces a genetic algorithm to perform automated test assembly alongside CDMs. The algorithm is flexible in that it can be applied whether the goal is to…

Descriptors: Identification, Genetics, Test Construction, Mathematics

Examining Teacher Grades Using Rasch Measurement Theory

Peer reviewed

Direct link

Randall, Jennifer; Engelhard, George, Jr. – Journal of Educational Measurement, 2009

In this study, we present an approach to questionnaire design within educational research based on Guttman's mapping sentences and Many-Facet Rasch Measurement Theory. We designed a 54-item questionnaire using Guttman's mapping sentences to examine the grading practices of teachers. Each item in the questionnaire represented a unique student…

Descriptors: Student Evaluation, Educational Research, Grades (Scholastic), Public School Teachers

The National Assessment of Educational Progress Information Retrieval System (NAEPIRS), Version 1.15 (Computer Program). [Review].

Peer reviewed

Brzezinski, Evelyn J. – Journal of Educational Measurement, 1985

The National Assessment of Educational Progress Information Retrieval System is a single purpose database program. It is well constructed, runs without problems, and serves as a model for dissemination of research and evaluation study results. The program seems more useful as an index to documents than as an independent database. (Author/DWH)

Descriptors: Computer Software, Databases, Information Retrieval, Microcomputers

Generalizability in Item Response Modeling

Peer reviewed

Direct link

Briggs, Derek C.; Wilson, Mark – Journal of Educational Measurement, 2007

An approach called generalizability in item response modeling (GIRM) is introduced in this article. The GIRM approach essentially incorporates the sampling model of generalizability theory (GT) into the scaling model of item response theory (IRT) by making distributional assumptions about the relevant measurement facets. By specifying a random…

Descriptors: Markov Processes, Generalizability Theory, Item Response Theory, Computation

MicroCAT Testing System Version 3.0.

Peer reviewed

Patience, Wayne – Journal of Educational Measurement, 1990

The four main subsystems of the MicroCAT Testing System for developing, administering, scoring, and analyzing computerized tests using conventional or item response theory methods are described. Judgments of three users of the system are included in the evaluation of this software. (SLD)

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Software, Computer Software Reviews

Previous Page | Next Page »

Pages: 1 | 2

Wang, Wen-Chung	3
Jin, Kuan-Yu	2
Alex J. Mechaber	1
Almond, Russell G.	1
Bolt, Daniel M.	1
Brian E. Clauser	1
Briggs, Derek C.	1
Brzezinski, Evelyn J.	1
De Boeck, Paul	1
Debeer, Dries	1
DiBello, Louis V.	1
Emre Gonulates	1
Engelhard, George, Jr.	1
Finkelman, Matthew	1
Harold Doran	1
He, Wei	1
Janssen, Rianne	1
Jiao, Hong	1
Kai North	1
Kim, Wonsuk	1
Kuan-Yu Jin	1
Le An Ha	1
Moulder, Brad	1
Nanda, Alice O.	1
Oshima, T. C.	1
More ▼