ERIC - Search Results

Publication Date

In 2025	3
Since 2024	5
Since 2021 (last 5 years)	22
Since 2016 (last 10 years)	41
Since 2006 (last 20 years)	51

Source

Educational Measurement:…

Publication Type

Journal Articles	59
Reports - Research	59
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	7
Elementary Secondary Education	3
Higher Education	2
Postsecondary Education	2
Elementary Education	1
Grade 10	1
Grade 3	1
Grade 4	1
Grade 5	1
High Schools	1

Audience

Location

Germany	2
Canada	1
China	1
Ireland	1
Massachusetts	1

Laws, Policies, & Programs

Every Student Succeeds Act…

Assessments and Surveys

Program for International…	3
National Assessment of…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 59 results Save | Export

Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms

Peer reviewed

Direct link

Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025

This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…

Descriptors: Generalizability Theory, Automation, Test Items, Students

Investigating Approaches to Controlling Item Position Effects in Computerized Adaptive Tests

Peer reviewed

Direct link

Ye Ma; Deborah J. Harris – Educational Measurement: Issues and Practice, 2025

Item position effect (IPE) refers to situations where an item performs differently when it is administered in different positions on a test. The majority of previous research studies have focused on investigating IPE under linear testing. There is a lack of IPE research under adaptive testing. In addition, the existence of IPE might violate Item…

Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items

The Effect of Item Preknowledge on Response Time: Analysis of Two Datasets Using the Multiple-Group Lognormal Response Time Model with a Gating Mechanism

Peer reviewed

Direct link

Zopluoglu, Cengiz; Kasli, Murat; Toton, Sarah L. – Educational Measurement: Issues and Practice, 2021

Response time information has recently attracted significant attention in the literature as it may provide meaningful information about item preknowledge. The methods that use response time information to identify examinees with potential item preknowledge make an implicit assumption that the examinees with item preknowledge differ in their…

Descriptors: Reaction Time, Cheating, Test Items

Guesses and Slips as Proficiency-Related Phenomena and Impacts on Parameter Invariance

Peer reviewed

Direct link

Xiangyi Liao; Daniel M Bolt – Educational Measurement: Issues and Practice, 2024

Traditional approaches to the modeling of multiple-choice item response data (e.g., 3PL, 4PL models) emphasize slips and guesses as random events. In this paper, an item response model is presented that characterizes both disjunctively interacting guessing and conjunctively interacting slipping processes as proficiency-related phenomena. We show…

Descriptors: Item Response Theory, Test Items, Error Correction, Guessing (Tests)

An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach

Peer reviewed

Direct link

Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022

This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…

Descriptors: Test Items, Measurement, Psychometrics, Test Construction

Do Subject Matter Experts' Judgments of Multiple-Choice Format Suitability Predict Item Quality?

Peer reviewed

Direct link

Berenbon, Rebecca F.; McHugh, Bridget C. – Educational Measurement: Issues and Practice, 2023

To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ…

Descriptors: Test Items, Multiple Choice Tests, Standards, Difficulty Level

Item Selection Algorithm Based on Collaborative Filtering for Item Exposure Control

Peer reviewed

Direct link

Pan, Yiqin; Livne, Oren; Wollack, James A.; Sinharay, Sandip – Educational Measurement: Issues and Practice, 2023

In computerized adaptive testing, overexposure of items in the bank is a serious problem and might result in item compromise. We develop an item selection algorithm that utilizes the entire bank well and reduces the overexposure of items. The algorithm is based on collaborative filtering and selects an item in two stages. In the first stage, a set…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms

Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Educational Measurement: Issues and Practice, 2022

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores,…

Descriptors: Reliability, Scores, Scaling, Statistical Analysis

Exploration of Latent Structure in Test Revision and Review Log Data

Peer reviewed

Direct link

Zhang, Susu; Li, Anqi; Wang, Shiyu – Educational Measurement: Issues and Practice, 2023

In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and…

Descriptors: Computer Assisted Testing, Test Construction, Test Wiseness, Test Items

An Automated Item Pool Assembly Framework for Maximizing Item Utilization for CAT

Peer reviewed

Direct link

Hwanggyu Lim; Kyung T. Han – Educational Measurement: Issues and Practice, 2024

Computerized adaptive testing (CAT) has gained deserved popularity in the administration of educational and professional assessments, but continues to face test security challenges. To ensure sustained quality assurance and testing integrity, it is imperative to establish and maintain multiple stable item pools that are consistent in terms of…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks

A Machine Learning Approach for the Simultaneous Detection of Preknowledge in Examinees and Items When Both Are Unknown

Peer reviewed

Direct link

Pan, Yiqin; Wollack, James A. – Educational Measurement: Issues and Practice, 2023

Pan and Wollack (PW) proposed a machine learning method to detect compromised items. We extend the work of PW to an approach detecting compromised items and examinees with item preknowledge simultaneously and draw on ideas in ensemble learning to relax several limitations in the work of PW. The suggested approach also provides a confidence score,…

Descriptors: Artificial Intelligence, Prior Learning, Item Analysis, Test Content

Evaluating Population Invariance of Test Equating during the COVID-19 Pandemic

Peer reviewed

Direct link

Li, Dongmei; Kapoor, Shalini – Educational Measurement: Issues and Practice, 2022

Population invariance is a desirable property of test equating which might not hold when significant changes occur in the test population, such as those brought about by the COVID-19 pandemic. This research aims to investigate whether equating functions are reasonably invariant when the test population is impacted by the pandemic. Based on…

Descriptors: Test Items, Equated Scores, COVID-19, Pandemics

Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022

Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…

Descriptors: Ability, Tests, Equated Scores, Testing Problems

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

Modeling Slipping Effects in a Large-Scale Assessment with Innovative Item Formats

Peer reviewed

Direct link

Cuhadar, Ismail; Binici, Salih – Educational Measurement: Issues and Practice, 2022

This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the…

Descriptors: Item Response Theory, Measurement, Student Evaluation, Algebra

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Test Items	59
Test Construction	14
Item Response Theory	12
Computer Assisted Testing	11
Scores	11
Item Analysis	9
Mathematics Tests	9
Difficulty Level	8
Foreign Countries	8
Test Bias	8
Accuracy	7
Cutting Scores	7
Evaluation Methods	7
Achievement Tests	6
Secondary School Students	6
Test Format	6
Test Validity	6
Adaptive Testing	5
Automation	5
Standard Setting (Scoring)	5
Test Wiseness	5
Testing Problems	5
Validity	5
International Assessment	4
Measurement	4
More ▼

Feinberg, Richard A.	3
Katz, Irvin R.	2
Keehner, Madeleine	2
Lim, Hwanggyu	2
Pan, Yiqin	2
Wollack, James A.	2
Wyse, Adam E.	2
Ahmadi, Alireza	1
Almehrizi, Rashid S.	1
Alonzo, Julie	1
An, Lily Shiao	1
Anderson, Daniel	1
Arikan, Serkan	1
Armstrong, Anne-Marie	1
Arslan, Burcu	1
Aybek, Eren Can	1
Babcock, Ben	1
Berenbon, Rebecca F.	1
Binici, Salih	1
Blackmore, John	1
Brew, Chris	1
Buerger, Sarah	1
Cai, Jinfa	1
Carter, Kathy	1
More ▼