NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 706 to 720 of 9,552 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Russell, Michael; Kaplan, Larry – Practical Assessment, Research & Evaluation, 2021
Differential Item Functioning (DIF) is commonly employed to examine measurement bias of test scores. Current approaches to DIF compare item functioning separately for select demographic identities such as gender, racial stratification, and economic status. Examining potential item bias fails to recognize and capture the intersecting configurations…
Descriptors: Test Bias, Test Items, Demography, Identification
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Guo, Wenjing – Educational Assessment, 2021
Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and…
Descriptors: Scoring, Responses, Test Items, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Man, Kaiwen; Harring, Jeffrey R. – Educational and Psychological Measurement, 2021
Many approaches have been proposed to jointly analyze item responses and response times to understand behavioral differences between normally and aberrantly behaved test-takers. Biometric information, such as data from eye trackers, can be used to better identify these deviant testing behaviors in addition to more conventional data types. Given…
Descriptors: Cheating, Item Response Theory, Reaction Time, Eye Movements
Peer reviewed Peer reviewed
Direct linkDirect link
Su, Shiyang; Wang, Chun; Weiss, David J. – Educational and Psychological Measurement, 2021
S-X[superscript 2] is a popular item fit index that is available in commercial software packages such as "flex"MIRT. However, no research has systematically examined the performance of S-X[superscript 2] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was…
Descriptors: Statistics, Goodness of Fit, Test Items, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Cook, Ryan M.; Fye, Heather J.; Wind, Stefanie A. – Measurement and Evaluation in Counseling and Development, 2021
We examined the psychometric properties of the Counselor Burnout Inventory (CBI) with 560 early career, post-master's counselors. We tested the dimensional structure of the CBI, item ordering, and the function of the rating scale using item response theory. Implications of the findings for researchers, counselors, and counselor educators are…
Descriptors: Counselors, Burnout, Item Response Theory, Entry Workers
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Chunyan; Jurich, Daniel; Morrison, Carol; Grabovsky, Irina – Applied Measurement in Education, 2021
The existence of outliers in the anchor items can be detrimental to the estimation of examinee ability and undermine the validity of score interpretation across forms. However, in practice, anchor item performance can become distorted due to various reasons. This study compares the performance of modified "INFIT" and "OUTFIT"…
Descriptors: Equated Scores, Test Items, Item Response Theory, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Yi-Hsuan Lee; Yue Jia – Applied Measurement in Education, 2024
Test-taking experience is a consequence of the interaction between students and assessment properties. We define a new notion, rapid-pacing behavior, to reflect two types of test-taking experience -- disengagement and speededness. To identify rapid-pacing behavior, we extend existing methods to develop response-time thresholds for individual items…
Descriptors: Adaptive Testing, Reaction Time, Item Response Theory, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024
A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…
Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Ato Kwamina Arhin – Acta Educationis Generalis, 2024
Introduction: This article aimed at digging deep into distractors used for mathematics multiple-choice items. The quality of distractors may be more important than their number and the stem in a multiple-choice question. Little attention is given to this aspect of item writing especially, mathematics multiple-choice questions. This article…
Descriptors: Testing, Multiple Choice Tests, Test Items, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Tatiana Chaiban; Zeinab Nahle; Ghaith Assi; Michelle Cherfane – Discover Education, 2024
Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that…
Descriptors: Medical Education, Accuracy, Artificial Intelligence, Computer Software
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kevser Arslan; Asli Görgülü Ari – Shanlax International Journal of Education, 2024
This study aimed to develop a valid and reliable multiple-choice achievement test for the subject area of ecology. The study was conducted within the framework of exploratory sequential design based on mixed research methods, and the study group consisted of a total of 250 middle school students studying at the sixth and seventh grade level. In…
Descriptors: Ecology, Science Tests, Test Construction, Multiple Choice Tests
Reima Al-Jarf – Online Submission, 2024
Expressions of impossibility refer to events that can never or rarely happen, tasks that are difficult or impossible to perform, people or things that are of no use and things that are impossible to find. This study explores the similarities and differences between English and Arabic expressions of impossibility, and the difficulties that…
Descriptors: English (Second Language), Second Language Learning, Arabic, Translation
Peer reviewed Peer reviewed
Direct linkDirect link
Kevin Ackermans; Marjoke Bakker; Pierre Gorissen; Anne-Marieke Loon; Marijke Kral; Gino Camp – Journal of Computer Assisted Learning, 2024
Background: A practical test that measures the information and communication technology (ICT) skills students need for effectively using ICT in primary education has yet to be developed (Oh et al., 2021). This paper reports on the development, validation, and reliability of a test measuring primary school students' ICT skills required for…
Descriptors: Test Construction, Test Validity, Measures (Individuals), Elementary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kyeng Gea Lee; Mark J. Lee; Soo Jung Lee – International Journal of Technology in Education and Science, 2024
Online assessment is an essential part of online education, and if conducted properly, has been found to effectively gauge student learning. Generally, textbased questions have been the cornerstone of online assessment. Recently, however, the emergence of generative artificial intelligence has added a significant challenge to the integrity of…
Descriptors: Artificial Intelligence, Computer Software, Biology, Science Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024
Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…
Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning
Pages: 1  |  ...  |  44  |  45  |  46  |  47  |  48  |  49  |  50  |  51  |  52  |  ...  |  637