ERIC - Search Results

Publication Date

In 2025	0
Since 2024	10
Since 2021 (last 5 years)	36
Since 2016 (last 10 years)	83
Since 2006 (last 20 years)	92

Descriptor

Test Items	92
Item Response Theory	31
Psychometrics	25
Test Construction	25
Elementary School Students	23
Test Validity	21
Test Reliability	18
Difficulty Level	17
Models	17
Multiple Choice Tests	17
Computer Assisted Testing	14
Correlation	14
Mathematics Tests	14
Accuracy	13
Reading Tests	13
Scores	13
Computation	12
Item Analysis	11
Reading Comprehension	11
Scoring	11
Statistical Analysis	10
Comparative Analysis	9
Elementary Secondary Education	9
Error of Measurement	9
Factor Analysis	9
More ▼

Source

Grantee Submission

Publication Type

Reports - Research	83
Speeches/Meeting Papers	24
Journal Articles	23
Tests/Questionnaires	8
Reports - Descriptive	5
Reports - Evaluative	3
Information Analyses	2
Numerical/Quantitative Data	2

Education Level

Elementary Education	35
Middle Schools	20
Secondary Education	20
Early Childhood Education	16
Junior High Schools	14
Intermediate Grades	13
Primary Education	10
Elementary Secondary Education	9
Grade 5	8
Postsecondary Education	8
Grade 4	7
High Schools	7
Higher Education	7
Grade 3	6
Grade 8	6
Grade 6	3
Grade 9	3
Preschool Education	3
Grade 1	2
Grade 2	2
Grade 7	2
Grade 10	1
Grade 11	1
Grade 12	1
Kindergarten	1
More ▼

Audience

Location

Florida	6
Illinois	4
California	3
Kansas	2
Minnesota	2
United States	2
Utah	2
Africa	1
Canada	1
Colorado	1
Idaho	1
Illinois (Chicago)	1
Indiana	1
Ireland	1
Japan	1
Kenya	1
Malawi	1
Massachusetts	1
Netherlands	1
New York	1
North Carolina	1
Tennessee	1
Texas	1
United Kingdom (England)	1
United Kingdom (Northern…	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…

Assessments and Surveys

Gates MacGinitie Reading Tests	3
National Assessment of…	2
Social Skills Improvement…	2
Trends in International…	2
ACT Assessment	1
Big Five Inventory	1
Dynamic Indicators of Basic…	1
Eysenck Personality Inventory	1
Massachusetts Comprehensive…	1
Mayer Salovey Caruso…	1
Measures of Academic Progress	1
Program for International…	1
Program for the International…	1
State of Texas Assessments of…	1
Torrance Tests of Creative…	1
Wide Range Achievement Test	1
Woodcock Reading Mastery Test	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 92 results Save | Export

How Hard Can This Question Be? An Exploratory Analysis of Features Assessing Question Difficulty Using LLMs

Peer reviewed

Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2024

Assessing the difficulty of reading comprehension questions is crucial to educational methodologies and language understanding technologies. Traditional methods of assessing question difficulty rely frequently on human judgments or shallow metrics, often failing to accurately capture the intricate cognitive demands of answering a question. This…

Descriptors: Difficulty Level, Reading Tests, Test Items, Reading Comprehension

Generating Multiple Choice Questions with a Multi-Angle Question Answering Model

Peer reviewed
PDF on ERIC

Download full text

Direct link

Olney, Andrew M. – Grantee Submission, 2022

Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Models

Multi-Group Regularized Gaussian Variational Estimation: Fast Detection of DIF

Peer reviewed

Direct link

Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…

Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory

Examining the Influence of Passage and Student and Characteristics on Test-Taking Strategies: An Eye-Tracking Study

Peer reviewed
PDF on ERIC

Download full text

Direct link

Scott P. Ardoin; Katherine S. Binder; Paulina A. Kulesz; Eloise Nimocks; Joshua A. Mellott – Grantee Submission, 2024

Understanding test-taking strategies (TTSs) and the variables that influence TTSs is crucial to understanding what reading comprehension tests measure. We examined how passage and student characteristics were associated with TTSs and their impact on response accuracy. Third (n = 78), fifth (n = 86), and eighth (n = 86) graders read and answered…

Descriptors: Test Wiseness, Eye Movements, Reading Comprehension, Reading Tests

A Factored Regression Model for Composite Scores with Item-Level Missing Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Egamaria Alacam; Craig K. Enders; Han Du; Brian T. Keller – Grantee Submission, 2023

Composite scores are an exceptionally important psychometric tool for behavioral science research applications. A prototypical example occurs with self-report data, where researchers routinely use questionnaires with multiple items that tap into different features of a target construct. Item-level missing data are endemic to composite score…

Descriptors: Regression (Statistics), Scores, Psychometrics, Test Items

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2021

Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices of Levine and Drasgow (1988) and is the most powerful statistic for detecting item…

Descriptors: Robustness (Statistics), Hypothesis Testing, Statistics, Test Items

Generating Multiple Choice Questions from a Textbook: LLMs Match Human Performance on Most Metrics

Peer reviewed
PDF on ERIC

Download full text

Andrew M. Olney – Grantee Submission, 2023

Multiple choice questions are traditionally expensive to produce. Recent advances in large language models (LLMs) have led to fine-tuned LLMs that generate questions competitive with human-authored questions. However, the relative capabilities of ChatGPT-family models have not yet been established for this task. We present a carefully-controlled…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Algorithms

Modeling Nonlinear Effects of Person-by-Item Covariates in Explanatory Item Response Models: Exploratory Plots and Modeling Using Smooth Functions

Peer reviewed

Direct link

Sun-Joo Cho; Amanda Goodwin; Matthew Naveiras; Paul De Boeck – Grantee Submission, 2024

Explanatory item response models (EIRMs) have been applied to investigate the effects of person covariates, item covariates, and their interactions in the fields of reading education and psycholinguistics. In practice, it is often assumed that the relationships between the covariates and the logit transformation of item response probability are…

Descriptors: Item Response Theory, Test Items, Models, Maximum Likelihood Statistics

Regression with Reduced Rank Predictor Matrices: A Model of Trade-Offs

Peer reviewed
PDF on ERIC

Download full text

Direct link

Davison, Mark L.; Davenport, Ernest C., Jr.; Jia, Hao; Seipel, Ben; Carlson, Sarah E. – Grantee Submission, 2022

A regression model of predictor trade-offs is described. Each regression parameter equals the expected change in Y obtained by trading 1 point from one predictor to a second predictor. The model applies to predictor variables that sum to a constant T for all observations; for example, proportions summing to T=1.0 or percentages summing to T=100…

Descriptors: Regression (Statistics), Prediction, Predictor Variables, Models

Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Direct link

Peter Organisciak; Selcuk Acar; Denis Dumas; Kelly Berthiaume – Grantee Submission, 2023

Automated scoring for divergent thinking (DT) seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of DT, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text…

Descriptors: Automation, Computer Assisted Testing, Scoring, Creative Thinking

Semi-Supervised Learning Method for Adjusting Biased Item Difficulty Estimates Caused by Nonignorable Missingness under 2PL-IRT Model

Peer reviewed
PDF on ERIC

Download full text

Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Grantee Submission, 2020

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…

Descriptors: Virtual Classrooms, Item Response Theory, Test Bias, Test Items

Teacher-Responses: Highlighting Characteristics of Low Response Process Validity for Item(s) Measuring Teachers' Pedagogical Content Knowledge

Peer reviewed
PDF on ERIC

Download full text

Direct link

Martha L. Epstein; Hamza Malik; Kun Wang; Chandra Hawley Orrill – Grantee Submission, 2022

Response Process Validity (RPV) reflects the degree to which items are interpreted as intended by item developers. In this study, teacher responses to constructed response (CR) items to assess pedagogical content knowledge (PCK) of middle school mathematics teachers were evaluated to determine what types of teacher responses signaled weak RPV. We…

Descriptors: Teacher Response, Test Items, Pedagogical Content Knowledge, Mathematics Teachers

Exploring the Comparability of Multiple-Choice and Constructed-Response Versions of Scenario-Based Assessment Tasks

Peer reviewed
PDF on ERIC

Download full text

Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E. – Grantee Submission, 2022

As implementation of the "Next Generation Science Standards" moves forward, there is a need for new assessments that can measure students' integrated three-dimensional science learning. The National Research Council has suggested that these assessments be multicomponent tasks that utilize a combination of item formats including…

Descriptors: Multiple Choice Tests, Conditioning, Test Items, Item Response Theory

Unpacking Response Process Issues Encountered When Developing a Mathematics Teachers' Pedagogical Content Knowledge (PCK) Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Martha L. Esptein; Hamza Malik; Kun Wang; Chandra H. Orrill – Grantee Submission, 2023

It is essential for items in assessments of mathematics' teacher knowledge to evoke the desired response processes -- to be interpreted and responded to by teachers as intended by item developers. In this study, we sought to unpack evidence that middle school mathematics teachers were not consistently interacting as intended with constructed…

Descriptors: Pedagogical Content Knowledge, Mathematics Teachers, Mathematics Instruction, Protocol Analysis

Using Rasch Measurement to Develop 3D Assessment Tasks to Measure Students' Understanding of Energy

Peer reviewed
PDF on ERIC

Download full text

Direct link

Cari F. Herrmann-Abell; George E. DeBoer – Grantee Submission, 2023

This study describes the role that Rasch measurement played in the development of assessments aligned to the "Next Generation Science Standards," tasks that require students to use the three dimensions of science practices, disciplinary core ideas and cross-cutting concepts to make sense of energy-related phenomena. A set of 27…

Descriptors: Item Response Theory, Computer Simulation, Science Tests, Energy

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Schoen, Robert C.	9
DeBoer, George E.	7
Herrmann-Abell, Cari F.	6
Sinharay, Sandip	6
Yang, Xiaotong	5
Chun Wang	4
Farina, Kristy	4
LaVenia, Mark	4
Paek, Insu	4
Gongjun Xu	3
Hamza Malik	3
Hardcastle, Joseph	3
Katherine S. Binder	3
Kun Wang	3
Liu, Sicong	3
Scott P. Ardoin	3
Wang, Chun	3
Carlson, Sarah E.	2
Champagne, Zachary M.	2
Chandra Hawley Orrill	2
Davison, Mark L.	2
Denis Dumas	2
Hughes, Carolyn	2
Jing Lu	2
Jingchen Liu	2
More ▼