ERIC - Search Results

Publication Date

In 2025	1
Since 2024	6

Source

Educational Measurement:…

Author

Angela Johnson	1
April L. Zenisky	1
Elizabeth Barker	1
Guher Gorgun	1
Javier Suárez-Álvarez	1
Jimmy de la Torre	1
Jinran Wu	1
Kit-Tai Hau	1
Leifeng Xiao	1
Marcos Viveros Cespedes	1
Maria Elena Oliveri	1
Melissa Dan Wang	1
Okan Bulut	1
Stephen G. Sireci	1
Xuelan Qiu	1
Yanyan Fu	1
You-Gan Wang	1
More ▼

Publication Type

Journal Articles	6
Reports - Descriptive	3
Reports - Research	3

Education Level

Adult Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

A Workflow for Minimizing Errors in Template-Based Automated Item-Generation Development

Peer reviewed

Direct link

Yanyan Fu – Educational Measurement: Issues and Practice, 2024

The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…

Descriptors: Error Correction, Automation, Test Items, Test Construction

Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size

Peer reviewed

Direct link

Leifeng Xiao; Kit-Tai Hau; Melissa Dan Wang – Educational Measurement: Issues and Practice, 2024

Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on…

Descriptors: Measurement, Benchmarking, Item Sampling, Sample Size

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

Item Response Theory Models for Polytomous Multidimensional Forced-Choice Items to Measure Construct Differentiation

Peer reviewed

Direct link

Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024

Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…

Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment

Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda

Peer reviewed

Direct link

Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024

Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…

Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement

Evolving Educational Testing to Meet Students' Needs: Design-in-Real-Time Assessment

Peer reviewed

Direct link

Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024

The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…

Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction

Test Construction	6
Computer Assisted Testing	3
Evaluation Methods	3
Automation	2
Guidelines	2
Test Bias	2
Test Items	2
Test Validity	2
Academic Aspiration	1
Adaptive Testing	1
Adult Students	1
Artificial Intelligence	1
Benchmarking	1
Bilingual Students	1
Career Choice	1
Career Exploration	1
Cloze Procedure	1
Comparative Testing	1
Culture Fair Tests	1
Data	1
Delivery Systems	1
Educational Assessment	1
Educational Innovation	1
Equal Education	1
Error Correction	1
More ▼