ERIC - Search Results

Publication Date

In 2025	1
Since 2024	6
Since 2021 (last 5 years)	11
Since 2016 (last 10 years)	36
Since 2006 (last 20 years)	53

Descriptor

Test Construction	154
Test Items	42
Test Use	38
Elementary Secondary Education	37
Computer Assisted Testing	34
Test Validity	32
Educational Assessment	31
Testing Problems	25
Educational Testing	23
Evaluation Methods	22
Testing Programs	18
Achievement Tests	16
Item Response Theory	15
Performance Based Assessment	15
Psychometrics	15
Standards	15
Student Evaluation	15
Scores	14
Scoring	13
Test Bias	13
Item Analysis	12
Measurement	12
Microcomputers	12
Test Interpretation	12
Item Banks	11
More ▼

Source

Educational Measurement:…

154

Publication Type

Journal Articles	154
Reports - Evaluative	49
Reports - Descriptive	47
Reports - Research	29
Opinion Papers	25
Information Analyses	14
Speeches/Meeting Papers	10
Guides - Non-Classroom	5
Book/Product Reviews	3
Tests/Questionnaires	2
Collected Works - Serials	1
Guides - Classroom - Learner	1
Reference Materials -…	1
Reports - General	1
More ▼

Education Level

Elementary Secondary Education	5
Higher Education	2
Postsecondary Education	2
Adult Education	1
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Israel	2
Sweden	2
Canada	1
Nebraska	1
Netherlands	1
New York (New York)	1
Pennsylvania	1
Singapore	1
Texas	1
United Kingdom	1
United States	1
West Virginia	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Every Student Succeeds Act…	1

Assessments and Surveys

National Assessment of…	5
SAT (College Admission Test)	2
Teacher Performance…	2
ACT Assessment	1
California Achievement Tests	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
Preliminary Scholastic…	1
Program for International…	1
Progress in International…	1
Stanford Achievement Tests	1
Stanford Binet Intelligence…	1
Trends in International…	1
Watson Glaser Critical…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 154 results Save | Export

A Workflow for Minimizing Errors in Template-Based Automated Item-Generation Development

Peer reviewed

Direct link

Yanyan Fu – Educational Measurement: Issues and Practice, 2024

The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…

Descriptors: Error Correction, Automation, Test Items, Test Construction

Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size

Peer reviewed

Direct link

Leifeng Xiao; Kit-Tai Hau; Melissa Dan Wang – Educational Measurement: Issues and Practice, 2024

Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on…

Descriptors: Measurement, Benchmarking, Item Sampling, Sample Size

An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach

Peer reviewed

Direct link

Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022

This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…

Descriptors: Test Items, Measurement, Psychometrics, Test Construction

Exploration of Latent Structure in Test Revision and Review Log Data

Peer reviewed

Direct link

Zhang, Susu; Li, Anqi; Wang, Shiyu – Educational Measurement: Issues and Practice, 2023

In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and…

Descriptors: Computer Assisted Testing, Test Construction, Test Wiseness, Test Items

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

Supporting the Interpretive Validity of Student-Level Claims in Science Assessment with Tiered Claim Structures

Peer reviewed

Direct link

Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022

We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…

Descriptors: Science Tests, Test Validity, Test Items, Test Construction

Applying Evidence-Centered Design in the Development of a Multidimensional Adaptive Reading Motivation Measure

Peer reviewed

Direct link

Wang, Wenhao; Kingston, Neal M.; Davis, Marcia H.; Tiemann, Gail C.; Tonks, Stephen; Hock, Michael – Educational Measurement: Issues and Practice, 2021

Adaptive tests are more efficient than fixed-length tests through the use of item response theory; adaptive tests also present students questions that are tailored to their proficiency level. Although the adaptive algorithm is straightforward, developing a multidimensional computer adaptive test (MCAT) measure is complex. Evidence-centered design…

Descriptors: Evidence Based Practice, Reading Motivation, Adaptive Testing, Computer Assisted Testing

Item Response Theory Models for Polytomous Multidimensional Forced-Choice Items to Measure Construct Differentiation

Peer reviewed

Direct link

Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024

Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…

Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment

Internet-Based Proctored Assessment: Security and Fairness Issues

Peer reviewed

Direct link

Langenfeld, Thomas – Educational Measurement: Issues and Practice, 2020

The COVID-19 pandemic has accelerated the shift toward online learning solutions necessitating the need for developing online assessment solutions. Vendors offer online assessment delivery systems with varying security levels designed to minimize unauthorized behaviors. Combating cheating and securing assessment content, however, is not solely the…

Descriptors: Computer Assisted Testing, Justice, COVID-19, Pandemics

Generating Performance-Level Descriptors under a Principled Assessment Design Paradigm: An Example for Assessments under the Next-Generation Science Standards

Peer reviewed

Direct link

Luecht, Richard M. – Educational Measurement: Issues and Practice, 2020

The educational testing landscape is changing in many significant ways as evidence-based, principled assessment design (PAD) approaches are formally adopted. This article discusses the challenges and presents some score scale- and task-focused strategies for developing useful performance-level descriptors (PLDs) under a PAD approach. Details of…

Descriptors: Test Construction, Academic Standards, Science Education, Educational Testing

Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda

Peer reviewed

Direct link

Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024

Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…

Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement

A Comparison of Two Alternate Scaling Approaches Employed for Task Analyses in Credentialing Examination Development

Peer reviewed

Direct link

Fidler, James R.; Risk, Nicole M. – Educational Measurement: Issues and Practice, 2019

Credentialing examination developers rely on task (job) analyses for establishing inventories of task and knowledge areas in which competency is required for safe and successful practice in target occupations. There are many ways in which task-related information may be gathered from practitioner ratings, each with its own advantage and…

Descriptors: Job Analysis, Scaling, Licensing Examinations (Professions), Test Construction

Adding Objectivity to Standard Setting: Evaluating Consequence Using the Conscious and Subconscious Weight Methods

Peer reviewed

Direct link

Leventhal, Brian C.; Grabovsky, Irina – Educational Measurement: Issues and Practice, 2020

Standard setting is arguably one of the most subjective techniques in test development and psychometrics. The decisions when scores are compared to standards, however, are arguably the most consequential outcomes of testing. Providing licensure to practice in a profession has high stake consequences for the public. Denying graduation or forcing…

Descriptors: Standard Setting (Scoring), Weighted Scores, Test Construction, Psychometrics

Using the "Joint Standards" to Design Postsecondary Assessments with Evidence of Validity and Reliability: An Approach to CAEP Accreditation

Peer reviewed

Direct link

Wilkerson, Judy R. – Educational Measurement: Issues and Practice, 2020

Validity and reliability are a major focus in teacher education accreditation by the Council for Accreditation of Educator Preparation (CAEP). CAEP requires the use of "accepted research standards," but many faculty and administrators are unsure how to meet this requirement. The Standards of Educational and Psychological Testing…

Descriptors: Test Construction, Test Validity, Test Reliability, Teacher Education Programs

The Invariance Paradox: Using Optimal Test Design to Minimize Bias

Peer reviewed

Direct link

Jones, Andrew T.; Kopp, Jason P.; Ong, Thai Q. – Educational Measurement: Issues and Practice, 2020

Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form…

Descriptors: Test Construction, Test Bias, Classification, Accuracy

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11

Hambleton, Ronald K.	5
Linn, Robert L.	4
Stiggins, Richard J.	3
Brennan, Robert L.	2
Brookhart, Susan M.	2
Fremer, John	2
Frisbie, David A.	2
Gierl, Mark J.	2
Hiscox, Michael D.	2
Jaeger, Richard M.	2
Katz, Irvin R.	2
Keehner, Madeleine	2
Kolen, Michael J.	2
Mislevy, Robert J.	2
Moss, Pamela A.	2
Nichols, Paul D.	2
Nitko, Anthony J.	2
Plake, Barbara S.	2
Popham, W. James	2
Reckase, Mark D.	2
Rudner, Lawrence M.	2
Stone, Clement A.	2
Sugrue, Brenda	2
Yen, Wendy M.	2
More ▼