ERIC - Search Results

The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…

Descriptors: Error Correction, Automation, Test Items, Test Construction

The Multidimensionality of Measurement Bias in High-Stakes Testing: Using Machine Learning to Evaluate Complex Sources of Differential Item Functioning

Peer reviewed

Direct link

Belzak, William C. M. – Educational Measurement: Issues and Practice, 2023

Test developers and psychometricians have historically examined measurement bias and differential item functioning (DIF) across a single categorical variable (e.g., gender), independently of other variables (e.g., race, age, etc.). This is problematic when more complex forms of measurement bias may adversely affect test responses and, ultimately,…

Descriptors: Test Bias, High Stakes Tests, Artificial Intelligence, Test Items

Supporting the Interpretive Validity of Student-Level Claims in Science Assessment with Tiered Claim Structures

Peer reviewed

Direct link

Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022

We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…

Descriptors: Science Tests, Test Validity, Test Items, Test Construction

Digital Module 13: Monte Carlo Simulation Studies in Item Response Theory

Peer reviewed

Direct link

Leventhal, Brian; Ames, Allison – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of "Monte Carlo simulation studies" (MCSS) in "item response theory" (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because…

Descriptors: Item Response Theory, Monte Carlo Methods, Simulation, Test Items

Evaluating Content-Related Validity Evidence Using a Text-Based Machine Learning Procedure

Peer reviewed

Direct link

Anderson, Daniel; Rowley, Brock; Stegenga, Sondra; Irvin, P. Shawn; Rosenberg, Joshua M. – Educational Measurement: Issues and Practice, 2020

Validity evidence based on test content is critical to meaningful interpretation of test scores. Within high-stakes testing and accountability frameworks, content-related validity evidence is typically gathered via alignment studies, with panels of experts providing qualitative judgments on the degree to which test items align with the…

Descriptors: Content Validity, Artificial Intelligence, Test Items, Vocabulary

How Can Released State Test Items Support Interim Assessment Purposes in an Educational Crisis?

Peer reviewed

Direct link

Klugman, Emma M.; Ho, Andrew D. – Educational Measurement: Issues and Practice, 2020

State testing programs regularly release previously administered test items to the public. We provide an open-source recipe for state, district, and school assessment coordinators to combine these items flexibly to produce scores linked to established state score scales. These would enable estimation of student score distributions and achievement…

Descriptors: Testing Programs, State Programs, Test Items, Scores

Digital Module 16: Longitudinal Data Analysis

Peer reviewed

Direct link

Harring, Jeffrey R.; Johnson, Tessa L. – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Jeffrey Harring and Ms. Tessa Johnson introduce the linear mixed effects (LME) model as a flexible general framework for simultaneously modeling continuous repeated measures data with a scientifically defensible function that adequately summarizes both individual change as well as the average response. The module…

Descriptors: Educational Assessment, Data Analysis, Longitudinal Studies, Case Studies

Digital Module 08: Foundations of Operational Item Analysis https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Yoo, Hanwook; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2019

Item analysis is an integral part of operational test development and is typically conducted within two popular statistical frameworks: classical test theory (CTT) and item response theory (IRT). In this digital ITEMS module, Hanwook Yoo and Ronald K. Hambleton provide an accessible overview of operational item analysis approaches within these…

Descriptors: Item Analysis, Item Response Theory, Guidelines, Test Construction

On the Choice of Anchor Tests in Equating

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018

The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…

Descriptors: Test Content, Difficulty Level, Test Items, Test Construction

Digital Module 17: Data Visualizations: Effective Evidence-Based Practices https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Gregg, Nikole; Leventhal, Brian C. – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Nikole Gregg and Dr. Brian Leventhal discuss strategies to ensure data visualizations achieve graphical excellence. Data visualizations are commonly used by measurement professionals to communicate results to examinees, the public, educators, and other stakeholders. To do so effectively, it is important that these…

Descriptors: Data Analysis, Evidence Based Practice, Visualization, Test Results

Understanding Examinees' Responses to Items: Implications for Measurement

Peer reviewed

Direct link

Embretson, Susan E. – Educational Measurement: Issues and Practice, 2016

Examinees' thinking processes have become an increasingly important concern in testing. The responses processes aspect is a major component of validity, and contemporary tests increasingly involve specifications about the cognitive complexity of examinees' response processes. Yet, empirical research findings on examinees' cognitive processes are…

Descriptors: Testing, Cognitive Processes, Test Construction, Test Items

Easier Said than Done: Rejoinder on Sijtsma and on Green and Yang

Peer reviewed

Direct link

Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2016

The main points of Sijtsma and Green and Yang in Educational Measurement: Issues and Practice (34, 4) are that reliability, internal consistency, and unidimensionality are distinct and that Cronbach's alpha may be problematic. Neither of these assertions are at odds with Davenport, Davison, Liou, and Love in the same issue. However, many authors…

Descriptors: Educational Assessment, Reliability, Validity, Test Construction

An NCME Instructional Module on Polytomous Item Response Theory Models

Peer reviewed

Direct link

Penfield, Randall David – Educational Measurement: Issues and Practice, 2014

A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…

Descriptors: Item Response Theory, Test Items, Models, Equations (Mathematics)

Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2013

Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

Descriptors: Educational Assessment, Test Items, Automation, Computer Assisted Testing

First Language of Test Takers and Fairness Assessment Procedures

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011

Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…

Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency

Previous Page | Next Page »

Pages: 1 | 2

Test Items	28
Test Construction	10
Educational Assessment	7
Validity	6
Elementary Secondary Education	5
Item Analysis	5
Item Response Theory	5
Models	5
Psychometrics	5
Scores	5
Test Results	5
Achievement Tests	4
Test Validity	4
Diagnostic Tests	3
Equated Scores	3
Glossaries	3
Measurement	3
Norm Referenced Tests	3
Program Evaluation	3
Scoring	3
Test Bias	3
Testing Programs	3
Academic Standards	2
Accountability	2
Artificial Intelligence	2
More ▼

Sinharay, Sandip	2
Sireci, Stephen G.	2
Ames, Allison	1
Anderson, Dan	1
Anderson, Daniel	1
Belzak, William C. M.	1
Davenport, Ernest C.	1
Davison, Mark L.	1
Dorans, Neil J.	1
Embretson, Susan E.	1
Frey, Andreas	1
Gierl, Mark J.	1
Gong, Brian	1
Gramenz, Gary W.	1
Gregg, Nikole	1
Hambleton, Ronald K.	1
Harring, Jeffrey R.	1
Hartig, Johannes	1
Hendrickson, Amy	1
Ho, Andrew D.	1
Irvin, P. Shawn	1
Ito, Kyoko	1
Johnson, Tessa L.	1
Jolly, S. Jean	1
More ▼