ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	33
Since 2006 (last 20 years)	62

Descriptor

Scoring	122
Simulation	122
Test Items	37
Item Response Theory	36
Computer Assisted Testing	29
Scores	26
Test Construction	26
Comparative Analysis	25
Adaptive Testing	23
Models	23
Evaluation Methods	19
Correlation	18
Item Analysis	17
Statistical Analysis	17
Reliability	12
Computation	11
Mathematical Models	11
Probability	11
Psychometrics	11
Test Bias	11
Accuracy	10
Error of Measurement	10
Maximum Likelihood Statistics	10
Test Reliability	10
Evaluators	9
More ▼

Publication Type

Reports - Research	63
Journal Articles	62
Reports - Evaluative	28
Speeches/Meeting Papers	20
Dissertations/Theses -…	10
Reports - Descriptive	9
Numerical/Quantitative Data	4
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Guides - Classroom - Teacher	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	8
Postsecondary Education	6
Secondary Education	6
Middle Schools	5
Junior High Schools	4
Grade 8	3
Elementary Education	2
High Schools	2
Intermediate Grades	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 4	1
Grade 5	1
Preschool Education	1
More ▼

Audience

Practitioners	2
Administrators	1
Researchers	1
Teachers	1

Location

Afghanistan	1
California	1
Illinois (Chicago)	1
Indiana	1
Netherlands	1
New York	1
Ohio	1
Russia	1

Laws, Policies, & Programs

Assessments and Surveys

Torrance Tests of Creative…	2
Center for Epidemiologic…	1
Graduate Record Examinations	1
Kaufman Brief Intelligence…	1
Law School Admission Test	1
National Assessment of…	1
Program for International…	1
SAT (College Admission Test)	1
Stanford Binet Intelligence…	1
Test of English as a Foreign…	1
Vineland Adaptive Behavior…	1
Wechsler Adult Intelligence…	1
Woodcock Johnson Tests of…	1
Work Keys (ACT)	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 122 results Save | Export

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Using Machine Learning to Decrease the Human Coding Burden in Experimental Assessments of Text

Peer reviewed

Direct link

Reagan Mozer; Luke Miratrix – Society for Research on Educational Effectiveness, 2023

Background: For randomized trials that use text as an outcome, traditional approaches for assessing treatment impact require each document first be manually coded for constructs of interest by trained human raters. These hand-coded scores are then used as a measured outcome for an impact analysis, with the average scores of the treatment group…

Descriptors: Artificial Intelligence, Coding, Randomized Controlled Trials, Research Methodology

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

The Role of Distributional Overlap on the Precision Gain of Bounds for Generalization

Peer reviewed

Direct link

Chan, Wendy – American Journal of Evaluation, 2022

Over the past ten years, propensity score methods have made an important contribution to improving generalizations from studies that do not select samples randomly from a population of inference. However, these methods require assumptions and recent work has considered the role of bounding approaches that provide a range of treatment impact…

Descriptors: Probability, Scores, Scoring, Generalization

A Comparison of Final Scoring Methods under the Multistage Adaptive Testing Framework

Direct link

Hacer Karamese – ProQuest LLC, 2022

Multistage adaptive testing (MST) has become popular in the testing industry because the research has shown that it combines the advantages of both linear tests and item-level computer adaptive testing (CAT). The previous research efforts primarily focused on MST design issues such as panel design, module length, test length, distribution of test…

Descriptors: Adaptive Testing, Scoring, Computer Assisted Testing, Design

A Note on Improving Variational Estimation for Multidimensional Item Response Theory

Peer reviewed

Direct link

Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…

Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Psychometric Models for Scoring Multiple Reporter Assessments: Applications to Integrative Data Analysis in Prevention Science and Beyond

Peer reviewed

Direct link

Curran, Patrick J.; Georgeson, A. R.; Bauer, Daniel J.; Hussong, Andrea M. – International Journal of Behavioral Development, 2021

Conducting valid and reliable empirical research in the prevention sciences is an inherently difficult and challenging task. Chief among these is the need to obtain numerical scores of underlying theoretical constructs for use in subsequent analysis. This challenge is further exacerbated by the increasingly common need to consider multiple…

Descriptors: Psychometrics, Scoring, Prevention, Scores

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Developing Multistage Tests Using "D"-Scoring Method

Peer reviewed

Direct link

Han, Kyung T.; Dimitrov, Dimiter M.; Al-Mashary, Faisal – Educational and Psychological Measurement, 2019

The "D"-scoring method for scoring and equating tests with binary items proposed by Dimitrov offers some of the advantages of item response theory, such as item-level difficulty information and score computation that reflects the item difficulties, while retaining the merits of classical test theory such as the simplicity of number…

Descriptors: Test Construction, Scoring, Test Items, Adaptive Testing

Application of Latent Semantic Analysis to Divergent Thinking Is Biased by Elaboration

Peer reviewed

Direct link

Forthmann, Boris; Oyebade, Oluwatosin; Ojo, Adebusola; Günther, Fritz; Holling, Heinz – Journal of Creative Behavior, 2019

Scoring divergent-thinking response sets has always been challenging because such responses are not only open-ended in terms of number of ideas, but each idea may also be expressed by a varying number of concepts and, thus, by a varying number of words (elaboration). While many current studies have attempted to score the semantic distance in…

Descriptors: Semantics, Creative Thinking, Simulation, Correlation

Chunking and Redintegration in Verbal Short-Term Memory

Peer reviewed

Direct link

Norris, Dennis; Kalm, Kristjan; Hall, Jane – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2020

Memory for verbal material improves when words form familiar chunks. But how does the improvement due to chunking come about? Two possible explanations are that the input might be actively recoded into chunks, each of which takes up less memory capacity than items not forming part of a chunk (a form of data compression), or that chunking is based…

Descriptors: Phrase Structure, Short Term Memory, Recognition (Psychology), Linguistic Input

Routing Strategies and Optimizing Design for Multistage Testing in International Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019

This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…

Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory

Improving Methods for Propensity Score Analysis with Mismeasured Variables by Incorporating Background Variables with Moderated Nonlinear Factor Analysis

Direct link

Greifer, Noah – ProQuest LLC, 2018

There has been some research in the use of propensity scores in the context of measurement error in the confounding variables; one recommended method is to generate estimates of the mis-measured covariate using a latent variable model, and to use those estimates (i.e., factor scores) in place of the covariate. I describe a simulation study…

Descriptors: Evaluation Methods, Probability, Scores, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

ProQuest LLC	10
Journal of Educational and…	8
Applied Psychological…	7
ETS Research Report Series	7
Journal of Educational…	5
Applied Measurement in…	3
Educational and Psychological…	3
International Journal of…	3
Academic Medicine	2
Advances in Health Sciences…	2
Grantee Submission	2
Society for Research on…	2
American Journal of Evaluation	1
American Journal of…	1
American Journal of…	1
Annenberg Institute for…	1
Assessment	1
Assessment & Evaluation in…	1
Australian Review of Applied…	1
European Journal of…	1
Evaluation Quarterly	1
IGI Global	1
International Educational…	1
International Journal of…	1
Journal of Applied Measurement	1
More ▼

Chan, Wendy	2
Chernyshenko, Oleksandr S.	2
Chung, Gregory K. W. K.	2
Hambleton, Ronald K.	2
Lee, Won-Chan	2
Longford, Nicholas T.	2
Reese, Lynda M.	2
Segall, Daniel O.	2
Stark, Stephen	2
Stocking, Martha L.	2
Al-Mashary, Faisal	1
Ali, Usama S.	1
Allan S. Cohen	1
Allen, Layman E.	1
Baker, Eva L.	1
Barnes, Tiffany, Ed.	1
Bartram, Dave	1
Bauer, Daniel J.	1
Becker, Benjamin	1
Bejar, Isaac I.	1
Belfi, Brian	1
Bell, Sherry Mee	1
Bennett, Randy Elliot	1
Bligh, Thomas J.	1
More ▼