ERIC - Search Results

Publication Date

In 2025	1
Since 2024	5
Since 2021 (last 5 years)	16

Descriptor

Scoring	16
Simulation	10
Computer Simulation	6
Test Items	6
Comparative Analysis	5
Computer Assisted Testing	5
Evaluation Methods	4
Algorithms	3
Artificial Intelligence	3
Automation	3
Educational Technology	3
Feedback (Response)	3
Item Response Theory	3
Sample Size	3
Adaptive Testing	2
Barriers	2
Cheating	2
Chinese	2
Computation	2
Correlation	2
Data Analysis	2
Design	2
Difficulty Level	2
Elementary School Students	2
English (Second Language)	2
More ▼

Source

Journal of Educational and…	2
American Journal of Evaluation	1
Applied Measurement in…	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Education and Information…	1
Grantee Submission	1
International Educational…	1
International Journal of…	1
Journal of Educational…	1
Language, Speech, and Hearing…	1
Mathematical Thinking and…	1
Oxford University Press	1
ProQuest LLC	1
Society for Research on…	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	10
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Information Analyses	1
Reports - Descriptive	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Elementary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 2	1
Grade 3	1
Primary Education	1
Secondary Education	1

Audience

Location

Iran

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Using Machine Learning to Decrease the Human Coding Burden in Experimental Assessments of Text

Peer reviewed

Direct link

Reagan Mozer; Luke Miratrix – Society for Research on Educational Effectiveness, 2023

Background: For randomized trials that use text as an outcome, traditional approaches for assessing treatment impact require each document first be manually coded for constructs of interest by trained human raters. These hand-coded scores are then used as a measured outcome for an impact analysis, with the average scores of the treatment group…

Descriptors: Artificial Intelligence, Coding, Randomized Controlled Trials, Research Methodology

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

The Role of Distributional Overlap on the Precision Gain of Bounds for Generalization

Peer reviewed

Direct link

Chan, Wendy – American Journal of Evaluation, 2022

Over the past ten years, propensity score methods have made an important contribution to improving generalizations from studies that do not select samples randomly from a population of inference. However, these methods require assumptions and recent work has considered the role of bounding approaches that provide a range of treatment impact…

Descriptors: Probability, Scores, Scoring, Generalization

A Comparison of Final Scoring Methods under the Multistage Adaptive Testing Framework

Direct link

Hacer Karamese – ProQuest LLC, 2022

Multistage adaptive testing (MST) has become popular in the testing industry because the research has shown that it combines the advantages of both linear tests and item-level computer adaptive testing (CAT). The previous research efforts primarily focused on MST design issues such as panel design, module length, test length, distribution of test…

Descriptors: Adaptive Testing, Scoring, Computer Assisted Testing, Design

A Note on Improving Variational Estimation for Multidimensional Item Response Theory

Peer reviewed

Direct link

Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…

Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Psychometric Models for Scoring Multiple Reporter Assessments: Applications to Integrative Data Analysis in Prevention Science and Beyond

Peer reviewed

Direct link

Curran, Patrick J.; Georgeson, A. R.; Bauer, Daniel J.; Hussong, Andrea M. – International Journal of Behavioral Development, 2021

Conducting valid and reliable empirical research in the prevention sciences is an inherently difficult and challenging task. Chief among these is the need to obtain numerical scores of underlying theoretical constructs for use in subsequent analysis. This challenge is further exacerbated by the increasingly common need to consider multiple…

Descriptors: Psychometrics, Scoring, Prevention, Scores

Teacher Use of Digital Technologies for School-Based Assessment: A Scoping Review

Peer reviewed

Direct link

Blundell, Christopher N. – Assessment in Education: Principles, Policy & Practice, 2021

This paper presents a scoping review of, firstly, how teachers use digital technologies for school-based assessment, and secondly, how these assessment-purposed digital technologies are used in teacher- and student-centred pedagogies. It draws on research about the use of assessment-purposed digital technologies in school settings, published from…

Descriptors: Computer Uses in Education, Student Evaluation, Student Centered Learning, Computer Assisted Testing

Item Order and Speededness: Implications for Test Fairness in Higher Educational High-Stakes Testing

Peer reviewed

Direct link

Becker, Benjamin; van Rijn, Peter; Molenaar, Dylan; Debeer, Dries – Assessment & Evaluation in Higher Education, 2022

A common approach to increase test security in higher educational high-stakes testing is the use of different test forms with identical items but different item orders. The effects of such varied item orders are relatively well studied, but findings have generally been mixed. When multiple test forms with different item orders are used, we argue…

Descriptors: Information Security, High Stakes Tests, Computer Security, Test Items

Developing Collective Eyes for Iranian EFL Teachers' Computer-Assisted Language Assessment Literacy through Internet-Based Collaborative Reflection

Peer reviewed

Direct link

Rajab Esfandiari; Mohammad Hossein Arefian – Education and Information Technologies, 2024

Computer-assisted language assessment literacy (LAL) has gained momentum in English as a foreign language (EFL) teaching since computer-based education became particularly common in language education. As EFL teachers play a critical role in administering computer-assisted language assessments, Iranian EFL teachers need to learn, relearn, and…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Foreign Countries

Uses of Artificial Intelligence in STEM Education

Direct link

Xiaoming Zhai, Editor; Joseph Krajcik, Editor – Oxford University Press, 2025

In the age of rapid technological advancements, the integration of Artificial Intelligence (AI), machine learning (ML), and large language models (LLMs) in Science, Technology, Engineering, and Mathematics (STEM) education has emerged as a transformative force, reshaping pedagogical approaches and assessment methodologies. "Uses of AI in STEM…

Descriptors: Artificial Intelligence, STEM Education, Technology Uses in Education, Educational Technology

Features of a Pan Balance That May Support Students' Developing Understanding of Mathematical Equivalence

Peer reviewed

Direct link

Bajwa, Neet Priya; Perry, Michelle – Mathematical Thinking and Learning: An International Journal, 2021

Elementary school students struggle in interpreting the equal sign as a symbol denoting equivalence. Although many have advocated using a pan-balance scale to help students develop this understanding, less is known about what features associated with this model support learning. To attempt to control and examine these features, the investigators…

Descriptors: Mathematics Skills, Mathematics Instruction, Elementary School Students, Concept Formation

From a Distance: Comparison of In-Person and Virtual Assessments with Adult-Child Dyads from Linguistically Diverse Backgrounds

Peer reviewed

Direct link

Pratt, Amy S.; Anaya, Jissel B.; Ramos, Michelle N.; Pham, Giang; Muñoz, Miriam; Bedore, Lisa M.; Peña, Elizabeth D. – Language, Speech, and Hearing Services in Schools, 2022

Purpose: Our proof-of-concept study tested the feasibility of virtual testing using child assessments that were originally validated for in-person testing only. Method: Ten adult-child dyads were assigned to complete both in-person and virtual tests of language, cognition, and narratives. Child participants fell between the ages of 4 and 8 years;…

Descriptors: Evaluation Methods, Language Tests, Intelligence Tests, Narration

Previous Page | Next Page »

Pages: 1 | 2

Allan S. Cohen	1
Anaya, Jissel B.	1
Bajwa, Neet Priya	1
Bauer, Daniel J.	1
Becker, Benjamin	1
Bedore, Lisa M.	1
Blundell, Christopher N.	1
Chan, Wendy	1
Chen, Ping	1
Chenchen Ma	1
Chun Wang	1
Curran, Patrick J.	1
Debeer, Dries	1
Feng, Mingyu, Ed.	1
Georgeson, A. R.	1
Gongjun Xu	1
Hacer Karamese	1
Huang, Yingshi	1
Hussong, Andrea M.	1
James O. Ramsay	1
Jing Ouyang	1
Joakim Wallmark	1
Jordan M. Wheeler	1
Joseph Krajcik, Editor	1
Juan Li	1
More ▼