ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	12

Descriptor

Computer Software	15
Item Analysis	15
Models	15
Item Response Theory	8
Scoring	7
Test Items	7
Comparative Analysis	6
Computer Assisted Testing	6
Foreign Countries	5
Accuracy	4
Evaluation Methods	4
Evaluators	4
Computational Linguistics	3
Correlation	3
Goodness of Fit	3
Language Tests	3
Measurement	3
Psychometrics	3
Scores	3
Second Language Learning	3
Achievement Tests	2
Adaptive Testing	2
Classification	2
Diagnostic Tests	2
Error Patterns	2
More ▼

Source

International Journal of…	2
Annual Review of Applied…	1
Applied Psychological…	1
ETS Research Report Series	1
Educational and Psychological…	1
International Educational…	1
International Journal of…	1
Journal of Educational Data…	1
Journal of Educational and…	1
Journal of Experimental…	1
Journal of Speech, Language,…	1
Measurement:…	1
Practical Assessment,…	1
Routledge, Taylor & Francis…	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	9
Reports - Descriptive	3
Books	1
Information Analyses	1
Reports - Evaluative	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	2
Adult Education	1
Elementary Secondary Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers	2
Practitioners	1
Students	1

Location

Denmark	1
Germany	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Rosenberg Self Esteem Scale	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023

Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…

Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests

Refining Semantic Similarity of Paraphasias Using a Contextual Language Model

Peer reviewed

Direct link

Salem, Alexandra C.; Gale, Robert; Casilio, Marianne; Fleegle, Mikala; Fergadiotis, Gerasimos; Bedrick, Steven – Journal of Speech, Language, and Hearing Research, 2023

Purpose: ParAlg (Paraphasia Algorithms) is a software that automatically categorizes a person with aphasia's naming error (paraphasia) in relation to its intended target on a picture-naming test. These classifications (based on lexicality as well as semantic, phonological, and morphological similarity to the target) are important for…

Descriptors: Semantics, Computer Software, Aphasia, Classification

Using the eRm Package for Rasch Modeling

Peer reviewed

Direct link

Padgett, R. Noah; Morgan, Grant B. – Measurement: Interdisciplinary Research and Perspectives, 2020

The "extended Rasch modeling" (eRm) package in R provides users with a comprehensive set of tools for Rasch modeling for scale evaluation and general modeling. We provide a brief introduction to Rasch modeling followed by a review of literature that utilizes the eRm package. Then, the key features of the eRm package for scale evaluation…

Descriptors: Computer Software, Programming Languages, Self Esteem, Self Concept Measures

Investigating Concept Definition and Skill Modeling for Cognitive Diagnosis in Language Learning

Peer reviewed
PDF on ERIC

Download full text

Boxuan Ma; Sora Fukui; Yuji Ando; Shinichi Konomi – Journal of Educational Data Mining, 2024

Language proficiency diagnosis is essential to extract fine-grained information about the linguistic knowledge states and skill mastery levels of test takers based on their performance on language tests. Different from comprehensive standardized tests, many language learning apps often revolve around word-level questions. Therefore, knowledge…

Descriptors: Language Proficiency, Brain Hemisphere Functions, Language Processing, Task Analysis

An Introduction to the Analysis of Ranked Response Data

Peer reviewed
PDF on ERIC

Download full text

Finch, Holmes – Practical Assessment, Research & Evaluation, 2022

Researchers in many disciplines work with ranking data. This data type is unique in that it is often deterministic in nature (the ranks of items "k"-1 determine the rank of item "k"), and the difference in a pair of rank scores separated by "k" units is equivalent regardless of the actual values of the two ranks in…

Descriptors: Data Analysis, Statistical Inference, Models, College Faculty

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Beyond Arousal: Prediction Error Related to Aversive Events Promotes Episodic Memory Formation

Peer reviewed

Direct link

Kalbe, Felix; Schwabe, Lars – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2020

Stimuli encoded shortly before an aversive event are typically well remembered. Traditionally, this emotional memory enhancement has been attributed to beneficial effects of physiological arousal on memory formation. Here, we proposed an additional mechanism and tested whether memory formation is driven by the unpredictable nature of aversive…

Descriptors: Prediction, Memory, Fear, Conditioning

Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

Peer reviewed
PDF on ERIC

Download full text

Aybek, Eren Can; Demirtasli, R. Nukhet – International Journal of Research in Education and Science, 2017

This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

A Note on Item-Restscore Association in Rasch Models

Peer reviewed

Direct link

Kreiner, Svend – Applied Psychological Measurement, 2011

To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…

Descriptors: Item Analysis, Correlation, Item Response Theory, Models

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Handbook of Polytomous Item Response Theory Models

Direct link

Nering, Michael L., Ed.; Ostini, Remo, Ed. – Routledge, Taylor & Francis Group, 2010

This comprehensive "Handbook" focuses on the most used polytomous item response theory (IRT) models. These models help us understand the interaction between examinees and test questions where the questions have various response categories. The book reviews all of the major models and includes discussions about how and where the models…

Descriptors: Guides, Item Response Theory, Test Items, Correlation

Item Response Modeling with BILOG-MG and MULTILOG for Windows

Peer reviewed

Direct link

Rupp, Andre A. – International Journal of Testing, 2003

Item response theory (IRT) has become one of the most popular scoring frameworks for measurement data. IRT models are used frequently in computerized adaptive testing, cognitively diagnostic assessment, and test equating. This article reviews two of the most popular software packages for IRT model estimation, BILOG-MG (Zimowski, Muraki, Mislevy, &…

Descriptors: Test Items, Adaptive Testing, Item Response Theory, Computer Software

Multiple Evaluation: A New Testing Paradigm that Exorcizes Guessing

Peer reviewed

Direct link

Dirkzwager, Arie – International Journal of Testing, 2003

The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…

Descriptors: Psychometrics, Probability, Models, Measurement

Technologies for Language Assessment.

Peer reviewed

Burstein, Jill; And Others – Annual Review of Applied Linguistics, 1996

Reviews current and developing technology uses that are relevant to language assessment and discusses examples of recent linguistic applications from the laboratory at the Educational Testing Service. The processes of language test development are described and the functions they serve from the perspective of a large testing organization are…

Descriptors: Computer Assisted Testing, Computer Software, Educational Technology, Interactive Video

Aybek, Eren Can	1
Bedrick, Steven	1
Boxuan Ma	1
Breyer, F. Jay	1
Burstein, Jill	1
Casilio, Marianne	1
Demirtasli, R. Nukhet	1
Dirkzwager, Arie	1
Fergadiotis, Gerasimos	1
Finch, Holmes	1
Fleegle, Mikala	1
Gale, Robert	1
Heffernan, Neil	1
Kalbe, Felix	1
Khorramdel, Lale	1
Kreiner, Svend	1
Lan, Andrew	1
Lorenz, Florian	1
Morgan, Grant B.	1
Nering, Michael L., Ed.	1
Ostini, Remo, Ed.	1
Padgett, R. Noah	1
Rupp, Andre A.	1
Salem, Alexandra C.	1
More ▼