ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	8

Descriptor

Language Tests	10
Simulation	10
Test Items	10
Item Response Theory	5
English (Second Language)	4
Scores	4
Second Language Learning	4
Comparative Analysis	3
Computer Assisted Testing	3
Error of Measurement	3
Item Analysis	3
Monte Carlo Methods	3
Correlation	2
Difficulty Level	2
Elementary School Students	2
Generalization	2
Grade 1	2
Grade 2	2
Grade 3	2
Language Proficiency	2
Learner Engagement	2
Longitudinal Studies	2
Models	2
Randomized Controlled Trials	2
Reading	2
More ▼

Source

ETS Research Report Series	3
Annenberg Institute for…	1
Applied Measurement in…	1
Changing English: Studies in…	1
Education and Information…	1
Journal of Educational…	1
ProQuest LLC	1
Studies in Second Language…	1

Publication Type

Journal Articles	8
Reports - Research	8
Tests/Questionnaires	2
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Early Childhood Education	2
Elementary Education	2
Grade 1	2
Grade 2	2
Grade 3	2
Primary Education	2

Audience

Location

Minnesota

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning. EdWorkingPaper No. 23-868

Download full text

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Annenberg Institute for School Reform at Brown University, 2024

Longitudinal models of individual growth typically emphasize between-person predictors of change but ignore how growth may vary "within" persons because each person contributes only one point at each time to the model. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

Peer reviewed

Direct link

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024

Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Identifying Aberrant Responding: Use of Multiple Measures

Direct link

Steinkamp, Susan Christa – ProQuest LLC, 2017

For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…

Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness

The Handling of Missing Binary Data in Language Research

Peer reviewed
PDF on ERIC

Download full text

Pichette, François; Béland, Sébastien; Jolani, Shahab; Lesniewska, Justyna – Studies in Second Language Learning and Teaching, 2015

Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002) that data…

Descriptors: Language Research, Data Analysis, Simulation, Item Analysis

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Test-Task Authenticity: The Multiple Perspectives

Peer reviewed

Direct link

Gan, Zhengdong – Changing English: Studies in Culture and Education, 2012

Leung and Lewkowicz remind us that the debate over the past two decades that is most relevant to ELT (English languge teaching) pedagogy and curriculum concerns test-task authenticity. This paper first reviews how the authenticity debate in the literature of second language acquisition, pedagogy and testing has evolved. Drawing on a body of…

Descriptors: Teaching Methods, English (Second Language), Second Language Learning, Second Language Instruction

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

A General Diagnostic Model Applied to Language Testing Data. Research Report. ETS RR-05-16

Peer reviewed
PDF on ERIC

Download full text

von Davier, Matthias – ETS Research Report Series, 2005

Probabilistic models with more than one latent variable are designed to report profiles of skills or cognitive attributes. Testing programs want to offer additional information beyond what a single test score can provide using these skill profiles. Many recent approaches to skill profile models are limited to dichotomous data and have made use of…

Descriptors: Models, Diagnostic Tests, Language Tests, Language Proficiency

Strategies in Responding to the New TOEFL Reading Tasks. TOEFL Monograph Series. MS-33. ETS RR-06-06

Peer reviewed
PDF on ERIC

Download full text

Cohen, Andrew D.; Upton, Thomas A. – ETS Research Report Series, 2006

This study describes the reading and test-taking strategies that test takers used in the Reading section of the LanguEdge courseware (ETS, 2002a). These materials were developed to familiarize prospective respondents with the new TOEFL®. The investigation focused on strategies used to respond to more traditional single selection multiple-choice…

Descriptors: Reading Tests, Test Items, Courseware, Item Analysis

James S. Kim	2
Joshua B. Gilbert	2
Luke W. Miratrix	2
Breyer, F. Jay	1
Béland, Sébastien	1
Cohen, Andrew D.	1
Gan, Zhengdong	1
Gelbal, Selahattin	1
Jolani, Shahab	1
Lesniewska, Justyna	1
Lorenz, Florian	1
Ozdemir, Burhanettin	1
Pichette, François	1
Robitzsch, Alexander	1
Schipolowski, Stefan	1
Schroeders, Ulrich	1
Steinkamp, Susan Christa	1
Upton, Thomas A.	1
Zhang, Mo	1
von Davier, Matthias	1
More ▼