ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	7
Language Tests	7
Monte Carlo Methods	7
Test Items	4
Item Response Theory	3
Simulation	3
Data Analysis	2
Elementary School Students	2
Evaluation Methods	2
Grade 1	2
Grade 2	2
Grade 3	2
Learner Engagement	2
Longitudinal Studies	2
Markov Processes	2
Randomized Controlled Trials	2
Reading	2
Scoring	2
Second Language Instruction	2
Student Development	2
Vocabulary Development	2
Accuracy	1
Chinese	1
Comparative Analysis	1
Computer Assisted Testing	1
More ▼

Source

Annenberg Institute for…	1
Applied Measurement in…	1
ETS Research Report Series	1
International Journal of…	1
Journal of Educational and…	1
Language Testing	1

Author

James S. Kim	2
Joshua B. Gilbert	2
Luke W. Miratrix	2
Carlson, James E.	1
In'nami, Yo	1
Koizumi, Rie	1
Lee, Yong-Won	1
Lin, Chih-Kai	1
Liu, Yuming	1
Rock, Donald A.	1
Schulz, E. Matthew	1
Spray, Judith A.	1
Stricker, Lawrence J.	1
Yu, Lei	1
More ▼

Publication Type

Reports - Research	7
Journal Articles	5
Speeches/Meeting Papers	1

Education Level

Early Childhood Education	2
Elementary Education	2
Grade 1	2
Grade 2	2
Grade 3	2
Primary Education	2

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Iowa Tests of Basic Skills	1

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning. EdWorkingPaper No. 23-868

Download full text

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Annenberg Institute for School Reform at Brown University, 2024

Longitudinal models of individual growth typically emphasize between-person predictors of change but ignore how growth may vary "within" persons because each person contributes only one point at each time to the model. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

Peer reviewed

Direct link

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024

Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Review of Sample Size for Structural Equation Models in Second Language Testing and Learning Research: A Monte Carlo Approach

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013

The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…

Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods

Standard Error Estimation of 3PL IRT True Score Equating with an MCMC Method

Peer reviewed

Direct link

Liu, Yuming; Schulz, E. Matthew; Yu, Lei – Journal of Educational and Behavioral Statistics, 2008

A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…

Descriptors: Reading Comprehension, Test Format, Markov Processes, Educational Testing

Analysis of Contingency Tables Involving Multiple-Response Data.

Carlson, James E.; Spray, Judith A. – 1986

This paper discussed methods currently under study for use with multiple-response data. Besides using Bonferroni inequality methods to control type one error rate over a set of inferences involving multiple response data, a recently proposed methodology of plotting the p-values resulting from multiple significance tests was explored. Proficiency…

Descriptors: Cutting Scores, Data Analysis, Difficulty Level, Error of Measurement

Factor Structure of the LanguEdge™ Test across Language Groups. TOEFL® Monograph Series. MS-32. ETS RR-05-12

Peer reviewed
PDF on ERIC

Download full text

Stricker, Lawrence J.; Rock, Donald A.; Lee, Yong-Won – ETS Research Report Series, 2005

This study assessed the factor structure of the LanguEdge™ test and the invariance of its factors across language groups. Confirmatory factor analyses of individual tasks and subsets of items in the four sections of the test, Listening, Reading, Speaking, and Writing, was carried out for Arabic-, Chinese-, and Spanish-speaking test takers. Two…

Descriptors: Factor Structure, Language Tests, Factor Analysis, Semitic Languages