ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	7

Descriptor

Error of Measurement	10
Language Tests	10
Test Items	10
English (Second Language)	4
Monte Carlo Methods	4
Second Language Learning	4
Computer Assisted Testing	3
Foreign Countries	3
Grade 2	3
Grade 3	3
Item Response Theory	3
Simulation	3
Statistical Analysis	3
Test Construction	3
Test Format	3
Achievement Tests	2
College Students	2
Cutting Scores	2
Difficulty Level	2
Elementary School Students	2
English for Academic Purposes	2
Grade 1	2
Item Analysis	2
Language Proficiency	2
Learner Engagement	2
More ▼

Source

Annenberg Institute for…	1
Applied Measurement in…	1
ETS Research Report Series	1
Education and Information…	1
LEARN Journal: Language…	1
Language Assessment Quarterly	1
Language Teaching Research	1
NWEA	1

Publication Type

Reports - Research	10
Journal Articles	6
Speeches/Meeting Papers	1

Education Level

Early Childhood Education	3
Elementary Education	3
Grade 2	3
Grade 3	3
Higher Education	3
Postsecondary Education	3
Primary Education	3
Grade 1	2
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Japan	2
Europe	1
Thailand	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Test of English for…	2
ACT Assessment	1
Measures of Academic Progress	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning. EdWorkingPaper No. 23-868

Download full text

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Annenberg Institute for School Reform at Brown University, 2024

Longitudinal models of individual growth typically emphasize between-person predictors of change but ignore how growth may vary "within" persons because each person contributes only one point at each time to the model. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

Peer reviewed

Direct link

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024

Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Mapping the CU-TEP to the Common European Framework of Reference (CEFTR)

Peer reviewed
PDF on ERIC

Download full text

Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018

The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…

Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests

Simulation Study for Evaluating MAP® Growth™ Item Pools with Grade-Level Constraints

Download full text

Li, Sylvia; Meyer, Patrick – NWEA, 2019

This simulation study examines the measurement precision, item exposure rates, and the depth of the MAP® Growth™ item pools under various grade-level restrictions. Unlike most summative assessments, MAP Growth allows examinees to see items from any grade level, regardless of the examinee's actual grade level. It does not limit the test to items…

Descriptors: Achievement Tests, Item Banks, Test Items, Instructional Program Divisions

Guessing and the Rasch Model

Peer reviewed

Direct link

Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016

Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…

Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests

The Creation and Validation of a Listening Vocabulary Levels Test

Peer reviewed

Direct link

McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015

An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…

Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning

Test-Retest Analyses of the Test of English as a Foreign Language. TOEFL Research Reports Report 45.

Download full text

Henning, Grant – 1993

This study provides information about the total and component scores of the Test of English as a Foreign Language (TOEFL). First, the study provides comparative global and component estimates of test-retest, alternate-form, and internal-consistency reliability, controlling for sources of measurement error inherent in the examinees and the testing…

Descriptors: Difficulty Level, English (Second Language), Error of Measurement, Estimation (Mathematics)

Analysis of Contingency Tables Involving Multiple-Response Data.

Carlson, James E.; Spray, Judith A. – 1986

This paper discussed methods currently under study for use with multiple-response data. Besides using Bonferroni inequality methods to control type one error rate over a set of inferences involving multiple response data, a recently proposed methodology of plotting the p-values resulting from multiple significance tests was explored. Proficiency…

Descriptors: Cutting Scores, Data Analysis, Difficulty Level, Error of Measurement

Factor Structure of the LanguEdge™ Test across Language Groups. TOEFL® Monograph Series. MS-32. ETS RR-05-12

Peer reviewed
PDF on ERIC

Download full text

Stricker, Lawrence J.; Rock, Donald A.; Lee, Yong-Won – ETS Research Report Series, 2005

This study assessed the factor structure of the LanguEdge™ test and the invariance of its factors across language groups. Confirmatory factor analyses of individual tasks and subsets of items in the four sections of the test, Listening, Reading, Speaking, and Writing, was carried out for Arabic-, Chinese-, and Spanish-speaking test takers. Two…

Descriptors: Factor Structure, Language Tests, Factor Analysis, Semitic Languages

James S. Kim	2
Joshua B. Gilbert	2
Luke W. Miratrix	2
Beglar, David	1
Carlson, James E.	1
Gelbal, Selahattin	1
Henning, Grant	1
Holster, Trevor A.	1
Kramer, Brandon	1
Lake, J.	1
Lee, Yong-Won	1
Li, Sylvia	1
McLean, Stuart	1
Meyer, Patrick	1
Ozdemir, Burhanettin	1
Rock, Donald A.	1
Spray, Judith A.	1
Stricker, Lawrence J.	1
Wudthayagorn, Jirada	1
More ▼