ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	12

Descriptor

Item Response Theory	13
Second Language Learning	13
Test Format	13
English (Second Language)	10
Test Items	10
Language Tests	9
Comparative Analysis	6
Second Language Instruction	6
Foreign Countries	5
Computer Assisted Testing	4
Item Analysis	4
Scores	4
Test Construction	4
College Entrance Examinations	3
Construct Validity	3
Difficulty Level	3
Graduate Students	3
Language Proficiency	3
Listening Comprehension Tests	3
Test Reliability	3
Content Analysis	2
Correlation	2
Cutting Scores	2
High School Students	2
Language Skills	2
More ▼

Source

Language Testing	5
Language Assessment Quarterly	2
College Board	1
Educational and Psychological…	1
Intelligence	1
Language Testing in Asia	1
ProQuest LLC	1
ReCALL	1

Publication Type

Journal Articles	11
Reports - Research	6
Reports - Evaluative	3
Dissertations/Theses -…	1
Information Analyses	1
Non-Print Media	1
Reference Materials - General	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	5
Secondary Education	4
High Schools	2
Elementary Education	1

Audience

Location

Australia	1
Hong Kong	1
Indonesia	1
Iowa	1
South Korea	1
Turkey (Ankara)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Systematic Review of Differential Item Functioning in Second Language Assessment

Peer reviewed

Direct link

Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025

The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…

Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis

A Comparative Study of AI-Human-Made and Human-Made Test Forms for a University TESOL Theory Course

Peer reviewed

Direct link

Kyung-Mi O. – Language Testing in Asia, 2024

This study examines the efficacy of artificial intelligence (AI) in creating parallel test items compared to human-made ones. Two test forms were developed: one consisting of 20 existing human-made items and another with 20 new items generated with ChatGPT assistance. Expert reviews confirmed the content parallelism of the two test forms.…

Descriptors: Comparative Analysis, Artificial Intelligence, Computer Software, Test Items

Technology-Enhanced Items in Grades 1-12 English Language Proficiency Assessments

Peer reviewed

Direct link

Kim, Ahyoung Alicia; Tywoniw, Rurik L.; Chapman, Mark – Language Assessment Quarterly, 2022

Technology-enhanced items (TEIs) are innovative, computer-delivered test items that allow test takers to better interact with the test environment compared to traditional multiple-choice items (MCIs). The interactive nature of TEIs offer improved construct coverage compared with MCIs but little research exists regarding students' performance on…

Descriptors: Language Tests, Test Items, Computer Assisted Testing, English (Second Language)

Examining the Effects of Different English Speech Varieties on an L2 Academic Listening Comprehension Test at the Item Level

Peer reviewed

Direct link

Shin, Sun-Young; Lee, Senyung; Lidster, Ryan – Language Testing, 2021

In this study we investigated the potential for a shared-first-language (shared-L1) effect on second language (L2) listening test scores using differential item functioning (DIF) analyses. We did this in order to understand how accented speech may influence performance at the item level, while controlling for key variables including listening…

Descriptors: Listening Comprehension Tests, Language Tests, Native Language, Scores

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

Scenario-Based Language Assessment: Developing a Language Assessment Literacy Test for Indonesian Teachers of English as a Foreign Language

Direct link

Agustinus Hardi Prasetyo – ProQuest LLC, 2023

Studies have shown that language assessment literacy (LAL) is important for language teachers since they make important classroom decisions to improve student learning based on their assessment. However, some studies have shown that teachers need more knowledge and skills in assessment. Teachers also seem unconfident in assessing their students…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Second Language Instruction

Towards Improved Assessment of L2 Collocation Knowledge

Peer reviewed

Direct link

Lee, Senyung; Shin, Sun-Young – Language Assessment Quarterly, 2021

Multiple test tasks are available for assessing L2 collocation knowledge. However, few studies have investigated the characteristics of a variety of recognition and recall tasks of collocation simultaneously, and most research on L2 collocations has focused on verb-noun and adjective-noun collocations. This study investigates (1) the relative…

Descriptors: Phrase Structure, Second Language Learning, Language Tests, Recall (Psychology)

Causes of Gender DIF on an EFL Language Test: A Multiple-Data Analysis over Nine Years

Peer reviewed

Direct link

Pae, Tae-Il – Language Testing, 2012

This study tracked gender differential item functioning (DIF) on the English subtest of the Korean College Scholastic Aptitude Test (KCSAT) over a nine-year period across three data points, using both the Mantel-Haenszel (MH) and item response theory likelihood ratio (IRT-LR) procedures. Further, the study identified two factors (i.e. reading…

Descriptors: Aptitude Tests, Academic Aptitude, Language Tests, Test Items

Distinguishing Verbal, Quantitative, and Figural Facets of Fluid Intelligence in Young Students

Peer reviewed

Direct link

Lakin, Joni M.; Gambrell, James L. – Intelligence, 2012

Measures of broad fluid abilities including verbal, quantitative, and figural reasoning are commonly used in the K-12 school context for a variety of purposes. However, differentiation of these domains is difficult for young children (grades K-2) who lack basic linguistic and mathematical literacy. This study examined the latent factor structure…

Descriptors: Evidence, Validity, Item Response Theory, Numeracy

Do Questions Written in the Target Language Make Foreign Language Listening Comprehension Tests More Difficult?

Peer reviewed

Direct link

Filipi, Anna – Language Testing, 2012

The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…

Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries

How Reliable Are TOEFL Scores?

Peer reviewed

Wainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997

The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)

Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning

The Effect of Using Different Weights for Multiple-Choice and Free-Response Item Sections

Download full text

Hendrickson, Amy; Patterson, Brian; Melican, Gerald – College Board, 2008

Presented at the Annual National Council on Measurement in Education (NCME) in New York in March 2008. This presentation explores how different item weighting can affect the effective weights, validity coefficents and test reliability of composite scores among test takers.

Descriptors: Multiple Choice Tests, Test Format, Test Validity, Test Reliability

Evaluating Computer-Based and Paper-Based Versions of an English-Language Listening Test

Peer reviewed

Direct link

Coniam, David – ReCALL, 2006

This paper describes an English language listening test intended as computer-based testing material for secondary school students in Hong Kong, where considerable attention is being invested in online and computer-based testing. As well as providing a school-based testing facility, the study aims to contribute to the knowledge base regarding the…

Descriptors: Listening Comprehension Tests, Computer Assisted Testing, Foreign Countries, Grade 12

Lee, Senyung	2
Shin, Sun-Young	2
Agustinus Hardi Prasetyo	1
Chapman, Mark	1
Coniam, David	1
Filipi, Anna	1
Gambrell, James L.	1
Hendrickson, Amy	1
Kalender, Ilker	1
Kaya, Elif	1
Kim, Ahyoung Alicia	1
Kyung-Mi O.	1
Lakin, Joni M.	1
Lidster, Ryan	1
Lukhele, Robert	1
Melican, Gerald	1
O'Grady, Stefan	1
Pae, Tae-Il	1
Patterson, Brian	1
Tywoniw, Rurik L.	1
Vahid Aryadoust	1
Wainer, Howard	1
Wenxin Zhang	1
Xueliang Chen	1
More ▼