Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 11 |
Descriptor
Source
Author
James S. Kim | 2 |
Joshua B. Gilbert | 2 |
Luke W. Miratrix | 2 |
Azhar, Aqil Zainal | 1 |
Baghaei, Samira | 1 |
Bagheri, Mohammad Sadegh | 1 |
Chan, Huiping | 1 |
Dashti, Laleh | 1 |
Davis, Richard L. | 1 |
Domingue, Benjamin W. | 1 |
Eckes, Thomas | 1 |
More ▼ |
Publication Type
Reports - Research | 11 |
Journal Articles | 8 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 2 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Early Childhood Education | 2 |
Elementary Education | 2 |
Grade 1 | 2 |
Grade 2 | 2 |
Grade 3 | 2 |
Primary Education | 2 |
Secondary Education | 1 |
Audience
Location
Germany | 1 |
Hong Kong | 1 |
Iran | 1 |
South Korea | 1 |
Taiwan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
International English… | 2 |
Test of English as a Foreign… | 2 |
MacArthur Communicative… | 1 |
Program for International… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Annenberg Institute for School Reform at Brown University, 2024
Longitudinal models of individual growth typically emphasize between-person predictors of change but ignore how growth may vary "within" persons because each person contributes only one point at each time to the model. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift…
Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development
Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024
Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…
Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development
Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021
Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…
Descriptors: Language Tests, German, Second Languages, Writing Tests
Azhar, Aqil Zainal; Segal, Avi; Gal, Kobi – International Educational Data Mining Society, 2022
This paper studies the use of Reinforcement Learning (RL) policies for optimizing the sequencing of online learning materials to students. Our approach provides an end to end pipeline for automatically deriving and evaluating robust representations of students' interactions and policies for content sequencing in online educational settings. We…
Descriptors: Reinforcement, Instructional Materials, Learning Analytics, Policy Analysis
Baghaei, Samira; Bagheri, Mohammad Sadegh; Yamini, Mortaza – Cogent Education, 2020
The main purpose of this quantitative-qualitative content analysis study was to compare IELTS and TOEFL listening and reading tests based on the representation of the learning objectives of Revised Bloom's taxonomy. To this end, 12 Academic IELTS listening and reading tests and 12 TOEFL iBT listening and reading tests were analyzed qualitatively…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Reading Tests
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Dashti, Laleh; Razmjoo, Seyyed Ayatollah – Cogent Education, 2020
The purpose of this mixed-methods study was to explore Iranian IELTS candidates' strengths and weaknesses in IELTS Speaking Test in terms of IELTS's four speaking assessment criteria, namely Fluency and Coherence (FlC), Lexical Resource (LR), Grammar Range and Accuracy (GRA), and Pronunciation (Pro). It also aimed to examine the discourse features…
Descriptors: English (Second Language), Second Language Learning, Language Tests, Speech Communication
Wu, Mike; Davis, Richard L.; Domingue, Benjamin W.; Piech, Chris; Goodman, Noah – International Educational Data Mining Society, 2020
Item Response Theory (IRT) is a ubiquitous model for understanding humans based on their responses to questions, used in fields as diverse as education, medicine and psychology. Large modern datasets offer opportunities to capture more nuances in human behavior, potentially improving test scoring and better informing public policy. Yet larger…
Descriptors: Item Response Theory, Accuracy, Data Analysis, Public Policy
Lowie, Wander; van Dijk, Marijn; Chan, Huiping; Verspoor, Marjolijn – Studies in Second Language Learning and Teaching, 2017
A large body studies into individual differences in second language learning has shown that success in second language learning is strongly affected by a set of relevant learner characteristics ranging from the age of onset to motivation, aptitude, and personality. Most studies have concentrated on a limited number of learner characteristics and…
Descriptors: Second Language Learning, Individual Differences, Learning Motivation, Personality Traits
Enkin, Elizabeth – Canadian Journal of Applied Linguistics / Revue canadienne de linguistique appliquée, 2016
The maze task is a psycholinguistic experimental procedure that measures real-time incremental sentence processing. The task has recently been tested as a language learning tool with promising results. Therefore, the present study examines the merits of a contextualized version of this task: the story maze. The findings are consistent with…
Descriptors: Task Analysis, Psycholinguistics, English, Spanish
Xie, Qin – Educational Psychology, 2017
The study utilised a fine-grained diagnostic checklist to assess first-year undergraduates in Hong Kong and evaluated its validity and usefulness for diagnosing academic writing in English. Ten English language instructors marked 472 academic essays with the checklist. They also agreed on a Q-matrix, which specified the relationships among the…
Descriptors: Academic Discourse, College Students, College English, Foreign Countries