Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 11 |
Descriptor
Comparative Analysis | 11 |
Error of Measurement | 11 |
Language Tests | 11 |
English (Second Language) | 7 |
Second Language Learning | 6 |
Foreign Countries | 5 |
Language Proficiency | 5 |
Scores | 4 |
Second Language Instruction | 4 |
Item Response Theory | 3 |
Native Language | 3 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 9 |
Reports - Research | 9 |
Dissertations/Theses -… | 2 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 6 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
International English… | 1 |
Iowa Tests of Basic Skills | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Takehiro Iizuka – ProQuest LLC, 2024
This study examined the significance of the mode of delivery--aural versus written--in second language (L2) vocabulary knowledge and L2 comprehension skills. One of the unique aspects of listening comprehension that sets it apart from reading comprehension is the mode of delivery--language input is delivered not visually but aurally. Somewhat…
Descriptors: Reading Comprehension, Listening Comprehension, Language Skills, Error of Measurement
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Polat, Murat – International Online Journal of Education and Teaching, 2022
Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…
Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests
Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill
Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022
In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…
Descriptors: Evaluators, Training, Comparative Analysis, Academic Language
Jeffry White – Journal of Educational Research and Practice, 2024
Violations of normality and homogeneity are common in educational data. When this occurs, the use of parametric statistics may be inappropriate. A generalized form of nonparametric analyses based on the Puri and Sen L statistic provides an alternative approach. Using a chi-square distribution, this technique is easy to apply and has significant…
Descriptors: Nonparametric Statistics, Learning Analytics, Evaluation Methods, Guidance
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
LaFlair, Geoffrey T.; Isbell, Daniel; May, L. D. Nicolas; Gutierrez Arvizu, Maria Nelly; Jamieson, Joan – Language Testing, 2017
Language programs need multiple test forms for secure administrations and effective placement decisions, but can they have confidence that scores on alternate test forms have the same meaning? In large-scale testing programs, various equating methods are available to ensure the comparability of forms. The choice of equating method is informed by…
Descriptors: Language Tests, Equated Scores, Testing Programs, Comparative Analysis
Rastegar, Behnaz; Safari, Fatemeh – International Journal of Education and Literacy Studies, 2017
Language learners' productive role in teaching and learning processes has recently been the focus of attention. Therefore, this study aimed at investigating the effect of oral vs. written output-based instruction on English as a foreign language (EFL) learners' vocabulary learning with a focus on reflective vs. impulsive learning styles. To this…
Descriptors: Cognitive Style, English (Second Language), Second Language Learning, Foreign Countries
Hsieh, Mingchuan – Language Assessment Quarterly, 2013
The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests
Wang, Huan – ProQuest LLC, 2010
Multiple uses of the same assessment may present challenges for both the design and use of an assessment. Little advice, however, has been given to assessment developers as to how to understand the phenomena of multiple assessment use and meet the challenges these present. Particularly problematic is the case in which an assessment is used for…
Descriptors: Test Use, Testing Programs, Program Effectiveness, Test Construction
Liu, Yuming; Schulz, E. Matthew; Yu, Lei – Journal of Educational and Behavioral Statistics, 2008
A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…
Descriptors: Reading Comprehension, Test Format, Markov Processes, Educational Testing