Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 14 |
Since 2016 (last 10 years) | 37 |
Since 2006 (last 20 years) | 55 |
Descriptor
Difficulty Level | 58 |
High Stakes Tests | 58 |
Test Items | 31 |
Foreign Countries | 28 |
English (Second Language) | 18 |
Comparative Analysis | 15 |
Test Construction | 14 |
Language Tests | 13 |
Second Language Learning | 13 |
College Entrance Examinations | 12 |
Item Analysis | 12 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Location
Iran | 6 |
United Kingdom | 4 |
China | 3 |
Florida | 3 |
Australia | 2 |
Brazil | 2 |
Germany | 2 |
Ireland | 2 |
Massachusetts | 2 |
Nigeria | 2 |
Ohio | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Anja Riemenschneider; Zarah Weiss; Pauline Schröter; Detmar Meurers – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2024
The linguistic characteristics of text productions depend on various factors, including individual language proficiency as well as the tasks used to elicit the production. To date, little attention has been paid to whether some writing tasks are more suitable than others to represent and differentiate students' proficiency levels. This issue is…
Descriptors: English (Second Language), Writing (Composition), Difficulty Level, Language Proficiency
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Liu, Jinghua; Becker, Kirk – Journal of Educational Measurement, 2022
For any testing programs that administer multiple forms across multiple years, maintaining score comparability via equating is essential. With continuous testing and high-stakes results, especially with less secure online administrations, testing programs must consider the potential for cheating on their exams. This study used empirical and…
Descriptors: Cheating, Item Response Theory, Scores, High Stakes Tests
Tsang, Chi Lai; Isaacs, Talia – Language Testing, 2022
This sequential mixed-methods study investigates washback on learning in a high-stakes school exit examination by examining learner perceptions and reported behaviours in relation to learners' beliefs and language learning experience, the role of other stakeholders in the washback mechanism, and socio-educational forces. The focus is the graded…
Descriptors: Foreign Countries, Secondary School Students, Student Attitudes, High Stakes Tests
Rafatbakhsh, Elaheh; Ahmadi, Alireza – Practical Assessment, Research & Evaluation, 2022
The purpose of this study was to investigate the validity of the vocabulary subsection of a high-stakes university entrance exam for Ph.D. programs using the argument-based approach. All the three different versions of the test administered in a period of five years and the responses of 12,500 test-takers were studied. The study focused on four…
Descriptors: Vocabulary, College Entrance Examinations, Doctoral Programs, Test Validity
Yumei Zou; Sathiamoorthy Kannan; Gurnam Kaur Sidhu – SAGE Open, 2024
Task design has been viewed to be essential in the context of language assessment. This study investigated whether increasing task complexity affects learners' writing performance. It employs three writing tasks with different levels of complexity based on Robinson's Componential Framework. A cohort of 278 participants was selected using a simple…
Descriptors: Difficulty Level, College Students, Foreign Countries, Writing Achievement
White, Patricia; Martin, Barbara Nell – AERA Online Paper Repository, 2021
Exploring explored principals' and teachers' perceptions concerning the role of play in early childhood programs was this quantitative inquiry. All early childhood participants identified play as a learning tool but noted it was being eliminated from the curriculum due to high stake accountability. There was a significant difference between…
Descriptors: Play, Early Childhood Education, Accountability, Administrator Attitudes
Marcom, Guilherme Stecca; Villar, Renato Pacheco; Kleinke, Maurício Urban – Physics Education, 2022
Research on problem-solving strategies has been conducted since the 1940s. Experts and novices use different strategies in problem solving; the main difference is in the familiarity with which they face new problems. For multiple-choice problems, an analysis of distractors can provide possible clues about the resolution processes developed by…
Descriptors: Problem Solving, High Stakes Tests, Mechanics (Physics), Physics
Item Order and Speededness: Implications for Test Fairness in Higher Educational High-Stakes Testing
Becker, Benjamin; van Rijn, Peter; Molenaar, Dylan; Debeer, Dries – Assessment & Evaluation in Higher Education, 2022
A common approach to increase test security in higher educational high-stakes testing is the use of different test forms with identical items but different item orders. The effects of such varied item orders are relatively well studied, but findings have generally been mixed. When multiple test forms with different item orders are used, we argue…
Descriptors: Information Security, High Stakes Tests, Computer Security, Test Items
Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023
Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses
Omarov, Nazarbek Bakytbekovich; Mohammed, Aisha; Alghurabi, Ammar Muhi Khleel; Alallo, Hajir Mahmood Ibrahim; Ali, Yusra Mohammed; Hassan, Aalaa Yaseen; Demeuova, Lyazat; Viktorovna, Shvedova Irina; Nazym, Bekenova; Al Khateeb, Nashaat Sultan Afif – International Journal of Language Testing, 2023
The Multiple-choice (MC) item format is commonly used in educational assessments due to its economy and effectiveness across a variety of content domains. However, numerous studies have examined the quality of MC items in high-stakes and higher-education assessments and found many flawed items, especially in terms of distractors. These faulty…
Descriptors: Test Items, Multiple Choice Tests, Item Response Theory, English (Second Language)
Srisunakrua, Thanaporn; Chumworatayee, Tipamas – Arab World English Journal, 2019
Readability has long been regarded as a significant aspect in English language teaching as it provides the overall picture of a text's difficulty level, especially in the context of teaching and testing. Readability is a practical consideration when making decisions on materials to match a text with target readers' proficiency. However, few…
Descriptors: Readability Formulas, English (Second Language), Textbook Content, Reading Comprehension
Munoz, Albert; Mackay, Jonathon – Journal of University Teaching and Learning Practice, 2019
Online testing is a popular practice for tertiary educators, largely owing to efficiency in automation, scalability, and capability to add depth and breadth to subject offerings. As with all assessments, designs need to consider whether student cheating may be inadvertently made easier and more difficult to detect. Cheating can jeopardise the…
Descriptors: Cheating, Test Construction, Computer Assisted Testing, Classification
Raymond, Mark R.; Stevens, Craig; Bucak, S. Deniz – Advances in Health Sciences Education, 2019
Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if…
Descriptors: Multiple Choice Tests, Credentials, Test Format, Test Items
Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…
Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language