Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Comparative Analysis | 7 |
Difficulty Level | 7 |
Evaluators | 7 |
Test Items | 4 |
Foreign Countries | 3 |
Item Analysis | 3 |
Computer Software | 2 |
English (Second Language) | 2 |
Item Response Theory | 2 |
Language Tests | 2 |
Measurement Techniques | 2 |
More ▼ |
Source
Cambridge Assessment | 1 |
Evaluation & Research in… | 1 |
International Journal of… | 1 |
International Journal of… | 1 |
Journal of MultiDisciplinary… | 1 |
ProQuest LLC | 1 |
Teaching of Psychology | 1 |
Author
Alexander Kah | 1 |
Coleman, Tori | 1 |
Crisp, Victoria | 1 |
Darlington, Ellie | 1 |
Elliott, Gill | 1 |
Emily Courtney | 1 |
Golam Reza Rohani | 1 |
Greatorex, Jackie | 1 |
Hamdollah Ravand | 1 |
Kouame, Julien B. | 1 |
Lamprianou, Iasonas | 1 |
More ▼ |
Publication Type
Journal Articles | 5 |
Reports - Evaluative | 3 |
Reports - Research | 3 |
Dissertations/Theses -… | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Flesch Kincaid Grade Level… | 1 |
Fry Readability Formula | 1 |
National Adult Literacy… | 1 |
What Works Clearinghouse Rating
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025
Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…
Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks
Susan Rowe – ProQuest LLC, 2023
This dissertation explored whether unnecessary linguistic complexity (LC) in mathematics and biology assessment items changes the direction and significance of differential item functioning (DIF) between subgroups emergent bilinguals (EBs) and English proficient students (EPs). Due to inconsistencies in measuring LC in items, Study One adapted a…
Descriptors: Difficulty Level, English for Academic Purposes, Second Language Learning, Second Language Instruction
Greatorex, Jackie; Rushton, Nicky; Coleman, Tori; Darlington, Ellie; Elliott, Gill – Cambridge Assessment, 2019
A curriculum map is a visualisation of relationships within and between a curriculum or curricula. Curriculum mapping refers to the method for creating and using the curriculum map, however this term is used broadly and encompasses a variety of methodological approaches. Often, researchers in the field of curriculum studies conduct curriculum…
Descriptors: Comparative Analysis, Visualization, Curriculum, Maps
Crisp, Victoria; Novakovic, Nadezda – Evaluation & Research in Education, 2009
Maintaining standards over time is a much debated topic in the context of national examinations in the UK. This study used a pilot method to compare the demands, over time, of two examination units testing administration. The method involved 15 experts revising a framework of demand types and making paired comparisons of examinations from…
Descriptors: Pilot Projects, Test Reliability, Difficulty Level, Comparative Analysis
Kouame, Julien B. – Journal of MultiDisciplinary Evaluation, 2010
Background: Readability tests are indicators that measure how easy a document can be read and understood. Simple, but very often ignored, readability statistics cannot only provide information about the level of difficulty of the readability of particular documents but also can increase an evaluator's credibility. Purpose: The purpose of this…
Descriptors: Readability, Readability Formulas, Evaluation Methods, Literacy
Lamprianou, Iasonas – International Journal of Testing, 2008
This study investigates the effect of reporting the unadjusted raw scores in a high-stakes language exam when raters differ significantly in severity and self-selected questions differ significantly in difficulty. More sophisticated models, introducing meaningful facets and parameters, are successively used to investigate the characteristics of…
Descriptors: High Stakes Tests, Raw Scores, Item Response Theory, Language Tests