Publication Date
In 2025 | 30 |
Since 2024 | 93 |
Since 2021 (last 5 years) | 327 |
Since 2016 (last 10 years) | 642 |
Since 2006 (last 20 years) | 987 |
Descriptor
Test Items | 1017 |
Foreign Countries | 501 |
College Students | 283 |
Undergraduate Students | 282 |
Test Construction | 279 |
Difficulty Level | 211 |
Test Validity | 197 |
Test Reliability | 191 |
Multiple Choice Tests | 188 |
Item Analysis | 185 |
Scores | 177 |
More ▼ |
Source
Author
Liu, Ou Lydia | 9 |
Raker, Jeffrey R. | 8 |
Baghaei, Purya | 7 |
Bridgeman, Brent | 7 |
Gierl, Mark J. | 7 |
Liu, Jinghua | 7 |
Murphy, Kristen L. | 7 |
Dorans, Neil J. | 5 |
Holme, Thomas A. | 5 |
Ling, Guangming | 5 |
Attali, Yigal | 4 |
More ▼ |
Publication Type
Education Level
Audience
Teachers | 4 |
Administrators | 1 |
Counselors | 1 |
Practitioners | 1 |
Location
Turkey | 75 |
Iran | 37 |
Canada | 32 |
China | 30 |
Japan | 27 |
Australia | 25 |
Germany | 25 |
United Kingdom | 19 |
Indonesia | 17 |
Taiwan | 17 |
United States | 15 |
More ▼ |
Laws, Policies, & Programs
Civil Rights Act 1964 Title… | 1 |
Higher Education Act… | 1 |
Higher Education Opportunity… | 1 |
Improving Americas Schools… | 1 |
Jeanne Clery Disclosure of… | 1 |
Morrill Act 1862 | 1 |
United Nations Convention on… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025
In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…
Descriptors: Test Items, Factor Structure, Test Reliability, Semantics
Janet Mee; Ravi Pandian; Justin Wolczynski; Amy Morales; Miguel Paniagua; Polina Harik; Peter Baldwin; Brian E. Clauser – Advances in Health Sciences Education, 2024
Recent advances in automated scoring technology have made it practical to replace multiple-choice questions (MCQs) with short-answer questions (SAQs) in large-scale, high-stakes assessments. However, most previous research comparing these formats has used small examinee samples testing under low-stakes conditions. Additionally, previous studies…
Descriptors: Multiple Choice Tests, High Stakes Tests, Test Format, Test Items
Xu, Yufeng; Liu, Huinan; Chen, Bo; Huang, Sihui; Zhong, Chongyu – Chemistry Education Research and Practice, 2023
Scientific methods have received widespread attention in recent years. Based on the analytical framework derived from Brandon's matrix consisting of four categories of scientific methods, this paper aims to conduct a content analysis to examine how the diversity of scientific methods is represented in college entrance chemistry examination papers…
Descriptors: College Entrance Examinations, Chemistry, Scientific Methodology, Test Items
Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025
This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…
Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis
Yunting Liu; Shreya Bhandari; Zachary A. Pardos – British Journal of Educational Technology, 2025
Effective educational measurement relies heavily on the curation of well-designed item pools. However, item calibration is time consuming and costly, requiring a sufficient number of respondents to estimate the psychometric properties of items. In this study, we explore the potential of six different large language models (LLMs; GPT-3.5, GPT-4,…
Descriptors: Artificial Intelligence, Test Items, Psychometrics, Educational Assessment
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Lars Andersson Hult; Anders Persson – Journal of Social Science Education, 2025
Purpose: This article's purpose is to examine the manifestations of the evolving modern society and what we now identify as civics or other contemporary social issues in the final examination questions from 1914 to 1937 at four teacher education institutions in Uppsala, Falun, Lund, and Landskrona. Design/methodology/approach: The method can be…
Descriptors: Civics, Tests, Preservice Teacher Education, Test Items
David G. Schreurs; Jaclyn M. Trate; Shalini Srinivasan; Melonie A. Teichert; Cynthia J. Luxford; Jamie L. Schneider; Kristen L. Murphy – Chemistry Education Research and Practice, 2024
With the already widespread nature of multiple-choice assessments and the increasing popularity of answer-until-correct, it is important to have methods available for exploring the validity of these types of assessments as they are developed. This work analyzes a 20-question multiple choice assessment covering introductory undergraduate chemistry…
Descriptors: Multiple Choice Tests, Test Validity, Introductory Courses, Science Tests
Fu Chen; Ying Cui; Alina Lutsyk-King; Yizhu Gao; Xiaoxiao Liu; Maria Cutumisu; Jacqueline P. Leighton – Education and Information Technologies, 2024
Post-secondary data literacy education is critical to students' academic and career success. However, the literature has not adequately addressed the conceptualization and assessment of data literacy for post-secondary students. In this study, we introduced a novel digital performance-based assessment for teaching and evaluating post-secondary…
Descriptors: Performance Based Assessment, College Students, Information Literacy, Evaluation Methods
Pentecost, Thomas C.; Raker, Jeffery R.; Murphy, Kristen L. – Practical Assessment, Research & Evaluation, 2023
Using multiple versions of an assessment has the potential to introduce item environment effects. These types of effects result in version dependent item characteristics (i.e., difficulty and discrimination). Methods to detect such effects and resulting implications are important for all levels of assessment where multiple forms of an assessment…
Descriptors: Item Response Theory, Test Items, Test Format, Science Tests
Melissa Whatley; Dominique Foster; Stephen Paul – Journal of Studies in International Education, 2024
The purpose of this study was to develop a measurement instrument that scholars and practitioners in international education can use as a means of exploring whether and how individuals who come into contact with international education programs develop a greater sense of cultural humility. Specifically, the study described here outlines the four…
Descriptors: Foreign Students, Cultural Awareness, Consciousness Raising, Test Construction
Emily A. Holt; Jessica Duke; Ryan Dunk; Krystal Hinerman – Environmental Education Research, 2024
Student understanding of climate change is an active and growing area of research, but little research has documented undergraduate students' knowledge about the biotic impacts of climate change. Here, we address this literature gap by presenting the Inventory of Biotic Climate Literacy (IBCL), a concept inventory developed to assess undergraduate…
Descriptors: Climate, Undergraduate Students, Knowledge Level, Test Construction
Lahza, Hatim; Smith, Tammy G.; Khosravi, Hassan – British Journal of Educational Technology, 2023
Traditional item analyses such as classical test theory (CTT) use exam-taker responses to assessment items to approximate their difficulty and discrimination. The increased adoption by educational institutions of electronic assessment platforms (EAPs) provides new avenues for assessment analytics by capturing detailed logs of an exam-taker's…
Descriptors: Medical Students, Evaluation, Computer Assisted Testing, Time Factors (Learning)
Corrin Moss; Sharon Kwabi; Scott P. Ardoin; Katherine S. Binder – Reading and Writing: An Interdisciplinary Journal, 2024
The ability to form a mental model of a text is an essential component of successful reading comprehension (RC), and purpose for reading can influence mental model construction. Participants were assigned to one of two conditions during an RC test to alter their purpose for reading: concurrent (texts and questions were presented simultaneously)…
Descriptors: Eye Movements, Reading Comprehension, Test Format, Short Term Memory
Mahdi Ghorbankhani; Keyvan Salehi – SAGE Open, 2025
Academic procrastination, the tendency to delay academic tasks without reasonable justification, has significant implications for students' academic performance and overall well-being. To measure this construct, numerous scales have been developed, among which the Academic Procrastination Scale (APS) has shown promise in assessing academic…
Descriptors: Psychometrics, Measures (Individuals), Time Management, Foreign Countries