Publication Date
In 2025 | 10 |
Since 2024 | 21 |
Descriptor
Difficulty Level | 21 |
Item Response Theory | 21 |
Test Items | 18 |
Foreign Countries | 8 |
Item Analysis | 7 |
Psychometrics | 7 |
Test Reliability | 5 |
Questionnaires | 4 |
Reading Tests | 4 |
Test Construction | 4 |
Test Validity | 4 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 20 |
Journal Articles | 18 |
Dissertations/Theses -… | 1 |
Information Analyses | 1 |
Education Level
Audience
Laws, Policies, & Programs
Assessments and Surveys
International English… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Kuan-Yu Jin; Thomas Eckes – Educational and Psychological Measurement, 2024
Insufficient effort responding (IER) refers to a lack of effort when answering survey or questionnaire items. Such items typically offer more than two ordered response categories, with Likert-type scales as the most prominent example. The underlying assumption is that the successive categories reflect increasing levels of the latent variable…
Descriptors: Item Response Theory, Test Items, Test Wiseness, Surveys
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Mohammad M. Khajah – Journal of Educational Data Mining, 2024
Bayesian Knowledge Tracing (BKT) is a popular interpretable computational model in the educational mining community that can infer a student's knowledge state and predict future performance based on practice history, enabling tutoring systems to adaptively select exercises to match the student's competency level. Existing BKT implementations do…
Descriptors: Students, Bayesian Statistics, Intelligent Tutoring Systems, Cognitive Development
Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…
Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment
Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…
Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity
Agus Santoso; Heri Retnawati; Timbul Pardede; Ibnu Rafi; Munaya Nikma Rosyada; Gulzhaina K. Kassymova; Xu Wenxin – Practical Assessment, Research & Evaluation, 2024
The test blueprint is important in test development, where it guides the test item writer in creating test items according to the desired objectives and specifications or characteristics (so-called a priori item characteristics), such as the level of item difficulty in the category and the distribution of items based on their difficulty level.…
Descriptors: Foreign Countries, Undergraduate Students, Business English, Test Construction
Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025
Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…
Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks
Rodrigo Moreta-Herrera; Xavier Oriol-Granado; Mònica González; Jose A. Rodas – Infant and Child Development, 2025
This study evaluates the Children's Worlds Psychological Well-Being Scale (CW-PSWBS) within a diverse international cohort of children aged 10 and 12, utilising Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies. Through a detailed psychometric analysis, this research assesses the CW-PSWBS's structural integrity, focusing on…
Descriptors: Well Being, Rating Scales, Children, Item Response Theory
Mimi Ismail; Ahmed Al - Badri; Said Al - Senaidi – Journal of Education and e-Learning Research, 2025
This study aimed to reveal the differences in individuals' abilities, their standard errors, and the psychometric properties of the test according to the two methods of applying the test (electronic and paper). The descriptive approach was used to achieve the study's objectives. The study sample consisted of 74 male and female students at the…
Descriptors: Achievement Tests, Computer Assisted Testing, Psychometrics, Item Response Theory
Alicia A. Stoltenberg – ProQuest LLC, 2024
Multiple-select multiple-choice items, or multiple-choice items with more than one correct answer, are used to quickly assess content on standardized assessments. Because there are multiple keys to these item types, there are also multiple ways to score student responses to these items. The purpose of this study was to investigate how changing the…
Descriptors: Scoring, Evaluation Methods, Multiple Choice Tests, Standardized Tests
Moritz Waitzmann; Ruediger Scholz; Susanne Wessnigk – Physical Review Physics Education Research, 2024
Clear and rigorous quantum reasoning is needed to explain quantum physical phenomena. As pillars of true quantum physical explanations, we suggest specific quantum reasoning derived from quantum physical key ideas. An experiment is suggested to support such a quantum reasoning, in which a quantized radiation field interacts with an optical beam…
Descriptors: Physics, Science Instruction, Teaching Methods, Quantum Mechanics
Previous Page | Next Page »
Pages: 1 | 2