Publication Date
In 2025 | 5 |
Since 2024 | 7 |
Since 2021 (last 5 years) | 22 |
Since 2016 (last 10 years) | 49 |
Since 2006 (last 20 years) | 86 |
Descriptor
Difficulty Level | 98 |
Item Response Theory | 98 |
Psychometrics | 98 |
Test Items | 76 |
Foreign Countries | 32 |
Test Construction | 25 |
Test Reliability | 25 |
Item Analysis | 21 |
Test Validity | 19 |
Models | 16 |
Statistical Analysis | 15 |
More ▼ |
Source
Author
Paek, Insu | 4 |
Schoen, Robert C. | 3 |
Yang, Xiaotong | 3 |
Bejar, Isaac I. | 2 |
Benjamin W. Domingue | 2 |
Chen, Ching-I | 2 |
Engelhard, George, Jr. | 2 |
Ferrando, Pere J. | 2 |
Joshua B. Gilbert | 2 |
Liu, Sicong | 2 |
Luke W. Miratrix | 2 |
More ▼ |
Publication Type
Education Level
Audience
Location
Greece | 4 |
Nigeria | 4 |
Taiwan | 4 |
Germany | 3 |
United States | 3 |
Indonesia | 2 |
Jordan | 2 |
South Korea | 2 |
Turkey | 2 |
Asia | 1 |
Australia | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Dahl, Laura S.; Staples, B. Ashley; Mayhew, Matthew J.; Rockenbach, Alyssa N. – Innovative Higher Education, 2023
Surveys with rating scales are often used in higher education research to measure student learning and development, yet testing and reporting on the longitudinal psychometric properties of these instruments is rare. Rasch techniques allow scholars to map item difficulty and individual aptitude on the same linear, continuous scale to compare…
Descriptors: Surveys, Rating Scales, Higher Education, Educational Research
Rodrigo Moreta-Herrera; Xavier Oriol-Granado; Mònica González; Jose A. Rodas – Infant and Child Development, 2025
This study evaluates the Children's Worlds Psychological Well-Being Scale (CW-PSWBS) within a diverse international cohort of children aged 10 and 12, utilising Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies. Through a detailed psychometric analysis, this research assesses the CW-PSWBS's structural integrity, focusing on…
Descriptors: Well Being, Rating Scales, Children, Item Response Theory
Mimi Ismail; Ahmed Al - Badri; Said Al - Senaidi – Journal of Education and e-Learning Research, 2025
This study aimed to reveal the differences in individuals' abilities, their standard errors, and the psychometric properties of the test according to the two methods of applying the test (electronic and paper). The descriptive approach was used to achieve the study's objectives. The study sample consisted of 74 male and female students at the…
Descriptors: Achievement Tests, Computer Assisted Testing, Psychometrics, Item Response Theory
Zenger, Tim; Bitzenbauer, Philipp – Science Education International, 2022
This article reports on the development and piloting of a German version of a concept test to assess students' conceptual knowledge of density. The concept test was administered in paper-pencil format to 222 German secondary school students as a post-test after instruction in all relevant concepts of density. We provide a psychometric…
Descriptors: Foreign Countries, Secondary School Students, Concept Formation, Psychometrics
Yoo Jeong Jang – ProQuest LLC, 2022
Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…
Descriptors: Classification, Accuracy, Item Response Theory, Correlation
Rodriguez, Rebekah M.; Silvia, Paul J.; Kaufman, James C.; Reiter-Palmon, Roni; Puryear, Jeb S. – Creativity Research Journal, 2023
The original 90-item Creative Behavior Inventory (CBI) was a landmark self-report scale in creativity research, and the 28-item brief form developed nearly 20 years ago continues to be a popular measure of everyday creativity. Relatively little is known, however, about the psychometric properties of this widely used scale. In the current research,…
Descriptors: Creativity Tests, Creativity, Creative Thinking, Psychometrics
Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Journal of Educational and Behavioral Statistics, 2025
Analyzing heterogeneous treatment effects (HTEs) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and preintervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…
Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics
Stephanie M. Werner; Ying Chen; Mike Stieff – Journal of Chemical Education, 2021
The Chemistry Self-Concept Inventory (CSCI) is a widely used instrument within chemistry education research. Yet, agreement on its overall reliability and validity is lacking, and psychometric analyses of the instrument remain outstanding. This study examined the psychometric properties of the subscale and item function of the CSCI on 1140 high…
Descriptors: Self Concept Measures, Chemistry, Psychometrics, Item Response Theory
Musa Adekunle Ayanwale – Discover Education, 2023
Examination scores obtained by students from the West African Examinations Council (WAEC), and National Business and Technical Examinations Board (NABTEB) may not be directly comparable due to differences in examination administration, item characteristics of the subject in question, and student abilities. For more accurate comparisons, scores…
Descriptors: Equated Scores, Mathematics Tests, Test Items, Test Format
Hussein, Rasha Abed; Sabit, Shaker Holh; Alwan, Merriam Ghadhanfar; Wafqan, Hussam Mohammed; Baqer, Abeer Ameen; Ali, Muneam Hussein; Hachim, Safa K.; Sahi, Zahraa Tariq; AlSalami, Huda Takleef; Sulaiman, Bahaa Aldin Fawzi – International Journal of Language Testing, 2022
Dictation is a traditional technique for both teaching and testing overall language ability and listening comprehension. In a dictation, a passage is read aloud by the teacher and examinees write down what they hear. Due to the peculiar form of dictations, psychometric analysis of dictations is challenging. In a dictation, there is no clear…
Descriptors: Psychometrics, Verbal Communication, Teaching Methods, Language Skills
Roelofs, Erik C.; Emons, Wilco H. M.; Verschoor, Angela J. – International Journal of Testing, 2021
This study reports on an Evidence Centered Design (ECD) project in the Netherlands, involving the theory exam for prospective car drivers. In particular, we illustrate how cognitive load theory, task-analysis, response process models, and explanatory item-response theory can be used to systematically develop and refine task models. Based on a…
Descriptors: Foreign Countries, Psychometrics, Test Items, Evidence Based Practice
Fadillah, Sarah Meilani; Ha, Minsu; Nuraeni, Eni; Indriyanti, Nurma Yunita – Malaysian Journal of Learning and Instruction, 2023
Purpose: Researchers discovered that when students were given the opportunity to change their answers, a majority changed their responses from incorrect to correct, and this change often increased the overall test results. What prompts students to modify their answers? This study aims to examine the modification of scientific reasoning test, with…
Descriptors: Science Tests, Multiple Choice Tests, Test Items, Decision Making
Qi Huang; Daniel M. Bolt; Weicong Lyu – Large-scale Assessments in Education, 2024
Large scale international assessments depend on invariance of measurement across countries. An important consideration when observing cross-national differential item functioning (DIF) is whether the DIF actually reflects a source of bias, or might instead be a methodological artifact reflecting item response theory (IRT) model misspecification.…
Descriptors: Test Items, Item Response Theory, Test Bias, Test Validity