Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 40 |
Descriptor
Item Analysis | 141 |
Test Theory | 141 |
Test Items | 70 |
Latent Trait Theory | 45 |
Test Construction | 42 |
Test Reliability | 38 |
Test Validity | 36 |
Mathematical Models | 30 |
Difficulty Level | 23 |
Psychometrics | 23 |
Foreign Countries | 22 |
More ▼ |
Source
Author
Beddow, Peter A. | 3 |
Dorans, Neil J. | 3 |
Haladyna, Tom | 3 |
Hambleton, Ronald K. | 3 |
Bennett, Randy Elliot | 2 |
Cook, Linda L. | 2 |
Kettler, Ryan J. | 2 |
Lee, Young-Sun | 2 |
Lord, Frederic M. | 2 |
McKinley, Robert L. | 2 |
Reckase, Mark D. | 2 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 19 |
Practitioners | 2 |
Teachers | 2 |
Students | 1 |
Location
Netherlands | 3 |
Israel | 2 |
Italy | 2 |
Singapore | 2 |
Spain | 2 |
Alabama | 1 |
Australia | 1 |
Europe | 1 |
Finland (Helsinki) | 1 |
Illinois | 1 |
Indonesia | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024
This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…
Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Tyrone B. Pretorius; P. Paul Heppner; Anita Padmanabhanunni; Serena Ann Isaacs – SAGE Open, 2023
In previous studies, problem solving appraisal has been identified as playing a key role in promoting positive psychological well-being. The Problem Solving Inventory is the most widely used measure of problem solving appraisal and consists of 32 items. The length of the instrument, however, may limit its applicability to large-scale surveys…
Descriptors: Problem Solving, Measures (Individuals), Test Construction, Item Response Theory
Eaton, Philip; Johnson, Keith; Barrett, Frank; Willoughby, Shannon – Physical Review Physics Education Research, 2019
For proper assessment selection understanding the statistical similarities amongst assessments that measure the same, or very similar, topics is imperative. This study seeks to extend the comparative analysis between the brief electricity and magnetism assessment (BEMA) and the conceptual survey of electricity and magnetism (CSEM) presented by…
Descriptors: Test Theory, Item Response Theory, Comparative Analysis, Energy
Different Analyses, Different Conclusions? Validity Evidence from the EGMA Spatial Reasoning Subtask
Perry, Lindsey – Global Education Review, 2018
As the global development community shifts its focus from improving access to education to improving learning and instruction, the need for instruments that accurately measure student achievement in mathematics and meet technical standards is increasing. This paper explores the importance of collecting high-quality validity evidence that aligns…
Descriptors: Mathematics Tests, Test Validity, Spatial Ability, Foreign Countries
Kim, Peter – Language Teaching Research Quarterly, 2021
Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…
Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction
Shanmugam, S. Kanageswari Suppiah; Wong, Vincent; Rajoo, Murugan – Malaysian Journal of Learning and Instruction, 2020
Purpose: This study examined the quality of English test items using psychometric and linguistic characteristics among Grade Six pupils. Method: Contrary to the conventional approach of relying only on statistics when investigating item quality, this study adopted a mixed-method approach by employing psychometric analysis and cognitive interviews.…
Descriptors: English (Second Language), Second Language Instruction, Language Tests, Psychometrics
Powers, Donald; Schedl, Mary; Papageorgiou, Spiros – Language Testing, 2017
The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
Descriptors: English (Second Language), Second Language Learning, Language Proficiency, Scores
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018
Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…
Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
Descriptors: Test Theory, Item Response Theory, Models, Correlation
Muhson, Ali; Lestari, Barkah; Supriyanto; Baroroh, Kiromim – International Journal of Instruction, 2017
Item analysis has essential roles in the learning assessment. The item analysis program is designed to measure student achievement and instructional effectiveness. This study was aimed to develop item-analysis program and verify its feasibility. This study uses a Research and Development (R & D) model. The procedure includes designing and…
Descriptors: Item Analysis, Questionnaires, Interviews, Test Items
Choi, Kyong Mi; Lee, Young-Sun; Park, Yoon Soo – EURASIA Journal of Mathematics, Science & Technology Education, 2015
International trended assessments have long attempted to provide instructional information to educational researchers and classroom teachers. Studies have shown that traditional methods of item analysis have not provided specific information that can be directly applicable to improve student performance. To this end, cognitive diagnosis models…
Descriptors: International Assessment, Mathematics Tests, Grade 8, Models
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013
In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…
Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory