Publication Date
In 2025 | 3 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 41 |
Since 2016 (last 10 years) | 126 |
Since 2006 (last 20 years) | 395 |
Descriptor
Test Theory | 1161 |
Test Items | 261 |
Test Reliability | 252 |
Test Construction | 245 |
Test Validity | 245 |
Psychometrics | 181 |
Scores | 176 |
Item Response Theory | 165 |
Foreign Countries | 159 |
Item Analysis | 141 |
Statistical Analysis | 134 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
United States | 17 |
United Kingdom (England) | 15 |
Canada | 14 |
Australia | 13 |
Turkey | 12 |
Sweden | 8 |
United Kingdom | 8 |
Netherlands | 7 |
Texas | 7 |
New York | 6 |
Taiwan | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 4 |
Elementary and Secondary… | 3 |
Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Baird, Jo-Anne; Andrich, David; Hopfenbeck, Therese N.; Stobart, Gordon – Assessment in Education: Principles, Policy & Practice, 2017
Educational assessments define what aspects of learning will formally be given credit and therefore have a huge impact upon teaching and learning. Although the impact of high-stakes national and international assessments on teaching and learning is considered in the literature, remarkably, there is little research on the connection between…
Descriptors: Educational Assessment, Learning Theories, Psychological Evaluation, Test Theory
Coggins, Joanne V.; Kim, Jwa K.; Briggs, Laura C. – Research in the Schools, 2017
The Gates-MacGinitie Reading Comprehension Test, fourth edition (GMRT-4) and the ACT Reading Tests (ACT-R) were administered to 423 high school students in order to explore the similarities and dissimilarities of data produced through classical test theory (CTT) and item response theory (IRT) analysis. Despite the many advantages of IRT…
Descriptors: Item Response Theory, Test Theory, Reading Comprehension, Reading Tests
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Dirlik, Ezgi Mor – International Journal of Progressive Education, 2019
Item response theory (IRT) has so many advantages than its precedent Classical Test Theory (CTT) such as non-changing item parameters, ability parameter estimations free from the items. However, in order to get these advantages, some assumptions should be met and they are; unidimensionality, normality and local independence. However, it is not…
Descriptors: Comparative Analysis, Nonparametric Statistics, Item Response Theory, Models
van der Linden, Wim J. – Journal of Educational Measurement, 2013
This article is a response to the commentaries on the position paper on observed-score equating by van der Linden (this issue). The response focuses on the more general issues in these commentaries, such as the nature of the observed scores that are equated, the importance of test-theory assumptions in equating, the necessity to use multiple…
Descriptors: Equated Scores, Test Theory, Transformations (Mathematics)
Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017
This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…
Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics
Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018
Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…
Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Holland, Paul W. – Journal of Educational Measurement, 2013
While agreeing with van der Linden (this issue) that test equating needs better theoretical underpinnings, my comments criticize several aspects of his article. His examples are, for the most part, worthless; he does not use well-established terminology correctly; his view of 100 years of attempts to give a theoretical basis for equating is…
Descriptors: Equated Scores, Test Theory, Transformations (Mathematics), Computation
Vangrieken, Katrien; Boon, Anne; Dochy, Filip; Kyndt, Eva – Frontline Learning Research, 2017
The current gap between traditional team research and research focusing on non-strict teams or groups such as teacher teams hampers boundary-crossing investigations of and theorising on teamwork and collaboration. The main aim of this study includes bridging this gap by proposing a continuum-based team concept, describing the distinction between…
Descriptors: Teamwork, Teacher Researchers, Teacher Collaboration, Questionnaires
Wolkowitz, Amanda; Davis-Becker, Susan – Practical Assessment, Research & Evaluation, 2015
This study evaluates the impact of common item characteristics on the outcome of equating in credentialing examinations when traditionally recommended representation is not possible. This research used real data sets from several credentialing exams to test the impact of content representation, item statistics, and number of common items on…
Descriptors: Test Items, Equated Scores, Licensing Examinations (Professions), Test Content
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
Descriptors: Test Theory, Item Response Theory, Models, Correlation
Powell, J. C. – International Association for Development of the Information Society, 2013
This reflection paper challenges current test scoring practices on the grounds that most wrong-answer selections are thoughtful not random, presenting research supporting this proposition. An alternative test scoring system is presented, described and its outcomes discussed. This new scoring system increases the number of variables considered,…
Descriptors: Test Theory, Test Interpretation, Scoring, Multiple Choice Tests
Opitz, Ansgar; Heene, Moritz; Fischer, Frank – Educational Research and Evaluation, 2017
Education systems increasingly emphasize the importance of scientific reasoning skills such as "generating hypotheses" and "evaluating evidence." Despite this importance, we do not know which tests of scientific reasoning exist, which skills they emphasize, how they conceptualize scientific reasoning, and how well they are…
Descriptors: Thinking Skills, Logical Thinking, Science Process Skills, Science Instruction
Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem – Educational Sciences: Theory and Practice, 2016
The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…
Descriptors: Test Bias, Test Items, Difficulty Level, Test Theory