Publication Date
In 2025 | 17 |
Since 2024 | 46 |
Since 2021 (last 5 years) | 143 |
Since 2016 (last 10 years) | 301 |
Since 2006 (last 20 years) | 621 |
Descriptor
Psychometrics | 819 |
Test Items | 819 |
Test Construction | 299 |
Item Response Theory | 291 |
Test Reliability | 224 |
Test Validity | 223 |
Foreign Countries | 206 |
Item Analysis | 160 |
Difficulty Level | 154 |
Scores | 137 |
Models | 113 |
More ▼ |
Source
Author
Gierl, Mark J. | 13 |
Dorans, Neil J. | 8 |
Liu, Ou Lydia | 7 |
Schoen, Robert C. | 7 |
Reckase, Mark D. | 6 |
Bejar, Isaac I. | 5 |
Embretson, Susan E. | 5 |
Katz, Irvin R. | 5 |
Mislevy, Robert J. | 5 |
Sinharay, Sandip | 5 |
Baghaei, Purya | 4 |
More ▼ |
Publication Type
Education Level
Location
Turkey | 20 |
Canada | 17 |
Germany | 15 |
United States | 12 |
China | 10 |
Australia | 9 |
Taiwan | 9 |
Florida | 7 |
Netherlands | 7 |
South Korea | 7 |
Nigeria | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 8 |
Individuals with Disabilities… | 2 |
Elementary and Secondary… | 1 |
Lau v Nichols | 1 |
National Defense Education Act | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Marcoulides, Katerina M. – Measurement: Interdisciplinary Research and Perspectives, 2023
Integrative data analyses have recently been shown to be an effective tool for researchers interested in synthesizing datasets from multiple studies in order to draw statistical or substantive conclusions. The actual process of integrating the different datasets depends on the availability of some common measures or items reflecting the same…
Descriptors: Data Analysis, Synthesis, Test Items, Simulation
Camilla M. McMahon; Maryellen Brunson McClain; Savannah Wells; Sophia Thompson; Jeffrey D. Shahidullah – Journal of Autism and Developmental Disorders, 2025
Purpose: The goal of the current study was to conduct a substantive validity review of four autism knowledge assessments with prior psychometric support (Gillespie-Lynch in J Autism and Dev Disord 45(8):2553-2566, 2015; Harrison in J Autism and Dev Disord 47(10):3281-3295, 2017; McClain in J Autism and Dev Disord 50(3):998-1006, 2020; McMahon…
Descriptors: Measures (Individuals), Psychometrics, Test Items, Accuracy
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025
This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…
Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics
Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025
Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms
Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022
This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…
Descriptors: Test Items, Measurement, Psychometrics, Test Construction
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2023
Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers.…
Descriptors: Scores, Test Items, Accuracy, Psychometrics
Berenbon, Rebecca F.; McHugh, Bridget C. – Educational Measurement: Issues and Practice, 2023
To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ…
Descriptors: Test Items, Multiple Choice Tests, Standards, Difficulty Level
Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023
There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…
Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis
Yunting Liu; Shreya Bhandari; Zachary A. Pardos – British Journal of Educational Technology, 2025
Effective educational measurement relies heavily on the curation of well-designed item pools. However, item calibration is time consuming and costly, requiring a sufficient number of respondents to estimate the psychometric properties of items. In this study, we explore the potential of six different large language models (LLMs; GPT-3.5, GPT-4,…
Descriptors: Artificial Intelligence, Test Items, Psychometrics, Educational Assessment
Lin Ma – ProQuest LLC, 2024
This dissertation presents an innovative approach to examining the keying method, wording method, and construct validity on psychometric instruments. By employing a mixed methods explanatory sequential design, the effects of keying and wording in two psychometric assessments were examined and validated. Those two self-report psychometric…
Descriptors: Evaluation, Psychometrics, Measures (Individuals), Instrumentation
Hauke Hermann; Annemieke Witte; Gloria Kempelmann; Brian F. Barrett; Sandra Zaal; Jolanda Vonk; Filip Morisse; Anna Pöhlmann; Paula S. Sterkenburg; Tanja Sappok – Journal of Applied Research in Intellectual Disabilities, 2024
Background: Valid and reliable instruments for measuring emotional development are critical for a proper diagnostic assignment in individuals with intellectual disabilities. This exploratory study examined the psychometric properties of the items on the Scale of Emotional Development--Short (SED-S). Method: The sample included 612 adults with…
Descriptors: Measures (Individuals), Emotional Development, Intellectual Disability, Psychometrics
Mingjia Ma – ProQuest LLC, 2023
Response time is an important research topic in the field of psychometrics. This dissertation tries to explore some response time properties across several item characteristics and examinee characteristics, as well as the interactions between response time and response outcomes, using data from a statewide mathematics assessment in two grades.…
Descriptors: Reaction Time, Mathematics Tests, Standardized Tests, State Standards
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests