Publication Date
In 2025 | 34 |
Since 2024 | 91 |
Since 2021 (last 5 years) | 277 |
Since 2016 (last 10 years) | 539 |
Since 2006 (last 20 years) | 842 |
Descriptor
Test Items | 1384 |
Test Validity | 1384 |
Test Construction | 693 |
Test Reliability | 663 |
Foreign Countries | 386 |
Item Analysis | 297 |
Difficulty Level | 239 |
Psychometrics | 225 |
Item Response Theory | 192 |
Scores | 178 |
Factor Analysis | 167 |
More ▼ |
Source
Author
Schoen, Robert C. | 8 |
Stansfield, Charles W. | 7 |
Baghaei, Purya | 5 |
Hambleton, Ronald K. | 5 |
LaVenia, Mark | 5 |
Roid, Gale | 5 |
Wainer, Howard | 5 |
Bejar, Isaac I. | 4 |
Bennett, Randy Elliot | 4 |
Benson, Jeri | 4 |
Filby, Nikola N. | 4 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 43 |
Researchers | 38 |
Teachers | 28 |
Administrators | 14 |
Students | 5 |
Support Staff | 3 |
Community | 2 |
Parents | 2 |
Counselors | 1 |
Policymakers | 1 |
Location
Turkey | 59 |
Indonesia | 23 |
Canada | 22 |
Iran | 22 |
Australia | 21 |
California | 19 |
Germany | 19 |
China | 17 |
Florida | 14 |
United Kingdom | 14 |
Japan | 12 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Shadi Noroozi; Hossein Karami – Language Testing in Asia, 2024
Recently, psychometricians and researchers have voiced their concern over the exploration of language test items in light of Messick's validation framework. Validity has been central to test development and use; however, it has not received due attention in language tests having grave consequences for test takers. The present study sought to…
Descriptors: Foreign Countries, Doctoral Students, Graduate Students, Language Proficiency
Eyüp Yurt – International Journal of Education in Mathematics, Science and Technology, 2025
This study aimed to develop and validate the Creative Problem-Solving Skills Test (CPSS-T), grounded in Torrance's creativity theory, to assess these skills in university students. The CPSS-T consists of five open-ended question types, each designed to measure different aspects of creative problem-solving: Alternative Use, Hypothetical Scenario,…
Descriptors: Creativity Tests, Creativity, Creative Thinking, Problem Solving
Fadime Hatice Inci; Ferhat Çelik – Psychology in the Schools, 2025
The aim of this study is to examine the validity, reliability, and responsiveness of the Turkish version of the Adolescent Health Promotion-Short Form (AHP-SF). This cross-sectional study was completed with 1483 students. Confirmatory factor analysis (CFA) supported the construct validity of the scale, demonstrating a good model fit with…
Descriptors: Foreign Countries, Measures (Individuals), Adolescents, Health Promotion
Sarah K. Cowan; Michael Hout; Stuart Perrett – Sociological Methods & Research, 2024
Long-running surveys need a systematic way to reflect social change and to keep items relevant to respondents, especially when they ask about controversial subjects, or they threaten the items' validity. We propose a protocol for updating measures that preserves content and construct validity. First, substantive experts articulate the current and…
Descriptors: Surveys, Public Opinion, Social Attitudes, Pregnancy
Hartono, Wahyu; Hadi, Samsul; Rosnawati, Raden; Retnawati, Heri – Pegem Journal of Education and Instruction, 2023
Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and score reliability, as well as to investigate the use of classical test theory to identify item characteristics. The…
Descriptors: Diagnostic Tests, Test Validity, Item Response Theory, Content Validity
Purwo Susongko; Chockchai Yuenyong; Mobinta Kusuma; Yuni Arfiani – Pegem Journal of Education and Instruction, 2024
This study aims to develop Nature of Science (NOS) measurement instrument for high school students in the Mathematics and Natural Sciences program using COVID-19 cases that had occurred since the end of 2019. This study also attempts to validate test items with the Rasch model approach. The participants in this study are 194 high school students…
Descriptors: Test Construction, Science Tests, Scientific Principles, COVID-19
Endang Susantini; Yurizka Melia Sari; Prima Vidya Asteria; Muhammad Ilyas Marzuqi – Journal of Education and Learning (EduLearn), 2025
Assessing preservice' higher order thinking skills (HOTS) in science and mathematics is essential. Teachers' HOTS ability is closely related to their ability to create HOTS-type science and mathematics problems. Among various types of HOTS, one is Bloomian HOTS. To facilitate the preservice teacher to create problems in those subjects, an Android…
Descriptors: Content Validity, Mathematics Instruction, Decision Making, Thinking Skills
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Meyer, J. Patrick; Hu, Ann; Li, Sylvia – NWEA, 2023
The Content Proximity Project was designed to improve the content validity of the MAP® Growth™ assessments while retaining the ability for the test to adapt off-grade and meet students wherever they are in their learning. Two main features of the project were the development of an enhanced item selection algorithm, and a spring pilot study…
Descriptors: Achievement Tests, Mathematics Achievement, Content Validity, Mathematics Tests
Camilla M. McMahon; Maryellen Brunson McClain; Savannah Wells; Sophia Thompson; Jeffrey D. Shahidullah – Journal of Autism and Developmental Disorders, 2025
Purpose: The goal of the current study was to conduct a substantive validity review of four autism knowledge assessments with prior psychometric support (Gillespie-Lynch in J Autism and Dev Disord 45(8):2553-2566, 2015; Harrison in J Autism and Dev Disord 47(10):3281-3295, 2017; McClain in J Autism and Dev Disord 50(3):998-1006, 2020; McMahon…
Descriptors: Measures (Individuals), Psychometrics, Test Items, Accuracy
Güntay Tasçi – Science Insights Education Frontiers, 2024
The present study has aimed to develop and validate a protein concept inventory (PCI) consisting of 25 multiple-choice (MC) questions to assess students' understanding of protein, which is a fundamental concept across different biology disciplines. The development process of the PCI involved a literature review to identify protein-related content,…
Descriptors: Science Instruction, Science Tests, Multiple Choice Tests, Biology
Anna Planas-Lladó; Xavier Úcar – American Journal of Evaluation, 2024
Empowerment is a concept that has become increasingly used over recent years. However, little research has been undertaken into how empowerment can be evaluated, particularly in the case of young people. The aim of this article is to present an inventory of dimensions and indicators of youth empowerment. The article describes the various phases in…
Descriptors: Youth, Empowerment, Test Construction, Test Validity
Yi Zou; Ying Zheng; Jingwen Wang – International Journal of Language Testing, 2025
The Pearson Test of English Academic (PTE-A), a widely used high-stakes language proficiency test for university admissions and migration purposes, underwent a notable change from a three-hour to a two-hour version in November 2021. The implementation of the new version has prompted inquiries into the washback effects on various stakeholders.…
Descriptors: Testing Problems, Test Preparation, High Stakes Tests, English (Second Language)
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods