Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 16 |
Since 2016 (last 10 years) | 32 |
Since 2006 (last 20 years) | 72 |
Descriptor
Licensing Examinations… | 153 |
Test Items | 153 |
Test Construction | 49 |
Difficulty Level | 34 |
Item Response Theory | 33 |
Multiple Choice Tests | 31 |
Higher Education | 27 |
Certification | 24 |
Scores | 24 |
Cutting Scores | 23 |
Standard Setting (Scoring) | 22 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 95 |
Journal Articles | 79 |
Speeches/Meeting Papers | 52 |
Reports - Evaluative | 44 |
Reports - Descriptive | 8 |
Dissertations/Theses -… | 4 |
Tests/Questionnaires | 2 |
Guides - General | 1 |
Education Level
Location
Canada | 5 |
Tennessee | 4 |
Saudi Arabia | 3 |
Arizona | 2 |
California | 2 |
United States | 2 |
Australia | 1 |
China | 1 |
Colorado | 1 |
Florida | 1 |
Georgia | 1 |
More ▼ |
Laws, Policies, & Programs
Comprehensive Education… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…
Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items
Emily K. Toutkoushian; Huaping Sun; Mark T. Keegan; Ann E. Harman – Measurement: Interdisciplinary Research and Perspectives, 2024
Linear logistic test models (LLTMs), leveraging item response theory and linear regression, offer an elegant method for learning about item characteristics in complex content areas. This study used LLTMs to model single-best-answer, multiple-choice-question response data from two medical subspecialty certification examinations in multiple years…
Descriptors: Licensing Examinations (Professions), Certification, Medical Students, Test Items
Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023
Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…
Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items
Sinharay, Sandip – Grantee Submission, 2021
Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices of Levine and Drasgow (1988) and is the most powerful statistic for detecting item…
Descriptors: Robustness (Statistics), Hypothesis Testing, Statistics, Test Items
Sakworawich, Arnond; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2020
Test scoring models vary in their generality, some even adjust for examinees answering multiple-choice items correctly by accident (guessing), but no models, that we are aware of, automatically adjust an examinee's score when there is internal evidence of cheating. In this study, we use a combination of jackknife technology with an adaptive robust…
Descriptors: Scoring, Cheating, Test Items, Licensing Examinations (Professions)
Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022
As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…
Descriptors: Scores, Scoring, Comparative Analysis, Testing
Jang, Jung Un; Kim, Eun Joo – Journal of Curriculum and Teaching, 2022
This study conducts the validity of the pen-and-paper and smart-device-based tests on optician's examination. The developed questions for each media were based on the national optician's simulation test. The subjects of this study were 60 students enrolled in E University. The data analysis was performed to verify the equivalence of the two…
Descriptors: Optometry, Licensing Examinations (Professions), Test Format, Test Validity
Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…
Descriptors: Scoring, Test Items, Test Format, Raw Scores
Peabody, Michael R. – Measurement: Interdisciplinary Research and Perspectives, 2023
Many organizations utilize some form of automation in the test assembly process; either fully algorithmic or heuristically constructed. However, one issue with heuristic models is that when the test assembly problem changes the entire model may need to be re-conceptualized and recoded. In contrast, mixed-integer programming (MIP) is a mathematical…
Descriptors: Programming Languages, Algorithms, Heuristics, Mathematical Models
Wolkowitz, Amanda A.; Foley, Brett; Zurn, Jared – Practical Assessment, Research & Evaluation, 2023
The purpose of this study is to introduce a method for converting scored 4-option multiple-choice (MC) items into scored 3-option MC items without re-pretesting the 3-option MC items. This study describes a six-step process for achieving this goal. Data from a professional credentialing exam was used in this study and the method was applied to 24…
Descriptors: Multiple Choice Tests, Test Items, Accuracy, Test Format
Dianne S. McCarthy – Journal of Inquiry and Action in Education, 2023
Teacher certification exams are supposed to assess if a student is likely to succeed in teaching. What if an exam seems to be inappropriate? This article is an inquiry of the New York State Content Specialty Test for Early Childhood Candidates, particularly the math section. It raises the issue of whether we are asking the right questions and…
Descriptors: Teacher Certification, Licensing Examinations (Professions), Preservice Teachers, Early Childhood Teachers
Feinberg, Richard A. – Educational Measurement: Issues and Practice, 2021
Unforeseen complications during the administration of large-scale testing programs are inevitable and can prevent examinees from accessing all test material. For classification tests in which the primary purpose is to yield a decision, such as a pass/fail result, the current study investigated a model-based standard error approach, Bayesian…
Descriptors: High Stakes Tests, Classification, Decision Making, Bayesian Statistics
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Tang, Xiaodan; Schultz, Matthew – Practical Assessment, Research & Evaluation, 2020
This study aims to examine the potential impacts on repeat examinees' performance by reusing simulation-based items in a high-stakes standardized assessment. We examined change patterns of item scores, ability estimate, score pattern change, response time and compared the performance of repeat examinees who have received repeat items and those who…
Descriptors: Test Items, Repetition, Simulation, Standardized Tests
Babcock, Ben; Siegel, Zachary D. – Practical Assessment, Research & Evaluation, 2022
Research about repeated testing has revealed that retaking the same exam form generally does not advantage or disadvantage failing candidates in selected response-style credentialing exams. Feinberg, Raymond, and Haist (2015) found a contributing factor to this phenomenon: people answering items incorrectly on both attempts give the same incorrect…
Descriptors: Multiple Choice Tests, Item Analysis, Test Items, Response Style (Tests)