Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 13 |
Descriptor
Psychometrics | 16 |
Simulation | 16 |
Test Reliability | 16 |
Test Items | 7 |
Test Validity | 6 |
Item Response Theory | 5 |
Scores | 5 |
Test Construction | 4 |
Adaptive Testing | 3 |
Computation | 3 |
Difficulty Level | 3 |
More ▼ |
Source
Author
Edwards, Michael C. | 2 |
Guo, Hongwen | 2 |
Azaan Vhora | 1 |
Betz, Nancy E. | 1 |
Borsman, Denny | 1 |
Davis, John L. | 1 |
Doolen, Jessica | 1 |
Flora, David B. | 1 |
Frankel, Lois | 1 |
Galindo-Garre, Francisca | 1 |
Harshman, Jordan | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 12 |
Reports - Evaluative | 2 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
Australia | 1 |
Germany | 1 |
North America | 1 |
Sweden | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Azaan Vhora; Ryan L. Davies; Kylie Rice – Psychology Learning and Teaching, 2024
Background: Objective Structured Clinical Examinations (OSCEs) are a simulation-based assessment tool used extensively in medical education for evaluating clinical competence. OSCEs are widely regarded as more valid, reliable, and valuable compared to traditional assessment measures, and are now emerging within professional psychology training…
Descriptors: Psychology, Higher Education, Psychometrics, Objective Tests
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019
Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…
Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests
Raborn, Anthony W.; Leite, Walter L.; Marcoulides, Katerina M. – International Educational Data Mining Society, 2019
Short forms of psychometric scales have been commonly used in educational and psychological research to reduce the burden of test administration. However, it is challenging to select items for a short form that preserve the validity and reliability of the scores of the original scale. This paper presents and evaluates multiple automated methods…
Descriptors: Psychometrics, Measures (Individuals), Mathematics, Heuristics
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018
For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…
Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction
Harshman, Jordan; Yezierski, Ellen – Journal of Chemical Education, 2016
Determining the error of measurement is a necessity for researchers engaged in bench chemistry, chemistry education research (CER), and a multitude of other fields. Discussions regarding what constructs measurement error entails and how to best measure them have occurred, but the critiques about traditional measures have yielded few alternatives.…
Descriptors: Science Instruction, Chemistry, Error of Measurement, Psychometrics
Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016
The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…
Descriptors: Test Reliability, Goodness of Fit, Scores, Patients
Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014
C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…
Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests
Edwards, Michael C.; Flora, David B.; Thissen, David – Applied Measurement in Education, 2012
This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Test Items
Parker, Richard I.; Vannest, Kimberly J.; Davis, John L. – Journal of School Psychology, 2013
The use of multi-category scales is increasing for the monitoring of IEP goals, classroom and school rules, and Behavior Improvement Plans (BIPs). Although they require greater inference than traditional data counting, little is known about the inter-rater reliability of these scales. This simulation study examined the performance of nine…
Descriptors: Rating Scales, Scaling, Interrater Reliability, Test Reliability
Zhang, Jinming – Psychometrika, 2013
In some popular test designs (including computerized adaptive testing and multistage testing), many item pairs are not administered to any test takers, which may result in some complications during dimensionality analyses. In this paper, a modified DETECT index is proposed in order to perform dimensionality analyses for response data from such…
Descriptors: Adaptive Testing, Simulation, Computer Assisted Testing, Test Reliability
Doolen, Jessica – ProQuest LLC, 2012
High fidelity simulation has become a widespread and costly learning strategy in nursing education because it can fill the gap left by a shortage of clinical sites. In addition, high fidelity simulation is an active learning strategy that is thought to increase higher order thinking such as clinical reasoning and judgment skills in nursing…
Descriptors: Simulation, Nursing Education, Simulated Environment, Psychometrics
Borsman, Denny; Romeijn, Jan-Willem; Wicherts, Jelte M. – Psychological Methods, 2008
This article shows that measurement invariance (defined in terms of an invariant measurement model in different groups) is generally inconsistent with selection invariance (defined in terms of equal sensitivity and specificity across groups). In particular, when a unidimensional measurement instrument is used and group differences are present in…
Descriptors: Test Items, Minority Groups, Measurement, Scores
Galindo-Garre, Francisca; Vermunt, Jeroen K. – Psychometrika, 2004
This paper presents a row-column (RC) association model in which the estimated row and column scores are forced to be in agreement with a priori specified ordering. Two efficient algorithms for finding the order-restricted maximum likelihood (ML) estimates are proposed and their reliability under different degrees of association is investigated by…
Descriptors: Mathematics, Test Reliability, Computation, Testing
Betz, Nancy E.; Weiss, David J. – 1974
Monte Carlo simulation procedures were used to study the psychometric characteristics of two two-stage adaptive tests and a conventional "peaked" ability test. Results showed that scores yielded by both two-stage tests better reflected the normal distribution of underlying ability. Ability estimates yielded by one of the two stage tests…
Descriptors: Ability, Academic Ability, Adaptive Testing, Computers
Previous Page | Next Page ยป
Pages: 1 | 2