Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 15 |
Descriptor
Evaluation Methods | 42 |
Test Format | 42 |
Test Items | 42 |
Test Construction | 19 |
Higher Education | 10 |
Item Response Theory | 10 |
Student Evaluation | 9 |
Item Analysis | 7 |
Difficulty Level | 6 |
Multiple Choice Tests | 6 |
Test Validity | 6 |
More ▼ |
Source
Author
Albano, Anthony D. | 1 |
Algozzine, Bob | 1 |
Ariffin, Siti Rahayah | 1 |
Askegaard, Lewis D. | 1 |
Berberoglu, Giray | 1 |
Borowski, Andreas | 1 |
Buser, Karen | 1 |
Camilli, Gregory | 1 |
Carlson, Alfred B. | 1 |
Chandima Daskon | 1 |
Check, John F. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Secondary Education | 3 |
Grade 8 | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 6 | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Practitioners | 4 |
Teachers | 3 |
Students | 1 |
Location
Germany | 1 |
Malaysia | 1 |
Netherlands | 1 |
New Zealand | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Wind, Stefanie A.; Guo, Wenjing – Educational Assessment, 2021
Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and…
Descriptors: Scoring, Responses, Test Items, Test Format
Qian Liu; Navé Wald; Chandima Daskon; Tony Harland – Innovations in Education and Teaching International, 2024
This qualitative study looks at multiple-choice questions (MCQs) in examinations and their effectiveness in testing higher-order cognition. While there are claims that MCQs can do this, we consider many assertions problematic because of the difficulty in interpreting what higher-order cognition consists of and whether or not assessment tasks…
Descriptors: Multiple Choice Tests, Critical Thinking, College Faculty, Student Evaluation
Kim, Dong-In; Julian, Marc; Hermann, Pam – Online Submission, 2022
In test equating, one critical equating property is the group invariance property which indicates that the equating function used to convert performance on each alternate form to the reporting scale should be the same for various subgroups. To mitigate the impact of disrupted learning on the item parameters during the COVID-19 pandemic, a…
Descriptors: COVID-19, Pandemics, Test Format, Equated Scores
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Wolf, Raffaela – ProQuest LLC, 2013
Preservation of equity properties was examined using four equating methods--IRT True Score, IRT Observed Score, Frequency Estimation, and Chained Equipercentile--in a mixed-format test under a common-item nonequivalent groups (CINEG) design. Equating of mixed-format tests under a CINEG design can be influenced by factors such as attributes of the…
Descriptors: Testing, Item Response Theory, Equated Scores, Test Items
Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Descriptors: Sample Size, Test Length, Correlation, Test Format
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Kirschner, Sophie; Borowski, Andreas; Fischer, Hans E.; Gess-Newsome, Julie; von Aufschnaiter, Claudia – International Journal of Science Education, 2016
Teachers' professional knowledge is assumed to be a key variable for effective teaching. As teacher education has the goal to enhance professional knowledge of current and future teachers, this knowledge should be described and assessed. Nevertheless, only a limited number of studies quantitatively measures physics teachers' professional…
Descriptors: Evaluation Methods, Tests, Test Format, Science Instruction
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
Scarpati, Stanley E.; Wells, Craig S.; Lewis, Christine; Jirka, Stephen – Journal of Special Education, 2011
The purpose of this study was to use differential item functioning (DIF) and latent mixture model analyses to explore factors that explain performance differences on a large-scale mathematics assessment between examinees allowed to use a calculator or who were afforded item presentation accommodations versus those who did not receive the same…
Descriptors: Testing Accommodations, Test Items, Test Format, Validity
Miyazaki, Kei; Hoshino, Takahiro; Mayekawa, Shin-ichi; Shigemasu, Kazuo – Psychometrika, 2009
This study proposes a new item parameter linking method for the common-item nonequivalent groups design in item response theory (IRT). Previous studies assumed that examinees are randomly assigned to either test form. However, examinees can frequently select their own test forms and tests often differ according to examinees' abilities. In such…
Descriptors: Test Format, Item Response Theory, Test Items, Test Bias
Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008
In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory
Hertenstein, Matthew J.; Wayand, Joseph F. – Journal of Instructional Psychology, 2008
Many psychology instructors present videotaped examples of behavior at least occasionally during their courses. However, few include video clips during examinations. We provide examples of video-based questions, offer guidelines for their use, and discuss their benefits and drawbacks. In addition, we provide empirical evidence to support the use…
Descriptors: Student Evaluation, Video Technology, Evaluation Methods, Test Construction
Hagtvet, Knut A.; Nasser, Fadia M. – Structural Equation Modeling, 2004
This article presents a methodology for examining the content and nature of item parcels as indicators of a conceptually defined latent construct. An essential component of this methodology is the 2-facet measurement model, which includes items and parcels as facets of construct indicators. The 2-facet model tests assumptions required for…
Descriptors: Evaluation Methods, Validity, Test Anxiety, Content Validity