Publication Date
In 2025 | 9 |
Since 2024 | 23 |
Descriptor
Evaluation Methods | 23 |
Item Analysis | 23 |
Item Response Theory | 10 |
Models | 7 |
Factor Analysis | 6 |
Comparative Analysis | 5 |
Accuracy | 4 |
Classification | 4 |
Simulation | 4 |
Test Items | 4 |
Validity | 4 |
More ▼ |
Source
Author
Abbie Raikes | 1 |
Alicia A. Stoltenberg | 1 |
Anders Sjöberg | 1 |
Ann M. Aviles | 1 |
Apantee Poonputta | 1 |
Ashley Karls | 1 |
Ben Kelcey | 1 |
Boxuan Ma | 1 |
Brian F. French | 1 |
Carolyn Maxwell | 1 |
Cecelia Cassell | 1 |
More ▼ |
Publication Type
Journal Articles | 19 |
Reports - Research | 18 |
Dissertations/Theses -… | 3 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 7 |
Elementary Education | 4 |
Middle Schools | 3 |
High Schools | 2 |
Junior High Schools | 2 |
Early Childhood Education | 1 |
Grade 11 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Higher Education | 1 |
More ▼ |
Audience
Location
China | 1 |
Colombia | 1 |
Liberia | 1 |
Michigan | 1 |
Mississippi | 1 |
Thailand | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Classroom Assessment Scoring… | 1 |
National Longitudinal Study… | 1 |
Social Skills Rating System | 1 |
Wechsler Adult Intelligence… | 1 |
Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Martin Bäckström; Fredrik Björklund – Educational and Psychological Measurement, 2024
The forced-choice response format is often considered superior to the standard Likert-type format for controlling social desirability in personality inventories. We performed simulations and found that the trait information based on the two formats converges when the number of items is high and forced-choice items are mixed with regard to…
Descriptors: Likert Scales, Item Analysis, Personality Traits, Personality Measures
Kazuhiro Yamaguchi – Journal of Educational and Behavioral Statistics, 2025
This study proposes a Bayesian method for diagnostic classification models (DCMs) for a partially known Q-matrix setting between exploratory and confirmatory DCMs. This Q-matrix setting is practical and useful because test experts have pre-knowledge of the Q-matrix but cannot readily specify it completely. The proposed method employs priors for…
Descriptors: Models, Classification, Bayesian Statistics, Evaluation Methods
Jiaying Xiao; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy (Cho et al., 2021). However, the SE estimation procedure has yet to…
Descriptors: Error of Measurement, Models, Evaluation Methods, Item Analysis
Zachary K. Collier; Minji Kong; Olushola Soyoye; Kamal Chawla; Ann M. Aviles; Yasser Payne – Journal of Educational and Behavioral Statistics, 2024
Asymmetric Likert-type items in research studies can present several challenges in data analysis, particularly concerning missing data. These items are often characterized by a skewed scaling, where either there is no neutral response option or an unequal number of possible positive and negative responses. The use of conventional techniques, such…
Descriptors: Likert Scales, Test Items, Item Analysis, Evaluation Methods
Erik Forsberg; Anders Sjöberg – Measurement: Interdisciplinary Research and Perspectives, 2025
This paper reports a validation study based on descriptive multidimensional item response theory (DMIRT), implemented in the R package "D3mirt" by using the ERS-C, an extended version of the Relevance subscale from the Moral Foundations Questionnaire including two new items for collectivism (17 items in total). Two latent models are…
Descriptors: Evaluation Methods, Programming Languages, Altruism, Collectivism
Hannah Gadd Ardrey – ProQuest LLC, 2024
The purpose of the study was to investigate secondary choral music educators' and administrators' perceptions of the use of the Mississippi Professional Growth System (PGS) as an applicable tool for evaluating secondary choral music educators. While there is limited research regarding the evaluation of choral music educators, this study aimed to…
Descriptors: Secondary School Teachers, Music Teachers, Singing, Teacher Evaluation
Stephen Humphry; Paul Montuoro; Carolyn Maxwell – Journal of Psychoeducational Assessment, 2024
This article builds upon a proiminent definition of construct validity that focuses on variation in attributes causing variation in measurement outcomes. This article synthesizes the defintion and uses Rasch measurement modeling to explicate a modified conceptualization of construct validity for assessments of developmental attributes. If…
Descriptors: Construct Validity, Measurement Techniques, Developmental Stages, Item Analysis
Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025
Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…
Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment
Markus T. Jansen; Ralf Schulze – Educational and Psychological Measurement, 2024
Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent…
Descriptors: Factor Analysis, Models, Item Analysis, Evaluation Methods
Abbie Raikes; Rebecca Sayre Mojgani; Jem Heinzel-Nelson Alvarenga Lima; Dawn Davis; Cecelia Cassell; Marcus Waldman; Elsa Escalante – International Journal of Early Childhood, 2024
Quality early childhood care and education (ECCE) is important for young children's holistic healthy development. As ECCE scales, contextually relevant and feasible measurement is needed to inform policy and programs on strengths and areas for improvement. However, few measures have been designed for use across diverse contexts. Drawing on…
Descriptors: Foreign Countries, Early Childhood Education, Educational Quality, Program Effectiveness
Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data
Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024
Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…
Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests
Yu-Sheng Su; Xiao Wang; Li Zhao – IEEE Transactions on Education, 2024
Research Purpose and Contribution: The study aimed to construct an evaluation framework for assessing pupils' computational thinking (CT) during classroom learning problem solving. As a self-report evaluation scale for pupils, this evaluation framework further enriched the CT assessment instruments for pupils and provided a specialized instrument…
Descriptors: Computation, Thinking Skills, Student Evaluation, Evaluation Methods
Yuanfang Liu; Mark H. C. Lai; Ben Kelcey – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Measurement invariance holds when a latent construct is measured in the same way across different levels of background variables (continuous or categorical) while controlling for the true value of that construct. Using Monte Carlo simulation, this paper compares the multiple indicators, multiple causes (MIMIC) model and MIMIC-interaction to a…
Descriptors: Classification, Accuracy, Error of Measurement, Correlation
Yicheng Sun – ProQuest LLC, 2024
We study how to automatically generate cloze questions from given texts to assess reading comprehension, where a cloze question consists of a stem with a blank space holder for the answer key, and three distractors for generating confusions. We present a generative method called CQG (Cloze Question Generator) for constructing cloze questions from…
Descriptors: Cloze Procedure, Reading Processes, Questioning Techniques, Computational Linguistics
Previous Page | Next Page »
Pages: 1 | 2