NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 46 to 60 of 9,400 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Huber, Chuck; Marcoulides, George A.; Pusic, Martin; Menold, Natalja – Measurement: Interdisciplinary Research and Perspectives, 2021
A readily and widely applicable procedure is discussed that can be used to point and interval estimate the probabilities of particular responses on polytomous items at pre-specified points along underlying latent continua. The items are assumed thereby to be part of unidimensional multi-component measuring instruments that may contain also binary…
Descriptors: Probability, Computation, Test Items, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Sherry Everett Jones; Nancy D. Brener; Barbara Queen; Molly Hershey-Arista; William Harris; J. Michael Underwood – Journal of School Health, 2024
Background: School Health Profiles assesses school health policies and practices among US secondary schools. Methods: The 2020 School Health Profiles principal and teacher questionnaires were used for a test-retest reliability study. Cohen's kappa coefficients tested the agreement in dichotomous responses to each questionnaire variable at 2 time…
Descriptors: Administrator Surveys, Teacher Surveys, Questionnaires, Pretests Posttests
Peer reviewed Peer reviewed
Direct linkDirect link
Xiangyi Liao; Daniel M Bolt – Educational Measurement: Issues and Practice, 2024
Traditional approaches to the modeling of multiple-choice item response data (e.g., 3PL, 4PL models) emphasize slips and guesses as random events. In this paper, an item response model is presented that characterizes both disjunctively interacting guessing and conjunctively interacting slipping processes as proficiency-related phenomena. We show…
Descriptors: Item Response Theory, Test Items, Error Correction, Guessing (Tests)
Peer reviewed Peer reviewed
Direct linkDirect link
Yue Liu; Zhen Li; Hongyun Liu; Xiaofeng You – Applied Measurement in Education, 2024
Low test-taking effort of examinees has been considered a source of construct-irrelevant variance in item response modeling, leading to serious consequences on parameter estimation. This study aims to investigate how non-effortful response (NER) influences the estimation of item and person parameters in item-pool scale linking (IPSL) and whether…
Descriptors: Item Response Theory, Computation, Simulation, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Deven Carlson; Adam Shepardson – Annenberg Institute for School Reform at Brown University, 2024
As students are exposed to extreme temperatures with ever-increasing frequency, it is important to understand how such exposure affects student learning. In this paper we draw upon detailed student achievement data, combined with high-resolution weather records, to paint a clear portrait of the effect of temperature on student learning across a…
Descriptors: Weather, Climate, Heat, Academic Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022
A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…
Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…
Descriptors: Test Bias, Methods, Test Items, Scoring
Chang, Kuo-Feng – ProQuest LLC, 2022
This dissertation was designed to foster a deeper understanding of population invariance in the context of composite-score equating and provide practitioners with guidelines for addressing score equity concerns at the composite score level. The purpose of this dissertation was threefold. The first was to compare different composite equating…
Descriptors: Test Items, Equated Scores, Methods, Design
Peer reviewed Peer reviewed
Direct linkDirect link
Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022
This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…
Descriptors: Test Items, Measurement, Psychometrics, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Randall, Jennifer – Educational Assessment, 2023
In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…
Descriptors: Social Justice, Racism, Test Bias, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023
The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…
Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  627