NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 76 to 90 of 9,530 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sherry Everett Jones; Nancy D. Brener; Barbara Queen; Molly Hershey-Arista; William Harris; J. Michael Underwood – Journal of School Health, 2024
Background: School Health Profiles assesses school health policies and practices among US secondary schools. Methods: The 2020 School Health Profiles principal and teacher questionnaires were used for a test-retest reliability study. Cohen's kappa coefficients tested the agreement in dichotomous responses to each questionnaire variable at 2 time…
Descriptors: Administrator Surveys, Teacher Surveys, Questionnaires, Pretests Posttests
Peer reviewed Peer reviewed
Direct linkDirect link
Xiangyi Liao; Daniel M Bolt – Educational Measurement: Issues and Practice, 2024
Traditional approaches to the modeling of multiple-choice item response data (e.g., 3PL, 4PL models) emphasize slips and guesses as random events. In this paper, an item response model is presented that characterizes both disjunctively interacting guessing and conjunctively interacting slipping processes as proficiency-related phenomena. We show…
Descriptors: Item Response Theory, Test Items, Error Correction, Guessing (Tests)
Deven Carlson; Adam Shepardson – Annenberg Institute for School Reform at Brown University, 2024
As students are exposed to extreme temperatures with ever-increasing frequency, it is important to understand how such exposure affects student learning. In this paper we draw upon detailed student achievement data, combined with high-resolution weather records, to paint a clear portrait of the effect of temperature on student learning across a…
Descriptors: Weather, Climate, Heat, Academic Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Yue Liu; Zhen Li; Hongyun Liu; Xiaofeng You – Applied Measurement in Education, 2024
Low test-taking effort of examinees has been considered a source of construct-irrelevant variance in item response modeling, leading to serious consequences on parameter estimation. This study aims to investigate how non-effortful response (NER) influences the estimation of item and person parameters in item-pool scale linking (IPSL) and whether…
Descriptors: Item Response Theory, Computation, Simulation, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Zopluoglu, Cengiz; Kasli, Murat; Toton, Sarah L. – Educational Measurement: Issues and Practice, 2021
Response time information has recently attracted significant attention in the literature as it may provide meaningful information about item preknowledge. The methods that use response time information to identify examinees with potential item preknowledge make an implicit assumption that the examinees with item preknowledge differ in their…
Descriptors: Reaction Time, Cheating, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Huber, Chuck; Marcoulides, George A.; Pusic, Martin; Menold, Natalja – Measurement: Interdisciplinary Research and Perspectives, 2021
A readily and widely applicable procedure is discussed that can be used to point and interval estimate the probabilities of particular responses on polytomous items at pre-specified points along underlying latent continua. The items are assumed thereby to be part of unidimensional multi-component measuring instruments that may contain also binary…
Descriptors: Probability, Computation, Test Items, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Randall, Jennifer – Educational Assessment, 2023
In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…
Descriptors: Social Justice, Racism, Test Bias, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023
The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…
Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2023
Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers.…
Descriptors: Scores, Test Items, Accuracy, Psychometrics
Hess, Jessica – ProQuest LLC, 2023
This study was conducted to further research into the impact of student-group item parameter drift (SIPD) --referred to as subpopulation item parameter drift in previous research-- on ability estimates and proficiency classification accuracy when occurring in the discrimination parameter of a 2-PL item response theory (IRT) model. Using Monte…
Descriptors: Test Items, Groups, Ability, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…
Descriptors: Classification, Models, Science Tests, Physics
Peer reviewed Peer reviewed
Direct linkDirect link
Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022
A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…
Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  636