NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20254
Since 202414
Since 2021 (last 5 years)79
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 79 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Rodgers, Emily; D'Agostino, Jerome V.; Berenbon, Rebecca; Johnson, Tracy; Winkler, Christa – Journal of Early Childhood Literacy, 2023
Running Records are thought to be an excellent formative assessment tool because they generate results that educators can use to make their teaching more responsive. Despite the technical nature of scoring Running Records and the kinds of important decisions that are attached to their analysis, few studies have investigated assessor accuracy. We…
Descriptors: Formative Evaluation, Scoring, Accuracy, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…
Descriptors: Test Bias, Methods, Test Items, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph – Applied Measurement in Education, 2022
To mitigate the deleterious effects of rapid guessing (RG) on ability estimates, several rescoring procedures have been proposed. Underlying many of these procedures is the assumption that RG is accurately identified. At present, there have been minimal investigations examining the utility of rescoring approaches when RG is misclassified, and…
Descriptors: Accuracy, Guessing (Tests), Scoring, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023
Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…
Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Pearson, Christopher; Penna, Nigel – Assessment & Evaluation in Higher Education, 2023
E-assessments are becoming increasingly common and progressively more complex. Consequently, how these longer, more complex questions are designed and marked is imperative. This article uses the NUMBAS e-assessment tool to investigate the best practice for creating longer questions and their mark schemes on surveying modules taken by engineering…
Descriptors: Automation, Scoring, Engineering Education, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023
This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…
Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021
This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…
Descriptors: Equated Scores, Scoring, Test Items, Accuracy
Das, Bidyut; Majumder, Mukta; Phadikar, Santanu; Sekh, Arif Ahmed – Research and Practice in Technology Enhanced Learning, 2021
Learning through the internet becomes popular that facilitates learners to learn anything, anytime, anywhere from the web resources. Assessment is most important in any learning system. An assessment system can find the self-learning gaps of learners and improve the progress of learning. The manual question generation takes much time and labor.…
Descriptors: Automation, Test Items, Test Construction, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
van Rijn, Peter W.; Attali, Yigal; Ali, Usama S. – Journal of Experimental Education, 2023
We investigated whether and to what extent different scoring instructions, timing conditions, and direct feedback affect performance and speed. An experimental study manipulating these factors was designed to address these research questions. According to the factorial design, participants were randomly assigned to one of twelve study conditions.…
Descriptors: Scoring, Time, Feedback (Response), Performance
Laura Laclede – ProQuest LLC, 2023
Because non-cognitive constructs can influence student success in education beyond academic achievement, it is essential that they are reliably conceptualized and measured. Within this context, there are several gaps in the literature related to correctly interpreting the meaning of scale scores when a non-standard response option like I do not…
Descriptors: High School Students, Test Wiseness, Models, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Krishna Mohan Surapaneni; Anusha Rajajagadeesan; Lakshmi Goudhaman; Shalini Lakshmanan; Saranya Sundaramoorthi; Dineshkumar Ravi; Kalaiselvi Rajendiran; Porchelvan Swaminathan – Biochemistry and Molecular Biology Education, 2024
The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an…
Descriptors: Biochemistry, Science Instruction, Artificial Intelligence, Teaching Methods
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6