ERIC - Search Results

Publication Date

In 2025	4
Since 2024	14
Since 2021 (last 5 years)	79

Descriptor

Scoring	79
Test Items	79
Computer Assisted Testing	25
Item Analysis	23
Item Response Theory	20
Foreign Countries	19
Test Format	19
Accuracy	16
Comparative Analysis	15
Difficulty Level	13
Mathematics Tests	13
Psychometrics	13
Test Construction	13
Models	12
Test Reliability	12
Test Validity	12
Science Tests	11
Scores	11
Multiple Choice Tests	10
Computation	9
Achievement Tests	8
Artificial Intelligence	8
Computer Software	8
Evaluation Methods	8
Language Tests	8
More ▼

Publication Type

Journal Articles	62
Reports - Research	59
Reports - Evaluative	8
Dissertations/Theses -…	5
Reports - Descriptive	5
Numerical/Quantitative Data	3
Speeches/Meeting Papers	3
Books	2
Collected Works - General	2
Information Analyses	2
Collected Works - Proceedings	1
More ▼

Education Level

Secondary Education	18
Higher Education	17
Postsecondary Education	17
Elementary Education	12
Junior High Schools	7
Middle Schools	7
Elementary Secondary Education	5
Grade 8	4
High Schools	4
Early Childhood Education	2
Grade 2	2
Grade 3	2
Intermediate Grades	2
Primary Education	2
Grade 4	1
Grade 6	1
Grade 7	1
More ▼

Audience

Location

China	5
Canada	1
China (Shanghai)	1
Europe	1
Iran	1
Nebraska	1
Oman	1
Oregon	1
Turkey	1
Ukraine	1
United Kingdom	1
United Kingdom (England)	1
United Kingdom (Newcastle…	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	4
Trends in International…	3
Program for International…	2
ACT Assessment	1
Autism Diagnostic Observation…	1
Computer Attitude Scale	1
Graduate Record Examinations	1
NEO Five Factor Inventory	1
Test of English as a Foreign…	1
Torrance Tests of Creative…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 79 results Save | Export

Scoring Running Records: Complexities and Affordances

Peer reviewed

Direct link

Rodgers, Emily; D'Agostino, Jerome V.; Berenbon, Rebecca; Johnson, Tracy; Winkler, Christa – Journal of Early Childhood Literacy, 2023

Running Records are thought to be an excellent formative assessment tool because they generate results that educators can use to make their teaching more responsive. Despite the technical nature of scoring Running Records and the kinds of important decisions that are attached to their analysis, few studies have investigated assessor accuracy. We…

Descriptors: Formative Evaluation, Scoring, Accuracy, Difficulty Level

Testing for Differential Item Functioning under the "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022

This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…

Descriptors: Test Bias, Methods, Test Items, Scoring

Examination of the Aggregate Scoring Method in a Judgment Concordance Test

Peer reviewed
PDF on ERIC

Download full text

Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023

The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…

Descriptors: Scoring, Tests, Evaluation Methods, Test Items

An Examination of Individual Ability Estimation and Classification Accuracy under Rapid Guessing Misidentifications

Peer reviewed

Direct link

Rios, Joseph – Applied Measurement in Education, 2022

To mitigate the deleterious effects of rapid guessing (RG) on ability estimates, several rescoring procedures have been proposed. Underlying many of these procedures is the assumption that RG is accurately identified. At present, there have been minimal investigations examining the utility of rescoring approaches when RG is misclassified, and…

Descriptors: Accuracy, Guessing (Tests), Scoring, Classification

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

The Impact of Scoring Later on Mixed Format Adaptive Testing

Direct link

Jing Ma – ProQuest LLC, 2024

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…

Descriptors: Scoring, Adaptive Testing, Test Items, Classification

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

Automated Marking of Longer Computational Questions in Engineering Subjects

Peer reviewed

Direct link

Pearson, Christopher; Penna, Nigel – Assessment & Evaluation in Higher Education, 2023

E-assessments are becoming increasingly common and progressively more complex. Consequently, how these longer, more complex questions are designed and marked is imperative. This article uses the NUMBAS e-assessment tool to investigate the best practice for creating longer questions and their mark schemes on surveying modules taken by engineering…

Descriptors: Automation, Scoring, Engineering Education, Foreign Countries

Item Response Theory and Modeling with Stata

Peer reviewed

Direct link

Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023

This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…

Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis

An Approach to Test Equating under the Latent "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021

This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…

Descriptors: Equated Scores, Scoring, Test Items, Accuracy

Automatic Question Generation and Answer Assessment: A Survey

Direct link

Das, Bidyut; Majumder, Mukta; Phadikar, Santanu; Sekh, Arif Ahmed – Research and Practice in Technology Enhanced Learning, 2021

Learning through the internet becomes popular that facilitates learners to learn anything, anytime, anywhere from the web resources. Assessment is most important in any learning system. An assessment system can find the self-learning gaps of learners and improve the progress of learning. The manual question generation takes much time and labor.…

Descriptors: Automation, Test Items, Test Construction, Computer Assisted Testing

Impact of Scoring Instructions, Timing, and Feedback on Measurement: An Experimental Study

Peer reviewed

Direct link

van Rijn, Peter W.; Attali, Yigal; Ali, Usama S. – Journal of Experimental Education, 2023

We investigated whether and to what extent different scoring instructions, timing conditions, and direct feedback affect performance and speed. An experimental study manipulating these factors was designed to address these research questions. According to the factorial design, participants were randomly assigned to one of twelve study conditions.…

Descriptors: Scoring, Time, Feedback (Response), Performance

Using Nominal Models to Examine How High School Students Use an I Do Not Know Response Option When Answering Scale Items

Direct link

Laura Laclede – ProQuest LLC, 2023

Because non-cognitive constructs can influence student success in education beyond academic achievement, it is essential that they are reliably conceptualized and measured. Within this context, there are several gaps in the literature related to correctly interpreting the meaning of scale scores when a non-standard response option like I do not…

Descriptors: High School Students, Test Wiseness, Models, Test Items

Evaluating ChatGPT as a Self-Learning Tool in Medical Biochemistry: A Performance Assessment in Undergraduate Medical University Examination

Peer reviewed

Direct link

Krishna Mohan Surapaneni; Anusha Rajajagadeesan; Lakshmi Goudhaman; Shalini Lakshmanan; Saranya Sundaramoorthi; Dineshkumar Ravi; Kalaiselvi Rajendiran; Porchelvan Swaminathan – Biochemistry and Molecular Biology Education, 2024

The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an…

Descriptors: Biochemistry, Science Instruction, Artificial Intelligence, Teaching Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Educational and Psychological…	8
ProQuest LLC	5
Journal of Educational…	4
Journal of Educational and…	4
Applied Measurement in…	3
Grantee Submission	3
Language Testing	3
Assessment & Evaluation in…	2
International Educational…	2
Journal of Intelligence	2
Measurement:…	2
Online Submission	2
Practical Assessment,…	2
ACT Education Corp.	1
Advanced Education	1
Biochemistry and Molecular…	1
CBE - Life Sciences Education	1
Comparative Education Review	1
ETS Research Report Series	1
Education and Information…	1
Educational Assessment	1
European Journal of Science…	1
IAP - Information Age…	1
Innovations in Education and…	1
Interactive Learning…	1
More ▼

Atanasov, Dimitar V.	2
Denis Dumas	2
Dimitrov, Dimiter M.	2
Guo, Wenjing	2
Holling, Heinz	2
Kao, Shu-chuan	2
Kelly Berthiaume	2
Kim, Dong-In	2
Kim, Doyoung	2
Peter Organisciak	2
Selcuk Acar	2
Wind, Stefanie A.	2
Akyildiz, Murat	1
Ali, Usama S.	1
Alicia A. Stoltenberg	1
Allan S. Cohen	1
Almehrizi, Rashid S.	1
Alpizar, David	1
Ann Arthur	1
Anusha Rajajagadeesan	1
Attali, Yigal	1
Bakla, Arif	1
Baldwin, Peter	1
Banse, Holland	1
Bao, Lei	1
More ▼