Publication Date
In 2025 | 3 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 11 |
Since 2016 (last 10 years) | 30 |
Since 2006 (last 20 years) | 82 |
Descriptor
Educational Assessment | 355 |
Scoring | 355 |
Elementary Secondary Education | 136 |
Test Construction | 135 |
Student Evaluation | 119 |
Performance Based Assessment | 113 |
Evaluation Methods | 89 |
Test Use | 65 |
Academic Achievement | 62 |
Testing Programs | 59 |
State Programs | 54 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 41 |
Teachers | 35 |
Policymakers | 13 |
Researchers | 11 |
Community | 6 |
Students | 4 |
Administrators | 2 |
Parents | 2 |
Location
Australia | 10 |
Vermont | 9 |
United Kingdom | 7 |
Kentucky | 6 |
Pennsylvania | 6 |
Canada | 5 |
New York | 5 |
California | 4 |
Japan | 4 |
New Mexico | 4 |
North Carolina | 4 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 4 |
Kentucky Education Reform Act… | 3 |
Elementary and Secondary… | 2 |
Education Consolidation… | 1 |
Improving Americas Schools… | 1 |
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
David DiSabito; Lisa Hansen; Thomas Mennella; Josephine Rodriguez – New Directions for Teaching and Learning, 2025
This chapter investigates the integration of generative AI (GenAI), specifically ChatGPT, into institutional and course-level assessment at Western New England University. It explores the potential of GenAI to streamline the assessment process, making it more efficient, equitable, and objective. Through the development of a proprietary GenAI tool,…
Descriptors: Artificial Intelligence, Technology Uses in Education, Man Machine Systems, Educational Assessment
Akif Avcu – Malaysian Online Journal of Educational Technology, 2025
This scope-review presents the milestones of how Hierarchical Rater Models (HRMs) become operable to used in automated essay scoring (AES) to improve instructional evaluation. Although essay evaluations--a useful instrument for evaluating higher-order cognitive abilities--have always depended on human raters, concerns regarding rater bias,…
Descriptors: Automation, Scoring, Models, Educational Assessment
Jie Yang; Ehsan Latif; Yuze He; Xiaoming Zhai – Journal of Science Education and Technology, 2025
The development of explanations for scientific phenomena is crucial in science assessment. However, the scoring of students' written explanations is a challenging and resource-intensive process. Large language models (LLMs) have demonstrated the potential to address these challenges, particularly when the explanations are written in English, an…
Descriptors: Artificial Intelligence, Technology Uses in Education, Automation, Scoring
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Ashish Gurung; Anthony F. Botelho; Russell Thompson; Adam C. Sales; Sami Baral; Neil T. Heffernan – Grantee Submission, 2022
It is particularly important to identify and address issues of fairness and equity in educational contexts as academic performance can have large impacts on the types of opportunities that are made available to students. While it is always the hope that educators approach student assessment with these issues in mind, there are a number of factors…
Descriptors: Equal Education, Middle School Mathematics, Middle School Students, Educational Assessment
Kelly, Anthony – Assessment & Evaluation in Higher Education, 2023
The Research Excellence Framework is a high-stakes exercise used by the UK government to allocate billions of pounds of quality-related research (QR) funding and used by the media to rank universities and their departments in national league tables. The 2008, 2014 and 2021 assessments were zero-sum games in terms of league table position because…
Descriptors: Foreign Countries, Educational Assessment, Educational Research, Educational Quality
Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022
We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…
Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods
Tahereh Firoozi; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023
The proliferation of large language models represents a paradigm shift in the landscape of automated essay scoring (AES) systems, fundamentally elevating their accuracy and efficacy. This study presents an extensive examination of large language models, with a particular emphasis on the transformative influence of transformer-based models, such as…
Descriptors: Turkish, Writing Evaluation, Essays, Accuracy
Wind, Stefanie A.; Guo, Wenjing – Educational Assessment, 2021
Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and…
Descriptors: Scoring, Responses, Test Items, Test Format
Gardner, John; O'Leary, Michael; Yuan, Li – Journal of Computer Assisted Learning, 2021
Artificial Intelligence is at the heart of modern society with computers now capable of making process decisions in many spheres of human activity. In education, there has been intensive growth in systems that make formal and informal learning an anytime, anywhere activity for billions of people through online open educational resources and…
Descriptors: Artificial Intelligence, Educational Assessment, Formative Evaluation, Summative Evaluation
Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2020
Educational tests are standardized so that all examinees are tested on the same material, under the same testing conditions, and with the same scoring protocols. This uniformity is designed to provide a level "playing field" for all examinees so that the test is "the same" for everyone. Thus, standardization is designed to…
Descriptors: Standards, Educational Assessment, Culture Fair Tests, Scoring
Ninomiya, Shuichi – Assessment in Education: Principles, Policy & Practice, 2019
PISA presents a new image for academic achievement, which has prompted Japanese education reforms over the past decade to innovate teaching and learning for 'PISA-style literacy'. Supported by theoretical foundations, particularly with regard to the concept of 'PISA literacy' and 'authentic assessment', these reforms have accomplished progress in…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Lottridge, Sue; Burkhardt, Amy; Boyer, Michelle – Educational Measurement: Issues and Practice, 2020
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows…
Descriptors: Computer Assisted Testing, Scoring, Automation, Educational Assessment
Palermo, Corey; Bunch, Michael B.; Ridge, Kirk – Journal of Educational Measurement, 2019
Although much attention has been given to rater effects in rater-mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large-scale, multi-state summative assessment program.…
Descriptors: Scoring, Interrater Reliability, Measurement, Summative Evaluation
Rotou, Ourania; Rupp, André A. – ETS Research Report Series, 2020
This research report provides a description of the processes of evaluating the "deployability" of automated scoring (AS) systems from the perspective of large-scale educational assessments in operational settings. It discusses a comprehensive psychometric evaluation that entails analyses that take into consideration the specific purpose…
Descriptors: Computer Assisted Testing, Scoring, Educational Assessment, Psychometrics