Publication Date
| In 2026 | 0 |
| Since 2025 | 178 |
| Since 2022 (last 5 years) | 1058 |
| Since 2017 (last 10 years) | 2880 |
| Since 2007 (last 20 years) | 6165 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Teachers | 480 |
| Practitioners | 358 |
| Researchers | 152 |
| Administrators | 122 |
| Policymakers | 51 |
| Students | 44 |
| Parents | 32 |
| Counselors | 25 |
| Community | 15 |
| Media Staff | 5 |
| Support Staff | 3 |
| More ▼ | |
Location
| Australia | 183 |
| Turkey | 156 |
| California | 133 |
| Canada | 123 |
| New York | 118 |
| United States | 112 |
| Florida | 107 |
| China | 103 |
| Texas | 72 |
| United Kingdom | 72 |
| Japan | 70 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 5 |
| Meets WWC Standards with or without Reservations | 11 |
| Does not meet standards | 8 |
Olaghere, Ajima; Wilson, David B.; Kimbrell, Catherine – Research Synthesis Methods, 2023
A diversity of approaches for critically appraising qualitative and quantitative evidence exist and emphasize different aspects. These approaches lack clear processes to facilitate rating the overall quality of the evidence for aggregated findings that combine qualitative and quantitative evidence. We draw on a meta-aggregation of implementation…
Descriptors: Evidence, Synthesis, Scoring Rubrics, Standardized Tests
Rodgers, Emily; D'Agostino, Jerome V.; Berenbon, Rebecca; Johnson, Tracy; Winkler, Christa – Journal of Early Childhood Literacy, 2023
Running Records are thought to be an excellent formative assessment tool because they generate results that educators can use to make their teaching more responsive. Despite the technical nature of scoring Running Records and the kinds of important decisions that are attached to their analysis, few studies have investigated assessor accuracy. We…
Descriptors: Formative Evaluation, Scoring, Accuracy, Difficulty Level
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Glory Tobiason; Adrienne Lavine – Change: The Magazine of Higher Learning, 2025
Current methods for evaluating faculty teaching fall short, and one way to address this is through campus-wide initiatives that focus on change at the level of academic units. The complex context of higher education makes meaningful teaching evaluation difficult; in particular, four sobering realities of this context must be taken into account in…
Descriptors: Teacher Evaluation, Evaluation Methods, Testing Problems, Educational Change
Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022
Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…
Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment
Bamdev, Pakhi; Grover, Manraj Singh; Singla, Yaman Kumar; Vafaee, Payman; Hama, Mika; Shah, Rajiv Ratn – International Journal of Artificial Intelligence in Education, 2023
English proficiency assessments have become a necessary metric for filtering and selecting prospective candidates for both academia and industry. With the rise in demand for such assessments, it has become increasingly necessary to have the automated human-interpretable results to prevent inconsistencies and ensure meaningful feedback to the…
Descriptors: Language Proficiency, Automation, Scoring, Speech Tests
Doewes, Afrizal; Pechenizkiy, Mykola – International Educational Data Mining Society, 2021
Scoring essays is generally an exhausting and time-consuming task for teachers. Automated Essay Scoring (AES) facilitates the scoring process to be faster and more consistent. The most logical way to assess the performance of an automated scorer is by measuring the score agreement with the human raters. However, we provide empirical evidence that…
Descriptors: Man Machine Systems, Automation, Computer Assisted Testing, Scoring
A Review of the Inquiry and Utility of Mineral and Rock Labs for Use in Introductory Geology Courses
Meryssa Piper; Jessica Frankle; Sophia Owens; Blake Stubbins; Lancen Tully; Katherine Ryker – Journal of Geoscience Education, 2025
Rock and mineral labs are fundamental in traditional introductory geology courses. Successful implementation of these lab activities provides students opportunities to apply content knowledge. Inquiry-based instruction may be one way to increase student success. Prior examination of published STEM labs indicates that geology labs, particularly…
Descriptors: Geology, Introductory Courses, Laboratory Experiments, Inquiry
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Alexandra Jackson; Cheryl Bodnar; Elise Barrella; Juan Cruz; Krista Kecskemety – Journal of STEM Education: Innovations and Research, 2025
Recent curricular interventions in engineering education have focused on encouraging students to develop an entrepreneurial mindset (EM) to equip them with the skills needed to generate innovative ideas and address complex global problems upon entering the workforce. Methods to evaluate these interventions have been inconsistent due to the lack of…
Descriptors: Engineering Education, Entrepreneurship, Concept Mapping, Student Evaluation
Danwei Cai; Ben Naismith; Maria Kostromitina; Zhongwei Teng; Kevin P. Yancey; Geoffrey T. LaFlair – Language Learning, 2025
Globalization and increases in the numbers of English language learners have led to a growing demand for English proficiency assessments of spoken language. In this paper, we describe the development of an automatic pronunciation scorer built on state-of-the-art deep neural network models. The model is trained on a bespoke human-rated dataset that…
Descriptors: Automation, Scoring, Pronunciation, Speech Tests
Janika Saretzki; Rosalie Andrae; Boris Forthmann; Mathias Benedek – Journal of Creative Behavior, 2025
Divergent thinking (DT) ability is widely regarded as a central cognitive capacity underlying creativity, but its assessment is challenged by the fact that DT tasks yield a variable number of responses. Various approaches for the scoring of DT tasks have been proposed, which differ in how responses are evaluated and aggregated within a task. The…
Descriptors: Creative Thinking, Creativity Tests, Scoring, Metacognition
Michael O. Martin, Editor; Julian Fraillon, Editor; Heiko Sibberns, Editor; Betina Borisova, Contributor; Ekaterina Buzkich, Contributor; David Ebbs, Contributor; Eugenio Gonzalez, Contributor; Seamus Hegarty, Contributor; Sabine Meinck, Contributor; Sebastian Meyer, Contributor; Irini Moustaki, Contributor; Lauren Musu, Contributor; Keith Rust, Contributor; Ulrich Sievers, Contributor; Matthias von Davier, Contributor; Kentaro Yamamoto, Contributor – International Association for the Evaluation of Educational Achievement, 2025
This publication presents "IEA's Technical Standards for International Large-Scale Assessment." The initial standards, published in 1999, aimed to consolidate the best practices and methodological rigor in IEA's approach to educational assessment, addressing the unique needs of international studies. The standards presented in this…
Descriptors: International Assessment, Standards, Test Construction, Data Collection
Rosaline Tandiono; Amelia Limijaya – Asia-Pacific Education Researcher, 2025
Self and peer assessment in group work offers numerous benefits but is also susceptible to bias. Yet, research examining bias in self and peer assessments often overlooks cultural perspectives and predominantly favors Western contexts. This study aims to address this gap by examining how culture influences rater bias in self and peer assessments…
Descriptors: Evaluators, Bias, Self Evaluation (Individuals), Peer Evaluation
Frank Morley; Emma Walland – Research Matters, 2025
The recent development of Large Language Models (LLMs) such as Claude, Gemini, and GPT has led to widespread attention on potential applications of these models. Marking exams is a domain which requires the ability to interpret and evaluate student responses (often consisting of written text), and the potential for artificial intelligence (AI)…
Descriptors: Ethics, Artificial Intelligence, Automation, Scoring

Peer reviewed
Direct link
