Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 19 |
Since 2016 (last 10 years) | 37 |
Since 2006 (last 20 years) | 82 |
Descriptor
Comparative Analysis | 110 |
Evaluation Methods | 110 |
Scoring | 61 |
Scoring Rubrics | 46 |
Foreign Countries | 32 |
Writing Evaluation | 19 |
Reliability | 17 |
Student Evaluation | 17 |
Computer Assisted Testing | 15 |
Validity | 15 |
Performance Based Assessment | 14 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 3 |
Location
Australia | 8 |
Spain | 4 |
China | 3 |
New York | 3 |
Connecticut | 2 |
Hong Kong | 2 |
Iran | 2 |
New Hampshire | 2 |
Rhode Island | 2 |
United Kingdom (England) | 2 |
Vermont | 2 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Kevin C. Haudek; Xiaoming Zhai – International Journal of Artificial Intelligence in Education, 2024
Argumentation, a key scientific practice presented in the "Framework for K-12 Science Education," requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging…
Descriptors: Accuracy, Persuasive Discourse, Artificial Intelligence, Learning Management Systems
Sinclair, Andrea L., Ed.; Thacker, Arthur, Ed. – Human Resources Research Organization (HumRRO), 2019
California's Commission on Teacher Credentialing (Commission) requires all programs of preliminary multiple and single subject teacher preparation to use a Commission-approved Teaching Performance Assessment (TPA) as one of the program completion requirements for prospective teacher candidates. Three TPA models were approved by the Commission: (1)…
Descriptors: Preservice Teachers, Performance Based Assessment, Models, Credentials
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Calderon, Angel – Journal of Studies in International Education, 2023
Higher education institutions (HEIs) have long embraced a path towards sustainability and engaged in supporting sustainable development. The adoption of the sustainability development goals has forced HEIs to assess how they engage with these goals and how they address societal challenges head on. However, the emergence of sustainability rankings…
Descriptors: Sustainability, Institutional Evaluation, Sustainable Development, Objectives
Reagan Mozer; Luke Miratrix; Jackie Eunjung Relyea; James S. Kim – Journal of Educational and Behavioral Statistics, 2024
In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This…
Descriptors: Scoring, Evaluation Methods, Writing Evaluation, Comparative Analysis
Taylor, Gemma; Kolak, Joanna; Bent, Eve M.; Monaghan, Padraic – British Journal of Educational Technology, 2022
In the present paper, we assess whether website rating systems are useful for selecting educational apps for preschool age children. We selected the 10 highest scoring and 10 lowest scoring apps for 2-4-year-olds from two widely used websites (Good App Guide; Common Sense Media). Apps rated highly by the two websites had a higher educational…
Descriptors: Computer Software, Preschool Children, Psycholinguistics, Feedback (Response)
Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022
The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…
Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability
Kandaiah, Thiruchelvam; Latip, Siti Halijah – Journal of Science and Mathematics Education in Southeast Asia, 2022
Purpose: The aim of this paper is to study the use of FIS Response Analysis for Critical Thinking Assessment (FRACTA) method to assess critical thinking in STEM problem solving. The use of FIS (facts, ideas and solutions) chart as a tool to elicit student critical thinking responses and the method of scoring the responses are investigated. Method:…
Descriptors: Scoring, Critical Thinking, Feedback (Response), Credibility
Al-Salmani, Fatema; Johnson, Jordan; Thacker, Beth – Physical Review Physics Education Research, 2023
We present an analysis of students' thinking skills as evidenced by free-response exam problems during the COVID-19 pandemic. We compare two inquiry-based, laboratory-based classical mechanics courses, one taught online and one taught in person during the pandemic, and two inquiry-based, laboratory-based electricity and magnetism courses, one…
Descriptors: Thinking Skills, Evaluation Methods, Comparative Analysis, Inquiry
Joe Olsen – ProQuest LLC, 2023
Instructional explanations are an ubiquitous component of classroom instruction, but are relatively neglected in science education when compared to other facets of teaching and learning. The ubiquity of instructional explanations and their potential to stimulate learning in students suggests that they should garner more attention from science…
Descriptors: Physics, Comparative Analysis, Student Attitudes, Educational Quality
Vercellotti, MaryLou – TESL-EJ, 2021
Analytic rubrics are promoted as important tools to assess learner performance and to improve learning outcomes. Rubrics, however, are not appropriate for every classroom assessment, particularly given the time and effort required to construct well-designed rubrics. In classroom assessment, instructors must balance the beneficial consequences of…
Descriptors: Scoring Rubrics, Evaluation Methods, Second Language Learning, Second Language Instruction
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017
This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…
Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis
Jia, Lin; Cai, Jianyong; Wang, Jianqin – Language Assessment Quarterly, 2023
In Dynamic Assessment (DA), the observation that individuals respond differently to support, or mediation, is important for diagnoses of development. The concept of learning potential refers to openness to mediation, i.e., the extent of change to performance when mediation is available, which may suggest learners will need less overall instruction…
Descriptors: Learning Processes, Teaching Methods, Second Language Learning, Second Language Instruction
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring