Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 13 |
Since 2016 (last 10 years) | 28 |
Since 2006 (last 20 years) | 96 |
Descriptor
Comparative Analysis | 146 |
Evaluation Methods | 146 |
Validity | 146 |
Reliability | 50 |
Foreign Countries | 32 |
Correlation | 22 |
Statistical Analysis | 18 |
Student Evaluation | 18 |
Program Effectiveness | 16 |
Higher Education | 15 |
Models | 15 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
United Kingdom (England) | 7 |
United States | 7 |
Australia | 4 |
United Kingdom | 4 |
Canada | 3 |
China | 3 |
New York | 3 |
Connecticut | 2 |
Netherlands | 2 |
New Hampshire | 2 |
Rhode Island | 2 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Hyemin Yoon; HyunJin Kim; Sangjin Kim – Measurement: Interdisciplinary Research and Perspectives, 2024
We have maintained the customer grade system that is being implemented to customers with excellent performance through customer segmentation for years. Currently, financial institutions that operate the customer grade system provide similar services based on the score calculation criteria, but the score calculation criteria vary from the financial…
Descriptors: Classification, Artificial Intelligence, Prediction, Decision Making
Marine Simon; Alexandra Budke – Journal of Geography in Higher Education, 2024
Comparison is an important geographic method and a common task in geography education. Mastering comparison is a complex competency and written comparisons are challenging tasks both for students and assessors. As yet, however, there is no set test for evaluating comparison competency nor tool for enhancing it. Moreover, little is known about…
Descriptors: Geography Instruction, Student Evaluation, Comparative Analysis, Reliability
Kelly, Kate Tremain; Richardson, Mary; Isaacs, Talia – Assessment in Education: Principles, Policy & Practice, 2022
Comparative judgment is gaining popularity as an assessment tool, including for high-stakes testing purposes, despite relatively little research on the use of the technique. Advocates claim two main rationales for its use: that comparative judgment is valid because humans are better at comparative than absolute judgment, and because it distils the…
Descriptors: Comparative Analysis, Evaluation Methods, Evaluative Thinking, High Stakes Tests
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Weidlich, Joshua; Gaševic, Dragan; Drachsler, Hendrik – Journal of Learning Analytics, 2022
As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they…
Descriptors: Causal Models, Inferences, Learning Analytics, Comparative Analysis
Walland, Emma – Research Matters, 2022
In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…
Descriptors: Essays, Grading, Writing Evaluation, Evaluators
Sims, Sam; Anders, Jake; Zieger, Laura – Journal of Research on Educational Effectiveness, 2022
Comparative interrupted time series (CITS) designs evaluate impact by modeling the relative deviation from trends among a treatment and comparison group after an intervention. The broad applicability of the design means it is widely used in education research. Like all non-experimental evaluation methods however, the internal validity of a given…
Descriptors: Validity, Comparative Analysis, Statistical Analysis, Intervention
Davies, Ben; Miller, David; Infante, Nicole – International Journal of Mathematical Education in Science and Technology, 2023
We report on a series of task-based interviews in which nine mathematicians were asked to evaluate a series of six mathematical arguments, purportedly produced either by fellow mathematicians or undergraduate students. In this paper, we attend to the role of context in mathematicians' responses, leading to four themes in expectations when…
Descriptors: Undergraduate Students, Validity, Mathematical Logic, Persuasive Discourse
Azman Ong, Mohd Hanafi; Mohd Yasin, Norazlina; Ibrahim, Nur Syafikah – Asian Association of Open Universities Journal, 2022
Purpose: Measuring internal response of online learning is seen as fundamental to absorptive capacity which stimulates knowledge assimilation. However, the evaluation of practice and research of validated instruments that could effectively measure online learning response behavior is limited. Thus, in this study, a new instrument was designed…
Descriptors: Online Courses, Student Surveys, Student Attitudes, Factor Analysis
Lu, Xiaofei,; Wu, Jifeng – Modern Language Journal, 2022
This study proposed a set of measures for assessing noun phrase (NP) complexity in second language (L2) Chinese writing and compared the predictive power of these measures for L2 Chinese writing quality to that of a set of syntactic complexity measures based on the topic-comment unit (TC-unit). Our data consisted of 101 narratives written by…
Descriptors: Writing Instruction, Syntax, Chinese, Second Language Learning
Marshall, Neil; Shaw, Kirsten; Hunter, Jodie; Jones, Ian – New Zealand Journal of Educational Studies, 2020
There is growing interest in using comparative judgement to assess student work as an alternative to traditional marking. Comparative judgement requires no rubrics and is instead grounded in experts making pairwise judgements about the relative 'quality' of students' work according to a high level criterion. The resulting decision data are fitted…
Descriptors: Comparative Analysis, Decision Making, Student Evaluation, Evaluation Methods
Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019
The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…
Descriptors: Test Items, Translation, Computer Software, Evaluators
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring
Zimmerman, Kathleen N.; Ledford, Jennifer R.; Severini, Katherine E.; Pustejovsky, James E.; Barton, Erin E.; Lloyd, Blair P. – Grantee Submission, 2018
Tools for evaluating the quality and rigor of single case research designs (SCD) are often used when conducting SCD syntheses. Preferred components include evaluations of design features related to the internal validity of SCD to obtain quality and/or rigor ratings. Three tools for evaluating the quality and rigor of SCD (Council for Exceptional…
Descriptors: Research Design, Evaluation Methods, Synthesis, Validity
Khamboonruang, Apichat – rEFLections, 2022
Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…
Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction