Publication Date
In 2025 | 1 |
Since 2024 | 6 |
Since 2021 (last 5 years) | 26 |
Since 2016 (last 10 years) | 95 |
Since 2006 (last 20 years) | 265 |
Descriptor
Comparative Analysis | 331 |
Validity | 331 |
Reliability | 300 |
Foreign Countries | 141 |
Correlation | 83 |
Measures (Individuals) | 70 |
Statistical Analysis | 60 |
Evaluation Methods | 56 |
Psychometrics | 46 |
Questionnaires | 46 |
Scores | 42 |
More ▼ |
Source
Author
Darling-Hammond, Linda | 2 |
Jones, Ian | 2 |
Koch, Valerie L. | 2 |
McQuitty, Louis L. | 2 |
Mott, Michael S. | 2 |
Shahnazari, Mohammadtaghi | 2 |
Vaughn, Sharon | 2 |
Abdullah, Firdaus | 1 |
Abrami, Philip C. | 1 |
Adler, Lenard A. | 1 |
Agak, J. A. | 1 |
More ▼ |
Publication Type
Education Level
Location
Turkey | 18 |
United States | 12 |
Australia | 10 |
Canada | 9 |
Malaysia | 8 |
Netherlands | 8 |
United Kingdom | 8 |
Hong Kong | 6 |
Jordan | 6 |
China | 5 |
Germany | 5 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 2 |
Individuals with Disabilities… | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Lucy Chambers; Emma Walland; Jo Ireland – Research Matters, 2024
Comparative Judgement (CJ) is traditionally and primarily used to compare written texts. In this study we explored whether we could extend its use to comparing audio files. We used GCSE Music portfolios which contained a mix of audio recordings, musical scores and text documents. Fifteen judges completed two exercises: one comparing musical…
Descriptors: Evaluative Thinking, Judges, Comparative Analysis, Reliability
Marine Simon; Alexandra Budke – Journal of Geography in Higher Education, 2024
Comparison is an important geographic method and a common task in geography education. Mastering comparison is a complex competency and written comparisons are challenging tasks both for students and assessors. As yet, however, there is no set test for evaluating comparison competency nor tool for enhancing it. Moreover, little is known about…
Descriptors: Geography Instruction, Student Evaluation, Comparative Analysis, Reliability
Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018
Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…
Descriptors: Grading, Models, Reliability, Validity
Igor Esnaola; Albert Sesé; Lorea Azpiazu; Yina Wang – British Journal of Educational Psychology, 2024
Background: Modelling academic self-concept through second-order factors or bifactor structures is an important issue with substantive and practical implications; besides, the bifactor model has not been analysed with a Chinese sample and cross-cultural studies in the academic self-concept are scarce. Likewise, latent structure validity evidence…
Descriptors: Academic Achievement, Self Concept, Psychometrics, Validity
Lind, Veronika; Svensson, Melanie; Harringe, Marita L. – Measurement in Physical Education and Exercise Science, 2022
Goniometry is commonly used to evaluate joint range of motion (ROM). The most widespread method, a manual universal goniometer (UG), is considered time-consuming and difficult to handle. The digital goniometer EasyAngle (EA) was developed to improve and simplify the evaluation of ROM. This study aimed to evaluate the reliability and validity of EA…
Descriptors: Motor Reactions, Measurement Techniques, Comparative Analysis, Measurement Equipment
Damian, Elena; Meuleman, Bart; van Oorschot, Wim – Sociological Methods & Research, 2022
In this article, we examine whether cross-national studies disclose enough information for independent researchers to evaluate the validity and reliability of the findings (evaluation transparency) or to perform a direct replication (replicability transparency). The first contribution is theoretical. We develop a heuristic theoretical model…
Descriptors: National Surveys, Cross Cultural Studies, Social Science Research, Periodicals
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Walland, Emma – Research Matters, 2022
In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…
Descriptors: Essays, Grading, Writing Evaluation, Evaluators
Chantelle Bosch – Journal of Education and e-Learning Research, 2024
This study investigates the validity and reliability of the Intrinsic Motivation Inventory (IMI) in the context of blended learning. In the digital age, the fusion of online components with traditional classroom instruction has become integral to modern pedagogy, giving rise to blended learning--a flexible approach accommodating diverse learning…
Descriptors: Psychometrics, Student Motivation, Economics Education, Blended Learning
Di, Weiwei; Nie, Youyan; Chua, Bee Leng; Chye, Stefanie; Teo, Timothy – Journal of Psychoeducational Assessment, 2023
General self-efficacy represents the global sense of personal capability across various situations and tasks. The aim of the present study was to develop and validate a single-item general self-efficacy scale which balances practical demands and psychometric concerns. The psychometric properties of the proposed Single-Item General Self-Efficacy…
Descriptors: Self Efficacy, Self Concept Measures, Psychometrics, Adults
Wang, Faming; Wang, Yehui; Liu, Yaping; Leung, Shing On – Scandinavian Journal of Educational Research, 2023
The importance of the opportunity to learn (OTL) for mathematics achievement has been extensively researched. However, there were still unanswered questions regarding OTL's measurement, analytical level, and relationship with motivational beliefs. To fill in the gaps, we aimed to (1) scrutinize the reliability and validity of OTL, (2) investigate…
Descriptors: International Assessment, Foreign Countries, Achievement Tests, Secondary School Students
van der Scheer, Emmelien A.; Bijlsma, Hannah J. E.; Glas, Cees A. W. – School Effectiveness and School Improvement, 2019
A Bayesian IRT-model approach was used to investigate the validity and reliability of student perceptions of teaching quality. Furthermore, the student perceptions were compared with ratings of teaching quality by external observers. Grade 4 students (n = 675) filled out a questionnaire that was used to measure their opinions about the lessons of…
Descriptors: Student Attitudes, Validity, Interrater Reliability, Correlation
Jason E. Stone – ProQuest LLC, 2021
Personalized learning tracks within MOOCs remain underdeveloped. Despite MOOCs possessing tremendous potential for personalized learning, little individualization of the MOOC has occurred. Some students look at MOOCs as a textbook, others as a formal course, and others as an opportunity to socialize. Understanding student enrollment needs are a…
Descriptors: MOOCs, Student Motivation, Independent Study, Rating Scales
Deribo, Tobias; Goldhammer, Frank; Kroehne, Ulf – Educational and Psychological Measurement, 2023
As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a…
Descriptors: Reaction Time, Guessing (Tests), Behavior Patterns, Bias