Publication Date
In 2025 | 2 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 21 |
Since 2016 (last 10 years) | 44 |
Since 2006 (last 20 years) | 102 |
Descriptor
College Students | 182 |
Evaluation Methods | 182 |
Test Reliability | 99 |
Higher Education | 73 |
Test Validity | 68 |
Reliability | 63 |
Student Evaluation | 53 |
Foreign Countries | 52 |
Student Attitudes | 41 |
College Faculty | 35 |
Test Construction | 33 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 5 |
Administrators | 3 |
Teachers | 2 |
Location
Spain | 6 |
China | 5 |
United Kingdom | 5 |
Australia | 4 |
Germany | 4 |
United States | 4 |
Canada | 3 |
Florida | 3 |
Japan | 3 |
Netherlands | 3 |
Turkey | 3 |
More ▼ |
Laws, Policies, & Programs
Americans with Disabilities… | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Marjahan Begum; Pontus Haglund; Ari Korhonen; Violetta Lonati; Mattia Monga; Filip Strömbäck; Artturi Tilanterä – Informatics in Education, 2024
There can be many reasons why students fail to answer correctly to summative tests in advanced computer science courses: often the cause is a lack of prerequisites or misconceptions about topics presented in previous courses. One of the ITiCSE 2020 working groups investigated the possibility of designing assessments suitable for differentiating…
Descriptors: Foreign Countries, College Students, Prerequisites, Computer Science Education
Yang Yang – Shanlax International Journal of Education, 2024
This paper explores the reliability of using ChatGPT in evaluating EFL writing by assessing its intra- and inter-rater reliability. Eighty-two compositions were randomly sampled from the Written English Corpus of Chinese Learners. These compositions were rated by three experienced raters with regard to 'language', 'content', and 'organization'.…
Descriptors: English (Second Language), Second Language Instruction, Writing (Composition), Evaluation Methods
Power, Jason Richard; Tanner, David – European Journal of Engineering Education, 2023
Self and peer assessments have been identified as effective strategies to develop a deeper understanding of complex concepts, enhance meta-cognitive capacity, and support learner self-efficacy. This study examines data related to peer and self-assessment exercises completed within a university engineering programme (n=61). Data related to…
Descriptors: Peer Evaluation, Self Evaluation (Individuals), Feedback (Response), Engineering Education
Marine Simon; Alexandra Budke – Journal of Geography in Higher Education, 2024
Comparison is an important geographic method and a common task in geography education. Mastering comparison is a complex competency and written comparisons are challenging tasks both for students and assessors. As yet, however, there is no set test for evaluating comparison competency nor tool for enhancing it. Moreover, little is known about…
Descriptors: Geography Instruction, Student Evaluation, Comparative Analysis, Reliability
Huiying Cai; Xun Yan – Language Testing, 2024
Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…
Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation
Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024
Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…
Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability
Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021
This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…
Descriptors: Music Education, Music Activities, Music, Performance
Flor de Lis González-Mujico – Education and Information Technologies, 2024
Over the past decade, self-assessment tools have garnered significant attention in the interest of measuring the skillset required by educators and students to function productively and ethically in digitally mediated environments, particularly in relation to education policy implementation. Since stated beliefs do not always align with actual…
Descriptors: Technological Literacy, Evaluation Methods, Test Validity, Test Construction
Roselyn Peterson; Robert D. Dvorak; Emily K. Burr; Ardhys N. De Leon; Samantha J. Klaver; Madison H. Maynard; Emma R. Hayden; Bradley Aguilar – Journal of Drug Education, 2024
Alcohol protective behavioral strategies (PBS) are commonly conceptualized with a three-factor model, as used in the Protective Behavioral Strategies Scale--20 (PBSS-20). However, inconsistencies exist between factors and drinking outcomes. The current study used factor analysis to test a two-factor structure directly via controlled consumption…
Descriptors: College Students, College Faculty, School Personnel, Alcohol Abuse
Chahna Gonsalves – Journal of Learning Development in Higher Education, 2025
Generative AI (GenAI) is transforming higher education. It has already challenged the validity of traditional assessment methods and revealed concerns about the authenticity and reliability of conventional approaches. This opinion piece proposes an expanded theoretical framework for contextual learning, incorporating practical, situational,…
Descriptors: Artificial Intelligence, Higher Education, Evaluation Methods, Technology Uses in Education
Keshavarz, Hamid; Norouzi, Yaghoub – New Review of Academic Librarianship, 2022
Owing to the extreme importance of evaluating the credibility of existing scientific websites, the present study sets out to measure a proposed model concerning the views and preferences of university students in Iran, when evaluating information. Data were collected by administrating a highly validated questionnaire among 487 students in ten top…
Descriptors: Credibility, Evaluation Methods, Scientific and Technical Information, Web Sites
Konstantin Vinokic; Lukas Begrich; Mareike Kunter; Susanne Kuger – Frontline Learning Research, 2024
Thin slices ratings (i.e., ratings based on first impressions) have yielded intriguingly accurate results in various domains. Among other, researcher have applied the thin slices technique to assess instructional quality, showing that teacher-student interactions can be reliably inferred by just very short snippets of classroom instruction. The…
Descriptors: Teacher Effectiveness, Teacher Student Relationship, Foreign Countries, Classroom Observation Techniques
Abdel Azim Zumrawi; Leah P. Macfadyen – Cogent Education, 2023
Student Evaluations of Teaching (SETs) gather crucial feedback on student experiences of teaching and learning and have been used for decades to evaluate the quality of teaching and student experience of instruction. In this paper, we make the case for an important improvement to the analysis of SET data that can further refine its interpretation.…
Descriptors: Likert Scales, Student Evaluation of Teacher Performance, Student Attitudes, Reliability
Kübra Karakaya Özyer – Journal of Social Studies Education Research, 2024
The study aims to assess online assessment practices in a public university, addressing questions about self-efficacy levels, tools used, challenges faced, and proposed solutions. The chosen methodology employs a cross-sectional survey design, collecting both quantitative and qualitative data from 50 instructors in Türkiye through a convenience…
Descriptors: Foreign Countries, Student Evaluation, Computer Assisted Testing, College Students