NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 76 to 90 of 3,126 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Emily W. Wang; Maria I. Grigos – Journal of Speech, Language, and Hearing Research, 2024
Purpose: The aim of this study was to describe changes in speech intelligibility and interrater and intrarater reliability of naive listeners' ratings of words produced by young children diagnosed with childhood apraxia of speech (CAS) over a period of motor-based intervention (dynamic temporal and tactile cueing [DTTC]). Method: A total of 120…
Descriptors: Speech Communication, Intelligibility, Speech Impairments, Perceptual Motor Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ebru Öztürk; Erol Duran – Educational Policy Analysis and Strategic Research, 2024
In this study, it was aimed to develop a rubric to evaluate the creative story writing skill levels of seventh grade secondary school students. The research was designed in quantitative research method and survey model. In the research, convenience sampling technique was used and 270 students studying at the seventh grade level of secondary school…
Descriptors: Scoring Rubrics, Writing Evaluation, Creative Writing, Middle School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Maestrales, Sarah; Zhai, Xiaoming; Touitou, Israel; Baker, Quinton; Schneider, Barbara; Krajcik, Joseph – Journal of Science Education and Technology, 2021
In response to the call for promoting three-dimensional science learning (NRC, 2012), researchers argue for developing assessment items that go beyond rote memorization tasks to ones that require deeper understanding and the use of reasoning that can improve science literacy. Such assessment items are usually performance-based constructed…
Descriptors: Artificial Intelligence, Scoring, Evaluation Methods, Chemistry
Peer reviewed Peer reviewed
Direct linkDirect link
Gwet, Kilem L. – Educational and Psychological Measurement, 2021
Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among…
Descriptors: Sample Size, Statistical Analysis, Interrater Reliability, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hosseinali Gholami – Mathematics Teaching Research Journal, 2025
Scoring mathematics exam papers accurately is vital for fostering students' engagement and interest in the subject. Incorrect scoring practices can erode motivation and lead to the development of false self-confidence. Therefore, the implementation of appropriate scoring methods is essential for the success of mathematics education. This study…
Descriptors: Interrater Reliability, Mathematics Teachers, Scoring, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
Direct linkDirect link
Chase Young; Benjamin Mitchell-Yellin; George Kevin Randall – Active Learning in Higher Education, 2025
The purpose of this study was to develop a valid, reliable, and brief measure of active learning in college classrooms that is cheap and easy to complete and yields results that faculty can easily use to inform their development as instructors. Initial construct and face validity was achieved by modifying existing instruments and creating a draft…
Descriptors: College Faculty, College Students, Active Learning, Classroom Observation Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Louise Badham – Oxford Review of Education, 2025
Different sources of assessment evidence are reviewed during International Baccalaureate (IB) grade awarding to convert marks into grades and ensure fair results for students. Qualitative and quantitative evidence are analysed to determine grade boundaries, with statistical evidence weighed against examiner judgement and teachers' feedback on…
Descriptors: Advanced Placement Programs, Grading, Interrater Reliability, Evaluative Thinking
Peer reviewed Peer reviewed
Direct linkDirect link
Verdugo, Miguel Angel; Vicente, Eva; Guillén, Verónica Marina; Sánchez, Sergio; Ibáñez, Alba; Gómez, Laura Elisabet – International Journal of Developmental Disabilities, 2023
Background: Appropriate supports and instructional practices contribute to the development of self-determination. Also, research shows that the promotion of skills related to self-determination has been linked to the achievement of desired outcomes over the different life stages. Advances in self-determination require the development of assessment…
Descriptors: Measures (Individuals), Self Determination, Intellectual Disability, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Pouls, Claudia; Jeandarme, Inge – Journal of Mental Health Research in Intellectual Disabilities, 2023
Background: The ARMIDILO-S is advocated as a promising tool for assessing dynamic risk factors in sex offenders with intellectual disabilities (SOIDs). However, research remains scarce. The present study aimed to further validate this instrument in SOIDs. Method: The study prospectively followed 38 SOIDs for up to one year to test the accuracy of…
Descriptors: Test Reliability, Test Validity, Sexual Abuse, Criminals
Peer reviewed Peer reviewed
Direct linkDirect link
Ilona Rinne – Assessment & Evaluation in Higher Education, 2024
It is widely acknowledged in research that common criteria and aligned standards do not result in consistent assessment of such a complex performance as the final undergraduate thesis. Assessment is determined by examiners' understanding of rubrics and their views on thesis quality. There is still a gap in the research literature about how…
Descriptors: Foreign Countries, Undergraduate Students, Teacher Education Programs, Evaluation Criteria
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yang Yang – Shanlax International Journal of Education, 2024
This paper explores the reliability of using ChatGPT in evaluating EFL writing by assessing its intra- and inter-rater reliability. Eighty-two compositions were randomly sampled from the Written English Corpus of Chinese Learners. These compositions were rated by three experienced raters with regard to 'language', 'content', and 'organization'.…
Descriptors: English (Second Language), Second Language Instruction, Writing (Composition), Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024
This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated…
Descriptors: Foreign Countries, Interrater Reliability, Error of Measurement, Experience
Peer reviewed Peer reviewed
Direct linkDirect link
Bijani, Houman; Hashempour, Bahareh; Ibrahim, Khaled Ahmed Abdel-Al; Orabah, Salim Said Bani; Heydarnejad, Tahereh – Language Testing in Asia, 2022
Due to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently,…
Descriptors: Oral Language, Language Tests, Feedback (Response), Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022
This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…
Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  209