Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 16 |
Since 2006 (last 20 years) | 32 |
Descriptor
Comparative Analysis | 33 |
Evaluators | 33 |
Statistical Analysis | 33 |
Foreign Countries | 14 |
Second Language Learning | 14 |
English (Second Language) | 11 |
Second Language Instruction | 10 |
Scores | 9 |
Teaching Methods | 9 |
Correlation | 8 |
College Students | 7 |
More ▼ |
Source
Author
Coniam, David | 2 |
Ahmadi, Alireza | 1 |
Attali, Yigal | 1 |
Bradic, Lejla | 1 |
Buzick, Heather | 1 |
Christie, Christina A. | 1 |
Davis, James R. | 1 |
Davis, Stephen | 1 |
Dekle, Dawn J. | 1 |
Engels, Rutger C. M. E. | 1 |
Fajkic, Almir | 1 |
More ▼ |
Publication Type
Journal Articles | 29 |
Reports - Research | 23 |
Reports - Evaluative | 6 |
Dissertations/Theses -… | 3 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 16 |
Postsecondary Education | 10 |
Grade 1 | 1 |
Grade 11 | 1 |
Secondary Education | 1 |
Audience
Researchers | 1 |
Location
Canada | 2 |
Hong Kong | 2 |
Iran | 2 |
Argentina | 1 |
Bosnia and Herzegovina… | 1 |
Israel | 1 |
Mexico | 1 |
Mexico (Oaxaca) | 1 |
Mississippi | 1 |
Netherlands | 1 |
Spain (Barcelona) | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Rosenberg Self Esteem Scale | 1 |
What Works Clearinghouse Rating
White, Lisa – ProQuest LLC, 2017
Although used in the corporate world for decades, using a multi-rater tool to evaluate school leaders began relatively recently. With states seeking flexibility from the "Elementary and Secondary Education Act of 1965" (reauthorized as the "No Child Left Behind Act of 2001"), the requirement to develop and implement principal…
Descriptors: Principals, Administrator Evaluation, Surveys, Self Evaluation (Individuals)
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Morris, Darrell; Pennell, Ashley M.; Perney, Jan; Trathen, Woodrow – Reading Psychology, 2018
This study compared reading rate to reading fluency (as measured by a rating scale). After listening to first graders read short passages, we assigned an overall fluency rating (low, average, or high) to each reading. We then used predictive discriminant analyses to determine which of five measures--accuracy, rate (objective); accuracy, phrasing,…
Descriptors: Reading Fluency, Prediction, Grade 1, Elementary School Students
Silvey, Brian A.; Wacker, Aaron T.; Felder, Logan – International Journal of Music Education, 2017
The purpose of this study was to investigate the effects of baton usage on college musicians' perceptions of ensemble performance. Two conductors were videotaped while conducting a 1-minute excerpt from either a technical ("Pathfinder of Panama," John Philip Sousa) or lyrical ("Seal Lullaby," Eric Whitacre) piece of concert…
Descriptors: Musicians, College Students, Student Attitudes, Music Activities
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Teker, Gulsen Tasdelen; Guler, Nese; Uyanik, Gulden Kaya – Educational Sciences: Theory and Practice, 2015
Generalizability theory (G theory) provides a broad conceptual framework for social sciences such as psychology and education, and a comprehensive construct for numerous measurement events by using analysis of variance, a strong statistical method. G theory, as an extension of both classical test theory and analysis of variance, is a model which…
Descriptors: Guidelines, Generalizability Theory, Computer Software, Statistical Analysis
Levis, John M.; Levis, Greta Muller – CATESOL Journal, 2018
Pronunciation features are not equal in how they affect listeners' ability to understand. Some are low value, while others are high value. This study explores whether contrastive stress is high value. Previous research has shown that identification of contrastive stress is learnable (Pennington & Ellis, 2000), and that explicit teaching about…
Descriptors: Pronunciation, Pronunciation Instruction, English (Second Language), Second Language Learning
Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016
Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…
Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring
Lehan, Tara; Hussey, Heather; Mika, Eva – Journal of University Teaching and Learning Practice, 2016
Throughout the dissertation process, the chair and committee members provide feedback regarding quality to help the doctoral candidate to produce the highest-quality document and become an independent scholar. Nevertheless, results of previous research suggest that overall dissertation quality generally is poor. Because much of the feedback about…
Descriptors: Graduate Students, Doctoral Dissertations, Student Evaluation, Feedback (Response)
Moshinsky, Avital; Ziegler, David; Gafni, Naomi – International Journal of Testing, 2017
Many medical schools have adopted multiple mini-interviews (MMI) as an advanced selection tool. MMIs are expensive and used to test only a few dozen candidates per day, making it infeasible to develop a different test version for each test administration. Therefore, some items are reused both within and across years. This study investigated the…
Descriptors: Interviews, Medical Schools, Test Validity, Test Reliability
Secic, Damir; Husremovic, Dzenana; Kapur, Eldan; Jatic, Zaim; Hadziahmetovic, Nina; Vojnikovic, Benjamin; Fajkic, Almir; Meholjic, Amir; Bradic, Lejla; Hadzic, Amila – Advances in Physiology Education, 2017
Testing strategies can either have a very positive or negative effect on the learning process. The aim of this study was to examine the degree of consistency in evaluating the practicality and logic of questions from a medical school pathophysiology test, between students and family medicine doctors. The study engaged 77 family medicine doctors…
Descriptors: Medical Students, Physicians, Medicine, Qualitative Research
Schissel, Jamie L.; Leung, Constant; López-Gopar, Mario; Davis, James R. – Language and Education, 2018
The assessments designed for and analyzed in this study used a task-based language design template rooted in theories of language reflecting heteroglossic language practices and funds of knowledge learning theories, which were understood as transforming classroom teaching, learning, and assessment through continua of biliteracy lenses. Using a…
Descriptors: Multilingualism, Spanish, Task Analysis, Preservice Teachers
Saito, Yukie; Saito, Kazuya – Language Teaching Research, 2017
The current study examined in depth the effects of suprasegmental-based instruction on the global (comprehensibility) and suprasegmental (word stress, rhythm, and intonation) development of Japanese learners of English as a foreign language (EFL). Students in the experimental group (n = 10) received a total of three hours of instruction over six…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Japanese
Ahmadi, Alireza; Sadeghi, Elham – Language Assessment Quarterly, 2016
In the present study we investigated the effect of test format on oral performance in terms of test scores and discourse features (accuracy, fluency, and complexity). Moreover, we explored how the scores obtained on different test formats relate to such features. To this end, 23 Iranian EFL learners participated in three test formats of monologue,…
Descriptors: Oral Language, Comparative Analysis, Language Fluency, Accuracy