Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 9 |
Since 2016 (last 10 years) | 14 |
Since 2006 (last 20 years) | 28 |
Descriptor
Interrater Reliability | 60 |
Scoring | 60 |
Test Reliability | 60 |
Test Validity | 30 |
Test Construction | 23 |
Writing Evaluation | 19 |
Evaluation Methods | 14 |
Test Items | 12 |
Higher Education | 10 |
Language Tests | 10 |
Psychometrics | 10 |
More ▼ |
Source
Author
Anna-Maria Fall | 2 |
Beula M. Magimairaj | 2 |
Breland, Hunter M. | 2 |
Carlson, Sybil B. | 2 |
Greg Roberts | 2 |
Philip Capin | 2 |
Ronald B. Gillam | 2 |
Sandra L. Gillam | 2 |
Sharon Vaughn | 2 |
Aksu, Gökhan | 1 |
Aktas, Mehtap | 1 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 6 |
Practitioners | 4 |
Teachers | 3 |
Administrators | 1 |
Location
New Mexico | 2 |
Turkey | 2 |
Australia | 1 |
California | 1 |
California (Los Angeles) | 1 |
Canada | 1 |
Florida | 1 |
Illinois | 1 |
New York | 1 |
Tennessee | 1 |
United Kingdom (England) | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Bolton, Tiffany; Stevenson, Brittney; Janes, William – Journal of Occupational Therapy, Schools & Early Intervention, 2023
Researchers utilized a cross-sectional secondary analysis of data within an ongoing non-randomized controlled trial study design to establish the reliability and internal consistency of a novel handwriting assessment for preschoolers, the Just Write! (JW), written by the authors. Seventy-eight children from an area preschool participated in the…
Descriptors: Handwriting, Writing Skills, Writing Evaluation, Preschool Children
Jones, Nathan; Bell, Courtney; Qi, Yi; Lewis, Jennifer; Kirui, David; Stickler, Leslie; Redash, Amanda – ETS Research Report Series, 2021
The observation systems being used in all 50 states require administrators to learn to accurately and reliably score their teachers' instruction using standardized observation systems. Although the literature on observation systems is growing, relatively few studies have examined the outcomes of trainings focused on developing administrators'…
Descriptors: Observation, Standardized Tests, Teacher Evaluation, Test Reliability
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Lynsey Joohyun Lee – ProQuest LLC, 2021
Reliability and validity are two important topics that have been studied for many decades in the educational measurement field, including discussions of Writing Studies' subfield of writing assessment, since the establishment of the College Entrance Exam Board [CEEB] in 1899 (Huot et al., 2010). In recent years, scholarly conversations of fairness…
Descriptors: Writing Evaluation, Test Validity, Test Reliability, Case Studies
Wendler, Cathy; Glazer, Nancy; Cline, Frederick – ETS Research Report Series, 2019
One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as…
Descriptors: College Entrance Examinations, Graduate Study, Accuracy, Test Reliability
Safak, Pinar; Cakmak, Salih; Karakoc, Tamer; Aydin O'Dwyer, Pinar – European Journal of Educational Research, 2021
This study aimed to develop a valid and reliable instrument that measures the functional vision of students with low vision. Thus, an assessment tool and performance activities were developed for three vision skill groups (near vision skills, distance vision skills, and visual field) that include functional vision skills. The universe was 1485…
Descriptors: Foreign Countries, Vision Tests, Diagnostic Tests, Vision
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Uzun, N. Bilge; Alici, Devrim; Aktas, Mehtap – European Journal of Educational Research, 2019
The purpose of study is to examine the reliability of analytical rubrics and checklists developed for the assessment of story writing skills by means of generalizability theory. The study group consisted of 52 students attending the 5th grade at primary school and 20 raters in Mersin University. The G study was carried out with the fully crossed…
Descriptors: Foreign Countries, Scoring Rubrics, Check Lists, Writing Tests
Roohr, Katrina Crotts; Burkander, Kri; Mao, Liyang – ETS Research Report Series, 2018
Oral communication has been identified as an important skill by higher education institutions and by the workforce community. Despite its importance, minimal research has been conducted around the development of tasks to measure oral communication skills and behaviors. The purpose of this preliminary study is to evaluate the different factors…
Descriptors: Speech Communication, Video Technology, Test Construction, Scoring
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021
This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…
Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Moser, Gary P.; Sudweeks, Richard R.; Morrison, Timothy G.; Wilcox, Brad – Reading Psychology, 2014
This study examined ratings of fourth graders' oral reading expression. Randomly assigned participants (n = 36) practiced repeated readings using narrative or informational passages for 7 weeks. After this period raters used the "Multidimensional Fluency Scale" (MFS) on two separate occasions to rate students' expressive…
Descriptors: Elementary School Students, Oral Reading, Reading Skills, Suprasegmentals
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage