ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	11

Descriptor

Evaluators	20
Interrater Reliability	20
Test Validity	20
Test Reliability	12
Evaluation Methods	8
English (Second Language)	6
Scoring	6
Evaluation Criteria	5
Language Tests	5
Test Construction	5
Foreign Countries	4
Oral Language	4
Scoring Rubrics	4
Classification	3
Higher Education	3
Measurement Techniques	3
Scores	3
Second Language Instruction	3
Second Language Learning	3
Student Evaluation	3
Test Items	3
Training	3
Accuracy	2
Analysis of Variance	2
Context Effect	2
More ▼

Source

Advances in Language and…	1
Applied Measurement in…	1
Assessment Update	1
Autism: The International…	1
ETS Research Report Series	1
Educational Research Quarterly	1
International Journal of Art…	1
International Journal of…	1
Interpreter and Translator…	1
Journal of Educational…	1
Physical Educator	1
ProQuest LLC	1
Research Papers in Education	1
More ▼

Publication Type

Reports - Research	15
Journal Articles	12
Speeches/Meeting Papers	3
Tests/Questionnaires	3
Information Analyses	2
Reports - Descriptive	2
Reports - Evaluative	2
Dissertations/Theses -…	1
Numerical/Quantitative Data	1

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	2
Elementary Secondary Education	1
Grade 7	1

Audience

Researchers

Location

China	1
Hong Kong	1
Israel	1
Turkey (Istanbul)	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Raters' Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalisation

Peer reviewed

Direct link

Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024

Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…

Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Evaluating Students' Performance in Responding to Art: The Development and Validation of an Art Criticism Assessment Rubric

Peer reviewed

Direct link

Tam, Cheung On – International Journal of Art & Design Education, 2018

This article reports on the development and validation of a rubric for assessing students' written responses to artworks. Since the implementation of the Hong Kong New Senior Secondary Curriculum in 2009, art educators have seen responding to artworks as increasingly important. In this context, the Art Criticism Assessment Rubric (ACAR) was…

Descriptors: Foreign Countries, Art Education, Art Appreciation, Student Evaluation

Autism at a Glance: A Pilot Study Optimizing Thin-Slice Observations

Peer reviewed

Direct link

Hampton, Lauren H.; Curtis, Philip R.; Roberts, Megan Y. – Autism: The International Journal of Research and Practice, 2019

Borrowing from a clinical psychology observational methodology, thin-slice observations were used to assess autism characteristics in toddlers. Thin-slices are short observations taken from a longer behavior stream which are assigned ratings by multiple raters using a 5-point scale. The raters' observations are averaged together to assign a…

Descriptors: Autism, Pervasive Developmental Disorders, Observation, Toddlers

Assessing Individual and Group Oral Exams: Scoring Criteria and Rater Interaction

Peer reviewed
PDF on ERIC

Download full text

Yalçin-Çolakoglu, Özlem; Selçuk, Merve – Advances in Language and Literary Studies, 2019

Criterion referenced tests of second language speaking performance are administered in different institutions using different procedures. The present study reports raters' practices of second language speaking tests, in particular the correspondence between test-takers' grades when assessed individually and in groups. Data derived from…

Descriptors: Oral Language, Language Tests, Test Validity, Inferences

Native and Non-Native Raters of L2 Speaking Performance: Accent Familiarity and Cognitive Processes

Direct link

Bogorevich, Valeriia – ProQuest LLC, 2018

Rater variation in performance assessment can impact test-takers' scores and compromise assessments' fairness and validity (Crooks, Kane, & Cohen, 1996). Rater variation can also undermine a test's validity and fairness; therefore, it is important to investigate raters' scoring patterns in order to inform rater training. Substantial work has…

Descriptors: Pronunciation, Familiarity, English (Second Language), Second Language Learning

Marking as Judgment

Peer reviewed

Direct link

Brooks, Val – Research Papers in Education, 2012

An aspect of assessment which has received little attention compared with perennial concerns, such as standards or reliability, is the role of judgment in marking. This paper explores marking as an act of judgment, paying particular attention to the nature of judgment and the processes involved. It brings together studies which have explored…

Descriptors: Educational Assessment, Test Reliability, Test Validity, Value Judgment

Face Validity Revisited.

Peer reviewed

Nevo, Baruch – Journal of Educational Measurement, 1985

A literature review and a proposed means of measuring face validity, a test's appearance of being valid, are presented. Empirical evidence from examinees' perceptions of a college entrance examination support the reliability of measuring face validity. (GDC)

Descriptors: College Entrance Examinations, Evaluation Methods, Evaluators, Foreign Countries

A Comparison of Parent and Teacher Ratings of Children's Behaviors

Peer reviewed
PDF on ERIC

Download full text

Firmin, Michael W.; Proemmel, Elizabeth; Hwang, Chi-en – Educational Research Quarterly, 2005

Previous studies have compared the accuracy of parent, teacher, and clinician ratings of children behavior, especially in diagnostic analysis. However, many have questioned the validity of the tests and the value of each rater. While some research has found differences among raters, few had looked at samples of non-referred children. We wanted to…

Descriptors: Parent Attitudes, Teacher Attitudes, Comparative Analysis, Child Behavior

The Behavioral Assessment of Parents and Coaches at Youth Sports: Validity and Reliability

Peer reviewed

Direct link

Apache, R. R. – Physical Educator, 2006

A behavioral assessment system for scoring the behaviors of parents and coaches at youth sports games is described within this paper. The Youth Sports Behavior Assessment System (YSBAS) contains nine behavioral categories describing behaviors commonly seen during youth sports. The developmental process of YSBAS and the observer-training program…

Descriptors: Evaluators, Training, Scoring, Parent Education

Toward an Understanding of the Role of Speech Recognition in Nonnative Speech Assessment. TOEFL iBT Research Report. TOEFL iBT-02. ETS RR-07-02

Peer reviewed
PDF on ERIC

Download full text

Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007

The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…

Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language

Context Bias in the Test of English as a Foreign Language.

Download full text

Angoff, William H. – 1989

This study was undertaken to test the hypothesis that items of the Test of English as a Foreign Language (TOEFL) containing reference to American people, places, customs, etc., tend to favor examinees who have spent some time living in the United States. Two samples of examinees were drawn from the March 1987 TOEFL administration, one tested in…

Descriptors: Context Effect, English (Second Language), Evaluators, Foreign Nationals

Developing and Validating Sets of Algebra Word Problems.

Download full text

Nasser, Ramzi; Carifio, James – 1993

The validation of key contextual features of algebra word problems was studied in two phases. In the first phase, five experts were asked to assess the appropriateness of the concepts in the problems and the adequacy of the assignment of the contextual features to the problems. In the second phase, construct validity was established by having 6…

Descriptors: Algebra, Analysis of Variance, Construct Validity, Context Effect

Previous Page | Next Page »

Pages: 1 | 2

Bejar, Isaac I.	2
Aaron Zimmerman	1
Ahmadi Safa, Mohammad	1
Angoff, William H.	1
Apache, R. R.	1
Bethany L. Miller	1
Binghan Zheng	1
Bogorevich, Valeriia	1
Brooks, Val	1
Carifio, James	1
Chao Han	1
Cohen, Allan	1
Cross, James Logan	1
Curtis, Philip R.	1
Doosti, Mehdi	1
Firmin, Michael W.	1
Hampton, Lauren H.	1
Hawk, Anne W.	1
Hemat, Ramin	1
Hwang, Chi-en	1
Janice Kinghorn	1
Katherine McGuire	1
Littlefield, John H.	1
Mingqing Xie	1
More ▼