ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	3

Descriptor

Computer Assisted Testing	4
Interrater Reliability	4
Rating Scales	4
Evaluators	3
Chinese	2
Language Proficiency	2
Oral Language	2
Second Language Learning	2
Accuracy	1
Artificial Intelligence	1
Career Awareness	1
Career Guidance	1
College Admission	1
College Students	1
Computer Software	1
Correlation	1
Cultural Influences	1
English (Second Language)	1
Essays	1
Evaluation Criteria	1
Field Tests	1
Foreign Countries	1
Formative Evaluation	1
Grammar	1
Guidelines	1
More ▼

Source

American College Testing…	1
ETS Research Report Series	1
International Educational…	1
Language Assessment Quarterly	1

Author

Bobek, Becky L.	1
Doewes, Afrizal	1
Gore, Paul A.	1
Jamieson, Joan	1
Kurdhi, Nughthoh Arfawi	1
Li, Shuai	1
Poonpon, Kornwipa	1
Saxena, Akrati	1
Taguchi, Naoko	1
Xiao, Feng	1

Publication Type

Reports - Research	3
Journal Articles	2
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	1

Audience

Location

China (Beijing)

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

Variations in Rating Scale Functioning in Assessing Speech Act Production in L2 Chinese

Peer reviewed

Direct link

Li, Shuai; Taguchi, Naoko; Xiao, Feng – Language Assessment Quarterly, 2019

Adopting Linacre's guidelines for evaluating rating scale effectiveness, we examined whether and how a six-point rating scale functioned differently across raters, speech acts, and second language (L2) proficiency levels. We developed a 12-item Computerized Oral Discourse Completion Task (CODCT) for assessing the production of requests, refusals,…

Descriptors: Speech Acts, Rating Scales, Guidelines, Evaluators

Developing Analytic Rating Guides for "TOEFL iBT"® Integrated Speaking Tasks. "TOEFL iBT"® Research Report, TOEFL iBT-20. ETS Research Report. RR-13-13

Peer reviewed
PDF on ERIC

Download full text

Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013

Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…

Descriptors: Oral Language, Language Proficiency, Scaling, Scores

Inventory of Work-Relevant Values: 2001 Revision. ACT Research Report Series, 2004-03

Download full text

Bobek, Becky L.; Gore, Paul A. – American College Testing (ACT), Inc., 2004

This research report describes changes made to the Inventory of Work-Relevant Values when it was revised for online use as a part of the Internet version of DISCOVER. Users will see the following differences between the online and CD-ROM versions of the inventory: 22 items rather than 61, simplified presentation, and the contribution of all items…

Descriptors: Interrater Reliability, Field Tests, Internet, Test Construction