ERIC - Search Results

Publication Date

In 2025	3
Since 2024	3
Since 2021 (last 5 years)	15
Since 2016 (last 10 years)	25
Since 2006 (last 20 years)	35

Descriptor

Difficulty Level	48
Evaluators	48
Test Items	15
Scoring	13
Second Language Learning	13
Language Tests	12
Interrater Reliability	11
Item Analysis	11
Foreign Countries	10
Item Response Theory	10
English (Second Language)	9
Scores	8
Comparative Analysis	7
Cutting Scores	7
Higher Education	7
Standard Setting (Scoring)	7
Language Proficiency	6
Rating Scales	6
Computer Software	5
Decision Making	5
Licensing Examinations…	5
Minimum Competency Testing	5
Second Language Instruction	5
Undergraduate Students	5
College Faculty	4
More ▼

Publication Type

Journal Articles	35
Reports - Research	31
Reports - Evaluative	11
Speeches/Meeting Papers	5
Dissertations/Theses -…	4
Tests/Questionnaires	4
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	10
Postsecondary Education	10
Elementary Education	6
Secondary Education	5
Early Childhood Education	3
Kindergarten	3
Primary Education	3
Elementary Secondary Education	2
High Schools	2
Grade 8	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

California	2
Florida	2
Iran	2
United Kingdom (England)	2
Europe	1
Idaho	1
Illinois	1
Indonesia	1
Japan	1
Maryland	1
Massachusetts	1
Oregon	1
Pennsylvania	1
South Korea	1
United Kingdom	1
United States	1
Utah	1
Wisconsin	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Flesch Kincaid Grade Level…	1
Fry Readability Formula	1
National Adult Literacy…	1
National Teacher Examinations	1
Test of English for…	1
edTPA (Teacher Performance…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 48 results Save | Export

Exploring Difficult-to-Score Essays with a Hyperbolic Cosine Accuracy Model and Coh-Metrix Indices

Peer reviewed

Direct link

Wang, Jue; Engelhard, George; Combs, Trenton – Journal of Experimental Education, 2023

Unfolding models are frequently used to develop scales for measuring attitudes. Recently, unfolding models have been applied to examine rater severity and accuracy within the context of rater-mediated assessments. One of the problems in applying unfolding models to rater-mediated assessments is that the substantive interpretations of the latent…

Descriptors: Writing Evaluation, Scoring, Accuracy, Computational Linguistics

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices

Peer reviewed

Direct link

Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022

Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…

Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size

Scoring Difficulty in Summary Writing Assessment: Toward the Reconstruction of Analytic Rubric

Peer reviewed
PDF on ERIC

Download full text

Makiko Kato – Journal of Education and Learning, 2025

This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…

Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)

Content and Item Response Theory Analysis of ChatGPT-4-Generated Multiple-Choice Items

Peer reviewed

Direct link

Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025

Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…

Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

Application of an Automated Essay Scoring Engine to English Writing Assessment Using Many-Facet Rasch Measurement

Peer reviewed

Direct link

Chan, Kinnie Kin Yee; Bond, Trevor; Yan, Zi – Language Testing, 2023

We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into…

Descriptors: Computer Assisted Testing, Essays, Scoring, Scores

Measurement Properties of a Standardized Elicited Imitation Test: An Integrative Data Analysis

Peer reviewed

Direct link

Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022

Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…

Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning

Examining the Precision of Cut Scores within a Generalizability Theory Framework: A Closer Look at the Item Effect

Peer reviewed

Direct link

Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020

An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…

Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting

The PhD System under Pressure: An Examiner's Viewpoint

Peer reviewed

Direct link

Alexander, David E.; Davis, Ian R. – Quality Assurance in Education: An International Perspective, 2019

Purpose: The purpose of this paper is to review the issues and challenges associated with examining PhD theses in the modern, rapidly changing academic world. The PhD degree has been described as the "pinnacle of academic qualifications", but it is under threat in terms of the quality of supervision and the outcome of examinations. By…

Descriptors: Doctoral Dissertations, Doctoral Degrees, Educational Quality, Difficulty Level

Construct Exploration of Teacher Readiness as an Assessor of Vocational High School Competency Test

Peer reviewed
PDF on ERIC

Download full text

Cahyono, Sulistio Mukti; Kartawagiran, Badrun; Mahmudah, Fitri Nur – European Journal of Educational Research, 2021

Teachers who can adapt and be ready for all changes will also be able to provide a balance to increase the competence of vocational high school students. This is also not denied when teachers become assessors in student competency tests. The objectives of this study were to produce an instrument for the readiness of teachers as assessors; to…

Descriptors: Readiness, Vocational Education Teachers, Vocational High Schools, High School Students

An Empirical Study for the Statistical Adjustment of Rater Bias

Peer reviewed
PDF on ERIC

Download full text

Ilhan, Mustafa – International Journal of Assessment Tools in Education, 2019

This study investigated the effectiveness of statistical adjustments applied to rater bias in many-facet Rasch analysis. Some changes were first made in the dataset that did not include "rater × examinee" bias to cause to have "rater × examinee" bias. Later, bias adjustment was applied to rater bias included in the data file,…

Descriptors: Statistical Analysis, Item Response Theory, Evaluators, Bias

Developing a Comprehensive Decoding Instruction Observation Protocol for Special Education Teachers

Peer reviewed

Direct link

Moylan, Laura A.; Johnson, Evelyn S.; Zheng, Yuzhu – Reading & Writing Quarterly, 2022

This study describes the development of a special education teacher observation protocol detailing the elements of effective decoding instruction. The psychometric properties of the protocol were investigated through many-facet Rasch measurement (MFRM). Video observations of classroom decoding instruction from 20 special education teachers across…

Descriptors: Decoding (Reading), Special Education Teachers, Psychometrics, Video Technology

Low Inter-Rater Reliability of a High Stakes Performance Assessment of Teacher Candidates

Peer reviewed
PDF on ERIC

Download full text

Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021

The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…

Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language

Linking the International English Language Competency Assessment Suite of Examinations to the Common European Framework of Reference

Peer reviewed

Direct link

Hidri, Sahbi – Language Testing in Asia, 2021

The study investigated the alignment process of the International English Language Competency Assessment (IELCA) suite examinations' four levels, B1, B2, C1 and C2, onto the Common European Framework of Reference (CEFR) by explaining and discussing the five linking stages (Council of Europe (CoE 2009). Unlike previous studies, this study used the…

Descriptors: Literacy, Second Language Learning, Second Language Instruction, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

ProQuest LLC	4
Educational and Psychological…	3
Language Testing	3
Educational Measurement:…	2
Journal of Educational…	2
Advances in Health Sciences…	1
Applied Language Learning	1
Applied Measurement in…	1
Cambridge Assessment	1
Education Sciences	1
European Journal of…	1
Evaluation & Research in…	1
Instructional Science: An…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Applied Measurement	1
Journal of Education and…	1
Journal of Educational…	1
Journal of English as an…	1
Journal of Experimental…	1
Journal of MultiDisciplinary…	1
Language Assessment Quarterly	1
More ▼

Lunz, Mary E.	2
Alexander Kah	1
Alexander, David E.	1
Bond, Trevor	1
Britton, Bruce K.	1
Brunfaut, Tineke	1
Cahyono, Sulistio Mukti	1
Carlson, Alfred B.	1
Chafouleas, Sandra M.	1
Chamberland, Martine	1
Chan, Kinnie Kin Yee	1
Choi, Jin Soo	1
Christ, Theodore J.	1
Clauser, Brian E.	1
Clauser, Jerome C.	1
Clevinger, Amanda	1
Coleman, Tori	1
Combs, Trenton	1
Cox, Troy L.	1
Crisp, Victoria	1
Crossley, Scott	1
Darlington, Ellie	1
Davis, Ian R.	1
Dyson, Alan	1
More ▼