ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	18

Source

ETS Research Report Series

Publication Type

Journal Articles	21
Reports - Research	21
Numerical/Quantitative Data	2
Tests/Questionnaires	2

Education Level

Higher Education	5
Postsecondary Education	4
Secondary Education	4
Elementary Education	2
Grade 8	2
Junior High Schools	2
Middle Schools	2
Grade 4	1
Grade 7	1
Intermediate Grades	1

Audience

Location

Australia	1
China	1
France	1
Germany	1
Japan	1
Netherlands	1
South Korea	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	3
Program for International…	2
SAT (College Admission Test)	2
ACT Assessment	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Technology-Enhanced Items and Model-Data Misfit. Research Report. ETS RR-22-11

Peer reviewed
PDF on ERIC

Download full text

Carol Eckerly; Yue Jia; Paul Jewsbury – ETS Research Report Series, 2022

Testing programs have explored the use of technology-enhanced items alongside traditional item types (e.g., multiple-choice and constructed-response items) as measurement evidence of latent constructs modeled with item response theory (IRT). In this report, we discuss considerations in applying IRT models to a particular type of adaptive testlet…

Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Scoring

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Unidimensional Vertical Scaling in Multidimensional Space. Research Report. ETS RR-17-29

Peer reviewed
PDF on ERIC

Download full text

Carlson, James E. – ETS Research Report Series, 2017

In this paper, I consider a set of test items that are located in a multidimensional space, S[subscript M], but are located along a curved line in S[subscript M] and can be scaled unidimensionally. Furthermore, I am demonstrating a case in which the test items are administered across 6 levels, such as occurs in K-12 assessment across 6 grade…

Descriptors: Test Items, Item Response Theory, Difficulty Level, Scoring

SARM: A Computer Program for Estimating Speed-Accuracy Response Models for Dichotomous Items. Research Report. ETS RR-18-15

Peer reviewed
PDF on ERIC

Download full text

van Rijn, Peter W.; Ali, Usama S. – ETS Research Report Series, 2018

A computer program was developed to estimate speed-accuracy response models for dichotomous items. This report describes how the models are estimated and how to specify data and input files. An example using data from a listening section of an international language test is described to illustrate the modeling approach and features of the computer…

Descriptors: Computer Software, Computation, Reaction Time, Timed Tests

Performance of Automated Speech Scoring on Different Low- to Medium-Entropy Item Types for Low-Proficiency English Learners. Research Report. ETS RR-17-12

Peer reviewed
PDF on ERIC

Download full text

Loukina, Anastassia; Zechner, Klaus; Yoon, Su-Youn; Zhang, Mo; Tao, Jidong; Wang, Xinhao; Lee, Chong Min; Mulholland, Matthew – ETS Research Report Series, 2017

This report presents an overview of the "SpeechRater"? automated scoring engine model building and evaluation process for several item types with a focus on a low-English-proficiency test-taker population. We discuss each stage of speech scoring, including automatic speech recognition, filtering models for nonscorable responses, and…

Descriptors: Automation, Scoring, Speech Tests, Test Items

Evaluation of Different Scoring Rules for a Noncognitive Test in Development. Research Report. ETS RR-16-03

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016

In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…

Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics

Developing a Machine-Supported Coding System for Constructed-Response Items in PISA. Research Report. ETS RR-17-47

Peer reviewed
PDF on ERIC

Download full text

Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias – ETS Research Report Series, 2017

Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

Automated Scoring of Speaking Tasks in the Test of English-for-Teaching ("TEFT"™). Research Report. ETS RR-15-31

Peer reviewed
PDF on ERIC

Download full text

Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015

This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…

Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis

Comparing Data Treatments on Item-Level Nonresponse and Their Effects on Data Analysis of Large-Scale Assessments: 2009 PISA Study. Research Report. ETS RR-15-12

Peer reviewed
PDF on ERIC

Download full text

Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015

One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…

Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students

An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014

Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items

Automated Scoring of Mathematics Tasks in the Common Core Era: Enhancements to M-Rater in Support of "CBAL"™ Mathematics and the Common Core Assessments. Research Reports. ETS RR-13-26

Peer reviewed
PDF on ERIC

Download full text

Fife, James H. – ETS Research Report Series, 2013

The m-rater scoring engine has been used successfully for the past several years to score "CBAL"™ mathematics tasks, for the most part without the need for human scoring. During this time, various improvements to m-rater and its scoring keys have been implemented in response to specific CBAL needs. In 2012, with the general move toward…

Descriptors: Mathematics, Scoring, Educational Assessment, Academic Standards

Statistical Report of 2011 "CBAL"™ Multistate Administration of Reading and Writing Tests. Research Report. ETS RR-12-24

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Wise, Maxwell – ETS Research Report Series, 2012

In the Cognitively Based Assessment of, for, and as Learning ("CBAL"™) research initiative, innovative K-12 prototype tests based on cognitive competency models are developed. This report presents the statistical results of the 2 CBAL Grade 8 writing tests and 2 Grade 7 reading tests administered to students in 20 states in spring 2011.…

Descriptors: Cognitive Ability, Grade 8, Writing Tests, Grade 7

Potential Impact of Context Effects on the Scoring and Equating of the Multistage GRE® Revised General Test. ETS GRE® Board Research Report. ETS GRE® GREB-08-01. ETS Research Report. RR-11-26

Peer reviewed
PDF on ERIC

Download full text

Davey, Tim; Lee, Yi-Hsuan – ETS Research Report Series, 2011

Both theoretical and practical considerations have led the revision of the Graduate Record Examinations® (GRE®) revised General Test, here called the rGRE, to adopt a multistage adaptive design that will be continuously or nearly continuously administered and that can provide immediate score reporting. These circumstances sharply constrain the…

Descriptors: Context Effect, Scoring, Equated Scores, College Entrance Examinations

From Biology to Education: Scoring and Clustering Multilingual Text Sequences and Other Sequential. Research Report. ETS RR-12-25

Peer reviewed
PDF on ERIC

Download full text

Sukkarieh, Jane Z.; von Davier, Matthias; Yamamoto, Kentaro – ETS Research Report Series, 2012

This document describes a solution to a problem in the automatic content scoring of the multilingual character-by-character highlighting item type. This solution is language independent and represents a significant enhancement. This solution not only facilitates automatic scoring but plays an important role in clustering students' responses;…

Descriptors: Scoring, Multilingualism, Test Items, Role

Previous Page | Next Page »

Pages: 1 | 2

Scoring	21
Test Items	21
Comparative Analysis	8
Item Response Theory	8
Computer Assisted Testing	6
Item Analysis	6
Models	6
Psychometrics	6
Statistical Analysis	6
College Entrance Examinations	5
Correlation	5
Simulation	5
Test Construction	5
Accuracy	4
Scores	4
Test Bias	4
Test Reliability	4
Test Scoring Machines	4
Adaptive Testing	3
Difficulty Level	3
English (Second Language)	3
Foreign Countries	3
Graduate Study	3
Interrater Reliability	3
Language Tests	3
More ▼

Yamamoto, Kentaro	3
Zhang, Mo	3
Ali, Usama S.	2
Guo, Hongwen	2
Lee, Chong Min	2
Wang, Xinhao	2
Yoon, Su-Youn	2
Zechner, Klaus	2
von Davier, Matthias	2
Attali, Yigal	1
Breyer, F. Jay	1
Carlson, James E.	1
Carol Eckerly	1
Chang, Hua-Hua	1
Chen, Haiwen H.	1
Chen, Lei	1
Davey, Tim	1
Davis, Larry	1
Dorans, Neil J.	1
Evanini, Keelan	1
Fife, James H.	1
Frankel, Lois	1
Freedman, Marshall	1
Fu, Jianbin	1
Harrison, Marissa	1
More ▼