Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 18 |
Descriptor
Scoring | 21 |
Test Items | 21 |
Comparative Analysis | 8 |
Item Response Theory | 8 |
Computer Assisted Testing | 6 |
Item Analysis | 6 |
Models | 6 |
Psychometrics | 6 |
Statistical Analysis | 6 |
College Entrance Examinations | 5 |
Correlation | 5 |
More ▼ |
Source
ETS Research Report Series | 21 |
Author
Yamamoto, Kentaro | 3 |
Zhang, Mo | 3 |
Ali, Usama S. | 2 |
Guo, Hongwen | 2 |
Lee, Chong Min | 2 |
Wang, Xinhao | 2 |
Yoon, Su-Youn | 2 |
Zechner, Klaus | 2 |
von Davier, Matthias | 2 |
Attali, Yigal | 1 |
Breyer, F. Jay | 1 |
More ▼ |
Publication Type
Journal Articles | 21 |
Reports - Research | 21 |
Numerical/Quantitative Data | 2 |
Tests/Questionnaires | 2 |
Education Level
Higher Education | 5 |
Postsecondary Education | 4 |
Secondary Education | 4 |
Elementary Education | 2 |
Grade 8 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Grade 4 | 1 |
Grade 7 | 1 |
Intermediate Grades | 1 |
Audience
Location
Australia | 1 |
China | 1 |
France | 1 |
Germany | 1 |
Japan | 1 |
Netherlands | 1 |
South Korea | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 3 |
Program for International… | 2 |
SAT (College Admission Test) | 2 |
ACT Assessment | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Carol Eckerly; Yue Jia; Paul Jewsbury – ETS Research Report Series, 2022
Testing programs have explored the use of technology-enhanced items alongside traditional item types (e.g., multiple-choice and constructed-response items) as measurement evidence of latent constructs modeled with item response theory (IRT). In this report, we discuss considerations in applying IRT models to a particular type of adaptive testlet…
Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Scoring
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Carlson, James E. – ETS Research Report Series, 2017
In this paper, I consider a set of test items that are located in a multidimensional space, S[subscript M], but are located along a curved line in S[subscript M] and can be scaled unidimensionally. Furthermore, I am demonstrating a case in which the test items are administered across 6 levels, such as occurs in K-12 assessment across 6 grade…
Descriptors: Test Items, Item Response Theory, Difficulty Level, Scoring
van Rijn, Peter W.; Ali, Usama S. – ETS Research Report Series, 2018
A computer program was developed to estimate speed-accuracy response models for dichotomous items. This report describes how the models are estimated and how to specify data and input files. An example using data from a listening section of an international language test is described to illustrate the modeling approach and features of the computer…
Descriptors: Computer Software, Computation, Reaction Time, Timed Tests
Loukina, Anastassia; Zechner, Klaus; Yoon, Su-Youn; Zhang, Mo; Tao, Jidong; Wang, Xinhao; Lee, Chong Min; Mulholland, Matthew – ETS Research Report Series, 2017
This report presents an overview of the "SpeechRater"? automated scoring engine model building and evaluation process for several item types with a focus on a low-English-proficiency test-taker population. We discuss each stage of speech scoring, including automatic speech recognition, filtering models for nonscorable responses, and…
Descriptors: Automation, Scoring, Speech Tests, Test Items
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016
In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…
Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics
Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias – ETS Research Report Series, 2017
Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015
This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…
Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis
Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015
One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014
Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…
Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items
Fife, James H. – ETS Research Report Series, 2013
The m-rater scoring engine has been used successfully for the past several years to score "CBAL"™ mathematics tasks, for the most part without the need for human scoring. During this time, various improvements to m-rater and its scoring keys have been implemented in response to specific CBAL needs. In 2012, with the general move toward…
Descriptors: Mathematics, Scoring, Educational Assessment, Academic Standards
Fu, Jianbin; Wise, Maxwell – ETS Research Report Series, 2012
In the Cognitively Based Assessment of, for, and as Learning ("CBAL"™) research initiative, innovative K-12 prototype tests based on cognitive competency models are developed. This report presents the statistical results of the 2 CBAL Grade 8 writing tests and 2 Grade 7 reading tests administered to students in 20 states in spring 2011.…
Descriptors: Cognitive Ability, Grade 8, Writing Tests, Grade 7
Davey, Tim; Lee, Yi-Hsuan – ETS Research Report Series, 2011
Both theoretical and practical considerations have led the revision of the Graduate Record Examinations® (GRE®) revised General Test, here called the rGRE, to adopt a multistage adaptive design that will be continuously or nearly continuously administered and that can provide immediate score reporting. These circumstances sharply constrain the…
Descriptors: Context Effect, Scoring, Equated Scores, College Entrance Examinations
Sukkarieh, Jane Z.; von Davier, Matthias; Yamamoto, Kentaro – ETS Research Report Series, 2012
This document describes a solution to a problem in the automatic content scoring of the multilingual character-by-character highlighting item type. This solution is language independent and represents a significant enhancement. This solution not only facilitates automatic scoring but plays an important role in clustering students' responses;…
Descriptors: Scoring, Multilingualism, Test Items, Role
Previous Page | Next Page »
Pages: 1 | 2