ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	26

Descriptor

Multiple Choice Tests	31
Test Items	18
Item Response Theory	13
Scores	11
Comparative Analysis	10
Equated Scores	10
Statistical Analysis	10
Responses	8
Language Tests	7
Second Language Learning	7
Test Format	7
Difficulty Level	6
English (Second Language)	6
Models	6
Test Construction	6
Gender Differences	5
Test Reliability	5
Accuracy	4
Raw Scores	4
Simulation	4
Test Validity	4
College Entrance Examinations	3
Computer Assisted Testing	3
Educational Assessment	3
Error of Measurement	3
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	31
Reports - Research	31
Tests/Questionnaires	3
Numerical/Quantitative Data	2

Education Level

Secondary Education	6
Higher Education	5
Postsecondary Education	5
Elementary Education	3
Grade 8	2
High Schools	2
Junior High Schools	2
Middle Schools	2
Elementary Secondary Education	1
Grade 7	1

Audience

Location

Arizona	2
Georgia	2
Indiana	2
Nevada	2
Alabama	1
Arkansas	1
Armenia	1
California	1
Connecticut	1
Idaho	1
Illinois	1
Iowa	1
Kentucky	1
Minnesota	1
Tennessee	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	6
Praxis Series	3
SAT (College Admission Test)	2
Advanced Placement…	1
National Merit Scholarship…	1
Preliminary Scholastic…	1
Program for International…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

Comparisons among Approaches to Link Tests Using Random Samples Selected under Suboptimal Conditions. Research Report. ETS RR-21-14

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael E. – ETS Research Report Series, 2021

Equating the scores from different forms of a test requires collecting data that link the forms. Problems arise when the test forms to be linked are given to groups that are not equivalent and the forms share no common items by which to measure or adjust for this group nonequivalence. We compared three approaches to adjusting for group…

Descriptors: Equated Scores, Weighted Scores, Sampling, Multiple Choice Tests

The Impact of Using Synthetically Generated Listening Stimuli on Test-Taker Performance: A Case Study with Multiple-Choice, Single-Selection Items. TOEFL® Research Reports. RR-98. ETS?RR-22-05

Peer reviewed
PDF on ERIC

Download full text

Choi, Ikkyu; Zu, Jiyun – ETS Research Report Series, 2022

Synthetically generated speech (SGS) has become an integral part of our oral communication in a wide variety of contexts. It can be generated instantly at a low cost and allows precise control over multiple aspects of output, all of which can be highly appealing to second language (L2) assessment developers who have traditionally relied upon human…

Descriptors: Test Wiseness, Multiple Choice Tests, Test Items, Difficulty Level

Does Rearranging Multiple-Choice Item Response Options Affect Item and Test Performance? Research Report. ETS RR-19-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin – ETS Research Report Series, 2019

Rearranging response options in different versions of a test of multiple-choice items can be an effective strategy against cheating on the test. This study investigated if rearranging response options would affect item performance and test score comparability. A study test was assembled as the base version from which 3 variant versions were…

Descriptors: Multiple Choice Tests, Test Items, Test Format, Scores

New Validity Evidence on the "TOEFL Junior"® Standard Test as a Measure of Progress. TOEFL® Research Report. RR-95. ETS RR-21-19

Peer reviewed
PDF on ERIC

Download full text

Madyarov, Irshat; Movsisyan, Vahe; Madoyan, Habet; Galikyan, Irena; Gasparyan, Rubina – ETS Research Report Series, 2021

The "TOEFL Junior"® Standard test is a tool for measuring the English language skills of students ages 11+ who learn English as an additional language. It is a paper-based multiple-choice test and measures proficiency in three sections: listening, form and meaning, and reading. To date, empirical evidence provides some support for the…

Descriptors: English (Second Language), Second Language Learning, Language Tests, Standardized Tests

A Simulation-Based Method for Finding the Optimal Number of Options for Multiple-Choice Items on a Test. Research Report. ETS RR-18-22

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018

For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…

Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction

Distractor Analysis for Multiple-Choice Tests: An Empirical Study with International Language Assessment Data. Research Report. ETS RR-19-39

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J.; Liu, Yang; Lee, Yi-Hsuan – ETS Research Report Series, 2019

Distractor analyses are routinely conducted in educational assessments with multiple-choice items. In this research report, we focus on three item response models for distractors: (a) the traditional nominal response (NR) model, (b) a combination of a two-parameter logistic model for item scores and a NR model for selections of incorrect…

Descriptors: Multiple Choice Tests, Scores, Test Reliability, High Stakes Tests

Long-Term Impact of Valid Case Criterion on Capturing Population-Level Growth under Item Response Theory Equating. Research Report. ETS RR-17-17

Peer reviewed
PDF on ERIC

Download full text

Deng, Weiling; Monfils, Lora – ETS Research Report Series, 2017

Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…

Descriptors: Item Response Theory, Equated Scores, True Scores, Educational Assessment

Integrating Cognitive Views into Psychometric Models for Reading Comprehension Assessment. Research Report. ETS RR-17-35

Peer reviewed
PDF on ERIC

Download full text

Rahman, Taslima; Mislevy, Robert J. – ETS Research Report Series, 2017

To demonstrate how methodologies for assessing reading comprehension can grow out of views of the construct suggested in the reading research literature, we constructed tasks and carried out psychometric analyses that were framed in accordance with 2 leading reading models. In estimating item difficulty and subsequently, examinee proficiency, an…

Descriptors: Reading Tests, Reading Comprehension, Psychometrics, Test Items

Effect of Item Response Theory (IRT) Model Selection on Testlet-Based Test Equating. Research Report. ETS RR-14-19

Peer reviewed
PDF on ERIC

Download full text

Cao, Yi; Lu, Ru; Tao, Wei – ETS Research Report Series, 2014

The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…

Descriptors: Item Response Theory, Equated Scores, Test Items, Simulation

Assessing Digital Information Literacy in Higher Education: A Review of Existing Frameworks and Assessments with Recommendations for Next-Generation Assessment. Research Report. ETS RR-16-32

Peer reviewed
PDF on ERIC

Download full text

Sparks, Jesse R.; Katz, Irvin R.; Beile, Penny M. – ETS Research Report Series, 2016

Digital information literacy (DIL)--generally defined as the ability to obtain, understand, evaluate, and use information in a variety of digital technology contexts--is a critically important skill deemed necessary for success in higher education as well as in the global networked economy. To determine whether college graduates possess the…

Descriptors: Technological Literacy, Information Literacy, Higher Education, Definitions

Comparing Data Treatments on Item-Level Nonresponse and Their Effects on Data Analysis of Large-Scale Assessments: 2009 PISA Study. Research Report. ETS RR-15-12

Peer reviewed
PDF on ERIC

Download full text

Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015

One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…

Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students

The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

Peer reviewed
PDF on ERIC

Download full text

Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013

The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

Descriptors: Test Format, Test Items, Responses, Computation

Creating Vocabulary Item Types That Measure Students' Depth of Semantic Knowledge. Research Report. ETS RR-14-02

Peer reviewed
PDF on ERIC

Download full text

Deane, Paul; Lawless, René R.; Li, Chen; Sabatini, John; Bejar, Isaac I.; O'Reilly, Tenaha – ETS Research Report Series, 2014

We expect that word knowledge accumulates gradually. This article draws on earlier approaches to assessing depth, but focuses on one dimension: richness of semantic knowledge. We present results from a study in which three distinct item types were developed at three levels of depth: knowledge of common usage patterns, knowledge of broad topical…

Descriptors: Vocabulary, Test Items, Language Tests, Semantics

Constructed-Response DIF Evaluations for Mixed-Format Tests. Research Report. ETS RR-13-33

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013

In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis

The Stability of the Score Scales for the "SAT Reasoning Test"™ from 2005 to 2010. Research Report. ETS RR-12-15

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Liu, Jinghua; Curley, Edward; Dorans, Neil – ETS Research Report Series, 2012

This study examines the stability of the "SAT Reasoning Test"™ score scales from 2005 to 2010. A 2005 old form (OF) was administered along with a 2010 new form (NF). A new conversion for OF was derived through direct equipercentile equating. A comparison of the newly derived and the original OF conversions showed that Critical Reading…

Descriptors: Aptitude Tests, Cognitive Tests, Thinking Skills, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3

Kim, Sooyeon	4
Walker, Michael E.	4
Deng, Weiling	2
Guo, Hongwen	2
Haberman, Shelby J.	2
Liu, Jinghua	2
McHale, Frederick	2
Puhan, Gautam	2
Sinharay, Sandip	2
Zu, Jiyun	2
von Davier, Alina A.	2
Attali, Yigal	1
Beile, Penny M.	1
Bejar, Isaac I.	1
Cao, Yi	1
Carrell, Patricia L.	1
Chen, Haiwen H.	1
Choi, Ikkyu	1
Cohen, Andrew D.	1
Curley, Edward	1
Deane, Paul	1
Dorans, Neil	1
Dorans, Neil J.	1
Galikyan, Irena	1
Gasparyan, Rubina	1
More ▼