Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 9 |
Descriptor
Difficulty Level | 10 |
Test Items | 6 |
College Entrance Examinations | 3 |
Equated Scores | 3 |
Item Response Theory | 3 |
Scores | 3 |
Statistical Analysis | 3 |
Test Bias | 3 |
Accountability | 2 |
Disabilities | 2 |
English (Second Language) | 2 |
More ▼ |
Source
Educational Testing Service | 10 |
Author
Sinharay, Sandip | 3 |
Dorans, Neil J. | 2 |
Holland, Paul W. | 2 |
Kostin, Irene | 2 |
Buzick, Heather M. | 1 |
Curley, Edward | 1 |
Davey, Tim | 1 |
Dugdale, Deborah M. | 1 |
Feigenbaum, Miriam | 1 |
Flor, Michael | 1 |
Futagi, Yoko | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Information Analyses | 2 |
Reports - Evaluative | 2 |
Opinion Papers | 1 |
Education Level
Elementary Secondary Education | 4 |
Elementary Education | 2 |
Grade 4 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Audience
Location
China | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
SAT (College Admission Test) | 2 |
Test of English as a Foreign… | 2 |
Flesch Kincaid Grade Level… | 1 |
Gates MacGinitie Reading Tests | 1 |
What Works Clearinghouse Rating
Papageorgiou, Spiros; Xu, Xiaoqiu; Timpe-Laughlin, Veronika; Dugdale, Deborah M. – Educational Testing Service, 2020
The purpose of this study is to examine the appropriateness of using the "TOEFL Primary®" tests to evaluate the language abilities of students learning English as a foreign language (EFL) through an online-delivered curriculum, the VIPKid Major Course (MC). Data include student test scores on the TOEFL Primary Listening and Reading tests…
Descriptors: Alignment (Education), Language Tests, English (Second Language), Second Language Learning
Sheehan, Kathleen M.; Kostin, Irene; Futagi, Yoko; Flor, Michael – Educational Testing Service, 2010
The Common Core Standards call for students to be exposed to a much greater level of text complexity than has been the norm in schools for the past 40 years. Textbook publishers, teachers, and assessment developers are being asked to refocus materials and methods to ensure that students are challenged to read texts at steadily increasing…
Descriptors: Automation, Content Analysis, Difficulty Level, Readability Formulas
Haberman, Shelby J.; Sinharay, Sandip; Lee, Yi-Hsuan – Educational Testing Service, 2011
Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement (Carroll, 1993). Scale anchoring (Beaton & Allen, 1992), a technique that describes what students at different points on a score scale know and can do,…
Descriptors: Statistical Analysis, Scores, Regression (Statistics), Item Response Theory
Middleton, Kyndra; Dorans, Neil J. – Educational Testing Service, 2011
Extreme linkings are performed in settings in which neither equivalent groups nor anchor material is available to link scores on two assessments. Examples of extreme linkages include links between scores on tests administered in different languages or between scores on tests administered across disability groups. The strength of interpretation…
Descriptors: Equated Scores, Testing, Difficulty Level, Test Reliability
Dorans, Neil J. – Educational Testing Service, 2010
Santelices and Wilson (2010) claimed to have addressed technical criticisms of Freedle (2003) presented in Dorans (2004a) and elsewhere. Santelices and Wilson's abstract claimed that their study confirmed that SAT[R] verbal items do function differently for African American and White subgroups. In this commentary, I demonstrate that the…
Descriptors: College Entrance Examinations, Verbal Tests, Test Bias, Test Items
Stone, Elizabeth; Davey, Tim – Educational Testing Service, 2011
There has been an increased interest in developing computer-adaptive testing (CAT) and multistage assessments for K-12 accountability assessments. The move to adaptive testing has been met with some resistance by those in the field of special education who express concern about routing of students with divergent profiles (e.g., some students with…
Descriptors: Disabilities, Adaptive Testing, Accountability, Computer Assisted Testing
Buzick, Heather M.; Laitusis, Cara Cahalan – Educational Testing Service, 2010
Recently growth-based approaches to accountability have received considerable attention because they have the potential to reward schools and teachers for improving student performance over time by measuring the progress of students at all levels of the performance spectrum (including those who have not yet reached proficiency on state…
Descriptors: Disabilities, Student Improvement, Accountability, Models
Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009
This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…
Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement
Sinharay, Sandip; Holland, Paul W. – Educational Testing Service, 2008
The nonequivalent groups with anchor test (NEAT) design involves missing data that are missing by design. Three popular equating methods that can be used with a NEAT design are the poststratification equating method, the chain equipercentile equating method, and the item-response-theory observed-score-equating method. These three methods each…
Descriptors: Equated Scores, Test Items, Item Response Theory, Data
Kostin, Irene – Educational Testing Service, 2004
The purpose of this study is to explore the relationship between a set of item characteristics and the difficulty of TOEFL[R] dialogue items. Identifying characteristics that are related to item difficulty has the potential to improve the efficiency of the item-writing process The study employed 365 TOEFL dialogue items, which were coded on 49…
Descriptors: Statistical Analysis, Difficulty Level, Language Tests, English (Second Language)