NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 13 results Save | Export
Deane, Paul; Quinlan, Thomas; Kostin, Irene – Educational Testing Service, 2011
ETS has recently instituted the Cognitively Based Assessments of, for, and as Learning (CBAL) research initiative to create a new generation of assessment designed from the ground up to enhance learning. It is intended as a general approach, covering multiple subject areas including reading, writing, and math. This paper is concerned with the…
Descriptors: Automation, Scoring, Educational Assessment, Writing Tests
Attali, Yigal – Educational Testing Service, 2011
The e-rater[R] automated essay scoring system is used operationally in the scoring of TOEFL iBT[R] independent essays. Previous research has found support for a 3-factor structure of the e-rater features. This 3-factor structure has an attractive hierarchical linguistic interpretation with a word choice factor, a grammatical convention within a…
Descriptors: Essay Tests, Language Tests, Test Scoring Machines, Automation
Attali, Yigal – Educational Testing Service, 2011
This paper proposes an alternative content measure for essay scoring, based on the "difference" in the relative frequency of a word in high-scored versus low-scored essays. The "differential word use" (DWU) measure is the average of these differences across all words in the essay. A positive value indicates the essay is using…
Descriptors: Scoring, Essay Tests, Word Frequency, Content Analysis
Dorans, Neil J.; Liang, Longjuan; Puhan, Gautam – Educational Testing Service, 2010
Scores are the most visible and widely used products of a testing program. The choice of score scale has implications for test specifications, equating, and test reliability and validity, as well as for test interpretation. At the same time, the score scale should be viewed as infrastructure likely to require repair at some point. In this report…
Descriptors: Testing Programs, Standard Setting (Scoring), Test Interpretation, Certification
Ricker-Pedley, Kathryn L. – Educational Testing Service, 2011
A pseudo-experimental study was conducted to examine the link between rater accuracy calibration performances and subsequent accuracy during operational scoring. The study asked 45 raters to score a 75-response calibration set and then a 100-response (operational) set of responses from a retired Graduate Record Examinations[R] (GRE[R]) writing…
Descriptors: Scoring, Accuracy, College Entrance Examinations, Writing Tests
Tan, Xuan; Ricker, Kathryn L.; Puhan, Gautam – Educational Testing Service, 2010
This study examines the differences in equating outcomes between two trend score equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor…
Descriptors: Equated Scores, Scoring, Responses, Test Items
DeCarlo, Lawrence T. – Educational Testing Service, 2010
A basic consideration in large-scale assessments that use constructed response (CR) items, such as essays, is how to allocate the essays to the raters that score them. Designs that are used in practice are incomplete, in that each essay is scored by only a subset of the raters, and also unbalanced, in that the number of essays scored by each rater…
Descriptors: Test Items, Responses, Essay Tests, Scoring
Bennett, Randy Elliot – Educational Testing Service, 2011
CBAL, an acronym for Cognitively Based Assessment of, for, and as Learning, is a research initiative intended to create a model for an innovative K-12 assessment system that provides summative information for policy makers, as well as formative information for classroom instructional purposes. This paper summarizes empirical results from 16 CBAL…
Descriptors: Educational Assessment, Elementary Secondary Education, Summative Evaluation, Formative Evaluation
Santelices, Maria Veronica; Ugarte, Juan Jose; Flotts, Paulina; Radovic, Darinka; Kyllonen, Patrick – Educational Testing Service, 2011
This paper presents the development and initial validation of new measures of critical thinking and noncognitive attributes that were designed to supplement existing standardized tests used in the admissions system for higher education in Chile. The importance of various facets of this process, including the establishment of technical rigor and…
Descriptors: Foreign Countries, College Entrance Examinations, Test Construction, Test Validity
Deane, Paul – Educational Testing Service, 2011
This paper presents a socio-cognitive framework for connecting writing pedagogy and writing assessment with modern social and cognitive theories of writing. It focuses on providing a general framework that highlights the connections between writing competency and other literacy skills; identifies key connections between literacy instruction,…
Descriptors: Writing (Composition), Writing Evaluation, Writing Tests, Cognitive Ability
Dorans, Neil J.; Liu, Jinghua – Educational Testing Service, 2009
The equating process links scores from different editions of the same test. For testing programs that build nearly parallel forms to the same explicit content and statistical specifications and administer forms under the same conditions, the linkings between the forms are expected to be equatings. Score equity assessment (SEA) provides a useful…
Descriptors: Testing Programs, Mathematics Tests, Quality Control, Psychometrics
Chodorow, Martin; Burstein, Jill – Educational Testing Service, 2004
This study examines the relation between essay length and holistic scores assigned to Test of English as a Foreign Language[TM] (TOEFL[R]) essays by e-rater[R], the automated essay scoring system developed by ETS. Results show that an early version of the system, e-rater99, accounted for little variance in human reader scores beyond that which…
Descriptors: Essays, Test Scoring Machines, English (Second Language), Student Evaluation
Rizavi, Saba; Way, Walter D.; Davey, Tim; Herbert, Erin – Educational Testing Service, 2004
Item parameter estimates vary for a variety of reasons, including estimation error, characteristics of the examinee samples, and context effects (e.g., item location effects, section location effects, etc.). Although we expect variation based on theory, there is reason to believe that observed variation in item parameter estimates exceeds what…
Descriptors: Adaptive Testing, Test Items, Computation, Context Effect