Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 16 |
Descriptor
Source
Author
| Akour, Mutasem | 1 |
| Barford, Sean W. | 1 |
| Baxter, Gail P. | 1 |
| Bond, Lloyd | 1 |
| Breyer, F. Jay | 1 |
| Brown, Gavin T. L. | 1 |
| Chakwera, Elias | 1 |
| Chiero, Robin | 1 |
| Childs, Ruth A. | 1 |
| Crocker, Linda | 1 |
| Deng, Hui | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 46 |
| Reports - Research | 17 |
| Reports - Evaluative | 16 |
| Reports - Descriptive | 12 |
| Opinion Papers | 2 |
| Speeches/Meeting Papers | 2 |
| Tests/Questionnaires | 2 |
| Book/Product Reviews | 1 |
Education Level
| Higher Education | 6 |
| Elementary Secondary Education | 5 |
| Postsecondary Education | 4 |
| Elementary Education | 3 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| Grade 8 | 1 |
| High Schools | 1 |
| Middle Schools | 1 |
| Secondary Education | 1 |
| More ▼ | |
Audience
| Practitioners | 3 |
| Teachers | 3 |
| Policymakers | 1 |
Location
| United States | 3 |
| Australia | 2 |
| Canada | 2 |
| California | 1 |
| Denmark | 1 |
| France | 1 |
| Kentucky | 1 |
| Malawi | 1 |
| New Zealand | 1 |
| Pennsylvania | 1 |
| Slovakia | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Individuals with Disabilities… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Haberman, Shelby J. – ETS Research Report Series, 2020
Best linear prediction (BLP) and penalized best linear prediction (PBLP) are techniques for combining sources of information to produce task scores, section scores, and composite test scores. The report examines issues to consider in operational implementation of BLP and PBLP in testing programs administered by ETS [Educational Testing Service].
Descriptors: Prediction, Scores, Tests, Testing Programs
Akour, Mutasem; Sabah, Saed; Hammouri, Hind – Journal of Psychoeducational Assessment, 2015
The purpose of this study was to apply two types of Differential Item Functioning (DIF), net and global DIF, as well as the framework of Differential Step Functioning (DSF) to real testing data to investigate measurement invariance related to test language. Data from the Program for International Student Assessment (PISA)-2006 polytomously scored…
Descriptors: Test Bias, Science Tests, Test Items, Scoring
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015
An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…
Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring
Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014
Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…
Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy
Ujifusa, Andrew – Education Week, 2012
Results from new state tests in Kentucky--the first in the nation explicitly tied to the Common Core State Standards--show that the share of students scoring "proficient" or better in reading and math dropped by roughly a third or more in both elementary and middle school the first year the tests were given. Kentucky in 2010 was the…
Descriptors: Academic Achievement, State Standards, Scoring, Testing Programs
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
McCurry, Doug – English in Australia, 2010
This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…
Descriptors: Testing Programs, High Stakes Tests, Computer Software, Scoring
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Peterson, Shelley Stagg; McClay, Jill; Main, Kristin – Alberta Journal of Educational Research, 2011
This paper reports on an analysis of large-scale assessments of Grades 5-8 students' writing across 10 provinces and 2 territories in Canada. Theory, classroom practice, and the contributions and constraints of large-scale writing assessment are brought together with a focus on Grades 5-8 writing in order to provide both a broad view of…
Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Measures (Individuals)
Robelen, Erik W. – Education Week, 2008
Nearly four years after a front-page story in "The New York Times" sparked a fierce debate by suggesting that charter school students nationally were lagging academically behind their peers in regular public schools, the national testing program that informed the controversy has generated far more data for researchers and advocates to scrutinize.…
Descriptors: Charter Schools, Testing Programs, Sample Size, Reading Instruction
Childs, Ruth A.; Jaciw, Andrew P.; Saunders, Kelsey – International Journal of Testing, 2007
Many approaches to standard-setting use item calibration and student score estimation results to structure panelists' tasks. However, this requires collecting standard-setting judgments after the item analysis results are available. The Scoring Guide Alignment approach collects standard-setting judgments during the scoring sessions from teachers…
Descriptors: Testing Programs, Scoring, Item Analysis, Test Items
Guaglianone, Curtis L.; Payne, Maggie; Kinsey, Gary W.; Chiero, Robin – Issues in Teacher Education, 2009
This article is based on the perceptions of California State University administrators and provides a comparative study of the challenges and benefits resulting from the implementation of the teaching performance assessment requirement of SB 2042 standards 19-21 on the California State University (CSU) campuses. With 23 campuses and almost 450,000…
Descriptors: Preservice Teacher Education, Performance Based Assessment, Comparative Analysis, State Universities
Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David M. – ETS Research Report Series, 2008
This report presents the results of a research and development effort for SpeechRater? Version 1.0 (v1.0), an automated scoring system for the spontaneous speech of English language learners used operationally in the Test of English as a Foreign Language™ (TOEFL®) Practice Online assessment (TPO). The report includes a summary of the validity…
Descriptors: Speech, Scoring, Scoring Rubrics, Scoring Formulas

Peer reviewed
Direct link
