ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	20

Descriptor

Equated Scores	113
Standardized Tests	113
Reading Tests	52
Raw Scores	41
Reading Comprehension	39
Vocabulary	38
Measurement Techniques	37
Reading	34
Statistical Analysis	32
Grade 6	30
Comparative Analysis	28
Grade 4	28
Grade 5	26
Achievement Tests	24
Elementary School Students	23
Tables (Data)	22
Scores	17
Test Reliability	16
Correlation	15
College Entrance Examinations	14
Item Response Theory	14
Elementary Education	13
Norms	13
Elementary Secondary Education	12
Graphs	12
More ▼

Publication Type

Reports - Research	44
Numerical/Quantitative Data	37
Journal Articles	23
Speeches/Meeting Papers	17
Reports - Evaluative	14
Information Analyses	6
Opinion Papers	4
Reports - Descriptive	4
Dissertations/Theses -…	3
Tests/Questionnaires	2
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
More ▼

Education Level

Higher Education	10
Postsecondary Education	5
Elementary Education	3
Elementary Secondary Education	3
Grade 4	3
Grade 10	2
High Schools	2
Intermediate Grades	2
Middle Schools	2
Secondary Education	2
Grade 7	1
Grade 8	1
Junior High Schools	1
More ▼

Audience

Researchers	4
Practitioners	2

Location

Florida	2
Australia	1
California	1
Canada	1
Connecticut	1
Netherlands	1
New York (New York)	1
United Kingdom (England)	1

Laws, Policies, & Programs

Elementary and Secondary…

What Works Clearinghouse Rating

Showing 1 to 15 of 113 results Save | Export

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale

Peer reviewed
PDF on ERIC

Download full text

Direct link

Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both…

Descriptors: Equated Scores, Validity, Methods, School Districts

Does Testing Date Impact Student Scores on the ACT? Technical Brief

Download full text

Camara, Wayne J.; Allen, Jeff – ACT, Inc., 2017

Students must choose when to take the ACT for the first time and if and when to retest. States and districts that administer the ACT test to all students must also choose when to administer the test. A key consideration in making these decisions is the impact on scores. Because the ACT is a curriculum-based test of academic achievement, students…

Descriptors: Scores, Time Perspective, Scheduling, Testing

Adapting Accountability Systems to the Limitations of Educational Measurement

Peer reviewed

Direct link

Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2015

Michael Kane writes in this article that he is in more or less complete agreement with Professor Koretz's characterization of the problem outlined in the paper published in this issue of "Measurement." Kane agrees that current testing practices are not adequate for test-based accountability (TBA) systems, but he writes that he is far…

Descriptors: Educational Testing, Accountability, Standardized Tests, Equated Scores

Psychometric Consequences of Subpopulation Item Parameter Drift

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017

This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing

New York Charter Schools Outperform Traditional Selective Public Schools: More Evidence That Cream-Skimming Is Not Driving Charters' Success. Report 33

Download full text

Winters, Marcus A. – Manhattan Institute for Policy Research, 2017

Critics of charter schools in New York City, America's largest school district, often allege that charters score better on standardized tests, on average, than traditional public schools because charters "cream-skim" (i.e., attract) the brightest, most motivated, students. Yet this accusation neglects the fact that not all traditional…

Descriptors: Charter Schools, Public Schools, School Effectiveness, Success

Observed Score and True Score Equating Procedures for Multidimensional Item Response Theory

Direct link

Brossman, Bradley Grant – ProQuest LLC, 2010

The purpose of this research was to develop observed score and true score equating procedures to be used in conjunction with the Multidimensional Item Response Theory (MIRT) framework. Currently, MIRT scale linking procedures exist to place item parameter estimates and ability estimates on the same scale after separate calibrations are conducted.…

Descriptors: Item Response Theory, Equated Scores, True Scores, Standardized Tests

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Relationship between Air Traffic Selection and Training (AT-SAT)) Battery Test Scores and Composite Scores in the Initial en Route Air Traffic Control Qualification Training Course at the Federal Aviation Administration (FAA) Academy

Direct link

Kelley, Ronald Scott – ProQuest LLC, 2012

Scope and Method of Study: This study focused on the development and use of the AT-SAT test battery and the Initial En Route Qualification training course for the selection, training, and evaluation of air traffic controller candidates. The Pearson product moment correlation coefficient was used to measure the linear relationship between the…

Descriptors: Traffic Safety, Scores, Equated Scores, Multiple Regression Analysis

Confusion in the Ranks: How Good Are England's Schools?

Direct link

Smithers, Alan – Sutton Trust, 2013

Understanding how well English education performs compared with other countries is a valuable exercise, particularly because the information can help England and other countries learn from successful systems. The most recent international league tables of pupil performance differ considerably. England languishes well down the list in PISA 2009,…

Descriptors: Foreign Countries, School Effectiveness, National Competency Tests, Classification

Population Invariance of Vertical Scaling Results

Direct link

Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012

The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…

Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests

Stability of Rasch Scales over Time

Peer reviewed

Direct link

Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…

Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis

The Impact of Item Position Change on Item Parameters and Common Equating Results under the 3PL Model

Direct link

Meyers, Jason L.; Murphy, Stephen; Goodman, Joshua; Turhan, Ahmet – Pearson, 2012

Operational testing programs employing item response theory (IRT) applications benefit from of the property of item parameter invariance whereby item parameter estimates obtained from one sample can be applied to other samples (when the underlying assumptions are satisfied). In theory, this feature allows for applications such as computer-adaptive…

Descriptors: Equated Scores, Test Items, Test Format, Item Response Theory

Performance of a Generic Approach in Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Bridgeman, Brent; Trapani, Catherine – Journal of Technology, Learning, and Assessment, 2010

A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Test Scoring Machines

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Applied Psychological…	6
Journal of Educational…	6
College Entrance Examination…	4
Educational and Psychological…	3
ProQuest LLC	3
Applied Measurement in…	2
Educational Measurement:…	2
Pearson	2
ACT, Inc.	1
American Secondary Education	1
College Board	1
International Journal of…	1
Journal of Educational and…	1
Journal of Technology,…	1
Manhattan Institute for…	1
Measurement:…	1
New Directions for Testing…	1
Praeger	1
Sutton Trust	1
US Department of Education	1
More ▼

Bianchini, John C.	35
Loret, Peter G.	31
Dorans, Neil J.	5
Angoff, William H.	4
Vale, Carol A.	4
Bashaw, W. L.	3
Gallas, Edwin J.	3
Linn, Robert L.	3
Rentz, R. Robert	3
Brennan, Robert L.	2
Eignor, Daniel R.	2
Green, Donald Ross	2
Kolen, Michael J.	2
Lissitz, Robert W.	2
Liu, Jinghua	2
Skaggs, Gary	2
Turhan, Ahmet	2
Allen, Jeff	1
Attali, Yigal	1
Bielinski, John	1
Binici, Salih	1
Boldt, R. F.	1
Brennan, Robert L., Ed.	1
Bridgeman, Brent	1
More ▼

California Achievement Tests	38
Comprehensive Tests of Basic…	38
Metropolitan Achievement Tests	36
Stanford Achievement Tests	36
Iowa Tests of Basic Skills	35
SRA Achievement Series	35
Sequential Tests of…	35
SAT (College Admission Test)	11
Gates MacGinitie Reading Tests	9
ACT Assessment	4
Graduate Record Examinations	3
National Assessment of…	3
College Board Achievement…	1
Florida Comprehensive…	1
Iowa Tests of Educational…	1
Measures of Academic Progress	1
Miller Analogies Test	1
Program for International…	1
Progress in International…	1
Test of English as a Foreign…	1
Test of Standard Written…	1
Trends in International…	1
Wechsler Intelligence Scale…	1
More ▼