ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	9

Descriptor

Reliability	9
Scores	5
Error of Measurement	3
Test Theory	3
Accuracy	2
Correlation	2
Data Analysis	2
Educational Assessment	2
High Stakes Tests	2
Item Response Theory	2
Prediction	2
Scoring	2
Test Validity	2
Validity	2
Achievement Tests	1
Best Practices	1
Case Studies	1
Certification	1
College Entrance Examinations	1
Comparative Analysis	1
Comparative Testing	1
Computation	1
Educational Legislation	1
Educational Policy	1
Educational Testing	1
More ▼

Source

Educational Testing Service

Author

Haberman, Shelby J.	2
Sinharay, Sandip	2
Ackerman, Debra J.	1
Dorans, Neil J.	1
Haertel, Edward H.	1
Holtzman, Steven	1
Kane, Michael	1
Ricker-Pedley, Kathryn L.	1
Rose, Norman	1
Steinberg, Jonathan	1
Xu, Xueli	1
Young, John W.	1
von Davier, Matthias	1
More ▼

Publication Type

Reports - Evaluative	3
Reports - Research	3
Speeches/Meeting Papers	2
Information Analyses	1
Numerical/Quantitative Data	1
Opinion Papers	1
Reports - Descriptive	1

Education Level

Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 4	1
Grade 8	1
Higher Education	1
Postsecondary Education	1
Preschool Education	1

Audience

Administrators	1
Policymakers	1
Practitioners	1

Location

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Graduate Record Examinations	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

State-Funded PreK Policies on External Classroom Observations: Issues and Status. Policy Information Report

Download full text

Ackerman, Debra J. – Educational Testing Service, 2014

Early education programs are increasingly being promoted by states and the federal government as an integral part of their efforts to ensure that all children enter school ready to learn. As these programs and their enrollments have grown in recent years, so too have efforts to monitor their quality and performance. A common focus is on…

Descriptors: Preschool Education, State Policy, Observation, Validity

An Examination of the Link between Rater Calibration Performance and Subsequent Scoring Accuracy in Graduate Record Examinations[R] (GRE[R]) Writing. Research Report. ETS RR-11-03

Download full text

Ricker-Pedley, Kathryn L. – Educational Testing Service, 2011

A pseudo-experimental study was conducted to examine the link between rater accuracy calibration performances and subsequent accuracy during operational scoring. The study asked 45 raters to score a 75-response calibration set and then a 100-response (operational) set of responses from a retired Graduate Record Examinations[R] (GRE[R]) writing…

Descriptors: Scoring, Accuracy, College Entrance Examinations, Writing Tests

Reliability and Validity of Inferences about Teachers Based on Student Scores. William H. Angoff Memorial Lecture Series

Download full text

Haertel, Edward H. – Educational Testing Service, 2013

Policymakers and school administrators have embraced value-added models of teacher effectiveness as tools for educational improvement. Teacher value-added estimates may be viewed as complicated scores of a certain kind. This suggests using a test validation model to examine their reliability and validity. Validation begins with an interpretive…

Descriptors: Reliability, Validity, Inferences, Teacher Effectiveness

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

When Can Subscores Be Expected to Have Added Value? Results from Operational and Simulated Data. Research Report. ETS RR-10-16

Download full text

Sinharay, Sandip – Educational Testing Service, 2010

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008) suggested a method based on classical test theory to determine whether subscores have added value over total scores. This paper provides a literature review and reports when subscores were found to have added value for…

Descriptors: Scores, Correlation, Reliability, Item Response Theory

How Does the Knowledge of Subgroup Membership of Examinees Affect the Prediction of True Subscores? Research Report. ETS RR-11-43

Download full text

Haberman, Shelby J.; Sinharay, Sandip – Educational Testing Service, 2011

Subscores are reported for several operational assessments. Haberman (2008) suggested a method based on classical test theory to determine if the true subscore is predicted better by the corresponding subscore or the total score. Researchers are often interested in learning how different subgroups perform on subtests. Stricker (1993) and…

Descriptors: True Scores, Test Theory, Prediction, Group Membership

Modeling Nonignorable Missing Data with Item Response Theory (IRT). Research Report. ETS RR-10-11

Download full text

Rose, Norman; von Davier, Matthias; Xu, Xueli – Educational Testing Service, 2010

Large-scale educational surveys are low-stakes assessments of educational outcomes conducted using nationally representative samples. In these surveys, students do not receive individual scores, and the outcome of the assessment is inconsequential for respondents. The low-stakes nature of these surveys, as well as variations in average performance…

Descriptors: Item Response Theory, Educational Assessment, Data Analysis, Case Studies

Errors of Measurement, Theory, and Public Policy. William H. Angoff Memorial Lecture Series

Download full text

Kane, Michael – Educational Testing Service, 2010

The 12th annual William H. Angoff Memorial Lecture was presented by Dr. Michael T. Kane, ETS's (Educational Testing Service) Samuel J. Messick Chair in Test Validity and the former Director of Research at the National Conference of Bar Examiners. Dr. Kane argues that it is important for policymakers to recognize the impact of errors of measurement…

Descriptors: Error of Measurement, Scores, Public Policy, Test Theory

Score Comparability for Language Minority Students on the Content Assessments Used by Two States. Research Report. ETS RR-11-27

Download full text

Young, John W.; Holtzman, Steven; Steinberg, Jonathan – Educational Testing Service, 2011

In this research investigation of score comparability for language minority students (English language learners [ELLs] and former English language learners), we examined 3 indicators of score comparability (reliability, internal test structure, and differential item functioning) for 4th and 8th grade students who took the NCLB-mandated content…

Descriptors: Language Minorities, Second Language Learning, Grade 8, Minority Group Students