ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	19

Descriptor

Educational Testing	39
Reliability	39
Validity	25
Scores	14
Student Evaluation	12
Evaluation Methods	11
Educational Assessment	9
Academic Achievement	8
Test Construction	8
Elementary Secondary Education	7
High Stakes Tests	7
Models	7
Accountability	6
Foreign Countries	6
Measurement	6
Standardized Tests	6
Test Use	6
Testing Problems	5
Achievement Tests	4
Criterion Referenced Tests	4
Educational Policy	4
Measurement Techniques	4
Measures (Individuals)	4
Psychometrics	4
Teacher Evaluation	4
More ▼

Publication Type

Journal Articles	21
Reports - Research	10
Reports - Descriptive	9
Reports - Evaluative	6
Speeches/Meeting Papers	5
Books	3
Dissertations/Theses -…	3
Opinion Papers	2
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Teacher	1
Guides - Non-Classroom	1
Information Analyses	1
Legal/Legislative/Regulatory…	1
Reports - General	1
More ▼

Education Level

Elementary Secondary Education	6
Elementary Education	4
Grade 4	3
Higher Education	3
Postsecondary Education	3
Secondary Education	3
Adult Education	2
High Schools	2
Grade 1	1
Grade 3	1
Grade 5	1
Grade 8	1
Kindergarten	1
More ▼

Audience

Practitioners	3
Teachers	2
Administrators	1

Location

United Kingdom	3
New York	2
United States	2
Canada	1
United Kingdom (England)	1
United Kingdom (Great Britain)	1
United Kingdom (Northern…	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Race to the Top	1

Assessments and Surveys

Myers Briggs Type Indicator	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 39 results Save | Export

Digital Module 09: Sociocognitive Assessment for Diverse Populations

Peer reviewed

Direct link

Mislevy, Robert J.; Oliveri, Maria Elena – Educational Measurement: Issues and Practice, 2019

In this digital ITEMS module, Dr. Robert [Bob] Mislevy and Dr. Maria Elena Oliveri introduce and illustrate a sociocognitive perspective on educational measurement, which focuses on a variety of design and implementation considerations for creating fair and valid assessments for learners from diverse populations with diverse sociocultural…

Descriptors: Educational Testing, Reliability, Test Validity, Test Reliability

Do Adjusted Subscores Lack Validity? Don't Blame the Messenger

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J.; Wainer, Howard – Educational and Psychological Measurement, 2011

There are several techniques that increase the precision of subscores by borrowing information from other parts of the test. These techniques have been criticized on validity grounds in several of the recent publications. In this note, the authors question the argument used in these publications and suggest both inherent limits to the validity…

Descriptors: Scores, Methods, Validity, Reliability

Value of Value-Added Models Based on Student Outcomes to Evaluate Teaching

Peer reviewed

Direct link

Berk, Ronald A. – Journal of Faculty Development, 2016

Recently, student outcomes have bubbled to the top of debates about how to evaluate teaching in community and liberal arts colleges, universities, and professional schools, but even more international attention has been riveted on how outcomes are being used to evaluate teachers and administrators K-12 (Harris, 2012; Rowen & Raudenbush, 2016;…

Descriptors: Value Added Models, Academic Achievement, Outcomes of Education, Teacher Evaluation

Large-Scale Academic Achievement Testing of Deaf and Hard-of-Hearing Students: Past, Present, and Future

Peer reviewed

Direct link

Qi, Sen; Mitchell, Ross E. – Journal of Deaf Studies and Deaf Education, 2012

The first large-scale, nationwide academic achievement testing program using Stanford Achievement Test (Stanford) for deaf and hard-of-hearing children in the United States started in 1969. Over the past three decades, the Stanford has served as a benchmark in the field of deaf education for assessing student academic achievement. However, the…

Descriptors: Testing Programs, Educational Testing, Deafness, Academic Achievement

Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Peer reviewed

Direct link

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

Descriptors: Educational Testing, Scores, Reports, Psychometrics

Measuring Teaching Using Value-Added Modeling: The Imperfect Panacea

Peer reviewed

Direct link

Scherrer, Jimmy – NASSP Bulletin, 2011

The use of value-added modeling (VAM) in school accountability is expanding. However, trying to decide how to embrace VAM can be rather nettlesome. Some experts claim it is "too unreliable," causes "more harm than good," and has "a big margin for error," while other experts assert VAM is "imperfect, but…

Descriptors: Teacher Effectiveness, Accountability, Inferences, Validity

Online or Face-to-Face? An Experimental Study of Examiner Training

Peer reviewed

Direct link

Chamberlain, Suzanne; Taylor, Rachel – British Journal of Educational Technology, 2011

Thousands of examiners are employed to mark candidate scripts from the suite of public examinations offered to students during the compulsory and post-compulsory schooling phases in England, Northern Ireland and Wales. All examiners undergo training to ensure that they interpret correctly, and apply consistently, the mark scheme for their…

Descriptors: Foreign Countries, Examiners, Training Methods, Educational Testing

Is It Just a Bad Class? Assessing the Stability of Measured Teacher Performance. CEDR Working Paper No. 2010-3.0

Direct link

Goldhaber, Dan; Hansen, Michael – Center for Education Data & Research, 2010

Economic theory commonly models unobserved worker quality as a given parameter that is fixed over time, but empirical evidence supporting this assumption is sparse. In this paper we report on work estimating the stability of value-added estimates of teacher effects, an important area of investigation given that new workforce policies implicitly…

Descriptors: Teacher Effectiveness, Reliability, Evidence, Teacher Evaluation

Secondary School Students' Attitudes toward Fitness Testing

Direct link

Mercier, Kevin John – ProQuest LLC, 2011

The purpose of this investigation was to develop an instrument that has scores that are valid and reliable for measuring students' attitudes toward fitness testing. A second purpose of the study was to determine the attitudes of secondary students toward fitness testing. A review of literature, an elicitation study, and a pilot study were…

Descriptors: Student Attitudes, Females, Testing, Reliability

Matching Time of Day and Preference for Adolescent Achievement

Direct link

Parker, Leisha Moree – ProQuest LLC, 2009

Research shows that adolescents enter a circadian-phase delay as they approach and enter high school. On or about age 14, teens become less of a morning learner due to biological factors. Researchers have determined consequences to the adolescent's circadian shift as related to learning; therefore, morning time may have a negative influence on the…

Descriptors: High Schools, Secondary School Teachers, Adolescents, Academic Achievement

Score Comparability for Language Minority Students on the Content Assessments Used by Two States. Research Report. ETS RR-11-27

Download full text

Young, John W.; Holtzman, Steven; Steinberg, Jonathan – Educational Testing Service, 2011

In this research investigation of score comparability for language minority students (English language learners [ELLs] and former English language learners), we examined 3 indicators of score comparability (reliability, internal test structure, and differential item functioning) for 4th and 8th grade students who took the NCLB-mandated content…

Descriptors: Language Minorities, Second Language Learning, Grade 8, Minority Group Students

Exploring Teacher Effectiveness Using Hierarchical Linear Models: Student- and Classroom-Level Predictors and Cross-Year Stability in Elementary School Reading

Peer reviewed

Direct link

Munoz, Marco A.; Prather, Joseph R.; Stronge, James H. – Planning and Changing, 2011

Teacher effectiveness and evaluation using student growth measures is a popular reform strategy in education. Teachers can make a difference in student academic growth, but a question that begs an answer is how to go about measuring this impact. This study examines models of teacher effectiveness and the development of hierarchical linear models…

Descriptors: Reading Instruction, Elementary Education, Urban Schools, Teacher Effectiveness

Reliability and Validity of Information about Student Achievement: Comparing Large-Scale and Classroom Testing Contexts

Peer reviewed

Direct link

Cizek, Gregory J. – Theory Into Practice, 2009

Reliability and validity are two characteristics that must be considered whenever information about student achievement is collected. However, those characteristics--and the methods for evaluating them--differ in large-scale testing and classroom testing contexts. This article presents the distinctions between reliability and validity in the two…

Descriptors: Academic Achievement, Validity, Measures (Individuals), Reliability

An Essay on the History and Future of Reliability from the Perspective of Replications.

Peer reviewed

Brennan, Robert L. – Journal of Educational Measurement, 2001

Reviews important milestones in the history of reliability, current issues related to reliability, and likely prospects for reliability from the perspective of what constitutes a replication of a measurement procedure. Pays special attention to the fixed/random aspects of facets that characterize replications. (SLD)

Descriptors: Educational Testing, Measurement Techniques, Reliability

How Much Can We Reliably Know about What Examinees Know?

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.

Descriptors: Scoring, Reliability, Validity, Classification

Previous Page | Next Page »

Pages: 1 | 2 | 3

Educational Research	3
ProQuest LLC	3
Educational Measurement:…	2
Journal of Technology,…	2
Applied Measurement in…	1
Assessment Update	1
British Journal of…	1
Center for Education Data &…	1
ETS Research Report Series	1
Educational Testing Service	1
Educational and Psychological…	1
Journal of Applied Research…	1
Journal of Deaf Studies and…	1
Journal of Educational…	1
Journal of Faculty Development	1
Measurement:…	1
Multivariate Behavioral…	1
NASSP Bulletin	1
Online Submission	1
Planning and Changing	1
Theory Into Practice	1
More ▼

Haberman, Shelby J.	4
Sinharay, Sandip	3
Brennan, Robert L.	2
Attali, Yigal	1
Berk, Ronald A.	1
Burstein, Jill	1
Chamberlain, Suzanne	1
Cizek, Gregory J.	1
Dahl, Theodore	1
Denham, Thomas J.	1
Dikli, Semire	1
Ebel, Robert L.	1
Erickson, Richard C.	1
Faraday, Sally	1
Goldhaber, Dan	1
Gray, B. Thomas	1
Green, Sylvia	1
Hansen, Michael	1
Harris, Richard	1
Holtzman, Steven	1
Johanningmeier, Erwin V.	1
Kadamus, James A.	1
Leitzel, Thomas C.	1
McCowan, Richard J.	1
McCowan, Sheila C.	1
More ▼