ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	18

Descriptor

Reliability	29
Scores	29
Test Theory	29
Error of Measurement	9
Validity	9
Correlation	8
Psychometrics	6
Measurement Techniques	5
Comparative Analysis	4
Computation	4
Item Response Theory	4
Regression (Statistics)	4
Achievement Gains	3
Change	3
Equations (Mathematics)	3
Measurement	3
Models	3
Prediction	3
Science Education	3
Statistical Analysis	3
Test Items	3
Academic Achievement	2
Astronomy	2
College Students	2
Decision Making	2
More ▼

Source

Applied Psychological…	4
ETS Research Report Series	2
Educational Testing Service	2
Educational and Psychological…	2
Multivariate Behavioral…	2
Applied Measurement in…	1
Astronomy Education Review	1
Biochemistry and Molecular…	1
Educational Measurement:…	1
Florida Center for Reading…	1
International Journal of…	1
International Journal of…	1
Journal of Adolescence	1
Journal of Educational…	1
Journal of Special Education	1
Psychological Assessment	1
Psychological Review	1
Society for Research on…	1
More ▼

Publication Type

Journal Articles	21
Reports - Evaluative	11
Reports - Research	11
Reports - Descriptive	5
Speeches/Meeting Papers	5
Book/Product Reviews	3
Opinion Papers	2
Guides - Non-Classroom	1
Information Analyses	1
Numerical/Quantitative Data	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	3
Postsecondary Education	3
High Schools	1
Kindergarten	1
Secondary Education	1

Audience

Teachers

Location

Florida	1
Luxembourg	1
United States	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

SAT (College Admission Test)	1
Stanford Achievement Tests	1
Wisconsin Card Sorting Test	1

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Measurement Error Correction Formula for Cluster-Level Group Differences in Cluster Randomized and Observational Studies

Peer reviewed

Direct link

Cho, Sun-Joo; Preacher, Kristopher J. – Educational and Psychological Measurement, 2016

Multilevel modeling (MLM) is frequently used to detect cluster-level group differences in cluster randomized trial and observational studies. Group differences on the outcomes (posttest scores) are detected by controlling for the covariate (pretest scores) as a proxy variable for unobserved factors that predict future attributes. The pretest and…

Descriptors: Error of Measurement, Error Correction, Multivariate Analysis, Hierarchical Linear Modeling

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

How Often Do Subscores Have Added Value? Results from Operational and Simulated Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2010

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…

Descriptors: Scores, Test Theory, Simulation, Reliability

Development and Validation of the Star Properties Concept Inventory

Peer reviewed

Direct link

Bailey, Janelle M.; Johnson, Bruce; Prather, Edward E.; Slater, Timothy F. – International Journal of Science Education, 2012

Concept inventories (CIs)--typically multiple-choice instruments that focus on a single or small subset of closely related topics--have been used in science education for more than a decade. This paper describes the development and validation of a new CI for astronomy, the "Star Properties Concept Inventory" (SPCI). Questions cover the areas of…

Descriptors: Educational Strategies, Validity, Testing, Astronomy

Measurement of Classroom Teaching Quality with Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Kelcey, Ben; McGinn, Daniel; Hill, Heather – Society for Research on Educational Effectiveness, 2013

Recent policy has charged schools and districts with maintaining highly qualified teachers and differentiating among teachers in terms of their effectiveness (U.S. Department of Education, 2009). This emphasis has driven the development and implementation of teacher quality measures which are increasingly being used to evaluate teachers with…

Descriptors: Teacher Effectiveness, Measures (Individuals), Observation, Teacher Evaluation

Florida Center for Reading Research (FCRR) Reading Assessment (FRA): Kindergarten to Grade 2. Technical Manual

Download full text

Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015

The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…

Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1

When Can Subscores Be Expected to Have Added Value? Results from Operational and Simulated Data. Research Report. ETS RR-10-16

Download full text

Sinharay, Sandip – Educational Testing Service, 2010

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008) suggested a method based on classical test theory to determine whether subscores have added value over total scores. This paper provides a literature review and reports when subscores were found to have added value for…

Descriptors: Scores, Correlation, Reliability, Item Response Theory

Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Peer reviewed

Direct link

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

Descriptors: Educational Testing, Scores, Reports, Psychometrics

Development of the Enzyme-Substrate Interactions Concept Inventory

Peer reviewed

Direct link

Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012

Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…

Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts

Teaching Introductory Measurement: Suggestions for What to Include and How to Motivate Students

Peer reviewed

Direct link

Bandalos, Deborah L.; Kopp, Jason P. – Educational Measurement: Issues and Practice, 2012

In this article, we discuss the importance of measurement literacy and some issues encountered in teaching introductory measurement courses. We present results from a survey of introductory measurement instructors, including information about the topics included in such courses and the amount of time spent on each. Topics that were included by the…

Descriptors: Class Activities, Motivation Techniques, Item Analysis, Test Theory

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

A Study of General Education Astronomy Students' Understandings of Cosmology. Part II. Evaluating Four Conceptual Cosmology Surveys: A Classical Test Theory Approach

Peer reviewed

Direct link

Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K. – Astronomy Education Review, 2011

This is the second of five papers detailing our national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. This article begins our quantitative investigation of the data. We describe how we scored students' responses to four conceptual cosmology surveys, and we present evidence for the inter-rater…

Descriptors: Astronomy, Scientific Concepts, College Students, Introductory Courses

Defensible Progress Monitoring Data for Medium- and High-Stakes Decisions

Peer reviewed

Direct link

Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.; Clemens, Nathan H. – Journal of Special Education, 2012

Within a response to intervention model, educators increasingly use progress monitoring (PM) to support medium- to high-stakes decisions for individual students. For PM to serve these more demanding decisions requires more careful consideration of measurement error. That error should be calculated within a fixed linear regression model rather than…

Descriptors: Measurement, Computation, Response to Intervention, Regression (Statistics)

Errors of Measurement, Theory, and Public Policy. William H. Angoff Memorial Lecture Series

Download full text

Kane, Michael – Educational Testing Service, 2010

The 12th annual William H. Angoff Memorial Lecture was presented by Dr. Michael T. Kane, ETS's (Educational Testing Service) Samuel J. Messick Chair in Test Validity and the former Director of Research at the National Conference of Bar Examiners. Dr. Kane argues that it is important for policymakers to recognize the impact of errors of measurement…

Descriptors: Error of Measurement, Scores, Public Policy, Test Theory

Incomplete Psychometric Equivalence of Scores Obtained on the Manual and the Computer Version of the Wisconsin Card Sorting Test?

Peer reviewed

Direct link

Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010

The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…

Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores

Previous Page | Next Page »

Pages: 1 | 2

Sinharay, Sandip	4
Haberman, Shelby J.	3
Prather, Edward E.	2
Puhan, Gautam	2
Zimmerman, Donald W.	2
Bailey, Janelle M.	1
Bandalos, Deborah L.	1
Bretz, Stacey Lowery	1
Brunner, Martin	1
Cho, Sun-Joo	1
Clark, Rodney	1
Clemens, Nathan H.	1
Coleman, Apollonia P.	1
Collins, Linda M.	1
Culpepper, Steven Andrew	1
Davis, John L.	1
Duncan, Douglas K.	1
Espelage, Dorothy L.	1
Foorman, Barbara R.	1
Graham, James M.	1
Haberman, Shelby	1
Helms, LuAnn Sherbeck	1
Henson, Robin K.	1
Hill, Heather	1
More ▼