Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 18 |
Descriptor
Reliability | 29 |
Scores | 29 |
Test Theory | 29 |
Error of Measurement | 9 |
Validity | 9 |
Correlation | 8 |
Psychometrics | 6 |
Measurement Techniques | 5 |
Comparative Analysis | 4 |
Computation | 4 |
Item Response Theory | 4 |
More ▼ |
Source
Author
Publication Type
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
High Schools | 1 |
Kindergarten | 1 |
Secondary Education | 1 |
Audience
Teachers | 1 |
Location
Florida | 1 |
Luxembourg | 1 |
United States | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
SAT (College Admission Test) | 1 |
Stanford Achievement Tests | 1 |
Wisconsin Card Sorting Test | 1 |
What Works Clearinghouse Rating
Cho, Sun-Joo; Preacher, Kristopher J. – Educational and Psychological Measurement, 2016
Multilevel modeling (MLM) is frequently used to detect cluster-level group differences in cluster randomized trial and observational studies. Group differences on the outcomes (posttest scores) are detected by controlling for the covariate (pretest scores) as a proxy variable for unobserved factors that predict future attributes. The pretest and…
Descriptors: Error of Measurement, Error Correction, Multivariate Analysis, Hierarchical Linear Modeling
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Sinharay, Sandip – Journal of Educational Measurement, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…
Descriptors: Scores, Test Theory, Simulation, Reliability
Bailey, Janelle M.; Johnson, Bruce; Prather, Edward E.; Slater, Timothy F. – International Journal of Science Education, 2012
Concept inventories (CIs)--typically multiple-choice instruments that focus on a single or small subset of closely related topics--have been used in science education for more than a decade. This paper describes the development and validation of a new CI for astronomy, the "Star Properties Concept Inventory" (SPCI). Questions cover the areas of…
Descriptors: Educational Strategies, Validity, Testing, Astronomy
Kelcey, Ben; McGinn, Daniel; Hill, Heather – Society for Research on Educational Effectiveness, 2013
Recent policy has charged schools and districts with maintaining highly qualified teachers and differentiating among teachers in terms of their effectiveness (U.S. Department of Education, 2009). This emphasis has driven the development and implementation of teacher quality measures which are increasingly being used to evaluate teachers with…
Descriptors: Teacher Effectiveness, Measures (Individuals), Observation, Teacher Evaluation
Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015
The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…
Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1
Sinharay, Sandip – Educational Testing Service, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008) suggested a method based on classical test theory to determine whether subscores have added value over total scores. This paper provides a literature review and reports when subscores were found to have added value for…
Descriptors: Scores, Correlation, Reliability, Item Response Theory
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Descriptors: Educational Testing, Scores, Reports, Psychometrics
Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012
Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…
Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts
Bandalos, Deborah L.; Kopp, Jason P. – Educational Measurement: Issues and Practice, 2012
In this article, we discuss the importance of measurement literacy and some issues encountered in teaching introductory measurement courses. We present results from a survey of introductory measurement instructors, including information about the topics included in such courses and the amount of time spent on each. Topics that were included by the…
Descriptors: Class Activities, Motivation Techniques, Item Analysis, Test Theory
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K. – Astronomy Education Review, 2011
This is the second of five papers detailing our national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. This article begins our quantitative investigation of the data. We describe how we scored students' responses to four conceptual cosmology surveys, and we present evidence for the inter-rater…
Descriptors: Astronomy, Scientific Concepts, College Students, Introductory Courses
Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.; Clemens, Nathan H. – Journal of Special Education, 2012
Within a response to intervention model, educators increasingly use progress monitoring (PM) to support medium- to high-stakes decisions for individual students. For PM to serve these more demanding decisions requires more careful consideration of measurement error. That error should be calculated within a fixed linear regression model rather than…
Descriptors: Measurement, Computation, Response to Intervention, Regression (Statistics)
Kane, Michael – Educational Testing Service, 2010
The 12th annual William H. Angoff Memorial Lecture was presented by Dr. Michael T. Kane, ETS's (Educational Testing Service) Samuel J. Messick Chair in Test Validity and the former Director of Research at the National Conference of Bar Examiners. Dr. Kane argues that it is important for policymakers to recognize the impact of errors of measurement…
Descriptors: Error of Measurement, Scores, Public Policy, Test Theory
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
Previous Page | Next Page ยป
Pages: 1 | 2