NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 38 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Oliver Lüdtke; Alexander Robitzsch – Journal of Experimental Education, 2025
There is a longstanding debate on whether the analysis of covariance (ANCOVA) or the change score approach is more appropriate when analyzing non-experimental longitudinal data. In this article, we use a structural modeling perspective to clarify that the ANCOVA approach is based on the assumption that all relevant covariates are measured (i.e.,…
Descriptors: Statistical Analysis, Longitudinal Studies, Error of Measurement, Hierarchical Linear Modeling
Domingue, Benjamin W.; Trejo, Sam; Armstrong-Carter, Emma; Tucker-Drob, Elliot M. – Grantee Submission, 2020
Interest in the study of gene-environment interaction has recently grown due to the sudden availability of molecular genetic data--in particular, polygenic scores--in many long-running longitudinal studies. Identifying and estimating statistical interactions comes with several analytic and inferential challenges; these challenges are heightened…
Descriptors: Genetics, Environmental Influences, Scores, Interaction
Peer reviewed Peer reviewed
Direct linkDirect link
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019
The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…
Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Regional Educational Laboratory Mid-Atlantic, 2023
This Snapshot highlights key findings from a study that used Bayesian stabilization to improve the reliability (long-term stability) of subgroup proficiency measures that the Pennsylvania Department of Education (PDE) uses to identify schools for Targeted Support and Improvement (TSI) or Additional Targeted Support and Improvement (ATSI). The…
Descriptors: At Risk Students, Low Achievement, Error of Measurement, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Perry, Thomas – Research Papers in Education, 2019
A compositional effect is when pupil attainment is associated with the characteristics of their peers, over and above their own individual characteristics. Pupils at academically selective schools, for example, tend to out-perform similar-ability pupils who are educated with mixed-ability peers. Previous methodological studies however have shown…
Descriptors: Value Added Models, Correlation, Individual Characteristics, Peer Influence
Peer reviewed Peer reviewed
Direct linkDirect link
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Peer reviewed Peer reviewed
Direct linkDirect link
Lane, David; Oswald, Frederick L. – Educational Measurement: Issues and Practice, 2016
The educational literature, the popular press, and educated laypeople have all echoed a conclusion from the book "Academically Adrift" by Richard Arum and Josipa Roksa (which has now become received wisdom), namely, that 45% of college students showed no significant gains in critical thinking skills. Similar results were reported by…
Descriptors: College Students, Critical Thinking, Thinking Skills, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Balkin, Richard S. – Measurement and Evaluation in Counseling and Development, 2017
An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…
Descriptors: Evidence, Academic Standards, Test Construction, Evaluation Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Chiu, Ting-Wei; Camilli, Gregory – Applied Psychological Measurement, 2013
Guessing behavior is an issue discussed widely with regard to multiple choice tests. Its primary effect is on number-correct scores for examinees at lower levels of proficiency. This is a systematic error or bias, which increases observed test scores. Guessing also can inflate random error variance. Correction or adjustment for guessing formulas…
Descriptors: Item Response Theory, Guessing (Tests), Multiple Choice Tests, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016
ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…
Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Kane, Michael – Journal of Educational Measurement, 2011
Errors don't exist in our data, but they serve a vital function. Reality is complicated, but our models need to be simple in order to be manageable. We assume that attributes are invariant over some conditions of observation, and once we do that we need some way of accounting for the variability in observed scores over these conditions of…
Descriptors: Error of Measurement, Scores, Test Interpretation, Testing
Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011
For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…
Descriptors: Scores, Reliability, Equated Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013
We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…
Descriptors: Measurement, Testing, Language Proficiency, Test Construction
Previous Page | Next Page »
Pages: 1  |  2  |  3