NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Bailey, Janelle M.; Johnson, Bruce; Prather, Edward E.; Slater, Timothy F. – International Journal of Science Education, 2012
Concept inventories (CIs)--typically multiple-choice instruments that focus on a single or small subset of closely related topics--have been used in science education for more than a decade. This paper describes the development and validation of a new CI for astronomy, the "Star Properties Concept Inventory" (SPCI). Questions cover the areas of…
Descriptors: Educational Strategies, Validity, Testing, Astronomy
Peer reviewed Peer reviewed
Direct linkDirect link
Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012
Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…
Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts
Peer reviewed Peer reviewed
Direct linkDirect link
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification
Haladyna, Tom; Roid, Gale – 1980
The problems associated with misclassifying students when pass-fail decisions are based on test scores are discussed. One protection against misclassification is to set a confidence interval around the cutting score. Those whose scores fall above the interval are passed; those whose scores fall below the interval are failed; and those whose scores…
Descriptors: Bayesian Statistics, Classification, Comparative Analysis, Criterion Referenced Tests