NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)21
Laws, Policies, & Programs
Elementary and Secondary…1
What Works Clearinghouse Rating
Showing 1 to 15 of 42 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Twing, Jon S. – Assessment in Education: Principles, Policy & Practice, 2016
This special issue of "Assessment in Education" contains the type of debate needed about what Cizek (2015) calls a "… lingering flaw in the concept of validity…." Some practitioners might not agree that the current theory of validation is flawed. Specifically, the debate Jon Twing is referencing concerns the role of the…
Descriptors: Test Validity, Misconceptions, Evidence, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2014
Brennan (Brennan, R. L., 2012) noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman (Haberman, S. J., 2008) suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. According to this…
Descriptors: Scores, Test Theory, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2014
Brennan noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. One way to interpret the method is that a subscore has added value…
Descriptors: Scores, Test Theory, Classification, Cutting Scores
Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013
Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…
Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling
Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015
The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…
Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1
Peer reviewed Peer reviewed
Direct linkDirect link
He, Qingping; Hayes, Malcolm; Wiliam, Dylan – Research Papers in Education, 2013
The accuracy of the results of the national tests in English, mathematics and science taken by 11-year olds in England has been a matter of much debate since their introduction in 1994, with estimates of the proportion of students incorrectly classified varying from 10 to 30%. Using live data from the 2009 and 2010 administration of the national…
Descriptors: Foreign Countries, National Curriculum, Accuracy, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K. – Astronomy Education Review, 2011
This is the first in a series of five articles describing a national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. In this paper, we describe the process by which we designed four new surveys to assess general education astronomy students' conceptual cosmology knowledge. These surveys focused…
Descriptors: General Education, Astronomy, Surveys, Evolution
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K. – Astronomy Education Review, 2011
This is the second of five papers detailing our national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. This article begins our quantitative investigation of the data. We describe how we scored students' responses to four conceptual cosmology surveys, and we present evidence for the inter-rater…
Descriptors: Astronomy, Scientific Concepts, College Students, Introductory Courses
Peer reviewed Peer reviewed
Direct linkDirect link
Bristow, M.; Erkorkmaz, K.; Huissoon, J. P.; Jeon, Soo; Owen, W. S.; Waslander, S. L.; Stubley, G. D. – IEEE Transactions on Education, 2012
Any meaningful initiative to improve the teaching and learning in introductory control systems courses needs a clear test of student conceptual understanding to determine the effectiveness of proposed methods and activities. The authors propose a control systems concept inventory. Development of the inventory was collaborative and iterative. The…
Descriptors: Diagnostic Tests, Concept Formation, Undergraduate Students, Engineering Education
Peer reviewed Peer reviewed
Direct linkDirect link
Wicherts, Jelte M.; Scholten, Annemarie Zand – Intelligence, 2010
The validity of cognitive ability tests is often interpreted solely as a function of the cognitive abilities that these tests are supposed to measure, but other factors may be at play. The effects of test anxiety on the criterion related validity (CRV) of tests was the topic of a recent study by Reeve, Heggestad, and Lievens (2009) (Reeve, C. L.,…
Descriptors: Familiarity, Test Validity, Cognitive Tests, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Kettler, Ryan J. – Review of Research in Education, 2015
This chapter introduces theory that undergirds the role of testing adaptations in assessment, provides examples of item modifications and testing accommodations, reviews research relevant to each, and introduces a new paradigm that incorporates opportunity to learn (OTL), academic enablers, testing adaptations, and inferences that can be made from…
Descriptors: Meta Analysis, Literature Reviews, Testing, Testing Accommodations
Peer reviewed Peer reviewed
Direct linkDirect link
Hubley, Anita M.; Zumbo, Bruno D. – Social Indicators Research, 2011
The vast majority of measures have, at their core, a purpose of personal and social change. If test developers and users want measures to have personal and social consequences and impact, then it is critical to consider the consequences and side effects of measurement in the validation process itself. The consequential basis of test interpretation…
Descriptors: Construct Validity, Social Change, Measurement, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
Previous Page | Next Page »
Pages: 1  |  2  |  3