NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 196 to 210 of 1,161 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Zenisky, April L.; Crotts, Katrina M. – International Journal of Testing, 2010
The "International Journal of Testing" (IJT) is the journal of the International Test Commission. It is intended to support the dissemination of scholarly research on tests and test use worldwide. The purpose of this article is to reflect on what has been published in IJT over its nine volumes to date, with a focus on the extent to which…
Descriptors: Test Use, Testing, Evaluation, Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Cress, Cynthia J.; Lambert, Matthew C.; Epstein, Michael H. – Journal of Early Intervention, 2014
The Preschool Behavioral and Emotional Rating Scale (PreBERS) is an assessment of emotional and behavioral strengths in preschoolers with well-established reliability and validity for educational and clinical application in children with and without disabilities. The present study provides further evidence of psychometric rigor for items and…
Descriptors: Preschool Children, Rating Scales, Child Behavior, Behavior Problems
Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015
The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…
Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1
Peer reviewed Peer reviewed
Direct linkDirect link
Aslanides, J. S.; Savage, C. M. – Physical Review Special Topics - Physics Education Research, 2013
We report on a concept inventory for special relativity: the development process, data analysis methods, and results from an introductory relativity class. The Relativity Concept Inventory tests understanding of relativistic concepts. An unusual feature is confidence testing for each question. This can provide additional information; for example,…
Descriptors: Physics, Science Tests, Scientific Concepts, Confidence Testing
Williamson, Kathryn Elizabeth – ProQuest LLC, 2013
The topic of Newtonian gravity offers a unique vantage point from which to investigate and encourage conceptual change because it is something with which everyone has daily experience, and because it is taught in two courses that reach a wide variety of students--introductory-level college astronomy ("Astro 101") and physics ("Phys…
Descriptors: Scientific Concepts, Science Tests, College Science, Astronomy
Hixson, Nate; Rhudy, Vaughn – West Virginia Department of Education, 2013
Student responses to the West Virginia Educational Standards Test (WESTEST) 2 Online Writing Assessment are scored by a computer-scoring engine. The scoring method is not widely understood among educators, and there exists a misperception that it is not comparable to hand scoring. To address these issues, the West Virginia Department of Education…
Descriptors: Scoring Formulas, Scoring Rubrics, Interrater Reliability, Test Scoring Machines
Sinharay, Sandip – Educational Testing Service, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008) suggested a method based on classical test theory to determine whether subscores have added value over total scores. This paper provides a literature review and reports when subscores were found to have added value for…
Descriptors: Scores, Correlation, Reliability, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Descriptors: Educational Testing, Scores, Reports, Psychometrics
Li, Tiandong – ProQuest LLC, 2012
In large-scale assessments, such as the National Assessment of Educational Progress (NAEP), plausible values based on Multiple Imputations (MI) have been used to estimate population characteristics for latent constructs under complex sample designs. Mislevy (1991) derived a closed-form analytic solution for a fixed-effect model in creating…
Descriptors: National Competency Tests, Statistical Analysis, Educational Assessment, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
He, Qingping; Hayes, Malcolm; Wiliam, Dylan – Research Papers in Education, 2013
The accuracy of the results of the national tests in English, mathematics and science taken by 11-year olds in England has been a matter of much debate since their introduction in 1994, with estimates of the proportion of students incorrectly classified varying from 10 to 30%. Using live data from the 2009 and 2010 administration of the national…
Descriptors: Foreign Countries, National Curriculum, Accuracy, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie – Measurement and Evaluation in Counseling and Development, 2013
Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
Descriptors: Item Response Theory, Test Theory, Measures (Individuals), Racial Identification
Peer reviewed Peer reviewed
Direct linkDirect link
Revelle, William; Zinbarg, Richard E. – Psychometrika, 2009
There are three fundamental problems in Sijtsma ("Psychometrika," 2008): (1) contrary to the name, the glb is not the greatest lower bound of reliability but rather is systematically less than omega[subscript t] (McDonald, "Test theory: A unified treatment," Erlbaum, Hillsdale, 1999), (2) we agree with Sijtsma that when considering how well a test…
Descriptors: Test Theory, Computer Software, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Xu, Ting; Stone, Clement A. – Educational and Psychological Measurement, 2012
It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…
Descriptors: Educational Research, Monte Carlo Methods, Measures (Individuals), Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Calmettes, Guillaume; Drummond, Gordon B.; Vowler, Sarah L. – Advances in Physiology Education, 2012
A jack knife is a pocket knife that is put to many tasks, because it's ready to hand. Often there could be a better tool for the job, such as a screwdriver, a scraper, or a can-opener, but these are not usually pocket items. In statistical terms, the expression implies making do with what's available. Another simile, of an extreme situation, is…
Descriptors: Statistical Analysis, Computation, Population Distribution, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Schlingman, Wayne M.; Prather, Edward E.; Wallace, Colin S.; Brissenden, Gina; Rudolph, Alexander L. – Astronomy Education Review, 2012
This paper is the first in a series of investigations into the data from the recent national study using the Light and Spectroscopy Concept Inventory (LSCI). In this paper, we use classical test theory to form a framework of results that will be used to evaluate individual item difficulties, item discriminations, and the overall reliability of the…
Descriptors: Item Response Theory, Spectroscopy, Investigations, Light
Pages: 1  |  ...  |  10  |  11  |  12  |  13  |  14  |  15  |  16  |  17  |  18  |  ...  |  78