NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 166 to 180 of 1,161 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lai, Kevin; Cabrera, Julio; Vitale, Jonathan M.; Madhok, Jacquie; Tinker, Robert; Linn, Marcia C. – Journal of Science Education and Technology, 2016
Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed…
Descriptors: Middle School Students, Secondary School Science, Science Instruction, Graphs
Peer reviewed Peer reviewed
Direct linkDirect link
Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013
In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…
Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Lambert, Matthew C.; Hurley, Kristin Duppong; Tomlinson, M. Michele Athay; Stevens, Amy L. – Child & Youth Care Forum, 2013
Background: A client's motivation to receive services is significantly related to seeking services, remaining in services, and improved outcomes. The Motivation for Youth Treatment Scale (MYTS) is one of the few brief measures used to assess motivation for mental health treatment. Objective: To investigate if the psychometric properties of the…
Descriptors: Motivation, Mental Health, Health Services, Access to Health Care
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom; Dhawan, Vikas – Research Papers in Education, 2013
This paper discusses the issues involved in calculating indices of composite reliability for "modular" or "unitised" assessments of the kind used in GCSEs, AS and A level examinations in England. The increasingly widespread use of on-screen marking has meant that the item-level data required for calculating indices of…
Descriptors: Foreign Countries, Exit Examinations, Secondary Education, Test Reliability
Rice, Stephen; Geels, Kasha; Trafimow, David; Hackett, Holly – Online Submission, 2011
Test scores are used to assess one's general knowledge of a specific area. Although strategies to improve test performance have been previously identified, the consistency with which one uses these strategies has not been analyzed in such a way that allows assessment of how much consistency affects overall performance. Participants completed one…
Descriptors: Performance, Test Theory, Reliability, Knowledge Level
Peer reviewed Peer reviewed
Direct linkDirect link
Grigg, Kaine; Manderson, Lenore – Australian Educational and Developmental Psychologist, 2015
Existing Australian measures of racist attitudes focus on single groups or have not been validated across the lifespan. To redress this, the present research aimed to develop and validate a measure of racial, ethnic, cultural and religious acceptance--the Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES)--for use with…
Descriptors: Racial Bias, Racial Attitudes, Foreign Countries, Ethnocentrism
Gunn, Cathy; Steel, Caroline – Research in Learning Technology, 2012
We present a case to reposition theory so that it plays a pivotal role in learning technology research and helps to build an ecology of learning. To support the case, we present a critique of current practice based on a review of articles published in two leading international journals from 2005 to 2010. Our study reveals that theory features only…
Descriptors: Outcomes of Education, Educational Change, Educational Technology, Theory Practice Relationship
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Abedalaziz, Nabeel; Leng, Chin Hai – Malaysian Online Journal of Educational Sciences, 2013
Most of the tests and inventories used by counseling psychologists have been developed using CTT; IRT derives from what is called latent trait theory. A number of important differences exist between CTT- versus IRT-based approaches to both test development and evaluation, as well as the process of scoring the response profiles of individual…
Descriptors: Test Theory, Item Response Theory, Difficulty Level, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Williamson, Kathryn E.; Willoughby, Shannon; Prather, Edward E. – Astronomy Education Review, 2013
We introduce the Newtonian Gravity Concept Inventory (NGCI), a 26-item multiple-choice instrument to assess introductory general education college astronomy ("Astro 101") student understanding of Newtonian gravity. This paper describes the development of the NGCI through four phases: Planning, Construction, Quantitative Analysis, and…
Descriptors: Science Instruction, Scientific Concepts, Astronomy, College Science
Peer reviewed Peer reviewed
Direct linkDirect link
Beauducel, Andre – Applied Psychological Measurement, 2013
The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…
Descriptors: Factor Analysis, Predictor Variables, Reliability, Error of Measurement
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Herman, Geoffrey L.; Zilles, Craig; Loui, Michael C. – Computer Science Education, 2014
Concept inventories hold tremendous promise for promoting the rigorous evaluation of teaching methods that might remedy common student misconceptions and promote deep learning. The measurements from concept inventories can be trusted only if the concept inventories are evaluated both by expert feedback and statistical scrutiny (psychometric…
Descriptors: Psychometrics, Concept Formation, Measures (Individuals), Teaching Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Herbst, Patricio; Dimmel, Justin; Erickson, Ander; Ko, Inah; Kosko, Karl W. – North American Chapter of the International Group for the Psychology of Mathematics Education, 2014
We describe the conceptualization, development, and piloting of two instruments--a survey and a scenario-based assessment--designed to assess, teachers' recognition of an obligation to the discipline of mathematics and the extent to which teachers justify actions that deviate from what is normative on account of this obligation. We show how we…
Descriptors: Mathematics Teachers, Test Construction, Test Theory, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015
This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…
Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…
Descriptors: Scores, Test Theory, Simulation, Reliability
Pages: 1  |  ...  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  16  |  ...  |  78