NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 166 to 180 of 1,166 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sandbank, Micheal; Yoder, Paul – Topics in Early Childhood Special Education, 2014
Generalizability and decision studies provide a mathematical framework for quantifying the stability of a given number of measurements. This approach is especially relevant to the task of obtaining a representative measure of communicative behavior in young children and supports an alternative to the debate regarding which type of assessment…
Descriptors: Developmental Delays, Toddlers, Intervention, Vocabulary Development
Badjadi, Nour El Imane – Online Submission, 2013
The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…
Descriptors: Essay Tests, Writing Evaluation, Test Validity, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Zimmerman, Donald W. – Journal of Educational and Behavioral Statistics, 2011
Many well-known equations in classical test theory are mathematical identities in populations of individuals but not in random samples from those populations. First, test scores are subject to the same sampling error that is familiar in statistical estimation and hypothesis testing. Second, the assumptions made in derivation of formulas in test…
Descriptors: Test Theory, Equations (Mathematics), Scores, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Brousselle, Astrid; Champagne, Francois – Evaluation and Program Planning, 2011
Program theory evaluation, which has grown in use over the past 10 years, assesses whether a program is designed in such a way that it can achieve its intended outcomes. This article describes a particular type of program theory evaluation--logic analysis--that allows us to test the plausibility of a program's theory using scientific knowledge.…
Descriptors: Evaluators, Program Evaluation, Logical Thinking, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Lai, Kevin; Cabrera, Julio; Vitale, Jonathan M.; Madhok, Jacquie; Tinker, Robert; Linn, Marcia C. – Journal of Science Education and Technology, 2016
Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed…
Descriptors: Middle School Students, Secondary School Science, Science Instruction, Graphs
Peer reviewed Peer reviewed
Direct linkDirect link
van Ravenzwaaij, Don; van der Maas, Han L. J.; Wagenmakers, Eric-Jan – Psychological Review, 2012
In their influential "Psychological Review" article, Bogacz, Brown, Moehlis, Holmes, and Cohen (2006) discussed optimal decision making as accomplished by the drift diffusion model (DDM). The authors showed that neural inhibition models, such as the leaky competing accumulator model (LCA) and the feedforward inhibition model (FFI), can mimic the…
Descriptors: Intelligent Tutoring Systems, Inhibition, Bayesian Statistics, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013
In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…
Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Lambert, Matthew C.; Hurley, Kristin Duppong; Tomlinson, M. Michele Athay; Stevens, Amy L. – Child & Youth Care Forum, 2013
Background: A client's motivation to receive services is significantly related to seeking services, remaining in services, and improved outcomes. The Motivation for Youth Treatment Scale (MYTS) is one of the few brief measures used to assess motivation for mental health treatment. Objective: To investigate if the psychometric properties of the…
Descriptors: Motivation, Mental Health, Health Services, Access to Health Care
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom; Dhawan, Vikas – Research Papers in Education, 2013
This paper discusses the issues involved in calculating indices of composite reliability for "modular" or "unitised" assessments of the kind used in GCSEs, AS and A level examinations in England. The increasingly widespread use of on-screen marking has meant that the item-level data required for calculating indices of…
Descriptors: Foreign Countries, Exit Examinations, Secondary Education, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Grigg, Kaine; Manderson, Lenore – Australian Educational and Developmental Psychologist, 2015
Existing Australian measures of racist attitudes focus on single groups or have not been validated across the lifespan. To redress this, the present research aimed to develop and validate a measure of racial, ethnic, cultural and religious acceptance--the Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES)--for use with…
Descriptors: Racial Bias, Racial Attitudes, Foreign Countries, Ethnocentrism
Rice, Stephen; Geels, Kasha; Trafimow, David; Hackett, Holly – Online Submission, 2011
Test scores are used to assess one's general knowledge of a specific area. Although strategies to improve test performance have been previously identified, the consistency with which one uses these strategies has not been analyzed in such a way that allows assessment of how much consistency affects overall performance. Participants completed one…
Descriptors: Performance, Test Theory, Reliability, Knowledge Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Abedalaziz, Nabeel; Leng, Chin Hai – Malaysian Online Journal of Educational Sciences, 2013
Most of the tests and inventories used by counseling psychologists have been developed using CTT; IRT derives from what is called latent trait theory. A number of important differences exist between CTT- versus IRT-based approaches to both test development and evaluation, as well as the process of scoring the response profiles of individual…
Descriptors: Test Theory, Item Response Theory, Difficulty Level, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Williamson, Kathryn E.; Willoughby, Shannon; Prather, Edward E. – Astronomy Education Review, 2013
We introduce the Newtonian Gravity Concept Inventory (NGCI), a 26-item multiple-choice instrument to assess introductory general education college astronomy ("Astro 101") student understanding of Newtonian gravity. This paper describes the development of the NGCI through four phases: Planning, Construction, Quantitative Analysis, and…
Descriptors: Science Instruction, Scientific Concepts, Astronomy, College Science
Peer reviewed Peer reviewed
Direct linkDirect link
Beauducel, Andre – Applied Psychological Measurement, 2013
The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…
Descriptors: Factor Analysis, Predictor Variables, Reliability, Error of Measurement
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Pages: 1  |  ...  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  16  |  ...  |  78