NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Does not meet standards1
Showing 1,966 to 1,980 of 3,296 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Jenson, William R.; Clark, Elaine; Kircher, John C.; Kristjansson, Sean D. – Psychology in the Schools, 2007
Evidence-based practice approaches to interventions has come of age and promises to provide a new standard of excellence for school psychologists. This article describes several definitions of evidence-based practice and the problems associated with traditional statistical analyses that rely on rejection of the null hypothesis for the…
Descriptors: School Psychologists, Statistical Analysis, Hypothesis Testing, Intervention
Peer reviewed Peer reviewed
Direct linkDirect link
Barkaoui, Khaled – Canadian Modern Language Review, 2007
Essay tests are widely used to assess ESL/EFL learners' writing abilities for instructional, administrative, and research purposes. Relevant literature was searched to identify 70 empirical studies on ESL/EFL essay tests. The majority of these studies examined task, essay, and rater effects on essay rating and scores. Less attention has been given…
Descriptors: Essay Tests, Language Tests, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
George, James D.; Bradshaw, Danielle I.; Hyde, Annette; Vehrs, Pat R.; Hager, Ronald L.; Yanowitz, Frank G. – Measurement in Physical Education and Exercise Science, 2007
The purpose of this study was to develop an age-generalized regression model to predict maximal oxygen uptake (VO sub 2 max) based on a maximal treadmill graded exercise test (GXT; George, 1996). Participants (N = 100), ages 18-65 years, reached a maximal level of exertion (mean plus or minus standard deviation [SD]; maximal heart rate [HR sub…
Descriptors: Metabolism, Body Composition, Multiple Regression Analysis, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Yan; Zumbo, Bruno D. – Educational and Psychological Measurement, 2007
The impact of outliers on Cronbach's coefficient [alpha] has not been documented in the psychometric or statistical literature. This is an important gap because coefficient [alpha] is the most widely used measurement statistic in all of the social, educational, and health sciences. The impact of outliers on coefficient [alpha] is investigated for…
Descriptors: Psychometrics, Computation, Reliability, Monte Carlo Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Mapuranga, Raymond; Dorans, Neil J.; Middleton, Kyndra – ETS Research Report Series, 2008
In many practical settings, essentially the same differential item functioning (DIF) procedures have been in use since the late 1980s. Since then, examinee populations have become more heterogeneous, and tests have included more polytomously scored items. This paper summarizes and classifies new DIF methods and procedures that have appeared since…
Descriptors: Test Bias, Educational Development, Evaluation Methods, Statistical Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
von Davier, Alina A.; Holland, Paul W.; Livingston, Samuel A.; Casabianca, Jodi; Grant, Mary C.; Martin, Kathleen – ETS Research Report Series, 2006
This study examines how closely the kernel equating (KE) method (von Davier, Holland, & Thayer, 2004a) approximates the results of other observed-score equating methods--equipercentile and linear equatings. The study used pseudotests constructed of item responses from a real test to simulate three equating designs: an equivalent groups (EG)…
Descriptors: Equated Scores, Statistical Analysis, Simulation, Tests
Zhang, Yanwei; Breithaupt, Krista; Tessema, Aster; Chuah, David – Online Submission, 2006
Two IRT-based procedures to estimate test reliability for a certification exam that used both adaptive (via a MST model) and non-adaptive design were considered in this study. Both procedures rely on calibrated item parameters to estimate error variance. In terms of score variance, one procedure (Method 1) uses the empirical ability distribution…
Descriptors: Individual Testing, Test Reliability, Programming, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Gonzalez-Roma, Vicente; Hernandez, Ana; Gomez-Benito, Juana – Multivariate Behavioral Research, 2006
In this simulation study, we investigate the power and Type I error rate of a procedure based on the mean and covariance structure analysis (MACS) model in detecting differential item functioning (DIF) of graded response items with five response categories. The following factors were manipulated: type of DIF (uniform and non-uniform), DIF…
Descriptors: Multivariate Analysis, Item Response Theory, Test Bias, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Sass, Daniel A.; Smith, Philip L. – Structural Equation Modeling: A Multidisciplinary Journal, 2006
Structural equation modeling allows several methods of estimating the disattenuated association between 2 or more latent variables (i.e., the measurement model). In one common approach, measurement models are specified using item parcels as indicators of latent constructs. Item parcels versus original items are often used as indicators in these…
Descriptors: Structural Equation Models, Item Analysis, Error of Measurement, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Aguinis, Herman; Pierce, Charles A. – Applied Psychological Measurement, 2006
The computation and reporting of effect size estimates is becoming the norm in many journals in psychology and related disciplines. Despite the increased importance of effect sizes, researchers may not report them or may report inaccurate values because of a lack of appropriate computational tools. For instance, Pierce, Block, and Aguinis (2004)…
Descriptors: Effect Size, Multiple Regression Analysis, Predictor Variables, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Meyers, Jason L.; Beretvas, S. Natasha – Multivariate Behavioral Research, 2006
Cross-classified random effects modeling (CCREM) is used to model multilevel data from nonhierarchical contexts. These models are widely discussed but infrequently used in social science research. Because little research exists assessing when it is necessary to use CCREM, 2 studies were conducted. A real data set with a cross-classified structure…
Descriptors: Social Science Research, Computation, Models, Data Analysis
Kieffer, Kevin M. – 1998
This paper discusses the benefits of using generalizabilty theory in lieu of classical test theory. Generalizability theory subsumes and extends the precepts of classical test theory by estimating the magnitude of multiple sources of measurement error and their interactions simultaneously in a single analysis. Since classical test theory examines…
Descriptors: Error of Measurement, Generalizability Theory, Heuristics, Interaction
Woodruff, David – 1989
Previous methods for estimating the conditional standard error of measurement (CSEM) at specific score or ability levels are critically discussed, and a brief summary of prior empirical results is given. A new method is developed which avoids theoretical problems inherent in some prior methods, is easy to implement, and estimates not only a…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Models, Predictive Measurement
Pages: 1  |  ...  |  128  |  129  |  130  |  131  |  132  |  133  |  134  |  135  |  136  |  ...  |  220