NotesFAQContact Us
Collection
Advanced
Search Tips
Location
Laws, Policies, & Programs
Assessments and Surveys
Armed Services Vocational…1
What Works Clearinghouse Rating
Showing 1 to 15 of 20 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deke, John; Finucane, Mariel; Thal, Daniel – National Center for Education Evaluation and Regional Assistance, 2022
BASIE is a framework for interpreting impact estimates from evaluations. It is an alternative to null hypothesis significance testing. This guide walks researchers through the key steps of applying BASIE, including selecting prior evidence, reporting impact estimates, interpreting impact estimates, and conducting sensitivity analyses. The guide…
Descriptors: Bayesian Statistics, Educational Research, Data Interpretation, Hypothesis Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ludtke, Oliver; Marsh, Herbert W.; Robitzsch, Alexander; Trautwein, Ulrich; Asparouhov, Tihomir; Muthen, Bengt – Psychological Methods, 2008
In multilevel modeling (MLM), group-level (L2) characteristics are often measured by aggregating individual-level (L1) characteristics within each group so as to assess contextual effects (e.g., group-average effects of socioeconomic status, achievement, climate). Most previous applications have used a multilevel manifest covariate (MMC) approach,…
Descriptors: Statistical Analysis, Sampling, Context Effect, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Dekle, Dawn J.; Leung, Denis H. Y.; Zhu, Min – Psychological Methods, 2008
Across many areas of psychology, concordance is commonly used to measure the (intragroup) agreement in ranking a number of items by a group of judges. Sometimes, however, the judges come from multiple groups, and in those situations, the interest is to measure the concordance between groups, under the assumption that there is some within-group…
Descriptors: Item Response Theory, Statistical Analysis, Psychological Studies, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Eid, Michael; Nussbeck, Fridtjof W.; Geiser, Christian; Cole, David A.; Gollwitzer, Mario; Lischetzke, Tanja – Psychological Methods, 2008
The question as to which structural equation model should be selected when multitrait-multimethod (MTMM) data are analyzed is of interest to many researchers. In the past, attempts to find a well-fitting model have often been data-driven and highly arbitrary. In the present article, the authors argue that the measurement design (type of methods…
Descriptors: Structural Equation Models, Multitrait Multimethod Techniques, Statistical Analysis, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Briggs, Derek C. – Applied Measurement in Education, 2008
This article illustrates the use of an explanatory item response modeling (EIRM) approach in the context of measuring group differences in science achievement. The distinction between item response models and EIRMs, recently elaborated by De Boeck and Wilson (2004), is presented within the statistical framework of generalized linear mixed models.…
Descriptors: Science Achievement, Science Tests, Measurement, Error of Measurement
Fairbank, Benjamin A., Jr. – 1985
The effectiveness of 19 methods of smoothing was investigated as those methods apply to the equipercentile method of test equating. Seven methods involved smoothing the score distribution before the tests were equated (presmoothing). Seven involved smoothing the resultant points after the equating (postsmoothing). Five methods involved combining…
Descriptors: Adults, Equated Scores, Equations (Mathematics), Error of Measurement
Lunneborg, Clifford E. – 1983
The wide availability of large amounts of inexpensive computing power has encouraged statisticians to explore many approaches to a basis for inference. This paper presents one such "computer-intensive" approach: the bootstrap of Bradley Efron. This methodology fits between the cases where it is assumed that the form of the distribution…
Descriptors: Analysis of Variance, Error of Measurement, Estimation (Mathematics), Hypothesis Testing
Peer reviewed Peer reviewed
Stuart, Andrew; And Others – Journal of Speech and Hearing Research, 1990
Variability of aided sound field thresholds (ASFTs) was examined in 30 hearing-impaired children comprising 2 age groups (5-9 and 10-14 years). Findings showed that 2 ASFTs would have to differ by more than 10 decibels across signal test frequencies to attain statistical significance. (Author/DB)
Descriptors: Age Differences, Audiology, Auditory Evaluation, Children
Phillips, Gary W. – 1985
This paper provides empirical data on two approaches to statistically equate scores derived from the direct assessment of writing. These methods are linear equating and equating based on the general polychotomous form of the Rasch model. Data from the Maryland Functional Writing Test are used to equate scores obtained from two prompts given in…
Descriptors: Elementary Secondary Education, Equated Scores, Equations (Mathematics), Error of Measurement
Pike, Gary R. – 1991
Because change is fundamental to education and the measurement of change assesses the quality and effectiveness of postsecondary education, this study examined three methods of measuring change: (1) gain scores; (2) residual scores; and (3) repeated measures. Data for the study was obtained from transcripts of 722 graduating seniors at the…
Descriptors: Academic Achievement, College Seniors, Error of Measurement, Higher Education
Peer reviewed Peer reviewed
Yen, Wendy M. – Journal of Educational Measurement, 1984
A procedure for obtaining maximum likelihood trait estimates from number-correct (NC) scores for the three-parameter logistic model is presented. It produces an NC score to trait estimate conversion table. Analyses in the estimated true score metric confirm the conclusions made in the trait metric. (Author/DWH)
Descriptors: Achievement Tests, Error of Measurement, Estimation (Mathematics), Latent Trait Theory
Skaggs, Gary; Lissitz, Robert W. – 1985
This study examined how four commonly used test equating procedures (linear, equipercentile, Rasch Model, and three-parameter) would respond to situations in which the properties or the two tests being equated were different. Data for two tests plus an external anchor test were generated from a three parameter model in which mean test differences…
Descriptors: Computer Simulation, Equated Scores, Error of Measurement, Goodness of Fit
Olejnik, Stephen F.; Algina, James – 1986
Sampling distributions for ten tests for comparing population variances in a two group design were generated for several combinations of equal and unequal sample sizes, population means, and group variances when distributional forms differed. The ten procedures included: (1) O'Brien's (OB); (2) O'Brien's with adjusted degrees of freedom; (3)…
Descriptors: Error of Measurement, Evaluation Methods, Measurement Techniques, Nonparametric Statistics
Cope, Ronald T. – 1987
This study used generalizability theory and other statistical concepts to assess the application of the Angoff method to setting cutoff scores on two professional certification tests. A panel of ten judges gave pre- and post-feedback Angoff probability ratings of items of two forms of a professional certification test, and another panel of nine…
Descriptors: Certification, Correlation, Cutting Scores, Error of Measurement
Livingston, Samuel A. – 1986
This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…
Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models
Previous Page | Next Page ยป
Pages: 1  |  2