NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ji Seung; Cai, Li – Journal of Educational and Behavioral Statistics, 2014
The main purpose of this study is to improve estimation efficiency in obtaining maximum marginal likelihood estimates of contextual effects in the framework of nonlinear multilevel latent variable model by adopting the Metropolis-Hastings Robbins-Monro algorithm (MH-RM). Results indicate that the MH-RM algorithm can produce estimates and standard…
Descriptors: Computation, Hierarchical Linear Modeling, Mathematics, Context Effect
Peer reviewed Peer reviewed
Direct linkDirect link
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Wu, Margaret – Educational Measurement: Issues and Practice, 2010
In large-scale assessments, such as state-wide testing programs, national sample-based assessments, and international comparative studies, there are many steps involved in the measurement and reporting of student achievement. There are always sources of inaccuracies in each of the steps. It is of interest to identify the source and magnitude of…
Descriptors: Testing Programs, Educational Assessment, Measures (Individuals), Program Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Brennan, Robert L. – Applied Psychological Measurement, 2008
The discussion here covers five articles that are linked in the sense that they all treat population invariance. This discussion of population invariance is a somewhat broader treatment of the subject than simply a discussion of these five articles. In particular, occasional reference is made to publications other than those in this issue. The…
Descriptors: Advanced Placement, Law Schools, Science Achievement, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Huitzing, Hiddo A.; Veldkamp, Bernard P.; Verschoor, Angela J. – Journal of Educational Measurement, 2005
Several techniques exist to automatically put together a test meeting a number of specifications. In an item bank, the items are stored with their characteristics. A test is constructed by selecting a set of items that fulfills the specifications set by the test assembler. Test assembly problems are often formulated in terms of a model consisting…
Descriptors: Testing Programs, Programming, Mathematics, Item Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Petersen, Nancy S. – Applied Psychological Measurement, 2008
This article discusses the five studies included in this issue. Each article addressed the same topic, population invariance of equating. They all used data from major standardized testing programs, and they all used essentially the same statistics to evaluate their results, namely, the root mean square difference and root expected mean square…
Descriptors: Testing Programs, Standardized Tests, Equated Scores, Evaluation Methods
Peer reviewed Peer reviewed
Shoemaker, David M.; Shoemaker, Judith Sauls – Evaluation and Program Planning: An International Journal, 1981
When evaluating the effectiveness of an educational program, multiple matrix sampling is particularly effective and efficient when the goal of the evaluation is estimating group (as opposed to individual) performance. The technique is described in some detail, with its advantages and disadvantages, and examples of its application are given.…
Descriptors: Educational Assessment, Evaluation Methods, Item Sampling, Program Effectiveness
Peer reviewed Peer reviewed
Gohmann, Stephen F. – Journal of Educational Measurement, 1988
One method to correct for selection bias in comparing Scholastic Aptitude Test (SAT) scores among states is presented, which is a modification of J. J. Heckman's Selection Bias Correction (1976, 1979). Empirical results suggest that sample selection bias is present in SAT score regressions. (SLD)
Descriptors: Regression (Statistics), Sampling, Scoring, Selection
Peer reviewed Peer reviewed
Gao, Xiaohong; And Others – Applied Measurement in Education, 1994
This study provides empirical evidence about the sampling variability and generalizability (reliability) of a statewide performance assessment for grade six. Results for 600 students at individual and school levels indicate that task-sampling variability was the major source of measurement error. Rater-sampling variability was negligible. (SLD)
Descriptors: Achievement Tests, Educational Assessment, Elementary School Students, Error of Measurement
Peer reviewed Peer reviewed
Shepard, Lorrie – Studies in Educational Evaluation, 1979
Assessment generally refers to large-scale, system-wide measurement programs for pupil diagnosis; pupil certification; program evaluation; research; accountability; resource allocations; or teacher evaluation. The purpose of assessment should determine the test content, construction, administration, and examinees sampled. Assessment methods for…
Descriptors: Accountability, Diagnostic Tests, Educational Assessment, Educational Research
Peer reviewed Peer reviewed
Marascuilo, Leonard A. – Journal of Experimental Education, 1979
The utility of the biomedical model of adjusted statistics is demonstrated. The model is recommended for use by educational researchers to randomize subjects for a more accurate estimate of school programs' success or failure when compared across classrooms or other units. (Author/MH)
Descriptors: Academic Achievement, Analysis of Variance, Comparative Analysis, Criterion Referenced Tests