NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Does not meet standards1
Showing 2,866 to 2,880 of 3,311 results Save | Export
Peer reviewed Peer reviewed
Yen, Wendy M. – Journal of Educational Measurement, 1984
A procedure for obtaining maximum likelihood trait estimates from number-correct (NC) scores for the three-parameter logistic model is presented. It produces an NC score to trait estimate conversion table. Analyses in the estimated true score metric confirm the conclusions made in the trait metric. (Author/DWH)
Descriptors: Achievement Tests, Error of Measurement, Estimation (Mathematics), Latent Trait Theory
Basol-Gocmen, Gulsah; Kanyongo, Gibbs Y.; Blankson, Lydia – Online Submission, 2002
The purpose of this paper is to evaluate the use of MC2G program to teach certain topics in statistics education. MC2G is a program written in Pascal Delphi by Gordon Brooks of Ohio University based on Monte Carlo studies. MC2G provides students opportunity to practice important topics in an introductory statistics course, such as power, Type I…
Descriptors: Student Attitudes, Monte Carlo Methods, Computer Software, Effect Size
Karkee, Thakur B.; Wright, Karen R. – Online Submission, 2004
Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the…
Descriptors: Measures (Individuals), Evaluation Criteria, Testing, Integrity
Ross, J. Michael – 1996
This paper presents a number of arguments for the increased importance of within-state district-level data in systematic assessments in the organizational structure of schools as educational institutions. The major question is whether the Schools and Staffing Survey (SASS) should shift its focus toward more macro-institutional district level…
Descriptors: Adult Education, Elementary Secondary Education, Error of Measurement, Evaluation Methods
Zwick, Rebecca; And Others – 1994
A simulation study of methods of assessing differential item functioning (DIF) in computer-adaptive tests (CATs) was conducted by Zwick, Thayer, and Wingersky (in press, 1993). Results showed that modified versions of the Mantel-Haenszel and standardization methods work well with CAT data. DIF methods were also investigated for nonadaptive…
Descriptors: Adaptive Testing, Computer Assisted Testing, Error of Measurement, Estimation (Mathematics)
Safarik, John Gerald – California Journal of Educational Research, 1972
Study analyzed the results of a college rule regulating the maximum number of units students were allowed to carry and its effect on academic performance. (Author/RK)
Descriptors: Academic Achievement, Academic Failure, College Students, Credits
Peer reviewed Peer reviewed
Emrick, John A. – Journal of Educational Measurement, 1971
Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation Methods, Item Analysis
Peer reviewed Peer reviewed
Livingston, Samuel A. – Journal of Educational Measurement, 1972
A reliability coefficient for criterion-referenced tests is developed from the assumptions of classical test theory. The coefficient is based on deviations of scores from the criterion score, rather than from the mean. (Author/CK)
Descriptors: Criterion Referenced Tests, Error of Measurement, Mathematical Applications, Norm Referenced Tests
Peer reviewed Peer reviewed
Harris, Chester W. – Journal of Educational Measurement, 1972
An alternative interpretation of Livingston's reliability coefficient (see TM 500 487) is based on the notion of the relation of the size of the reliability coefficient to the range of talent. (Author/CK)
Descriptors: Criterion Referenced Tests, Error of Measurement, Mathematical Applications, Norm Referenced Tests
Peer reviewed Peer reviewed
McGaw, Barry; And Others – American Educational Research Journal, 1972
The generalizability theory approach to the estimation of reliability is outlined, and a design is developed in which systematic variations in behavior over differing situations are separated from random fluctuation. Three coefficients of reliability are proposed. (CK)
Descriptors: Analysis of Variance, Behavior Change, Classroom Observation Techniques, Classroom Research
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1980
Using three sets of real data, a comparison of four discrete discriminate analysis procedures is made using the actual versus the optimal error rate. The kernel method gives the most accurate results in all three cases. (Author/RL)
Descriptors: Achievement Tests, Comparative Analysis, Discriminant Analysis, Error of Measurement
Peer reviewed Peer reviewed
Solano-Flores, Guillermo; Trumbull, Elise – Educational Researcher, 2003
Suggests the importance of new paradigms in research and practice on testing English language learners (ELLs) to effectively address the complexities of language and culture, identifying three key issues: test review, test development, and treatment of language as a source of measurement error. Research examples illustrate the importance and…
Descriptors: Cultural Differences, Educational Research, Elementary Secondary Education, English (Second Language)
Peer reviewed Peer reviewed
Willms, J. Douglas; Raudenbush, Stephen W. – Journal of Educational Measurement, 1989
A general longitudinal model is presented for estimating school effects and their stability. The model, capable of separating true changes from sampling and measurement error, controls statistically for effects of factors exogenous to the school system. The model is illustrated with data from large cohorts of students in Scotland. (SLD)
Descriptors: Elementary Secondary Education, Equations (Mathematics), Error of Measurement, Estimation (Mathematics)
Peer reviewed Peer reviewed
De Ayala, R. J.; And Others – Journal of Educational Measurement, 1990
F. M. Lord's flexilevel, computerized adaptive testing (CAT) procedure was compared to an item-response theory-based CAT procedure that uses Bayesian ability estimation with various standard errors of estimates used for terminating the test. Ability estimates of flexilevel CATs were as accurate as were those of Bayesian CATs. (TJH)
Descriptors: Ability Identification, Adaptive Testing, Bayesian Statistics, Comparative Analysis
Peer reviewed Peer reviewed
Hanushek, Eric A.; Taylor, Lori L. – Journal of Human Resources, 1990
Commonly employed measures of school quality can lead to very misleading results. Especially at the state level, nonrepresentative data such as aggregate Scholastic Aptitude Test scores provide very biased measures of school performance. Far superior are direct estimates of achievement growth. (SK)
Descriptors: Academic Achievement, Alternative Assessment, Educational Assessment, Educational Quality
Pages: 1  |  ...  |  188  |  189  |  190  |  191  |  192  |  193  |  194  |  195  |  196  |  ...  |  221