NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 19 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2018
The Rasch facets model was developed to account for facet data, such as student essays graded by raters, but it accounts for only one kind of rater effect (severity). In practice, raters may exhibit various tendencies such as using middle or extreme scores in their ratings, which is referred to as the rater centrality/extremity response style. To…
Descriptors: Scoring, Models, Interrater Reliability, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2018
The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…
Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Wesolowski, Brian C. – Journal of Educational Measurement, 2019
The purpose of this study was to build a Random Forest supervised machine learning model in order to predict musical rater-type classifications based upon a Rasch analysis of raters' differential severity/leniency related to item use. Raw scores (N = 1,704) from 142 raters across nine high school solo and ensemble festivals (grades 9-12) were…
Descriptors: Item Response Theory, Prediction, Classification, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; González B., Jorge; von Davier, Alina A. – Journal of Educational Measurement, 2013
Local equating (LE) is based on Lord's criterion of equity. It defines a family of true transformations that aim at the ideal of equitable equating. van der Linden (this issue) offers a detailed discussion of common issues in observed-score equating relative to this local approach. By assuming an underlying item response theory model, one of…
Descriptors: Equated Scores, Transformations (Mathematics), Item Response Theory, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Moses, Timothy P.; Yu, Lei; Dorans, Neil J. – Journal of Educational Measurement, 2009
This study examined the extent to which log-linear smoothing could improve the accuracy of differential item functioning (DIF) estimates in small samples of examinees. Examinee responses from a certification test were analyzed using White examinees in the reference group and African American examinees in the focal group. Using a simulation…
Descriptors: Test Items, Reference Groups, Testing Programs, Raw Scores
Peer reviewed Peer reviewed
Sabers, Darrell L.; Klausmeier, Richard D. – Journal of Educational Measurement, 1971
Descriptors: Measurement Techniques, Raw Scores, Sampling, Statistical Analysis
Peer reviewed Peer reviewed
de Gruijter, Dato N. M. – Journal of Educational Measurement, 1997
K. May and W. A. Nicewander recently concluded (1994) that percentile ranks are inferior or raw scores as indicators of latent ability. It is argued that their conclusions are incorrect, and an error in their derivation is identified. The incorrect equation results in an incorrect conclusion, as work by F. M. Lord (1980) also indicates.…
Descriptors: Equations (Mathematics), Estimation (Mathematics), Raw Scores, Statistical Distributions
Peer reviewed Peer reviewed
May, Kim O.; Nicewander, W. Alan – Journal of Educational Measurement, 1997
Dato de Gruijter is correct in the recent conclusion that one equation derived by the present authors should be changed to reflect that it is an approximation, but it is still argued that percentile ranks for difficult tests can have substantially lower reliability and information relative to their number correct scores holds. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Raw Scores, Reliability
Peer reviewed Peer reviewed
Wilcox, Rand R.; Harris, Chester W. – Journal of Educational Measurement, 1977
Emrick's proposed method for determining a mastery level cut-off score is questioned. Emrick's method is shown to be useful only in limited situations. (JKS)
Descriptors: Correlation, Cutting Scores, Mastery Tests, Mathematical Models
Peer reviewed Peer reviewed
Baskin, David – Journal of Educational Measurement, 1975
Traditional test scoring does not allow the examination of differences among subjects obtaining identical raw scores on the same test. A configuration scoring paradigm for identical raw scores, which provides for such comparisons, is developed and illustrated. (Author)
Descriptors: Elementary Secondary Education, Individual Differences, Mathematical Models, Multiple Choice Tests
Peer reviewed Peer reviewed
Baglin, Roger F. – Journal of Educational Measurement, 1986
Norm-referenced standardized achievement tests are designed for obtaining group scores which can vary widely, depending on not only the measure of central tendency but also the type of derived score employed. This situation is hypothesized to be the result of using inappropriate statistical procedures to develop publishers' scaled scores.…
Descriptors: Achievement Tests, Elementary Secondary Education, Latent Trait Theory, Norm Referenced Tests
Peer reviewed Peer reviewed
Prediger, Dale; Hanson, Gary – Journal of Educational Measurement, 1977
Raw-score reports of vocational interest, personality traits and other psychological constructs are coming into common use. Using college seniors' scores on the American College Test Interest Inventory, criterion-related validity of standard scores based on same-sex and combined-sex norms was equal to or greater than that of raw scores.…
Descriptors: Higher Education, Interest Inventories, Majors (Students), Norms
Peer reviewed Peer reviewed
Lamb, Richard R.; Prediger, Dale J. – Journal of Educational Measurement, 1980
The construct validity of vocational interest test scores was examined by finding the interest profiles of 15,447 college students in 51 majors. Standard scores based on same-sex norms were found to be more valid on this criterion than were raw scores. (CTM)
Descriptors: Higher Education, Interest Inventories, Raw Scores, Sex Bias
Peer reviewed Peer reviewed
Thissen, David M. – Journal of Educational Measurement, 1976
Where estimation of abilities in the lower half of the ability distribution for the Raven Progressive Matrices is important, or an increase in accuracy of ability estimation is needed, the multiple category latent trait estimation provides a rational procedure for realizing gains in accuracy from the use of information in wrong responses.…
Descriptors: Intelligence Tests, Item Analysis, Junior High Schools, Mathematical Models
Previous Page | Next Page »
Pages: 1  |  2