ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	6

Source

Journal of Educational…

Publication Type

Journal Articles	14
Reports - Evaluative	7
Reports - Research	7

Education Level

High Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	2
ACT Assessment	1
ACT Interest Inventory	1
Iowa Tests of Educational…	1
Metropolitan Achievement Tests	1
Raven Progressive Matrices	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

A New Facets Model for Rater's Centrality/Extremity Response Style

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2018

The Rasch facets model was developed to account for facet data, such as student essays graded by raters, but it accounts for only one kind of rater effect (severity). In practice, raters may exhibit various tendencies such as using middle or extreme scores in their ratings, which is referred to as the rater centrality/extremity response style. To…

Descriptors: Scoring, Models, Interrater Reliability, Computation

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

A New Interpretation of Augmented Subscores and Their Added Value in Terms of Parallel Forms

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2018

The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…

Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory

Predicting Operational Rater-Type Classifications Using Rasch Measurement Theory and Random Forests: A Music Performance Assessment Perspective

Peer reviewed

Direct link

Wesolowski, Brian C. – Journal of Educational Measurement, 2019

The purpose of this study was to build a Random Forest supervised machine learning model in order to predict musical rater-type classifications based upon a Rasch analysis of raters' differential severity/leniency related to item use. Raw scores (N = 1,704) from 142 raters across nine high school solo and ensemble festivals (grades 9-12) were…

Descriptors: Item Response Theory, Prediction, Classification, Artificial Intelligence

Local Equating Using the Rasch Model, the OPLM, and the 2PL IRT Model--or--What Is It Anyway if the Model Captures Everything There Is to Know about the Test Takers?

Peer reviewed

Direct link

von Davier, Matthias; González B., Jorge; von Davier, Alina A. – Journal of Educational Measurement, 2013

Local equating (LE) is based on Lord's criterion of equity. It defines a family of true transformations that aim at the ideal of equitable equating. van der Linden (this issue) offers a detailed discussion of common issues in observed-score equating relative to this local approach. By assuming an underlying item response theory model, one of…

Descriptors: Equated Scores, Transformations (Mathematics), Item Response Theory, Raw Scores

Using Log-Linear Smoothing to Improve Small-Sample DIF Estimation

Peer reviewed

Direct link

Puhan, Gautam; Moses, Timothy P.; Yu, Lei; Dorans, Neil J. – Journal of Educational Measurement, 2009

This study examined the extent to which log-linear smoothing could improve the accuracy of differential item functioning (DIF) estimates in small samples of examinees. Examinee responses from a certification test were analyzed using White examinees in the reference group and African American examinees in the focal group. Using a simulation…

Descriptors: Test Items, Reference Groups, Testing Programs, Raw Scores

Accuracy of Short-Cut Estimates for Standard Deviation

Peer reviewed

Sabers, Darrell L.; Klausmeier, Richard D. – Journal of Educational Measurement, 1971

Descriptors: Measurement Techniques, Raw Scores, Sampling, Statistical Analysis

On Information of Percentile Ranks.

Peer reviewed

de Gruijter, Dato N. M. – Journal of Educational Measurement, 1997

K. May and W. A. Nicewander recently concluded (1994) that percentile ranks are inferior or raw scores as indicators of latent ability. It is argued that their conclusions are incorrect, and an error in their derivation is identified. The incorrect equation results in an incorrect conclusion, as work by F. M. Lord (1980) also indicates.…

Descriptors: Equations (Mathematics), Estimation (Mathematics), Raw Scores, Statistical Distributions

Information and Reliability for Percentile Ranks and Other Monotonic Transformations of the Number-Correct Score: Reply to de Gruijter.

Peer reviewed

May, Kim O.; Nicewander, W. Alan – Journal of Educational Measurement, 1997

Dato de Gruijter is correct in the recent conclusion that one equation derived by the present authors should be changed to reflect that it is an approximation, but it is still argued that percentile ranks for difficult tests can have substantially lower reliability and information relative to their number correct scores holds. (SLD)

Descriptors: Equations (Mathematics), Estimation (Mathematics), Raw Scores, Reliability

On Emrick's "An Evaluation Model for Mastery Testing"

Peer reviewed

Wilcox, Rand R.; Harris, Chester W. – Journal of Educational Measurement, 1977

Emrick's proposed method for determining a mastery level cut-off score is questioned. Emrick's method is shown to be useful only in limited situations. (JKS)

Descriptors: Correlation, Cutting Scores, Mastery Tests, Mathematical Models

A Configuration-Scoring Paradigm for Identical Raw Scores

Peer reviewed

Baskin, David – Journal of Educational Measurement, 1975

Traditional test scoring does not allow the examination of differences among subjects obtaining identical raw scores on the same test. A configuration scoring paradigm for identical raw scores, which provides for such comparisons, is developed and illustrated. (Author)

Descriptors: Elementary Secondary Education, Individual Differences, Mathematical Models, Multiple Choice Tests

A Problem in Calculating Group Scores on Norm-Referenced Tests.

Peer reviewed

Baglin, Roger F. – Journal of Educational Measurement, 1986

Norm-referenced standardized achievement tests are designed for obtaining group scores which can vary widely, depending on not only the measure of central tendency but also the type of derived score employed. This situation is hypothesized to be the result of using inappropriate statistical procedures to develop publishers' scaled scores.…

Descriptors: Achievement Tests, Elementary Secondary Education, Latent Trait Theory, Norm Referenced Tests

Some Consequences of Using Raw-Score Reports of Vocational Interests

Peer reviewed

Prediger, Dale; Hanson, Gary – Journal of Educational Measurement, 1977

Raw-score reports of vocational interest, personality traits and other psychological constructs are coming into common use. Using college seniors' scores on the American College Test Interest Inventory, criterion-related validity of standard scores based on same-sex and combined-sex norms was equal to or greater than that of raw scores.…

Descriptors: Higher Education, Interest Inventories, Majors (Students), Norms

Construct Validity of Raw Score and Standard Score Reports of Vocational Interests.

Peer reviewed

Lamb, Richard R.; Prediger, Dale J. – Journal of Educational Measurement, 1980

The construct validity of vocational interest test scores was examined by finding the interest profiles of 15,447 college students in 51 majors. Standard scores based on same-sex norms were found to be more valid on this criterion than were raw scores. (CTM)

Descriptors: Higher Education, Interest Inventories, Raw Scores, Sex Bias

Information in Wrong Responses to the Raven Progressive Matrices

Peer reviewed

Thissen, David M. – Journal of Educational Measurement, 1976

Where estimation of abilities in the lower half of the ability distribution for the Raven Progressive Matrices is important, or an increase in accuracy of ability estimation is needed, the multiple category latent trait estimation provides a rational procedure for realizing gains in accuracy from the use of information in wrong responses.…

Descriptors: Intelligence Tests, Item Analysis, Junior High Schools, Mathematical Models

Previous Page | Next Page »

Pages: 1 | 2

Raw Scores	19
Scoring	6
Item Response Theory	5
Mathematical Models	4
Achievement Tests	3
Equations (Mathematics)	3
Error of Measurement	3
Estimation (Mathematics)	3
Models	3
Reliability	3
Scaling	3
Statistical Distributions	3
Test Construction	3
Test Interpretation	3
Test Items	3
Test Validity	3
Classification	2
Comparative Analysis	2
Computation	2
Correlation	2
Elementary Secondary Education	2
Equated Scores	2
High School Students	2
Higher Education	2
Interest Inventories	2
More ▼

Baglin, Roger F.	1
Baskin, David	1
Becker, Douglas F.	1
Choi, Jiwon	1
Dorans, Neil J.	1
Ferrara, Steven	1
Forsyth, Robert A.	1
González B., Jorge	1
Hanson, Gary	1
Harris, Chester W.	1
Hoover, H. D.	1
Huynh, Huynh	1
Jin, Kuan-Yu	1
Kang, Yujin	1
Kim, Stella Y.	1
Klausmeier, Richard D.	1
Kolen, Michael J.	1
Lamb, Richard R.	1
Lee, Won-Chan	1
May, Kim O.	1
Moses, Timothy P.	1
Nicewander, W. Alan	1
Plake, Barbara S.	1
Prediger, Dale	1
More ▼