Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 4 |
Descriptor
Error of Measurement | 12 |
Item Bias | 6 |
Simulation | 5 |
Test Items | 5 |
Item Response Theory | 4 |
Scores | 4 |
Adaptive Testing | 3 |
Computer Assisted Testing | 3 |
Evaluation Methods | 3 |
Mathematical Models | 3 |
College Students | 2 |
More ▼ |
Source
Educational Assessment | 2 |
Journal of Educational and… | 2 |
Applied Psychological… | 1 |
ETS Research Report Series | 1 |
Journal of Educational… | 1 |
Journal of Educational… | 1 |
Author
Zwick, Rebecca | 12 |
Thayer, Dorothy T. | 2 |
Zapata-Rivera, Diego | 2 |
Hegarty, Mary | 1 |
Himelfarb, Igor | 1 |
Sklar, Jeffrey C. | 1 |
Vezzu, Margaret | 1 |
Publication Type
Journal Articles | 8 |
Reports - Evaluative | 6 |
Reports - Research | 6 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 1 |
Education Level
Elementary Secondary Education | 3 |
Higher Education | 3 |
Postsecondary Education | 2 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
California | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Zapata-Rivera, Diego; Zwick, Rebecca; Vezzu, Margaret – Educational Assessment, 2016
The goal of this study was to explore the effectiveness of a short web-based tutorial in helping teachers to better understand the portrayal of measurement error in test score reports. The short video tutorial included both verbal and graphical representations of measurement error. Results showed a significant difference in comprehension scores…
Descriptors: Error of Measurement, Tutorial Programs, Instructional Effectiveness, Web Based Instruction
Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary – Educational Assessment, 2014
Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…
Descriptors: Error of Measurement, Scores, Reports, Comparative Analysis
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Zwick, Rebecca; Himelfarb, Igor – Journal of Educational Measurement, 2011
Research has often found that, when high school grades and SAT scores are used to predict first-year college grade-point average (FGPA) via regression analysis, African-American and Latino students, are, on average, predicted to earn higher FGPAs than they actually do. Under various plausible models, this phenomenon can be explained in terms of…
Descriptors: Socioeconomic Status, Grades (Scholastic), Error of Measurement, White Students

Zwick, Rebecca; Thayer, Dorothy T. – Journal of Educational and Behavioral Statistics, 1996
Two possible standard error formulas for the polytomous differential item functioning index proposed by N. J. Dorans and A. P. Schmitt (1991) were derived. These standard errors, and associated hypothesis-testing procedures, were evaluated through simulated data. The standard error that performed better is based on N. Mantel's (1963)…
Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias
Zwick, Rebecca – 1986
Most currently used measures of inter-rater agreement for the nominal case incorporate a correction for "chance agreement." The definition of chance agreement is not the same for all coefficients, however. Three chance-corrected coefficients are Cohen's Kappa; Scott's Pi; and the S index of Bennett, Goldstein, and Alpert, which has…
Descriptors: Error of Measurement, Interrater Reliability, Mathematical Models, Measurement Techniques

Zwick, Rebecca – Journal of Educational Statistics, 1990
Use of the Mantel-Haenszel procedure as a test for differential item functioning under the Rasch model of item-response theory is examined. Results of the procedure cannot be generalized to the class of items for which item-response functions are monotonic and local independence holds. (TJH)
Descriptors: Demography, Equations (Mathematics), Error of Measurement, Item Bias
Zwick, Rebecca; Thayer, Dorothy T. – 1994
Several recent studies have investigated the application of statistical inference procedures to the analysis of differential item functioning (DIF) in test items that are scored on an ordinal scale. Mantel's extension of the Mantel-Haenszel test is a possible hypothesis-testing method for this purpose. The development of descriptive statistics for…
Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias

Zwick, Rebecca; And Others – Applied Psychological Measurement, 1994
Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel method of differential item functioning (DIF) analysis in computerized adaptive tests (CAT). Results indicate that CAT-based DIF procedures perform well and support the use of item response theory-based matching variables in DIF analysis. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Simulation, Error of Measurement
Zwick, Rebecca; And Others – 1993
Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel and standardization methods of differential item functioning (DIF) analysis in computer-adaptive tests (CATs). Each "examinee" received 25 items out of a 75-item pool. A three-parameter logistic item response model was assumed, and…
Descriptors: Adaptive Testing, Computer Assisted Testing, Correlation, Error of Measurement
Zwick, Rebecca; And Others – 1994
A simulation study of methods of assessing differential item functioning (DIF) in computer-adaptive tests (CATs) was conducted by Zwick, Thayer, and Wingersky (in press, 1993). Results showed that modified versions of the Mantel-Haenszel and standardization methods work well with CAT data. DIF methods were also investigated for nonadaptive…
Descriptors: Adaptive Testing, Computer Assisted Testing, Error of Measurement, Estimation (Mathematics)
Zwick, Rebecca; Sklar, Jeffrey C. – Journal of Educational and Behavioral Statistics, 2005
Cox (1972) proposed a discrete-time survival model that is somewhat analogous to the proportional hazards model for continuous time. Efron (1988) showed that this model can be estimated using ordinary logistic regression software, and Singer and Willett (1993) provided a detailed illustration of a particularly flexible form of the model that…
Descriptors: Error of Measurement, Regression (Statistics), Computer Software, Predictor Variables