NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Type
Reports - Evaluative46
Journal Articles30
Speeches/Meeting Papers12
Audience
Location
Japan1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 46 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023
This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…
Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Arthurs, Noah; Stenhaug, Ben; Karayev, Sergey; Piech, Chris – International Educational Data Mining Society, 2019
Understanding exam score distributions has implications for item response theory (IRT), grade curving, and downstream modeling tasks such as peer grading. Historically, grades have been assumed to be normally distributed, and to this day the normal is the ubiquitous choice for modeling exam scores. While this is a good assumption for tests…
Descriptors: Grades (Scholastic), Scores, Statistical Distributions, Models
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2019
Lord's (1980) equity theorem claims observed-score equating to be possible only when two test forms are perfectly reliable or strictly parallel. An analysis of its proof reveals use of an incorrect statistical assumption. The assumption does not invalidate the theorem itself though, which can be shown to follow directly from the discrete nature of…
Descriptors: Equated Scores, Testing Problems, Item Response Theory, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Shu, Lianghua; Schwarz, Richard D. – Journal of Educational Measurement, 2014
As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…
Descriptors: Item Response Theory, Reliability, Models, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; González B., Jorge; von Davier, Alina A. – Journal of Educational Measurement, 2013
Local equating (LE) is based on Lord's criterion of equity. It defines a family of true transformations that aim at the ideal of equitable equating. van der Linden (this issue) offers a detailed discussion of common issues in observed-score equating relative to this local approach. By assuming an underlying item response theory model, one of…
Descriptors: Equated Scores, Transformations (Mathematics), Item Response Theory, Raw Scores
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Hirose, Hideo – Online Submission, 2011
Teachers often raise a question that whether the lecture questionnaires are necessary or not. In this paper, we first show the recent statistical analysis for the official unsigned questionnaire evaluation results took in our faculty. We have found that: (1) the evaluation scores of lectures by students have been rising up year by year, which…
Descriptors: Item Response Theory, Questionnaires, Statistical Analysis, Course Evaluation
Kang, Taehoon; Petersen, Nancy S. – ACT, Inc., 2009
This paper compares three methods of item calibration--concurrent calibration, separate calibration with linking, and fixed item parameter calibration--that are frequently used for linking item parameters to a base scale. Concurrent and separate calibrations were implemented using BILOG-MG. The Stocking and Lord (1983) characteristic curve method…
Descriptors: Standards, Testing Programs, Test Items, Statistical Distributions
Peer reviewed Peer reviewed
van der Linden, Wim J. – Psychometrika, 1998
Dichotomous item response theory (IRT) models can be viewed as families of stochastically ordered distributions of responses to test items. This paper explores several properties of such distributions, especially those related to transfer to other distributions. Results are formulated as a series of theorems and corollaries that apply to…
Descriptors: Item Response Theory, Responses, Statistical Distributions, Test Items
Peer reviewed Peer reviewed
Nering, Michael L. – Applied Psychological Measurement, 1995
A person-fit method that allows researchers to identify nonfitting response vectors is the l(z) statistic. Simulation results show that l(z) may not perform as expected when estimated person parameters are used rather than true person parameters. Other considerations in using true and estimated person parameters are discussed. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Research Methodology, Responses
Buras, Avery – 1996
The logic and uses of test equating are discussed, including three methods of test equating. The focus is on the conceptual underpinnings of each test equating method, rather than on the mathematics of the procedures. Additional consideration is given to the assumptions of each method and its respective strengths and weaknesses. A commonly…
Descriptors: Equated Scores, Item Response Theory, Models, Raw Scores
Garner, Mary; Engelhard, George, Jr. – 1997
This paper considers the following questions: (1) what is the relationship between the method of paired comparisons and Rasch measurement theory? (2) what is the relationship between the method of paired comparisons and graph theory? and (3) what can graph theory contribute to the understanding of Rasch measurement theory? It is specifically shown…
Descriptors: Comparative Analysis, Estimation (Mathematics), Graphs, Item Response Theory
Peer reviewed Peer reviewed
Baker, Frank B. – Applied Psychological Measurement, 1996
Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulations indicate that for the equating conditions studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained…
Descriptors: Equated Scores, Item Response Theory, Sampling, Statistical Distributions
Peer reviewed Peer reviewed
Bedrick, Edward J. – Psychometrika, 1997
A simple approximation to the conditional distribution of goodness-of-fit statistics for the Rasch model is presented that is used when item difficulties are known. The approximation, which is easily programmed, gives relatively accurate assessments of conditional p-values for tests of 10 or more items. (Author/SLD)
Descriptors: Difficulty Level, Goodness of Fit, Item Response Theory, Statistical Distributions
Peer reviewed Peer reviewed
Seol, Hyunsoo – Journal of Outcome Measurement, 1999
Examined five Rasch-model-based item-fit indices in terms of their distributional properties and the power of detecting item bias or differential item functioning. Results indicate that, although these five standardized item-fit indices did not depart significantly from a normal distribution, the Type I error rates were not reasonable. (Author/SLD)
Descriptors: Goodness of Fit, Item Bias, Item Response Theory, Statistical Distributions
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4