Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 9 |
Descriptor
Source
Author
Bashaw, W. L. | 2 |
Rentz, R. Robert | 2 |
Abad, Francisco J. | 1 |
Beretvas, S. Natasha | 1 |
Bolsinova, Maria | 1 |
Cai, Li | 1 |
Chan, Y. C. | 1 |
Chun, P. K. R. | 1 |
Dayton, C. Mitchell | 1 |
DeMars, Christine E. | 1 |
Jackman, M. Grace-Anne | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 8 |
Reports - Evaluative | 3 |
Dissertations/Theses -… | 1 |
Numerical/Quantitative Data | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 1 |
Grade 1 | 1 |
Grade 2 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Secondary Education | 1 |
Audience
Location
Hong Kong | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Child Abuse Potential… | 1 |
Early Childhood Longitudinal… | 1 |
Program for International… | 1 |
Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020
Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…
Descriptors: Test Items, Goodness of Fit, Probability, Accuracy
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015
Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics
Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014
Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…
Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit
Pelanek, Radek – Journal of Educational Data Mining, 2015
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
Descriptors: Models, Data Analysis, Data Processing, Evaluation Criteria
Lin, Johnny Cheng-Han – ProQuest LLC, 2013
Many methods exist for imputing missing data but fewer methods have been proposed to test the missing data mechanism. Little (1988) introduced a multivariate chi-square test for the missing completely at random data mechanism (MCAR) that compares observed means for each pattern with expectation-maximization (EM) estimated means. As an alternative,…
Descriptors: Data Analysis, Statistical Inference, Error of Measurement, Probability
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability
Leite, Walter L.; Sandbach, Robert; Jin, Rong; MacInnes, Jann W.; Jackman, M. Grace-Anne – Structural Equation Modeling: A Multidisciplinary Journal, 2012
Because random assignment is not possible in observational studies, estimates of treatment effects might be biased due to selection on observable and unobservable variables. To strengthen causal inference in longitudinal observational studies of multiple treatments, we present 4 latent growth models for propensity score matched groups, and…
Descriptors: Structural Equation Models, Probability, Computation, Observation
Chan, Y. C.; Lam, Gladys L. T.; Chun, P. K. R.; So, Moon Tong Ernest – Child Abuse & Neglect: The International Journal, 2006
Objectives: To evaluate whether or not the original six-factor structure of the Child Abuse Potential (CAP) Inventory suggested by [Milner, J. S. (1986). "The Child Abuse Potential Inventory: Manual" (2nd ed.). DeKalb, IL: Psytec. Inc.] can be confirmed with data from a group of Chinese mothers in Hong Kong. Method: Eight hundred and…
Descriptors: Measures (Individuals), Factor Structure, Child Abuse, Mothers
DeMars, Christine E. – Applied Psychological Measurement, 2004
Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…
Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies
Statistical Comparisons Among Hierarchies Based on Latent Structure Models. Research Monograph 77-1.
Macready, George B.; Dayton, C. Mitchell – 1977
A probabilistic hypothesis testing procedure to assess the fit of hypothesized hierarchical structures for test item data is discussed. Statistical procedures are presented which are useful for evaluating the fit of data of a certain class of probabilistic models. These models apply to sets of dichotomous (O,1) responses for which there are…
Descriptors: Error of Measurement, Goodness of Fit, Hypothesis Testing, Mathematical Models
Wang, Tianyou; And Others – 1996
M. J. Kolen, B. A. Hanson, and R. L. Brennan (1992) presented a procedure for assessing the conditional standard error of measurement (CSEM) of scale scores using a strong true-score model. They also investigated the ways of using nonlinear transformation from number-correct raw score to scale score to equalize the conditional standard error along…
Descriptors: Ability, Classification, Error of Measurement, Goodness of Fit
Rentz, R. Robert; Bashaw, W. L. – 1975
In order to determine if Rasch Model procedures have any utility for equating pre-existing tests, this study reanalyzed the data from the equating phase of the Anchor Test Study which used a variety of equipercentile and linear model methods. The tests involved included seven reading test batteries, each having from one to three levels and two…
Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement
Rentz, R. Robert; Bashaw, W. L. – 1975
This volume contains tables of item analysis results obtained by following procedures associated with the Rasch Model for those reading tests used in the Anchor Test Study. Appendix I gives the test names and their corresponding analysis code numbers. Section I (Basic Item Analyses) presents data for the item analysis of each test in a two part…
Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement