ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	9

Descriptor

Error of Measurement	14
Goodness of Fit	14
Probability	14
Statistical Analysis	7
Item Response Theory	5
Data Analysis	4
Evaluation Methods	4
Item Analysis	4
Computation	3
Mathematical Models	3
Measurement Techniques	3
Models	3
Simulation	3
Statistical Distributions	3
Bayesian Statistics	2
Comparative Analysis	2
Elementary Education	2
Equated Scores	2
Factor Structure	2
Foreign Countries	2
Hypothesis Testing	2
Longitudinal Studies	2
Raw Scores	2
Reading Comprehension	2
Reading Tests	2
More ▼

Source

Educational and Psychological…	2
Applied Psychological…	1
Child Abuse & Neglect: The…	1
Journal of Educational Data…	1
Journal of Educational…	1
Journal of Educational and…	1
National Center for Research…	1
ProQuest LLC	1
Structural Equation Modeling:…	1

Publication Type

Journal Articles	8
Reports - Research	8
Reports - Evaluative	3
Dissertations/Theses -…	1
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Secondary Education	1

Audience

Location

Hong Kong

Laws, Policies, & Programs

Assessments and Surveys

Child Abuse Potential…	1
Early Childhood Longitudinal…	1
Program for International…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Assessment of Person Fit for Mixed-Format Tests

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015

Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics

Evaluation of Two Types of Differential Item Functioning in Factor Mixture Models with Binary Outcomes

Peer reviewed

Direct link

Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014

Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…

Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit

Metrics for Evaluation of Student Models

Peer reviewed
PDF on ERIC

Download full text

Pelanek, Radek – Journal of Educational Data Mining, 2015

Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…

Descriptors: Models, Data Analysis, Data Processing, Evaluation Criteria

A Probability Based Framework for Testing the Missing Data Mechanism

Direct link

Lin, Johnny Cheng-Han – ProQuest LLC, 2013

Many methods exist for imputing missing data but fewer methods have been proposed to test the missing data mechanism. Little (1988) introduced a multivariate chi-square test for the missing completely at random data mechanism (MCAR) that compares observed means for each pattern with expectation-maximization (EM) estimated means. As an alternative,…

Descriptors: Data Analysis, Statistical Inference, Error of Measurement, Probability

A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839

Download full text

Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014

We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…

Descriptors: Item Response Theory, Models, Goodness of Fit, Probability

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

An Evaluation of Latent Growth Models for Propensity Score Matched Groups

Peer reviewed

Direct link

Leite, Walter L.; Sandbach, Robert; Jin, Rong; MacInnes, Jann W.; Jackman, M. Grace-Anne – Structural Equation Modeling: A Multidisciplinary Journal, 2012

Because random assignment is not possible in observational studies, estimates of treatment effects might be biased due to selection on observable and unobservable variables. To strengthen causal inference in longitudinal observational studies of multiple treatments, we present 4 latent growth models for propensity score matched groups, and…

Descriptors: Structural Equation Models, Probability, Computation, Observation

Confirmatory Factor Analysis of the Child Abuse Potential Inventory: Results Based on a Sample of Chinese Mothers in Hong Kong

Peer reviewed

Direct link

Chan, Y. C.; Lam, Gladys L. T.; Chun, P. K. R.; So, Moon Tong Ernest – Child Abuse & Neglect: The International Journal, 2006

Objectives: To evaluate whether or not the original six-factor structure of the Child Abuse Potential (CAP) Inventory suggested by [Milner, J. S. (1986). "The Child Abuse Potential Inventory: Manual" (2nd ed.). DeKalb, IL: Psytec. Inc.] can be confirmed with data from a group of Chinese mothers in Hong Kong. Method: Eight hundred and…

Descriptors: Measures (Individuals), Factor Structure, Child Abuse, Mothers

Type I Error Rates for Generalized Graded Unfolding Model Fit Indices

Peer reviewed

Direct link

DeMars, Christine E. – Applied Psychological Measurement, 2004

Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…

Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies

Statistical Comparisons Among Hierarchies Based on Latent Structure Models. Research Monograph 77-1.

Download full text

Macready, George B.; Dayton, C. Mitchell – 1977

A probabilistic hypothesis testing procedure to assess the fit of hypothesized hierarchical structures for test item data is discussed. Statistical procedures are presented which are useful for evaluating the fit of data of a certain class of probabilistic models. These models apply to sets of dichotomous (O,1) responses for which there are…

Descriptors: Error of Measurement, Goodness of Fit, Hypothesis Testing, Mathematical Models

Conditional Standard Errors, Reliability and Decision Consistency of Performance Levels Using Polytomous IRT.

Wang, Tianyou; And Others – 1996

M. J. Kolen, B. A. Hanson, and R. L. Brennan (1992) presented a procedure for assessing the conditional standard error of measurement (CSEM) of scale scores using a strong true-score model. They also investigated the ways of using nonlinear transformation from number-correct raw score to scale score to equalize the conditional standard error along…

Descriptors: Ability, Classification, Error of Measurement, Goodness of Fit

Equating Reading Tests With the Rasch Model. Volume I, Final Report.

Download full text

Rentz, R. Robert; Bashaw, W. L. – 1975

In order to determine if Rasch Model procedures have any utility for equating pre-existing tests, this study reanalyzed the data from the equating phase of the Anchor Test Study which used a variety of equipercentile and linear model methods. The tests involved included seven reading test batteries, each having from one to three levels and two…

Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement

Equating Reading Tests With the Rasch Model. Volume II, Technical Reference Tables.

Download full text

Rentz, R. Robert; Bashaw, W. L. – 1975

This volume contains tables of item analysis results obtained by following procedures associated with the Rasch Model for those reading tests used in the Anchor Test Study. Appendix I gives the test names and their corresponding analysis code numbers. Section I (Basic Item Analyses) presents data for the item analysis of each test in a two part…

Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement

Bashaw, W. L.	2
Rentz, R. Robert	2
Abad, Francisco J.	1
Beretvas, S. Natasha	1
Bolsinova, Maria	1
Cai, Li	1
Chan, Y. C.	1
Chun, P. K. R.	1
Dayton, C. Mitchell	1
DeMars, Christine E.	1
Jackman, M. Grace-Anne	1
Jin, Rong	1
Lam, Gladys L. T.	1
Lee, HwaYoung	1
Leite, Walter L.	1
Liaw, Yuan-Ling	1
Lin, Johnny Cheng-Han	1
MacInnes, Jann W.	1
Macready, George B.	1
Monroe, Scott	1
Pelanek, Radek	1
Rutkowski, David	1
Rutkowski, Leslie	1
Sandbach, Robert	1
More ▼