Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 10 |
Descriptor
Probability | 21 |
Computation | 6 |
Test Items | 6 |
Bayesian Statistics | 5 |
Error of Measurement | 5 |
Comparative Analysis | 4 |
Equated Scores | 4 |
Models | 4 |
Scores | 4 |
Statistical Analysis | 4 |
Computer Assisted Testing | 3 |
More ▼ |
Source
Journal of Educational and… | 21 |
Author
Johnson, Matthew S. | 2 |
Sinharay, Sandip | 2 |
Wallin, Gabriel | 2 |
Wiberg, Marie | 2 |
van der Linden, Wim J. | 2 |
Bradlow, Eric T. | 1 |
Brennan, Robert L. | 1 |
Casella, George | 1 |
Charron, Camilo | 1 |
Cleary, Richard J. | 1 |
Cohen, Steve | 1 |
More ▼ |
Publication Type
Journal Articles | 21 |
Reports - Evaluative | 21 |
Education Level
Higher Education | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 11 | 1 |
Grade 5 | 1 |
Intermediate Grades | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Audience
Location
Pennsylvania | 1 |
Sweden | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 1 |
Law School Admission Test | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022
The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…
Descriptors: Equated Scores, Test Items, Scores, Probability
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023
This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…
Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores
The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models
Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020
One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…
Descriptors: Reliability, Probability, Skill Development, Classification
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019
When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…
Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations
Monroe, Scott – Journal of Educational and Behavioral Statistics, 2019
In item response theory (IRT) modeling, the Fisher information matrix is used for numerous inferential procedures such as estimating parameter standard errors, constructing test statistics, and facilitating test scoring. In principal, these procedures may be carried out using either the expected information or the observed information. However, in…
Descriptors: Item Response Theory, Error of Measurement, Scoring, Inferences
Keller, Bryan; Tipton, Elizabeth – Journal of Educational and Behavioral Statistics, 2016
In this article, we review four software packages for implementing propensity score analysis in R: "Matching, MatchIt, PSAgraphics," and "twang." After briefly discussing essential elements for propensity score analysis, we apply each package to a data set from the Early Childhood Longitudinal Study in order to estimate the…
Descriptors: Computer Software, Probability, Statistical Analysis, Longitudinal Studies
Garcia-Perez, Miguel A. – Journal of Educational and Behavioral Statistics, 2010
A recent comparative analysis of alternative interval estimation approaches and procedures has shown that confidence intervals (CIs) for true raw scores determined with the Score method--which uses the normal approximation to the binomial distribution--have actual coverage probabilities that are closest to their nominal level. It has also recently…
Descriptors: Computation, Statistical Analysis, True Scores, Raw Scores
Ho, Andrew Dean – Journal of Educational and Behavioral Statistics, 2009
Problems of scale typically arise when comparing test score trends, gaps, and gap trends across different tests. To overcome some of these difficulties, test score distributions on the same score scale can be represented by nonparametric graphs or statistics that are invariant under monotone scale transformations. This article motivates and then…
Descriptors: Nonparametric Statistics, Comparative Analysis, Trend Analysis, Scores
Moses, Tim – Journal of Educational and Behavioral Statistics, 2008
Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…
Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods

Lecoutre, Bruno; Charron, Camilo – Journal of Educational and Behavioral Statistics, 2000
Illustrates procedures for prediction analysis in 2 X 2 contingency tables through the analyses of solutions of six types of problems associated with the acquisition of fractions. Reviews and extends confidence interval procedures previously proposed for an index of predictive efficiency of implication hypotheses. Compares frequentist coverage…
Descriptors: Bayesian Statistics, Hypothesis Testing, Prediction, Probability

Meulders, Michel; De Boeck, Paul; Van Mechelen, Iven; Gelman, Andrew; Maris, Eric – Journal of Educational and Behavioral Statistics, 2001
Presents a fully Bayesian analysis for the Probability Matrix Decomposition (PMD) model using the Gibbs sampler. Identifies the advantages of this approach and illustrates the approach by applying the PMD model to opinions of respondents from different countries concerning the possibility of contracting AIDS in a specific situation. (SLD)
Descriptors: Bayesian Statistics, Matrices, Probability, Psychometrics

Bradlow, Eric T.; Weiss, Robert E. – Journal of Educational and Behavioral Statistics, 2001
Compares four methods that map outlier statistics to a familiarity probability scale (a "P" value). Explored these methods in the context of computerized adaptive test data from a 1995 nationally administered computerized examination for professionals in the medical industry. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Probability, Test Construction
van der Linden, Wim J.; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2004
Item-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter's (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require…
Descriptors: Probability, Law Schools, Admission (School), Adaptive Testing
Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M. – Journal of Educational and Behavioral Statistics, 2003
Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…
Descriptors: Test Items, Markov Processes, Educational Testing, Probability
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational and Behavioral Statistics, 2006
Assuming errors of measurement are distributed binomially, this article reviews various procedures for constructing an interval for an individual's true number-correct score; presents two general interval estimation procedures for an individual's true scale score (i.e., normal approximation and endpoints conversion methods); compares various…
Descriptors: Probability, Intervals, Guidelines, Computer Simulation
Previous Page | Next Page ยป
Pages: 1 | 2