Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 15 |
Descriptor
Error of Measurement | 21 |
Item Analysis | 21 |
Item Response Theory | 6 |
Comparative Analysis | 5 |
Test Reliability | 5 |
Evaluation Methods | 4 |
Measurement Techniques | 4 |
Models | 4 |
Simulation | 4 |
Statistical Bias | 4 |
Correlation | 3 |
More ▼ |
Source
Author
Altepeter, Tom | 1 |
Bandalos, Deborah L. | 1 |
Bowes, Neal | 1 |
Clauser, Brian E. | 1 |
Clauser, Jerome C. | 1 |
Colton, Dean A. | 1 |
Emons, Wilco H. M. | 1 |
Finch, Holmes | 1 |
Fox, Kenneth R. | 1 |
French, Brian F. | 1 |
Gadermann, Anne | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 21 |
Journal Articles | 16 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 2 |
Information Analyses | 1 |
Education Level
Elementary Secondary Education | 2 |
Higher Education | 2 |
Adult Education | 1 |
Early Childhood Education | 1 |
Audience
Researchers | 1 |
Location
Canada | 1 |
Mississippi | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Beck Depression Inventory | 1 |
Expressive One Word Picture… | 1 |
General Educational… | 1 |
What Works Clearinghouse Rating
Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020
An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…
Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
Descriptors: Test Theory, Item Response Theory, Models, Correlation
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement
Wu, Pei-Chen – Journal of Psychoeducational Assessment, 2010
This study examined measurement invariance (i.e., configural invariance, metric invariance, scalar invariance) of the Chinese version of Beck Depression Inventory II (BDI-II-C) across college males and females and compared gender differences on depression at the latent factor mean level. Two samples composed of 402 male college students and 595…
Descriptors: College Students, Females, Negative Attitudes, Construct Validity
Hamilton, Patti; Johnson, Robert; Poudrier, Chelsey – Teaching in Higher Education, 2010
In this paper, we argue that, as indicators of the educational quality of graduate degree programs, student theses and dissertations are best used in specific contexts. High-quality theses and dissertations, that is, may be the result of factors such as verbal skills students already possessed at admission or of complex interactions between…
Descriptors: Educational Quality, Doctoral Dissertations, Theses, Change Strategies
McKenzie, Robert G. – Learning Disability Quarterly, 2009
The assessment procedures within Response to Intervention (RTI) models have begun to supplant the use of traditional, discrepancy-based frameworks for identifying students with specific learning disabilities (SLD). Many RTI proponents applaud this shift because of perceived shortcomings in utilizing discrepancy as an indicator of SLD. However,…
Descriptors: Intervention, Learning Disabilities, Error of Measurement, Psychometrics
Bandalos, Deborah L. – Structural Equation Modeling: A Multidisciplinary Journal, 2008
This study examined the efficacy of 4 different parceling methods for modeling categorical data with 2, 3, and 4 categories and with normal, moderately nonnormal, and severely nonnormal distributions. The parceling methods investigated were isolated parceling in which items were parceled with other items sharing the same source of variance, and…
Descriptors: Structural Equation Models, Computation, Goodness of Fit, Classification
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
French, Brian F.; Maller, Susan J. – Educational and Psychological Measurement, 2007
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
Descriptors: Effect Size, Test Bias, Guidelines, Error of Measurement
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Linacre, John Michael – 1988
Simulations were performed to verify the accuracy with which the Mantel-Haenszel (MH) and Rasch PROX procedures recover simulated item bias. Several standard error estimators for the MH procedure were evaluated. Item bias is recovered satisfactorily by both techniques under all simulated conditions. The proposed MH standard error estimators have…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis
Hartig, Johannes; Holzel, Britta; Moosbrugger, Helfried – Multivariate Behavioral Research, 2007
Numerous studies have shown increasing item reliabilities as an effect of the item position in personality scales. Traditionally, these context effects are analyzed based on item-total correlations. This approach neglects that trends in item reliabilities can be caused either by an increase in true score variance or by a decrease in error…
Descriptors: True Scores, Error of Measurement, Structural Equation Models, Simulation
Guhn, Martin; Gadermann, Anne; Zumbo, Bruno D. – Early Education and Development, 2007
The present study investigates whether the Early Development Instrument (Offord & Janus, 1999) measures school readiness similarly across different groups of children. We employ ordinal logistic regression to investigate differential item functioning, a method of examining measurement bias. For 40,000 children, our analysis compares groups…
Descriptors: School Readiness, Kindergarten, Child Development, Program Validation
Niemi, David; Wang, Jia; Wang, Haiwen; Vallone, Julia; Griffin, Noelle – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2007
There are usually many testing activities going on in a school, with different tests serving different purposes, thus organization and planning are key in creating an efficient system in assessing the most important educational objectives. In the ideal case, an assessment system will be able to inform on student learning, instruction and…
Descriptors: School Administration, Educational Objectives, Administration, Public Schools
Previous Page | Next Page ยป
Pages: 1 | 2