NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…
Descriptors: Scores, Test Validity, Test Items, Prior Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Hou, Likun – ProQuest LLC, 2013
Analyzing examinees' responses using cognitive diagnostic models (CDMs) have the advantages of providing richer diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this dissertation, the model-based DIF detection method, Wald-CDM procedure is…
Descriptors: Test Bias, Models, Cognitive Processes, Diagnostic Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
This report provides an overview of what was known about alternative assessment at the time that the article was written in 1991. Topics include beliefs about assessment reform, overview of alternative assessment including research knowledge, evidence of assessment impact, and critical features of alternative assessment. The author notes that in…
Descriptors: Alternative Assessment, Evaluation Methods, Evaluation Research, Performance Based Assessment
Holland, John L. – 1974
This paper provides a general perspective for evaluating interest inventories and simulations and outlines some activities to stimulate the development of more useful inventories. Previous evaluations have been primarily instrument-specific; have relied generally on opinion rather than evidence; and have focused only on possible sex, age, race, or…
Descriptors: Career Guidance, Evaluation, Improvement, Interest Inventories
Peer reviewed Peer reviewed
Direct linkDirect link
Etherton, Joseph L.; Bianchini, Kevin J.; Ciota, Megan A.; Greve, Kevin W. – Assessment, 2005
Reliable Digit Span (RDS) is an indicator used to assess the validity of cognitive test performance. Scores of 7 or lower suggest poor effort or negative response bias. The possibility that RDS scores are also affected by pain has not been addressed thus potentially threatening RDS specificity. The current study used cold pressor-induced pain to…
Descriptors: Response Style (Tests), Simulation, Intelligence Tests, Pain
Pine, Steven M.; Weiss, David J. – 1978
This report examines how selection fairness is influenced by the characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, degree of item bias, and testing strategy. Computer simulation was used in the administration of either a conventional or Bayesian adaptive ability test to a…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Testing, Computer Assisted Testing
Merz, William R.; Grossen, Neal E. – 1978
Six approaches to assessing test item bias were examined: transformed item difficulty, point biserial correlations, chi-square, factor analysis, one parameter item characteristic curve, and three parameter item characteristic curve. Data sets for analysis were generated by a Monte Carlo technique based on the three parameter model; thus, four…
Descriptors: Difficulty Level, Evaluation Methods, Factor Analysis, Item Analysis
Pine, Steven M.; Weiss, David J. – 1976
This report examines how selection fairness is influenced by the item characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, and degree of item bias. Computer simulation was used in the administration of conventional ability tests to a hypothetical target population consisting of…
Descriptors: Aptitude Tests, Bias, Computer Programs, Culture Fair Tests