Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Simulation | 12 |
Test Bias | 12 |
Test Validity | 12 |
Item Analysis | 7 |
Test Items | 6 |
Evaluation Methods | 3 |
Models | 3 |
Scores | 3 |
Statistical Analysis | 3 |
Test Reliability | 3 |
Bayesian Statistics | 2 |
More ▼ |
Source
Journal of Educational and… | 2 |
Applied Measurement in… | 1 |
Assessment | 1 |
Center for Education Data &… | 1 |
ETS Research Report Series | 1 |
National Center for Research… | 1 |
ProQuest LLC | 1 |
Author
Pine, Steven M. | 2 |
Weiss, David J. | 2 |
Baker, Eva L. | 1 |
Bianchini, Kevin J. | 1 |
Chaplin, Duncan | 1 |
Ciota, Megan A. | 1 |
Eckerly, Carol | 1 |
Etherton, Joseph L. | 1 |
Goldhaber, Dan | 1 |
Gorney, Kylie | 1 |
Greve, Kevin W. | 1 |
More ▼ |
Publication Type
Reports - Research | 8 |
Journal Articles | 5 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Reports - Evaluative | 1 |
Education Level
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…
Descriptors: Scores, Test Validity, Test Items, Prior Learning
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Hou, Likun – ProQuest LLC, 2013
Analyzing examinees' responses using cognitive diagnostic models (CDMs) have the advantages of providing richer diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this dissertation, the model-based DIF detection method, Wald-CDM procedure is…
Descriptors: Test Bias, Models, Cognitive Processes, Diagnostic Tests
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
This report provides an overview of what was known about alternative assessment at the time that the article was written in 1991. Topics include beliefs about assessment reform, overview of alternative assessment including research knowledge, evidence of assessment impact, and critical features of alternative assessment. The author notes that in…
Descriptors: Alternative Assessment, Evaluation Methods, Evaluation Research, Performance Based Assessment
Holland, John L. – 1974
This paper provides a general perspective for evaluating interest inventories and simulations and outlines some activities to stimulate the development of more useful inventories. Previous evaluations have been primarily instrument-specific; have relied generally on opinion rather than evidence; and have focused only on possible sex, age, race, or…
Descriptors: Career Guidance, Evaluation, Improvement, Interest Inventories
Etherton, Joseph L.; Bianchini, Kevin J.; Ciota, Megan A.; Greve, Kevin W. – Assessment, 2005
Reliable Digit Span (RDS) is an indicator used to assess the validity of cognitive test performance. Scores of 7 or lower suggest poor effort or negative response bias. The possibility that RDS scores are also affected by pain has not been addressed thus potentially threatening RDS specificity. The current study used cold pressor-induced pain to…
Descriptors: Response Style (Tests), Simulation, Intelligence Tests, Pain
Pine, Steven M.; Weiss, David J. – 1978
This report examines how selection fairness is influenced by the characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, degree of item bias, and testing strategy. Computer simulation was used in the administration of either a conventional or Bayesian adaptive ability test to a…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Testing, Computer Assisted Testing
Merz, William R.; Grossen, Neal E. – 1978
Six approaches to assessing test item bias were examined: transformed item difficulty, point biserial correlations, chi-square, factor analysis, one parameter item characteristic curve, and three parameter item characteristic curve. Data sets for analysis were generated by a Monte Carlo technique based on the three parameter model; thus, four…
Descriptors: Difficulty Level, Evaluation Methods, Factor Analysis, Item Analysis
Pine, Steven M.; Weiss, David J. – 1976
This report examines how selection fairness is influenced by the item characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, and degree of item bias. Computer simulation was used in the administration of conventional ability tests to a hypothetical target population consisting of…
Descriptors: Aptitude Tests, Bias, Computer Programs, Culture Fair Tests