NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)7
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Parry, James R. – Online Submission, 2020
This paper presents research and provides a method to ensure that parallel assessments, that are generated from a large test-item database, maintain equitable difficulty and content coverage each time the assessment is presented. To maintain fairness and validity it is important that all instances of an assessment, that is intended to test the…
Descriptors: Culture Fair Tests, Difficulty Level, Test Items, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K. – CBE - Life Sciences Education, 2016
Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a…
Descriptors: Gender Differences, Biology, Science Instruction, Case Studies
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Colwell, Nicole Makas – Journal of Education and Training Studies, 2013
This paper highlights the current findings and issues regarding the role of computer-adaptive testing in test anxiety. The computer-adaptive test (CAT) proposed by one of the Common Core consortia brings these issues to the forefront. Research has long indicated that test anxiety impairs student performance. More recent research indicates that…
Descriptors: Test Anxiety, Computer Assisted Testing, Evaluation Methods, Standardized Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…
Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries
Stone, Elizabeth; Davey, Tim – Educational Testing Service, 2011
There has been an increased interest in developing computer-adaptive testing (CAT) and multistage assessments for K-12 accountability assessments. The move to adaptive testing has been met with some resistance by those in the field of special education who express concern about routing of students with divergent profiles (e.g., some students with…
Descriptors: Disabilities, Adaptive Testing, Accountability, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008
A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…
Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006
It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…
Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level
Peer reviewed Peer reviewed
Stricker, Lawrence J. – Educational and Psychological Measurement, 1984
The stability was evaluated of a partial correlation index, comparisons of item characteristic curves, and comparisions of item difficulties in assessing race and sex differences in the performance of verbal items on the Graduate Record Examination Aptitude Test. All three indexes exhibited consistency in identifying the same items in different…
Descriptors: College Entrance Examinations, Comparative Analysis, Correlation, Difficulty Level
Merz, William R.; Grossen, Neal E. – 1978
Six approaches to assessing test item bias were examined: transformed item difficulty, point biserial correlations, chi-square, factor analysis, one parameter item characteristic curve, and three parameter item characteristic curve. Data sets for analysis were generated by a Monte Carlo technique based on the three parameter model; thus, four…
Descriptors: Difficulty Level, Evaluation Methods, Factor Analysis, Item Analysis
Merz, William R.; Rudner, Lawrence M. – 1978
A variety of terms related to test bias or test fairness have been used in a variety of ways, but in this document the "fair use of tests" is defined as equitable selection procedures by means of intact tests, and "test item bias" refers to the study of separate items with respect to the tests of which they are a part. Seven…
Descriptors: Analysis of Covariance, Analysis of Variance, Difficulty Level, Evaluation Criteria