NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Andrew P. Jaciw – American Journal of Evaluation, 2025
By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…
Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…
Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wiliam, Dylan – Assessment in Education: Principles, Policy & Practice, 2008
While international comparisons such as those provided by PISA may be meaningful in terms of overall judgements about the performance of educational systems, caution is needed in terms of more fine-grained judgements. In particular it is argued that the results of PISA to draw conclusions about the quality of instruction in different systems is…
Descriptors: Test Bias, Test Construction, Comparative Testing, Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Pine, Steven M.; Weiss, David J. – 1978
This report examines how selection fairness is influenced by the characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, degree of item bias, and testing strategy. Computer simulation was used in the administration of either a conventional or Bayesian adaptive ability test to a…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Testing, Computer Assisted Testing
McManus, Barbara Luger – 1992
This paper discusses whether or not revisions of the Scholastic Aptitude Test (SAT) and the American College Test (ACT) have created such significant differences between the two tests that a student could conceivably score significantly higher on one than the other. The SAT has been revised to meet the needs of an increasingly diverse student…
Descriptors: Ability, Achievement Tests, Aptitude Tests, College Entrance Examinations
Steele, D. Joyce – 1985
This paper contains a comparison of descriptive information based on analyses of pilot and live administrations of the Alabama High School Graduation Examination (AHSGE). The test is composed of three subject tests: Reading, Mathematics, and Language. The study was intended to validate the test development procedure by comparing difficulty levels…
Descriptors: Achievement Tests, Comparative Testing, Difficulty Level, Graduation Requirements
Peer reviewed Peer reviewed
Armstrong, Anne-Marie – Educational Measurement: Issues and Practice, 1993
The effects of test performance of differentially written multiple-choice tests and test takers' cognitive style were studied for 47 graduate students and 35 public school and college teachers. Adhering to test-writing item guidelines resulted in mean scores basically the same for two groups of differing cognitive style. (SLD)
Descriptors: Cognitive Style, College Faculty, Comparative Testing, Graduate Students
Macpherson, Colin R.; Rowley, Glenn L. – 1986
Teacher-made mastery tests were administered in a classroom-sized sample to study their decision consistency. Decision-consistency of criterion-referenced tests is usually defined in terms of the proportion of examinees who are classified in the same way after two test administrations. Single-administration estimates of decision consistency were…
Descriptors: Classroom Research, Comparative Testing, Criterion Referenced Tests, Cutting Scores
Coffman, William E. – 1978
The Iowa Tests of Basic Skills were administered to over 600 black and white students in grades six through nine, to determine if the test showed bias against minorities. Outliers were identified from test results. Outliers are items which differ from the central core of test items because they fall outside the range expected from a random…
Descriptors: Achievement Tests, Basic Skills, Black Students, Comparative Testing