ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

Comparative Testing	10
Test Bias	10
Test Construction	10
Multiple Choice Tests	4
Test Items	4
Test Validity	4
Achievement Tests	3
Item Analysis	3
Test Use	3
Computer Assisted Testing	2
High Schools	2
Higher Education	2
Scores	2
Test Format	2
Test Interpretation	2
Test Selection	2
Ability	1
Adaptive Testing	1
Aptitude Tests	1
Basic Skills	1
Bayesian Statistics	1
Black Students	1
Career Choice	1
Career Exploration	1
Causal Models	1
More ▼

Source

Educational Measurement:…	2
American Journal of Evaluation	1
Assessment in Education:…	1
Journal of Educational…	1

Author

Andrew P. Jaciw	1
Armstrong, Anne-Marie	1
Coffman, William E.	1
Jimmy de la Torre	1
Jinran Wu	1
Kim, Sooyeon	1
Macpherson, Colin R.	1
McHale, Frederick	1
McManus, Barbara Luger	1
Pine, Steven M.	1
Rowley, Glenn L.	1
Steele, D. Joyce	1
Walker, Michael E.	1
Weiss, David J.	1
Wiliam, Dylan	1
Xuelan Qiu	1
You-Gan Wang	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	5
Reports - Evaluative	3
Speeches/Meeting Papers	2
Opinion Papers	1
Reports - Descriptive	1

Education Level

Elementary Education	1
Elementary Secondary Education	1

Audience

Researchers

Location

Tennessee

Laws, Policies, & Programs

Assessments and Surveys

Alabama High School…	1
Iowa Tests of Basic Skills	1
Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Hold the Bets! Should Quasi-Experiments Be Preferred to True Experiments When Causal Generalization Is the Goal?

Peer reviewed

Direct link

Andrew P. Jaciw – American Journal of Evaluation, 2025

By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…

Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias

Item Response Theory Models for Polytomous Multidimensional Forced-Choice Items to Measure Construct Differentiation

Peer reviewed

Direct link

Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024

Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…

Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment

International Comparisons and Sensitivity to Instruction

Peer reviewed

Direct link

Wiliam, Dylan – Assessment in Education: Principles, Policy & Practice, 2008

While international comparisons such as those provided by PISA may be meaningful in terms of overall judgements about the performance of educational systems, caution is needed in terms of more fine-grained judgements. In particular it is argued that the results of PISA to draw conclusions about the quality of instruction in different systems is…

Descriptors: Test Bias, Test Construction, Comparative Testing, Evaluation

Comparisons among Designs for Equating Mixed-Format Tests in Large-Scale Assessments

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010

In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…

Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias

A Comparison of the Fairness of Adaptive and Conventional Testing Strategies. Research Report 78-1.

Download full text

Pine, Steven M.; Weiss, David J. – 1978

This report examines how selection fairness is influenced by the characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, degree of item bias, and testing strategy. Computer simulation was used in the administration of either a conventional or Bayesian adaptive ability test to a…

Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Testing, Computer Assisted Testing

The Revised SAT's and the ACT's--Are They Really Different?

Download full text

McManus, Barbara Luger – 1992

This paper discusses whether or not revisions of the Scholastic Aptitude Test (SAT) and the American College Test (ACT) have created such significant differences between the two tests that a student could conceivably score significantly higher on one than the other. The SAT has been revised to meet the needs of an increasingly diverse student…

Descriptors: Ability, Achievement Tests, Aptitude Tests, College Entrance Examinations

A Descriptive Comparison of Test Items Utilized in Pilot and Live Administrations of the Alabama High School Graduation Examination.

Download full text

Steele, D. Joyce – 1985

This paper contains a comparison of descriptive information based on analyses of pilot and live administrations of the Alabama High School Graduation Examination (AHSGE). The test is composed of three subject tests: Reading, Mathematics, and Language. The study was intended to validate the test development procedure by comparing difficulty levels…

Descriptors: Achievement Tests, Comparative Testing, Difficulty Level, Graduation Requirements

Cognitive-Style Differences in Testing Situations.

Peer reviewed

Armstrong, Anne-Marie – Educational Measurement: Issues and Practice, 1993

The effects of test performance of differentially written multiple-choice tests and test takers' cognitive style were studied for 47 graduate students and 35 public school and college teachers. Adhering to test-writing item guidelines resulted in mean scores basically the same for two groups of differing cognitive style. (SLD)

Descriptors: Cognitive Style, College Faculty, Comparative Testing, Graduate Students

An Empirical Study of the Properties of Two Estimates of Decision-Consistency Used with Two Types of Teacher-Constructed Classroom Tests.

Macpherson, Colin R.; Rowley, Glenn L. – 1986

Teacher-made mastery tests were administered in a classroom-sized sample to study their decision consistency. Decision-consistency of criterion-referenced tests is usually defined in terms of the proportion of examinees who are classified in the same way after two test administrations. Single-administration estimates of decision consistency were…

Descriptors: Classroom Research, Comparative Testing, Criterion Referenced Tests, Cutting Scores

An Exploratory Study of Group Differences in the Performance of Pupils in Grades 6, 7, 8, and 9 on the Items in the Iowa Tests of Basic Skills.

Coffman, William E. – 1978

The Iowa Tests of Basic Skills were administered to over 600 black and white students in grades six through nine, to determine if the test showed bias against minorities. Outliers were identified from test results. Outliers are items which differ from the central core of test items because they fall outside the range expected from a random…

Descriptors: Achievement Tests, Basic Skills, Black Students, Comparative Testing