ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Descriptor

Item Analysis	8
Test Items	7
Psychometrics	4
Test Construction	4
Foreign Countries	3
Item Response Theory	3
Multiple Choice Tests	3
Achievement Tests	2
Correlation	2
Difficulty Level	2
Evaluation Methods	2
Mathematics Tests	2
Models	2
Reading Tests	2
Accuracy	1
Algebra	1
Cluster Grouping	1
Cognitive Processes	1
Cognitive Structures	1
College Entrance Examinations	1
College Graduates	1
Comparative Analysis	1
Computer Assisted Testing	1
Diseases	1
Educational Assessment	1
More ▼

Source

Applied Measurement in…	2
Alberta Journal of…	1
College Board	1
Educational Measurement:…	1
Journal of Educational…	1
Large-scale Assessments in…	1
Review of Educational Research	1

Author

Gierl, Mark J.	8
Leighton, Jacqueline P.	3
Bulut, Okan	2
Boulais, André-Philippe	1
De Champlain, André	1
Ercikan, Kadriye	1
Gokiert, Rebecca	1
Guo, Qi	1
Hunka, Stephen M.	1
Klinger, Don A.	1
Koh, Kim	1
Lai, Hollis	1
McCreith, Tanya	1
Pugh, Debra	1
Puhan, Gautam	1
Quo, Qi	1
Rogers, W. Todd	1
Tan, Adele	1
Tan, Xuan	1
Touchie, Claire	1
Wang, Changjiang	1
Zhang, Xinxin	1
Zhou, Jiawen	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	6
Guides - Classroom - Learner	1
Information Analyses	1
Numerical/Quantitative Data	1
Reports - Evaluative	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1
High Schools	1
Secondary Education	1

Audience

Location

Canada	2
New York	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Developing, Analyzing, and Using Distractors for Multiple-Choice Tests in Education: A Comprehensive Review

Peer reviewed

Direct link

Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017

Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…

Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns

A Structural Equation Modeling Approach for Examining Position Effects in Large-Scale Assessments

Peer reviewed

Direct link

Bulut, Okan; Quo, Qi; Gierl, Mark J. – Large-scale Assessments in Education, 2017

Position effects may occur in both paper--pencil tests and computerized assessments when examinees respond to the same test items located in different positions on the test. To examine position effects in large-scale assessments, previous studies often used multilevel item response models within the generalized linear mixed modeling framework.…

Descriptors: Structural Equation Models, Educational Assessment, Measurement, Test Items

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

Validating Cognitive Models of Task Performance in Algebra on the SAT®. Research Report No. 2009-3

Download full text

Gierl, Mark J.; Leighton, Jacqueline P.; Wang, Changjiang; Zhou, Jiawen; Gokiert, Rebecca; Tan, Adele – College Board, 2009

The purpose of the study is to present research focused on validating the four algebra cognitive models in Gierl, Wang, et al., using student response data collected with protocol analysis methods to evaluate the knowledge structures and processing skills used by a sample of SAT test takers.

Descriptors: Algebra, Mathematics Tests, College Entrance Examinations, Student Attitudes

Exploring the Logic of Tatsuoka's Rule-Space Model for Test Development and Analysis. An NCME Instructional Module.

Peer reviewed

Gierl, Mark J.; Leighton, Jacqueline P.; Hunka, Stephen M. – Educational Measurement: Issues and Practice, 2000

Discusses the logic of the rule-space model (K. Tatsuoka, 1983) as it applies to test development and analysis. The rule-space model is a statistical method for classifying examinees' test item responses into a set of attribute-mastery patterns associated with different cognitive skills. Directs readers to a tutorial that may be downloaded. (SLD)

Descriptors: Item Analysis, Item Response Theory, Test Construction, Test Items

Using Statistical and Judgmental Reviews To Identify and Interpret Translation Differential Item Functioning.

Peer reviewed

Gierl, Mark J.; Rogers, W. Todd; Klinger, Don A. – Alberta Journal of Educational Research, 1999

Evaluates the equivalence of translated achievement tests administered to 4,400 English- and French-speaking sixth-graders. Items displaying differential item functioning were flagged using three statistical methods; results were relatively consistent across methods, but not identical. Substantive review of French items via back-translation to…

Descriptors: Achievement Tests, Evaluation Methods, Evaluation Research, Foreign Countries

Evaluating DETECT Classification Accuracy and Consistency when Data Display Complex Structure

Peer reviewed

Direct link

Gierl, Mark J.; Leighton, Jacqueline P.; Tan, Xuan – Journal of Educational Measurement, 2006

DETECT, the acronym for Dimensionality Evaluation To Enumerate Contributing Traits, is an innovative and relatively new nonparametric dimensionality assessment procedure used to identify mutually exclusive, dimensionally homogeneous clusters of items using a genetic algorithm ( Zhang & Stout, 1999). Because the clusters of items are mutually…

Descriptors: Program Evaluation, Cluster Grouping, Evaluation Methods, Multivariate Analysis

Comparability of Bilingual Versions of Assessments: Sources of Incomparability of English and French Versions of Canada's National Achievement Tests

Peer reviewed

Direct link

Ercikan, Kadriye; Gierl, Mark J.; McCreith, Tanya; Puhan, Gautam; Koh, Kim – Applied Measurement in Education, 2004

This research examined the degree of comparability and sources of incomparability of English and French versions of reading, mathematics, and science tests that were administered as part of a survey of achievement in Canada. The results point to substantial psychometric differences between the 2 language versions. Approximately 18% to 36% of the…

Descriptors: Foreign Countries, Psychometrics, Science Tests, French