Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 29 |
Descriptor
Source
Author
| Wise, Steven L. | 3 |
| Badger, Elizabeth | 2 |
| Bridgeman, Brent | 2 |
| Clarke, S. C. T. | 2 |
| Clauser, Brian E. | 2 |
| De Ayala, R. J. | 2 |
| Ellis, Barbara B. | 2 |
| Hughes, Carolyn | 2 |
| Lissitz, Robert W. | 2 |
| Little, Todd D. | 2 |
| Nandakumar, Ratna | 2 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 11 |
| Elementary Secondary Education | 10 |
| Postsecondary Education | 7 |
| Elementary Education | 6 |
| Grade 8 | 5 |
| Grade 4 | 4 |
| Secondary Education | 4 |
| Grade 3 | 3 |
| Early Childhood Education | 2 |
| Grade 5 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
| Researchers | 6 |
| Practitioners | 1 |
| Teachers | 1 |
Location
| United States | 7 |
| Canada | 6 |
| Germany | 3 |
| Israel | 3 |
| Australia | 2 |
| China | 2 |
| South Africa | 2 |
| United Kingdom (England) | 2 |
| Alabama | 1 |
| Canada (Edmonton) | 1 |
| France | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
National Center for Education Statistics, 2013
The 2011 NAEP-TIMSS linking study conducted by the National Center for Education Statistics (NCES) was designed to predict Trends in International Mathematics and Science Study (TIMSS) scores for the U.S. states that participated in 2011 National Assessment of Educational Progress (NAEP) mathematics and science assessment of eighth-grade students.…
Descriptors: Grade 8, Research Methodology, Research Design, Trend Analysis
Foy, Pierre, Ed.; Arora, Alka, Ed.; Stanco, Gabrielle M., Ed. – International Association for the Evaluation of Educational Achievement, 2013
The TIMSS 2011 International Database includes data for all questionnaires administered as part of the TIMSS 2011 assessment. This supplement contains the international version of the TIMSS 2011 background questionnaires and curriculum questionnaires in the following 10 sections: (1) Fourth Grade Student Questionnaire; (2) Fourth Grade Home…
Descriptors: Background, Questionnaires, Test Items, Grade 4
Sparfeldt, Jorn R.; Kimmel, Rumena; Lowenkamp, Lena; Steingraber, Antje; Rost, Detlef H. – Educational Assessment, 2012
Multiple-choice (MC) reading comprehension test items comprise three components: text passage, questions about the text, and MC answers. The construct validity of this format has been repeatedly criticized. In three between-subjects experiments, fourth graders (N[subscript 1] = 230, N[subscript 2] = 340, N[subscript 3] = 194) worked on three…
Descriptors: Test Items, Reading Comprehension, Construct Validity, Grade 4
Sinharay, Sandip; Holland, Paul W. – Journal of Educational Measurement, 2007
It is a widely held belief that anchor tests should be miniature versions (i.e., "minitests"), with respect to content and statistical characteristics, of the tests being equated. This article examines the foundations for this belief regarding statistical characteristics. It examines the requirement of statistical representativeness of…
Descriptors: Test Items, Comparative Testing
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Wang, Jianjun – School Science and Mathematics, 2011
As the largest international study ever taken in history, the Trend in Mathematics and Science Study (TIMSS) has been held as a benchmark to measure U.S. student performance in the global context. In-depth analyses of the TIMSS project are conducted in this study to examine key issues of the comparative investigation: (1) item flaws in mathematics…
Descriptors: Test Items, Figurative Language, Item Response Theory, Benchmarking
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics
Schulz, Wolfram; Fraillon, Julian – Educational Research and Evaluation, 2011
When comparing data derived from tests or questionnaires in cross-national studies, researchers commonly assume measurement invariance in their underlying scaling models. However, different cultural contexts, languages, and curricula can have powerful effects on how students respond in different countries. This article illustrates how the…
Descriptors: Citizenship Education, International Studies, Item Response Theory, International Education
Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009
Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…
Descriptors: Test Results, Test Items, Testing, Aptitude Tests
Kato, Kentaro; Moen, Ross E.; Thurlow, Martha L. – Educational Measurement: Issues and Practice, 2009
Large data sets from a state reading assessment for third and fifth graders were analyzed to examine differential item functioning (DIF), differential distractor functioning (DDF), and differential omission frequency (DOF) between students with particular categories of disabilities (speech/language impairments, learning disabilities, and emotional…
Descriptors: Learning Disabilities, Language Impairments, Behavior Disorders, Affective Behavior
Coe, Robert – Oxford Review of Education, 2008
The comparability of examinations in different subjects has been a controversial topic for many years and a number of criticisms have been made of statistical approaches to estimating the "difficulties" of achieving particular grades in different subjects. This paper argues that if comparability is understood in terms of a linking…
Descriptors: Test Items, Grades (Scholastic), Foreign Countries, Test Bias
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007
In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…
Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)
Frisbie, David A. – 1981
The relative difficulty ratio (RDR) is used as a method of representing test difficulty. The RDR is the ratio of a test mean to the ideal mean, the point midway between the perfect score and the mean chance score for the test. The RDR tranformation is a linear scale conversion method but not a linear equating method in the classical sense. The…
Descriptors: Comparative Testing, Difficulty Level, Evaluation Methods, Raw Scores
Peer reviewedMelnick, Steven A.; Gable, Robert K. – Educational Research Quarterly, 1990
By administering an attitude survey to 3,328 parents of elementary school students, use of positive and negative Likert item stems was analyzed. Respondents who answered positive/negative item pairs that were parallel in meaning consistently were compared with those who answered inconsistently. Implications for construction of affective measures…
Descriptors: Affective Measures, Comparative Testing, Elementary Education, Likert Scales
Clauser, Brian E.; And Others – 1991
Item bias has been a major concern for test developers during recent years. The Mantel-Haenszel statistic has been among the preferred methods for identifying biased items. The statistic's performance in identifying uniform bias in simulated data modeled by producing various levels of difference in the (item difficulty) b-parameter for reference…
Descriptors: Comparative Testing, Difficulty Level, Item Bias, Item Response Theory

Direct link
