Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Xu and von Davier (2006) demonstrated the feasibility of using the general diagnostic model (GDM) to analyze National Assessment of Educational Progress (NAEP) proficiency data. Their work showed that the GDM analysis not only led to conclusions for gender and race groups similar to those published in the NAEP Report Card, but also allowed…
Descriptors: National Competency Tests, Models, Data Analysis, Reading Tests
Elosua, Paula; Lopez-Jauregui, Alicia – Journal of Experimental Education, 2008
The comparison of scores from linguistically different tests is a twofold matter: the adaptation of tests and the comparison of scores. These 2 aspects of measurement invariance intersect at the need to guarantee the psychometric equivalence between the original and adapted versions. In this study, the authors examined comparability in 2 stages.…
Descriptors: Psychometrics, Item Response Theory, Equated Scores, Comparative Analysis
Leighton, Jacqueline P.; Gokiert, Rebecca J. – Educational Assessment, 2008
The purpose of the present investigation was to identify the relationship among different indicators of uncertainty that lead to potential item misalignment. The item-based indicators included ratings of ambiguity and cognitive complexity. The student-based indicators included (a) frequency of cognitive monitoring per item, (b) levels of…
Descriptors: Test Items, Cognitive Processes, Item Analysis, Self Concept
Kwak, Nohoon; And Others – 1997
This paper introduces a new method for detecting differential item functioning (DIF), the unsigned Mantel-Haenszel (UMH) statistic, and compares this method with two other chi-square methods, the Mantel-Haenszel (MH) and the absolute mean deviation (AMD) statistics, in terms of power and agreement between expected and actual false positive rates.…
Descriptors: Chi Square, Identification, Item Bias, Test Items
Roberts, James S.; Wedell, Douglas H.; Laughlin, James E. – 1998
The Likert rating scale procedure is often used in conjunction with a graded disagree-agree response scale to measure attitudes. Item characteristic curves associated with graded disagree-agree responses are generally single-peaked, nonmonotonic functions of true attitude. These characteristics are, thus, more generally consistent with an…
Descriptors: Attitudes, Likert Scales, Sampling, Test Items
Holland, Paul W. – ETS Research Report Series, 2005
There are test-equating situations in which it may be appropriate to fit a loglinear or other type of probability model to the joint distribution of a total score on a test and a score on part of that test. For anchor test designs, this situation arises for internal anchor tests, which are embedded within the total test. Similarly, a part-whole…
Descriptors: Test Items, Equated Scores, Probability, Statistical Analysis
Boyd, Aimee M.; Dodd, Barbara G.; Fitzpatrick, Steven J. – 2003
This study compared several item exposure control procedures for computerized adaptive test (CAT) systems based on a three-parameter logistic testlet response theory model (X. Wang, E. Bradlow, and H. Wainer, 2002) and G. Masters' (1982) partial credit model using real data from the Verbal Reasoning section of the Medical College Admission Test.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items
Bierschenk, Inger – 2001
Two scientific ideas have been discerned in 20th century thinking: the structuralism common in Europe and the functionalism apparent in the United States. This paper presents two experiments in text analysis. One discusses the behaviorist writing style of Ernest Hemingway. It hypothesizes that since he is a behaviorist in practice, he should be a…
Descriptors: Reader Text Relationship, Test Items, Text Structure
De Ayala, R. J.; Kim, Seock-Ho; Stapleton, Laura M.; Dayton, C. Mitchell – 1999
Differential item functioning (DIF) may be defined as an item that displays different statistical properties for different groups after the groups are matched on an ability measure. For instance, with binary data, DIF exists when there is a difference in the conditional probabilities of a correct response for two manifest groups. This paper…
Descriptors: Item Bias, Monte Carlo Methods, Test Items
Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – 2001
It is widely believed that item selection methods using the maximum information approach (MI) can maintain high efficiency in trait estimation by repeatedly choosing high discriminating (alpha) items. However, the consequence is that they lead to extremely skewed item exposure distribution in which items with high alpha values becoming overly…
Descriptors: Item Banks, Selection, Test Construction, Test Items
Witt, Elizabeth A.; Stahl, John A.; Bergstrom, Betty A.; Muckle, Tim – 2003
The focus of this simulation study was to investigate the effects of item difficulty drift on the stability of test taker ability estimates and pass/fail status under the Rasch model. Real, non-normal distributions of test taker abilities and item difficulties were used to represent true parameters. Test taker responses for 18 conditions of item…
Descriptors: Item Response Theory, Statistical Distributions, Test Items
Roberts, James S. – 2003
Stone and colleagues (C. Stone, R. Ankenman, S. Lane, and M. Liu, 1993; C. Stone, R. Mislevy and J. Mazzeo, 1994; C. Stone, 2000) have proposed a fit index that explicitly accounts for the measurement error inherent in an estimated theta value, here called chi squared superscript 2, subscript i*. The elements of this statistic are natural…
Descriptors: Chi Square, Goodness of Fit, Test Items
Rudner, Lawrence M. – 2000
Testing programs that report a single score based on multiple choice and performance components must face the issue of how to derive the component scores. This paper identifies and logically evaluates alternative component weighting methods. It then examines composite reliability and validity as a function of weights, component reliability,…
Descriptors: Reliability, Scores, Test Construction, Test Items
Veldkamp, Bernard P. – 2000
Two mathematical programming approaches are presented for the assembly of ability test from item pools calibrated under a multidimensional item response theory model. Item selection is based on Fisher's Information matrix. Several criteria can be used to optimize this matrix. In this paper, the A-criterion and the D-criterion are applied. In a…
Descriptors: Ability, Item Banks, Test Construction, Test Items
Lin, Chuan-Ju; Spray, Judith – 2000
This paper presents comparisons among three item-selection criteria for the sequential probability ratio test. The criteria were compared in terms of their efficiency in selecting items, as indicated by average test length and the percentage of correct decisions. The item-selection criteria applied in this study were the Fisher information…
Descriptors: Classification, Criteria, Cutting Scores, Selection

Peer reviewed
Direct link
