Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Holland, Paul W. – 1989
A simple technique, developed by A. Phillips (1987) is used to approximate the covariance between the Mantel-Haenszel log-odds-ratio estimator for a 2 x 2 x k table and the sample marginal proportions. These results are then applied to obtain an approximate variance estimate of an adjusted risk difference based on the Mantel-Haenszel odds-ratio…
Descriptors: Difficulty Level, Estimation (Mathematics), Item Bias, Risk
Shen, Linjun – 1997
Three aspects of the usual approach to assessing local item dependency, Yen's "Q" (H. Huynh, H. Michaels, and S. Ferrara, 1995), deserve further investigation. Pearson correlation coefficients do not distribute normally when the coefficients are large, and thus cannot quantify the dependency well. In the second place, the accuracy of…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Reliability
Meijer, Rob R.; Sijtsma, Klaas – 1994
Methods for detecting item score patterns that are unlikely (aberrant) given that a parametric item response theory (IRT) model gives an adequate description of the data or given the responses of the other persons in the group are discussed. The emphasis here is on the latter group of statistics. These statistics can be applied when a…
Descriptors: Foreign Countries, Identification, Item Response Theory, Nonparametric Statistics
Spray, Judith; Miller, Tim – 1994
Computer simulations under three conditions of polytomous differential item functioning (DIF) compared the ability of three different statistical procedures to detect nonuniform DIF. The procedures were a nominal and an ordinal extension of the Mantel-Haenszel statistic, and logistic discriminant function analysis. Results showed that only the…
Descriptors: Computer Simulation, Identification, Item Bias, Sample Size
Bergstrom, Betty A.; Lunz, Mary E. – 1998
This paper addresses questions of whether positively- and negatively-worded items measure the same construct and whether the rating scale categories "strongly agree" to "strongly disagree" are used in the same way for both types of items. Item response theory (IRT), specifically the Andrich Rating Scale Model (B. Wright and G.…
Descriptors: Adults, Item Response Theory, Rating Scales, Research Methodology
Sykes, Robert C.; Ito, Kyoko – 1998
A common procedure for obtaining multiple readings (ratings) for a constructed response item, especially in high-stakes tests, is to have two readers read the papers independently, with a third reading if the results differ by more than one point. This necessitates a scoring rule that specifies how the ratings will be aggregated into a single item…
Descriptors: Ability, Constructed Response, High Stakes Tests, Judges
Lee, Guemin; Frisbie, David A. – 1997
Previous studies have indicated that the reliability of test scores composed of testlets might be overestimated by conventional item-based reliability estimation methods (R. Thorndike, 1953; A. Anastasi, 1988; S. Sireci, D. Thissen, and H. Wainer, 1991; H. Wainer and D. Thissen, 1996). This study used generalizability theory to investigate the…
Descriptors: Estimation (Mathematics), Generalizability Theory, Reliability, Scores
Martinez, Michael E.; Simpson, R. Scott – 1999
Item-level statistics from ability and achievement tests have been underutilized as sources of data for building models of cognitive development. How item data can be used to build a cognitive-developmental map of proportional reasoning is demonstrated. The product of the analysis is a cognitive hierarchy with levels corresponding to categories of…
Descriptors: Ability, Achievement Tests, Cognitive Development, Cognitive Tests
Mazor, Kathleen M.; And Others – 1991
The Mantel-Haenszel (MH) procedure has become one of the most popular procedures for detecting differential item functioning. Valid results with relatively small numbers of examinees represent one of the advantages typically attributed to this procedure. In this study, examinee item responses were simulated to contain differentially functioning…
Descriptors: Difficulty Level, Item Bias, Item Response Theory, Sample Size
Attali, Yigal – ETS Research Report Series, 2004
Contrary to common belief, reliability estimates of number-right multiple-choice tests are not inflated by speededness. Because examinees guess on questions when they run out of time, the responses to these questions show less consistency with the responses of other questions, and the reliability of the test will be decreased. The surprising…
Descriptors: Multiple Choice Tests, Timed Tests, Test Reliability, Guessing (Tests)
Haberman, Shelby J. – ETS Research Report Series, 2004
The usefulness of joint and conditional maximum-likelihood is considered for the Rasch model under realistic testing conditions in which the number of examinees is very large and the number is items is relatively large. Conditions for consistency and asymptotic normality are explored, effects of model error are investigated, measures of prediction…
Descriptors: Maximum Likelihood Statistics, Computation, Item Response Theory, Testing
Whitney, Douglas R.; And Others – 1985
This preview of the Tests of General Educational Development (GED) to be introduced in 1988 begins with a brief background of the review process that will result in the GED Test. An overview of committee recommendations then highlights five themes of Test Specifications Committee panel reports: the tests should (1) demand more highly developed…
Descriptors: Adult Education, High School Equivalency Programs, Test Format, Test Items
Larsen, Gary Y. – 1984
The paper describes the reasons for developing a new instrument to measure adaptive behavior of mentally retarded residents at Glenwood State Hospital-School and recounts the processes involved in constructing the new scale. Among complaints about the American Association on Mental Deficiency Adaptive Behavior Scale (ABS) are its inappropriateness…
Descriptors: Adaptive Behavior (of Disabled), Factor Analysis, Mental Retardation, Test Construction
Scheuneman, Janice Dowd – 1982
The connection between item bias and test scores was investigated using a simulation approach. Two samples of hypothetical examinees were simulated using an item response theory model. The two samples were identical, except that the mean theta value 1 sample was 5 less than the other. The simulated tests consisted of 50 items with characteristics…
Descriptors: Latent Trait Theory, Research Methodology, Research Problems, Simulation
Peer reviewedWashington, William N.; Godfrey, R. Richard – Journal of Educational Measurement, 1974
Item statistics between illustrated and written items drawn from the same content areas were compared using F ratios. The results indicated: that illustrated items performed slightly better than matched written items; and that the best performing category of illustrated items was tables. (Author/BB)
Descriptors: Achievement Tests, Illustrations, Test Construction, Test Items


