Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Holland, Paul W. – ETS Research Report Series, 2005
There are test-equating situations in which it may be appropriate to fit a loglinear or other type of probability model to the joint distribution of a total score on a test and a score on part of that test. For anchor test designs, this situation arises for internal anchor tests, which are embedded within the total test. Similarly, a part-whole…
Descriptors: Test Items, Equated Scores, Probability, Statistical Analysis
Boyd, Aimee M.; Dodd, Barbara G.; Fitzpatrick, Steven J. – 2003
This study compared several item exposure control procedures for computerized adaptive test (CAT) systems based on a three-parameter logistic testlet response theory model (X. Wang, E. Bradlow, and H. Wainer, 2002) and G. Masters' (1982) partial credit model using real data from the Verbal Reasoning section of the Medical College Admission Test.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items
Bierschenk, Inger – 2001
Two scientific ideas have been discerned in 20th century thinking: the structuralism common in Europe and the functionalism apparent in the United States. This paper presents two experiments in text analysis. One discusses the behaviorist writing style of Ernest Hemingway. It hypothesizes that since he is a behaviorist in practice, he should be a…
Descriptors: Reader Text Relationship, Test Items, Text Structure
De Ayala, R. J.; Kim, Seock-Ho; Stapleton, Laura M.; Dayton, C. Mitchell – 1999
Differential item functioning (DIF) may be defined as an item that displays different statistical properties for different groups after the groups are matched on an ability measure. For instance, with binary data, DIF exists when there is a difference in the conditional probabilities of a correct response for two manifest groups. This paper…
Descriptors: Item Bias, Monte Carlo Methods, Test Items
Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – 2001
It is widely believed that item selection methods using the maximum information approach (MI) can maintain high efficiency in trait estimation by repeatedly choosing high discriminating (alpha) items. However, the consequence is that they lead to extremely skewed item exposure distribution in which items with high alpha values becoming overly…
Descriptors: Item Banks, Selection, Test Construction, Test Items
Witt, Elizabeth A.; Stahl, John A.; Bergstrom, Betty A.; Muckle, Tim – 2003
The focus of this simulation study was to investigate the effects of item difficulty drift on the stability of test taker ability estimates and pass/fail status under the Rasch model. Real, non-normal distributions of test taker abilities and item difficulties were used to represent true parameters. Test taker responses for 18 conditions of item…
Descriptors: Item Response Theory, Statistical Distributions, Test Items
Roberts, James S. – 2003
Stone and colleagues (C. Stone, R. Ankenman, S. Lane, and M. Liu, 1993; C. Stone, R. Mislevy and J. Mazzeo, 1994; C. Stone, 2000) have proposed a fit index that explicitly accounts for the measurement error inherent in an estimated theta value, here called chi squared superscript 2, subscript i*. The elements of this statistic are natural…
Descriptors: Chi Square, Goodness of Fit, Test Items
Rudner, Lawrence M. – 2000
Testing programs that report a single score based on multiple choice and performance components must face the issue of how to derive the component scores. This paper identifies and logically evaluates alternative component weighting methods. It then examines composite reliability and validity as a function of weights, component reliability,…
Descriptors: Reliability, Scores, Test Construction, Test Items
Veldkamp, Bernard P. – 2000
Two mathematical programming approaches are presented for the assembly of ability test from item pools calibrated under a multidimensional item response theory model. Item selection is based on Fisher's Information matrix. Several criteria can be used to optimize this matrix. In this paper, the A-criterion and the D-criterion are applied. In a…
Descriptors: Ability, Item Banks, Test Construction, Test Items
Lin, Chuan-Ju; Spray, Judith – 2000
This paper presents comparisons among three item-selection criteria for the sequential probability ratio test. The criteria were compared in terms of their efficiency in selecting items, as indicated by average test length and the percentage of correct decisions. The item-selection criteria applied in this study were the Fisher information…
Descriptors: Classification, Criteria, Cutting Scores, Selection
Peer reviewedDeMars, Christine E. – Applied Psychological Measurement, 2003
Varied the number of items and categories per item to explore the effects on estimation of item parameters in the nominal response model. Simulation results show that increasing the number of items had little effect on item parameter recovery, but increasing the number of categories increased the error variance of the parameter estimates. (SLD)
Descriptors: Estimation (Mathematics), Sample Size, Simulation, Test Items
Peer reviewedKeller, Lisa A.; Swaminathan, Hariharan; Sireci, Stephen G. – Applied Measurement in Education, 2003
Evaluated two strategies for scoring context-dependent test items: ignoring the depending and scoring dichotomously or modeling the dependence through polytomous scoring. Results for data from 38,965 examinees taking a professional examination show that dichotomous scoring may overestimate test information, but polytomous scoring may underestimate…
Descriptors: Adults, Licensing Examinations (Professions), Scoring, Test Items
Peer reviewedChang, Shun-Wen; Ansley, Timothy N. – Journal of Educational Measurement, 2003
Compared the properties of five methods of item exposure control in the context of estimating examinees' abilities in a computerized adaptive testing situation. Findings show advantages to the Stocking and Lewis conditional multinomial procedure (M. Stocking and C. Lewis, 1995) and, to a lesser degree, the Davy and Parshall method (T. Davey and C.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items
Peer reviewedHolman, Rebecca; Berger, Martijn P. F. – Journal of Educational and Behavioral Statistics, 2001
Studied calibration designs that maximize the determinants of Fisher's information matrix on the item parameters for sets of polytomously scored items. Analyzed these items using a number of item response theory models. Results show that for the data and models used, a D-optimal calibration design for an answer or set of answers can reduce the…
Descriptors: Item Response Theory, Research Design, Test Items
Peer reviewedWalker, Cindy M. – International Journal of Testing, 2001
Provides a tutorial on differential item functioning (DIF) and reviews DIFPACK, a new software package that is specifically designed to test for the presence of DIF. DIFPACK allows the user to test for standard unidirectional DIF, DIF in dichotomous items, DIF in polytomous items, and disordinal, or crossing, DIF. (SLD)
Descriptors: Computer Software, Identification, Item Bias, Test Items


