Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedEngelhard, George, Jr.; Davis, Melodee; Hansche, Linda – Applied Measurement in Education, 1999
Examined whether reviewers on item-review committees can identify accurately test items that exhibit a variety of flaws. Results with 39 reviewers of a 75-item test show that reviewers exhibit fairly high accuracy rates overall, with statistically significant differences in judgmental accuracy among reviewers. (SLD)
Descriptors: Decision Making, Judges, Review (Reexamination), Test Construction
Peer reviewedLee, Guemin; Frisbie, David A. – Applied Measurement in Education, 1999
Studied the appropriateness and implications of using a generalizability theory approach to estimating the reliability of scores from tests composed of testlets. Analyses of data from two national standardization samples suggest that manipulating the number of passages is a more productive way to obtain efficient measurement than manipulating the…
Descriptors: Generalizability Theory, Models, National Surveys, Reliability
Peer reviewedNering, Michael L.; Meijer, Rob R. – Applied Psychological Measurement, 1998
Compared the person-response function (PRF) method for identifying examinees who respond to test items in a manner divergent from the underlying test model to the "l(z)" index of Drasgow and others (1985). Although performance of the "l(z)" index was superior in most cases, the PRF was useful in some conditions. (SLD)
Descriptors: Comparative Analysis, Item Response Theory, Models, Responses
Peer reviewedvan der Linden, Wim J.; Adema, Jos J. – Journal of Educational Measurement, 1998
Proposes an algorithm for the assembly of multiple test forms in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. Illustrates how the method can be implemented using 0-1 linear programming and gives two examples. (SLD)
Descriptors: Algorithms, Linear Programming, Test Construction, Test Format
Peer reviewedCamilli, Gregory; Congdon, Peter – Journal of Educational and Behavioral Statistics, 1999
Demonstrates a method for studying differential item functioning (DIF) that can be used with dichotomous or polytomous items and that is valid for data that follow a partial credit Item Response Theory model. A simulation study shows that positively biased Type I error rates are in accord with results from previous studies. (SLD)
Descriptors: Estimation (Mathematics), Item Bias, Item Response Theory, Test Items
Peer reviewedCarlstedt, Berit; Gustafsson, Jan-Eric; Ullstadius, Eva – Intelligence, 2000
Studied whether a change of test item sequencing, intended to increase test complexity, would cause increased involvement of general intelligence using a sample of Swedish military recruits who received heterogeneous (n=1,778) or homogeneous (n=363) tests. Items presented homogeneously showed higher general intelligence ("G") loadings.…
Descriptors: Foreign Countries, Intelligence, Military Personnel, Test Construction
Peer reviewedArmstrong, Ronald D.; Jones, Douglas H.; Wang, Zhaobo – Journal of Educational and Behavioral Statistics, 1998
Generating a test from an item bank using a criterion based on classical test theory parameters poses considerable problems. A mathematical model is formulated that maximizes the reliability coefficient alpha, subject to logical constraints on the choice of items. Theorems ensuring appropriate application of the Lagragian relation techniques are…
Descriptors: Item Banks, Mathematical Models, Reliability, Test Construction
Peer reviewedLinacre, John Michael – Journal of Outcome Measurement, 1998
Simulation studies indicate that, for responses to complete tests, construction of Rasch measures from observational data, followed by principal components factor analysis of Rasch residuals, provides an effective means of identifying multidimensionality. The most diagnostically useful residual form was found to be the standardized residual. (SLD)
Descriptors: Factor Analysis, Identification, Item Response Theory, Simulation
Peer reviewedDouglas, Jeff; Kim, Hae Rim; Habing, Brian; Gao, Furong – Journal of Educational and Behavioral Statistics, 1998
The local dependence of item pairs is investigated through a conditional covariance function estimation procedure. The conditioning variable used is obtained by a monotonic transformation of total score on the remaining items. Conditional covariance functions are estimated by using kernel smoothing. Several models of local dependence are…
Descriptors: Analysis of Covariance, Estimation (Mathematics), Models, Scores
Peer reviewedSmith, Richard M.; Schumacker, Randall E.; Bush, M. Joan – Journal of Outcome Measurement, 1998
Using item mean squares to evaluate fit to the Rasch model was studied, also considering the transformed version of the item fit statistics. Simulations demonstrate that the critical value for the mean square used to detect misfit is affected by the type of mean square and the number of persons in the calibration. (SLD)
Descriptors: Goodness of Fit, Item Response Theory, Simulation, Test Items
Peer reviewedKamata, Akihito – Journal of Educational Measurement, 2001
Presents the hierarchical generalized linear model (HGLM) as an explicit two-level formulation of a multilevel item response model. Shows that the HGLM is equivalent to the Rasch model, and that a characteristic of the HGLM is that person ability can be expressed as a latent regression model with person-characteristic variables. Shows that the…
Descriptors: Item Analysis, Item Response Theory, Regression (Statistics), Test Items
Peer reviewedBolt, Daniel M. – Journal of Educational Measurement, 2000
Reviewed aspects of the SIBTEST procedure through three studies. Study 1 examined the effects of item format using 40 mathematics items from the Scholastic Assessment Test. Study 2 considered the effects of a problem type factor and its interaction with item format for eight items, and study 3 evaluated the degree to which factors varied in the…
Descriptors: Computer Software, Hypothesis Testing, Item Bias, Mathematics
Peer reviewedGierl, Mark J.; Leighton, Jacqueline P.; Hunka, Stephen M. – Educational Measurement: Issues and Practice, 2000
Discusses the logic of the rule-space model (K. Tatsuoka, 1983) as it applies to test development and analysis. The rule-space model is a statistical method for classifying examinees' test item responses into a set of attribute-mastery patterns associated with different cognitive skills. Directs readers to a tutorial that may be downloaded. (SLD)
Descriptors: Item Analysis, Item Response Theory, Test Construction, Test Items
Peer reviewedJackson, Stacy L.; And Others – Journal of Career Assessment, 1996
Factor analysis of 1,030 adults' responses on the Myers Briggs Type Indicator (MBTI) were used to test 4 alternative models. Results support a four-factor structure similar to the original Jungian structure. Elimination of 12 MBTI items was recommended. (SK)
Descriptors: Construct Validity, Factor Analysis, Models, Personality Measures
Peer reviewedWainer, Howard – Journal of Educational and Behavioral Statistics, 2000
Suggests that because of the nonlinear relationship between item usage and item security, the problems of test security posed by continuous administration of standardized tests cannot be resolved merely by increasing the size of the item pool. Offers alternative strategies to overcome these problems, distributing test items so as to avoid the…
Descriptors: Computer Assisted Testing, Standardized Tests, Test Items, Testing Problems


