Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Chen, Shu-Ying; Ankenmann, Robert D.; Spray, Judith A. – 1999
This paper presents a derivation of an average between-test overlap index as a function of the item exposure index, for fixed-length computerized adaptive tests (CAT). This relationship is used to investigate the simultaneous control of item exposure at both the item and test levels. Implications for practice as well as future research are also…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Test Items
Michaelides, Michalis P.; Haertel, Edward H. – Center for Research on Evaluation Standards and Student Testing CRESST, 2004
There is variability in the estimation of an equating transformation because common-item parameters are obtained from responses of samples of examinees. The most commonly used standard error of equating quantifies this source of sampling error, which decreases as the sample size of examinees used to derive the transformation increases. In a…
Descriptors: Test Items, Testing, Error Patterns, Interrater Reliability
Peer reviewedOosterhof, Albert C. – Journal of Educational Measurement, 1976
The purpose of this study was to investigate the degree to which various selected test item discrimination indices reflect a common factor. The indices used include the point-biserial, biserial, phi and tetrachoric coefficients, Flanagan's approximation of the product-moment correlation, Gulliksen's item reliability index, and Findley's difference…
Descriptors: Comparative Analysis, Correlation, Factor Analysis, Mathematical Formulas
Peer reviewedHicks, Marilyn Maginley – Multivariate Behavioral Research, 1981
An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)
Descriptors: Correlation, Factor Analysis, Mastery Tests, Measurement Techniques
Peer reviewedVidler, Derek; Hansen, Richard – Journal of Experimental Education, 1980
Relationships among patterns of answer changing and item characteristics on multiple-choice tests are discussed. Results obtained were similar to those found in previous studies but pointed to further relationships among these variables. (Author/GK)
Descriptors: College Students, Difficulty Level, Higher Education, Multiple Choice Tests
Peer reviewedBerk, Ronald A. – Educational Research Quarterly, 1978
Guttman's mapping sentence technique is examined as a mechanism for defining domains of cognitive behavior and for generating test items to measure achievement in those domains. The utility of the mechanism as compared to alternatives is discussed. (Author/JKS)
Descriptors: Achievement Tests, Cognitive Objectives, Semantics, Technical Reports
Peer reviewedVegelius, Jan – Educational and Psychological Measurement, 1979
The G index is a measure of similarity between pairs of dichotomized items. The G index is generalized here to the case where items are trichotomized. (JKS)
Descriptors: Correlation, Item Analysis, Nonparametric Statistics, Technical Reports
Peer reviewedCamilli, Gregory; Penfield, Douglas A. – Journal of Educational Measurement, 1997
The simultaneous assessment of differential item functioning (DIF) for a collection of test items through an index that measures the variance of DIF on a test as an indicator of the degree to which different items show DIF in different directions is proposed and evaluated through simulations. (SLD)
Descriptors: Ability, Estimation (Mathematics), Item Bias, Item Response Theory
Peer reviewedHansen, James D.; Dexter, Lee – Journal of Education for Business, 1997
Analysis of test item banks in 10 auditing textbooks found that 75% of questions violated one or more guidelines for multiple-choice items. In comparison, 70% of a certified public accounting exam bank had no violations. (SK)
Descriptors: Accounting, Guidelines, Item Banks, Multiple Choice Tests
Peer reviewedLinacre, John M.; Wright, Benjamin D. – Journal of Applied Measurement, 2002
Describes an extension to the Rasch model for fundamental measurement in which there is parameterization not only for examinee ability and item difficulty but also for judge severity. Discusses variants of this model and judging plans, and explains its use in an empirical testing situation. (SLD)
Descriptors: Ability, Difficulty Level, Evaluators, Item Response Theory
Peer reviewedMiller, G. Edward; Beretvas, S. Natasha – Journal of Applied Measurement, 2002
Presents empirically based item selection guidelines for moving the cut score on equated tests consisting of "n" dichotomous items calibrated assuming the Rasch model. Derivations of lemmas that underlie the guidelines are provided as well as a simulated example. (SLD)
Descriptors: Cutting Scores, Equated Scores, Item Response Theory, Selection
Peer reviewedRudas, Tamas; Zwick, Rebecca – Journal of Educational and Behavioral Statistics, 1997
The mixture index of fit (T. Rudas et al, 1994) is used to estimate the fraction of a population for which differential item functioning (DIF) occurs, and this approach is compared to the Mantel Haenszel test of DIF. The proposed noniterative procedure provides information about data portions contributing to DIF. (SLD)
Descriptors: Comparative Analysis, Estimation (Mathematics), Item Bias, Maximum Likelihood Statistics
Peer reviewedVeldkamp, Bernard P. – Applied Psychological Measurement, 2002
Presents two mathematical programming approaches for the assembly of ability tests from item pools calibrated under a multidimensional item response theory model. Item selection is based on the Fisher information matrix. Illustrates the method through empirical examples for a two-dimensional mathematics item pool. (SLD)
Descriptors: Ability, Item Banks, Item Response Theory, Selection
Peer reviewedDavis, Laurie Laughlin; Pastor, Dena A.; Dodd, Barbara G.; Chiang, Claire; Fitzpatrick, Steven J. – Journal of Applied Measurement, 2003
Examined the effectiveness of the Sympson-Hetter technique and rotated content balancing relative to no exposure control and no content rotation conditions in a computerized adaptive testing system based on the partial credit model. Simulation results show the Sympson-Hetter technique can be used with minimal impact on measurement precision,…
Descriptors: Adaptive Testing, Computer Assisted Testing, Selection, Simulation
Peer reviewedEnright, Mary K.; Morley, Mary; Sheehan, Kathleen M. – Applied Measurement in Education, 2002
Studied the impact of systematic item feature variation on item statistical characteristics and the degree to which such information could be used as collateral information to supplement examinee performance data and reduce pretest sample size by generating 2 families of 48 word problem variants for the Graduate Record Examinations. Results with…
Descriptors: College Entrance Examinations, Sample Size, Statistical Analysis, Test Construction


