Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedEggen, T. J. H. M. – Applied Psychological Measurement, 1999
Evaluates a method for item selection in adaptive testing that is based on Kullback-Leibler information (KLI) (T. Cover and J. Thomas, 1991). Simulation study results show that testing algorithms using KLI-based item selection perform better than or as well as those using Fisher information item selection. (SLD)
Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Selection
Peer reviewedAllen, Nancy L.; Donoghue, John R. – Journal of Educational Measurement, 1996
Examined the effect of complex sampling of items on the measurement of differential item functioning (DIF) using the Mantel-Haenszel procedure through a Monte Carlo study. Suggests the superiority of the pooled booklet method when items are selected for examinees according to a balanced incomplete block design. Discusses implications for other DIF…
Descriptors: Item Bias, Monte Carlo Methods, Research Design, Sampling
Peer reviewedDe Ayala, R. J.; Sava-Bolesta, Monica – Applied Psychological Measurement, 1999
Investigated the relationship between sample size, latent trait distribution, and item parameter estimation with the nominal response model through simulation. Results suggest guidelines for reasonable item parameter estimation. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Sample Size, Simulation
Peer reviewedGierl, Mark J.; Henderson, Diane; Jodoin, Michael; Klinger, Don – Journal of Experimental Education, 2001
Examined the influence of item parameter estimation errors across three item selection methods using the two- and three-parameter logistic item response theory (IRT) model. Tests created with the maximum no target and maximum target item selection procedures consistently overestimated the test information function. Tests created using the theta…
Descriptors: Estimation (Mathematics), Item Response Theory, Selection, Test Construction
Peer reviewedPenfield, Randall D. – Applied Measurement in Education, 2001
Compared the performance of three methods of assessing differential item functioning (DIF) across demographic groups, using: (1) the Mantel-Haenszel chi-square statistic with no adjustment to the alpha level; (2) the Mantel-Haenszel statistic with a Bonferroni adjusted alpha level; and (3) the generalized Mantel-Haenszel statistic. Simulation…
Descriptors: Chi Square, Demography, Item Bias, Power (Statistics)
Hayashi, Kentaro; Kamata, Akihito – Psychometrika, 2005
The asymptotic standard deviation (SD) of the alpha coefficient with standardized variables is derived under normality. The research shows that the SD of the standardized alpha coefficient becomes smaller as the number of examinees and/or items increase. Furthermore, this research shows that the degree of the dependence of the SD on the number of…
Descriptors: Correlation, Statistical Analysis, Measurement Techniques, Simulation
Van Onna, Marieke J. H. – Applied Psychological Measurement, 2004
Coefficient "H" is used as an index of scalability in nonparametric item response theory (NIRT). It indicates the degree to which a set of items rank orders examinees. Theoretical sampling distributions, however, have only been derived asymptotically and only under restrictive conditions. Bootstrap methods offer an alternative possibility to…
Descriptors: Sampling, Item Response Theory, Scaling, Comparative Analysis
Bock, R. Darrell; Brennan, Robert L.; Muraki, Eiji – Applied Psychological Measurement, 2002
In assessment programs where scores are reported for individual examinees, it is desirable to have responses to performance exercises graded by more than one rater. If more than one item on each test form is so graded, it is also desirable that different raters grade the responses of any one examinee. This gives rise to sampling designs in which…
Descriptors: Generalizability Theory, Test Items, Item Response Theory, Error of Measurement
Brusco, Michael J. – Journal of Problem Solving, 2007
The study of human performance on discrete optimization problems has a considerable history that spans various disciplines. The two most widely studied problems are the Euclidean traveling salesperson problem and the quadratic assignment problem. The purpose of this paper is to outline a program of study for the measurement of human performance on…
Descriptors: Problem Solving, Performance, Measurement, Criticism
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007
In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…
Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)
Turner, Haley; Williams, Robert L. – Journal of College Reading and Learning, 2007
Scores on a vocabulary test given at the beginning of two semesters in a large entry-level course predicted performance on multiple-choice exams more strongly than pre-course knowledge and critical thinking. Words on the vocabulary instrument were derived from multiple-choice exam items in the course. Although commonly used in the course, these…
Descriptors: Vocabulary Development, Multiple Choice Tests, Scores, Introductory Courses
Rowan, Noell; Wulff, Dan – Qualitative Report, 2007
This article describes the process by which one study utilized qualitative methods to create items for a multi dimensional scale to measure twelve step program affiliation. The process included interviewing fourteen addicted persons while in twelve step focused treatment about specific pros (things they like or would miss out on by not being…
Descriptors: Qualitative Research, Measures (Individuals), Test Items, Test Construction
Nylund, Karen L.; Asparouhov, Tihomir; Muthen, Bengt O. – Structural Equation Modeling: A Multidisciplinary Journal, 2007
Mixture modeling is a widely applied data analysis technique used to identify unobserved heterogeneity in a population. Despite mixture models' usefulness in practice, one unresolved issue in the application of mixture models is that there is not one commonly accepted statistical indicator for deciding on the number of classes in a study…
Descriptors: Test Items, Monte Carlo Methods, Program Effectiveness, Data Analysis
Wilhelm, Jennifer – International Journal of Science Education, 2009
This paper reports an examination on gender differences in lunar phases understanding of 123 students (70 females and 53 males). Middle-level students interacted with the Moon through observations, sketching, journalling, two-dimensional and three-dimensional modelling, and classroom discussions. These lunar lessons were adapted from the Realistic…
Descriptors: Test Results, Test Items, Females, Astronomy
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Three strategies for linking two consecutive assessments are investigated and compared by analyzing reading data for the National Assessment of Educational Progress (NAEP) using the general diagnostic model. These strategies are compared in terms of marginal and joint expectations of skills, joint probabilities of skill patterns, and item…
Descriptors: National Competency Tests, Probability, Reading Achievement, Test Items

Direct link
