Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Logsdon, David M. – 1981
This study examined the effectiveness of two different domain definition strategies in achieving homogeneity of criterion-referenced test items. The argument was tested regarding the extent to which item writers following the Instructional Objectives Exchange (IOX) domain definition strategy for a cognitive skill generate items that are more…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Methods, Test Construction
Traub, Ross E.; Fisher, Charles W.
Two sets of mathematical reasoning and two sets of verbal comprehension items were cast into each of three formats--constructed response, standard multiple-choice, and Coombs multiple-choice--in order to assess whether tests with indentical content but different formats measure the same attribute, except for possible differences in error variance…
Descriptors: Memory, Multiple Choice Tests, Recall (Psychology), Responses
Peer reviewedKrus, David J.; Ney, Robert G. – Educational and Psychological Measurement, 1978
An algorithm for item analysis in which item discrimination indices have been defined for the distractors as well as the correct answer is presented. Also, the concept of convergent and discriminant validity is applied to items instead of tests, and is discussed as an aid to item analysis. (Author/JKS)
Descriptors: Algorithms, Item Analysis, Multiple Choice Tests, Test Items
Peer reviewedPine, Steven M.; Wattawa, Scott – Educational and Psychological Measurement, 1978
A computer program for a comparative evaluation of the extent of item bias between two subgroups in a test population is described. The program calculates an index of bias based on Angoff's elliptical distance measure, and provides statistics for determining the similarity of intergroup item parameters. (Author/JKS)
Descriptors: Comparative Analysis, Computer Programs, Item Analysis, Test Bias
Peer reviewedGreen, Samual B.; And Others – Educational and Psychological Measurement, 1977
Confusion in the literature between the concepts of internal consistency and homogeneity has led to a misuse of coefficient alpha as an index of item homogeneity. This misuse is discussed and several indices of item homogeneity derived from the model of common factor analysis are offered as alternatives. (Author/JKS)
Descriptors: Factor Analysis, Item Analysis, Test Interpretation, Test Items
Peer reviewedWilcox, Rand R. – Journal of Experimental Education, 1985
A new method of measuring item bias based on the latent class model proposed by the author is suggested. A test for item bias is also suggested that is based on standard asymptotic results. (Author/DWH)
Descriptors: Mathematical Models, Measurement Techniques, Statistical Analysis, Test Bias
Peer reviewedJoshi, Bhairav D. – Journal of Chemical Education, 1986
Provides a question (with the acceptable answer) designed to test students' ability to apply, and extend, the concept of thermodynamic work discussed in the classroom. The question was originally designed as a part of a take-home examination. (JN)
Descriptors: Chemistry, College Science, Higher Education, Science Education
Peer reviewedEvans, William – Journal of Experimental Education, 1984
The capacity of examinees to develop cue-using strategies was examined, and the results suggest that students profit from knowledge of a particular test constructor's idiosyncrasies. The findings also lend weight to the argument that performance on test wiseness items is cue-specific. (Author/BW)
Descriptors: Adults, Cues, Test Construction, Test Items
Wang, Jianjun – Online Submission, 2004
Primary school data from the Third International Mathematics and Science Study (TIMSS) are analyzed in this article to examine performance difference between 3rd and 4th grades. Score comparisons are determined across all TIMSS items in each of the participating countries, using computer technology and programming to complete the thousands of…
Descriptors: Articulation (Education), Test Results, Computers, Test Items
Childs, Ruth A.; Jaciw, Andrew P. – 2003
Matrix sampling of test items, the division of a set of items into different versions of a test form, is used by several large-scale testing programs. This Digest discusses nine categories of costs associated with matrix sampling. These categories are: (1) development costs; (2) materials costs; (3) administration costs; (4) educational costs; (5)…
Descriptors: Costs, Matrices, Reliability, Sampling
Weems, Gail H.; Onwuegbuzie, Anthony J.; Lustig, Daniel – 2002
Many instruments, especially Likert-type scales, contain both positively and negatively worded items within the same scale (i.e., mixed item format). A major reason for this practice appears to be to discourage response sets from emerging. Using this format also helps the analyst detect response sets that occur in data sets, and thus eliminate…
Descriptors: Individual Characteristics, Likert Scales, Profiles, Rating Scales
Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I. – 2002
This study explored the application of hierarchical model calibration as a means of reducing, if not eliminating, the need for pretesting of automatically generated items from a common item model prior to operational use. Ultimately the successful development of automatic item generation (AIG) systems capable of producing items with highly similar…
Descriptors: Junior High School Students, Junior High Schools, Models, Test Items
Anzaldua, Ric M. – 2002
This paper discusses item banks calibrated to indicate levels of difficulty to assist in test development. The item bank topics discussed are: (1) purpose; (2) development issues; (3) advantages and disadvantages; and (4) practical issues. The most common issues are content validity, reliability, concerns with software purchase and programming,…
Descriptors: Item Banks, Item Response Theory, Test Construction, Test Items
van der Linden, Wim J.; Chang, Hua-Hua – 2001
The methods of alpha-stratified adaptive testing and constrained adaptive testing with shadow tests are combined in this study. The advantages are twofold. First, application of the shadow test allows the researcher to implement any type of constraint on item selection in alpha-stratified adaptive testing. Second, the result yields a simple set of…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Selection
van der Linden, Wim J.; Vos, Hans J.; Chang, Lei – 2000
In judgmental standard setting experiments, it may be difficult to specify subjective probabilities that adequately take the properties of the items into account. As a result, these probabilities are not consistent with each other in the sense that they do not refer to the same borderline level of performance. Methods to check standard setting…
Descriptors: Interrater Reliability, Judges, Probability, Standard Setting


