Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Mizokawa, Donald T.; Hamlin, Michael D. – Educational Technology, 1984
Suggestions for software design in computer managed testing (CMT) cover instructions to testees, their physical format, provision of practice items, and time limit information; test item presentation, physical format, discussion of task demands, review capabilities, and rate of presentation; pedagogically helpful utilities; typefonts; vocabulary;…
Descriptors: Computer Assisted Testing, Decision Making, Guidelines, Test Construction
Peer reviewedSharpley, Christopher F.; Rogers, H. Jane – Journal of Clinical Psychology, 1985
Compared items from psychologically naive vs. psychologically sophisticated item-writers vs. a standardized test (N=552). Results showed that nonpsychologists with no formal definition of the construct they were to measure were able to write items that were as valid as those elicited from psychologists. (BH)
Descriptors: Anxiety, Foreign Countries, Lay People, Measurement Techniques
Peer reviewedVan Der Flier, Henk; And Others – Journal of Educational Measurement, 1984
Two strategies for assessing item bias are discussed: methods comparing item difficulties unconditional on ability and methods comparing probabilities of response conditional on ability. Results suggest that the iterative logit method is an improvement on the noniterative one and is efficient in detecting biased and unbiased items. (Author/DWH)
Descriptors: Algorithms, Evaluation Methods, Item Analysis, Scores
Peer reviewedConger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
Howe, Roger; Scheaffer, Richard; Lindquist, Mary; Philip, Frank; Halbrook, Arthur – US Department of Education, 2004
This document contains the framework and a set of recommendations for the 2005 NAEP mathematics assessment. It includes descriptions of the mathematical content of the test, the types of test questions, and recommendations for administration of the test. In broad terms, this framework attempts to answer the question: What mathematics should be…
Descriptors: National Competency Tests, Student Evaluation, Mathematics Achievement, Test Items
Kim, Seock-Ho – 2002
Continuation ratio logits are used to model the possibilities of obtaining ordered categories in a polytomously scored item. This model is an alternative to other models for ordered category items such as the graded response model and the generalized partial credit model. The discussion includes a theoretical development of the model, a…
Descriptors: Ability, Classification, Item Response Theory, Mathematical Models
Childs, Ruth A.; Jaciw, Andrew P. – 2003
This Digest describes matrix sampling of test items as an approach to achieving broad coverage while minimizing testing time per student. Matrix sampling involves developing a complete set of items judged to cover the curriculum, then dividing the items into subsets and administering one subset to each student. Matrix sampling, by limiting the…
Descriptors: Item Banks, Matrices, Sampling, Test Construction
Perkins, Kyle; Pohlmann, John T. – 2002
The purpose of this study was to determine if the patterning of the responses of English as a Second Language (ESL) students to a reading comprehension test would change over time due to the restructuring of the subjects ESL reading comprehension competence as they increased their overall ESL proficiency. In this context, restructuring refers to…
Descriptors: Adults, Change, English (Second Language), Reading Comprehension
Hendrickson, Amy B.; Kolen, Michael J. – 2001
This study compared various equating models and procedures for a sample of data from the Medical College Admission Test(MCAT), considering how item response theory (IRT) equating results compare with classical equipercentile results and how the results based on use of various IRT models, observed score versus true score, direct versus linked…
Descriptors: Equated Scores, Higher Education, Item Response Theory, Models
Reese, Lynda M. – 1999
This study represented a first attempt to evaluate the impact of local item dependence (LID) for Item Response Theory (IRT) scoring in computerized adaptive testing (CAT). The most basic CAT design and a simplified design for simulating CAT item pools with varying degrees of LID were applied. A data generation method that allows the LID among…
Descriptors: College Entrance Examinations, Item Response Theory, Law Schools, Scoring
Bond, Lloyd – Carnegie Foundation for the Advancement of Teaching, 2004
The writer comments on the issue of high-stakes testing and the pressures on teachers to "teach to the test." Although many view teaching to the test as an all or none issue, in practice it is actually a continuum. At one end, some teachers examine the achievement objectives as described in their curriculum and then design instructional activities…
Descriptors: Testing, Standardized Tests, High Stakes Tests, Academic Achievement
Zwick, Rebecca – 1994
The Mantel Haenszel (MH; 1959) approach of Holland and Thayer (1988) is a well-established method for assessing differential item functioning (DIF). The formula for the variance of the MH DIF statistic is based on work by Phillips and Holland (1987) and Robins, Breslow, and Greenland (1986). Recent simulation studies showed that the MH variances…
Descriptors: Adaptive Testing, Evaluation Methods, Item Bias, Measurement Techniques
Campbell, Todd C. – 1995
This paper discusses alternatives to R-technique factor analysis that are applicable to counseling and psychotherapy. The traditional R-technique involves correlating columns of a data matrix. O, P, Q, S, and T techniques are discussed with particular emphasis on Q-technique. In Q-technique, people are factored across items or variables with the…
Descriptors: Counseling, Factor Analysis, Q Methodology, Research Methodology
Hendrickson, Amy B. – 2001
The purpose of the study was to compare reliability estimates for a test composed of stimulus-dependent testlets as derived from item scores, testlet scores, and under the univariate generalizability theory and multivariate generalizability theory designs, as well as to determine the influence of the number of testlets and the number of items per…
Descriptors: Comparative Analysis, Reliability, Scores, Standardized Tests
Hombo, Catherine M.; Pashley, Katharine; Jenkins, Frank – 2001
The use of grid-in formats, such as those requiring students to solve problems and fill in bubbles, is common on large-scale standardized assessments, but little is known about the use of this format with a more general population of students than high school students taking college entrance examinations, including those attending public schools…
Descriptors: Responses, Secondary Education, Secondary School Students, Standardized Tests


