Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M. – Journal of Educational and Behavioral Statistics, 2003
Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…
Descriptors: Test Items, Markov Processes, Educational Testing, Probability
van der Linden, Wim J.; Sotaridona, Leonardo – Journal of Educational and Behavioral Statistics, 2006
A statistical test for detecting answer copying on multiple-choice items is presented. The test is based on the exact null distribution of the number of random matches between two test takers under the assumption that the response process follows a known response model. The null distribution can easily be generalized to the family of distributions…
Descriptors: Test Items, Multiple Choice Tests, Cheating, Responses
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis
Johnstone, Christopher; Liu, Kristi; Altman, Jason; Thurlow, Martha – National Center on Educational Outcomes, University of Minnesota, 2007
This document reports on research related to large-scale assessments for students with learning disabilities in the area of reading. As part of a process of making assessments more universally designed the authors examined the role of "readable and comprehensible" test items (Thompson, Johnstone, & Thurlow, 2002). In this research, they used think…
Descriptors: Test Items, Readability, Learning Disabilities, Protocol Analysis
Wright, Anthony A. – Journal of the Experimental Analysis of Behavior, 2007
Rhesus monkeys were trained and tested in visual and auditory list-memory tasks with sequences of four travel pictures or four natural/environmental sounds followed by single test items. Acquisitions of the visual list-memory task are presented. Visual recency (last item) memory diminished with retention delay, and primacy (first item) memory…
Descriptors: Memory, Test Items, Familiarity, Inhibition
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Reid, Christine A.; Kolakowsky-Hayner, Stephanie A.; Lewis, Allen N.; Armstrong, Amy J. – Rehabilitation Counseling Bulletin, 2007
Item response theory (IRT) methodology is introduced as a tool for improving assessment instruments used with people who have disabilities. Need for this approach in rehabilitation is emphasized; differences between IRT and classical test theory are clarified. Concepts essential to understanding IRT are defined, necessary data assumptions are…
Descriptors: Psychometrics, Methods, Item Response Theory, Aptitude Tests
Ketterlin-Geller, Leanne R.; Yovanoff, Paul; Tindal, Gerald – Exceptional Children, 2007
Accommodations influence the measurement of student proficiency. However, with discrepant research findings, it is difficult to evaluate the effects of these practices on the measurement of performance of students with special needs. In this article, we present results from an experimental study investigating the effects of item characteristics…
Descriptors: Student Characteristics, Special Needs Students, Mathematics Tests, Testing Accommodations
van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…
Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time
Draaijer, S.; Hartog, R. J. M. – E-Journal of Instructional Science and Technology, 2007
A set of design patterns for digital item types has been developed in response to challenges identified in various projects by teachers in higher education. The goal of the projects in question was to design and develop formative and summative tests, and to develop interactive learning material in the form of quizzes. The subject domains involved…
Descriptors: Higher Education, Instructional Design, Test Format, Biological Sciences
Zabaleta, Francisco – CALICO Journal, 2007
Placing students of a foreign language within a basic language program constitutes an ongoing problem, particularly for large university departments when they have many incoming freshmen and transfer students. This article outlines the author's experience designing and piloting a language placement test for a university level Spanish program. The…
Descriptors: Test Items, Student Placement, Spanish, Transfer Students
Gvozdenko, Eugene; Chambers, Dianne – Australasian Journal of Educational Technology, 2007
This paper investigates how monitoring the time spent on a question in a test of basic mathematics skills can provide insights into learning processes, the quality of test takers' knowledge, and cognitive demands and performance of test items that otherwise would remain undiscovered if the usual test outcome of accuracy only format…
Descriptors: Reaction Time, Computer Assisted Testing, Mathematics Tests, Test Items
Mariano, Louis T.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2007
When constructed response test items are scored by more than one rater, the repeated ratings allow for the consideration of individual rater bias and variability in estimating student proficiency. Several hierarchical models based on item response theory have been introduced to model such effects. In this article, the authors demonstrate how these…
Descriptors: Test Items, Item Response Theory, Rating Scales, Scoring
Elosua, Paula; Lopez-Jauregui, Alicia – International Journal of Testing, 2007
This report shows a classification of differential item functioning (DIF) sources that have an effect on the adaptation of tests. This classification is based on linguistic and cultural criteria. Four general DIF sources are distinguished: cultural relevance, translation problems, morph syntactical differences, and semantic differences. The…
Descriptors: Semantics, Cultural Relevance, Classification, Test Bias
Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008
Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…
Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level

Peer reviewed
Direct link
