Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Wagner, Teresa A.; Harvey, Robert J. – Psychological Assessment, 2006
The authors describe the initial development of the Wagner Assessment Test (WAT), an instrument designed to assess critical thinking, using the 5-faceted view popularized by the Watson-Glaser Critical Thinking Appraisal (WGCTA; G. B. Watson & E. M. Glaser, 1980). The WAT was designed to reduce the degree of successful guessing relative to the…
Descriptors: Critical Thinking, Item Response Theory, Test Items, Scores
Weems, Gail H.; Onwuegbuzie, Anthony J.; Collins, Kathleen M. T. – Evaluation and Research in Education, 2006
Should instruments, such as Likert-type scales, contain both positively worded and negatively worded items within the same scale (i.e. mixed format)? Recent evidence suggests that the use of scales with a mixed format can adversely affect the psychometric properties of scales. In particular, the mean item response to the positively worded items…
Descriptors: Likert Scales, Reading Comprehension, Test Items, Psychometrics
Elliott, Robert; Fox, Christine M.; Beltyukova, Svetlana A.; Stone, Gregory E.; Gunderson, Jennifer; Zhang, Xi – Psychological Assessment, 2006
Rasch analysis was used to illustrate the usefulness of item-level analyses for evaluating a common therapy outcome measure of general clinical distress, the Symptom Checklist-90-Revised (SCL-90-R; Derogatis, 1994). Using complementary therapy research samples, the instrument's 5-point rating scale was found to exceed clients' ability to make…
Descriptors: Therapy, Rating Scales, Item Response Theory, Test Items
Swinkels, Sophie H. N.; Dietz, Claudine; van Daalen, Emma; Kerkhof, Ine H. G. M.; van Engeland, Herman; Buitelaar, Jan K. – Journal of Autism and Developmental Disorders, 2006
This article describes the development of a screening instrument for young children. Screening items were tested first in a non-selected population of children aged 8-20 months (n = 478). Then, parents of children with clinically diagnosed ASD (n = 153, average age 87 months) or ADHD (n = 76, average age 112 months) were asked to score the items…
Descriptors: Pervasive Developmental Disorders, Autism, Questionnaires, Testing
Niemi, David; Vallone, Julia; Wang, Jia; Griffin, Noelle – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2007
Many districts and schools across the U. S. have begun to develop and administer assessments to complement state testing systems and provide additional information to monitor curriculum, instruction and schools. In advance of this trend, the Jackson Public Schools (JPS) district has had a district benchmark testing system in place for many years.…
Descriptors: Public Schools, Testing Programs, Educational Testing, Item Analysis
Wilhelm, Pascal; Pieters, Jules M. – Assessment & Evaluation in Higher Education, 2007
In a course on biological psychology and neuropsychology, study questions were provided that also appeared as test questions in the course exam. This method was introduced to support students in active processing and reproduction of the study texts, and study planning. Data were gathered to test the hypothesis that study question use would be…
Descriptors: Neuropsychology, Academic Achievement, Biology, Psychology
Geranpayeh, Ardeshir; Kunnan, Antony John – Language Assessment Quarterly, 2007
When standardized English-language tests are administered to test takers worldwide, the test-taking population could be varied on a number of personal and educational characteristics such as age, gender, first language, and academic discipline. As test tasks and test items may not always be prepared keeping this diversity of characteristics in…
Descriptors: Test Bias, Test Items, Language Tests, Intellectual Disciplines
Kong, Xiaojing J.; Wise, Steven L.; Bhola, Dennison S. – Educational and Psychological Measurement, 2007
This study compared four methods for setting item response time thresholds to differentiate rapid-guessing behavior from solution behavior. Thresholds were either (a) common for all test items, (b) based on item surface features such as the amount of reading required, (c) based on visually inspecting response time frequency distributions, or (d)…
Descriptors: Test Items, Reaction Time, Timed Tests, Item Response Theory
Dempster, Edith R.; Reddy, Vijay – Science Education, 2007
This study investigated the relationship between readability of 73 text-only multiple-choice questions from Trends in International Mathematics and Science Study (TIMSS) 2003 and performance of two groups of South African learners: those with limited English-language proficiency (learners attending African schools) and those with better…
Descriptors: Instructional Effectiveness, Foreign Countries, Disadvantaged Youth, Sentences
Marks, Anthony M.; Cronje, Johannes C. – Educational Technology & Society, 2008
Computer-based assessments are becoming more commonplace, perhaps as a necessity for faculty to cope with large class sizes. These tests often occur in large computer testing venues in which test security may be compromised. In an attempt to limit the likelihood of cheating in such venues, randomised presentation of items is automatically…
Descriptors: Educational Assessment, Educational Testing, Research Needs, Test Items
Ferrara, Steven; And Others – 1995
A study was conducted to begin a process of validating hypothesized causes of local item dependence (LID) in large-scale performance assessments. Data for the study are item level scores from 26 science tasks from the 1993 edition of the Maryland School Performance Assessment Program. Causes of high LID were hypothesized from studies by Ferrara et…
Descriptors: Educational Assessment, Hands on Science, Performance Based Assessment, Prediction
Bejar, Isaac I. – 1991
Response generative modeling (RGM) is an approach to psychological measurement that involves a "grammar" capable of assigning a psychometric description to every item in a universe of items and is capable of generating all the items in that universe. The article discusses the rationale behind RMG and its roots, explores how it relates to…
Descriptors: Educational Assessment, Item Response Theory, Measurement Techniques, Models
Freedle, Roy; Kostin, Irene – 1992
This study examines the predictability of Graduate Record Examinations (GRE) reading item difficulty (equated delta) for the three major reading item types: main idea, inference, and explicit statement items. Each item type is analyzed separately, using 110 GRE reading passages and their associated 244 reading items; selective analyses of 285…
Descriptors: College Entrance Examinations, Correlation, Difficulty Level, Higher Education
Freedle, Roy; Kostin, Irene – 1993
Prediction of the difficulty (equated delta) of a large sample (n=213) of reading comprehension items from the Test of English as a Foreign Language (TOEFL) was studied using main idea, inference, and supporting statement items. A related purpose was to examine whether text and text-related variables play a significant role in predicting item…
Descriptors: Construct Validity, Difficulty Level, Multiple Choice Tests, Prediction
Carlson, Sybil B.; Ward, William C. – 1988
Issues concerning the cost and feasibility of using Formulating Hypotheses (FH) test item types for the Graduate Record Examinations have slowed research into their use. This project focused on two major issues that need to be addressed in considering FH items for operational use: the costs of scoring and the assignment of scores along a range of…
Descriptors: Adaptive Testing, Computer Assisted Testing, Costs, Pilot Projects

Peer reviewed
Direct link
