Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedBerrenberg, Joy L. – Teaching of Psychology, 1990
Reports that a goal and item analysis of eight history and systems of psychology textbooks and their accompanying test item files showed that the majority of the essay test items are too narrow in scope to measure the commonly stated course goals. Presents some integrative and goal-relevant essay questions to rectify this shortcoming. Includes a…
Descriptors: Content Analysis, Essay Tests, Evaluation Research, Higher Education
Peng, Lim Ho – IRAL, 1990
A pilot experiment examined ambiguity in English-as-a-Second-Language (ESL) learning by graduate and undergraduate students. The findings revealed that most ESL speakers have greatest difficulty in understanding sentences with derived-structure ambiguity. Underlying-structure ambiguity was the next most difficult to understand, followed by lexical…
Descriptors: Ambiguity, College Students, Difficulty Level, English (Second Language)
Peer reviewedHarasym, P. H.; And Others – Evaluation and the Health Professions, 1993
Results of a study involving approximately 220 student nurses indicate that the use of negation (e.g., not, except) should be avoided in stems of multiple-choice test items and that the single-response negatively worded item should often be converted to a multiple-response, positively worded item. (SLD)
Descriptors: Ability, Estimation (Mathematics), Multiple Choice Tests, Negative Forms (Language)
Peer reviewedPomplun, Mark; Sundbye, Nita – Applied Measurement in Education, 1999
Gender differences in answers to constructed-response reading items from a state assessment program were studied with four raters rating approximately 500 papers at two grade levels. Results indicate that number of words written and number of unrelated responses show significant gender differences and are related to holistic scores. (SLD)
Descriptors: Constructed Response, Holistic Evaluation, Reading Tests, Secondary Education
Peer reviewedHamilton, Laura S. – Educational Evaluation and Policy Analysis, 1998
Gender differences on the National Education Longitudinal Study of 1988 science tests were explored through statistical analyses and interviews with 25 high school students. Results show the importance of studying the validity of the outcome measure and suggest that conclusions about group differences and correlates of achievement depend on the…
Descriptors: Achievement Tests, Correlation, High School Students, High Schools
Peer reviewedByrne, Barbara M.; Campbell, T. Leanne – Journal of Cross-Cultural Psychology, 1999
Demonstrates the extent to which item-score data can vary across cultures despite measurements from an instrument in which factorial structure is specified in each group. Scores from the Beck Depression Inventory for 658 Canadian, 1,096 Swedish, and 691 Bulgarian high school students illustrate the differences. Discusses implications for…
Descriptors: Comparative Analysis, Cross Cultural Studies, Depression (Psychology), Factor Structure
Peer reviewedSpelberg, Henk C. Lutje; de Boer, Paulien; van den Bos, Kees P. – Language Testing, 2000
Compares two language tests with different item types. The tests are the Dutch Reynell test and the BELL test. Both tests were administered to 64 Dutch kindergarten children with an average age of 70.3 months. Regression analyses indicate that item type does not contribute significantly to prediction of item difficulty, but the linguistic…
Descriptors: Comparative Analysis, Dutch, Foreign Countries, Item Analysis
Peer reviewedWard, Annie W.; Murray-Ward, Mildred – Educational Measurement: Issues and Practice, 1994
This instructional module presented by the National Council on Measurement in Education (NCME) provides guidelines for teachers and other test developers to help them construct test item banks. Setting up an item bank and using it are described, with a consideration of software that can be used. (SLD)
Descriptors: Annotated Bibliographies, Computer Software, Educational Assessment, Elementary Secondary Education
Peer reviewedRocklin, Thomas R. – Applied Measurement in Education, 1994
Effects of self-adapted testing (SAT), in which examinees choose the difficulty of items themselves, on ability estimates, precision, and efficiency, mechanisms of SAT effects, and examinee reactions to SAT are reviewed. SAT, which is less efficient than computer-adapted testing, is more efficient than fixed-item testing. (SLD)
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Difficulty Level
Peer reviewedMeijer, Rob R.; And Others – Applied Measurement in Education, 1996
Several existing group-based statistics to detect improbable item score patterns are discussed, along with the cut scores proposed in the literature to classify an item score pattern as aberrant. A simulation study and an empirical study are used to compare the statistics and their use and to investigate the practical use of cut scores. (SLD)
Descriptors: Achievement Tests, Classification, Cutting Scores, Identification
Peer reviewedBrown, James Dean – Language Learning & Technology, 1997
Explores recent developments in the use of computers in language testing in four areas: (1) item banking; (2) computer-assisted language testing; (3) computerized-adaptive language testing; and (4) research on the effectiveness of computers in language testing. Examines educational measurement literature in an attempt to forecast the directions…
Descriptors: Computer Assisted Instruction, Computer Assisted Testing, Language Research, Language Tests
Peer reviewedWalsh, Margaret; Hickey, Crystal; Duffy, Jim – Sex Roles: A Journal of Research, 1999
Studied the effects of item content and stereotypic threat to gender differences in mathematical problem solving through two experiments, one with 63 seventh and eighth graders and the other with 174 college students. Results suggest that gender-stereotype threat could be a big factor in gender differences in mathematical problem solving. (SLD)
Descriptors: College Students, Higher Education, Junior High School Students, Junior High Schools
Burton, Richard F. – Assessment and Evaluation in Higher Education, 2005
Examiners seeking guidance on multiple-choice and true/false tests are likely to encounter various faulty or questionable ideas. Twelve of these are discussed in detail, having to do mainly with the effects on test reliability of test length, guessing and scoring method (i.e. number-right scoring or negative marking). Some misunderstandings could…
Descriptors: Guessing (Tests), Multiple Choice Tests, Objective Tests, Test Reliability
Olsen, Rolf Vegar – Scandinavian Journal of Educational Research, 2004
In the Programme for International Student Assessment (PISA) the items are organised in small clusters relating to the same stimulus material (called 'units'). Homogeneity analysis (HA) is used to develop a detailed description of the relationship between all the items in one unit, using the categorical information available in the PISA data. The…
Descriptors: Thinking Skills, Knowledge Level, Student Evaluation, Foreign Countries
Kjaernsli, Marit; Lie, Svein – Scandinavian Journal of Educational Research, 2004
In this paper we have set out to search for similarities and differences between the Nordic countries concerning patterns of competencies defined as scientific literacy in the Programme for International Student Assessment (PISA) study. The first part focuses on gender differences concerning the two types of competencies, understanding of…
Descriptors: Foreign Countries, Scientific Literacy, Thinking Skills, Gender Differences

Direct link
