Publication Date
| In 2026 | 0 |
| Since 2025 | 81 |
| Since 2022 (last 5 years) | 449 |
| Since 2017 (last 10 years) | 1237 |
| Since 2007 (last 20 years) | 2511 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 122 |
| Teachers | 105 |
| Researchers | 64 |
| Students | 46 |
| Administrators | 14 |
| Policymakers | 7 |
| Counselors | 3 |
| Parents | 3 |
Location
| Canada | 134 |
| Turkey | 130 |
| Australia | 123 |
| Iran | 66 |
| Indonesia | 61 |
| United Kingdom | 51 |
| Germany | 50 |
| Taiwan | 46 |
| United States | 43 |
| China | 39 |
| California | 34 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 5 |
| Does not meet standards | 6 |
DeMauro, Gerald E. – 1995
Studies of the Angoff method of standard setting suggest that judges agree in their estimates of the relative difficulties of test questions for minimally competent examinees and that each judge's estimates correlate well with the observed item difficulties for examinees whose total test scores are near the judge's personal standard (G. E.…
Descriptors: Ability, Competence, Construct Validity, Difficulty Level
Fisher, Gwen Laura – 1996
There has been concern over the validity of the Algebra Diagnostic Test (ADT) used to determine the actual level of student preparation for the first quarter of calculus as taught at the University of California, Santa Barbara. It has been hypothesized that performance-based questions, along with the more traditional multiple choice questions,…
Descriptors: Algebra, Calculus, Chemistry, College Freshmen
DeMars, Christine – 1998
Using data from a pilot test of science and math from students in 30 high schools, item difficulties were estimated with a one-parameter model (partial-credit model for the multi-point items). Some items were multiple-choice items, and others were constructed-response items (open-ended). Four sets of estimates were obtained: estimates for males…
Descriptors: Constructed Response, Difficulty Level, Estimation (Mathematics), Goodness of Fit
Puncochar, Judith; And Others – 1994
Individual and group assessments of quiz accuracy and students' discrimination of what they know and what they do not know regarding course material were examined using confidence ratings from 22 graduate students, 47 undergraduates, and their 23 heterogeneous learning groups over 6 quizzes. Students first took each multiple choice quiz as…
Descriptors: Confidence Testing, Decision Making, Graduate Students, Group Dynamics
Wang, Chen-Shih; Ackerman, Terry – 1994
Passages used in the Illinois Goal Assessment Program (IGAP) reading test are intact pieces of literature, stories, and essays that match classroom reading assignments and typical student reading experiences. There are 15 testlets, each containing 5 items, associated with each passage. Each testlet requires students to demonstrate various levels…
Descriptors: Analysis of Covariance, Elementary Education, Elementary School Students, Grade 3
National Commission on Testing and Public Policy. – 1990
Findings of the National Commission on Testing and Public Policy concerning problems in testing are reported. Recommendations are proposed for restructuring educational and employment testing to help people develop their talents and become more productive, and to help institutions become more productive, accountable, and just. Over a 3-year…
Descriptors: Accountability, Educational Assessment, Educational Change, Educational Improvement
Melancon, Janet G.; Thompson, Bruce – 1989
Classical measurement theory was used to investigate the measurement (psychometric) characteristics of both parts of the Finding Embedded Figures Test (FEFT) administered in either a "no guessing" supply format or a multiple-choice selection format to undergraduate college students or to middle school students. Three issues were…
Descriptors: Comparative Testing, Construct Validity, Higher Education, Junior High School Students
Djiwandono, M. Soenardi – 1990
In Indonesia, Bahasa Indonesian (BI) is the designated national and official language. However, deficiencies in Indonesian proficiency are found in a wide range of individuals. A test battery to measure proficiency level was developed, consisting of a writing test, a grammar test, and a cloze test. The writing test was an essay, in which five…
Descriptors: Cloze Procedure, College Faculty, Comparative Analysis, Foreign Countries
Linacre, John M. – 1987
This paper describes a computer program in Microsoft BASIC which selects and administers test items from a small item bank. The level of the difficulty of the item selected depends on the test taker's previous response. This adaptive system is based on the Rasch model. The Rasch model uses a unit of measurement based on the logarithm of the…
Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Individual Testing
Treagust, David F.; Haslam, Filocha – 1986
Based on the premise that multiple choice tests can be used as diagnostic tools for teachers in identifying and remedying student misconceptions, this study focused on the development of an instrument for diagnosing secondary students' understanding of photosynthesis and respiration. Information is presented on: (1) procedures of development of…
Descriptors: Biology, Cognitive Processes, Concept Teaching, Diagnostic Tests
Tanner, David E. – 1986
A multiple choice achievement test was constructed in which both cognitive level and degree of abstractness were controlled. Subjects were 75 students from a major university in the Southwest. A group of 13 judges, also university students, classified the concepts for degree of abstractness. Results indicated that both cognitive level and degree…
Descriptors: Abstract Reasoning, Achievement Tests, Analysis of Variance, Cognitive Processes
Lenel, Julia C.; Gilmer, Jerry S. – 1986
In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…
Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)
Alberta Dept. of Education, Edmonton. – 1987
Intended for students taking Grade 12 Diploma Examinations in English 33 in Alberta, Canada, this reading test contains 70 multiple choice test items related to the 9 selections in the reading booklet. The questions examine students' skills in (1) understanding meanings, (2) understanding and interpreting the relationships between form and…
Descriptors: Achievement Tests, Educational Assessment, English Instruction, Foreign Countries
Samejima, Fumiko – 1986
Item analysis data fitting the normal ogive model were simulated in order to investigate the problems encountered when applying the three-parameter logistic model. Binary item tests containing 10 and 35 items were created, and Monte Carlo methods simulated the responses of 2,000 and 500 examinees. Item parameters were obtained using Logist 5.…
Descriptors: Computer Simulation, Difficulty Level, Guessing (Tests), Item Analysis
Alberta Dept. of Education, Edmonton. – 1986
This proficiency exam for ninth-year French is comprised of a composition test and a reading comprehension test. The composition test consists of general instructions, the description of a subject about which the student is to write a narrative, and space for organizing, drafting, and writing a final copy of the composition. The reading…
Descriptors: Advanced Courses, French, Integrated Activities, Language Proficiency


