Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedStone, Clement A. – Educational Measurement: Issues and Practice, 1989
MicroCAT version 3.0--an integrated test development, administration, and analysis system--is reviewed in this first article of a series on testing software. A framework for comparing testing software is presented. The strength of this package lies in the development, banking, and administration of items composed of text and graphics. (SLD)
Descriptors: Computer Assisted Testing, Computer Software, Computer Software Reviews, Data Analysis
Peer reviewedBalch, William R. – Teaching of Psychology, 1989
Studies the effect of item order on test scores and completion time. Students scored slightly higher when test items were grouped sequentially (relating to text and lectures) than on tests when test items were grouped by text chapter but ordered randomly, or when test items were ordered randomly. Found no differences in completion time. (Author/LS)
Descriptors: Educational Research, Higher Education, Performance, Psychology
Peer reviewedHenning, Grant – Language Testing, 1988
Violations of item unidimensionality on language tests produced distorted estimates of person ability, and violations of person unidimensionality produced distorted estimates of item difficulty. The Bejar Method was sensitive to such distortions. (Author)
Descriptors: Construct Validity, Content Validity, Difficulty Level, Item Analysis
Peer reviewedSamejima, Fumiko – Psychometrika, 1994
Using the constant information model, constant amounts of test information, and a finite interval of ability, simulated data were produced for 8 ability levels and 20 numbers of test items. Analyses suggest that it is desirable to consider modifying test information functions when they measure accuracy in ability estimation. (SLD)
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Computer Simulation
Peer reviewedHambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994
The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)
Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory
Peer reviewedKim, Seock-Ho; And Others – Journal of Educational Measurement, 1995
A method is presented for detection of differential item functioning in multiple groups. This method is closely related to F. M. Lord's chi square for comparing vectors of item parameters estimated in two groups. An example is provided using data from 600 college students taking a mathematics test with and without calculators. (SLD)
Descriptors: Chi Square, College Students, Comparative Analysis, Estimation (Mathematics)
Peer reviewedBruno, James E.; Dirkzwager, A. – Educational and Psychological Measurement, 1995
Determining the optimal number of choices on a multiple-choice test is explored analytically from an information theory perspective. The analysis revealed that, in general, three choices seem optimal. This finding is in agreement with previous statistical and psychometric research. (SLD)
Descriptors: Distractors (Tests), Information Theory, Multiple Choice Tests, Psychometrics
Peer reviewedGlaser, Robert – Educational Measurement: Issues and Practice, 1994
Beginning discussions and exploratory work on criterion-referenced measurement are reviewed in this commentary on the author's 1963 address to the American Educational Research Association on issues of measurement and the development of educational technology. Many problems foreseen at that time remain current. (SLD)
Descriptors: Criterion Referenced Tests, Educational History, Educational Research, Educational Technology
Peer reviewedMay, Kim; Nicewander, W. Alan – Journal of Educational Measurement, 1994
Reliabilities and information functions for percentile ranks and number-right scores were compared using item response theory, modeling standardized achievement tests. Results demonstrate that situations exist in which the percentage of items known by examinees can be accurately estimated, but the percentage of persons falling below a given score…
Descriptors: Achievement Tests, Difficulty Level, Equations (Mathematics), Estimation (Mathematics)
Peer reviewedHoman, Susan; And Others – Journal of Educational Measurement, 1994
A study was conducted with 782 elementary school students to determine whether the Homan-Hewitt Readability Formula could identify the readability of a single-sentence test item. Results indicate that a relationship exists between students' reading grade levels and responses to test items written at higher readability levels. (SLD)
Descriptors: Difficulty Level, Elementary Education, Elementary School Students, Identification
Peer reviewedCizek, Gregory J.; And Others – Evaluation and the Health Professions, 1995
Results from 627 examinees taking a health sciences certification examination suggest that test items written purposefully to assess higher order cognitive skills do not provide evidence of assessing different levels of cognitive processing. Results do not support continuing use of a hierarchical cognitive classification dimension for the test…
Descriptors: Classification, Cognitive Processes, Educational Assessment, Evaluation Methods
Peer reviewedMellenbergh, Gideon J. – Multivariate Behavioral Research, 1994
A general linear latent trait model for continuous item responses is described. The special unidimensional case for continuous item response is the model of K. G. Joreskog (1971) of congeneric item response. The correspondence between models for continuous and dichotomous item responses is shown to be closer than usually supposed. (SLD)
Descriptors: Attitude Measures, Item Bias, Item Response Theory, Personality Measures
Peer reviewedClark, Lee Anna; Watson, David – Psychological Assessment, 1995
Basic principles that should be followed by anyone developing a scale are reviewed, with emphasis on verbally mediated measures. The essential first step is a clear conceptualization of the target construct. Item development and ensuring unidimensionality follow. Factor analysis can be crucial in establishing unidimensionality and discriminant…
Descriptors: Concept Formation, Construct Validity, Factor Analysis, Item Banks
Peer reviewedStumpf, Steven H. – Evaluation and the Health Professions, 1994
A five-year curriculum evaluation project is described that treated students' course ratings, examination reliability coefficients, and item-discrimination data as a battery of data points for determining annual revision efforts. Histograms were constructed to make valid demonstrations of successful efforts immediately comprehensible to faculty.…
Descriptors: College Faculty, Comprehension, Curriculum Evaluation, Longitudinal Studies
Peer reviewedDowning, Steven M.; And Others – Applied Measurement in Education, 1995
The criterion-related validity evidence and other psychometric characteristics of multiple-choice and multiple true-false (MTF) items in medical specialty certification examinations were compared using results from 21,346 candidates. Advantages of MTF items and implications for test construction are discussed. (SLD)
Descriptors: Cognitive Ability, Licensing Examinations (Professions), Medical Education, Objective Tests


