Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 6 |
Descriptor
Test Items | 28 |
Test Validity | 28 |
Test Construction | 14 |
Item Analysis | 10 |
Multiple Choice Tests | 7 |
Higher Education | 6 |
Test Reliability | 6 |
Scores | 5 |
Comparative Analysis | 4 |
Computer Simulation | 4 |
Item Response Theory | 4 |
More ▼ |
Source
Journal of Educational… | 28 |
Author
Clauser, Brian E. | 2 |
Wainer, Howard | 2 |
Ackerman, Terry A. | 1 |
Baldwin, Peter | 1 |
Bennett, Randy Elliot | 1 |
Benson, Jeri | 1 |
Beuchert, A. Kent | 1 |
Brandenburg, Dale C. | 1 |
Cohen, Allan S. | 1 |
Ebel, Robert L. | 1 |
Enright, Mary K. | 1 |
More ▼ |
Publication Type
Journal Articles | 26 |
Reports - Research | 21 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 2 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Researchers | 1 |
Location
Canada | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Iowa Tests of Basic Skills | 1 |
Program for International… | 1 |
Stanford Achievement Tests | 1 |
What Works Clearinghouse Rating
Yaneva, Victoria; Clauser, Brian E.; Morales, Amy; Paniagua, Miguel – Journal of Educational Measurement, 2021
Eye-tracking technology can create a record of the location and duration of visual fixations as a test-taker reads test questions. Although the cognitive process the test-taker is using cannot be directly observed, eye-tracking data can support inferences about these unobserved cognitive processes. This type of information has the potential to…
Descriptors: Eye Movements, Test Validity, Multiple Choice Tests, Cognitive Processes
Shear, Benjamin R. – Journal of Educational Measurement, 2023
Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…
Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests
Peabody, Michael R.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard-setting panels should have the proper qualifications to make the judgments asked…
Descriptors: Standard Setting, Decision Making, Performance Based Assessment, Evaluators
Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia – Journal of Educational Measurement, 2017
Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…
Descriptors: Evaluation Methods, Standard Setting, Cutting Scores, Expertise
Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011
A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…
Descriptors: Test Length, Test Items, Alignment (Education), Models
Pommerich, Mary – Journal of Educational Measurement, 2006
Domain scores have been proposed as a user-friendly way of providing instructional feedback about examinees' skills. Domain performance typically cannot be measured directly; instead, scores must be estimated using available information. Simulation studies suggest that IRT-based methods yield accurate group domain score estimates. Because…
Descriptors: Test Validity, Scores, Simulation, Evaluation Methods

Washington, William N.; Godfrey, R. Richard – Journal of Educational Measurement, 1974
Item statistics between illustrated and written items drawn from the same content areas were compared using F ratios. The results indicated: that illustrated items performed slightly better than matched written items; and that the best performing category of illustrated items was tables. (Author/BB)
Descriptors: Achievement Tests, Illustrations, Test Construction, Test Items

Hartke, Alan R. – Journal of Educational Measurement, 1978
Latent partition analysis is shown to be useful in determining the conceptual homogeneity of an item population. Such item populations are useful for mastery testing. Applications of latent partition analysis in assessing content validity are suggested. (Author/JKS)
Descriptors: Higher Education, Item Analysis, Item Sampling, Mastery Tests

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979
Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)
Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction

Garg, Rashmi; And Others – Journal of Educational Measurement, 1986
For the purpose of obtaining data to use in test development, multiple matrix sampling plans were compared to examinee sampling plans. Data were simulated for examinees, sampled from a population with a normal distribution of ability, responding to items selected from an item universe. (Author/LMO)
Descriptors: Difficulty Level, Monte Carlo Methods, Sampling, Statistical Studies

Ebel, Robert L. – Journal of Educational Measurement, 1982
Reasonable and practical solutions to two major problems confronting the developer of any test of educational achievement (what to measure and how to measure it) are proposed, defended, and defined. (Author/PN)
Descriptors: Measurement Techniques, Objective Tests, Test Construction, Test Items

Mehrens, William A.; Phillips, S. E. – Journal of Educational Measurement, 1987
A taxonomic matrix classification was used to assess the curricular validity of the Stanford Achievement Tests for the mathematics textbooks used in a school district's fifth and sixth grades. Rasch item difficulty was also examined. Results indicated only small differences between textbooks. (GDC)
Descriptors: Difficulty Level, Elementary School Mathematics, Intermediate Grades, Item Analysis

Irvin, Larry K.; And Others – Journal of Educational Measurement, 1980
The relative efficacy of content-appropriate, orally administered true/false and multiple-choice testing was examined with retarded adolescents. Both approaches demonstrated utility and psychometric adequacy. Implications regarding test development for retarded students are briefly discussed. (Author)
Descriptors: High Schools, Mild Mental Retardation, Multiple Choice Tests, Objective Tests

Frisbie, David A.; Brandenburg, Dale C. – Journal of Educational Measurement, 1979
Content-parallel questionnaire items in which response schemes varied in one of two ways--scale alternatives were all defined or only endpoints were defined, and alternatives were numbered or lettered--were investigated on a large sample of college freshmen. (Author/JKS)
Descriptors: Higher Education, Item Analysis, Questionnaires, Rating Scales

Shepard, Lorrie A.; And Others – Journal of Educational Measurement, 1985
The purpose of this research was to recommend an item bias procedure when the number of minority examinees is too small to use preferred three-parameter item response theory (IRT) methods. The chi-square, Angoff delta-plot, and pseudo-IRT indices were compared with both real and simulated data. (Author/DWH)
Descriptors: Estimation (Mathematics), Item Analysis, Latent Trait Theory, Minority Groups
Previous Page | Next Page ยป
Pages: 1 | 2