Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Journal of Educational… | 52 |
Author
Ebel, Robert L. | 2 |
Wainer, Howard | 2 |
Abeles, Harold F. | 1 |
Ackerman, Terry A. | 1 |
Amery D. Wu | 1 |
Bell, John F. | 1 |
Bennett, Randy Elliot | 1 |
Benson, Jeri | 1 |
Beuchert, A. Kent | 1 |
Bhushan, Vidya | 1 |
Biancarosa, Gina | 1 |
More ▼ |
Publication Type
Journal Articles | 33 |
Reports - Research | 21 |
Reports - Evaluative | 7 |
Reports - Descriptive | 4 |
Information Analyses | 2 |
Speeches/Meeting Papers | 2 |
Opinion Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Audience
Researchers | 3 |
Location
Australia | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Yaneva, Victoria; Clauser, Brian E.; Morales, Amy; Paniagua, Miguel – Journal of Educational Measurement, 2021
Eye-tracking technology can create a record of the location and duration of visual fixations as a test-taker reads test questions. Although the cognitive process the test-taker is using cannot be directly observed, eye-tracking data can support inferences about these unobserved cognitive processes. This type of information has the potential to…
Descriptors: Eye Movements, Test Validity, Multiple Choice Tests, Cognitive Processes
Liu, Bowen; Kennedy, Patrick C.; Seipel, Ben; Carlson, Sarah E.; Biancarosa, Gina; Davison, Mark L. – Journal of Educational Measurement, 2019
This article describes an ongoing project to develop a formative, inferential reading comprehension assessment of causal story comprehension. It has three features to enhance classroom use: equated scale scores for progress monitoring within and across grades, a scale score to distinguish among low-scoring students based on patterns of mistakes,…
Descriptors: Formative Evaluation, Reading Comprehension, Story Reading, Test Construction
Mislevy, Robert J. – Journal of Educational Measurement, 2016
Validity is the sine qua non of properties of educational assessment. While a theory of validity and a practical framework for validation has emerged over the past decades, most of the discussion has addressed familiar forms of assessment and psychological framings. Advances in digital technologies and in cognitive and social psychology have…
Descriptors: Test Validity, Technology, Cognitive Psychology, Social Psychology
Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff – Journal of Educational Measurement, 2016
Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Sequential Approach

Millman, Jason; Popham, W. James – Journal of Educational Measurement, 1974
The use of the regression equation derived from the Anglo-American sample to predict grades of Mexican-American students resulted in overprediction. An examination of the standardized regression weights revealed a significant difference in the weight given to the Scholastic Aptitude Test Mathematics Score. (Author/BB)
Descriptors: Criterion Referenced Tests, Item Analysis, Predictive Validity, Scores

Woodson, M. I. Chas. E. – Journal of Educational Measurement, 1974
Descriptors: Criterion Referenced Tests, Item Analysis, Test Construction, Test Reliability

Ackerman, Terry A. – Journal of Educational Measurement, 1992
The difference between item bias and item impact and the way they relate to item validity are discussed from a multidimensional item response theory perspective. The Mantel-Haenszel procedure and the Simultaneous Item Bias strategy are used in a Monte Carlo study to illustrate detection of item bias. (SLD)
Descriptors: Causal Models, Computer Simulation, Construct Validity, Equations (Mathematics)

Washington, William N.; Godfrey, R. Richard – Journal of Educational Measurement, 1974
Item statistics between illustrated and written items drawn from the same content areas were compared using F ratios. The results indicated: that illustrated items performed slightly better than matched written items; and that the best performing category of illustrated items was tables. (Author/BB)
Descriptors: Achievement Tests, Illustrations, Test Construction, Test Items

Grier, J. Brown – Journal of Educational Measurement, 1975
The expected reliability of a multiple choice test is maximized by the use of three alternative items. (Author)
Descriptors: Achievement Tests, Multiple Choice Tests, Test Construction, Test Reliability

Embretson, Susan; Gorin, Joanna – Journal of Educational Measurement, 2001
Examines testing practices in: (1) the past, in which the traditional paradigm left little room for cognitive psychology principles; (2) the present, in which testing research is enhanced by principles of cognitive psychology; and (3) the future, in which the potential of cognitive psychology should be fully realized through item design.…
Descriptors: Cognitive Psychology, Construct Validity, Educational Research, Educational Testing

Haladyna, Thomas Michael – Journal of Educational Measurement, 1974
Classical test construction and analysis procedures are applicable and appropriate for use with criterion referenced tests when samples of both mastery and nonmastery examinees are employed. (Author/BB)
Descriptors: Criterion Referenced Tests, Item Analysis, Mastery Tests, Test Construction

Woodson, M. I. Charles E. – Journal of Educational Measurement, 1974
The basis for selection of the calibration sample determines the kind of scale which will be developed. A random sample from a population of individuals leads to a norm-referenced scale, and a sample representative of abilities of a range of characteristics leads to a criterion-referenced scale. (Author/BB)
Descriptors: Criterion Referenced Tests, Discriminant Analysis, Item Analysis, Test Construction

Darlington, Richard B. – Journal of Educational Measurement, 1971
Four definitions of cultural fairness" are critically examined. Suggestions for dealing with conflicts between the two goals of maximizing a test's validity and minimizing its culture-group discrimination, are presented. Terms in which this judgment should be made, and methods of using its results are described. (LR)
Descriptors: Cultural Background, Cultural Differences, Culture Fair Tests, Test Bias

Worthen, Blaine R.; Clark, Philip M. – Journal of Educational Measurement, 1971
Descriptors: Association Measures, College Students, Creativity, Creativity Tests