Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Validity | 5 |
Test Construction | 4 |
Accuracy | 2 |
Construct Validity | 2 |
Educational Testing | 2 |
Evaluation Criteria | 2 |
Evaluation Methods | 2 |
Models | 2 |
Scoring | 2 |
Test Items | 2 |
Test Reliability | 2 |
More ▼ |
Source
Journal of Educational… | 10 |
Author
Daria Gerasimova | 1 |
Embretson, Susan | 1 |
Gorin, Joanna | 1 |
Henson, Robert A. | 1 |
Joiner, Lee M. | 1 |
Kane, Michael T. | 1 |
Lane, Suzanne | 1 |
Leinhardt, Gaea | 1 |
Mullis, Ina V. S. | 1 |
Roussos, Louis A. | 1 |
Seewald, Andrea Mar | 1 |
More ▼ |
Publication Type
Journal Articles | 10 |
Reports - Descriptive | 10 |
Reports - Research | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Daria Gerasimova – Journal of Educational Measurement, 2024
I propose two practical advances to the argument-based approach to validity: developing a living document and incorporating preregistration. First, I present a potential structure for the living document that includes an up-to-date summary of the validity argument. As the validation process may span across multiple studies, the living document…
Descriptors: Validity, Documentation, Methods, Research Reports
Shermis, Mark D. – Journal of Educational Measurement, 2022
One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…
Descriptors: Scoring, Essays, Validity, Writing Evaluation
Lane, Suzanne – Journal of Educational Measurement, 2019
Rater-mediated assessments require the evaluation of the accuracy and consistency of the inferences made by the raters to ensure the validity of score interpretations and uses. Modeling rater response processes allows for a better understanding of how raters map their representations of the examinee performance to their representation of the…
Descriptors: Responses, Accuracy, Validity, Interrater Reliability
de la Torre, Jimmy – Journal of Educational Measurement, 2008
Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…
Descriptors: Matrices, Validity, Models, Evaluation Methods
Roussos, Louis A.; Templin, Jonathan L.; Henson, Robert A. – Journal of Educational Measurement, 2007
This article describes a latent trait approach to skills diagnosis based on a particular variety of latent class models that employ item response functions (IRFs) as in typical item response theory (IRT) models. To enable and encourage comparisons with other approaches, this description is provided in terms of the main components of any…
Descriptors: Validity, Identification, Psychometrics, Item Response Theory

Kane, Michael T. – Journal of Educational Measurement, 2001
Provides a brief historical review of construct validity and discusses the current state of validity theory, emphasizing the role of arguments in validation. Examines the application of an argument-based approach with regard to the distinction between performance-based and theory-based interpretations and the role of consequences in validation.…
Descriptors: Construct Validity, Educational Testing, Performance Based Assessment, Theories

Embretson, Susan; Gorin, Joanna – Journal of Educational Measurement, 2001
Examines testing practices in: (1) the past, in which the traditional paradigm left little room for cognitive psychology principles; (2) the present, in which testing research is enhanced by principles of cognitive psychology; and (3) the future, in which the potential of cognitive psychology should be fully realized through item design.…
Descriptors: Cognitive Psychology, Construct Validity, Educational Research, Educational Testing

Simon, Alan J.; Joiner, Lee M. – Journal of Educational Measurement, 1976
The purpose of this study was to determine whether a Mexican version of the Peabody Picture Vocabulary Test could be improved by directly translating both forms of the American test, then using decision procedures to select the better item of each pair. The reliability of the simple translations suffered. (Author/BW)
Descriptors: Early Childhood Education, Spanish, Test Construction, Test Format

Leinhardt, Gaea; Seewald, Andrea Mar – Journal of Educational Measurement, 1981
The Student-Level Observation of Beginning Reading (SOBR) was designed to focus on the content of instructional activities in reading at the individual student level, and is based on a time sample of time spent on specific activities. (Author/BW)
Descriptors: Beginning Reading, Classroom Observation Techniques, Elementary Education, Learning Activities

Mullis, Ina V. S. – Journal of Educational Measurement, 1992
An overview is given of the consensus process for development of the frameworks underlying the National Assessment of Educational Progress (NAEP) assessments, with emphasis on those for the 1990 and 1992 mathematics assessments, the 1992 reading assessment, and the 1994 science assessments. Innovative techniques for 1992 are described. (SLD)
Descriptors: Academic Standards, Content Validity, Educational Assessment, Elementary Secondary Education