Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 6 |
Descriptor
| Evaluation Methods | 62 |
| Test Interpretation | 62 |
| Test Reliability | 49 |
| Test Validity | 36 |
| Test Construction | 21 |
| Student Evaluation | 17 |
| Evaluation Criteria | 14 |
| Measurement Techniques | 12 |
| Elementary Secondary Education | 11 |
| Test Use | 11 |
| Reliability | 9 |
| More ▼ | |
Source
Author
| Fleming, Dan B. | 2 |
| ANDERSON, JAMES A. | 1 |
| Allen, R. R. | 1 |
| Anderson, Colette | 1 |
| Archer, Robert P. | 1 |
| Arreola, Raoul A. | 1 |
| Athelstan, Gary T. | 1 |
| Benavidez, Charlotte | 1 |
| Binghan Zheng | 1 |
| Bowman, Michael L. | 1 |
| Campbell, Mary | 1 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 3 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 7 | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
| Practitioners | 13 |
| Teachers | 7 |
| Administrators | 4 |
| Researchers | 1 |
| Students | 1 |
Location
| Australia | 1 |
| China | 1 |
| Greece | 1 |
| Michigan | 1 |
| South Africa | 1 |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024
Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…
Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability
Eirini M. Mitropoulou; Leonidas A. Zampetakis; Ioannis Tsaousis – Evaluation Review, 2024
Unfolding item response theory (IRT) models are important alternatives to dominance IRT models in describing the response processes on self-report tests. Their usage is common in personality measures, since they indicate potential differentiations in test score interpretation. This paper aims to gain a better insight into the structure of trait…
Descriptors: Foreign Countries, Adults, Item Response Theory, Personality Traits
Plucker, Jonathan A.; Qian, Meihua; Schmalensee, Stephanie L. – Creativity Research Journal, 2014
In recent years, the social sciences have seen a resurgence in the study of divergent thinking (DT) measures. However, many of these recent advances have focused on abstract, decontextualized DT tasks (e.g., list as many things as you can think of that have wheels). This study provides a new perspective by exploring the reliability and validity…
Descriptors: Creative Thinking, Creativity Tests, Scoring Formulas, Evaluation Methods
Geisinger, Kurt F. – International Journal of Testing, 2012
This article sets the stage for the description of a variety of approaches to test reviewing worldwide. It describes the importance of test reviewing as a protection of the public and of society and also the benefits of this activity for test users, who must choose measures to use in particular situations with particular clients at a particular…
Descriptors: Test Reviews, Evaluation Methods, Evaluation Criteria, Global Approach
Gray, B. Thomas – 1997
Validity is a critically important issue with far-reaching implications for testing. The history of conceptualizations of validity over the past 50 years is reviewed, and 3 important areas of controversy are examined. First, the question of whether the three traditionally recognized types of validity should be integrated as a unitary entity of…
Descriptors: Educational Testing, Evaluation Methods, Reliability, Scores
Peer reviewedMcKee, Lynne M.; Levinson, Edward M. – Career Development Quarterly, 1990
Discusses general issues and concerns relative to the adaptation of paper-pencil assessment instruments to computerized formats. Describes and evaluates Self-Directed Search computerized version (SDS-CV). Presents strengths and weaknesses of the SDS-CV and makes recommendations for its use. (Author/ABL)
Descriptors: Career Counseling, Computer Oriented Programs, Evaluation Methods, Reliability
Sullivan, Francis J. – 1986
A study examined how pragmatic form influences evaluation of student essays in university placement testing. Specifically, the study documented how patterns in students' use of information (assumed to be either old, inferable, or new for readers) affected the holistic scores for quality given to the essays. Subjects, 99 randomly selected entering…
Descriptors: College Freshmen, Essay Tests, Evaluation Criteria, Evaluation Methods
Peer reviewedUduehi, Joseph – Visual Arts Research, 1998
Centers on the development of an art test for determining subjects' preferences for simple and complex design, particularly the Uduehi preference test. Finds that this test truly represents elements of design from least to most complex; subjects preferences for designs increases with complexity; and the test is suitable for cross-cultural studies.…
Descriptors: Cross Cultural Studies, Design Preferences, Evaluation Methods, Test Construction
Peer reviewedNichols, Paul D.; Smith, Philip L. – Educational Measurement: Issues and Practice, 1998
This essay argues that reliability should be reconceptualized in a way that reflects the importance of the theoretical expectations of the test specialist and the learning and problem solving of the test takers. It is time to characterize clearly the substantive theoretical framework supporting reliability studies and the technical evaluation of…
Descriptors: Data Analysis, Educational Research, Educational Theories, Evaluation Methods
Fortna, Richard O. – 1981
Measurement terms used in Title I evaluation are contained in this glossary. Several types of measurement techniques are identified and defined. Other measurement terms which are defined include those relating to validity, reliability, statistical analysis, test interpretation, and program effectiveness. (DWH)
Descriptors: Educational Testing, Evaluation Methods, Glossaries, Program Evaluation
Peer reviewedFleming, Dan B. – Peabody Journal of Education, 1977
Descriptors: Accountability, Evaluation Methods, Social Studies, Standardized Tests
Simpson, J. D. – Audio-Visual Language Journal, 1974
Some basic statistical concepts relevant to the teacher--mean scores, standard deviation, normal and skewed distributions, z scores, item analysis, standard error of measurement, reliability--and their use by the teacher are explained. (RM)
Descriptors: Error of Measurement, Evaluation Methods, Norm Referenced Tests, Scoring
Campbell, Mary – Engl Quart, 1970
Critically discusses existing objective tests of students' language abilities. (RD)
Descriptors: College Bound Students, Evaluation Methods, Language Skills, Objective Tests
Peer reviewedYarger, Carmel Collum – Volta Review, 1996
Discusses the use of the Test of Written Language (3rd ed.) in assessing the writing ability of students with significant hearing losses. Reviews the test's format, administration, scoring, standardization and norming. Evaluates the test's reliability and validity as a screening tool for determining student strengths and weaknesses. (CR)
Descriptors: Elementary Secondary Education, Evaluation Methods, Hearing Impairments, Student Evaluation

Direct link
