Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 4 |
Descriptor
Evaluation Methods | 35 |
Scores | 35 |
Test Use | 35 |
Educational Assessment | 12 |
Elementary Secondary Education | 11 |
Student Evaluation | 11 |
Test Construction | 10 |
Test Validity | 9 |
Test Interpretation | 8 |
Test Results | 8 |
Academic Achievement | 7 |
More ▼ |
Source
Author
Koretz, Daniel | 2 |
Amery D. Wu | 1 |
Arter, Judith A. | 1 |
Ayala, Carlos C. | 1 |
Bailey, Nancy | 1 |
Baker, A. Paige | 1 |
Bauer, Scott C. | 1 |
Brown, Mary Jo McGee | 1 |
Bunch, Michael B. | 1 |
Buser, Karen | 1 |
Croft, Cedric, Ed. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 4 |
Adult Basic Education | 1 |
Adult Education | 1 |
Elementary Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Practitioners | 4 |
Teachers | 4 |
Community | 1 |
Parents | 1 |
Location
Australia | 1 |
New Zealand | 1 |
North Carolina | 1 |
Virginia | 1 |
Laws, Policies, & Programs
Education Consolidation… | 1 |
Elementary and Secondary… | 1 |
Every Student Succeeds Act… | 1 |
Hawkins Stafford Act 1988 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
Klesch, Heather S. – ProQuest LLC, 2010
The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…
Descriptors: Feedback (Response), Test Results, Focus Groups, Educational Testing
Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S. – Educational Assessment, 2008
In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…
Descriptors: Test Use, Educational Assessment, Psychological Testing, Validity
Daniels, Roberta R. – G/C/T, 1986
Results of gifted fourth-sixth graders on the Structure of the Intellect Learning Abilities Test are analyzed in terms of the test's measure of five operations (evaluation, convergent production, divergent production, memory, and cognition). (CL)
Descriptors: Evaluation Methods, Gifted, Intermediate Grades, Scores

Schmoker, Mike – Educational Leadership, 2000
Despite their shortcomings, standardized tests provide numerical, intelligible data on how a child, school, or district is performing and vital information about patterns of strengths and weaknesses among students in a classroom, school, or district. A sensible compromise in the testing debate would be helpful. (MLH)
Descriptors: Elementary Secondary Education, Evaluation Methods, Scores, Standardized Tests
Humphries-Wadsworth, Terresa M. – 1998
The American Psychological Association, in the late 1940s, began work to establish a code of ethics to include and address the needs of members in scientific and applied fields. Out of the ethics work emerged a set of standards for evaluating psychological tests. Four categories, or types of validity, were identified: content, predictive,…
Descriptors: Codes of Ethics, Definitions, Evaluation Methods, Psychological Testing

MacKay, Gilbert; Lundie, Jennifer – International Journal of Disability, Development and Education, 1998
Recognizes the attraction of Goal Attainment Scaling (GAS), a technique that uses a scale to measure client's achievement, but suggests that there are concerns about the calculation of its standard scores. Examples show how GAS may be used in service development, whether or not numerical values are attached. (Author/CR)
Descriptors: Achievement Gains, Achievement Rating, Adults, Children
Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004
A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…
Descriptors: Test Results, Student Evaluation, Scores, Testing Programs

Kehoe, Jerard F. Tenopyr, Mary L. – Psychological Assessment, 1994
Methods of adjusting group differences in assessment and test scores are described, classified, and evaluated. Investigation of the relationship between intended use of test scores and the appropriate meaning of scores is essential for fair treatment in assessment. (SLD)
Descriptors: Classification, Educational Assessment, Equal Education, Evaluation Methods
Kaiser, Paul D. – 1995
This paper describes a process used to assess the technical competency of non-direct line candidates for managerial positions in computer specialties. The Training and Experience (T&E) qualifying test, a multi-method approach combining task-based, computer scorable evaluations of training and experience with multiple-choice batteries, was used as…
Descriptors: Administrators, Competence, Evaluation Methods, Experience

Linn, Robert L. – Educational Measurement: Issues and Practice, 1997
It is argued that consequential validity is a concept worth considering. The solution to defining "validity" is not to narrow the concept, but to allow for the differential prediction provided by tests in different circumstances. Consequences of the uses and interpretations of test scores are central to their evaluation. (SLD)
Descriptors: Educational Assessment, Educational Testing, Elementary Secondary Education, Evaluation Methods

Healy, Charles C.; And Others – Measurement and Evaluation in Counseling and Development, 1990
Community college students' (n=662) scores on My Vocational Situation related modestly to some concurrent signs of career progress but did not predict grades or enrollment decisions, thereby calling into question the instrument's usefulness for prescription or evaluation, unless it is revised. (Author)
Descriptors: Achievement Tests, Career Development, Community Colleges, Evaluation Methods
Thompson, Bruce; Snyder, Patricia A. – 1997
The mission of the "Journal of Counseling and Development" (JCD) includes the attempt to serve as a "scholarly record of the counseling profession" and as part of the "conscience of the profession." This responsibility requires the willingness to engage in self-study. This study investigated two aspects of research…
Descriptors: Counseling, Educational Research, Effect Size, Evaluation Methods
Arter, Judith A. – 1982
Specific recommendations are made concerning the circumstances under which the benefits of out-of-level testing outweigh the problems associated with it. Topics explored are: various methods for deciding when a set of test scores is invalid and the utility of these methods for local evaluators, the accuracy of vertical scaling, and the usefulness…
Descriptors: Equated Scores, Evaluation Methods, Local Norms, Scores