Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Evaluation Methods | 60 |
Test Reliability | 60 |
Test Use | 60 |
Test Validity | 44 |
Test Construction | 26 |
Student Evaluation | 19 |
Educational Assessment | 15 |
Elementary Secondary Education | 11 |
Higher Education | 9 |
Test Interpretation | 9 |
Psychometrics | 8 |
More ▼ |
Source
Author
Aiken, Lewis R. | 1 |
Amery D. Wu | 1 |
Axelrod, Bradley N. | 1 |
Bailey, Earletta | 1 |
Baxter, Gail P. | 1 |
Blake, Jennifer M. | 1 |
Blakemore, Thomas | 1 |
Bouwens, M. R. J. | 1 |
Boyle, J. David | 1 |
Bradley, Sandra | 1 |
Bricker, Diane | 1 |
More ▼ |
Publication Type
Education Level
Elementary Education | 2 |
Elementary Secondary Education | 2 |
Adult Basic Education | 1 |
Adult Education | 1 |
Early Childhood Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Preschool Education | 1 |
Audience
Practitioners | 7 |
Teachers | 4 |
Researchers | 2 |
Students | 1 |
Laws, Policies, & Programs
Every Student Succeeds Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Caselman, Tonia D.; Self, Patricia A. – Children & Schools, 2008
Early identification of social-emotional behavioral problems in infants and preschoolers is critical. Nine parent-report and caregiver/teacher-report instruments measuring preschool social-emotional behavioral problems and strengths are reviewed. Advantages to the use of parent-report and caregiver/teacher-report instruments are that they are easy…
Descriptors: Identification, Psychometrics, Evaluation Methods, Child Caregivers

Porter, Kathleen F.; Bradley, Sandra – American Annals of the Deaf, 1985
The speech intelligibility of 21 hearing-impaired adolescents was measured by the National Technical Institute for the Deaf (NTID) rating scale, the Speech Intelligibility Evaluation, and the Speech Intelligibility Test for Deaf Children. The practical advantages and disadvantages of each procedure are discussed and recommendations made for their…
Descriptors: Adolescents, Deafness, Evaluation Methods, Speech Skills
Falk, Beverly; Ort, Suzanne Wichterle; Moirs, Katie – Educational Assessment, 2007
This article describes the findings of studies conducted on a large-scale, classroom-based performance assessment of literacy for the early grades designed to provide information that is useful for reporting, as well as teaching. Technical studies found the assessment to be a promising instrument that is reliable and valid. Follow-up studies of…
Descriptors: Program Effectiveness, Performance Based Assessment, Student Evaluation, Evaluation Research

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996
The underlying structure of the Postconcussion Syndrome Questionnaire (PCS) was evaluated in a large sample of 1,116 medical and psychiatric patients. Balancing internal consistency, confirmatory factor analysis, and parsimony results in endorsement of the four-factor solution for the PCS for this sample. (SLD)
Descriptors: Adults, Evaluation Methods, Factor Structure, Head Injuries

MacKay, Gilbert; Lundie, Jennifer – International Journal of Disability, Development and Education, 1998
Recognizes the attraction of Goal Attainment Scaling (GAS), a technique that uses a scale to measure client's achievement, but suggests that there are concerns about the calculation of its standard scores. Examples show how GAS may be used in service development, whether or not numerical values are attached. (Author/CR)
Descriptors: Achievement Gains, Achievement Rating, Adults, Children
Peck, Curtiss S. – 1995
The relevance of assessing attention or concentration skills for personnel selection is discussed, and how a person's interpersonal characteristics are influenced by and influence attentional skills is explored. Scales in the Theory Attentional and Interpersonal Style (TAIS) inventory developed by Robert Nideffer are described. The interaction of…
Descriptors: Attention, Evaluation Methods, Interpersonal Relationship, Personnel Selection
Bricker, Diane; Bailey, Earletta – 1983
The study examined psychometric properties of the Comprehensive Early Evaluation and Programming System (CEEPS), a criterion-referenced instrument designed for handicapped children birth to 3 years old. The instrument was intended to provide specific information to develop program objectives across a range of developmental areas and to assess…
Descriptors: Criterion Referenced Tests, Disabilities, Early Childhood Education, Evaluation Methods
Morris, Lynn Lyons; Fitz-Gibbon, Carol Taylor; Lindheim, Elaine – 1987
The "CSE Program Evaluation Kit" is a series of nine books intended to assist people conducting program evaluations. This volume, the seventh in the kit, provides an overview of a variety of approaches to measuring performance outcomes. It presents considerations in deciding what to measure and in selecting or developing instruments best suited to…
Descriptors: Evaluation Methods, Evaluation Utilization, Performance Tests, Program Evaluation
Wolfe, Edward W.; Kao, Chi-Wen – 1996
This paper reports the results of an analysis of the relationship between scorer behaviors and score variability. Thirty-six essay scorers were interviewed and asked to perform a think-aloud task as they scored 24 essays. Each comment made by a scorer was coded according to its content focus (i.e. appearance, assignment, mechanics, communication,…
Descriptors: Content Analysis, Educational Assessment, Essays, Evaluation Methods
A Comparison of the Kansas Marital Satisfaction Scale and the Locke-Wallace Marital Adjustment Test.
White, Mark B.; And Others – 1990
Past research has suggested that the Kansas Marital Satisfaction Scale (KMS) is a brief, reliable, and valid measure of marital satisfaction. This study was conducted to: (1) examine responses on the KMS from a national sample of couples; (2) assess the construct validity of the KMS through a comparison with the Locke-Wallace Marital Adjustment…
Descriptors: Adjustment (to Environment), Construct Validity, Evaluation Methods, Marital Satisfaction

Lange, Bob – Reading Teacher, 1982
Examines the arguments against the indiscriminate use of readability formulas. (FL)
Descriptors: Elementary Education, Evaluation Methods, Readability Formulas, Reading Diagnosis
Yu, Dickie; And Others – Canadian Journal for Exceptional Children, 1985
A study involving 95 retarded persons (16-66 years old) revealed that the Objective Behavioral Assessment-OBA-, a behavioral system for assessing self-care, social, sheltered domestic, prevocational, and sheltered work performance skills, has a high degree of objectivity and acceptable standards of inter-rater and test-retest reliability. (CL)
Descriptors: Behavior Rating Scales, Evaluation Methods, Mental Retardation, Moderate Mental Retardation