Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 8 |
Descriptor
Evaluation Methods | 93 |
Test Use | 93 |
Test Validity | 93 |
Test Reliability | 44 |
Test Construction | 40 |
Student Evaluation | 27 |
Educational Assessment | 24 |
Elementary Secondary Education | 22 |
Test Interpretation | 19 |
Testing Problems | 17 |
Psychometrics | 13 |
More ▼ |
Source
Author
Linn, Robert L. | 3 |
Clark, John L. D. | 2 |
Johnson, Bil | 2 |
Aiken, Lewis R. | 1 |
Amery D. Wu | 1 |
Arter, Judith A. | 1 |
Bailey, Earletta | 1 |
Baird, Jo-Anne | 1 |
Baker, Eva L. | 1 |
Baxter, Gail P. | 1 |
Bishop, Laurence A. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 4 |
Elementary Education | 2 |
Adult Basic Education | 1 |
Adult Education | 1 |
Early Childhood Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Preschool Education | 1 |
Audience
Practitioners | 10 |
Teachers | 6 |
Researchers | 3 |
Community | 1 |
Parents | 1 |
Students | 1 |
Location
Canada | 4 |
United Kingdom (England) | 3 |
Australia | 1 |
Netherlands | 1 |
North Carolina | 1 |
United Kingdom | 1 |
United Kingdom (Wales) | 1 |
United States | 1 |
Virginia | 1 |
Laws, Policies, & Programs
Every Student Succeeds Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
O'Sullivan, Barry – Modern Language Journal, 2012
While Grosse and Voght (1991) set out a well-considered overview of LSP and identified areas in need of development, they limited their observations on the topic of assessment to a short section devoted to what they called the "proficiency movement." While it is true that they really did not have a lot to report on at the time they wrote their…
Descriptors: Theory Practice Relationship, Work Environment, Languages for Special Purposes, Language Tests
Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010
"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…
Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
Caselman, Tonia D.; Self, Patricia A. – Children & Schools, 2008
Early identification of social-emotional behavioral problems in infants and preschoolers is critical. Nine parent-report and caregiver/teacher-report instruments measuring preschool social-emotional behavioral problems and strengths are reviewed. Advantages to the use of parent-report and caregiver/teacher-report instruments are that they are easy…
Descriptors: Identification, Psychometrics, Evaluation Methods, Child Caregivers
Falk, Beverly; Ort, Suzanne Wichterle; Moirs, Katie – Educational Assessment, 2007
This article describes the findings of studies conducted on a large-scale, classroom-based performance assessment of literacy for the early grades designed to provide information that is useful for reporting, as well as teaching. Technical studies found the assessment to be a promising instrument that is reliable and valid. Follow-up studies of…
Descriptors: Program Effectiveness, Performance Based Assessment, Student Evaluation, Evaluation Research

Brown, Elissa J.; And Others – Psychological Assessment, 1997
The psychometric adequacy of the Social Interaction Scale and the Social Phobia Scale (both by R. P. Mattick and J. C. Clark, 1989) was studied with 165 patients with anxiety disorders and 21 people without anxiety. Results support the usefulness of the scales for screening and treatment design and evaluation. (SLD)
Descriptors: Anxiety, Evaluation Methods, Mental Disorders, Patients
Peck, Curtiss S. – 1995
The relevance of assessing attention or concentration skills for personnel selection is discussed, and how a person's interpersonal characteristics are influenced by and influence attentional skills is explored. Scales in the Theory Attentional and Interpersonal Style (TAIS) inventory developed by Robert Nideffer are described. The interaction of…
Descriptors: Attention, Evaluation Methods, Interpersonal Relationship, Personnel Selection

Milner, Joel S. – Early Child Development and Care, 1989
Describes the Child Abuse Potential Inventory and associated psychometric data. Discusses uses and misuses of the inventory, general test limitations, and screening instruments available to professionals concerned with child maltreatment. (RJC)
Descriptors: Child Abuse, Evaluation Methods, Measurement Techniques, Test Interpretation
Bricker, Diane; Bailey, Earletta – 1983
The study examined psychometric properties of the Comprehensive Early Evaluation and Programming System (CEEPS), a criterion-referenced instrument designed for handicapped children birth to 3 years old. The instrument was intended to provide specific information to develop program objectives across a range of developmental areas and to assess…
Descriptors: Criterion Referenced Tests, Disabilities, Early Childhood Education, Evaluation Methods
Morris, Lynn Lyons; Fitz-Gibbon, Carol Taylor; Lindheim, Elaine – 1987
The "CSE Program Evaluation Kit" is a series of nine books intended to assist people conducting program evaluations. This volume, the seventh in the kit, provides an overview of a variety of approaches to measuring performance outcomes. It presents considerations in deciding what to measure and in selecting or developing instruments best suited to…
Descriptors: Evaluation Methods, Evaluation Utilization, Performance Tests, Program Evaluation
The Constant Danger of Sacrificing Validity to Reliability: Making Writing Assessment Serve Writers.

Wiggins, Grant – Assessing Writing, 1994
Suggests that assessment must be built into the curriculum and focused upon the kinds of skills students need. Considers much educational testing in writing to be reductionist, unrealistic, and detrimental to learning. Critiques writing assessment's trust and reliance on a single or small sample of student work collected and scored outside of a…
Descriptors: Elementary Secondary Education, Evaluation Methods, Reliability, Student Evaluation

Szapocznik, Jose; And Others – Journal of Marital and Family Therapy, 1991
Describes theoretically based structural family assessment procedure designed for use in evaluating therapy outcome. Explains how procedure involves evaluating family's interactional patterns on three tasks along six dimensions (structure, flexibility, resonance, developmental stage, identified patienthood, and conflict resolution). Presents…
Descriptors: Counseling Effectiveness, Evaluation Methods, Family Counseling, Outcomes of Treatment