Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 7 |
Descriptor
Evaluation Methods | 11 |
Multitrait Multimethod… | 11 |
Test Validity | 11 |
Test Reliability | 5 |
Construct Validity | 3 |
Correlation | 3 |
Factor Analysis | 3 |
Foreign Countries | 3 |
Models | 3 |
Test Construction | 3 |
Adolescents | 2 |
More ▼ |
Source
Journal of Educational… | 2 |
American Journal on Mental… | 1 |
Educational Assessment | 1 |
Grantee Submission | 1 |
Journal of Educational… | 1 |
Journal of Pan-Pacific… | 1 |
Research in Developmental… | 1 |
Research in Higher Education | 1 |
Author
Amery D. Wu | 1 |
Andres De Los Reyes | 1 |
Baker, Eva L. | 1 |
Berke, Elizabeth | 1 |
Bridget A. Makol | 1 |
Byrne, Barbara M. | 1 |
Craven, Rhonda G. | 1 |
Fleenor, John W. | 1 |
Goldschmidt, Pete | 1 |
Hamilton, Linda R. | 1 |
Howard, George S. | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Journal Articles | 8 |
Guides - Non-Classroom | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 11 | 1 |
Grade 12 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Postsecondary Education | 1 |
Two Year Colleges | 1 |
More ▼ |
Audience
Researchers | 1 |
Location
Australia | 1 |
California | 1 |
Canada (Ottawa) | 1 |
Japan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Aberrant Behavior Checklist | 1 |
What Works Clearinghouse Rating
Does not meet standards | 1 |
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022
Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…
Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research
Park, Siwon – Journal of Pan-Pacific Association of Applied Linguistics, 2017
This paper examines how different test methods may tap different aspects of second language knowledge. It employs multiple-choice (MC) and constructed response (CR) items which yield distinct or convergent information in the computer delivered testing of English in its presentation of this factor. In order to examine the effects of test method, a…
Descriptors: Evaluation Methods, Second Language Learning, English (Second Language), Computer Assisted Testing
Ngo, Federick; Kwon, William W. – Research in Higher Education, 2015
Community college students are often placed in developmental math courses based on the results of a single placement test. However, concerns about accurate placement have recently led states and colleges across the country to consider using other measures to inform placement decisions. While the relationships between college outcomes and such…
Descriptors: Access to Education, Success, Community Colleges, Mathematics Education
Rojahn, Johannes; Schroeder, Stephen R.; Mayo-Ortega, Liliana; Oyama-Ganiko, Rosao; LeBlanc, Judith; Marquis, Janet; Berke, Elizabeth – Research in Developmental Disabilities: A Multidisciplinary Journal, 2013
Reliable and valid assessment of aberrant behaviors is essential in empirically verifying prevention and intervention for individuals with intellectual or developmental disabilities (IDD). Few instruments exist which assess behavior problems in infants. The current longitudinal study examined the performance of three behavior-rating scales for…
Descriptors: Rating Scales, Behavior Problems, Developmental Disabilities, Infants
Marsh, Herbert W.; Nagengast, Benjamin; Morin, Alexandre J. S.; Parada, Roberto H.; Craven, Rhonda G.; Hamilton, Linda R. – Journal of Educational Psychology, 2011
Existing research posits multiple dimensions of bullying and victimization but has not identified well-differentiated facets of these constructs that meet standards of good measurement: goodness of fit, measurement invariance, lack of differential item functioning, and well-differentiated factors that are not so highly correlated as to detract…
Descriptors: Locus of Control, Test Bias, Bullying, Structural Equation Models
Goldschmidt, Pete; Martinez, Jose Felipe; Niemi, David; Baker, Eva L. – Educational Assessment, 2007
In this article we examine empirical evidence on the criterion, predictive, transfer, and fairness aspects of validity of a large-scale language arts performance assessment, referred to as the Performance Assignment (PA). We use multilevel models to avoid biased inferences that might result from the naturally nested data. Specifically, we examine…
Descriptors: Language Arts, Performance Based Assessment, Academic Achievement, Performance Tests
Van Velsor, Ellen; Leslie, Jean Brittain; Fleenor, John W. – 1997
This book presents a nontechnical, step-by-step process that shows how to evaluate any 360-degree-feedback instrument intended for management or leadership development. The 360-degree-feedback instruments collect information from different sources about a target manager's performance, and they offer multiple perspectives. The 16 steps in…
Descriptors: Administrator Characteristics, Evaluation Methods, Feedback, Interrater Reliability

Howard, George S.; And Others – Journal of Educational Psychology, 1985
The accuracy of various evaluation methods for assessing teacher effectiveness was investigated. College instructors (n=43) were rated by students, colleagues, trained classroom raters, former students, and themselves. Results indicate these methods to be more valid than prior research would suggest. (BS)
Descriptors: College Faculty, Evaluation Methods, Higher Education, Interrater Reliability
Widaman, Keith F.; And Others – American Journal on Mental Retardation, 1993
Measures of 4traits (cognitive competence, social competence, social maladaption, and personal maladaption) were obtained on 157 persons with mental retardation, using 3 measurements: standardized assessment instrument, day shift staff ratings, and evening shift staff ratings. The multitrait-multimethod matrix procedure demonstrated strong…
Descriptors: Adaptive Behavior (of Disabled), Behavior Rating Scales, Cognitive Ability, Construct Validity
Byrne, Barbara M. – 1989
The construct validity of a multidimensional adolescent self-concept was estimated separately for males and females, and the results were compared across gender. The application of four multitrait-multimethod (MTMM) approaches to estimating construct validity was demonstrated, and the scope and consistency of the findings derived from each were…
Descriptors: Adolescents, Analysis of Variance, Comparative Analysis, Construct Validity