ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	7

Descriptor

Evaluation Methods	11
Multitrait Multimethod…	11
Test Validity	11
Test Reliability	5
Construct Validity	3
Correlation	3
Factor Analysis	3
Foreign Countries	3
Models	3
Test Construction	3
Adolescents	2
Goodness of Fit	2
High School Students	2
Interrater Reliability	2
Measurement Techniques	2
Mental Retardation	2
Scores	2
Testing	2
Tests	2
Academic Achievement	1
Access to Education	1
Accuracy	1
Achievement Rating	1
Achievement Tests	1
Adaptive Behavior (of…	1
More ▼

Source

Journal of Educational…	2
American Journal on Mental…	1
Educational Assessment	1
Grantee Submission	1
Journal of Educational…	1
Journal of Pan-Pacific…	1
Research in Developmental…	1
Research in Higher Education	1

Publication Type

Reports - Research	9
Journal Articles	8
Guides - Non-Classroom	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Grade 10	1
Grade 11	1
Grade 12	1
Grade 7	1
Grade 8	1
Grade 9	1
High Schools	1
Postsecondary Education	1
Two Year Colleges	1
More ▼

Audience

Researchers

Location

Australia	1
California	1
Canada (Ottawa)	1
Japan	1

Laws, Policies, & Programs

Assessments and Surveys

Aberrant Behavior Checklist

What Works Clearinghouse Rating

Does not meet standards

Showing all 11 results Save | Export

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

The Operations Triad Model and Youth Mental Health Assessments: Catalyzing a Paradigm Shift in Measurement Validation

Peer reviewed
PDF on ERIC

Download full text

Direct link

Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022

Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…

Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research

The Effects of Test Method on L2 Reading and Listening Performance

Peer reviewed
PDF on ERIC

Download full text

Park, Siwon – Journal of Pan-Pacific Association of Applied Linguistics, 2017

This paper examines how different test methods may tap different aspects of second language knowledge. It employs multiple-choice (MC) and constructed response (CR) items which yield distinct or convergent information in the computer delivered testing of English in its presentation of this factor. In order to examine the effects of test method, a…

Descriptors: Evaluation Methods, Second Language Learning, English (Second Language), Computer Assisted Testing

Using Multiple Measures to Make Math Placement Decisions: Implications for Access and Success in Community Colleges

Peer reviewed

Direct link

Ngo, Federick; Kwon, William W. – Research in Higher Education, 2015

Community college students are often placed in developmental math courses based on the results of a single placement test. However, concerns about accurate placement have recently led states and colleges across the country to consider using other measures to inform placement decisions. While the relationships between college outcomes and such…

Descriptors: Access to Education, Success, Community Colleges, Mathematics Education

Validity and Reliability of the "Behavior Problems Inventory," the "Aberrant Behavior Checklist," and the "Repetitive Behavior Scale--Revised" among Infants and Toddlers at Risk for Intellectual or Developmental Disabilities: A Multi-Method Assessment Approach

Peer reviewed

Direct link

Rojahn, Johannes; Schroeder, Stephen R.; Mayo-Ortega, Liliana; Oyama-Ganiko, Rosao; LeBlanc, Judith; Marquis, Janet; Berke, Elizabeth – Research in Developmental Disabilities: A Multidisciplinary Journal, 2013

Reliable and valid assessment of aberrant behaviors is essential in empirically verifying prevention and intervention for individuals with intellectual or developmental disabilities (IDD). Few instruments exist which assess behavior problems in infants. The current longitudinal study examined the performance of three behavior-rating scales for…

Descriptors: Rating Scales, Behavior Problems, Developmental Disabilities, Infants

Construct Validity of the Multidimensional Structure of Bullying and Victimization: An Application of Exploratory Structural Equation Modeling

Peer reviewed

Direct link

Marsh, Herbert W.; Nagengast, Benjamin; Morin, Alexandre J. S.; Parada, Roberto H.; Craven, Rhonda G.; Hamilton, Linda R. – Journal of Educational Psychology, 2011

Existing research posits multiple dimensions of bullying and victimization but has not identified well-differentiated facets of these constructs that meet standards of good measurement: goodness of fit, measurement invariance, lack of differential item functioning, and well-differentiated factors that are not so highly correlated as to detract…

Descriptors: Locus of Control, Test Bias, Bullying, Structural Equation Models

Relationships among Measures as Empirical Evidence of Validity: Incorporating Multiple Indicators of Achievement and School Context

Peer reviewed

Direct link

Goldschmidt, Pete; Martinez, Jose Felipe; Niemi, David; Baker, Eva L. – Educational Assessment, 2007

In this article we examine empirical evidence on the criterion, predictive, transfer, and fairness aspects of validity of a large-scale language arts performance assessment, referred to as the Performance Assignment (PA). We use multilevel models to avoid biased inferences that might result from the naturally nested data. Specifically, we examine…

Descriptors: Language Arts, Performance Based Assessment, Academic Achievement, Performance Tests

Choosing 360. A Guide to Evaluating Multi-Rater Feedback Instruments for Management Development.

Van Velsor, Ellen; Leslie, Jean Brittain; Fleenor, John W. – 1997

This book presents a nontechnical, step-by-step process that shows how to evaluate any 360-degree-feedback instrument intended for management or leadership development. The 360-degree-feedback instruments collect information from different sources about a target manager's performance, and they offer multiple perspectives. The 16 steps in…

Descriptors: Administrator Characteristics, Evaluation Methods, Feedback, Interrater Reliability

Construct Validity of Measures of College Teaching Effectiveness.

Peer reviewed

Howard, George S.; And Others – Journal of Educational Psychology, 1985

The accuracy of various evaluation methods for assessing teacher effectiveness was investigated. College instructors (n=43) were rated by students, colleagues, trained classroom raters, former students, and themselves. Results indicate these methods to be more valid than prior research would suggest. (BS)

Descriptors: College Faculty, Evaluation Methods, Higher Education, Interrater Reliability

Construct Validity of Dimensions of Adaptive Behavior: A Multitrait-Multimethod Evaluation.

Widaman, Keith F.; And Others – American Journal on Mental Retardation, 1993

Measures of 4traits (cognitive competence, social competence, social maladaption, and personal maladaption) were obtained on 157 persons with mental retardation, using 3 measurements: standardized assessment instrument, day shift staff ratings, and evening shift staff ratings. The multitrait-multimethod matrix procedure demonstrated strong…

Descriptors: Adaptive Behavior (of Disabled), Behavior Rating Scales, Cognitive Ability, Construct Validity

Four Multitrait-Multimethod Approaches to Examining the Construct Validity of Adolescent Self-Concept within and across Gender.

Download full text

Byrne, Barbara M. – 1989

The construct validity of a multidimensional adolescent self-concept was estimated separately for males and females, and the results were compared across gender. The application of four multitrait-multimethod (MTMM) approaches to estimating construct validity was demonstrated, and the scope and consistency of the findings derived from each were…

Descriptors: Adolescents, Analysis of Variance, Comparative Analysis, Construct Validity

Amery D. Wu	1
Andres De Los Reyes	1
Baker, Eva L.	1
Berke, Elizabeth	1
Bridget A. Makol	1
Byrne, Barbara M.	1
Craven, Rhonda G.	1
Fleenor, John W.	1
Goldschmidt, Pete	1
Hamilton, Linda R.	1
Howard, George S.	1
Jake Stone	1
John R. Weisz	1
Kwon, William W.	1
LeBlanc, Judith	1
Leslie, Jean Brittain	1
Marquis, Janet	1
Marsh, Herbert W.	1
Martinez, Jose Felipe	1
Matthew D. Lerner	1
Mayo-Ortega, Liliana	1
Mo Wang	1
Morin, Alexandre J. S.	1
Nagengast, Benjamin	1
More ▼