ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	9

Descriptor

Evaluation Methods	17
Test Validity	17
Scaling	12
Test Reliability	11
Test Construction	8
Measurement Techniques	6
Multidimensional Scaling	6
Psychometrics	6
Data Analysis	5
Achievement Tests	4
Elementary Secondary Education	4
Item Analysis	4
Item Response Theory	4
Rating Scales	4
Scores	4
Test Interpretation	4
Criterion Referenced Tests	3
Statistical Analysis	3
Test Use	3
College Faculty	2
College Students	2
Comparative Analysis	2
Correlation	2
Decision Making	2
Design	2
More ▼

Source

Applied Measurement in…	1
Educational Assessment	1
Educational Sciences: Theory…	1
Grantee Submission	1
Journal of Educational…	1
Journal of Psychoeducational…	1
Journal of Secondary Gifted…	1
Measurement and Evaluation in…	1
Measurement:…	1
Routledge, Taylor & Francis…	1
Stanford Center for Education…	1
More ▼

Publication Type

Reports - Research	9
Journal Articles	8
Speeches/Meeting Papers	3
Reports - Evaluative	2
Books	1
Collected Works - General	1
Guides - General	1
Historical Materials	1
Information Analyses	1
Opinion Papers	1
Reports - Descriptive	1
Tests/Questionnaires	1
More ▼

Education Level

Secondary Education	3
Elementary Secondary Education	2
Higher Education	2
Junior High Schools	2
Middle Schools	2
Elementary Education	1
Grade 8	1
High Schools	1
Postsecondary Education	1
Two Year Colleges	1

Audience

Researchers

Location

Michigan	2
California	1
United Kingdom (England)	1
United Kingdom (Wales)	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Advanced Placement…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Validation of the Child Observation Record Advantage 1.5 Assessment Tool for Preschool Children: A Multilevel Bifactor Modeling Approach

Peer reviewed

Direct link

Akaeze, Hope O.; Wu, Jamie Heng-Chieh; Lawrence, Frank R.; Weber, Everett P. – Journal of Psychoeducational Assessment, 2023

This paper reports an investigation into the psychometric properties of the COR-Advantage1.5 (COR-Adv1.5) assessment tool, a criterion-referenced observation-based instrument designed to assess the developmental abilities of children from birth through kindergarten. Using data from 8534 children participating in a state-funded preschool program…

Descriptors: Criterion Referenced Tests, Evaluation Methods, Measures (Individuals), Measurement Techniques

Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

Download full text

Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…

Descriptors: Test Validity, Evaluation Methods, School Districts, Scores

Test Assembly Implications for Providing Reliable and Valid Subscores

Peer reviewed

Direct link

Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017

This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…

Descriptors: Scores, Test Construction, Test Reliability, Test Validity

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Improving Comprehension Assessment for Middle and High School Students: Challenges and Opportunities

Peer reviewed
PDF on ERIC

Download full text

Sabatini, John; Petscher, Yaacov; O'Reilly, Tenaha; Truckenmiller, Adrea – Grantee Submission, 2015

For decades, standardized reading comprehension tests have consisted of a series of passages and associated multiple-choice questions. Although widely used in and out of the classroom, there continues to be considerable disagreement regarding how or whether such tests have net value in the service of advancing educational progress in reading. This…

Descriptors: Middle School Students, High School Students, Reading Comprehension, Reading Tests

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

Traditional versus Rasch Scaling of Aggregate Data in the Multitrait-Multimethod Matrix.

Turner, Carol J.; Smith, Jeffrey K. – Measurement and Evaluation in Guidance, 1982

Used aggregate ratings of teacher behavior as data for a multitrait-multimethod validity analysis. Scaled ratings using Rasch latent trait scaling model and traditional scaling techniques. Compared Rasch-scaled multitrait-multimethod matrix to the traditionally scaled multitrait-multimethod matrix. Results showed Rasch scaling resulted in higher…

Descriptors: Children, Comparative Testing, Data Analysis, Elementary Education

Psychometric Characteristics of the Adolescent Coping Scale with Academically Gifted Adolescents.

Peer reviewed

Plucker, Jonathan A. – Journal of Secondary Gifted Education, 1997

This study used a sample (n=967) of academically gifted adolescent students attending summer enrichment programs and participating in urban school districts' gifted programs to evaluate the reliability and validity of the Adolescent Coping Scale. Results suggest the instrument is sufficiently reliable for group administration and research purposes…

Descriptors: Academically Gifted, Adolescents, Coping, Elementary Secondary Education

Assessing the Validity of the National Assessment of Educational Progress: NAEP Technical Review Panel White Paper.

Download full text

Linn, Robert L.; Baker, Eva L. – 1996

During the past 6 years, under a contract from the National Center for Education Statistics, a Technical Review Panel has overseen and conducted a series of research studies addressing a range of validity questions relevant to the various uses and interpretations of the National Assessment of Educational Progress (NAEP). Study topics included: (1)…

Descriptors: Achievement Tests, Comparative Analysis, Data Analysis, Educational Policy

Handbook on Measurement, Assessment, and Evaluation in Higher Education

Direct link

Secolsky, Charles, Ed.; Denison, D. Brian, Ed. – Routledge, Taylor & Francis Group, 2011

Increased demands for colleges and universities to engage in outcomes assessment for accountability purposes have accelerated the need to bridge the gap between higher education practice and the fields of measurement, assessment, and evaluation. The "Handbook on Measurement, Assessment, and Evaluation in Higher Education" provides higher…

Descriptors: Generalizability Theory, Higher Education, Institutional Advancement, Teacher Effectiveness

A Look at Behavioristic Measurement of English Composition in United States Public Schools, 1901-1941.

Younglove, William A. – 1983

In the early twentieth century behaviorist Edward L. Thorndike began the development and use of measurement scales to replace personal judgment to evaluate student compositions in U.S. public schools. In 1912, utilizing the Fullerton and Catell equal difference theorem, Milo B. Hillegas released the first scientifically designed scale to measure…

Descriptors: Behavior Theories, Educational History, Elementary Secondary Education, Evaluation Methods

Assessing the Construct Validity of a Criterion-Referenced Test: A Nomological Network Approach.

Shoemaker, Sharon H.; Johnson, Richard T. – 1981

The applicability of a recognized construct validation procedure, the nomological network, to a sixth-grade criterion-referenced mathematics test developed in a metropolitan school system was investigated. Two samples were tested. The first sample was used to investigate two hypotheses: items within each objective will constitute a Guttman scale,…

Descriptors: Criterion Referenced Tests, Educational Objectives, Elementary School Mathematics, Evaluation Methods

The Multidimensional Character of Teaching Effectiveness: A Comparative Analysis of Student Evaluation Responses of Full and Part-Time Faculty.

Download full text

Obiekwe, Jerry C. – 1999

This study compared college students' responses on their evaluations of the effectiveness of full- and part-time college faculty. A group of 1,101 students completed evaluation instruments for all courses taught by full-time faculty, and 2,067 students completed evaluations for all courses taught by part-time faculty in spring 1998. In fall 1998,…

Descriptors: College Faculty, College Students, Evaluation Methods, Full Time Faculty

Previous Page | Next Page »

Pages: 1 | 2

Akaeze, Hope O.	1
Amery D. Wu	1
Baker, Eva L.	1
Denison, D. Brian, Ed.	1
Ercikan, Kadriye	1
Ho, Andrew D.	1
Izard, J. F.	1
Jake Stone	1
Johnson, Richard T.	1
Kalogrides, Demetra	1
Kelecioglu, Hülya	1
Lawrence, Frank R.	1
Lee, Minji K.	1
Linn, Robert L.	1
Melican, Gerald J.	1
O'Reilly, Tenaha	1
Obiekwe, Jerry C.	1
Oliveri, María Elena	1
Petscher, Yaacov	1
Plucker, Jonathan A.	1
Reardon, Sean F.	1
Sabatini, John	1
Secolsky, Charles, Ed.	1
Shoemaker, Sharon H.	1
Shun-Fu Hu	1
More ▼