ERIC - Search Results

Publication Date

In 2025	4
Since 2024	7
Since 2021 (last 5 years)	28
Since 2016 (last 10 years)	73
Since 2006 (last 20 years)	201

Descriptor

Evaluation Methods	276
Scores	276
Test Validity	141
Validity	104
Student Evaluation	67
Test Reliability	67
Correlation	48
Foreign Countries	47
Reliability	45
Measurement Techniques	34
Psychometrics	34
Factor Analysis	32
Construct Validity	29
Comparative Analysis	28
Measures (Individuals)	27
Test Construction	26
Test Items	26
Elementary School Students	25
Standardized Tests	25
Rating Scales	24
Academic Achievement	22
Achievement Tests	19
Educational Assessment	19
Statistical Analysis	19
Elementary Secondary Education	18
More ▼

Education Level

Higher Education	44
Elementary Education	43
Postsecondary Education	29
Secondary Education	20
Elementary Secondary Education	18
Early Childhood Education	12
Middle Schools	12
High Schools	9
Primary Education	9
Grade 3	7
Grade 6	7
Grade 1	6
Junior High Schools	6
Grade 4	5
Grade 5	5
Grade 2	4
Intermediate Grades	4
Preschool Education	4
Adult Education	3
Kindergarten	3
Adult Basic Education	2
Grade 7	2
Grade 10	1
Grade 12	1
Grade 8	1
More ▼

Audience

Practitioners	4
Researchers	4
Teachers	3
Administrators	1
Community	1
Parents	1
Policymakers	1

Location

Australia	5
Illinois	5
Florida	4
Massachusetts	4
United Kingdom	4
United Kingdom (England)	4
China	3
Germany	3
South Korea	3
United States	3
Canada	2
Kenya	2
Michigan	2
Minnesota	2
New Jersey	2
North Carolina	2
South Africa	2
Tennessee	2
Texas	2
Washington	2
Arizona	1
California	1
California (Los Angeles)	1
California (San Diego)	1
Canada (Montreal)	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	3
Comprehensive Education…	1
Elementary and Secondary…	1
Every Student Succeeds Act…	1
Individuals with Disabilities…	1
No Child Left Behind Act 2001	1
Race to the Top	1

What Works Clearinghouse Rating

Showing 1 to 15 of 276 results Save | Export

Propensity Score Methods for Causal Inference and Generalization

Peer reviewed

Direct link

Wendy Chan – Asia Pacific Education Review, 2024

As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…

Descriptors: Probability, Scores, Causal Models, Statistical Inference

Peer Overmarking and Insufficient Diagnosticity: The Impact of the Rating Method for Peer Assessment

Peer reviewed

Direct link

Van Meenen, Florence; Coertjens, Liesje; Van Nes, Marie-Claire; Verschuren, Franck – Advances in Health Sciences Education, 2022

The present study explores two rating methods for peer assessment (analytical rating using criteria and comparative judgement) in light of concurrent validity, reliability and insufficient diagnosticity (i.e. the degree to which substandard work is recognised by the peer raters). During a second-year undergraduate course, students wrote a one-page…

Descriptors: Evaluation Methods, Peer Evaluation, Accuracy, Evaluation Criteria

Validation and Implementation of Customer Classification System Using Machine Learning

Peer reviewed

Direct link

Hyemin Yoon; HyunJin Kim; Sangjin Kim – Measurement: Interdisciplinary Research and Perspectives, 2024

We have maintained the customer grade system that is being implemented to customers with excellent performance through customer segmentation for years. Currently, financial institutions that operate the customer grade system provide similar services based on the score calculation criteria, but the score calculation criteria vary from the financial…

Descriptors: Classification, Artificial Intelligence, Prediction, Decision Making

A Note on the Use of Categorical Subscores

Peer reviewed

Direct link

Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025

Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…

Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment

Characteristics and Uses of SRL Microanalysis across Diverse Contexts, Tasks, and Populations: A Systematic Review

Peer reviewed

Direct link

Cleary, Timothy J.; Slemp, Jackie; Reddy, Linda A.; Alperin, Alexander; Lui, Angela; Austin, Amanda; Cedar, Tori – School Psychology Review, 2023

The primary purpose of this study was to systematically review the literature regarding the characteristics, use, and implementation of an emerging assessment methodology, "SRL microanalysis." Forty-two studies across diverse samples, contexts, and research methodologies met inclusion criteria. The majority of studies used microanalysis…

Descriptors: School Psychology, School Psychologists, Evaluation Methods, Metacognition

Latent Vector Autoregressive Modeling: A Stepwise Estimation Approach

Peer reviewed

Direct link

Manuel T. Rein; Jeroen K. Vermunt; Kim De Roover; Leonie V. D. E. Vogelsmeier – Structural Equation Modeling: A Multidisciplinary Journal, 2025

Researchers often study dynamic processes of latent variables in everyday life, such as the interplay of positive and negative affect over time. An intuitive approach is to first estimate the measurement model of the latent variables, then compute factor scores, and finally use these factor scores as observed scores in vector autoregressive…

Descriptors: Measurement Techniques, Factor Analysis, Scores, Validity

Examining the Convergent Validity between the PEAK Relational Training System's Semi-Standardized and Standardized Skill Assessments

Peer reviewed

Direct link

Jamaal L. Moore; Zhihui Yi; Jessica M. Hinman; Becky F. Barron; Mark R. Dixon – Journal of Developmental and Physical Disabilities, 2021

The current study examined the convergent validity between the standardized PEAK Comprehensive Assessment (PCA) and the semi-standardized PEAK Pre-assessment (PEAK-PA). Twenty-two participants were administered each tool, and an item by item analysis was conducted to evaluate correlations between tests. The results suggested a strong positive…

Descriptors: Validity, Evaluation Methods, Standardized Tests, Correlation

What Should We Evaluate When We Use Technology in Education?

Peer reviewed

Direct link

Lai, Jennifer W. M.; Bower, Matt; De Nobile, John; Breyer, Yvonne – Journal of Computer Assisted Learning, 2022

Background: There is a lack of critical or empirical work interrogating the nature and purpose of evaluating technology use in education. Objectives: In this study, we examine the values underpinning the evaluation of technology use in education through field specialist perceptions. The study also poses critical reflections about the rigour of…

Descriptors: Technology Uses in Education, Educational Technology, Program Evaluation, Content Validity

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Building Validity Evidence for the Use of Aggregate Scores in Accountability

Direct link

Karen Blackburn Hoeve – ProQuest LLC, 2021

High stakes test-based accountability systems primarily rely on aggregates and derivatives of scores from tests that were originally developed to measure individual student mastery of content specifications. Current validity models do not explicitly address this use of aggregate scores to measure the performance of teachers, administrators, and…

Descriptors: Accountability, Test Validity, High Stakes Tests, Hierarchical Linear Modeling

Using Think-Aloud Interviews to Examine a Clinically Oriented Performance Assessment Rubric

Peer reviewed

Direct link

Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022

Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…

Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment

The Impact of External Events on Low-Stakes Assessment: A Cautionary Tale

Peer reviewed
PDF on ERIC

Download full text

Kelsey Nason; Christine E. DeMars – Research & Practice in Assessment, 2023

Universities administer assessments for accountability and program improvement. Student effort is low during assessments due to minimal perceived consequences. The effects of low effort are compounded by assessment context. This project investigates validity concerns caused by minimal effort and exacerbated by contextual factors. Systematic…

Descriptors: Test Validity, COVID-19, Pandemics, Environmental Influences

Re-Examining Measurement Invariance of School Climate Surveys across Race/Ethnicity

Peer reviewed

Direct link

Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025

Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…

Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Developing a Game-Based Test to Assess Middle School Sixth-Grade Students' Algorithmic Thinking Skills

Peer reviewed
PDF on ERIC

Download full text

Emre Zengin; Yasemin Karal – International Journal of Assessment Tools in Education, 2024

This study was carried out to develop a test to assess algorithmic thinking skills. To this end, the twelve steps suggested by Downing (2006) were adopted. Throughout the test development, 24 middle school sixth-grade students and eight experts in different areas took part as needed in the tasks on the project. The test was given to 252 students…

Descriptors: Grade 6, Algorithms, Thinking Skills, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 19

ProQuest LLC	11
Educational and Psychological…	10
Journal of Educational…	6
Assessment for Effective…	5
Applied Measurement in…	4
Educational Measurement:…	4
Educational Researcher	4
Grantee Submission	4
Journal of Psychoeducational…	4
Measurement:…	4
Practical Assessment,…	4
Psychological Assessment	4
Psychology in the Schools	4
School Psychology Review	4
Advances in Health Sciences…	3
College Board	3
ETS Research Report Series	3
Elementary School Journal	3
Journal of Autism and…	3
Online Submission	3
Assessment in Education:…	2
Early Childhood Research…	2
Educational Assessment	2
International Journal of…	2
Journal of Chemical Education	2
More ▼

Cook, Colleen	3
Raykov, Tenko	3
Thompson, Bruce	3
Amrein-Beardsley, Audrey	2
Cowan, James	2
Deno, Stanley L.	2
Erford, Bradley T.	2
Francis, David J.	2
Frazier, Thomas W.	2
Goldhaber, Dan	2
Kane, Michael T.	2
Lembke, Erica S.	2
McCaffrey, Daniel F.	2
McIntyre, Nancy	2
McMaster, Kristen L.	2
Miller, Joshua D.	2
Mundy, Peter	2
Novotny, Stephanie	2
Oswald, Tasha	2
Penfield, Randall D.	2
Pilkonis, Paul A.	2
Rudner, Lawrence M.	2
Ryser, Gail R.	2
Sireci, Stephen G.	2
More ▼

Journal Articles	206
Reports - Research	162
Reports - Evaluative	64
Speeches/Meeting Papers	22
Reports - Descriptive	21
Tests/Questionnaires	15
Dissertations/Theses -…	12
Information Analyses	8
Opinion Papers	6
Numerical/Quantitative Data	3
Reports - General	3
Guides - Non-Classroom	2
Non-Print Media	2
Reference Materials - General	2
Books	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Teacher	1
Guides - General	1
More ▼

ACT Assessment	4
Autism Diagnostic Observation…	4
National Assessment of…	4
Bayley Scales of Infant…	3
SAT (College Admission Test)	3
Wechsler Intelligence Scale…	3
Child Behavior Checklist	2
Program for International…	2
Progress in International…	2
Teacher Rating Scale	2
Trends in International…	2
Wechsler Individual…	2
Woodcock Johnson Tests of…	2
Beck Anxiety Inventory	1
Beck Depression Inventory	1
Behavior Assessment System…	1
Behavioral and Emotional…	1
Childhood Autism Rating Scale	1
Classroom Assessment Scoring…	1
Clinical Evaluation of…	1
Cognitive Assessment System	1
College Level Examination…	1
Collegiate Assessment of…	1
Early Childhood Longitudinal…	1
Flesch Kincaid Grade Level…	1
More ▼