ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	11
Since 2007 (last 20 years)	27

Descriptor

Evaluation Methods	34
Validity	34
Inferences	27
Research Design	8
Research Methodology	8
Evaluation Problems	7
Scores	7
Statistical Inference	7
Causal Models	6
Comparative Analysis	6
Probability	6
Evidence	5
Foreign Countries	5
Measurement Techniques	5
Models	5
Research Problems	5
Student Evaluation	5
Educational Practices	4
Experiments	4
Hypothesis Testing	4
Misconceptions	4
Predictive Measurement	4
Replication (Evaluation)	4
Research Reports	4
Second Language Learning	4
More ▼

Publication Type

Journal Articles	24
Reports - Research	12
Reports - Evaluative	9
Reports - Descriptive	5
Opinion Papers	4
Speeches/Meeting Papers	3
Tests/Questionnaires	3
Information Analyses	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Non-Print Media	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	3
Elementary Education	2
Elementary Secondary Education	2
Adult Education	1
Grade 4	1
High Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

United States	3
United Kingdom (England)	2
Australia	1
China	1
Colombia	1
Cyprus	1
Ireland	1
Israel	1
Ohio	1
South Korea	1
Taiwan	1
Tennessee	1
Thailand	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Progress in International…

What Works Clearinghouse Rating

Showing 1 to 15 of 34 results Save | Export

Addressing Threats to Validity in Supervised Machine Learning: A Framework and Best Practices for Education Researchers. EdWorkingPaper No. 25-1117

Download full text

Kylie L. Anglin – Annenberg Institute for School Reform at Brown University, 2025

Since 2018, institutions of higher education have been aware of the "enrollment cliff" which refers to expected declines in future enrollment. This paper attempts to describe how prepared institutions in Ohio are for this future by looking at trends leading up to the anticipated decline. Using IPEDS data from 2012-2022, we analyze trends…

Descriptors: Validity, Artificial Intelligence, Models, Best Practices

Propensity Score Methods for Causal Inference and Generalization

Peer reviewed

Direct link

Wendy Chan – Asia Pacific Education Review, 2024

As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…

Descriptors: Probability, Scores, Causal Models, Statistical Inference

Addressing Threats to Validity in Supervised Machine Learning: A Framework and Best Practices for Education Researchers

Peer reviewed
PDF on ERIC

Download full text

Kylie Anglin – AERA Open, 2024

Given the rapid adoption of machine learning methods by education researchers, and the growing acknowledgment of their inherent risks, there is an urgent need for tailored methodological guidance on how to improve and evaluate the validity of inferences drawn from these methods. Drawing on an integrative literature review and extending a…

Descriptors: Validity, Artificial Intelligence, Models, Best Practices

Examining the Robustness of the Graded Response and 2-Parameter Logistic Models to Violations of Construct Normality

Peer reviewed

Direct link

Manapat, Patrick D.; Edwards, Michael C. – Educational and Psychological Measurement, 2022

When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait ([theta]) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal [theta]. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed…

Descriptors: Robustness (Statistics), Computational Linguistics, Item Response Theory, Psychological Patterns

Using Think-Aloud Interviews to Examine a Clinically Oriented Performance Assessment Rubric

Peer reviewed

Direct link

Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022

Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…

Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment

Causal Inference and Bias in Learning Analytics: A Primer on Pitfalls Using Directed Acyclic Graphs

Peer reviewed
PDF on ERIC

Download full text

Weidlich, Joshua; Gaševic, Dragan; Drachsler, Hendrik – Journal of Learning Analytics, 2022

As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they…

Descriptors: Causal Models, Inferences, Learning Analytics, Comparative Analysis

All Value-Added Models (VAMs) Are Wrong, but Sometimes They May Be Useful

Peer reviewed

Direct link

Amrein-Beardsley, Audrey; Sloat, Edward; Holloway, Jessica – AASA Journal of Scholarship & Practice, 2020

In this study, researchers compared the concordance of teacher-level effectiveness ratings derived via six common generalized value-added model (VAM) approaches including a (1) student growth percentile (SGP) model, (2) value-added linear regression model (VALRM), (3) value-added hierarchical linear model (VAHLM), (4) simple difference (gain)…

Descriptors: Value Added Models, Teacher Effectiveness, Elementary School Teachers, Teacher Evaluation

Building an Initial Validity Argument for Binary and Analytic Rating Scales for an EFL Classroom Writing Assessment: Evidence from Many-Facets Rasch Measurement

Peer reviewed
PDF on ERIC

Download full text

Khamboonruang, Apichat – rEFLections, 2022

Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…

Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction

What Does Corpus Linguistics Have to Offer to Language Assessment?

Peer reviewed

Direct link

Xi, Xiaoming – Language Testing, 2017

In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they…

Descriptors: Computational Linguistics, Second Language Learning, Language Tests, Evaluation Methods

Regression Discontinuity and Beyond: Options for Studying External Validity in an Internally Valid Design

Peer reviewed

Direct link

Wing, Coady; Bello-Gomez, Ricardo A. – American Journal of Evaluation, 2018

Treatment effect estimates from a "regression discontinuity design" (RDD) have high internal validity. However, the arguments that support the design apply to a subpopulation that is narrower and usually different from the population of substantive interest in evaluation research. The disconnect between RDD population and the…

Descriptors: Regression (Statistics), Research Design, Validity, Evaluation Methods

Using Corpus Linguistics Tools to Identify Instances of Low Linguistic Accessibility in Tests

Download full text

Beauchamp, David; Constantinou, Filio – Research Matters, 2020

Assessment is a useful process as it provides various stakeholders (e.g., teachers, parents, government, employers) with information about students' competence in a particular subject area. However, for the information generated by assessment to be useful, it needs to support valid inferences. One factor that can undermine the validity of…

Descriptors: Computational Linguistics, Inferences, Validity, Language Usage

Quasi-Experimental Designs for Causal Inference

Peer reviewed

Direct link

Kim, Yongnam; Steiner, Peter – Educational Psychologist, 2016

When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs. This…

Descriptors: Quasiexperimental Design, Causal Models, Statistical Inference, Randomized Controlled Trials

Estimating the Causal Effect of Randomization versus Treatment Preference in a Doubly Randomized Preference Trial

Peer reviewed

Direct link

Marcus, Sue M.; Stuart, Elizabeth A.; Wang, Pei; Shadish, William R.; Steiner, Peter M. – Psychological Methods, 2012

Although randomized studies have high internal validity, generalizability of the estimated causal effect from randomized clinical trials to real-world clinical or educational practice may be limited. We consider the implication of randomized assignment to treatment, as compared with choice of preferred treatment as it occurs in real-world…

Descriptors: Educational Practices, Program Effectiveness, Validity, Causal Models

Applying Methods to Evaluate Construct Validity in the Context of A Level Assessment

Peer reviewed

Direct link

Crisp, Victoria; Shaw, Stuart – Educational Studies, 2012

Validity is a central principle of assessment relating to the appropriateness of the uses and interpretations of test results. Usually, one of the inferences that we wish to make is that the score reflects the extent of a student's learning in a given domain. Thus, it is important to establish that the assessment tasks elicit performances that…

Descriptors: Test Results, Evaluation Methods, Construct Validity, Validity

Investigating Sources of Differential Item Functioning in International Large-Scale Assessments Using a Confirmatory Approach

Peer reviewed

Direct link

Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013

International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…

Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2 | 3

Psychological Methods	5
Education and the Public…	2
Language Testing	2
AASA Journal of Scholarship &…	1
AERA Open	1
American Journal of Evaluation	1
Annenberg Institute for…	1
Asia Pacific Education Review	1
Association for Educational…	1
College Board	1
Educational Psychologist	1
Educational Studies	1
Educational and Psychological…	1
International Group for the…	1
International Journal of…	1
Journal of Black Studies	1
Journal of Learning Analytics	1
Measurement:…	1
New Directions for Evaluation	1
ProQuest LLC	1
Research Matters	1
Review of Educational Research	1
Studies in Educational…	1
TESOL Quarterly: A Journal…	1
rEFLections	1
More ▼

Ahn, Soyeon	1
Ames, Allison J.	1
Amrein-Beardsley, Audrey	1
Asante, Molefi K.	1
Baker, Bruce	1
Baker, Eva	1
Beauchamp, David	1
Bello-Gomez, Ricardo A.	1
Chao, Iris C. I.	1
Constantinou, Filio	1
Cook, Megan	1
Crisp, Victoria	1
Cumming, Geoff	1
Dorn, Sherman	1
Drachsler, Hendrik	1
Edwards, Michael C.	1
Ercikan, Kadriye	1
Feldon, David F.	1
Gaševic, Dragan	1
Gotch, Chad M.	1
Hendrickson, Amy	1
Holloway, Jessica	1
Huff, Kristen	1
Jaeger, Richard M.	1
More ▼