NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
Assessments and Surveys
Progress in International…1
What Works Clearinghouse Rating
Showing 1 to 15 of 34 results Save | Export
Kylie L. Anglin – Annenberg Institute for School Reform at Brown University, 2025
Since 2018, institutions of higher education have been aware of the "enrollment cliff" which refers to expected declines in future enrollment. This paper attempts to describe how prepared institutions in Ohio are for this future by looking at trends leading up to the anticipated decline. Using IPEDS data from 2012-2022, we analyze trends…
Descriptors: Validity, Artificial Intelligence, Models, Best Practices
Peer reviewed Peer reviewed
Direct linkDirect link
Wendy Chan – Asia Pacific Education Review, 2024
As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…
Descriptors: Probability, Scores, Causal Models, Statistical Inference
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kylie Anglin – AERA Open, 2024
Given the rapid adoption of machine learning methods by education researchers, and the growing acknowledgment of their inherent risks, there is an urgent need for tailored methodological guidance on how to improve and evaluate the validity of inferences drawn from these methods. Drawing on an integrative literature review and extending a…
Descriptors: Validity, Artificial Intelligence, Models, Best Practices
Peer reviewed Peer reviewed
Direct linkDirect link
Manapat, Patrick D.; Edwards, Michael C. – Educational and Psychological Measurement, 2022
When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait ([theta]) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal [theta]. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed…
Descriptors: Robustness (Statistics), Computational Linguistics, Item Response Theory, Psychological Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022
Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…
Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Weidlich, Joshua; Gaševic, Dragan; Drachsler, Hendrik – Journal of Learning Analytics, 2022
As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they…
Descriptors: Causal Models, Inferences, Learning Analytics, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Amrein-Beardsley, Audrey; Sloat, Edward; Holloway, Jessica – AASA Journal of Scholarship & Practice, 2020
In this study, researchers compared the concordance of teacher-level effectiveness ratings derived via six common generalized value-added model (VAM) approaches including a (1) student growth percentile (SGP) model, (2) value-added linear regression model (VALRM), (3) value-added hierarchical linear model (VAHLM), (4) simple difference (gain)…
Descriptors: Value Added Models, Teacher Effectiveness, Elementary School Teachers, Teacher Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Khamboonruang, Apichat – rEFLections, 2022
Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…
Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Xi, Xiaoming – Language Testing, 2017
In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they…
Descriptors: Computational Linguistics, Second Language Learning, Language Tests, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Wing, Coady; Bello-Gomez, Ricardo A. – American Journal of Evaluation, 2018
Treatment effect estimates from a "regression discontinuity design" (RDD) have high internal validity. However, the arguments that support the design apply to a subpopulation that is narrower and usually different from the population of substantive interest in evaluation research. The disconnect between RDD population and the…
Descriptors: Regression (Statistics), Research Design, Validity, Evaluation Methods
Beauchamp, David; Constantinou, Filio – Research Matters, 2020
Assessment is a useful process as it provides various stakeholders (e.g., teachers, parents, government, employers) with information about students' competence in a particular subject area. However, for the information generated by assessment to be useful, it needs to support valid inferences. One factor that can undermine the validity of…
Descriptors: Computational Linguistics, Inferences, Validity, Language Usage
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Yongnam; Steiner, Peter – Educational Psychologist, 2016
When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs. This…
Descriptors: Quasiexperimental Design, Causal Models, Statistical Inference, Randomized Controlled Trials
Peer reviewed Peer reviewed
Direct linkDirect link
Marcus, Sue M.; Stuart, Elizabeth A.; Wang, Pei; Shadish, William R.; Steiner, Peter M. – Psychological Methods, 2012
Although randomized studies have high internal validity, generalizability of the estimated causal effect from randomized clinical trials to real-world clinical or educational practice may be limited. We consider the implication of randomized assignment to treatment, as compared with choice of preferred treatment as it occurs in real-world…
Descriptors: Educational Practices, Program Effectiveness, Validity, Causal Models
Peer reviewed Peer reviewed
Direct linkDirect link
Crisp, Victoria; Shaw, Stuart – Educational Studies, 2012
Validity is a central principle of assessment relating to the appropriateness of the uses and interpretations of test results. Usually, one of the inferences that we wish to make is that the score reflects the extent of a student's learning in a given domain. Thus, it is important to establish that the assessment tasks elicit performances that…
Descriptors: Test Results, Evaluation Methods, Construct Validity, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…
Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries
Previous Page | Next Page »
Pages: 1  |  2  |  3