ERIC - Search Results

Publication Date

In 2025	2
Since 2024	3
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	17
Since 2006 (last 20 years)	39

Descriptor

Evaluation Methods	50
Inferences	50
Validity	27
Test Validity	22
Scores	12
Student Evaluation	11
Models	10
Test Construction	10
Test Items	8
Measurement Techniques	7
Construct Validity	6
Educational Assessment	6
Psychometrics	6
Research Problems	6
Second Language Learning	6
Comparative Analysis	5
Correlation	5
English (Second Language)	5
Evaluation Research	5
Evidence	5
Foreign Countries	5
Generalization	5
Research Design	5
Research Methodology	5
Teacher Evaluation	5
More ▼

Publication Type

Journal Articles	37
Reports - Evaluative	21
Reports - Research	17
Reports - Descriptive	6
Information Analyses	3
Tests/Questionnaires	3
Opinion Papers	2
Speeches/Meeting Papers	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Non-Print Media	1
Reference Materials - General	1
More ▼

Education Level

Elementary Secondary Education	8
Elementary Education	5
Higher Education	4
Grade 4	2
Grade 5	2
Secondary Education	2
Adult Education	1
Grade 6	1
Grade 7	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
More ▼

Audience

Practitioners

Location

United States	3
Ohio	2
United Kingdom (England)	2
Australia	1
California	1
China	1
Colombia	1
Cyprus	1
Ireland	1
Israel	1
South Korea	1
Taiwan	1
Tennessee	1
Thailand	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Progress in International…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 50 results Save | Export

Addressing Threats to Validity in Supervised Machine Learning: A Framework and Best Practices for Education Researchers. EdWorkingPaper No. 25-1117

Download full text

Kylie L. Anglin – Annenberg Institute for School Reform at Brown University, 2025

Since 2018, institutions of higher education have been aware of the "enrollment cliff" which refers to expected declines in future enrollment. This paper attempts to describe how prepared institutions in Ohio are for this future by looking at trends leading up to the anticipated decline. Using IPEDS data from 2012-2022, we analyze trends…

Descriptors: Validity, Artificial Intelligence, Models, Best Practices

MSC-Trans: A Multi-Feature-Fusion Network with Encoding Structure for Student Engagement Detecting

Peer reviewed

Direct link

Nan Xie; Zhengxu Li; Haipeng Lu; Wei Pang; Jiayin Song; Beier Lu – IEEE Transactions on Learning Technologies, 2025

Classroom engagement is a critical factor for evaluating students' learning outcomes and teachers' instructional strategies. Traditional methods for detecting classroom engagement, such as coding and questionnaires, are often limited by delays, subjectivity, and external interference. While some neural network models have been proposed to detect…

Descriptors: Learner Engagement, Artificial Intelligence, Technology Uses in Education, Educational Technology

Addressing Threats to Validity in Supervised Machine Learning: A Framework and Best Practices for Education Researchers

Peer reviewed
PDF on ERIC

Download full text

Kylie Anglin – AERA Open, 2024

Given the rapid adoption of machine learning methods by education researchers, and the growing acknowledgment of their inherent risks, there is an urgent need for tailored methodological guidance on how to improve and evaluate the validity of inferences drawn from these methods. Drawing on an integrative literature review and extending a…

Descriptors: Validity, Artificial Intelligence, Models, Best Practices

Examining the Robustness of the Graded Response and 2-Parameter Logistic Models to Violations of Construct Normality

Peer reviewed

Direct link

Manapat, Patrick D.; Edwards, Michael C. – Educational and Psychological Measurement, 2022

When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait ([theta]) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal [theta]. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed…

Descriptors: Robustness (Statistics), Computational Linguistics, Item Response Theory, Psychological Patterns

Using Think-Aloud Interviews to Examine a Clinically Oriented Performance Assessment Rubric

Peer reviewed

Direct link

Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022

Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…

Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Causal Inference and Bias in Learning Analytics: A Primer on Pitfalls Using Directed Acyclic Graphs

Peer reviewed
PDF on ERIC

Download full text

Weidlich, Joshua; Gaševic, Dragan; Drachsler, Hendrik – Journal of Learning Analytics, 2022

As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they…

Descriptors: Causal Models, Inferences, Learning Analytics, Comparative Analysis

All Value-Added Models (VAMs) Are Wrong, but Sometimes They May Be Useful

Peer reviewed

Direct link

Amrein-Beardsley, Audrey; Sloat, Edward; Holloway, Jessica – AASA Journal of Scholarship & Practice, 2020

In this study, researchers compared the concordance of teacher-level effectiveness ratings derived via six common generalized value-added model (VAM) approaches including a (1) student growth percentile (SGP) model, (2) value-added linear regression model (VALRM), (3) value-added hierarchical linear model (VAHLM), (4) simple difference (gain)…

Descriptors: Value Added Models, Teacher Effectiveness, Elementary School Teachers, Teacher Evaluation

A Cognitive Diagnostic Assessment Study of the Reading Comprehension Section of the Preliminary English Test (PET)

Peer reviewed
PDF on ERIC

Download full text

Mohammed, Aisha; Dawood, Abdul Kareem Shareef; Alghazali, Tawfeeq; Kadhim, Qasim Khlaif; Sabti, Ahmed Abdulateef; Sabit, Shaker Holh – International Journal of Language Testing, 2023

Cognitive diagnostic models (CDMs) have received much interest within the field of language testing over the last decade due to their great potential to provide diagnostic feedback to all stakeholders and ultimately improve language teaching and learning. A large number of studies have demonstrated the application of CDMs on advanced large-scale…

Descriptors: Reading Comprehension, Reading Tests, Language Tests, English (Second Language)

Building an Initial Validity Argument for Binary and Analytic Rating Scales for an EFL Classroom Writing Assessment: Evidence from Many-Facets Rasch Measurement

Peer reviewed
PDF on ERIC

Download full text

Khamboonruang, Apichat – rEFLections, 2022

Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…

Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction

Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

Peer reviewed

Direct link

Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018

Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…

Descriptors: Competence, Simulation, Allied Health Personnel, Certification

What Does Corpus Linguistics Have to Offer to Language Assessment?

Peer reviewed

Direct link

Xi, Xiaoming – Language Testing, 2017

In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they…

Descriptors: Computational Linguistics, Second Language Learning, Language Tests, Evaluation Methods

Regression Discontinuity and Beyond: Options for Studying External Validity in an Internally Valid Design

Peer reviewed

Direct link

Wing, Coady; Bello-Gomez, Ricardo A. – American Journal of Evaluation, 2018

Treatment effect estimates from a "regression discontinuity design" (RDD) have high internal validity. However, the arguments that support the design apply to a subpopulation that is narrower and usually different from the population of substantive interest in evaluation research. The disconnect between RDD population and the…

Descriptors: Regression (Statistics), Research Design, Validity, Evaluation Methods

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Using Corpus Linguistics Tools to Identify Instances of Low Linguistic Accessibility in Tests

Download full text

Beauchamp, David; Constantinou, Filio – Research Matters, 2020

Assessment is a useful process as it provides various stakeholders (e.g., teachers, parents, government, employers) with information about students' competence in a particular subject area. However, for the information generated by assessment to be useful, it needs to support valid inferences. One factor that can undermine the validity of…

Descriptors: Computational Linguistics, Inferences, Validity, Language Usage

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Measurement:…	3
Education and the Public…	2
Educational Measurement:…	2
Language Testing	2
AASA Journal of Scholarship &…	1
AEL	1
AERA Open	1
Advances in Health Sciences…	1
American Journal of Evaluation	1
Annenberg Institute for…	1
Applied Measurement in…	1
Association for Educational…	1
College Board	1
Educational Assessment	1
Educational Evaluation and…	1
Educational Research and…	1
Educational Researcher	1
Educational Studies	1
Educational Testing Service	1
Educational and Psychological…	1
Evaluation and the Health…	1
IEEE Transactions on Learning…	1
International Group for the…	1
International Journal of…	1
International Journal of…	1
More ▼

Blunk, Merrie	2
Ercikan, Kadriye	2
Goldschmidt, Pete	2
Haertel, Geneva	2
Hill, Heather C.	2
Kane, Michael T.	2
Abedi, Jamal	1
Ahn, Soyeon	1
Alexiou, Jon J.	1
Alghazali, Tawfeeq	1
Almond, Patricia	1
Ames, Allison J.	1
Amrein-Beardsley, Audrey	1
An, Lily Shiao	1
Anderson, Daniel	1
Asante, Molefi K.	1
Baker, Bruce	1
Baker, Eva	1
Baker, Eva L.	1
Ball, Deborah Loewenberg	1
Beauchamp, David	1
Beddow, Peter	1
Beier Lu	1
Bello-Gomez, Ricardo A.	1
More ▼