ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	12

Descriptor

Comparative Analysis	12
Generalizability Theory	12
Correlation	3
Interrater Reliability	3
Interviews	3
Item Response Theory	3
Scores	3
Accuracy	2
Bayesian Statistics	2
Factor Analysis	2
Mathematics Tests	2
Measurement	2
Medical Schools	2
Models	2
Multiple Choice Tests	2
Predictive Validity	2
Reading Tests	2
Scoring Rubrics	2
Structural Equation Models	2
Test Items	2
Test Reliability	2
Achievement Tests	1
Algorithms	1
Artificial Intelligence	1
Automation	1
More ▼

Source

ACT, Inc.	1
Advances in Health Sciences…	1
Advances in Physiology…	1
Asia Pacific Education Review	1
Educational Sciences: Theory…	1
Frontline Learning Research	1
Grantee Submission	1
IEEE Transactions on Learning…	1
Journal of Research on…	1
Language Testing	1
Scandinavian Journal of…	1
School Psychology Review	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	9
Reports - Evaluative	2
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	3
Postsecondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Florida (Orlando)

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Machine Learning for Causal Inference

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jennifer Hill; George Perrett; Vincent Dorie – Grantee Submission, 2023

Estimation of causal effects requires making comparisons across groups of observations exposed and not exposed to a a treatment or cause (intervention, program, drug, etc). To interpret differences between groups causally we need to ensure that they have been constructed in such a way that the comparisons are "fair." This can be…

Descriptors: Causal Models, Statistical Inference, Artificial Intelligence, Data Analysis

Multivariate Generalizability Analysis of Automated Scoring for Short Answer Items of Social Studies in Large-Scale Assessment

Peer reviewed

Direct link

Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee – Asia Pacific Education Review, 2017

With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…

Descriptors: Automation, Scoring, Social Studies, Test Items

Site Selection in Experiments: An Assessment of Site Recruitment and Generalizability in Two Scale-Up Studies

Peer reviewed

Direct link

Tipton, Elizabeth; Fellers, Lauren; Caverly, Sarah; Vaden-Kiernan, Michael; Borman, Geoffrey; Sullivan, Kate; Ruiz de Castilla, Veronica – Journal of Research on Educational Effectiveness, 2016

Recently, statisticians have begun developing methods to improve the generalizability of results from large-scale experiments in education. This work has included the development of methods for improved site selection when random sampling is infeasible, including the use of stratification and targeted recruitment strategies. This article provides…

Descriptors: Generalizability Theory, Site Selection, Experiments, Comparative Analysis

Hidden Item Variance in Multiple Mini-Interview Scores

Peer reviewed

Direct link

Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017

The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…

Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods

Dependability of Two Scaling Approaches to Direct Behavior Rating Multi-Item Scales Assessing Disruptive Classroom Behavior

Peer reviewed

Direct link

Volpe, Robert J.; Briesch, Amy M. – School Psychology Review, 2016

This study examines the dependability of two scaling approaches for using a five-item Direct Behavior Rating multi-item scale to assess student disruptive behavior. A series of generalizability theory studies were used to compare a traditional frequency-based scaling approach with an approach wherein the informant compares a target student's…

Descriptors: Scaling, Behavior Rating Scales, Behavior Problems, Student Behavior

A Comparison of Rubrics and Graded Category Rating Scales with Various Methods Regarding Raters' Reliability

Peer reviewed
PDF on ERIC

Download full text

Dogan, C. Deha; Uluman, Müge – Educational Sciences: Theory and Practice, 2017

The aim of this study was to determine the extent at which graded-category rating scales and rubrics contribute to inter-rater reliability. The research was designed as a correlational study. Study group consisted of 82 students attending sixth grade and three writing course teachers in a private elementary school. A performance task was…

Descriptors: Comparative Analysis, Scoring Rubrics, Rating Scales, Interrater Reliability

Item Response Theory for Peer Assessment

Peer reviewed

Direct link

Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016

As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

The Generalized Internal/External Frame of Reference Model: An Extension to Dimensional Comparison Theory

Peer reviewed
PDF on ERIC

Download full text

Möller, Jens; Müller-Kalthoff, Hanno; Helm, Friederike; Nagy, Nicole; Marsh, Herb W. – Frontline Learning Research, 2016

The dimensional comparison theory (DCT) focuses on the effects of internal, dimensional comparisons (e.g., "How good am I in math compared to English?") on academic self-concepts with widespread consequences for students' self-evaluation, motivation, and behavioral choices. DCT is based on the internal/external frame of reference model…

Descriptors: Comparative Analysis, Comparative Testing, Self Concept, Self Concept Measures

Evidence for Paper and Online ACT® Comparability: Spring 2014 and 2015 Mode Comparability Studies. ACT Research Report Series 2017-1

Download full text

Li, Dongmei; Yi, Qing; Harris, Deborah – ACT, Inc., 2017

In preparation for online administration of the ACT® test, ACT conducted studies to examine the comparability of scores between online and paper administrations, including a timing study in fall 2013, a mode comparability study in spring 2014, and a second mode comparability study in spring 2015. This report presents major findings from these…

Descriptors: College Entrance Examinations, Computer Assisted Testing, Comparative Analysis, Test Format

Application of a Utility Analysis to Evaluate a Novel Assessment Tool for Clinically Oriented Physiology and Pharmacology

Peer reviewed

Direct link

Cramer, Nicholas; Asmar, Abdo; Gorman, Laurel; Gros, Bernard; Harris, David; Howard, Thomas; Hussain, Mujtaba; Salazar, Sergio; Kibble, Jonathan D. – Advances in Physiology Education, 2016

Multiple-choice questions are a gold-standard tool in medical school for assessment of knowledge and are the mainstay of licensing examinations. However, multiple-choice questions items can be criticized for lacking the ability to test higher-order learning or integrative thinking across multiple disciplines. Our objective was to develop a novel…

Descriptors: Physiology, Pharmacology, Multiple Choice Tests, Cost Effectiveness

The Contribution of International Large-Scale Assessments to Educational Research: Combining Individual and Institutional Data Sources

Peer reviewed

Direct link

Strietholt, Rolf; Scherer, Ronny – Scandinavian Journal of Educational Research, 2018

The present paper aims to discuss how data from international large-scale assessments (ILSAs) can be utilized and combined, even with other existing data sources, in order to monitor educational outcomes and study the effectiveness of educational systems. We consider different purposes of linking data, namely, extending outcomes measures,…

Descriptors: International Assessment, Group Testing, Outcomes of Education, Outcome Measures

Asmar, Abdo	1
Attali, Yigal	1
Borman, Geoffrey	1
Briesch, Amy M.	1
Caverly, Sarah	1
Chon, Kyong Hee	1
Cramer, Nicholas	1
Dogan, C. Deha	1
Fellers, Lauren	1
George Perrett	1
Gorman, Laurel	1
Gros, Bernard	1
Harris, David	1
Harris, Deborah	1
Helm, Friederike	1
Howard, Thomas	1
Hussain, Mujtaba	1
Jennifer Hill	1
Kelcey, Benjamin M.	1
Kibble, Jonathan D.	1
Li, Dongmei	1
Manuel, R. Stephen	1
Marsh, Herb W.	1
Möller, Jens	1
Müller-Kalthoff, Hanno	1
More ▼