ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	6

Descriptor

Comparative Analysis	8
Evaluators	8
Measurement Techniques	8
Evaluation Methods	3
Computer Software	2
Correlation	2
Difficulty Level	2
Foreign Countries	2
Generalization	2
Item Response Theory	2
Mathematical Models	2
Academic Achievement	1
Academic Standards	1
Accuracy	1
Advanced Placement	1
Algorithms	1
Barriers	1
Bayesian Statistics	1
Biology	1
Chemistry	1
Classroom Observation…	1
Cognitive Ability	1
College Faculty	1
Computer Assisted Testing	1
Curriculum	1
More ▼

Source

Cambridge Assessment	1
International Journal of…	1
Journal of Educational and…	1
Journal of MultiDisciplinary…	1
Society for Research on…	1
rEFLections	1

Author

Allan S. Cohen	1
Arslan Mancar, Sinem	1
Ben Kelcey	1
Coleman, Tori	1
Darlington, Ellie	1
Elliott, Gill	1
Fangxing Bai	1
Flores, Kathryn Younger	1
Greatorex, Jackie	1
Gulleroglu, H. Deniz	1
Jordan M. Wheeler	1
Khamboonruang, Apichat	1
Kouame, Julien B.	1
Linacre, John M.	1
Rushton, Nicky	1
Shiyu Wang	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	4
Reports - Evaluative	2
Collected Works - Proceedings	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Thailand	1
United Kingdom (England)	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Flesch Kincaid Grade Level…	1
Fry Readability Formula	1
National Adult Literacy…	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Method-of-Moment Corrected Maximum Likelihood (Ml) Structural-after-Measurement (SAM) Estimator for n-Level Structural Equation Models

Peer reviewed

Direct link

Fangxing Bai; Ben Kelcey – Society for Research on Educational Effectiveness, 2024

Purpose and Background: Despite the flexibility of multilevel structural equation modeling (MLSEM), a practical limitation many researchers encounter is how to effectively estimate model parameters with typical sample sizes when there are many levels of (potentially disparate) nesting. We develop a method-of-moment corrected maximum likelihood…

Descriptors: Maximum Likelihood Statistics, Structural Equation Models, Sample Size, Faculty Development

Comparison of Inter-Rater Reliability Techniques in Performance-Based Assessment

Peer reviewed
PDF on ERIC

Download full text

Arslan Mancar, Sinem; Gulleroglu, H. Deniz – International Journal of Assessment Tools in Education, 2022

The aim of this study is to analyse the importance of the number of raters and compare the results obtained by techniques based on Classical Test Theory (CTT) and Generalizability (G) Theory. The Kappa and Krippendorff alpha techniques based on CTT were used to determine the inter-rater reliability. In this descriptive research data consists of…

Descriptors: Comparative Analysis, Interrater Reliability, Advanced Placement, Scoring Rubrics

Building an Initial Validity Argument for Binary and Analytic Rating Scales for an EFL Classroom Writing Assessment: Evidence from Many-Facets Rasch Measurement

Peer reviewed
PDF on ERIC

Download full text

Khamboonruang, Apichat – rEFLections, 2022

Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…

Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction

Towards a Method for Comparing Curricula

Direct link

Greatorex, Jackie; Rushton, Nicky; Coleman, Tori; Darlington, Ellie; Elliott, Gill – Cambridge Assessment, 2019

A curriculum map is a visualisation of relationships within and between a curriculum or curricula. Curriculum mapping refers to the method for creating and using the curriculum map, however this term is used broadly and encompasses a variety of methodological approaches. Often, researchers in the field of curriculum studies conduct curriculum…

Descriptors: Comparative Analysis, Visualization, Curriculum, Maps

Using Readability Tests to Improve the Accuracy of Evaluation Documents Intended for Low-Literate Participants

Peer reviewed

Direct link

Kouame, Julien B. – Journal of MultiDisciplinary Evaluation, 2010

Background: Readability tests are indicators that measure how easy a document can be read and understood. Simple, but very often ignored, readability statistics cannot only provide information about the level of difficulty of the readability of particular documents but also can increase an evaluator's credibility. Purpose: The purpose of this…

Descriptors: Readability, Readability Formulas, Evaluation Methods, Literacy

Generalizability Theory and Many-Facet Rasch Measurement.

Download full text

Linacre, John M. – 1993

Generalizability theory (G-theory) and many-facet Rasch measurement (Rasch) manage the variability inherent when raters rate examinees on test items. The purpose of G-theory is to estimate test reliability in a raw score metric. Unadjusted examinee raw scores are reported as measures. A variance component is estimated for the examinee…

Descriptors: Comparative Analysis, Equations (Mathematics), Estimation (Mathematics), Evaluators

Measuring Instructional Effectiveness: A Comparison of a Computer-Assisted Systematic Observation Instrument with Global Measures. Proceedings from Seminar on Teacher Development and Linguistic Diversity.

Download full text

Flores, Kathryn Younger – 1995

This paper presents preliminary, but statistically significant, findings from a study that compares two methods of measuring instructional effectiveness: global evaluation by experts and systematic observation using the SCRIBE Ob.2 software developed at the University of Texas at Austin. Hierarchical instruction of a performance skill…

Descriptors: Classroom Observation Techniques, College Faculty, Comparative Analysis, Computer Assisted Testing