ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	19

Descriptor

Evaluation Methods	21
Evaluation Research	21
Testing Problems	21
Evaluation Problems	15
Educational Assessment	13
Measurement	13
Psychometrics	13
Test Validity	11
Measurement Techniques	9
Teacher Evaluation	9
Test Construction	9
Knowledge Base for Teaching	8
Mathematics Education	8
Mathematics Instruction	8
Pedagogical Content Knowledge	8
Educational Testing	6
Item Response Theory	4
Test Items	4
Test Reliability	4
Content Validity	3
Error of Measurement	3
Multiple Choice Tests	3
Scoring	3
Simulation	3
Student Evaluation	3
More ▼

Source

Measurement:…	8
Journal of Educational…	4
Computers & Education	1
Educational Research and…	1
Educational Research for…	1
Educational Technology &…	1
Educational and Psychological…	1
International Association for…	1
International Journal of…	1
Language Teaching Research…	1
Science & Education	1
More ▼

Publication Type

Journal Articles	20
Opinion Papers	8
Reports - Research	7
Reports - Evaluative	5
Information Analyses	1
Reports - Descriptive	1

Education Level

Elementary Secondary Education	12
Higher Education	3
Elementary Education	2
Postsecondary Education	2
Secondary Education	2
High Schools	1

Audience

Location

California	1
Maine	1
Michigan	1
New Hampshire	1
Oregon	1
South Africa	1
Thailand	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

James Dean Brown's 50 Years of Work in Second Language Studies: A Systematic Review

Peer reviewed
PDF on ERIC

Download full text

James Dean Brown; Ali Panahi; Hassan Mohebbi – Language Teaching Research Quarterly, 2023

Panahi and Mohebbi review James Dean Brown's 50-years of research in language testing, curriculum development and research statistics with reference to an impressionistic framework for analysis containing two components with their subcomponents: Annotations (i.e., briefing and implications) and main concepts and themes (i.e., testing and teaching…

Descriptors: Second Language Learning, Second Language Instruction, Language Tests, Curriculum Development

Progress and Proficiency: Redesigning Grading for Competency Education. CompetencyWorks Issue Brief

Download full text

Sturgis, Chris – International Association for K-12 Online Learning, 2014

This paper is part of a series investigating the implementation of competency education. The purpose of the paper is to explore how districts and schools can redesign grading systems to best help students to excel in academics and to gain the skills that are needed to be successful in college, the community, and the workplace. In order to make the…

Descriptors: Grading, Competency Based Education, Evaluation Methods, Evaluation Research

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Comparison of Oral Examination and Electronic Examination Using Paired Multiple-Choice Questions

Peer reviewed

Direct link

Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011

The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…

Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format

Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

Peer reviewed

Direct link

de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…

Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods

Impact of Missing Data on the Detection of Differential Item Functioning: The Case of Mantel-Haenszel and Logistic Regression Analysis

Peer reviewed

Direct link

Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009

This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…

Descriptors: Test Bias, Simulation, Interaction, Effect Size

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Randomised Items in Computer-Based Tests: Russian Roulette in Assessment?

Peer reviewed

Direct link

Marks, Anthony M.; Cronje, Johannes C. – Educational Technology & Society, 2008

Computer-based assessments are becoming more commonplace, perhaps as a necessity for faculty to cope with large class sizes. These tests often occur in large computer testing venues in which test security may be compromised. In an attempt to limit the likelihood of cheating in such venues, randomised presentation of items is automatically…

Descriptors: Educational Assessment, Educational Testing, Research Needs, Test Items

A Practical and Prescriptive Approach to Validity--Commentary

Peer reviewed

Direct link

DiBello, Lou; Stout, William – Measurement: Interdisciplinary Research and Perspectives, 2007

In this article, the authors provide their critique on a set of papers that investigated Mathematics Knowledge for Teachers (MKT) assessment and the underlying theory and characteristics of the validity enterprise. Three types of assumptions and inferences--elemental, structural, and ecological--are discussed in these papers. These assumptions…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Our Field Needs a Framework to Guide Development of Validity Research Agendas and Identification of Validity Research Questions and Threats to Validity

Peer reviewed

Direct link

Ferrara, Steve – Measurement: Interdisciplinary Research and Perspectives, 2007

In this issue of Measurement: Interdisciplinary Research and Perspectives, Schilling et al. are explicit about the centrality of assessment design and development and psychometric analysis in validation. Schilling and colleagues, Kane (2004, 2006), other contemporary validity theorists and practitioners, and their predecessors typically discuss…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Re-Conceptualizing Validity within the Context of a New Measure of Mathematical Knowledge for Teaching

Peer reviewed

Direct link

Engelhard, George, Jr.; Sullivan, Rubye K. – Measurement: Interdisciplinary Research and Perspectives, 2007

In this journal issue, the authors of the focus articles have provided a suite of very stimulating and thoughtful articles. The overarching purpose of this research is to explore the application of principles derived from the view of validity proposed by Kane (2004) to their research on issues related to the measurement of mathematical knowledge…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

The Complexities of Assessing Teacher Knowledge

Peer reviewed

Direct link

Schoenfeld, Alan H. – Measurement: Interdisciplinary Research and Perspectives, 2007

The authors of this volume's stimulus papers have taken on the challenge of developing measures of teachers' mathematical knowledge for teaching (MKT). This task involves multiple decisions and considerations, including: (1) How does one specify the body of knowledge being assessed? What warrants are offered for those choices?; (2) How does one…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Mathematics Knowledge for Teaching: Questions about Constructs

Peer reviewed

Direct link

Gearhart, Maryl – Measurement: Interdisciplinary Research and Perspectives, 2007

Teacher knowledge has been of theoretical and empirical interest for over two decades, and development of measures is overdue. The researchers represented in this volume have been breaking new ground by developing a measure of mathematical knowledge for teaching (MKT) without guiding precedents, and in the face of differing perspectives on teacher…

Descriptors: Learning Theories, Elementary School Mathematics, Teaching Methods, Construct Validity

Previous Page | Next Page »

Pages: 1 | 2

Ali Panahi	1
Baldwin, Su G.	1
Camilli, Gregory	1
Clauser, Brian E.	1
Cronje, Johannes C.	1
Cui, Ying	1
DiBello, Lou	1
Dillon, Gerard F.	1
Engelhard, George, Jr.	1
Fawkes, Don	1
Ferrara, Steve	1
Flage, Dan	1
Foster, Jeff L.	1
Gearhart, Maryl	1
Hassan Mohebbi	1
Hill, Heather C.	1
James Dean Brown	1
Karelitz, Tzur M.	1
Kulikowich, Jonna M.	1
Leighton, Jacqueline P.	1
Margolis, Melissa J.	1
Marks, Anthony M.	1
Mee, Janet	1
Meyer, Kevin D.	1
Myford, Carol M.	1
More ▼