Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 19 |
Descriptor
Evaluation Methods | 21 |
Evaluation Research | 21 |
Testing Problems | 21 |
Evaluation Problems | 15 |
Educational Assessment | 13 |
Measurement | 13 |
Psychometrics | 13 |
Test Validity | 11 |
Measurement Techniques | 9 |
Teacher Evaluation | 9 |
Test Construction | 9 |
More ▼ |
Source
Author
Ali Panahi | 1 |
Baldwin, Su G. | 1 |
Camilli, Gregory | 1 |
Clauser, Brian E. | 1 |
Cronje, Johannes C. | 1 |
Cui, Ying | 1 |
DiBello, Lou | 1 |
Dillon, Gerard F. | 1 |
Engelhard, George, Jr. | 1 |
Fawkes, Don | 1 |
Ferrara, Steve | 1 |
More ▼ |
Publication Type
Journal Articles | 20 |
Opinion Papers | 8 |
Reports - Research | 7 |
Reports - Evaluative | 5 |
Information Analyses | 1 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 12 |
Higher Education | 3 |
Elementary Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Advanced Placement… | 1 |
What Works Clearinghouse Rating
James Dean Brown; Ali Panahi; Hassan Mohebbi – Language Teaching Research Quarterly, 2023
Panahi and Mohebbi review James Dean Brown's 50-years of research in language testing, curriculum development and research statistics with reference to an impressionistic framework for analysis containing two components with their subcomponents: Annotations (i.e., briefing and implications) and main concepts and themes (i.e., testing and teaching…
Descriptors: Second Language Learning, Second Language Instruction, Language Tests, Curriculum Development
Sturgis, Chris – International Association for K-12 Online Learning, 2014
This paper is part of a series investigating the implementation of competency education. The purpose of the paper is to explore how districts and schools can redesign grading systems to best help students to excel in academics and to gain the skills that are needed to be successful in college, the community, and the workplace. In order to make the…
Descriptors: Grading, Competency Based Education, Evaluation Methods, Evaluation Research
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…
Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format
de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009
Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…
Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods
Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009
This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…
Descriptors: Test Bias, Simulation, Interaction, Effect Size
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
Marks, Anthony M.; Cronje, Johannes C. – Educational Technology & Society, 2008
Computer-based assessments are becoming more commonplace, perhaps as a necessity for faculty to cope with large class sizes. These tests often occur in large computer testing venues in which test security may be compromised. In an attempt to limit the likelihood of cheating in such venues, randomised presentation of items is automatically…
Descriptors: Educational Assessment, Educational Testing, Research Needs, Test Items
DiBello, Lou; Stout, William – Measurement: Interdisciplinary Research and Perspectives, 2007
In this article, the authors provide their critique on a set of papers that investigated Mathematics Knowledge for Teachers (MKT) assessment and the underlying theory and characteristics of the validity enterprise. Three types of assumptions and inferences--elemental, structural, and ecological--are discussed in these papers. These assumptions…
Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research
Ferrara, Steve – Measurement: Interdisciplinary Research and Perspectives, 2007
In this issue of Measurement: Interdisciplinary Research and Perspectives, Schilling et al. are explicit about the centrality of assessment design and development and psychometric analysis in validation. Schilling and colleagues, Kane (2004, 2006), other contemporary validity theorists and practitioners, and their predecessors typically discuss…
Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research
Engelhard, George, Jr.; Sullivan, Rubye K. – Measurement: Interdisciplinary Research and Perspectives, 2007
In this journal issue, the authors of the focus articles have provided a suite of very stimulating and thoughtful articles. The overarching purpose of this research is to explore the application of principles derived from the view of validity proposed by Kane (2004) to their research on issues related to the measurement of mathematical knowledge…
Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research
Schoenfeld, Alan H. – Measurement: Interdisciplinary Research and Perspectives, 2007
The authors of this volume's stimulus papers have taken on the challenge of developing measures of teachers' mathematical knowledge for teaching (MKT). This task involves multiple decisions and considerations, including: (1) How does one specify the body of knowledge being assessed? What warrants are offered for those choices?; (2) How does one…
Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research
Gearhart, Maryl – Measurement: Interdisciplinary Research and Perspectives, 2007
Teacher knowledge has been of theoretical and empirical interest for over two decades, and development of measures is overdue. The researchers represented in this volume have been breaking new ground by developing a measure of mathematical knowledge for teaching (MKT) without guiding precedents, and in the face of differing perspectives on teacher…
Descriptors: Learning Theories, Elementary School Mathematics, Teaching Methods, Construct Validity
Previous Page | Next Page ยป
Pages: 1 | 2