ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	21

Descriptor

Evaluation Methods	33
Psychometrics	33
Testing Problems	33
Educational Assessment	22
Test Construction	16
Educational Testing	14
Evaluation Problems	14
Measurement Techniques	14
Evaluation Research	13
Test Validity	13
Measurement	12
Knowledge Base for Teaching	9
Mathematics Education	9
Teacher Evaluation	9
Mathematics Instruction	8
Pedagogical Content Knowledge	8
Student Evaluation	7
Test Interpretation	7
Test Use	7
Comparative Analysis	6
Definitions	5
Equated Scores	5
Foreign Countries	5
High Stakes Tests	5
Predictive Measurement	5
More ▼

Source

Measurement:…	13
Journal of Educational…	3
Educational Researcher	1
Educational and Psychological…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Autism and…	1
Journal of Psychoeducational…	1
Journal of Visual Impairment…	1
Mathematics Teacher Education…	1
Topics in Early Childhood…	1
More ▼

Publication Type

Journal Articles	25
Opinion Papers	17
Reports - Evaluative	6
Reports - Descriptive	4
Reports - Research	4
Speeches/Meeting Papers	3
Information Analyses	2
Reports - General	1

Education Level

Elementary Secondary Education	13
Elementary Education	2

Audience

Practitioners	2
Counselors	1
Researchers	1

Location

United Kingdom (England)	3
United States	3
United Kingdom	2
United Kingdom (Wales)	2
Australia	1

Laws, Policies, & Programs

Education of the Handicapped…

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	2
Cognitive Assessment System	1
Leiter International…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Serious Games for Assessment: Welcome to the Jungle

Peer reviewed

Direct link

Kato, Pamela M.; de Klerk, Sebastiaan – Journal of Applied Testing Technology, 2017

Serious games are increasingly being explored for use as assessment tools in broad domains. Drawing from research in these domains, we present important advantages and challenges that arise when using games for assessment. In light of this context and as an introduction to this special issue on Serious Games and Assessments, we introduce the…

Descriptors: Evaluation Methods, Formative Evaluation, Design, Educational Games

Norm Block Sample Sizes: A Review of 17 Individually Administered Intelligence Tests

Peer reviewed

Direct link

Norfolk, Philip A.; Farmer, Ryan L.; Floyd, Randy G.; Woods, Isaac L.; Hawkins, Haley K.; Irby, Sarah M. – Journal of Psychoeducational Assessment, 2015

The representativeness, recency, and size of norm samples strongly influence the accuracy of inferences drawn from their scores. Inadequate norm samples may lead to inflated or deflated scores for individuals and poorer prediction of developmental and academic outcomes. The purpose of this study was to apply Kranzler and Floyd's method for…

Descriptors: Intelligence Tests, Psychometrics, Sample Size, Norm Referenced Tests

Challenges and Strategies for Assessing Specialised Knowledge for Teaching

Peer reviewed
PDF on ERIC

Download full text

Orrill, Chandra Hawley; Kim, Ok-Kyeong; Peters, Susan A.; Lischka, Alyson E.; Jong, Cindy; Sanchez, Wendy B.; Eli, Jennifer A. – Mathematics Teacher Education and Development, 2015

Developing and writing assessment items that measure teachers' knowledge is an intricate and complex undertaking. In this paper, we begin with an overview of what is known about measuring teacher knowledge. We then highlight the challenges inherent in creating assessment items that focus specifically on measuring teachers' specialised knowledge…

Descriptors: Specialization, Knowledge Base for Teaching, Educational Strategies, Testing Problems

Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

Peer reviewed

Direct link

de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…

Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Impact of Missing Data on the Detection of Differential Item Functioning: The Case of Mantel-Haenszel and Logistic Regression Analysis

Peer reviewed

Direct link

Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009

This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…

Descriptors: Test Bias, Simulation, Interaction, Effect Size

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

What Constitutes Legitimate Causal Linking?

Peer reviewed

Direct link

Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010

Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…

Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics

What Dictates the Meaning of Test Linking? A Reaction to "Thinking about Linking"

Peer reviewed

Direct link

von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010

The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…

Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Assessing the Functions of Aberrant Behaviors: A Review of Psychometric Instruments.

Peer reviewed

Sturmey, Peter – Journal of Autism and Developmental Disorders, 1994

This paper reviews the psychometric properties, treatment utility, and conceptual basis of instruments used to identify the functions of aberrant behaviors in people with developmental disabilities. Instruments include the Motivational Assessment Scale, Motivation Analysis Rating Scale, Functional Analysis Interview Form, and Functional Analysis…

Descriptors: Behavior Problems, Developmental Disabilities, Evaluation Methods, Motivation

A Practical and Prescriptive Approach to Validity--Commentary

Peer reviewed

Direct link

DiBello, Lou; Stout, William – Measurement: Interdisciplinary Research and Perspectives, 2007

In this article, the authors provide their critique on a set of papers that investigated Mathematics Knowledge for Teachers (MKT) assessment and the underlying theory and characteristics of the validity enterprise. Three types of assumptions and inferences--elemental, structural, and ecological--are discussed in these papers. These assumptions…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Our Field Needs a Framework to Guide Development of Validity Research Agendas and Identification of Validity Research Questions and Threats to Validity

Peer reviewed

Direct link

Ferrara, Steve – Measurement: Interdisciplinary Research and Perspectives, 2007

In this issue of Measurement: Interdisciplinary Research and Perspectives, Schilling et al. are explicit about the centrality of assessment design and development and psychometric analysis in validation. Schilling and colleagues, Kane (2004, 2006), other contemporary validity theorists and practitioners, and their predecessors typically discuss…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Previous Page | Next Page »

Pages: 1 | 2 | 3

Bielinski, John	2
Minnema, Jane	2
Thurlow, Martha	2
Baird, Jo-Anne	1
Baldwin, Su G.	1
Clauser, Brian E.	1
Cresswell, Mike	1
Cui, Ying	1
DiBello, Lou	1
Dillon, Gerard F.	1
Downey, Ronald G.	1
Eli, Jennifer A.	1
Ellis, Barbara B.	1
Engelhard, George, Jr.	1
Farmer, Ryan L.	1
Ferrara, Steve	1
Floyd, Randy G.	1
Foster, Jeff L.	1
Gearhart, Maryl	1
Hawkins, Haley K.	1
Hill, Heather C.	1
Hull, T.	1
Irby, Sarah M.	1
Jong, Cindy	1
More ▼