ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	50

Descriptor

Item Analysis	55
Robustness (Statistics)	55
Foreign Countries	15
Evaluation Research	12
Measurement Techniques	12
Evaluation Methods	11
Item Response Theory	11
Test Reliability	11
Test Validity	11
Models	10
Test Items	10
Program Validation	8
Error of Measurement	7
Measures (Individuals)	7
Psychometrics	7
Research Methodology	7
Statistical Studies	7
Comparative Analysis	6
Computation	6
Difficulty Level	6
Performance Factors	6
Replication (Evaluation)	6
Scoring Rubrics	6
Accountability	5
College Students	5
More ▼

Publication Type

Journal Articles	45
Reports - Research	27
Reports - Evaluative	18
Reports - Descriptive	5
Dissertations/Theses -…	4
Information Analyses	2
Guides - Non-Classroom	1
Opinion Papers	1

Education Level

Higher Education	17
Elementary Secondary Education	8
Postsecondary Education	6
Adult Education	4
Early Childhood Education	2
Grade 7	2
High Schools	2
Secondary Education	2
Adult Basic Education	1
Elementary Education	1
Grade 10	1
Grade 4	1
Grade 5	1
Grade 8	1
Middle Schools	1
More ▼

Audience

Teachers	2
Administrators	1
Counselors	1
Policymakers	1

Location

Canada	3
Australia	2
Belgium	1
Finland (Helsinki)	1
Italy	1
Japan	1
Kuwait	1
Malaysia	1
Michigan	1
Netherlands	1
Switzerland	1
Texas	1
Turkey	1
United Kingdom (England)	1
Washington	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Conners Teacher Rating Scale	1
General Educational…	1
Program for International…	1
Social Skills Rating System	1
Strong Interest Inventory	1
Teacher Rating Scale	1
Work Values Inventory	1

What Works Clearinghouse Rating

Showing 1 to 15 of 55 results Save | Export

A Robust Method for Detecting Item Misfit in Large-Scale Assessments

Peer reviewed

Direct link

von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…

Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit

Rethinking the Exploration of Dichotomous Data: Mokken Scale Analysis versus Factorial Analysis

Peer reviewed

Direct link

Antino, Mirko; Alvarado, Jesús M.; Asún, Rodrigo A.; Bliese, Paul – Sociological Methods & Research, 2020

The need to determine the correct dimensionality of theoretical constructs and generate valid measurement instruments when underlying items are categorical has generated a significant volume of research in the social sciences. This article presents two studies contrasting different categorical exploratory techniques. The first study compares…

Descriptors: Nonparametric Statistics, Factor Analysis, Item Analysis, Robustness (Statistics)

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Assess Robustness of the Rasch Mixture Model to Detect Differential Item Functioning -- A Monte Carlo Study

Direct link

Jinjin Huang – ProQuest LLC, 2020

Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated.…

Descriptors: Robustness (Statistics), Item Response Theory, Test Items, Item Analysis

Effective Computer-Aided Assessment of Mathematics; Principles, Practice and Results

Peer reviewed

Direct link

Greenhow, Martin – Teaching Mathematics and Its Applications, 2015

This article outlines some key issues for writing effective computer-aided assessment (CAA) questions in subjects with substantial mathematical or statistical content, especially the importance of control of random parameters and the encoding of wrong methods of solution (mal-rules) commonly used by students. The pros and cons of using CAA and…

Descriptors: Mathematics Instruction, Computer Assisted Testing, Educational Principles, Educational Practices

Stability of Scores on Super's Work Values Inventory-Revised

Peer reviewed

Direct link

Leuty, Melanie E. – Measurement and Evaluation in Counseling and Development, 2013

Test-retest data on Super's Work Values Inventory-Revised for a group of predominantly White ("N" = 995) women (mean age = 23.5 years, SD = 8.07) and men (mean age = 21.5 years, SD = 5.80) showed stability in mean-level scores over a period of 1 year for the sample as a whole. However, low raw score and rank order stability coefficients…

Descriptors: Robustness (Statistics), Scores, Individual Differences, Item Analysis

Replication and Robustness in Developmental Research

Peer reviewed

Direct link

Duncan, Greg J.; Engel, Mimi; Claessens, Amy; Dowsett, Chantelle J. – Developmental Psychology, 2014

Replications and robustness checks are key elements of the scientific method and a staple in many disciplines. However, leading journals in developmental psychology rarely include explicit replications of prior research conducted by different investigators, and few require authors to establish in their articles or online appendices that their key…

Descriptors: Replication (Evaluation), Robustness (Statistics), Developmental Psychology, Educational Research

A Psychometric Assessment of the "Businessweek," "U.S. News & World Report," and "Financial Times" Rankings of Business Schools' MBA Programs

Peer reviewed

Direct link

Iacobucci, Dawn – Journal of Marketing Education, 2013

This research investigates the reliability and validity of three major publications' rankings of MBA programs. Each set of rankings showed reasonable consistency over time, both at the level of the overall rankings and for most of the facets from which the rankings are derived. Each set of rankings also showed some levels of convergent and…

Descriptors: Psychometrics, Business Administration Education, Reliability, Validity

Development of the Statistical Reasoning in Biology Concept Inventory (SRBCI)

Peer reviewed

Direct link

Deane, Thomas; Nomme, Kathy; Jeffery, Erica; Pollock, Carol; Birol, Gülnur – CBE - Life Sciences Education, 2016

We followed established best practices in concept inventory design and developed a 12-item inventory to assess student ability in statistical reasoning in biology (Statistical Reasoning in Biology Concept Inventory [SRBCI]). It is important to assess student thinking in this conceptual area, because it is a fundamental requirement of being…

Descriptors: Foreign Countries, Measures (Individuals), Test Construction, Statistics

Extending Item Response Theory to Online Homework

Peer reviewed

Direct link

Kortemeyer, Gerd – Physical Review Special Topics - Physics Education Research, 2014

Item response theory (IRT) becomes an increasingly important tool when analyzing "big data" gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for…

Descriptors: Item Response Theory, Online Courses, Electronic Learning, Homework

The Number of Feedbacks Needed for Reliable Evaluation. A Multilevel Analysis of the Reliability, Stability and Generalisability of Students' Evaluation of Teaching

Peer reviewed

Direct link

Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013

A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…

Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance

How Large Should a Statistical Sample Be?

Peer reviewed

Direct link

Menil, Violeta C.; Ye, Ruili – MathAMATYC Educator, 2012

This study serves as a teaching aid for teachers of introductory statistics. The aim of this study was limited to determining various sample sizes when estimating population proportion. Tables on sample sizes were generated using a C[superscript ++] program, which depends on population size, degree of precision or error level, and confidence…

Descriptors: Sample Size, Probability, Statistics, Sampling

The MCCI (Millon College Counseling Inventory) in an Ethnically Diverse Student Population

Download full text

Dornheim, Liane; Ramnath, R.; Gomez, C.; von Harscher, H.; Pellegrini, A. – Online Submission, 2011

This study examined psychometric properties of the MCCI (Millon College Counseling Inventory) (T. Millon, Strack, C. Millon, & Grossman, 2006), as applied to students from ethnically and culturally diverse backgrounds. The sample (N = 209, Mean age = 23.81, 74% identified as ethnic minority) was derived from students presented for counseling…

Descriptors: Psychometrics, Item Analysis, Replication (Evaluation), Ethnic Diversity

Assessing the "Rothstein Falsification Test": Does It Really Show Teacher Value-Added Models Are Biased? CEDR Working Paper No. 2012 1.3

Direct link

Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…

Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

ProQuest LLC	4
Applied Measurement in…	2
Assessment & Evaluation in…	2
Center for Education Data &…	2
Educational and Psychological…	2
Measurement and Evaluation in…	2
Online Submission	2
School Effectiveness and…	2
Social Indicators Research	2
Assessment for Effective…	1
Australian Journal of Career…	1
Bulletin of Science,…	1
CBE - Life Sciences Education	1
Canadian Journal of School…	1
Carnegie Corporation of New…	1
Consortium for Policy…	1
Developmental Psychology	1
ETS Research Report Series	1
Early Education and…	1
Education Policy Analysis…	1
Educational Measurement:…	1
Educational Research and…	1
European Early Childhood…	1
European Physical Education…	1
Evaluation and Program…	1
More ▼

Goldhaber, Dan	2
Abdulkadri, Abdullahi O.	1
Abulela, Mohammed A. A.	1
Al-Hamdan, Jasem M.	1
Al-Yacoub, Ali M.	1
Alvarado, Jesús M.	1
Antino, Mirko	1
Aoyama, Kazuhiro	1
Asún, Rodrigo A.	1
Aye, Lu	1
Beddow, Peter A., III	1
Bezirhan, Ummugul	1
Birol, Gülnur	1
Bliese, Paul	1
Bowling, Nathan A.	1
Brown, Pat	1
Camilli, Gregory	1
Chaplin, Duncan	1
Claessens, Amy	1
Cools, Wilfried	1
Cordes, Matthew, McLaughlin,…	1
D'Achiardi, Catalina	1
D'Entremont, Dylan	1
De Fraine, Bieke	1
Deane, Thomas	1
More ▼