ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	14
Evaluation Methods	14
Measurement Techniques	5
Statistical Analysis	5
Psychometrics	4
Sampling	4
Simulation	4
Computation	3
Educational Research	3
Interrater Reliability	3
Item Response Theory	3
Research Design	3
Sample Size	3
Scores	3
Statistical Studies	3
Comparative Analysis	2
Data Collection	2
Hypothesis Testing	2
Item Analysis	2
Maximum Likelihood Statistics	2
Minimum Competency Testing	2
Monte Carlo Methods	2
Reliability	2
Research Methodology	2
Research Problems	2
More ▼

Source

Applied Measurement in…	2
Environmental Monitoring and…	1
Journal of Speech and Hearing…	1
Multivariate Behavioral…	1
National Center for Education…	1
Psychological Methods	1
Psychometrika	1

Publication Type

Reports - Research	9
Journal Articles	7
Speeches/Meeting Papers	4
Guides - Non-Classroom	2
Books	1
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Grade 10

Audience

Researchers	14
Students	1

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

The BASIE (BAyeSian Interpretation of Estimates) Framework for Interpreting Findings from Impact Evaluations: A Practical Guide for Education Researchers. Toolkit. NCEE 2022-005

Peer reviewed
PDF on ERIC

Download full text

Deke, John; Finucane, Mariel; Thal, Daniel – National Center for Education Evaluation and Regional Assistance, 2022

BASIE is a framework for interpreting impact estimates from evaluations. It is an alternative to null hypothesis significance testing. This guide walks researchers through the key steps of applying BASIE, including selecting prior evidence, reporting impact estimates, interpreting impact estimates, and conducting sensitivity analyses. The guide…

Descriptors: Bayesian Statistics, Educational Research, Data Interpretation, Hypothesis Testing

On the Relationships between Sum Score Based Estimation and Joint Maximum Likelihood Estimation

Peer reviewed

Direct link

del Pino, Guido; San Martin, Ernesto; Gonzalez, Jorge; De Boeck, Paul – Psychometrika, 2008

This paper analyzes the sum score based (SSB) formulation of the Rasch model, where items and sum scores of persons are considered as factors in a logit model. After reviewing the evolution leading to the equality between their maximum likelihood estimates, the SSB model is then discussed from the point of view of pseudo-likelihood and of…

Descriptors: Computation, Models, Scores, Evaluation Methods

Testing Intergroup Concordance in Ranking Experiments with Two Groups of Judges

Peer reviewed

Direct link

Dekle, Dawn J.; Leung, Denis H. Y.; Zhu, Min – Psychological Methods, 2008

Across many areas of psychology, concordance is commonly used to measure the (intragroup) agreement in ranking a number of items by a group of judges. Sometimes, however, the judges come from multiple groups, and in those situations, the interest is to measure the concordance between groups, under the assumption that there is some within-group…

Descriptors: Item Response Theory, Statistical Analysis, Psychological Studies, Evaluators

A Bootstrap Generalization of Modified Parallel Analysis for IRT Dimensionality Assessment

Peer reviewed

Direct link

Finch, Holmes; Monahan, Patrick – Applied Measurement in Education, 2008

This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA…

Descriptors: Monte Carlo Methods, Factor Analysis, Generalization, Methods

Using Explanatory Item Response Models to Analyze Group Differences in Science Achievement

Peer reviewed

Direct link

Briggs, Derek C. – Applied Measurement in Education, 2008

This article illustrates the use of an explanatory item response modeling (EIRM) approach in the context of measuring group differences in science achievement. The distinction between item response models and EIRMs, recently elaborated by De Boeck and Wilson (2004), is presented within the statistical framework of generalized linear mixed models.…

Descriptors: Science Achievement, Science Tests, Measurement, Error of Measurement

A Confirmatory Analysis of Item Reliability Trends (CAIRT): Differentiating True Score and Error Variance in the Analysis of Item Context Effects

Peer reviewed

Direct link

Hartig, Johannes; Holzel, Britta; Moosbrugger, Helfried – Multivariate Behavioral Research, 2007

Numerous studies have shown increasing item reliabilities as an effect of the item position in personality scales. Traditionally, these context effects are analyzed based on item-total correlations. This approach neglects that trends in item reliabilities can be caused either by an increase in true score variance or by a decrease in error…

Descriptors: True Scores, Error of Measurement, Structural Equation Models, Simulation

Qualities of Judgmental Ratings by Four Rater Sources.

Download full text

Tsui, Anne S. – 1983

Quality of performance data yielded by subjective judgment is of major concern to researchers in performance appraisal. However, some confusion exists in the analysis of quality on ratings obtained from different rating scale formats and from different raters. To clarify this confusion, a study was conducted to assess the quality of judgmental…

Descriptors: Administrator Evaluation, Administrators, Error of Measurement, Evaluation Methods

Reliability and Agreement of Ratings of Ataxic Dysarthric Speech Samples with Varying Intelligibility.

Peer reviewed

Sheard, Christine; And Others – Journal of Speech and Hearing Research, 1991

The study calculated indices of interjudge reliability and interjudge and intrajudge agreement on ratings made by 15 experienced speech clinicians on 5 deviant speech dimensions of 15 speakers with ataxic dysarthria and a wide range of speech intelligibility. Judges were reliable in tracking imprecise consonants, excess and equal stress, and harsh…

Descriptors: Adults, Error of Measurement, Evaluation Methods, Interrater Reliability

Tests of Variance Equality When Distributions Differ in Form, Scale and Location.

Download full text

Olejnik, Stephen F.; Algina, James – 1986

Sampling distributions for ten tests for comparing population variances in a two group design were generated for several combinations of equal and unequal sample sizes, population means, and group variances when distributional forms differed. The ten procedures included: (1) O'Brien's (OB); (2) O'Brien's with adjusted degrees of freedom; (3)…

Descriptors: Error of Measurement, Evaluation Methods, Measurement Techniques, Nonparametric Statistics

Framework for Designing Sampling Programs.

Peer reviewed

Maher, W. A.; And Others – Environmental Monitoring and Assessment, 1994

Presents a general framework for designing sampling programs that ensure cost effectiveness, and managed errors kept within known and acceptable limits. (LZ)

Descriptors: Cost Effectiveness, Environmental Education, Environmental Research, Error of Measurement

A Comparison of Approaches for Setting Proficiency Standards via Monte Carlo Simulations. Research Evaluation Development Technical Report Series, No. 2.

Download full text

Ziomek, Robert L.; Szymczuk, Mike – 1983

In order to evaluate standard setting procedures, apart from the more commonly applied approach of simply comparing the derived standards or failure rates across various techniques, this study investigated the errors of classification associated with the contrasting groups procedures. Monte Carlo simulations were employed to produce…

Descriptors: Classification, Computer Simulation, Error of Measurement, Evaluation Methods

Download full text

Cook, Linda L.; Petersen, Nancy S. – 1986

This paper examines how various equating methods are affected by: (1) sampling error; (2) sample characteristics; and (3) characteristics of anchor test items. It reviews empirical studies that investigated the invariance of equating transformations, and it discusses empirical and simulation studies that focus on how the properties of anchor tests…

Descriptors: Educational Research, Equated Scores, Error of Measurement, Evaluation Methods

How To Sample in Surveys. The Survey Kit, Volume 6.

Fink, Arlene – 1995

The nine-volume Survey Kit is designed to help readers prepare and conduct surveys and become better users of survey results. All the books in the series contain instructional objectives, exercises and answers, examples of surveys in use, illustrations of survey questions, guidelines for action, checklists of "dos and don'ts," and…

Descriptors: Costs, Data Collection, Educational Research, Error of Measurement

The Use and Effect of Caution Indices in Detecting Aberrant Patterns of Standard-Setting Recommendations.

Jaeger, Richard M.; Busch, John Christian – 1986

This study explores the use of the modified caution index (MCI) for identifying judges whose patterns of recommendations suggest that their judgments might be based on incomplete information, flawed reasoning, or inattention to their standard-setting tasks. It also examines the effect on test standards and passing rates when the test standards of…

Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation Methods, High Schools

Algina, James	1
Briggs, Derek C.	1
Busch, John Christian	1
Cook, Linda L.	1
De Boeck, Paul	1
Deke, John	1
Dekle, Dawn J.	1
Finch, Holmes	1
Fink, Arlene	1
Finucane, Mariel	1
Gonzalez, Jorge	1
Hartig, Johannes	1
Holzel, Britta	1
Jaeger, Richard M.	1
Leung, Denis H. Y.	1
Maher, W. A.	1
Monahan, Patrick	1
Moosbrugger, Helfried	1
Olejnik, Stephen F.	1
Petersen, Nancy S.	1
San Martin, Ernesto	1
Sheard, Christine	1
Szymczuk, Mike	1
Thal, Daniel	1
More ▼