ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	6
Since 2017 (last 10 years)	14
Since 2007 (last 20 years)	40

Descriptor

Comparative Analysis	66
Error of Measurement	66
Reliability	32
Test Reliability	30
Statistical Analysis	20
Scores	17
Correlation	16
Item Analysis	12
Item Response Theory	12
Mathematical Models	12
Foreign Countries	11
Test Items	10
Psychometrics	9
Sample Size	9
Tests	9
Factor Analysis	8
Monte Carlo Methods	8
Test Validity	8
Computer Assisted Testing	7
Sampling	7
Generalizability Theory	6
Interrater Reliability	6
Measurement Techniques	6
Raw Scores	6
Scoring	6
More ▼

Publication Type

Journal Articles	45
Reports - Research	43
Reports - Evaluative	11
Speeches/Meeting Papers	5
Reports - Descriptive	4
Dissertations/Theses -…	3
Tests/Questionnaires	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Opinion Papers	1

Education Level

Higher Education	7
Postsecondary Education	5
Elementary Education	4
Elementary Secondary Education	2
Kindergarten	2
Secondary Education	2
Adult Education	1
Early Childhood Education	1
Grade 4	1
Grade 6	1
High Schools	1
Intermediate Grades	1
Preschool Education	1
Primary Education	1
More ▼

Audience

Location

Canada	3
Germany	3
California	2
China	2
North Carolina	2
Portugal	2
United States	2
Canada (Toronto)	1
Chile	1
Finland	1
New Zealand	1
South Carolina	1
Taiwan	1
Turkey	1
United Kingdom (England)	1
United Kingdom (Great Britain)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Comprehensive Tests of Basic…	1
Early Childhood Longitudinal…	1
International English…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 66 results Save | Export

Frequentist and Bayesian Factorial Invariance Using R

Peer reviewed
PDF on ERIC

Download full text

Teck Kiang Tan – Practical Assessment, Research & Evaluation, 2024

The procedures of carrying out factorial invariance to validate a construct were well developed to ensure the reliability of the construct that can be used across groups for comparison and analysis, yet mainly restricted to the frequentist approach. This motivates an update to incorporate the growing Bayesian approach for carrying out the Bayesian…

Descriptors: Bayesian Statistics, Factor Analysis, Programming Languages, Reliability

Initial Evidence Supporting Interpretations of Scores from the Enhanced ACT Test. ACT Research. Research Report. R2425

Download full text

Jeff Allen; Ty Cruce – ACT Education Corp., 2025

This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…

Descriptors: College Entrance Examinations, Testing, Change, Scores

A Comparison of Reliability Estimation Based on Confirmatory Factor Analysis and Exploratory Structural Equation Models

Peer reviewed

Direct link

Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2022

Composite reliability, or coefficient omega, can be estimated using structural equation modeling. Composite reliability is usually estimated under the basic independent clusters model of confirmatory factor analysis (ICM-CFA). However, due to the existence of cross-loadings, the model fit of the exploratory structural equation model (ESEM) is…

Descriptors: Comparative Analysis, Structural Equation Models, Factor Analysis, Reliability

Adaptive Pairwise Comparison for Educational Measurement

Peer reviewed

Direct link

Crompvoets, Elise A. V.; Béguin, Anton A.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2020

Pairwise comparison is becoming increasingly popular as a holistic measurement method in education. Unfortunately, many comparisons are required for reliable measurement. To reduce the number of required comparisons, we developed an adaptive selection algorithm (ASA) that selects the most informative comparisons while taking the uncertainty of the…

Descriptors: Comparative Analysis, Statistical Analysis, Mathematics, Measurement

Estimating Hazard Ratios from Published Kaplan-Meier Survival Curves: A Methods Validation Study

Peer reviewed

Direct link

Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019

Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…

Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials

Kappa Coefficients for Missing Data

Peer reviewed

Direct link

De Raadt, Alexandra; Warrens, Matthijs J.; Bosker, Roel J.; Kiers, Henk A. L. – Educational and Psychological Measurement, 2019

Cohen's kappa coefficient is commonly used for assessing agreement between classifications of two raters on a nominal scale. Three variants of Cohen's kappa that can handle missing data are presented. Data are considered missing if one or both ratings of a unit are missing. We study how well the variants estimate the kappa value for complete data…

Descriptors: Interrater Reliability, Data, Statistical Analysis, Statistical Bias

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

A Comparison of Procedures for Estimating Person Reliability Parameters in the Graded Response Model

Peer reviewed

Direct link

LaHuis, David M.; Bryant-Lees, Kinsey B.; Hakoyama, Shotaro; Barnes, Tyler; Wiemann, Andrea – Journal of Educational Measurement, 2018

Person reliability parameters (PRPs) model temporary changes in individuals' attribute level perceptions when responding to self-report items (higher levels of PRPs represent less fluctuation). PRPs could be useful in measuring careless responding and traitedness. However, it is unclear how well current procedures for estimating PRPs can recover…

Descriptors: Comparative Analysis, Reliability, Error of Measurement, Measurement Techniques

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill

Peer reviewed
PDF on ERIC

Download full text

Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022

In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…

Descriptors: Evaluators, Training, Comparative Analysis, Academic Language

When near Means Related: Evidence from Three Web Survey Experiments on Inter-Item Correlations in Grid Questions

Peer reviewed

Direct link

Silber, Henning; Roßmann, Joss; Gummer, Tobias – International Journal of Social Research Methodology, 2018

In this article, we present the results of three question design experiments on inter-item correlations, which tested a grid design against a single-item design. The first and second experiments examined the inter-item correlations of a set with five and seven items, respectively, and the third experiment examined the impact of the question design…

Descriptors: Foreign Countries, Online Surveys, Experiments, Correlation

Measuring the Development of General Language Skills in English as a Foreign Language--Longitudinal Invariance of the C-Test

Peer reviewed

Direct link

Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023

Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies

The Validity and Reliability of the Gymaware Linear Position Transducer for Measuring Counter-Movement Jump Performance in Female Athletes

Peer reviewed

Direct link

O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew – Measurement in Physical Education and Exercise Science, 2018

The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…

Descriptors: Test Validity, Test Reliability, Females, Athletes

Quantifying Error in Survey Measures of School and Classroom Environments

Peer reviewed

Direct link

Schweig, Jonathan David – Applied Measurement in Education, 2014

Developing indicators that reflect important aspects of school and classroom environments has become central in a nationwide effort to develop comprehensive programs that measure teacher quality and effectiveness. Formulating teacher evaluation policy necessitates accurate and reliable methods for measuring these environmental variables. This…

Descriptors: Error of Measurement, Educational Environment, Classroom Environment, Surveys

The "Don't Know" Option in Progress Testing

Peer reviewed

Direct link

Ravesloot, C. J.; Van der Schaaf, M. F.; Muijtjens, A. M. M.; Haaring, C.; Kruitwagen, C. L. J. J.; Beek, F. J. A.; Bakker, J.; Van Schaik, J.P.J.; Ten Cate, Th. J. – Advances in Health Sciences Education, 2015

Formula scoring (FS) is the use of a don't know option (DKO) with subtraction of points for wrong answers. Its effect on construct validity and reliability of progress test scores, is subject of discussion. Choosing a DKO may not only be affected by knowledge level, but also by risk taking tendency, and may thus introduce construct-irrelevant…

Descriptors: Scoring Formulas, Tests, Scores, Construct Validity

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Educational and Psychological…	7
Journal of Educational…	3
ProQuest LLC	3
Applied Measurement in…	2
Applied Psychological…	2
ETS Research Report Series	2
Journal of Experimental…	2
Psychological Methods	2
Psychometrika	2
ACT Education Corp.	1
Advances in Health Sciences…	1
Assessment & Evaluation in…	1
CALICO Journal	1
Developmental Psychology	1
Early Education and…	1
Education and Information…	1
Educational Measurement:…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational and…	1
Journal of Psychoeducational…	1
Language Assessment Quarterly	1
Language Testing	1
Measurement and Evaluation in…	1
More ▼

Bashaw, W. L.	2
Brennan, Robert L.	2
Haladyna, Tom	2
Lee, Won-Chan	2
Rentz, R. Robert	2
Ackerman, Terry A.	1
Acklie, Teresa J.	1
Alkahtani, Saif F.	1
Anderson, Lance E.	1
Anwyll, Steve	1
Bakker, J.	1
Bandalos, Deborah L.	1
Barnes, Tyler	1
Beek, F. J. A.	1
Benson, Jeri	1
Blaker, Lisa	1
Bosker, Roel J.	1
Bowes, Neal	1
Browne, Dillon T.	1
Bruno D. Zumbo	1
Bryant-Lees, Kinsey B.	1
Béguin, Anton A.	1
Chambers, Samuel	1
Chan, Kelvin K. W.	1
More ▼