ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	18
Since 2007 (last 20 years)	48

Descriptor

Scoring	82
Statistical Analysis	82
Test Reliability	39
Reliability	27
Correlation	22
Interrater Reliability	19
Test Validity	19
Comparative Analysis	17
Foreign Countries	15
Test Items	13
Test Construction	12
Factor Analysis	11
Scores	11
Validity	11
Computer Assisted Testing	9
Evaluation Methods	9
Measurement Techniques	9
Essay Tests	8
Item Analysis	8
Psychometrics	8
Writing Evaluation	8
College Students	7
Error of Measurement	7
Test Interpretation	7
Tests	7
More ▼

Publication Type

Reports - Research	49
Journal Articles	45
Reports - Evaluative	7
Dissertations/Theses -…	5
Guides - Non-Classroom	3
Speeches/Meeting Papers	3
Tests/Questionnaires	3
Books	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Learner	1
Numerical/Quantitative Data	1
Reference Materials -…	1
Reports - Descriptive	1
More ▼

Education Level

Higher Education	10
Secondary Education	10
Postsecondary Education	8
Elementary Education	6
Middle Schools	6
Elementary Secondary Education	5
High Schools	4
Junior High Schools	4
Grade 7	3
Grade 8	3
Early Childhood Education	2
Grade 5	2
Grade 3	1
Grade 4	1
Intermediate Grades	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Researchers	2
Parents	1
Practitioners	1
Students	1
Teachers	1

Location

California	3
China	2
New York	2
Nigeria	2
Turkey	2
Australia	1
Delaware	1
District of Columbia	1
Estonia	1
Florida	1
Israel	1
Maryland	1
Massachusetts	1
Netherlands	1
Ohio	1
Panama	1
Spain	1
Taiwan	1
Texas	1
Texas (Houston)	1
United Kingdom (England)	1
United States	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Individuals with Disabilities…	1

Assessments and Surveys

ACT Assessment	2
SAT (College Admission Test)	2
Wechsler Intelligence Scale…	2
ACT Interest Inventory	1
Advanced Placement…	1
Flesch Kincaid Grade Level…	1
Graduate Record Examinations	1
Massachusetts Comprehensive…	1
NEO Personality Inventory	1
Praxis Series	1
Test of English as a Foreign…	1
Torrance Tests of Creative…	1
Wechsler Individual…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 82 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

A Comparison of Manual versus Automated Quantitative Production Analysis of Connected Speech

Peer reviewed

Direct link

Fromm, Davida; Katta, Saketh; Paccione, Mason; Hecht, Sophia; Greenhouse, Joel; MacWhinney, Brian; Schnur, Tatiana T. – Journal of Speech, Language, and Hearing Research, 2021

Purpose: Analysis of connected speech in the field of adult neurogenic communication disorders is essential for research and clinical purposes, yet time and expertise are often cited as limiting factors. The purpose of this project was to create and evaluate an automated program to score and compute the measures from the Quantitative Production…

Descriptors: Speech, Automation, Statistical Analysis, Adults

Reliability of Teams' Game-Related Statistics in Basketball: Number of Games Required and Minimal Detectable Change

Peer reviewed

Direct link

Pérez-Ferreirós, Alexandra; Kalén, Anton; Gómez, Miguel-Ángel; Rey, Ezequiel – Research Quarterly for Exercise and Sport, 2019

In basketball, game-related statistics are the most common measure of performance. However, the literature assessing their reliability is scarce. Purpose: Analyze the number of games required to obtain a good relative and absolute reliability of teams' game-related statistics. Method: A total of 884 games from the 2015-2016 to 2017-2018 seasons of…

Descriptors: Team Sports, Statistics, Reliability, Foreign Countries

Evaluating the Effectiveness of the Expectation-Maximization (EM) Algorithm for Bayesian Network Calibration

Direct link

Tingir, Seyfullah – ProQuest LLC, 2019

Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…

Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Factor Structure, Stability, and Congruence in the Functional Movement Screen

Peer reviewed

Direct link

Kelleher, Leila K.; Beach, Tyson A. C.; Frost, David M.; Johnson, Andrew M.; Dickey, James P. – Measurement in Physical Education and Exercise Science, 2018

The scoring scheme for the functional movement screen implicitly assumes that the factor structure is consistent, stable, and congruent across different populations. To determine if this is the case, we compared principal components analyses of three samples: a healthy, general population (n = 100), a group of varsity athletes (n = 101), and a…

Descriptors: Factor Structure, Test Reliability, Screening Tests, Motion

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

Peer reviewed
PDF on ERIC

Download full text

Ebuoh, Casmir N. – World Journal of Education, 2018

Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…

Descriptors: Holistic Approach, Scoring, Essay Tests, Biology

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

The Effect of Spoilers on the Enjoyment of Short Stories

Peer reviewed

Direct link

Levine, William H.; Betzner, Michelle; Autry, Kevin S. – Discourse Processes: A multidisciplinary journal, 2016

Recent research has provided evidence that the information provided before a story--a spoiler--may increase the enjoyment of that story, perhaps by increasing the processing fluency experienced during reading. In one experiment, we tested the reliability of these findings by closely replicating existing methods and the generality of these findings…

Descriptors: Literary Genres, Reading Fluency, Reliability, Reading Processes

Evaluation of Different Scoring Rules for a Noncognitive Test in Development. Research Report. ETS RR-16-03

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016

In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…

Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

An Exploration of Alternative Scoring Methods Using Curriculum-Based Measurement in Early Writing

Peer reviewed

Direct link

Allen, Abigail A.; Poch, Apryl L.; Lembke, Erica S. – Learning Disability Quarterly, 2018

This manuscript describes two empirical studies of alternative scoring procedures used with curriculum-based measurement in writing (CBM-W). Study 1 explored the technical adequacy of a trait-based rubric in first grade. Study 2 explored the technical adequacy of a trait-based rubric, production-dependent, and production-independent scores in…

Descriptors: Scoring, Alternative Assessment, Curriculum Based Assessment, Emergent Literacy

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

ETS Research Report Series	5
ProQuest LLC	5
Applied Measurement in…	4
CBE - Life Sciences Education	2
English Language Teaching	2
Eurasian Journal of…	2
Journal of Applied Testing…	2
ACT, Inc.	1
Advances in Health Sciences…	1
American Journal of…	1
Applied Psychological…	1
Asia-Pacific Forum on Science…	1
Assessment	1
Audio-Visual Language Journal	1
Canadian Journal of School…	1
Council for Aid to Education	1
Discourse Processes: A…	1
Education	1
Educational Assessment	1
Educational Measurement:…	1
Educational and Psychological…	1
European Early Childhood…	1
International Electronic…	1
International Journal of…	1
Journal of Education and…	1
More ▼

Braun, Henry I.	2
Ebuoh, Casmir N.	2
Lembke, Erica S.	2
Livingston, Samuel A.	2
Poch, Apryl L.	2
Prevost, Luanna B.	2
Steedle, Jeffrey T.	2
Algina, James	1
Alkahtani, Saif F.	1
Allalouf, Avi	1
Allen, Abigail A.	1
Allen, Nancy	1
Andrulis, Richard S.	1
Autry, Kevin S.	1
Barford, Sean W.	1
Beach, Tyson A. C.	1
Belfi, Brian	1
Ben-Simon, Anat	1
Bennett, Randy Elliot	1
Betzner, Michelle	1
Bhola, Dennison S.	1
Blackman, Nicole J-M.	1
Blackorby, Jose	1
Born, M. Ph.	1
More ▼