ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	3

Descriptor

Comparative Analysis	6
Evaluators	6
Classification	3
Decision Making	2
Graphs	2
Item Analysis	2
Item Response Theory	2
Mathematical Models	2
Scoring	2
Test Items	2
Test Validity	2
Accuracy	1
Achievement Tests	1
Artificial Intelligence	1
Chi Square	1
College Students	1
Computer Software	1
Concurrent Validity	1
Content Validity	1
Creativity Research	1
Creativity Tests	1
Cutting Scores	1
Data Analysis	1
Elementary Education	1
Elementary School Students	1
More ▼

Source

Educational and Psychological…

Author

Bachelor, Patricia A.	1
Baldwin, Peter	1
Clauser, Jerome C.	1
Feingold, Marcia	1
Hambleton, Ronald K.	1
Khorramdel, Lale	1
Lamprianou, Iasonas	1
Tyack, Lillian	1
Woehr, David J.	1
von Davier, Matthias	1

Publication Type

Journal Articles	6
Reports - Research	5
Reports - Evaluative	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Investigation of Rater Effects Using Social Network Analysis and Exponential Random Graph Models

Peer reviewed

Direct link

Lamprianou, Iasonas – Educational and Psychological Measurement, 2018

It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…

Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation

The Effect of Rating Unfamiliar Items on Angoff Passing Scores

Peer reviewed

Direct link

Clauser, Jerome C.; Hambleton, Ronald K.; Baldwin, Peter – Educational and Psychological Measurement, 2017

The Angoff standard setting method relies on content experts to review exam items and make judgments about the performance of the minimally proficient examinee. Unfortunately, at times content experts may have gaps in their understanding of specific exam content. These gaps are particularly likely to occur when the content domain is broad and/or…

Descriptors: Scores, Item Analysis, Classification, Decision Making

The Equivalence of Cohen's Kappa and Pearson's Chi-Square Statistics in the 2 X 2 Table.

Peer reviewed

Feingold, Marcia – Educational and Psychological Measurement, 1992

A formula that is simpler to calculate than the Kappa statistic of J. Cohen is presented for the situation where each subject in an experiment is rated on a nominal scale by two or more judges. Equivalence with Pearson's chi-square statistic in this situation is demonstrated. (SLD)

Descriptors: Chi Square, Comparative Analysis, Data Analysis, Equations (Mathematics)

A Comparison of the Multitrait-Multimethod and Factor Analytic Methods in the Determination of the Discriminant Validity of Three Tests of Creativity.

Peer reviewed

Bachelor, Patricia A. – Educational and Psychological Measurement, 1989

Whether 12 university students could discriminate validly between the correctness and originality of responses of 150 elementary school children to 3 creativity tests was studied as a determination of the discriminant validity of the tests through the multitrait-multimethod procedure. There was compelling evidence for convergent and discriminant…

Descriptors: College Students, Comparative Analysis, Creativity Research, Creativity Tests

An Empirical Comparison of Cutoff Score Methods for Content-Related and Criterion-Related Validity Settings.

Peer reviewed

Woehr, David J.; And Others – Educational and Psychological Measurement, 1991

Methods for setting cutoff scores based on criterion performance, normative comparison, and absolute judgment were compared for scores on a multiple-choice psychology examination for 121 undergraduates and 251 undergraduates as a comparison group. All methods fell within the standard error of measurement. Implications of differences for decision…

Descriptors: Comparative Analysis, Concurrent Validity, Content Validity, Cutting Scores