ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	4

Descriptor

Comparative Analysis	10
Educational Assessment	10
Evaluators	10
Evaluation Methods	5
Interrater Reliability	4
Correlation	3
Accuracy	2
Educational Quality	2
Elementary Secondary Education	2
Evaluation Criteria	2
Evaluation Problems	2
Higher Education	2
Item Analysis	2
Mathematical Models	2
Performance Based Assessment	2
Scores	2
Scoring	2
Surveys	2
Writing Evaluation	2
Achievement Tests	1
Algorithms	1
Artificial Intelligence	1
Citation Analysis	1
Computational Linguistics	1
Computer Simulation	1
More ▼

Source

Applied Measurement in…	1
Applied Psychological…	1
Assessment in Education:…	1
British Journal of Teacher…	1
Evaluation Review	1
Journal of Baltic Science…	1
Journal of Educational and…	1

Publication Type

Journal Articles	7
Reports - Research	6
Reports - Evaluative	3
Speeches/Meeting Papers	2
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

China

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Peer reviewed

Direct link

Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…

Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy

Critiquing the Rationales for Using Comparative Judgement: A Call for Clarity

Peer reviewed

Direct link

Kelly, Kate Tremain; Richardson, Mary; Isaacs, Talia – Assessment in Education: Principles, Policy & Practice, 2022

Comparative judgment is gaining popularity as an assessment tool, including for high-stakes testing purposes, despite relatively little research on the use of the technique. Advocates claim two main rationales for its use: that comparative judgment is valid because humans are better at comparative than absolute judgment, and because it distils the…

Descriptors: Comparative Analysis, Evaluation Methods, Evaluative Thinking, High Stakes Tests

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

What Is an Evaluator?

McCabe, J. J. C. – British Journal of Teacher Education, 1980

Evaluation is discussed as a trade. An evaluator must be able to function as tradesman, conductor, and interpreter. A typical evaluation task is defined, skills required by the evaluator are listed, and key questions used in evaluation are discussed. (Author/JN)

Descriptors: Comparative Analysis, Content Analysis, Course Evaluation, Educational Assessment

Violations of Evaluation Standards: Frequency and Seriousness of Occurrence.

Peer reviewed

Newman, Dianna L.; Brown, Robert D. – Evaluation Review, 1992

Survey items representing 30 standards of the Joint Committee on Standards for Educational Evaluation were used to examine violations in program evaluations. Twenty-nine novice graduate students, 57 graduate students with moderate knowledge about evaluation, and 61 expert evaluators viewed propriety and accuracy standards as suffering the most…

Descriptors: Comparative Analysis, Educational Assessment, Ethics, Evaluation Criteria

Alternative Procedures for Integrating Multidimensional Evaluations of Schools: An Experimental Comparison.

PDF pending restoration

Jaeger, Richard M.; Usher, Claire H. – 1991

This paper reports on a study of the foundation and application of two procedures used to specify appropriate weights to be applied to components in determining the overall quality of a school. These procedures are multiattribute utility technology (MAUT) and policy capturing, and the paper presents the results of applying them, using key…

Descriptors: Achievement Tests, Comparative Analysis, Curriculum Evaluation, Educational Assessment

Adjustments for Rater Effects in Performance Assessment.

Peer reviewed

Houston, Walter M.; And Others – Applied Psychological Measurement, 1991

The effectiveness of alternative procedures to correct for rater leniency/stringency effects was studied when true scores were known. Ordinary least squares, weighted least squares, and imputation of the missing data consistently outperformed averaging the observed ratings; and the imputation technique was superior to the least squares methods.…

Descriptors: Comparative Analysis, Computer Simulation, Educational Assessment, Equations (Mathematics)

Cross-State Comparability of Judgments of Student Writing: Reports from the New Standards Project.

Download full text

Linn, Robert L.; And Others – 1991

The New Standards Project is a joint effort of the Learning Research and Development Center (LRDC) at the University of Pittsburgh (Pennsylvania) and the National Center on Education and the Economy toward creation of a national examination system based on performance assessments. This study explored the feasibility of comparing performance on…

Descriptors: Comparative Analysis, Correlation, Educational Assessment, Elementary Secondary Education

Educational Rankings of Higher Education: Fact or Fiction?

Download full text

Hattendorf, Lynn C. – 1996

Since educational statistics, which are relatively easy to obtain, can only attempt to measure "quality," this paper asks how quality in higher education is assessed and how educational rankings, which are defined as benchmarks or attempts to measure, contribute to this process. The paper notes that while attempts to rank institutions of…

Descriptors: Citation Analysis, Comparative Analysis, Data Interpretation, Educational Assessment

Allan S. Cohen	1
Brown, Robert D.	1
Guangtian Zhu	1
Hattendorf, Lynn C.	1
Houston, Walter M.	1
Isaacs, Talia	1
Jaeger, Richard M.	1
Jianwen Xiong	1
Jordan M. Wheeler	1
Kelly, Kate Tremain	1
Lee, Won-Chan	1
Lin Liu	1
Linn, Robert L.	1
McCabe, J. J. C.	1
Newman, Dianna L.	1
Richardson, Mary	1
Shiyu Wang	1
Song, Yoon Ah	1
Usher, Claire H.	1
Yubin Xu	1
More ▼