NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Kelly, Kate Tremain; Richardson, Mary; Isaacs, Talia – Assessment in Education: Principles, Policy & Practice, 2022
Comparative judgment is gaining popularity as an assessment tool, including for high-stakes testing purposes, despite relatively little research on the use of the technique. Advocates claim two main rationales for its use: that comparative judgment is valid because humans are better at comparative than absolute judgment, and because it distils the…
Descriptors: Comparative Analysis, Evaluation Methods, Evaluative Thinking, High Stakes Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025
As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…
Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy
McCabe, J. J. C. – British Journal of Teacher Education, 1980
Evaluation is discussed as a trade. An evaluator must be able to function as tradesman, conductor, and interpreter. A typical evaluation task is defined, skills required by the evaluator are listed, and key questions used in evaluation are discussed. (Author/JN)
Descriptors: Comparative Analysis, Content Analysis, Course Evaluation, Educational Assessment
Peer reviewed Peer reviewed
Newman, Dianna L.; Brown, Robert D. – Evaluation Review, 1992
Survey items representing 30 standards of the Joint Committee on Standards for Educational Evaluation were used to examine violations in program evaluations. Twenty-nine novice graduate students, 57 graduate students with moderate knowledge about evaluation, and 61 expert evaluators viewed propriety and accuracy standards as suffering the most…
Descriptors: Comparative Analysis, Educational Assessment, Ethics, Evaluation Criteria
PDF pending restoration PDF pending restoration
Jaeger, Richard M.; Usher, Claire H. – 1991
This paper reports on a study of the foundation and application of two procedures used to specify appropriate weights to be applied to components in determining the overall quality of a school. These procedures are multiattribute utility technology (MAUT) and policy capturing, and the paper presents the results of applying them, using key…
Descriptors: Achievement Tests, Comparative Analysis, Curriculum Evaluation, Educational Assessment
Peer reviewed Peer reviewed
Houston, Walter M.; And Others – Applied Psychological Measurement, 1991
The effectiveness of alternative procedures to correct for rater leniency/stringency effects was studied when true scores were known. Ordinary least squares, weighted least squares, and imputation of the missing data consistently outperformed averaging the observed ratings; and the imputation technique was superior to the least squares methods.…
Descriptors: Comparative Analysis, Computer Simulation, Educational Assessment, Equations (Mathematics)
Linn, Robert L.; And Others – 1991
The New Standards Project is a joint effort of the Learning Research and Development Center (LRDC) at the University of Pittsburgh (Pennsylvania) and the National Center on Education and the Economy toward creation of a national examination system based on performance assessments. This study explored the feasibility of comparing performance on…
Descriptors: Comparative Analysis, Correlation, Educational Assessment, Elementary Secondary Education
Hattendorf, Lynn C. – 1996
Since educational statistics, which are relatively easy to obtain, can only attempt to measure "quality," this paper asks how quality in higher education is assessed and how educational rankings, which are defined as benchmarks or attempts to measure, contribute to this process. The paper notes that while attempts to rank institutions of…
Descriptors: Citation Analysis, Comparative Analysis, Data Interpretation, Educational Assessment