NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Elementary and Secondary…1
Race to the Top1
What Works Clearinghouse Rating
Showing 1 to 15 of 90 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Susan K. Johnsen – Gifted Child Today, 2025
The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…
Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Joyce M. W. Moonen-van Loon; Jeroen Donkers – Practical Assessment, Research & Evaluation, 2025
The reliability of assessment tools is critical for accurately monitoring student performance in various educational contexts. When multiple assessments are combined to form an overall evaluation, each assessment serves as a data point contributing to the student's performance within a broader educational framework. Determining composite…
Descriptors: Programming Languages, Reliability, Evaluation Methods, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Rosanna Cole – Sociological Methods & Research, 2024
The use of inter-rater reliability (IRR) methods may provide an opportunity to improve the transparency and consistency of qualitative case study data analysis in terms of the rigor of how codes and constructs have been developed from the raw data. Few articles on qualitative research methods in the literature conduct IRR assessments or neglect to…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Research Methodology
Wenjing Guo – ProQuest LLC, 2021
Constructed response (CR) items are widely used in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district and state-level assessments in the United States. One unique feature of CR items is that they depend on human raters to assess the quality of examinees' work. The judgment of human…
Descriptors: National Competency Tests, Responses, Interrater Reliability, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…
Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Amanda Timmerman; Vasiliki Totsika; Valerie Lye; Laura Crane; Audrey Linden; Elizabeth Pellicano – Autism: The International Journal of Research and Practice, 2025
Autistic people are more likely to have co-occurring mental health conditions compared to the general population, and mental health interventions have been identified as a top research priority by autistic people and the wider autism community. Autistic adults have also communicated that quality of life is the outcome that matters most to them in…
Descriptors: Adults, Autism Spectrum Disorders, Quality of Life, Randomized Controlled Trials
Peer reviewed Peer reviewed
Direct linkDirect link
Breanne J. Byiers; Alyssa M. Merbler; Chantel C. Burkitt; Frank J. Symons – American Journal on Intellectual and Developmental Disabilities, 2025
Sleep problems are common in Rett syndrome and other neurogenetic syndromes. Actigraphy is a cost-effective, objective method for measuring sleep. Current guidelines require caregiver-reported bed and wake times to facilitate actigraphy data scoring. The current study examined missingness and consistency of caregiver-reported bed and wake times…
Descriptors: Sleep, Neurodevelopmental Disorders, Psychomotor Skills, Genetic Disorders
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Regional Educational Laboratory Mid-Atlantic, 2024
These are the appendixes for the report, "Stabilizing School Performance Indicators in New Jersey to Reduce the Effect of Random Error." This study applied a stabilization model called Bayesian hierarchical modeling to group-­level data (with groups assigned according to demographic designations) within schools in New Jersey with the aim…
Descriptors: Institutional Evaluation, Elementary Secondary Education, Bayesian Statistics, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Peer reviewed Peer reviewed
Direct linkDirect link
Pere J. Ferrando; David Navarro-González; Fabia Morales-Vives – Educational and Psychological Measurement, 2025
The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the…
Descriptors: Scores, Accuracy, Evaluation Methods, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Robert Meyer; Sara Hu; Michael Christian – Society for Research on Educational Effectiveness, 2023
Background: This paper develops a new method to estimate quasi-experimental evaluation models when it is necessary to control for measurement error in predictors and individual assignment to the treatment group is based on these same fallible variables. A major methodological finding of the study is that standard methods of estimating models that…
Descriptors: Error of Measurement, Measurement Techniques, Elementary Secondary Education, Report Cards
Peer reviewed Peer reviewed
Direct linkDirect link
Ke-Hai Yuan; Zhiyong Zhang; Lijuan Wang – Grantee Submission, 2024
Mediation analysis plays an important role in understanding causal processes in social and behavioral sciences. While path analysis with composite scores was criticized to yield biased parameter estimates when variables contain measurement errors, recent literature has pointed out that the population values of parameters of latent-variable models…
Descriptors: Structural Equation Models, Path Analysis, Weighted Scores, Comparative Testing
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6