Publication Date
In 2025 | 4 |
Since 2024 | 13 |
Since 2021 (last 5 years) | 70 |
Since 2016 (last 10 years) | 146 |
Since 2006 (last 20 years) | 455 |
Descriptor
Comparative Analysis | 768 |
Evaluation Methods | 768 |
Foreign Countries | 185 |
Student Evaluation | 165 |
Scores | 130 |
Test Validity | 114 |
Academic Achievement | 104 |
Tests | 89 |
Standardized Tests | 87 |
Language Tests | 82 |
Higher Education | 81 |
More ▼ |
Source
Author
Cho, Sun-Joo | 3 |
Finch, Holmes | 3 |
French, Russell L. | 3 |
Happe, Francesca | 3 |
Linn, Robert L. | 3 |
Liu, Jinghua | 3 |
MacLeod, Andrea A. N. | 3 |
Nandakumar, Ratna | 3 |
Newton, Paul E. | 3 |
Pike, Gary R. | 3 |
Wu, Margaret | 3 |
More ▼ |
Publication Type
Education Level
Location
Australia | 25 |
United States | 21 |
United Kingdom | 19 |
United Kingdom (England) | 15 |
Canada | 14 |
Iran | 13 |
Florida | 11 |
Netherlands | 9 |
New York | 8 |
Pennsylvania | 8 |
California | 7 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Kazuhiro Yamaguchi – Journal of Educational and Behavioral Statistics, 2025
This study proposes a Bayesian method for diagnostic classification models (DCMs) for a partially known Q-matrix setting between exploratory and confirmatory DCMs. This Q-matrix setting is practical and useful because test experts have pre-knowledge of the Q-matrix but cannot readily specify it completely. The proposed method employs priors for…
Descriptors: Models, Classification, Bayesian Statistics, Evaluation Methods
Walter M. Stroup; Anthony Petrosino; Corey Brady; Karen Duseau – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023
Tests of statistical significance often play a decisive role in establishing the empirical warrant of evidence-based research in education. The results from pattern-based assessment items, as introduced in this paper, are categorical and multimodal and do not immediately support the use of measures of central tendency as typically related to…
Descriptors: Statistical Significance, Comparative Analysis, Research Methodology, Evaluation Methods
Klingbeil, David A.; Van Norman, Ethan R.; Osman, David J.; Berry-Corie, Kimberly; Carberry, Caroline K.; Kim, Jessica S. – Journal of Psychoeducational Assessment, 2023
Early identification of students needing additional support is a foundational component of Multi-Tiered Systems of Support (MTSS). Due to the resource-intensive nature of implementing MTSS, it is critical that universal screening procedures are maximally accurate and efficient. The purpose of this study was to compare the classification accuracy…
Descriptors: Comparative Analysis, Benchmarking, Evaluation Methods, Screening Tests
Carly Oddleifson; Stephen Kilgus; David A. Klingbeil; Alexander D. Latham; Jessica S. Kim; Ishan N. Vengurlekar – Grantee Submission, 2025
The purpose of this study was to conduct a conceptual replication of Pendergast et al.'s (2018) study that examined the diagnostic accuracy of a nomogram procedure, also known as a naive Bayesian approach. The specific naive Bayesian approach combined academic and social-emotional and behavioral (SEB) screening data to predict student performance…
Descriptors: Bayesian Statistics, Accuracy, Social Emotional Learning, Diagnostic Tests
Han, Lu – ProQuest LLC, 2022
This dissertation study explored the feasibility of using authenticated spoken texts to test L2 Chinese listening comprehension. The spoken texts used in the study were created using an "authenticating" technique, in which scripted spoken Chinese texts were infused with characteristics of real-world, unscripted spoken Chinese. In the…
Descriptors: Second Language Learning, Second Language Instruction, Listening Comprehension Tests, Chinese
Peña, Javier; Muthalib, Makii; Sampedro, Agurne; Cardoso-Botelho, Mafalda; Zabala, Oihana; Ibarretxe-Bilbao, Naroa; García-Guerrero, Acebo; Zubiaurre-Elorza, Leire; Ojeda, Natalia – Journal of Creative Behavior, 2023
Creativity is a fundamental human accomplishment from scientific advances to composing music. The left dorsolateral prefrontal cortex (DLPFC) and inferior frontal gyrus (IFG) are important metacontrol hubs in flexibility and persistence brain states, respectively. Those hubs are related to divergent thinking, insight problem-solving, and…
Descriptors: Creativity, Acoustics, Brain Hemisphere Functions, Comparative Analysis
Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022
Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…
Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction
Katrina Fulcher-Rood; Anny Castilla-Earls – Language, Speech, and Hearing Services in Schools, 2023
Purpose: The purpose of this study was to compare child language assessment practices of speech-language pathologists (SLPs) working in school and non-school settings to determine if their place of employment impacts the diagnostic decision-making process. Method: School-based SLPs (e.g., direct service providers employed in preschool and/or K-12…
Descriptors: Child Language, Speech Language Pathology, Language Tests, Allied Health Personnel
Kolarec, Biserka; Nincevic, Marina – International Society for Technology, Education, and Science, 2022
The object of research is a statistics exam that contains problem tasks. One examiner performed two exam evaluation methods to repeatedly evaluate the exam. The goal was to compare the methods for objectivity. One of the two exam evaluation methods we call a serial evaluation method. The serial evaluation method assumes evaluation of all exam…
Descriptors: Statistics Education, Mathematics Tests, Evaluation Methods, Test Construction
Davis, Kirsten A.; Jesiek, Brent K.; Knight, David B. – Journal of Engineering Education, 2023
Background: Engineers operate in an increasingly global environment, making it important that engineering students develop global engineering competency to prepare them for success in the workplace. To understand this learning, we need assessment approaches that go beyond traditional self-report surveys. A previous study (Jesiek et al.,…
Descriptors: Vignettes, Engineering Education, Study Abroad, Foreign Countries
Williamson, Joanna – Research Matters, 2022
Providing evidence that can inform awarding is an important application of Comparative Judgement (CJ) methods in high-stakes qualifications. The process of marking scripts is not changed, but CJ methods can assist in the maintenance of standards from one series to another by informing decisions about where to place grade boundaries or cut scores.…
Descriptors: Standards, Grading, Decision Making, Comparative Analysis
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Frits F. B. Pals; Jos L. J. Tolboom; Cor J. M. Suhre – Canadian Journal of Science, Mathematics and Technology Education, 2023
To be able to support students' competence development in solving physics problems over the course of a lesson series effectively, teachers need a proper appreciation of students' deficiencies. As teachers commonly assess students' competence by means of written tests, teachers are challenged to interpret students' work on these tests and to…
Descriptors: Formative Evaluation, Problem Solving, Science Instruction, Diagnostic Tests
Sayac, Nathalie; Veldhuis, Michiel – International Journal of Science and Mathematics Education, 2022
We investigated French primary school teachers' assessment practice in mathematics. Using an online questionnaire on teachers' background, teaching, and grading practice, we were able to determine assessment profiles of 604 primary school teachers. As evidenced by the teachers' scores on the latent factors Assessment purposes, Assessment…
Descriptors: Foreign Countries, Elementary School Teachers, Mathematics Instruction, Evaluation Methods