Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 16 |
Since 2006 (last 20 years) | 50 |
Descriptor
Comparative Analysis | 223 |
Test Validity | 223 |
Test Reliability | 72 |
Research Projects | 49 |
Research Methodology | 39 |
Reading Research | 37 |
Higher Education | 34 |
Evaluation Methods | 29 |
Test Construction | 29 |
College Students | 28 |
Elementary Education | 28 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 5 |
Practitioners | 2 |
Administrators | 1 |
Teachers | 1 |
Location
Australia | 6 |
Canada | 4 |
United States | 4 |
China | 3 |
Belgium | 2 |
California | 2 |
Florida | 2 |
Israel | 2 |
Italy | 2 |
Nigeria | 2 |
Ohio | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Workforce Investment Act 1998… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Vali, Yasaman; Yang, Bada; Olsen, Maria; Leeflang, Mariska M. G.; Bossuyt, Patrick M. M. – Research Synthesis Methods, 2021
Comparative accuracy studies evaluate the relative performance of two or more diagnostic tests. As any other form of research, such studies should be reported in an informative manner, to allow replication and to be useful for decision-making. In this study we aimed to assess whether and how components of test comparisons were reported in…
Descriptors: Comparative Analysis, Accuracy, Diagnostic Tests, Decision Making
Ute Knoch; Jason Fan – Language Testing, 2024
While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian…
Descriptors: Language Tests, English, Test Validity, Item Analysis
Braumoeller, Bear F. – Sociological Methods & Research, 2017
Fuzzy-set qualitative comparative analysis (fsQCA) has become one of the most prominent methods in the social sciences for capturing causal complexity, especially for scholars with small- and medium-"N" data sets. This research note explores two key assumptions in fsQCA's methodology for testing for necessary and sufficient…
Descriptors: Qualitative Research, Comparative Analysis, Social Science Research, Research Methodology
Yangqiuting Li; Chandralekha Singh – Physical Review Physics Education Research, 2025
Research-based multiple-choice questions implemented in class with peer instruction have been shown to be an effective tool for improving students' engagement and learning outcomes. Moreover, multiple-choice questions that are carefully sequenced to build on each other can be particularly helpful for students to develop a systematic understanding…
Descriptors: Physics, Science Instruction, Science Tests, Multiple Choice Tests
Wilkin, John P. – College & Research Libraries, 2017
The 1961 Copyright Office study on renewals, authored by Barbara Ringer, has cast an outsized influence on discussions of the U.S. 1923-1963 public domain. As more concrete data emerge from initiatives such as the large-scale determination process in the Copyright Review Management System (CRMS) project, questions are raised about the reliability…
Descriptors: Comparative Analysis, Copyrights, Misconceptions, Test Reliability
Wagemaker, Hans, Ed. – International Association for the Evaluation of Educational Achievement, 2020
Although International Association for the Evaluation of Educational Achievement-pioneered international large-scale assessment (ILSA) of education is now a well-established science, non-practitioners and many users often substantially misunderstand how large-scale assessments are conducted, what questions and challenges they are designed to…
Descriptors: International Assessment, Achievement Tests, Educational Assessment, Comparative Analysis
Duncan, Greg J.; Engel, Mimi; Claessens, Amy; Dowsett, Chantelle J. – Developmental Psychology, 2014
Replications and robustness checks are key elements of the scientific method and a staple in many disciplines. However, leading journals in developmental psychology rarely include explicit replications of prior research conducted by different investigators, and few require authors to establish in their articles or online appendices that their key…
Descriptors: Replication (Evaluation), Robustness (Statistics), Developmental Psychology, Educational Research
Elicited Imitation as a Measure of Second Language Proficiency: A Narrative Review and Meta-Analysis
Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…
Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size
Bao, Lei; Koenig, Kathleen; Xiao, Yang; Fritchman, Joseph; Zhou, Shaona; Chen, Cheng – Physical Review Physics Education Research, 2022
Abilities in scientific thinking and reasoning have been emphasized as core areas of initiatives, such as the Next Generation Science Standards or the College Board Standards for College Success in Science, which focus on the skills the future will demand of today's students. Although there is rich literature on studies of how these abilities…
Descriptors: Physics, Science Instruction, Teaching Methods, Thinking Skills
St. Clair, Travis; Hallberg, Kelly; Cook, Thomas D. – Journal of Educational and Behavioral Statistics, 2016
We explore the conditions under which short, comparative interrupted time-series (CITS) designs represent valid alternatives to randomized experiments in educational evaluations. To do so, we conduct three within-study comparisons, each of which uses a unique data set to test the validity of the CITS design by comparing its causal estimates to…
Descriptors: Research Methodology, Randomized Controlled Trials, Comparative Analysis, Time
Ford, Jeremy W.; Conoyer, Sarah J.; Lembke, Erica S.; Smith, R. Alex; Hosp, John L. – Assessment for Effective Intervention, 2018
In the present study, two types of curriculum-based measurement (CBM) tools in science, Vocabulary Matching (VM) and Statement Verification for Science (SV-S), a modified Sentence Verification Technique, were compared. Specifically, this study aimed to determine whether the format of information presented (i.e., SV-S vs. VM) produces differences…
Descriptors: Curriculum Based Assessment, Evaluation Methods, Measurement Techniques, Comparative Analysis
Yavuz, Aysun – International Education Studies, 2012
In this paper, the writer discusses the philosophical underpinnings of the two dominant research methods in social sciences; quantitative and qualitative paradigms. The natures of two paradigms are quite different so this leads many researchers to discuss these issues in a comparative way. This paper tackles the knowledge and understanding of…
Descriptors: Teacher Educators, Social Science Research, Research Methodology, Comparative Analysis
Kimura, Daisuke; Mattson, Nikki; Amory, Michael – TESOL Journal, 2018
Whereas previous research revealed the interactional variability occurring in oral assessments, demonstrating how it could undermine test validity (e.g., A. Brown, 2003), little has been published regarding how language programs can mine and analyze video recordings of in-house oral placement tests for validation purposes. Addressing this need,…
Descriptors: Oral Language, Language Tests, Student Placement, English (Second Language)
Hays, Danica G.; Wood, Chris – Measurement and Evaluation in Counseling and Development, 2017
We present considerations for validity when a population outside of a normed sample is assessed and those data are interpreted. Using a career group counseling example exploring life satisfaction changes as evidenced by the Quality of Life Inventory (Frisch, 1994), we showcase qualitative and quantitative approaches to explore how normative data…
Descriptors: Data Interpretation, Scores, Quality of Life, Life Satisfaction
Wright, Paul M.; Irwin, Carol – Measurement in Physical Education and Exercise Science, 2018
National content standards in PE address responsibility; however, learning outcomes and teacher effectiveness in this area remain poorly defined. This study employed the Social and Emotional Learning framework and a teaching personal and social responsibility (TPSR) model fidelity instrument to address this gap. Our purpose was to examine the…
Descriptors: Observation, Teacher Effectiveness, Teacher Responsibility, Physical Education Teachers