Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 25 |
Since 2016 (last 10 years) | 41 |
Since 2006 (last 20 years) | 73 |
Descriptor
Comparative Analysis | 114 |
Item Analysis | 114 |
Test Validity | 57 |
Foreign Countries | 39 |
Test Items | 39 |
Validity | 36 |
Correlation | 27 |
Test Reliability | 25 |
Factor Analysis | 23 |
Scores | 23 |
Test Construction | 23 |
More ▼ |
Source
Author
Haladyna, Tom | 3 |
Roid, Gale | 2 |
Acar, Tülin | 1 |
Aesaert, Koen | 1 |
Afflerbach, Peter | 1 |
Agarwala, Rina | 1 |
Akhtar, Hanif | 1 |
Aleyna Altan | 1 |
Allgaier, Antje-Kathrin | 1 |
Argulewicz, Ed N. | 1 |
Aricak, O. Tolga | 1 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 1 |
Location
Germany | 5 |
Australia | 4 |
Turkey | 4 |
China | 3 |
United Kingdom (England) | 3 |
Belgium | 2 |
Canada | 2 |
Spain | 2 |
Brazil | 1 |
Chile | 1 |
Europe | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023
We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…
Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format
Ute Knoch; Jason Fan – Language Testing, 2024
While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian…
Descriptors: Language Tests, English, Test Validity, Item Analysis
Beisemann, Marie; Forthmann, Boris; Bürkner, Paul-Christian; Holling, Heinz – Journal of Creative Behavior, 2020
The Remote Associates Test (RAT; Mednick, 1962; Mednick & Mednick, 1967) is a commonly employed test of creative convergent thinking. The RAT is scored with a dichotomous scoring, scoring correct answers as 1 and all other answers as 0. Based on recent research into the information processing underlying RAT performance, we argued that the…
Descriptors: Psychometrics, Scoring, Tests, Semantics
Seyda Aydin-Karaca; Mustafa Serdar Köksal; Bilkay Bi – Journal of Psychoeducational Assessment, 2024
This study aimed to develop a parent rating scale (PRSG) for screening children for further identification process in terms of giftedness. The participants of the study were 255 parents of gifted and non-gifted students. The PRSG, consisting of 30 items, was created by consulting parents and reviewing instruments existent in the literature. As…
Descriptors: Rating Scales, Parent Attitudes, Scores, Comparative Analysis
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Wu, Tong – ProQuest LLC, 2023
This three-article dissertation aims to address three methodological challenges to ensure comparability in educational research, including scale linking, test equating, and propensity score (PS) weighting. The first study intends to improve test scale comparability by evaluating the effect of six missing data handling approaches, including…
Descriptors: Educational Research, Comparative Analysis, Equated Scores, Weighted Scores
Celen, Umit; Aybek, Eren Can – International Journal of Assessment Tools in Education, 2022
Item analysis is performed by developers as an integral part of the scale development process. Thus, items are excluded from the scale depending on the item analysis prior to the factor analysis. Existing item discrimination indices are calculated based on correlation, yet items with different response patterns are likely to have a similar item…
Descriptors: Likert Scales, Factor Analysis, Item Analysis, Correlation
Katrin Klingbeil; Fabian Rösken; Bärbel Barzel; Florian Schacht; Kaye Stacey; Vicki Steinle; Daniel Thurm – ZDM: Mathematics Education, 2024
Assessing students' (mis)conceptions is a challenging task for teachers as well as for researchers. While individual assessment, for example through interviews, can provide deep insights into students' thinking, this is very time-consuming and therefore not feasible for whole classes or even larger settings. For those settings, automatically…
Descriptors: Multiple Choice Tests, Formative Evaluation, Mathematics Tests, Misconceptions
Yoo Jeong Jang – ProQuest LLC, 2022
Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…
Descriptors: Classification, Accuracy, Item Response Theory, Correlation
Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023
Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…
Descriptors: Biology, Science Tests, High Stakes Tests, Scores
Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022
Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…
Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity
Robie, Chet; Meade, Adam W.; Risavy, Stephen D.; Rasheed, Sabah – Educational and Psychological Measurement, 2022
The effects of different response option orders on survey responses have been studied extensively. The typical research design involves examining the differences in response characteristics between conditions with the same item stems and response option orders that differ in valence--either incrementally arranged (e.g., strongly disagree to…
Descriptors: Likert Scales, Psychometrics, Surveys, Responses
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
Jurij Selan; Mira Metljak – Center for Educational Policy Studies Journal, 2023
Since research integrity is not external to research but an integral part of it, it should be integrated into research training. However, several hindrances regarding contemporary research integrity education exist. To address them, we have developed a competency profile for teaching and learning research integrity based on four assumptions: 1) to…
Descriptors: Profiles, Integrity, Content Validity, Questionnaires
Phusee-orn, Songsak; Pongteerawut, Sasipat – Journal of Educational Issues, 2022
The research aimed at studying and comparing the futuristic thinking of Grade 9 students studying in schools of different sizes. The samples of the research were Grade 9 students of semester 2 in academic year 2020 in Sisaket Province, Thailand. The multi-stage random sampling technique was employed for the selection of 860 students from 12…
Descriptors: School Size, Futures (of Society), Correlation, Likert Scales