Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 36 |
Since 2006 (last 20 years) | 111 |
Descriptor
Comparative Analysis | 164 |
Correlation | 164 |
Test Validity | 164 |
Test Reliability | 66 |
Foreign Countries | 45 |
Scores | 41 |
Psychometrics | 32 |
Statistical Analysis | 30 |
Predictive Validity | 22 |
Academic Achievement | 20 |
Measures (Individuals) | 20 |
More ▼ |
Source
Author
Daro, Phil | 2 |
McIntyre, Nancy | 2 |
McNeil, Malcolm R. | 2 |
Mundy, Peter | 2 |
Novotny, Stephanie | 2 |
Oswald, Tasha | 2 |
Peyton, Vicki | 2 |
Staples, Shelley | 2 |
Swain-Lerro, Lindsey | 2 |
Zajic, Matt | 2 |
Adams, Melanie M. | 1 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 3 |
Practitioners | 2 |
Teachers | 1 |
Location
China | 6 |
Canada | 5 |
Australia | 4 |
Texas | 4 |
Turkey | 4 |
Florida | 3 |
Hong Kong | 3 |
New York | 3 |
United States | 3 |
Europe | 2 |
Louisiana | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Amin D. Lotfizadeh; Brendan Gard; Cynthia Rico; Alan Poling; Kristen R. Choi – Journal of Autism and Developmental Disorders, 2025
Behavior analysts frequently use the Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP) to assess the language and social skills of children with autism in everyday practice and in research. Despite the widespread use of the VB-MAPP, its psychometric characteristics have not been extensively investigated. To provide information…
Descriptors: Adjustment (to Environment), Behavior Rating Scales, Autism Spectrum Disorders, Test Validity
R. Lanai Jennings; Megan Midkiff; Emily Nestor McCauley; Jeremy Lopuch; Sandra Stroebel; Rachel James; Mary Toler; Rebecca Wendell; Paula King; Mallory Frampton – Contemporary School Psychology, 2024
Reading comprehension is one of the most valuable academic skills taught in school. Selecting the appropriate assessment instrument to ensure early identification and intervention is important as there is an amalgam of cognitive abilities and academic skills involved in reading comprehension. The GORT-5 is the most recent edition of a test that…
Descriptors: Test Validity, Diagnostic Tests, Reading Comprehension, Early Intervention
Yoo Jeong Jang – ProQuest LLC, 2022
Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…
Descriptors: Classification, Accuracy, Item Response Theory, Correlation
David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023
We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…
Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format
Akhtar, Hanif – International Association for Development of the Information Society, 2022
When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…
Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes
Leda Lampropoulou – Language Education & Assessment, 2023
Extensive oral tasks or monologues of different types (e.g., presentations, storytelling) are often used as second language acquisition tasks in the fields of language learning and language testing. Pre-task planning time is a common provision to test-takers who may use different strategies to prepare their response. High-stakes tests, such as the…
Descriptors: Language Tests, Speech Communication, Test Validity, Culture Fair Tests
Lúcio, Patrícia Silva; Vandekerckhove, Joachim; Polanczyk, Guilherme V.; Cogo-Moreira, Hugo – Journal of Psychoeducational Assessment, 2021
The present study compares the fit of two- and three-parameter logistic (2PL and 3PL) models of item response theory in the performance of preschool children on the Raven's Colored Progressive Matrices. The test of Raven is widely used for evaluating nonverbal intelligence of factor g. Studies comparing models with real data are scarce on the…
Descriptors: Guessing (Tests), Item Response Theory, Test Validity, Preschool Children
Khabbazbashi, Nahal; Galaczi, Evelina D. – Language Testing, 2020
This mixed methods study examined holistic, analytic, and part marking models (MMs) in terms of their measurement properties and impact on candidate CEFR classifications in a semi-direct online speaking test. Speaking performances of 240 candidates were first marked holistically and by part (phase 1). On the basis of phase 1 findings--which…
Descriptors: Holistic Approach, Classification, Grading, Language Tests
McKie, Greg L.; Islam, Hashim; Townsend, Logan K.; Howe, Greg J.; Hazell, Tom J. – Measurement in Physical Education and Exercise Science, 2018
This study examined the validity and reliability of a 30-second running sprint test using two non-motorized treadmills compared to the established Wingate Anaerobic Test. Twenty-four participants completed three sessions in a randomized order on a: (1) manual mode treadmill (Woodway); (2) specialized interval training treadmill (HiTrainer); and…
Descriptors: Exercise, Physical Activities, Correlation, Exercise Physiology
Martin-Raugh, Michelle P.; Anguiano-Carrsaco, Cristina; Jackson, Teresa; Brenneman, Meghan W.; Carney, Lauren; Barnwell, Patrick; Kochert, Jonathan – International Journal of Testing, 2018
Single-response situational judgment tests (SRSJTs) differ from multiple-response SJTs (MRSJTS) in that they present test takers with edited critical incidents and simply ask test takers to read over the action described and evaluate it according to its effectiveness. Research comparing the reliability and validity of SRSJTs and MRSJTs is thus far…
Descriptors: Test Format, Test Reliability, Test Validity, Predictive Validity
Bakhtiar, Mehdi; Wong, Min Ney; Tsui, Emily Ka Yin; McNeil, Malcolm R. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: This study reports the psychometric development of the Cantonese versions of the English Computerized Revised Token Test (CRTT) for persons with aphasia (PWAs) and healthy controls (HCs). Method: The English CRTT was translated into standard Chinese for the Reading--Word Fade version (CRTT-R-[subscript WF]-Cantonese) and into formal…
Descriptors: Psychometrics, Sino Tibetan Languages, Computer Assisted Testing, Aphasia
St. Clair, Travis; Hallberg, Kelly; Cook, Thomas D. – Journal of Educational and Behavioral Statistics, 2016
We explore the conditions under which short, comparative interrupted time-series (CITS) designs represent valid alternatives to randomized experiments in educational evaluations. To do so, we conduct three within-study comparisons, each of which uses a unique data set to test the validity of the CITS design by comparing its causal estimates to…
Descriptors: Research Methodology, Randomized Controlled Trials, Comparative Analysis, Time
Horn, Aaron S.; Horner, Olena G.; Lee, Giljae – Studies in Higher Education, 2019
Researchers in higher education frequently evaluate institutional effectiveness as the difference between an actual and predicted graduation rate, but little is known about whether such a method is reliable or valid. This study examines the measurement properties of effectiveness scores derived from regression residuals for community colleges in…
Descriptors: Instructional Effectiveness, Two Year Colleges, Comparative Analysis, Raw Scores
O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew – Measurement in Physical Education and Exercise Science, 2018
The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…
Descriptors: Test Validity, Test Reliability, Females, Athletes
Longabach, Tanya; Peyton, Vicki – Language Testing, 2018
K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…
Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency