Publication Date
In 2025 | 11 |
Since 2024 | 35 |
Since 2021 (last 5 years) | 98 |
Since 2016 (last 10 years) | 267 |
Since 2006 (last 20 years) | 448 |
Descriptor
Cutting Scores | 780 |
Test Validity | 145 |
Standard Setting (Scoring) | 121 |
Foreign Countries | 118 |
Scores | 114 |
Test Reliability | 104 |
Comparative Analysis | 103 |
Classification | 99 |
Test Items | 96 |
Screening Tests | 93 |
Statistical Analysis | 91 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
California | 17 |
Florida | 13 |
Canada | 12 |
United Kingdom | 11 |
Texas | 10 |
Tennessee | 9 |
Arizona | 8 |
North Carolina | 8 |
Turkey | 8 |
Kansas | 7 |
Netherlands | 7 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 15 |
Elementary and Secondary… | 3 |
Elementary and Secondary… | 2 |
Education Consolidation… | 1 |
Individuals with Disabilities… | 1 |
Stewart B McKinney Homeless… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 3 |
Michael T. Kalkbrenner – Measurement and Evaluation in Counseling and Development, 2024
The purpose of this instructional piece was to provide a nontechnical synthesis of common internal consistency reliability estimates used in professional counseling and in related fields. The article begins with an overview of coefficients alpha, omega, omega hierarchical, and H, with guidelines for their selection. Next, I provide recommendations…
Descriptors: Reliability, Counseling, Cutting Scores, High Stakes Tests
Erin Johnson; Samantha Barstack; Yikai Xu; Hannah Wise; Bradley T. Erford; Catharina Chang; David Delmonico – Measurement and Evaluation in Counseling and Development, 2025
Problem Statement: Among individuals aged 12 years or older, 14.3% (40.0 million) reporting the use of an illicit drug in the previous year. Given the prevalence of drug abuse, it is increasingly important to determine effective screening practices, treatment procedures, and best practices among various subpopulations to identify drug use-related…
Descriptors: Drug Abuse, Screening Tests, Psychometrics, Synthesis
Matt Homer – Advances in Health Sciences Education, 2024
Quantitative measures of systematic differences in OSCE scoring across examiners (often termed examiner stringency) can threaten the validity of examination outcomes. Such effects are usually conceptualised and operationalised based solely on checklist/domain scores in a station, and global grades are not often used in this type of analysis. In…
Descriptors: Examiners, Scoring, Validity, Cutting Scores
The Developmental Autism Early Screening (DAES): A Novel Test for Screening Autism Spectrum Disorder
Lara Cirnigliaro; Maria Stella Valle; Antonino Casabona; Martina Randazzo; Francesca La Bruna; Fabio Pettinato; Antonio Narzisi; Renata Rizzo; Rita Barone – Journal of Autism and Developmental Disorders, 2025
This study was undertaken to set a novel developmental screening test for autism spectrum disorder (ASD) using the Griffiths Scales of Child Development (Griffith III) (Green et al., 2016; Stroud et al., 2016), in order to intercept the early atypical developmental patterns indicating ASD risk in the first 3 years of age. An observational and…
Descriptors: Autism Spectrum Disorders, Test Construction, Screening Tests, Educational Diagnosis
Ashley A. Edwards; Wilhelmina van Dijk; Christine M. White; Christopher Schatschneider – Grantee Submission, 2022
Given the recent push for universal screening, it is important to take into account how well a screener identifies children at risk for reading problems as well as how screener and sample information contribute to this classification. Picking the best cut-point for a particular sample and screening goal can be challenging given that test manuals…
Descriptors: Screening Tests, Measures (Individuals), Reading Difficulties, Cutting Scores
Ashley A. Edwards; Wilhelmina van Dijk; Christine M. White; Christopher Schatschneider – Annals of Dyslexia, 2022
Given the recent push for universal screening, it is important to take into account how well a screener identifies children at risk for reading problems as well as how screener and sample information contribute to this classification. Picking the best cut-point for a particular sample and screening goal can be challenging given that test manuals…
Descriptors: Screening Tests, Measures (Individuals), Reading Difficulties, Cutting Scores
Aimel Zafar; Manzoor Khan; Muhammad Yousaf – Measurement: Interdisciplinary Research and Perspectives, 2024
Subjects with initially extreme observations upon remeasurement are found closer to the population mean. This tendency of observations toward the mean is called regression to the mean (RTM) and can make natural variation in repeated data look like real change. Studies, where subjects are selected on a baseline criterion, should be guarded against…
Descriptors: Measurement, Regression (Statistics), Statistical Distributions, Intervention
Lientje Maas; Matthew J. Madison; Matthieu J. S. Brinkhuis – Grantee Submission, 2024
Diagnostic classification models (DCMs) are psychometric models that yield probabilistic classifications of respondents according to a set of discrete latent variables. The current study examines the recently introduced one-parameter log-linear cognitive diagnosis model (1-PLCDM), which has increased interpretability compared with general DCMs due…
Descriptors: Clinical Diagnosis, Classification, Models, Psychometrics
Daniel McNeish; Patrick D. Manapat – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A recent review found that 11% of published factor models are hierarchical models with second-order factors. However, dedicated recommendations for evaluating hierarchical model fit have yet to emerge. Traditional benchmarks like RMSEA <0.06 or CFI >0.95 are often consulted, but they were never intended to generalize to hierarchical models.…
Descriptors: Factor Analysis, Goodness of Fit, Hierarchical Linear Modeling, Benchmarking
Bianchim, Mayara S.; McNarry, Melitta A.; Evans, Rachel; Thia, Lena; Barker, Alan R.; Williams, Craig A.; Denford, Sarah; Mackintosh, Kelly A – Measurement in Physical Education and Exercise Science, 2023
Commonly used cut-points may misclassify physical activity (PA) in people with cystic fibrosis (CF). The aim of this study was to develop and cross-validate condition-specific cut-points in children and adolescents with CF. Thirty-five children and adolescents with CF (15 girls; 11.6 ± 2.8 years) and 28 controls (16 girls; 12.2 ± 2.7 years), had…
Descriptors: Genetic Disorders, Children, Early Adolescents, Physical Activity Level
Abdolvahab Khademi; Craig S. Wells; Maria Elena Oliveri; Ester Villalonga-Olives – SAGE Open, 2023
The most common effect size when using a multiple-group confirmatory factor analysis approach to measurement invariance is [delta]CFI and [delta]TLI with a cutoff value of 0.01. However, this recommended cutoff value may not be ubiquitously appropriate and may be of limited application for some tests (e.g., measures using dichotomous items or…
Descriptors: Factor Analysis, Factor Structure, Error of Measurement, Test Items
Lee, Chansoon – Educational Measurement: Issues and Practice, 2022
Appropriate placement into courses at postsecondary institutions is critical for the success of students in terms of retention and graduation rates. To reduce the number of students who are misplaced, using multiple measures in placing students is encouraged. However, in practice most postsecondary schools utilize only a few measures to determine…
Descriptors: Classification, Models, Student Placement, College Students
Kathleen Lynne Lane; Katie Scarlett Lane Pelton; Nathan Allen Lane; Mark Matthew Buckman; Wendy Peia Oakes; Kandace Fleming; Rebecca E. Swinburne Romine; Emily D. Cantwell – Behavioral Disorders, 2025
We report findings of this replication study, examining the internalizing subscale (SRSS-I4) of the revised version of the Student Risk Screening Scale for Internalizing and Externalizing behavior (SRSS-IE 9) and the internalizing subscale of the Teacher Report Form (TRF). Using the sample from 13 elementary schools across three U.S. states with…
Descriptors: Data Analysis, Decision Making, Data Use, Measures (Individuals)
Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025
We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…
Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation
Alireza Akbari; Mohsen Shahrokhi – Quality Assurance in Education: An International Perspective, 2024
Purpose: The purpose of this research is to address the need for a robust system to accurately determine a cutoff score by using the Angoff method and leveraging the Rasch infit and outfit statistics of item response theory by detecting and removing misfitting items in a test. Design/methodology/approach: Researchers in educational evaluation…
Descriptors: Grades (Scholastic), Grading, Evaluation Criteria, Cutting Scores