Publication Date
In 2025 | 3 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 40 |
Since 2016 (last 10 years) | 113 |
Since 2006 (last 20 years) | 269 |
Descriptor
Test Reliability | 675 |
Test Validity | 454 |
Test Construction | 215 |
Evaluation Methods | 150 |
Elementary Secondary Education | 107 |
Student Evaluation | 105 |
Higher Education | 88 |
Testing | 87 |
Psychometrics | 83 |
Foreign Countries | 82 |
Standardized Tests | 79 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 53 |
Practitioners | 50 |
Teachers | 23 |
Administrators | 14 |
Policymakers | 9 |
Counselors | 2 |
Students | 1 |
Location
United Kingdom | 10 |
Australia | 9 |
California | 9 |
New York | 8 |
Canada | 6 |
United Kingdom (England) | 6 |
Japan | 5 |
United States | 5 |
Vermont | 5 |
Georgia | 4 |
Nebraska | 4 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 6 |
Elementary and Secondary… | 2 |
Every Student Succeeds Act… | 2 |
Individuals with Disabilities… | 2 |
Education Amendments 1974 | 1 |
Education of the Handicapped… | 1 |
Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Susan K. Johnsen – Gifted Child Today, 2025
The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…
Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement
Tenko Raykov; Bingsheng Zhang – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Multidimensional measuring instruments are often used in behavioral, social, educational, marketing, and biomedical research. For these scales, the paper discusses how to find the optimal score based on their components that is associated with the highest possible reliability. Within the framework of structural equation modeling, an approach to…
Descriptors: Multidimensional Scaling, Measurement Equipment, Measurement Techniques, Test Reliability
Denise Swanson; Gerald Tindal – Behavioral Research and Teaching, 2024
This technical report provides an authoritative bibliographic resource of all the studies conducted on "easyCBM"® and published on the main website for Behavioral Research and Teaching under Publications (https://brtprojects.org). The "easyCBM"© software is a direct descendent of "Curriculum-based Measurement" (CBM)…
Descriptors: Bibliographies, Computer Software, Test Construction, Test Reliability
National Institute for Excellence in Teaching, 2023
Aspiring teachers must develop an in-depth understanding of high-quality instructional practices. In order to prepare, instruct, and coach aspiring teachers, the National Institute for Excellence in Teaching (NIET) has developed a the NIET Aspiring Teacher Rubric (ATR) based on principles of excellence in instruction. This research brief…
Descriptors: Scoring Rubrics, Preservice Teachers, Test Construction, Test Validity
Rice, C. E.; Carpenter, L. A.; Morrier, M. J.; Lord, C.; DiRienzo, M.; Boan, A.; Skowyra, C.; Fusco, A.; Baio, J.; Esler, A.; Zahorodny, W.; Hobson, N.; Mars, A.; Thurm, A.; Bishop, S.; Wiggins, L. D. – Journal of Autism and Developmental Disorders, 2022
This paper describes a process to define a comprehensive list of exemplars for seven core Diagnostic and Statistical Manual (DSM) diagnostic criteria for autism spectrum disorder (ASD), and report on interrater reliability in applying these exemplars to determine ASD case classification. Clinicians completed an iterative process to map specific…
Descriptors: Autism Spectrum Disorders, Clinical Diagnosis, Test Reliability, Interrater Reliability
Scott J. Peters; Matthew C. Makel; Lindsay Ellis Lee; Tamra Stambaugh; Matthew T. McBee; D. Betsy McCoach; Kiana R. Johnson – Gifted Child Today, 2024
Universal screening is one of the most-common topics and well-accepted best practices within the field of gifted and talented education. There appears to be little disagreement that universally screening all students as part of a gifted and talented identification process results in fewer missed students. But surprisingly, there is little guidance…
Descriptors: Academically Gifted, Talent Identification, Screening Tests, Test Validity
Caspar J. Van Lissa; Eli-Boaz Clapper; Rebecca Kuiper – Research Synthesis Methods, 2024
The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and…
Descriptors: Hypothesis Testing, Evaluation Methods, Replication (Evaluation), Sample Size
Raykov, Tenko; Marcoulides, George A. – Measurement: Interdisciplinary Research and Perspectives, 2023
This article outlines a readily applicable procedure for point and interval estimation of the population discrepancy between reliability and the popular Cronbach's coefficient alpha for unidimensional multi-component measuring instruments with uncorrelated errors, which are widely used in behavioral and social research. The method is developed…
Descriptors: Measurement, Test Reliability, Measurement Techniques, Error of Measurement
Parsons, Seth A.; Ives, Samantha T.; Fields, R. Stacy; Barksdale, Bonnie; Marine, Jonathan; Rogers, Paul – Reading Teacher, 2023
Students who are engaged writers are likely to produce better writing and to enjoy writing more than students who are disengaged writers. Yet, we are unaware of any existing tool that validly and reliably measures writing engagement. In this article, we describe what writing engagement is and why it is important. Then, we present the Writing…
Descriptors: Learner Engagement, Writing (Composition), Writing Attitudes, Measures (Individuals)
Mikkel Helding Vembye; James Eric Pustejovsky; Therese Deocampo Pigott – Research Synthesis Methods, 2024
Sample size and statistical power are important factors to consider when planning a research synthesis. Power analysis methods have been developed for fixed effect or random effects models, but until recently these methods were limited to simple data structures with a single, independent effect per study. Recent work has provided power…
Descriptors: Sample Size, Robustness (Statistics), Effect Size, Social Science Research
Krystal Thomas; Todd A. Grindal; Daisy Wise Rutstein; Gullnar Syed; Sarah Nixon Gerard; Shari Golan; Sheryl Cababa; Amanda Di Dio; Behnosh Najafi; Kat Ward – SRI Education, a Division of SRI International, 2023
Instructional coaching, informed by observation tools that measure teachers' practices, has been effective in improving teaching quality in early learning programs. However, existing measurement tools limit teachers' abilities to implement this type of instructional coaching at scale. To address this challenge, a team at SRI Education, along with…
Descriptors: Preschool Education, Kindergarten, Coaching (Performance), Observation
Christian X. Navarro-Cota; Ana I. Molina; Miguel A. Redondo; Carmen Lacave – IEEE Transactions on Education, 2024
Contribution: This article describes the process used to create a questionnaire to evaluate the usability of mobile learning applications (CECAM). The questionnaire includes specific questions to assess user interface usability and pedagogical usability. Background: Nowadays, mobile applications are expanding rapidly and are commonly used in…
Descriptors: Usability, Questionnaires, Electronic Learning, Computer Oriented Programs
Emily L. Coderre – College Teaching, 2024
Psychometrics is the field of designing tests and assessments to measure certain psychological concepts. It is chiefly concerned with two fundamental properties: reliability and validity. These properties are often influenced by confounding variables: other things that can influence performance but are not what you are trying to measure. Here, I…
Descriptors: Teaching Methods, Psychometrics, Test Construction, Test Reliability
Sanchez, Jafeth E. – Journal of Extension, 2023
One Extension Specialist implemented a STEM pilot robotics program across three middle school settings. A program evaluation to provide guidance and recommendations for future development, implementation, and continued evaluation was conducted as part of a larger study. This process led to the development of a condensed STEM survey that can be…
Descriptors: Extension Education, Extension Agents, STEM Education, Robotics
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests