Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Li, Yanmei; Li, Shuhong; Wang, Lin – Educational Testing Service, 2010
Many standardized educational tests include groups of items based on a common stimulus, known as "testlets". Standard unidimensional item response theory (IRT) models are commonly used to model examinees' responses to testlet items. However, it is known that local dependence among testlet items can lead to biased item parameter estimates…
Descriptors: English, Language Tests, Reading Tests, Item Response Theory
Lengh, Carolyn J. – ProQuest LLC, 2010
This study compares the dependability of four classroom assessment scoring methods. Generalizability theory (G) and alternative decision (D) are used to measure the results of students' classroom assessment scores and compare the results of the four scoring methods on variability of rater by person variance and the level of G and D coefficients…
Descriptors: Generalizability Theory, Scoring, Social Studies, Tests
Porter, Andrew C.; Polikoff, Morgan S.; Goldring, Ellen B.; Murphy, Joseph; Elliott, Stephen N.; May, Henry – Elementary School Journal, 2010
The Vanderbilt Assessment of Leadership in Education (VAL-ED) is a multirater assessment of principals' learning-centered leadership. The instrument was developed based on the Standards for Educational and Psychological Testing. In this article, we report on the validity and reliability evidence for the VAL-ED accumulated in a national field…
Descriptors: Psychological Testing, Test Validity, Leadership, Principals
Ward-King, Jessica; Cohen, Ira L.; Penning, Henderika; Holden, Jeanette J. A. – Journal of Autism and Developmental Disorders, 2010
The Autism Diagnostic Interview-Revised is one of the "gold standard" diagnostic tools for autism spectrum disorders. It is traditionally administered face-to-face. Cost and geographical concerns constrain the employment of the ADI-R for large-scale research projects. The telephone interview is a reasonable alternative, but has not yet been…
Descriptors: Autism, Telecommunications, Pervasive Developmental Disorders, Evaluation Methods
Andrews, Jac J. W.; Violato, Claudio – Canadian Journal of School Psychology, 2010
In this article we provide an overview of the nature and scope of multisource feedback (MSF) and provide empirical evidence of its reliability, validity, and feasibility in one of the health professions. The overall internal consistency reliability (Cronbach alpha) of MSF instruments is generally greater than 0.96 for self and informants such as…
Descriptors: School Psychologists, Measures (Individuals), Feedback (Response), Reliability
Cole, James S.; Gonyea, Robert M. – Research in Higher Education, 2010
Because it is often impractical or impossible to obtain school transcripts or records on subjects, many researchers rely on college students to accurately self-report their academic record as part of their data collection procedures. The purpose of this study is to investigate the validity and reliability of student self-reported academic…
Descriptors: Academic Records, College Students, Validity, Scores
Cheing, Gladys L. Y.; Lai, Amy K. M.; Vong, Sinfia K. S.; Chan, Fong H. – International Journal of Rehabilitation Research, 2010
The aim of this study was to report the preliminary validation results for the Pain Rehabilitation Expectations Scale (PRES). The PRES is a clinical tool developed to measure the expectations about rehabilitation treatment and outcome for people with back pain. Fifty people with chronic back pain were recruited from 11 physiotherapy outpatient…
Descriptors: Measures (Individuals), Factor Structure, Pain, Rehabilitation
Leff, Stephen S.; Cassano, Michael; MacEvoy, Julie Paquette; Costigan, Tracy – Journal of Abnormal Child Psychology, 2010
Over the past fifteen years many schools have utilized aggression prevention programs. Despite these apparent advances, many programs are not examined systematically to determine the areas in which they are most effective. One reason for this is that many programs, especially those in urban under-resourced areas, do not utilize outcome measures…
Descriptors: Measures (Individuals), Psychological Patterns, Social Cognition, Validity
Cooper, Lynn O.; Buchanan, Trey – International Journal of Listening, 2010
Primarily used in corporate and organizational contexts, this study evaluates the psychometric properties of the 30-item "Organizational Listening Survey" ("OLS") as a measure of listening behavior with a sample of undergraduate college students. The first study analyzed 1,475 students' self-reports of their listening behavior on campus,…
Descriptors: Undergraduate Students, Interrater Reliability, Psychometrics, Listening Skills
Bechger, Timo M.; Maris, Gunter; Hsiao, Ya Ping – Applied Psychological Measurement, 2010
The main purpose of this article is to demonstrate how halo effects may be detected and quantified using two independent ratings of the same person. A practical illustration is given to show how halo effects can be avoided. (Contains 2 tables, 7 figures, and 2 notes.)
Descriptors: Performance Based Assessment, Test Reliability, Test Length, Language Tests
Barker, David H.; Lloyd, Thad Q.; Stewart, Peter K.; Wells, M. Gawain – Journal of Child and Family Studies, 2010
Developing normed treatment outcome measures is important to research addressing treatment effectiveness and to improved clinical care. The Preschool Outcome Questionnaire (POQ) is a new measure designed for use with preschool children aged two to six. Designed in collaboration with parents and clinicians, the POQ is brief, easy to administer,…
Descriptors: Outcomes of Treatment, Predictive Validity, Preschool Children, Measures (Individuals)
Thaler, Nicholas S.; Kazemi, Ellie; Wood, Jeffrey J. – Child Psychiatry and Human Development, 2010
Youth with learning disabilities (LD) are at an increased risk for anxiety disorders and valid measures of anxiety are necessary for assessing this population. We investigated the psychometric properties of the Multidimensional Anxiety Scale for Children (MASC; March in Multidimensional anxiety scale for children. Multi-Health Systems, North…
Descriptors: Learning Disabilities, Construct Validity, Measures (Individuals), Parents
Yang, Yanyun; Green, Samuel B. – Structural Equation Modeling: A Multidisciplinary Journal, 2010
Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient…
Descriptors: Monte Carlo Methods, Structural Equation Models, Reliability, Sample Size
Erford, Bradley T.; Duncan, Kelly; Savin-Murphy, Janet – Measurement and Evaluation in Counseling and Development, 2010
This study provides preliminary analysis of reliability and validity of scores on the Self-Efficacy Teacher Report Scale, which was designed to assess teacher perceptions of self-efficacy of students aged 8 to 17 years. (Contains 3 tables.)
Descriptors: Self Efficacy, Measures (Individuals), Psychometrics, Reliability
Century, Jeanne; Rudnick, Mollie; Freeman, Cassie – American Journal of Evaluation, 2010
There is a growing recognition of the value of measuring fidelity of implementation (FOI) as a necessary part of evaluating interventions. However, evaluators do not have a shared conceptual understanding of what FOI is and how to measure it. Thus, the creation of FOI measures is typically a secondary focus and based on specific contexts and…
Descriptors: Intervention, Program Implementation, Measurement Techniques, Evaluators

Direct link
Peer reviewed
