Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 0 |
| Since 2007 (last 20 years) | 5 |
Descriptor
| Statistical Significance | 18 |
| Test Validity | 18 |
| Hypothesis Testing | 9 |
| Correlation | 7 |
| Scores | 5 |
| Test Reliability | 5 |
| Testing Problems | 4 |
| Comparative Testing | 3 |
| Analysis of Variance | 2 |
| Attitude Measures | 2 |
| Comparative Analysis | 2 |
| More ▼ | |
Source
| Educational and Psychological… | 2 |
| Multivariate Behavioral… | 2 |
| American Institutes for… | 1 |
| Applied Measurement in… | 1 |
| Child Development | 1 |
| ETS Research Report Series | 1 |
| Educational Assessment | 1 |
| International Review of… | 1 |
Author
Publication Type
| Reports - Research | 11 |
| Journal Articles | 6 |
| Speeches/Meeting Papers | 4 |
| Opinion Papers | 1 |
| Reports - Descriptive | 1 |
| Reports - Evaluative | 1 |
Education Level
| Elementary Education | 3 |
| Early Childhood Education | 1 |
| Grade 2 | 1 |
| Grade 4 | 1 |
| Grade 8 | 1 |
| Middle Schools | 1 |
| Primary Education | 1 |
Audience
| Researchers | 4 |
Location
| Kenya | 1 |
| New Jersey | 1 |
| New York | 1 |
| Philippines | 1 |
| United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
| Wechsler Intelligence Scale… | 1 |
| Wechsler Intelligence Scales… | 1 |
What Works Clearinghouse Rating
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Reddy, Linda A.; Dudek, Christopher M.; Rualo, Angelique J.; Fabiano, Gregory A. – Educational Assessment, 2016
The present study investigated the concurrent validity of the Classroom Strategies Scale-Teacher Form (CSS-T), a multidimensional teacher formative assessment of instructional and behavioral management practices. The CSS-T is compared with the Classroom Assessment Scoring System (CLASS), a well-known teacher assessment of overall classroom…
Descriptors: Teacher Evaluation, Formative Evaluation, Test Validity, Rating Scales
Naemi, Bobby; Seybert, Jacob; Robbins, Steven; Kyllonen, Patrick – ETS Research Report Series, 2014
This report introduces the "WorkFORCE"™ Assessment for Job Fit, a personality assessment utilizing the "FACETS"™ core capability, which is based on innovations in forced-choice assessment and computer adaptive testing. The instrument is derived from the fivefactor model (FFM) of personality and encompasses a broad spectrum of…
Descriptors: Personality Assessment, Personality Traits, Personality Measures, Test Validity
Hedges, Larry V.; Bandeira de Mello, Victor – American Institutes for Research, 2013
In early 2001, to support an internal evaluation of the impact of changing exclusion rates on reports of statistically significant gains across states, the National Center for Education Statistics (NCES) sponsored research on imputation procedures of National Assessment of Educational Progress (NAEP) scores for the excluded students and provided…
Descriptors: National Competency Tests, Test Validity, Inclusion, Statistical Significance
Piper, Benjamin; Zuilkowski, Stephanie Simmons – International Review of Education, 2015
In recent years, the Education for All movement has focused more intensely on the quality of education, rather than simply provision. Many recent and current education quality interventions focus on literacy, which is the core skill required for further academic success. Despite this focus on the quality of literacy instruction in developing…
Descriptors: Foreign Countries, Reading Fluency, Reading Tests, Oral Reading
PDF pending restorationAnderson, Paul S.; Hyers, Albert D. – 1991
Three descriptive statistics (difficulty, discrimination, and reliability) of multiple-choice (MC) test items were compared to those of a new (1980s) format of machine-scored questions. The new method, answer-bank multi-digit testing (MDT), uses alphabetized lists of up to 1,000 alternatives and approximates the completion style of assessment…
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Correlation
Peer reviewedHubert, Lawrence J.; Baker, Frank B. – Multivariate Behavioral Research, 1978
The strategy for investigating convergent and discriminant test validity, known as the multitrait-multimethod matrix, is investigated. A nonparametric significance testing procedure is suggested and demonstrated. (JKS)
Descriptors: Correlation, Hypothesis Testing, Mathematical Models, Matrices
The Invalidity of Partitioned-U Tests in Canonical Correlation and Multivariate Analysis of Variance
Peer reviewedHarris, Richard J. – Multivariate Behavioral Research, 1976
The partitioned-U procedure is outlined, a fundamental logical flaw in this procedure's avoidance of any direct test of the significance of the first discriminant function or largest coefficient of canonical correlation is pointed out, and two alternatives to the partitioned-U procedure are discussed. (Author/DEP)
Descriptors: Analysis of Variance, Correlation, Hypothesis Testing, Multivariate Analysis
Steinfatt, Thomas M. – 1974
The known interval scale, referred to as the 7.8 scale, has been criticized as an invalid measuring instrument in the form of an attitude scale. It is the purpose of this paper to demonstrate that this scale can produce spuriously inflated correlation coefficients, high reliability, and false significance on statistical tests. The case will be…
Descriptors: Attitude Measures, Predictive Validity, Statistical Bias, Statistical Significance
Peer reviewedHsu, Louis M. – Educational and Psychological Measurement, 1978
The problem of determining the significance level which should be used in statistical tests of item validity in order to minimize type I errors is discussed. (Author/JKS)
Descriptors: Hypothesis Testing, Item Analysis, Power (Statistics), Statistical Significance
Peer reviewedHertzog, Christopher; Rovine, Michael – Child Development, 1985
Attempts to distill a growing technical literature on repeated-measures analysis of variance into a few simple principles for selecting an analytic technique. Argues that researchers ought not opt for a general analysis strategy when current computer technology makes it possible to select the optimal analysis technique for a given data set. (RH)
Descriptors: Analysis of Variance, Computer Software, Developmental Psychology, Hypothesis Testing
Shaver, James P. – 1992
A test of statistical significance is a procedure for determining how likely a result is assuming a null hypothesis to be true with randomization and a sample of size n (the given size in the study). Randomization, which refers to random sampling and random assignment, is important because it ensures the independence of observations, but it does…
Descriptors: Educational Research, Evaluation Problems, Hypothesis Testing, Probability
Peer reviewedRezmovic, Eva Lantos; Rezmovic, Victor – Educational and Psychological Measurement, 1981
A multitrait-multimethod matrix containing two methods of measuring 12 personality traits was analyzed and confirmatory factor analysis was applied to the data. Although unexplained variance remained, method factors and a general personality factor significantly improved the fit of a model containing only trait factors. (Author/RL)
Descriptors: Factor Analysis, Goodness of Fit, Hypothesis Testing, Mathematical Models
Samph, Thomas; Sayles, Felton – 1974
The intent of this investigation was to perform a validation study to determine whether RACE (Racial Attitude and Cultural Expression test) differentiates between primary grade students identified as having negative and positive attitudes. Students were categorized by a combination of administrator, teacher and clinical assessment into a negative…
Descriptors: Attitude Measures, Black Youth, Elementary Education, Elementary School Students
Case, Susan M. – 1992
The predictive validity of scores on the National Board of Medical Examiners (NBME) Part I and Part II examinations for the selection of residents in orthopaedic surgery was investigated. Use of NBME scores has been criticized because of the time lag between taking Part I and entering residency and because Part I content is not directly linked to…
Descriptors: Admission Criteria, College Entrance Examinations, Comparative Testing, Graduate Medical Students
Previous Page | Next Page »
Pages: 1 | 2
Direct link
