Publication Date
| In 2026 | 0 |
| Since 2025 | 59 |
| Since 2022 (last 5 years) | 416 |
| Since 2017 (last 10 years) | 919 |
| Since 2007 (last 20 years) | 1970 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Hintze, John M.; Christ, Theodore J. – School Psychology Review, 2004
This study examined the effects of controlling the level of difficulty on the sensitivity of repeated curriculum-based measurement (CBM). Participants included 99 students in Grades 2 through 5 who were administered CBM reading passage probes twice weekly over an 11-week period. Two sets of CBM reading progress monitoring materials were compared:…
Descriptors: Error of Measurement, Curriculum Based Assessment, Elementary Education, Elementary School Students
Wang, Wen-Chung; Su, Ya-Hui – Applied Psychological Measurement, 2004
Eight independent variables (differential item functioning [DIF] detection method, purification procedure, item response model, mean latent trait difference between groups, test length, DIF pattern, magnitude of DIF, and percentage of DIF items) were manipulated, and two dependent variables (Type I error and power) were assessed through…
Descriptors: Test Length, Test Bias, Simulation, Item Response Theory
Smith, Margaret H. – Journal of Statistics Education, 2004
Unless the sample encompasses a substantial portion of the population, the standard error of an estimator depends on the size of the sample, but not the size of the population. This is a crucial statistical insight that students find very counterintuitive. After trying several ways of convincing students of the validity of this principle, I have…
Descriptors: Sample Size, Error of Measurement, Mathematics Instruction, College Mathematics
McDonald, Roderick P. – Alberta Journal of Educational Research, 2003
The concept of a behavior domain is a reasonable and essential foundation for psychometric work based on true score theory, the linear model of common factor analysis, and the nonlinear models of item response theory. Investigators applying these models to test data generally treat the true scores or factors or traits as abstractive psychological…
Descriptors: Factor Analysis, Error of Measurement, True Scores, Psychometrics
Sinex, Scott A; Gage, Barbara A.; Beck, Peggy J. – AMATYC Review, 2007
A simple, guided-inquiry investigation using stacked sandwich cookies is employed to develop a simple linear mathematical model and to explore measurement error by incorporating errors as part of the investigation. Both random and systematic errors are presented. The model and errors are then investigated further by engaging with an interactive…
Descriptors: Mathematical Models, Measurement, Error of Measurement, Science Process Skills
Murphy, Sandra – Research in the Teaching of English, 2007
The persistent gap between the performance of mainstream students and racially and linguistically diverse students--for example, African Americans, Hispanic Americans, and Native Americans--on standardized tests may well signal problems with procedures for the development and use of standardized tests in general, and for their use with culturally…
Descriptors: Language Minorities, Standardized Tests, Test Validity, Prior Learning
Sterba, Sonya; Egger, Helen L.; Angold, Adrian – Journal of Child Psychology and Psychiatry, 2007
Background: The appropriateness of the "Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition" (DSM-IV) nosology for classifying preschool mental health disturbances continues to be debated. To inform this debate, we investigate whether preschool psychopathology shows differentiation along diagnostically specific lines…
Descriptors: Check Lists, Hyperactivity, Construct Validity, Psychopathology
Flanagan, Kristin Denton; McPhee, Cameron – National Center for Education Statistics, 2009
Using data from the final two rounds of the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B), a longitudinal study begun in 2001, this First Look provides a snapshot of the demographic characteristics, reading and mathematics knowledge, fine motor skills, school characteristics, and before- and after-school care arrangements of the cohort…
Descriptors: Child Development, Kindergarten, Longitudinal Studies, Cohort Analysis
Bridgeman, Brent; And Others – 1996
The various methods for computing the reliability of scores on Advanced Placement (AP) examinations are summarized. For the free response portion of the examinations, raters can contribute to score unreliability through both systematic severity errors (in which some raters consistently rate more severely than other raters) and through…
Descriptors: Advanced Placement, College Entrance Examinations, Error of Measurement, High School Students
Motika, Robert T. – 1997
Data from performance measures that were part of two foreign language teacher certification examinations were used in a generalizability study of the quality of their performance ratings. A total of 775 examinees from the Spanish K-12 and 192 examinees from the French K-12 subject area tests of the Florida Teacher Certification Examinations were…
Descriptors: Elementary Secondary Education, Error of Measurement, French, Generalizability Theory
Gaffney, Patrick V. – 1997
A reliability analysis was conducted of an abbreviated, 10-item version of the Pupil Control Ideology Form (PCI), using the Cronbach's alpha technique (L. J. Cronbach, 1951) and the computation of the standard error of measurement. The PCI measures a teacher's orientation toward pupil control. Subjects were 168 preservice teachers from one private…
Descriptors: Classroom Techniques, Discipline, Error of Measurement, Higher Education
Thompson, Bruce; Crowley, Susan – 1994
Most training programs in education and psychology focus on classical test theory techniques for assessing score dependability. This paper discusses generalizability theory and explores its concepts using a small heuristic data set. Generalizability theory subsumes and extends classical test score theory. It is able to estimate the magnitude of…
Descriptors: Analysis of Variance, Cutting Scores, Decision Making, Error of Measurement
Bergstrom, Betty A.; And Others – 1993
A problem that arises when a differential item functioning (DIF) study is done with samples of examinees differing in ability is examined. A test may function differently when the populations from which the items are calibrated are not of equal ability. Since the lower ability examinees get many difficult items incorrect, the spread (standard…
Descriptors: Ability, Error of Measurement, Grade 11, Grade 12
Chang, Yu-Wen; Davison, Mark L. – 1992
Standard errors and bias of unidimensional and multidimensional ability estimates were compared in a factorial, simulation design with two item response theory (IRT) approaches, two levels of test correlation (0.42 and 0.63), two sample sizes (500 and 1,000), and a hierarchical test content structure. Bias and standard errors of subtest scores…
Descriptors: Comparative Testing, Computer Simulation, Correlation, Error of Measurement
Levine, Michael V.; Drasgow, Fritz – 1980
Appropriateness measurement is a general approach to the problem caused by multiple choice tests failing to measure accurately the ability of atypical examinees. The conceptual framework of appropriateness measurement is presented, and several statistical indices of the appropriateness of a multiple choice test for an examinee are noted. A series…
Descriptors: Aptitude Tests, Cheating, Error of Measurement, Error Patterns

Peer reviewed
Direct link
