Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Raymond, Mark R.; Clauser, Brian E.; Furman, Gail E. – Advances in Health Sciences Education, 2010
The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary…
Descriptors: Generalizability Theory, Physicians, Patients, Least Squares Statistics
Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B. – Journal of Educational Measurement, 2010
In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…
Descriptors: Test Length, Goodness of Fit, Item Response Theory, Simulation
Claassen, Cynthia A.; Yip, Paul S.; Corcoran, Paul; Bossarte, Robert M.; Lawrence, Bruce A.; Currier, Glenn W. – Suicide and Life-Threatening Behavior, 2010
Durkheim's nineteenth-century analysis of national suicide rates dismissed prior concerns about mortality data fidelity. Over the intervening century, however, evidence documenting various types of error in suicide data has only mounted, and surprising levels of such error continue to be routinely uncovered. Yet the annual suicide rate remains the…
Descriptors: Suicide, Data Analysis, Data Interpretation, Models
Tsai, Yuping – Economics of Education Review, 2010
Studies examining the wage effect of overeducation have generated very consistent results. Their findings suggest that, for workers with similar educational attainment, workers who are overeducated for the job suffer from significant wage penalties. However, most studies use cross-sectional data, implicitly assuming that workers are randomly…
Descriptors: Wages, Individual Characteristics, Educational Attainment, Labor Market
Guo, Hongwen; Liu, Jinghua; Curley, Edward; Dorans, Neil – ETS Research Report Series, 2012
This study examines the stability of the "SAT Reasoning Test"™ score scales from 2005 to 2010. A 2005 old form (OF) was administered along with a 2010 new form (NF). A new conversion for OF was derived through direct equipercentile equating. A comparison of the newly derived and the original OF conversions showed that Critical Reading…
Descriptors: Aptitude Tests, Cognitive Tests, Thinking Skills, Equated Scores
Keaton, Patrick; Sable, Jennifer; Liu, Fei – National Center for Education Statistics, 2012
This revised data file includes corrections that were provided to NCES as a result of a special collection effort designed to address data quality issues found in the 1a release of this file. In May 2012, NCES became aware of data errors for key data items for several schools on the published version of the SY 2009-10 school file; in some cases…
Descriptors: School Statistics, Data Collection, Documentation, Error Patterns
Mercer, Sterett H.; Harpole, Lauren Lestremau; Mitchell, Rachel R.; McLemore, Chandler; Hardy, Christina – School Psychology Quarterly, 2012
The purpose of this study was to examine the impact of probe variability on the ability to replicate results in brief experimental analysis (BEA) of reading. In the first phase of the study, 41 first- and second- grade students completed 16 oral reading fluency probes. Calculations of probe difficulty were used to identify Low and High Variability…
Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3
Marsh, Herbert W.; Ludtke, Oliver; Nagengast, Benjamin; Trautwein, Ulrich; Morin, Alexandre J. S.; Abduljabbar, Adel S.; Koller, Olaf – Educational Psychologist, 2012
Classroom context and climate are inherently classroom-level (L2) constructs, but applied researchers sometimes--inappropriately--represent them by student-level (L1) responses in single-level models rather than more appropriate multilevel models. Here we focus on important conceptual issues (distinctions between climate and contextual variables;…
Descriptors: Foreign Countries, Classroom Environment, Educational Research, Research Design
Wang, Tianyou – Journal of Educational and Behavioral Statistics, 2009
Holland and colleagues derived a formula for analytical standard error of equating using the delta-method for the kernel equating method. Extending their derivation, this article derives an analytical standard error of equating procedure for the conventional percentile rank-based equipercentile equating with log-linear smoothing. This procedure is…
Descriptors: Error of Measurement, Equated Scores, Statistical Analysis, Statistical Inference
Rijmen, Frank; Manalo, Jonathan R.; von Davier, Alina A. – Applied Psychological Measurement, 2009
This article describes two methods for obtaining the standard errors of two commonly used population invariance measures of equating functions: the root mean square difference of the subpopulation equating functions from the overall equating function and the root expected mean square difference. The delta method relies on an analytical…
Descriptors: Error of Measurement, Sampling, Equated Scores, Statistical Analysis
Milanowski, Anthony T. – Online Submission, 2011
After decades of disinterest, evaluation of the performance of elementary and secondary teachers in the United States has become an important educational policy issue. As U.S. states and districts have tried to upgrade their evaluation processes, one of the models that has been increasingly used is the Framework for Teaching. This paper summarizes…
Descriptors: Evidence, Teacher Effectiveness, Teacher Evaluation, Observation
Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability
Culhane, Scott E.; Morera, Osvaldo F.; Watson, P. J.; Millsap, Roger E. – Assessment, 2011
The aim of this article is to assess the measurement invariance of the Bermond-Vorst Alexithymia Questionnaire (BVAQ) in U.S. Anglo (n = 490) and U.S. Hispanic (n = 379) samples of college students. The BVAQ items demonstrated invariance of the factor loadings, the latent item intercepts, and unique factor variances. However, Hispanics had higher…
Descriptors: Factor Structure, Measures (Individuals), Questionnaires, Hispanic Americans
Brown, Jane D. – Developmental Psychology, 2011
Steinberg and Monahan's (2011) reanalysis of the Teen Media longitudinal survey of adolescents does not meet prevailing standards for propensity score analysis and therefore does not undermine the original conclusions of the Brown, L'Engle, Pardun, Guo, Kenneavy, and Jackson (2006) analysis. The media do matter in the sexual socialization of…
Descriptors: Socialization, Adolescents, Scores, Sexuality