Publication Date
| In 2026 | 0 |
| Since 2025 | 59 |
| Since 2022 (last 5 years) | 416 |
| Since 2017 (last 10 years) | 919 |
| Since 2007 (last 20 years) | 1970 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
National Centre for Vocational Education Research (NCVER), 2010
The Longitudinal Surveys of Australian Youth (LSAY) is a research program that tracks young people as they move from school into further study, work and other destinations. This "User guide" has been developed for users of the LSAY data. The guide endeavours to consolidate existing technical documentation and other relevant information…
Descriptors: Longitudinal Studies, Youth, Foreign Countries, Guides
Kim, Sooyeon; Linvingston, Samuel A.; Lewis, Charles – ETS Research Report Series, 2008
This paper describes an empirical evaluation of a Bayesian procedure for equating scores on test forms taken by small numbers of examinees, using collateral information from the equating of other test forms. In this procedure, a separate Bayesian estimate is derived for the equated score at each raw-score level, making it unnecessary to specify a…
Descriptors: Equated Scores, Statistical Analysis, Sample Size, Bayesian Statistics
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008
The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008
Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…
Descriptors: Comparative Analysis, Scores, Tests, Certification
Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008
In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory
Emons, Wilco H. M. – Applied Psychological Measurement, 2008
Person-fit methods are used to uncover atypical test performance as reflected in the pattern of scores on individual items in a test. Unlike parametric person-fit statistics, nonparametric person-fit statistics do not require fitting a parametric test theory model. This study investigates the effectiveness of generalizations of nonparametric…
Descriptors: Simulation, Nonparametric Statistics, Item Response Theory, Goodness of Fit
A Generally Robust Approach for Testing Hypotheses and Setting Confidence Intervals for Effect Sizes
Keselman, H. J.; Algina, James; Lix, Lisa M.; Wilcox, Rand R.; Deering, Kathleen N. – Psychological Methods, 2008
Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of…
Descriptors: Intervals, Testing, Least Squares Statistics, Effect Size
Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008
Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…
Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
Smith, Emma – International Journal of Research & Method in Education, 2009
Secondary data analysis as a methodological approach is not without its critics. Indeed, three main objections to the use of secondary data analysis in social research stand out: first that because of the socially constructed nature of social data, the act of reducing it to a simple numeric form cannot fully encapsulate its complexity. Secondly,…
Descriptors: Expulsion, Foreign Countries, Data Analysis, Information Sources
Li, Deping; Oranje, Andreas; Jiang, Yanlin – ETS Research Report Series, 2007
The hierarchical latent regression model (HLRM) is a flexible framework for estimating group-level proficiency while taking into account the complex sample designs often found in large-scale educational surveys. A complex assessment design in which information is collected at different levels (such as student, school, and district), the model also…
Descriptors: Hierarchical Linear Modeling, Regression (Statistics), Computation, Comparative Analysis
Isenberg, Eric; Hock, Heinrich – Mathematica Policy Research, Inc., 2012
In this report, the authors describe the value-added models used as part of teacher evaluation systems in the District of Columbia Public Schools (DCPS) and in eligible DC charter schools participating in Race to the Top. They estimated (1) teacher effectiveness in DCPS and eligible DC charter schools during the 2011-2012 school year; and (2)…
Descriptors: Value Added Models, Teacher Evaluation, Public Schools, Urban Schools
Peer reviewedSilverstein, A. B. – Journal of Consulting and Clinical Psychology, 1984
Examined the standard error for short forms of Wechsler's scales with deviant subjects (N=2000). Demonstrated that the standard error of estimate of a short form for the standardization sample is an excellent approximation to the standard error of a predicted IQ for a new, even markedly deviant, subject. (LLL)
Descriptors: Error of Measurement, Intelligence Tests
Roberts, J. Kyle; Onwuegbuzie, Anthony J.; Eby, J. Robert – 2001
This paper suggests that although data from a homogenous sample might yield less reliable scores than did an inducted sample, these data should not be discarded until further examination of the data is conducted. The paper presents two statistics for monitoring data homogeneity and one statistic for correcting alpha when homogeneity is large. The…
Descriptors: Error of Measurement, Reliability, Scores
Davis, Brandon – 2001
This paper reviews the concept of experimentwise Type I error. While "testwise" alpha refers to the probability of making a Type I error for a single hypothesis test, "experimentwise" error refers to the probability of having made a Type I error anywhere within the study. Experimentwise error concerns are the basis for two…
Descriptors: Error of Measurement, Multivariate Analysis

Direct link
