Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 133 |
Descriptor
Evaluation Research | 183 |
Measurement Techniques | 183 |
Evaluation Methods | 120 |
Psychometrics | 44 |
Foreign Countries | 31 |
Models | 30 |
Educational Assessment | 29 |
Measurement | 28 |
Measurement Objectives | 28 |
Comparative Analysis | 27 |
Research Methodology | 26 |
More ▼ |
Source
Author
Billett, Stephen | 2 |
Bothe, Anne K. | 2 |
Dymock, Darryl | 2 |
Franic, Duska M. | 2 |
Onghena, Patrick | 2 |
Raykov, Tenko | 2 |
van der Linden, Wim J. | 2 |
Abbott, Robert D. | 1 |
Adams, Stephen T. | 1 |
Aigrain, Philippe | 1 |
Alessandri, Guido | 1 |
More ▼ |
Publication Type
Education Level
Location
Australia | 8 |
United States | 8 |
United Kingdom | 5 |
Canada | 3 |
California | 2 |
Germany | 2 |
Hong Kong | 2 |
Illinois | 2 |
Indiana | 2 |
Netherlands | 2 |
New York | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Tenko Raykov; Lisa Calvocoressi; Randall E. Schumacker – Measurement: Interdisciplinary Research and Perspectives, 2024
This paper is concerned with the process of selecting between the increasingly popular bi-factor model and the second-order factor model in measurement research. It is indicated that in certain settings widely used in empirical studies, the second-order model is nested in the bi-factor model and obtained from the latter after imposing appropriate…
Descriptors: Factor Analysis, Decision Making, Computer Software, Measurement Techniques
Raykov, Tenko; Anthony, James C.; Menold, Natalja – Educational and Psychological Measurement, 2023
The population relationship between coefficient alpha and scale reliability is studied in the widely used setting of unidimensional multicomponent measuring instruments. It is demonstrated that for any set of component loadings on the common factor, regardless of the extent of their inequality, the discrepancy between alpha and reliability can be…
Descriptors: Correlation, Evaluation Research, Reliability, Measurement Techniques
Richer, Amanda; Charmaraman, Linda; Ceder, Ineke – Afterschool Matters, 2018
Like instruments used in afterschool programs to assess children's social and emotional growth or to evaluate staff members' performance, instruments used to evaluate program quality should be free from bias. Practitioners and researchers alike want to know that assessment instruments, whatever their type or intent, treat all people fairly and do…
Descriptors: Cultural Differences, Social Bias, Interrater Reliability, Program Evaluation
Wiberg, Marie; van der Linden, Wim J.; von Davier, Alina A. – Journal of Educational Measurement, 2014
Three local observed-score kernel equating methods that integrate methods from the local equating and kernel equating frameworks are proposed. The new methods were compared with their earlier counterparts with respect to such measures as bias--as defined by Lord's criterion of equity--and percent relative error. The local kernel item response…
Descriptors: Measurement Techniques, Evaluation Methods, Item Response Theory, Equated Scores
Grissom, Jason A., Ed.; Youngs, Peter, Ed. – Teachers College Press, 2015
This is the first book to gather and address what we have learned about the impacts and challenges of data-intensive teacher evaluation systems--a defining characteristic of the current education policy landscape. Expert researchers and practitioners speak to what we know (and what remains to be known) about evaluation measures themselves, the…
Descriptors: Teacher Evaluation, Evaluation Methods, Evaluation Research, Test Validity
Wandersman, Abraham – American Journal of Evaluation, 2014
The Labin et al. logic model describes the why, how, what, and potential outcomes of evaluation capacity building (ECB). Getting To Outcomes offers a frame and empirical results for operationalizing the ECB logic model of Labin et al. and for deepening the science and practice of ECB.
Descriptors: Evaluation, Capacity Building, Methods, Accountability
Fives, Helenrose; Barnes, Nicole; Dacey, Charity; Gillis, Anna – Teacher Educator, 2016
We conducted a content analysis of 27 assessment textbooks to determine how assessment planning was framed in texts for preservice teachers. We identified eight assessment planning themes: alignment, assessment purpose and types, reliability and validity, writing goals and objectives, planning specific assessments, unpacking, overall assessment…
Descriptors: Student Evaluation, Lesson Plans, Knowledge Base for Teaching, Textbook Evaluation
Phillips, Shane Michael – ProQuest LLC, 2012
Propensity score matching is a relatively new technique used in observational studies to approximate data that have been randomly assigned to treatment. This technique assimilates the values of several covariates into a single propensity score that is used as a matching variable to create similar groups. This dissertation comprises two separate…
Descriptors: Statistical Analysis, Educational Research, Simulation, Observation
Guarino, Cassandra M. – Education Policy Center at Michigan State University, 2013
The push for accountability in public schooling has extended to the measurement of teacher performance, accelerated by federal efforts through Race to the Top. Currently, a large number of states and districts across the country are computing measures of teacher performance based on the standardized test scores of their students and using them in…
Descriptors: Teacher Evaluation, Teacher Effectiveness, Models, Program Descriptions
Fauskanger, Janne – Educational Studies in Mathematics, 2015
Mathematical knowledge for teaching (MKT) measures have been widely adopted by researchers. Critics have debated the value of such measures and questioned the type of knowledge that these access. This article reports on a study where the challenges in measuring teachers' knowledge were illuminated through investigating relationships between the…
Descriptors: Knowledge Base for Teaching, Teacher Competency Testing, Evaluation Problems, Multiple Choice Tests
Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…
Descriptors: Test Items, Simulation, Testing, Statistical Analysis
Chouinard, Jill Anne – American Journal of Evaluation, 2013
Evaluation occurs within a specific context and is influenced by the economic, political, historical, and social forces that shape that context. The culture of evaluation is thus very much embedded in the culture of accountability that currently prevails in public sector institutions, policies, and program. As such, our understanding of the…
Descriptors: Accountability, Public Sector, Participatory Research, Context Effect
Elbeck, Matt; Bacon, Don – Journal of Education for Business, 2015
The absence of universally accepted definitions for direct and indirect assessment motivates the purpose of this article: to offer definitions that are literature-based and theoretically driven, meeting K. Lewin's (1945) dictum that, "There is nothing so practical as a good theory" (p. 129). The authors synthesize the literature to…
Descriptors: Definitions, Evaluation Methods, Global Approach, Evidence
Raykov, Tenko; Patelis, Thanos; Marcoulides, George A. – Educational and Psychological Measurement, 2011
A latent variable modeling approach that can be used to examine whether several psychometric tests are parallel is discussed. The method consists of sequentially testing the properties of parallel measures via a corresponding relaxation of parameter constraints in a saturated model or an appropriately constructed latent variable model. The…
Descriptors: Models, Psychometrics, Evaluation Methods, Evaluation Research
Franic, Duska M.; Bothe, Anne K.; Bramlett, Robin E. – Journal of Communication Disorders, 2012
Purpose: To assess the feasibility of using one or more of four standard economic preference measures to assess health-related quality of life in stuttering, by assessing respondents' views of the acceptability of those measures. Method and results: A graphic positioning scale approach was used with 80 adults to assess four variables previously…
Descriptors: Stuttering, Quality of Life, Rating Scales, Decision Making