NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Baker, Eva L. – Teachers College Record, 2013
Background/Context: Education policy over the past 40 years has focused on the importance of accountability in school improvement. Although much of the scholarly discourse around testing and assessment is technical and statistical, understanding of validity by a non-specialist audience is essential as long as test results drive our educational…
Descriptors: Validity, Educational Assessment, Accountability, Educational Improvement
Peer reviewed Peer reviewed
Direct linkDirect link
Baker, Eva L.; Chung, Gregory K. W. K.; Cai, Li – Review of Research in Education, 2016
This chapter addresses assessment (testing) with an emphasis on the 100-year period since the American Education Research Association was formed. The authors start with definitions and explanations of contemporary tests. They then look backward into the 19th century to significant work by Horace Mann and Herbert Spencer, who engendered two…
Descriptors: Achievement Tests, Educational History, Testing, Educational Assessment
Buschang, Rebecca E.; Chung, Gregory K. W. K.; Delacruz, Girlie C.; Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2012
The purpose of this study was to validate inferences about scores of one task designed to measure subject matter knowledge and three tasks designed to measure aspects of pedagogical content knowledge. Evidence for the validity of inferences was based on two expectations. First, if tasks were sensitive to expertise, we would find group differences.…
Descriptors: Validity, Measures (Individuals), Test Interpretation, Algebra
Peer reviewed Peer reviewed
Direct linkDirect link
Buschang, Rebecca E.; Chung, Gregory K. W. K.; Delacruz, Girlie C.; Baker, Eva L. – Educational Assessment, 2012
The purpose of this study was to validate inferences about scores of one task designed to measure subject matter knowledge and three tasks designed to measure aspects of pedagogical content knowledge. Evidence for the validity of inferences was based on two expectations. First, if tasks were sensitive to expertise, we would find group differences.…
Descriptors: Algebra, Mathematics Teachers, Teacher Characteristics, Knowledge Base for Teaching
Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…
Descriptors: Educational Games, Validity, Prior Learning, Scores
Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
This report provides an overview of what was known about alternative assessment at the time that the article was written in 1991. Topics include beliefs about assessment reform, overview of alternative assessment including research knowledge, evidence of assessment impact, and critical features of alternative assessment. The author notes that in…
Descriptors: Alternative Assessment, Evaluation Methods, Evaluation Research, Performance Based Assessment
Chung, Gregory K. W. K.; Nagashima, Sam O.; Delacruz, Girlie C.; Lee, John J.; Wainess, Richard; Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2011
The UCLA National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is under contract from the Naval Postgraduate School (NPS) to conduct research on assessment models and tools designed to support Marine Corps rifle marksmanship. In this deliverable, we first review the literature on known-distance rifle marksmanship…
Descriptors: Weapons, Psychomotor Skills, Computer Software, Military Personnel
Nagashima, Sam O.; Chung, Gregory K. W. K.; Espinosa, Paul D.; Berka, Chris; Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2009
The goal of this report was to test the use of sensor-based skill measures in evaluating performance differences in rifle marksmanship. Ten shots were collected from 30 novices and 9 experts. Three measures for breath control and one for trigger control were used to predict skill classification. The data were fitted with a logistic regression…
Descriptors: Weapons, Classification, Lasers, Models
Baker, Eva L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2007
This paper will describe the relationships between research on learning and its application in assessment models and operational systems. These have been topics of research at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) for more than 20 years and form a significant part of the intellectual foundation of…
Descriptors: Educational Testing, Inferences, Hypothesis Testing, Predictive Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Baker, Eva L. – Educational Assessment, 2007
This article describes the history, evidence warrants, and evolution of the Center for Research on Evaluation, Standards, and Student Testing's (CRESST) model-based assessments. It considers alternative interpretations of scientific or practical models and illustrates how model-based assessment addresses both definitions. The components of the…
Descriptors: Educational Testing, Computer Assisted Testing, Validity, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Goldschmidt, Pete; Martinez, Jose Felipe; Niemi, David; Baker, Eva L. – Educational Assessment, 2007
In this article we examine empirical evidence on the criterion, predictive, transfer, and fairness aspects of validity of a large-scale language arts performance assessment, referred to as the Performance Assignment (PA). We use multilevel models to avoid biased inferences that might result from the naturally nested data. Specifically, we examine…
Descriptors: Language Arts, Performance Based Assessment, Academic Achievement, Performance Tests
Chung, Gregory K. W. K.; Baker, Eva L.; Brill, David G.; Sinha, Ravi; Saadat, Farzad; Bewley, William L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2006
A critical first step in developing training systems is gathering quality information about a trainee's competency in a skill or knowledge domain. Such information includes an estimate of what the trainee knows prior to training, how much has been learned from training, how well the trainee may perform in future task situations, and whether to…
Descriptors: Distance Education, Skill Analysis, Knowledge Level, Prior Learning