NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)10
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20012
What Works Clearinghouse Rating
Showing 1 to 15 of 28 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Atteberry, Allison; Mangan, Daniel – Educational Researcher, 2020
Papay (2011) noticed that teacher value-added measures (VAMs) from a statistical model using the most common pre/post testing timeframe--current-year spring relative to previous spring (SS)--are essentially unrelated to those same teachers' VAMs when instead using next-fall relative to current-fall (FF). This is concerning since this choice--made…
Descriptors: Correlation, Value Added Models, Pretests Posttests, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Macqueen, Susy; Knoch, Ute; Wigglesworth, Gillian; Nordlinger, Rachel; Singer, Ruth; McNamara, Tim; Brickle, Rhianna – Language Testing, 2019
All educational testing is intended to have consequences, which are assumed to be beneficial, but tests may also have unintended, negative consequences (Messick, 1989). The issue is particularly important in the case of large-scale standardized tests, such as Australia's "National Assessment Program--Literacy and Numeracy" (NAPLAN), the…
Descriptors: Numeracy, Standardized Tests, National Curriculum, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Davis-Becker, Susan L.; Buckendahl, Chad W. – International Journal of Testing, 2013
A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…
Descriptors: Standard Setting (Scoring), Evidence, Validity, Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Chow, Kui Foon; Kennedy, Kerry John – Educational Research and Evaluation, 2014
International large-scale assessments are now part of the educational landscape in many countries and often feed into major policy decisions. Yet, such assessments also provide data sets for secondary analysis that can address key issues of concern to educators and policymakers alike. Traditionally, such secondary analyses have been based on a…
Descriptors: Measurement, Data Analysis, Educational Assessment, Multivariate Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Qi, Sen; Mitchell, Ross E. – Journal of Deaf Studies and Deaf Education, 2012
The first large-scale, nationwide academic achievement testing program using Stanford Achievement Test (Stanford) for deaf and hard-of-hearing children in the United States started in 1969. Over the past three decades, the Stanford has served as a benchmark in the field of deaf education for assessing student academic achievement. However, the…
Descriptors: Testing Programs, Educational Testing, Deafness, Academic Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Lane, Suzanne – Measurement: Interdisciplinary Research and Perspectives, 2012
Considering consequences in the evaluation of validity is not new although it is still debated by Paul E. Newton and others. The argument-based approach to validity entails an interpretative argument that explicitly identifies the proposed interpretations and uses of test scores and a validity argument that provides a structure for evaluating the…
Descriptors: Educational Opportunities, Accountability, Validity, Inferences
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Creagh, Sue – TESOL in Context, 2014
Teachers are now experiencing the age of quantitative test-driven assessment, in which there is little weight accorded to teacher-based judgement about student progress. In the Australian context, the NAPLaN test has become a driving force in school and teacher accountability. The language of NAPLaN is one of bands and numerical scores and…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Buckendahl, Chad W.; Plake, Barbara S.; Davis, Susan L. – Applied Measurement in Education, 2009
The National Assessment of Educational Progress (NAEP) program is a series of periodic assessments administered nationally to samples of students and designed to measure different content areas. This article describes a multi-year study that focused on the breadth of the development, administration, maintenance, and renewal of the assessments in…
Descriptors: National Competency Tests, Audits (Verification), Testing Programs, Program Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Shudong; Jiao, Hong – Educational and Psychological Measurement, 2009
In practice, vertical scales have been continually used to measure students' achievement progress across several grade levels and have been considered very challenging psychometric procedures. Recently, such practices have been drawing many criticisms. The major criticisms focus on dimensionality and construct equivalence of the latent trait or…
Descriptors: Reading Comprehension, Elementary Secondary Education, Measures (Individuals), Psychometrics
Peer reviewed Peer reviewed
Jones, Terry; Cason, Carolyn L.; Mancini, Mary E. – Journal of Professional Nursing, 2002
Registered nurses (n=368) participated in a skills recredentialing program in which competencies were assessed by a knowledge test and performance test under simulated conditions and evaluator ratings in actual patient-care situations. No significant differences in results between the simulated and actual conditions support the validity of the…
Descriptors: Competence, Credentials, Interrater Reliability, Nurses
Peer reviewed Peer reviewed
Guskey, Thomas R.; Kifer, Edward W. – Educational Measurement: Issues and Practice, 1990
How state educational authorities in Kentucky use statewide test data to rank the state's 178 school districts was studied, using data from the "Kentucky Essential Skills Test: Statewide Testing Results" (1987). The methods used, means of refining those methods, the fairness/accuracy/validity of resulting interpretations, and problems…
Descriptors: School Districts, School Effectiveness, State Programs, Test Results
Peer reviewed Peer reviewed
Ryan, Katherine – Educational Measurement: Issues and Practice, 2002
Proposes a process approach to validity that addresses assessment validation in the context of high-stakes assessment. This approach includes a test evaluator or validator who considers the perspectives of five stakeholder groups at four different stages of assessment maturity in relation to six aspects of construct validity. Illustrates each…
Descriptors: Educational Assessment, Elementary Secondary Education, Evaluators, High Stakes Tests
Peer reviewed Peer reviewed
Mehrens, William A. – Applied Measurement in Education, 2000
Presents conclusions of an independent measurement expert that the Texas Assessment of Academic Skills (TAAS) was constructed according to acceptable professional standards and tests curricular material considered by the Texas Board of Education important for graduates to have mastered. Also supports the validity and reliability of the TAAS and…
Descriptors: Curriculum, Psychometrics, Reliability, Standards
Peer reviewed Peer reviewed
Crocker, Linda – Educational Measurement: Issues and Practice, 2002
Introduces the articles of this theme issue focusing on the involvement of key stakeholder groups in the validation of large-scale high-stakes assessments. Each makes a unique but complementary contribution to the understanding of the demands of a comprehensive validation effort. (SLD)
Descriptors: Elementary Secondary Education, High Stakes Tests, Performance Based Assessment, Stakeholders
Peer reviewed Peer reviewed
Kane, Michael – Educational Measurement: Issues and Practice, 2002
Makes the point that the interpretations and use of high-stakes test scores rely on policy assumptions about what should be taught and the content standards and performance standards that should be applied. The assumptions built into an assessment need to be subjected to scrutiny and criticism if a strong case is to be made for the validity of the…
Descriptors: Educational Policy, Elementary Secondary Education, High Stakes Tests, Scores
Previous Page | Next Page ยป
Pages: 1  |  2