NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 48 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025
Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…
Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Dumas, Denis; McNeish, Daniel; Greene, Jeffrey A. – Educational Psychologist, 2020
Scholars have lamented that current methods of assessing student performance do not align with contemporary views of learning as situated within students, contexts, and time. Here, we introduce and describe one theoretical--psychometric paradigm--termed "dynamic measurement"--designed to provide a valid representation of the way students…
Descriptors: Alternative Assessment, Psychometrics, Educational Psychology, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Fitzpatrick, Tess; Clenton, Jon – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2017
This article offers a solution to a significant problem for teachers and researchers of language learning that confounds their interpretations and expectations of test data: The apparent simplicity of tests of vocabulary knowledge masks the complexity of the constructs they claim to measure. The authors first scrutinise task elements in two widely…
Descriptors: Language Tests, Vocabulary Development, Difficulty Level, Performance Factors
Peer reviewed Peer reviewed
Direct linkDirect link
Berliner, David C. – Teachers College Record, 2015
Trying to understand PISA is analogous to the parable of the blind men and the elephant. There are many facets of the PISA program, and thus many ways to both applaud and critique this ambitious international program of assessment that has gained enormous importance in the crafting of contemporary educational policy. One of the facets discussed in…
Descriptors: Achievement Tests, Standardized Tests, Educational Assessment, Educational Indicators
Choi, Ick Kyu – ProQuest LLC, 2013
At the University of California, Los Angeles, the Test of Oral Proficiency (TOP), an internally developed oral proficiency test, is administered to international teaching assistant (ITA) candidates to ensure an appropriate level of academic oral English proficiency. Test taker performances are rated live by two raters according to four subscales.…
Descriptors: Screening Tests, Profiles, Oral Language, English
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yen, Wendy M.; Lall, Venessa F.; Monfils, Lora – ETS Research Report Series, 2012
Alternatives to vertical scales are compared for measuring longitudinal academic growth and for producing school-level growth measures. The alternatives examined were empirical cross-grade regression, ordinary least squares and logistic regression, and multilevel models. The student data used for the comparisons were Arabic Grades 4 to 10 in…
Descriptors: Foreign Countries, Scaling, Item Response Theory, Test Interpretation
Klesch, Heather S. – ProQuest LLC, 2010
The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…
Descriptors: Feedback (Response), Test Results, Focus Groups, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom; Gill, Tim – Research Papers in Education, 2010
The rank-ordering method for standard maintaining was designed for the purpose of mapping a known cut-score (e.g. a grade boundary mark) on one test to an equivalent point on the test score scale of another test, using holistic expert judgements about the quality of exemplars of examinees' work (scripts). It is a novel application of an old…
Descriptors: Scores, Psychometrics, Measurement Techniques, Foreign Countries
Gray, B. Thomas – 1997
Validity is a critically important issue with far-reaching implications for testing. The history of conceptualizations of validity over the past 50 years is reviewed, and 3 important areas of controversy are examined. First, the question of whether the three traditionally recognized types of validity should be integrated as a unitary entity of…
Descriptors: Educational Testing, Evaluation Methods, Reliability, Scores
Russell, Michael – 2000
This Digest introduces the advantages and disadvantages of three commonly used methods of reporting test score changes: (1) change in percentile rank; (2) scale or raw score change; and (3) percent change. The change in percentile rank method focuses on the increase or decrease of the mean percentile ranking for a group of students. This method…
Descriptors: Achievement Gains, Change, Evaluation Methods, Scores
Braun, Henry I.; Mislevy, Robert J. – US Department of Education, 2004
Psychologist Andrea diSessa coined the term "phenomenological primitives", or p-prims, to talk about nonexperts' reasoning about physical situations. P-prims are primitive in the sense that they stand without significant explanatory substructure or explanation. Examples are "Heavy objects fall faster than light objects" and "Continuing force is…
Descriptors: Test Theory, Testing, Evaluation Methods, Scores
Harris, Deborah J. – 2003
Tests and assessments are generally administered to gather data to aid in decision making, with at an individual student level or at an aggregated level. In order to incorporate assessment data in informed decision making, test users need to understand the test results. This chapter highlights the types of test scores and test score…
Descriptors: Decision Making, Educational Assessment, Educational Testing, Evaluation Methods
Russell, Michael – 2000
An earlier Digest described the shortcomings of three methods commonly used to summarize changes in test scores. This Digest describes two less commonly used approaches for examining changes in test scores, those of Standardized Growth Estimates and Effect Sizes. Aspects of these two approaches are combined and applied to the Iowa Test of Basic…
Descriptors: Achievement Gains, Change, Effect Size, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4