NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)4
Since 2006 (last 20 years)9
Source
Educational Measurement:…49
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 49 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Leventhal, Brian C.; Grabovsky, Irina – Educational Measurement: Issues and Practice, 2020
Standard setting is arguably one of the most subjective techniques in test development and psychometrics. The decisions when scores are compared to standards, however, are arguably the most consequential outcomes of testing. Providing licensure to practice in a profession has high stake consequences for the public. Denying graduation or forcing…
Descriptors: Standard Setting (Scoring), Weighted Scores, Test Construction, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Gotch, Chad M.; Roduta Roberts, Mary – Educational Measurement: Issues and Practice, 2018
As the primary interface between test developers and multiple educational stakeholders, score reports are a critical component to the success (or failure) of any assessment program. The purpose of this review is to document recent research on individual-level score reporting to advance the research and practice of score reporting. We conducted a…
Descriptors: Scores, Models, Correlation, Stakeholders
Peer reviewed Peer reviewed
Direct linkDirect link
Russell, Mike; Ludlow, Larry; O'Dwyer, Laura – Educational Measurement: Issues and Practice, 2019
The field of educational measurement has evolved considerably since the first doctoral programs were established. In response, programs have typically tacked on courses that address newly developed theories, methods, tools, and techniques. As our review of current programs evidences, this approach produces artificial distinctions among topics and…
Descriptors: Educational Testing, Specialists, Doctoral Programs, Program Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Sijtsma, Klaas – Educational Measurement: Issues and Practice, 2015
I discuss the contribution by Davenport, Davison, Liou, & Love (2015) in which they relate reliability represented by coefficient a to formal definitions of internal consistency and unidimensionality, both proposed by Cronbach (1951). I argue that coefficient a is a lower bound to reliability and that concepts of internal consistency and…
Descriptors: Reliability, Mathematics, Validity, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Newton, Paul E. – Educational Measurement: Issues and Practice, 2017
The dominant narrative for assessment design seems to reflect a strong, albeit largely implicit undercurrent of purpose purism, which idealizes the principle that assessment design should be driven by a single assessment purpose. With a particular focus on achievement assessments, the present article questions the tenability of purpose purism,…
Descriptors: Evaluation Methods, Test Construction, Instructional Design, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Arffman, Inga – Educational Measurement: Issues and Practice, 2013
The article reviews research and findings on problems and issues faced when translating international academic achievement tests. The purpose is to draw attention to the problems, to help to develop the procedures followed when translating the tests, and to provide suggestions for further research. The problems concentrate on the following: the…
Descriptors: Achievement Tests, Translation, Testing Problems, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Johnstone, Christopher J.; Thompson, Sandra J.; Bottsford-Miller, Nicole A.; Thurlow, Martha L. – Educational Measurement: Issues and Practice, 2008
Test items undergo multiple iterations of review before states and vendors deem them acceptable to be placed in a live statewide assessment. This article reviews three approaches that can add validity evidence to states' item review processes. The first process is a structured sensitivity review process that focuses on universal design…
Descriptors: Test Items, Disabilities, Test Construction, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Nichols, Paul D.; Meyers, Jason L.; Burling, Kelly S. – Educational Measurement: Issues and Practice, 2009
Assessments labeled as formative have been offered as a means to improve student achievement. But labels can be a powerful way to miscommunicate. For an assessment use to be appropriately labeled "formative," both empirical evidence and reasoned arguments must be offered to support the claim that improvements in student achievement can be linked…
Descriptors: Academic Achievement, Tutoring, Student Evaluation, Evaluation Methods
Peer reviewed Peer reviewed
Nichols, Paul; Sugrue, Brenda – Educational Measurement: Issues and Practice, 1999
Uses data from the National Assessment of Educational Progress to demonstrate the frequent lack of fidelity between cognitively complex construct definitions and the simple cognitive assumptions embedded in common test-development practices. Describes alternative,construct-centered test development approaches for each stage of the test-development…
Descriptors: Cognitive Tests, Educational Practices, Test Construction
Peer reviewed Peer reviewed
Reckase, Mark D. – Educational Measurement: Issues and Practice, 1998
Considers what a responsible test developer would do to gain information to support the consequential basis of validity for a test early in the development. How the consequential basis of validity of the program would be monitored and reported during the life of the program is examined. The validity of the ACT Assessment is considered as if it…
Descriptors: Evaluation Methods, Program Evaluation, Test Construction, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Gorin, Joanna S. – Educational Measurement: Issues and Practice, 2006
One of the primary themes of the National Research Council's 2001 book "Knowing What Students Know" was the importance of cognition as a component of assessment design and measurement theory (NRC, 2001). One reaction to the book has been an increased use of sophisticated statistical methods to model cognitive information available in test data.…
Descriptors: Test Construction, Student Evaluation, Academic Ability, Evaluation Methods
Peer reviewed Peer reviewed
Green, Donald Ross – Educational Measurement: Issues and Practice, 1998
Asserts that publishers of achievement tests are, for the most part, not in a position to obtain on their own any decent evidence about the consequences of uses made of their tests. Reasons why this is so are discussed, and what publishers can be expected to do is outlined. (SLD)
Descriptors: Achievement Tests, Elementary Secondary Education, Test Construction, Test Use
Peer reviewed Peer reviewed
Bock, R. Darrell – Educational Measurement: Issues and Practice, 1997
This brief history traces the development of item response theory (IRT) from concepts originating in 19th-century mathematics and psychology to present-day principles drawn from statistical estimation theory. Connections to other fields and current trends in IRT are outlined. (SLD)
Descriptors: Estimation (Mathematics), History, Item Response Theory, Psychometrics
Peer reviewed Peer reviewed
Diamond, Esther E.; Fremer, John – Educational Measurement: Issues and Practice, 1989
The Joint Committee on Testing Practices has completed the "Code of Fair Testing Practices in Education," which is meant for the public and focuses on the proper use of tests in education--admissions, educational assessment and diagnosis, and student placement. The Code separately addresses test developers' and users' roles. (SLD)
Descriptors: Educational Testing, Evaluation Utilization, Examiners, Scoring
Peer reviewed Peer reviewed
Reckase, Mark D. – Educational Measurement: Issues and Practice, 1989
Requirements for adaptive testing are reviewed, and the reasons implementation has taken so long are explored. The adaptive test is illustrated through the Stanford-Binet Intelligence Scale of L. M. Terman and M. A. Merrill (1960). Current adaptive testing is tied to the development of item response theory. (SLD)
Descriptors: Adaptive Testing, Educational Development, Elementary Secondary Education, Latent Trait Theory
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4