Publication Date
In 2025 | 9 |
Since 2024 | 31 |
Since 2021 (last 5 years) | 88 |
Since 2016 (last 10 years) | 197 |
Since 2006 (last 20 years) | 401 |
Descriptor
Test Interpretation | 3974 |
Test Validity | 958 |
Test Construction | 688 |
Elementary Secondary Education | 677 |
Scores | 650 |
Test Results | 623 |
Test Reliability | 622 |
Testing | 549 |
Achievement Tests | 510 |
Standardized Tests | 490 |
Testing Problems | 488 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 274 |
Researchers | 122 |
Teachers | 102 |
Administrators | 63 |
Counselors | 28 |
Parents | 21 |
Policymakers | 21 |
Students | 15 |
Community | 8 |
Location
Canada | 44 |
California | 33 |
Australia | 32 |
United Kingdom | 23 |
United States | 19 |
Pennsylvania | 18 |
United Kingdom (England) | 16 |
New York | 15 |
Michigan | 14 |
Japan | 13 |
New Jersey | 12 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Li, Xu; Ouyang, Fan; Liu, Jianwen; Wei, Chengkun; Chen, Wenzhi – Journal of Educational Computing Research, 2023
The computer-supported writing assessment (CSWA) has been widely used to reduce instructor workload and provide real-time feedback. Interpretability of CSWA draws extensive attention because it can benefit the validity, transparency, and knowledge-aware feedback of academic writing assessments. This study proposes a novel assessment tool,…
Descriptors: Computer Assisted Testing, Writing Evaluation, Feedback (Response), Natural Language Processing
Eirini M. Mitropoulou; Leonidas A. Zampetakis; Ioannis Tsaousis – Evaluation Review, 2024
Unfolding item response theory (IRT) models are important alternatives to dominance IRT models in describing the response processes on self-report tests. Their usage is common in personality measures, since they indicate potential differentiations in test score interpretation. This paper aims to gain a better insight into the structure of trait…
Descriptors: Foreign Countries, Adults, Item Response Theory, Personality Traits
Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2020
Researchers have documented the impact of rater effects, or raters' tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers' achievement estimates given their response patterns,…
Descriptors: Performance Based Assessment, Evaluators, Achievement, Influences
Clark, Amy K.; Karvonen, Meagan – Educational Assessment, 2020
Alternate assessments based on alternate achievement standards (AA-AAS) have historically lacked broad validity evidence and an overall evaluation of the extent to which evidence supports intended uses of results. An expanding body of validation literature, the funding of two AA-AAS consortia, and advances in computer-based assessment have…
Descriptors: Alternative Assessment, Test Validity, Test Use, Students with Disabilities
Kannan, Priya; Zapata-Rivera, Diego; Bryant, Andrew D. – Practical Assessment, Research & Evaluation, 2021
Individual-student score reports sometimes include information about precision of scores (i.e., measurement error). In this study, we specifically investigated if parents understand this information when presented. We conducted an online experimental study where 196 parents of middle school children, from various parts of the country, were…
Descriptors: Comprehension, Parents, Error of Measurement, Test Interpretation
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Smith, Leann V.; Graves, Scott L. – Contemporary School Psychology, 2021
The purpose of this paper is to examine the factorial invariance of the Wechsler Intelligence Scale for Children--Fifth Edition (WISC-V) between genders in a sample of Black students in an urban, public school district. Few researchers test the validity of cognitive assessments on Black samples and even fewer do so utilizing samples other than…
Descriptors: Children, Intelligence Tests, African American Students, Urban Schools
Zengilowski, Allison; Schuetze, Brendan A.; Nash, Brady L.; Schallert, Diane L. – Educational Psychologist, 2021
Refutation texts, rhetorical tools designed to reduce misconceptions, have garnered attention across four decades and many studies. Yet, the ability of a refutation text to change a learner's mind on a topic needs to be qualified and modulated. In this critical review, we bring attention to sources of constraints often overlooked by refutation…
Descriptors: Misconceptions, Instructional Materials, Research Problems, Research Methodology
Dadey, Nathan; Keng, Leslie; Boyer, Michelle; Marion, Scott – National Center for the Improvement of Educational Assessment, 2021
State summative educational assessment is about to begin in earnest. Rightfully, many are raising questions about the quality, meaning, and appropriate use of the assessment results. This document was written to support state educational agencies (SEAs) and their assessment providers in devising effective and efficient analysis plans. This…
Descriptors: Educational Assessment, Summative Evaluation, Student Evaluation, Test Use
Papageorgiou, Spiros; Davis, Larry; Ohta, Renka; Gomez, Pablo Garcia – ETS Research Report Series, 2022
In this research report, we describe a study to map the scores of the "TOEFL® Essentials"™ test to the Canadian Language Benchmarks (CLB). The TOEFL Essentials test is a four-skills assessment of foundational English language skills and communication abilities in academic and general (daily life) contexts. At the time of writing this…
Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Chen, Dandan – Online Submission, 2023
Technology-driven shifts have created opportunities to improve efficiency and quality of assessments. Meanwhile, they may have exacerbated underlying socioeconomic issues in relation to educational equity. The increased implementation of technology-based assessments during the COVID-19 pandemic compounds the concern about the digital divide, as…
Descriptors: Technology Uses in Education, Computer Assisted Testing, Alternative Assessment, Test Format
Beniermann, Anna; Moormann, Alexandra; Fiedler, Daniela – Journal of Research in Science Teaching, 2023
Over the past decades, a large body of research has examined students' magnitudes of evolution acceptance and related measurement issues resulting in questions concerning instruments' validity and operationalization. Until now, several studies investigated validity aspects of often-used evolution acceptance instruments and came to diverging…
Descriptors: Preservice Teachers, Science Teachers, Biology, Evolution
Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023
Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…
Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes
Ching-Ni Hsieh – ETS Research Report Series, 2023
Researchers suggest that claims about the meaningfulness of test score interpretations and consequences of test use should be backed by evidence that stakeholders understand the definition of the construct assessed (meaningfulness) and score reports (consequences). Evaluation of stakeholders' actual uses and interpretations of score reports in…
Descriptors: Reading Tests, Listening Comprehension, Foreign Countries, English (Second Language)