Publication Date
In 2025 | 3 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 28 |
Since 2006 (last 20 years) | 97 |
Descriptor
Scores | 152 |
Testing | 152 |
Test Validity | 100 |
Validity | 47 |
Test Reliability | 44 |
Test Construction | 37 |
Test Interpretation | 29 |
Language Tests | 25 |
Scoring | 23 |
Standardized Tests | 22 |
Academic Achievement | 21 |
More ▼ |
Source
Author
Kane, Michael | 4 |
Elliott, Stephen N. | 3 |
Davies, Alan | 2 |
Goldschmidt, Pete | 2 |
Heritage, Margaret | 2 |
Herman, Joan L. | 2 |
Jeff Allen | 2 |
Kapes, Jerome T. | 2 |
Kopriva, Rebecca J. | 2 |
Kratochwill, Thomas R. | 2 |
McKevitt, Brian C. | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 19 |
Postsecondary Education | 11 |
Elementary Education | 9 |
Secondary Education | 9 |
High Schools | 8 |
Elementary Secondary Education | 7 |
Grade 4 | 6 |
Grade 5 | 6 |
Grade 8 | 6 |
Middle Schools | 6 |
Grade 3 | 5 |
More ▼ |
Location
United Kingdom | 3 |
United States | 3 |
China | 2 |
United Kingdom (England) | 2 |
Australia | 1 |
Belgium | 1 |
Canada | 1 |
China (Beijing) | 1 |
Cyprus | 1 |
Georgia (Atlanta) | 1 |
Hawaii | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Elementary and Secondary… | 1 |
Every Student Succeeds Act… | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023
Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…
Descriptors: Biology, Science Tests, High Stakes Tests, Scores
Mattern, Krista; Radunzel, Justine – ACT, Inc., 2019
When applicants take the ACT® more than once, how do colleges and universities reconcile and make sense of the multiple scores? In terms of validity, fairness, and impact on subgroup differences, are certain score-use polices better than others? The focus of this issue brief is to summarize evidence on the validity and fairness of various…
Descriptors: Scoring, College Entrance Examinations, Test Validity, Evaluation Methods
Stefancik, Christopher D. – ProQuest LLC, 2019
There is an ongoing debate among instructional personnel, parents, legislators, and the community at large about the nature and purpose of testing in the educational system. State and district-based testing programs have been criticized as "over-testing" policies. The result of the criticism culminates in a reduction of assessment…
Descriptors: Testing, Standardized Tests, Progress Monitoring, Teacher Made Tests
Fitzgerald, Jill; Shanahan, Timothy E. – International Literacy Association, 2020
Reading scores exist for a continuum of purposes, from informal assessment to formal standardized tests. This brief aims to answer the question: What matters most for elementary-grade teachers when thinking about reading scores, and what could policymakers do to help teachers? Three positions worth pursuing in this regard are shared: (1) every…
Descriptors: Reading Achievement, Scores, Elementary School Students, Elementary School Teachers
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Isbell, Daniel R.; Kremmel, Benjamin – Language Testing, 2020
Administration of high-stakes language proficiency tests has been disrupted in many parts of the world as a result of the 2019 novel coronavirus pandemic. Institutions that rely on test scores have been forced to adapt, and in many cases this means using scores from a different test, or a new online version of an existing test, that can be taken…
Descriptors: Language Tests, High Stakes Tests, Language Proficiency, Second Language Learning
Hille, Kathryn; Cho, Yeonsuk – Language Testing, 2020
Accurate placement within levels of an ESL program is crucial for optimal teaching and learning. Commercially available tests are commonly used for placement, but their effectiveness has been found to vary. This study uses data from the Ohio Program of Intensive English (OPIE) at Ohio University to examine the value of two commercially available…
Descriptors: Student Placement, Testing, English (Second Language), Language Tests
Haertel, Edward H. – Educational Psychologist, 2018
In the service of educational accountability, student achievement tests are being used to measure constructs quite unlike those envisioned by test developers. Scores are compared to cut points to create classifications like "proficient"; scores are combined over time to measure growth; student scores are aggregated to measure the…
Descriptors: Achievement Tests, Scores, Test Validity, Test Interpretation
Raley, Sheida K.; Shogren, Karrie A.; Rifenbark, Graham G.; Anderson, Mark H.; Shaw, Leslie A. – Journal of Special Education Technology, 2020
The Self-Determination Inventory: Student Report (SDI: SR) was developed to measure the self-determination of adolescents and was recently validated for students aged 13-22 with and without disabilities across diverse racial/ethnic backgrounds. The SDI: SR is aligned Causal Agency Theory and its theoretical conceptualizations of self-determined…
Descriptors: Testing, Self Determination, Scores, Students with Disabilities
Norris, John; Drackert, Anastasia – Language Testing, 2018
The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…
Descriptors: German, Second Language Learning, Language Tests, Language Proficiency
DiBiase-Lubrano, Mary Jo – Unterrichtspraxis/Teaching German, 2018
Language testing is an integral part of teaching and learning, yet most language faculty do not receive adequate training for developing tests (Taylor, [Taylor, L., 2009]). Most have advanced degrees in literary and cultural studies in the target language but often have insufficient training in pedagogy and assessment. This shortcoming is alarming…
Descriptors: German, Second Language Learning, Second Language Instruction, Language Tests