Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 7 |
| Since 2007 (last 20 years) | 25 |
Descriptor
| Evaluation Methods | 187 |
| Test Use | 187 |
| Elementary Secondary Education | 76 |
| Student Evaluation | 71 |
| Educational Assessment | 68 |
| Testing | 56 |
| Testing Problems | 56 |
| Educational Testing | 46 |
| Test Construction | 42 |
| Test Validity | 35 |
| Testing Programs | 32 |
| More ▼ | |
Source
Author
| Thurlow, Martha | 5 |
| Ysseldyke, Jim | 4 |
| Linn, Robert L. | 3 |
| Shepard, Lorrie A. | 3 |
| Bank, Adrianne | 2 |
| Bobbett, Gordon | 2 |
| Eignor, Daniel R. | 2 |
| Elmore, Patricia B. | 2 |
| Erickson, Ron | 2 |
| French, Russell L. | 2 |
| Gong, Brian | 2 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 20 |
| Elementary Education | 4 |
| Higher Education | 3 |
| Secondary Education | 3 |
| High Schools | 2 |
| Postsecondary Education | 2 |
| Adult Basic Education | 1 |
| Adult Education | 1 |
| Grade 4 | 1 |
| Grade 6 | 1 |
| Grade 9 | 1 |
| More ▼ | |
Audience
| Practitioners | 17 |
| Teachers | 6 |
| Researchers | 4 |
| Administrators | 3 |
| Community | 1 |
| Counselors | 1 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Canada | 5 |
| United States | 5 |
| Kentucky | 4 |
| United Kingdom (England) | 4 |
| Maryland | 3 |
| New York | 3 |
| Virginia | 3 |
| Australia | 2 |
| Texas | 2 |
| United Kingdom | 2 |
| United Kingdom (Wales) | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Education Consolidation… | 2 |
| Carl D Perkins Vocational… | 1 |
| Elementary and Secondary… | 1 |
| Every Student Succeeds Act… | 1 |
| Hawkins Stafford Act 1988 | 1 |
| Individuals with Disabilities… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kylie Gorney; Mark D. Reckase – Journal of Educational Measurement, 2025
In computerized adaptive testing, item exposure control methods are often used to provide a more balanced usage of the item pool. Many of the most popular methods, including the restricted method (Revuelta and Ponsoda), use a single maximum exposure rate to limit the proportion of times that each item is administered. However, Barrada et al.…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Gu, Lixiong; Ling, Guangming; Qu, Yanxuan – ETS Research Report Series, 2019
Research has found that the "a"-stratified item selection strategy (STR) for computerized adaptive tests (CATs) may lead to insufficient use of high a items at later stages of the tests and thus to reduced measurement precision. A refined approach, unequal item selection across strata (USTR), effectively improves test precision over the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Use, Test Items
Ng, Zi Jia; Willner, Cynthia J.; Mannweiler, Morgan D.; Hoffmann, Jessica D.; Bailey, Craig S.; Cipriano, Christina – Educational Psychology Review, 2022
Many emotion regulation assessments have been developed for research purposes, but few are frequently used in schools despite the rapid growth of social and emotional learning programs with an explicit focus on emotion regulation in schools. This systematic review provides an overview of emotion regulation assessments that have been utilized with…
Descriptors: Emotional Response, Self Control, Elementary School Students, Secondary School Students
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
American Educational Research Association (AERA), 2014
Developed jointly by the American Educational Research Association, American Psychological Association, and the National Council on Measurement in Education, "Standards for Educational and Psychological Testing" (Revised 2014) addresses professional and technical issues of test development and use in education, psychology, and…
Descriptors: Standards, Educational Testing, Psychological Testing, Test Construction
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Shepherd, Keegan J. – ProQuest LLC, 2017
Standardized testing is a defining feature of contemporary American society. It not only governs how people are channeled through their schooling; it amplifies existing social disparities. Nonetheless, standardized testing endures, namely because it has served as a vital tool for the post-1945 American state. The postwar state prioritized, on the…
Descriptors: Standardized Tests, Testing, Educational History, Government Role
Howie, Sarah – Assessment in Education: Principles, Policy & Practice, 2012
The Jomtien conference in 1990 on Education for All is seen by many as a turning point for the introduction of increased monitoring and evaluation of the quality of education systems around the world. Internationally, debates have arisen about the nature and frequency of assessment and its impact on education systems with its intended and…
Descriptors: Test Use, Testing, High Stakes Tests, Measures (Individuals)
Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010
Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010
"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…
Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques
Cabrera, Nolan L.; Cabrera, George A. – Educational Horizons, 2011
Just like all the high-stakes tests that determine students' futures nowadays, The Chorizo Test is a standardized test rooted in the culture of the test makers. It was originally created to be used with students in teacher training programs to sensitize them to the pitfalls inherent in standardized pencil-and-paper tests, such as linguistic bias…
Descriptors: Test Use, Standardized Tests, Social Sciences, High Stakes Tests
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010
The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…
Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

Peer reviewed
Direct link
