Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 5 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Scores | 70 |
| Testing Problems | 70 |
| Test Reliability | 60 |
| Test Validity | 25 |
| Test Interpretation | 20 |
| Standardized Tests | 18 |
| Elementary Secondary Education | 16 |
| Error of Measurement | 13 |
| Test Bias | 13 |
| Achievement Tests | 12 |
| Higher Education | 12 |
| More ▼ | |
Source
Author
| Airasian, Peter W. | 1 |
| Anderson, Paul S. | 1 |
| Attali, Yigal | 1 |
| Avery, Richard O. | 1 |
| Baig, Basim | 1 |
| Baker, Beverly A. | 1 |
| Barker, Pierce | 1 |
| Bergquist, Constance | 1 |
| Bormuth, John R. | 1 |
| Bridgeman, Brent | 1 |
| Brown, Jonathan R. | 1 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Preschool Education | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 5 |
| Practitioners | 2 |
| Parents | 1 |
Location
| Canada | 2 |
| China | 2 |
| Texas | 1 |
| United Kingdom | 1 |
| United States | 1 |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Jiayi Wang; Michael T. Kalkbrenner; Riley Schaner – Psychology in the Schools, 2025
Teaching is a stressful profession with a high turnover rate. Schools and related institutions need to take more action to support teachers and keep teacher stress at a manageable level. The continued research and practical effort require measures to examine teachers' stress in a briefer and accurate manner. The Teacher Stress Scale is a recently…
Descriptors: Elementary School Teachers, Secondary School Teachers, Preschool Teachers, Stress Variables
LaFlair, Geoffrey T.; Langenfeld, Thomas; Baig, Basim; Horie, André Kenji; Attali, Yigal; von Davier, Alina A. – Journal of Computer Assisted Learning, 2022
Background: Digital-first assessments leverage the affordances of technology in all elements of the assessment process--from design and development to score reporting and evaluation to create test taker-centric assessments. Objectives: The goal of this paper is to describe the engineering, machine learning, and psychometric processes and…
Descriptors: Computer Assisted Testing, Affordances, Scoring, Engineering
Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018
Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…
Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Min, Shangchao; He, Lianzhen; Zhang, Jie – Language Teaching, 2020
This article reviews a selected sample of 70 empirical studies in journal articles and doctoral dissertations on language assessment in China between 2011 and 2018. Following a brief introduction to the history and current state of language assessment in China, the article presents a critical review of language assessment research on six themes…
Descriptors: Language Tests, Test Reliability, Test Validity, Journal Articles
Zhao, Hulin; Gu, Xiangdong – Language Testing, 2016
Test Purpose: The CATTI aims to measure competence in translation and interpreting (including simultaneous and consecutive interpreting) between Chinese and seven foreign languages: English, Japanese, French, Arabic, Russian, German, or Spanish. The test is intended to cover a wide range of domains including business, government, academia, and…
Descriptors: Accreditation (Institutions), Foreign Countries, Translation, Chinese
Feinberg, Richard A. – ProQuest LLC, 2012
Subscores, also known as domain scores, diagnostic scores, or trait scores, can help determine test-takers' relative strengths and weaknesses and appropriately focus remediation. However, subscores often have poor psychometric properties, particularly reliability and distinctiveness (Folske, Gessaroli, & Swanson, 1999; Monaghan, 2006;…
Descriptors: Simulation, Tests, Testing, Scores
Kettler, Ryan J. – Review of Research in Education, 2015
This chapter introduces theory that undergirds the role of testing adaptations in assessment, provides examples of item modifications and testing accommodations, reviews research relevant to each, and introduces a new paradigm that incorporates opportunity to learn (OTL), academic enablers, testing adaptations, and inferences that can be made from…
Descriptors: Meta Analysis, Literature Reviews, Testing, Testing Accommodations
Baker, Beverly A. – Assessing Writing, 2010
In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…
Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Helms, LuAnn Sherbeck – 1999
This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Descriptors: Effect Size, Meta Analysis, Reliability, Scores
Rudner, Lawrence M.; Schafer, William D. – 2001
This digest discusses sources of error in testing, several approaches to estimating reliability, and several ways to increase test reliability. Reliability has been defined in different ways by different authors, but the best way to look at reliability may be the extent to which measurements resulting from a test are characteristics of those being…
Descriptors: Educational Testing, Error of Measurement, Reliability, Scores
Peer reviewedFeldt, Leonard S. – Applied Measurement in Education, 2002
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
Descriptors: Error of Measurement, Reliability, Scores, Test Construction
Tyson, LeaAnn; Silverman, Stephen – 1992
The purpose of this study was to investigate differences in Texas Teacher Appraisal System (TTAS) scores when considering the scores of the first four individual domains (Instructional Strategies, Management and Organization, Presentation of Subject Matter, and Learning Environment), the sum of the scores of Domains I through IV, and the overall…
Descriptors: Analysis of Variance, Career Ladders, Classroom Observation Techniques, Elementary School Teachers
Kaplan, Bruce A.; Johnson, Eugene G. – 1992
Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…
Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators

Direct link
