Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 7 |
| Since 2017 (last 10 years) | 21 |
| Since 2007 (last 20 years) | 36 |
Descriptor
| Test Reliability | 72 |
| Scoring | 55 |
| Test Validity | 41 |
| Test Construction | 36 |
| Language Tests | 17 |
| Test Items | 17 |
| Testing Programs | 17 |
| Testing | 15 |
| Item Response Theory | 12 |
| Scoring Rubrics | 12 |
| Test Interpretation | 12 |
| More ▼ | |
Source
Author
| White, Edward M. | 6 |
| Erford, Bradley T. | 2 |
| Koretz, Daniel | 2 |
| Stansfield, Charles W. | 2 |
| Ahmed, Ayesha | 1 |
| Allalouf, Avi | 1 |
| Angoff, William H. | 1 |
| Ault, Haley | 1 |
| Balkin, Richard S. | 1 |
| Bardar, Erin | 1 |
| Bardhoshi, Gerta | 1 |
| More ▼ | |
Publication Type
| Reports - Descriptive | 72 |
| Journal Articles | 35 |
| Numerical/Quantitative Data | 14 |
| Reports - Research | 6 |
| Tests/Questionnaires | 6 |
| Guides - Non-Classroom | 2 |
| Book/Product Reviews | 1 |
| Books | 1 |
| Opinion Papers | 1 |
| Reports - Evaluative | 1 |
Education Level
| Secondary Education | 12 |
| Elementary Education | 11 |
| Junior High Schools | 10 |
| Middle Schools | 10 |
| Early Childhood Education | 8 |
| Grade 3 | 8 |
| Grade 4 | 8 |
| Grade 5 | 8 |
| Grade 6 | 8 |
| Grade 7 | 8 |
| Intermediate Grades | 8 |
| More ▼ | |
Audience
| Practitioners | 5 |
| Policymakers | 2 |
| Teachers | 2 |
| Researchers | 1 |
Location
| California | 7 |
| New York | 6 |
| Nebraska | 4 |
| New Mexico | 2 |
| Vermont | 2 |
| Arkansas (Little Rock) | 1 |
| Colorado (Denver) | 1 |
| Europe | 1 |
| Florida | 1 |
| Greece | 1 |
| Netherlands | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 2 |
| International English… | 1 |
| Measures of Academic Progress | 1 |
| New Jersey High School… | 1 |
| Program for International… | 1 |
What Works Clearinghouse Rating
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021
Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…
Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas
National Institute for Excellence in Teaching, 2023
Aspiring teachers must develop an in-depth understanding of high-quality instructional practices. In order to prepare, instruct, and coach aspiring teachers, the National Institute for Excellence in Teaching (NIET) has developed a the NIET Aspiring Teacher Rubric (ATR) based on principles of excellence in instruction. This research brief…
Descriptors: Scoring Rubrics, Preservice Teachers, Test Construction, Test Validity
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Wagaman, John; Fletcher, Michael – Teaching Statistics: An International Journal for Teachers, 2018
This article considers how a handicapping system should be devised for squash. It looks at the American scoring system, and whether it is possible to have a fair system of handicapping. We consider "fair" from a perspective of expected number of rallies won and probability of winning.
Descriptors: Probability, Athletes, Athletics, Inhibition
Sickler, Jessica; Bardar, Erin; Kochevar, Randy – Journal of College Science Teaching, 2021
Data literacy, or students' abilities to understand, interpret, and think critically about data, is an increasing need in K-16 science education. Ocean Tracks College Edition (OTCE) sought to address this need by creating a set of learning modules that engage students in using large-scale, professionally collected animal migration and physical…
Descriptors: Information Literacy, Data Analysis, Undergraduate Students, Scoring Rubrics
Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019
Reliability is the consistency of a set of scores that are designed to measure the same thing. Reliability is a statistical property of scores that must be demonstrated rather than assumed.
Descriptors: Scores, Measurement, Test Reliability, Error Patterns
Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022
In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…
Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics
Heng Lu – PASAA: Journal of Language Teaching and Learning in Thailand, 2023
The test view is on the Duolingo English Test (DET), an alternative online English proficiency test with a machine-driven characteristic. The review covers essential information of the DET such as test purpose, usage, score-mapping with CEFR scale, price, and publisher. Meanwhile, the test usefulness is discussed with focuses on reliability,…
Descriptors: Computer Software, Computer Assisted Instruction, Second Language Learning, Second Language Instruction
Kopacz, Dawn M.; Handlos, Zachary J. – International Journal for the Scholarship of Teaching and Learning, 2021
General education science courses strive to promote scientific literacy and the development of scientific process skills. However, research shows that many general education courses are still designed to stress content mastery. In this study, the number of topics in five semester-long introductory atmospheric science courses was reduced to…
Descriptors: Science Curriculum, Curriculum Design, Science Process Skills, Scientific Literacy
Kleijn, Suzanne; Pander Maat, Henk; Sanders, Ted – Language Testing, 2019
Although there are many methods available for assessing text comprehension, the cloze test is not widely acknowledged as one of them. Critiques on cloze testing center on its supposedly limited ability to measure comprehension beyond the sentence. However, these critiques do not hold for all types of cloze tests; the particular configuration of a…
Descriptors: Cloze Procedure, Language Tests, Semantics, Scoring
Cian, Heidi – Electronic Journal for Research in Science & Mathematics Education, 2020
Socioscientific issues, issues that center on the intersection between scientific and social problems in real-world contexts, are valuable tools to use in science instruction due to their association with gains in scientific literacy, argumentation skills, and content knowledge. However, due to their complex nature, crafting instruction using…
Descriptors: Science and Society, Social Problems, Science Instruction, Prior Learning
Nebraska Department of Education, 2022
In Winter 2021-2022, the Nebraska Student-Centered Assessment System (NSCAS) assessments are administered in ELA and mathematics in Grades 3-8. In Spring 2021-2022, the NSCAS assessments are administered in English language arts (ELA) and mathematics in Grades 3-8 and in science in Grades 5 and 8. The purposes of the NSCAS assessments are to…
Descriptors: English, Language Arts, Student Centered Learning, Mathematics Tests
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Talan, Teri N.; Bloom, Paula Jorde – Teachers College Press, 2018
The "Business Administration Scale for Family Child Care" (BAS) is the first valid and reliable tool for measuring and improving the overall quality of business and professional practices in family child care settings. It is applicable for multiple uses, including program self-improvement, technical assistance and monitoring, training,…
Descriptors: Business Administration, Child Care, Rating Scales, Qualifications

Peer reviewed
Direct link
