Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 14 |
Since 2006 (last 20 years) | 27 |
Descriptor
Scores | 44 |
Test Reliability | 44 |
Test Validity | 25 |
Evaluation Methods | 11 |
Test Construction | 10 |
Student Evaluation | 9 |
Error of Measurement | 8 |
Psychometrics | 8 |
Elementary Secondary Education | 7 |
Test Items | 7 |
Testing | 7 |
More ▼ |
Source
Author
Erford, Bradley T. | 3 |
Hays, Danica G. | 2 |
Pentimonti, J. | 2 |
Petscher, Y. | 2 |
Stanley, C. | 2 |
Abedi, Jamal | 1 |
Ault, Haley | 1 |
Badger, Julia R. | 1 |
Baird, Jo-Anne | 1 |
Balkin, Richard S. | 1 |
Bardhoshi, Gerta | 1 |
More ▼ |
Publication Type
Reports - Descriptive | 44 |
Journal Articles | 28 |
Numerical/Quantitative Data | 3 |
Guides - General | 1 |
Guides - Non-Classroom | 1 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Audience
Researchers | 5 |
Practitioners | 3 |
Administrators | 2 |
Teachers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019
Reliability is the consistency of a set of scores that are designed to measure the same thing. Reliability is a statistical property of scores that must be demonstrated rather than assumed.
Descriptors: Scores, Measurement, Test Reliability, Error Patterns
Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022
In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…
Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics
Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019
Validity is broadly defined as how well something measures what it's supposed to measure. The reliability and validity of scores from assessments are two concepts that are closely knit together and feed into each other.
Descriptors: Screening Tests, Scores, Test Validity, Test Reliability
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Center on Standards and Assessments Implementation, 2018
Reliability is a measure of consistency. It is the degree to which student results are the same when they take the same test on different occasions, when different scorers score the same item or task, and when different but equivalent tests are taken at the same time or at different times. Reliability is about making sure that different test forms…
Descriptors: Test Reliability, Test Validity, Student Evaluation, Test Bias
Fitzgerald, Jill; Shanahan, Timothy E. – International Literacy Association, 2020
Reading scores exist for a continuum of purposes, from informal assessment to formal standardized tests. This brief aims to answer the question: What matters most for elementary-grade teachers when thinking about reading scores, and what could policymakers do to help teachers? Three positions worth pursuing in this regard are shared: (1) every…
Descriptors: Reading Achievement, Scores, Elementary School Students, Elementary School Teachers
Badger, Julia R.; Mellanby, Jane – British Journal of Educational Psychology, 2018
Background: School attainment tests and Cognitive Abilities Tests are used in the United Kingdom to set targets for educational outcome. Whilst these are good predictors, they depend not only on basic ability but also on learnt knowledge and skills, such as reading. Method and Aims: VESPARCH is an online group test of verbal and spatial reasoning,…
Descriptors: Foreign Countries, Intelligence Tests, Verbal Ability, Spatial Ability
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Hays, Danica G.; Wood, Chris – Measurement and Evaluation in Counseling and Development, 2017
We present considerations for validity when a population outside of a normed sample is assessed and those data are interpreted. Using a career group counseling example exploring life satisfaction changes as evidenced by the Quality of Life Inventory (Frisch, 1994), we showcase qualitative and quantitative approaches to explore how normative data…
Descriptors: Data Interpretation, Scores, Quality of Life, Life Satisfaction
Kebble, Paul Graham – The EUROCALL Review, 2016
The C-Test as a tool for assessing language competence has been in existence for nearly 40 years, having been designed by Professors Klein-Braley and Raatz for implementation in German and English. Much research has been conducted over the ensuing years, particularly in regards to reliability and construct validity, for which it is reported to…
Descriptors: Language Tests, Computer Software, Test Construction, Test Reliability
Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016
ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…
Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement
New Meridian Corporation, 2020
The purpose of this report is to describe the technical qualities of the 2018-2019 operational administration of the English language arts/literacy (ELA/L) and mathematics summative assessments in grades 3 through 8 and high school. The ELA/L assessments focus on reading and comprehending a range of sufficiently complex texts independently and…
Descriptors: Language Arts, Literacy Education, Mathematics Education, Summative Evaluation
New Meridian Corporation, 2020
The purpose of this report is to describe the technical qualities of the 2018-2019 operational administration of the English language arts/literacy (ELA/L) and mathematics assessments in grades 3 through 8 and high school. New Meridian, in coordination with multiple states and vendors, developed an alternate form of the summative assessment to…
Descriptors: Language Arts, Literacy Education, Mathematics Education, Summative Evaluation
Baird, Jo-Anne; Black, Paul – Research Papers in Education, 2013
Much has already been written on the controversies surrounding the use of different test theories in educational assessment. Other authors have noted the prevalence of classical test theory over item response theory in practice. This Special Issue draws together articles based upon work conducted on the Reliability Programme for England's…
Descriptors: Test Theory, Foreign Countries, Test Reliability, Item Response Theory