Publication Date
In 2025 | 7 |
Since 2024 | 24 |
Since 2021 (last 5 years) | 73 |
Since 2016 (last 10 years) | 163 |
Since 2006 (last 20 years) | 412 |
Descriptor
Test Validity | 1009 |
Test Reliability | 457 |
Test Construction | 369 |
Evaluation Methods | 188 |
Elementary Secondary Education | 163 |
Student Evaluation | 156 |
Foreign Countries | 131 |
Higher Education | 127 |
Standardized Tests | 125 |
Testing | 125 |
Language Tests | 117 |
More ▼ |
Source
Author
Stansfield, Charles W. | 11 |
Kenyon, Dorry Mann | 4 |
Popham, W. James | 4 |
Sireci, Stephen G. | 4 |
Abedi, Jamal | 3 |
Brown, James Dean | 3 |
Clarke, Ben | 3 |
Halle, Tamara | 3 |
Ketterlin-Geller, Leanne R. | 3 |
Koretz, Daniel | 3 |
Liu, Kimy | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 66 |
Practitioners | 52 |
Teachers | 21 |
Administrators | 14 |
Policymakers | 10 |
Counselors | 2 |
Community | 1 |
Parents | 1 |
Location
Canada | 15 |
United Kingdom | 15 |
Australia | 14 |
United States | 13 |
New York | 10 |
Nebraska | 8 |
Netherlands | 7 |
Texas | 7 |
Georgia | 6 |
India | 6 |
California | 5 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Andrew P. Jaciw – American Journal of Evaluation, 2025
By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…
Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests
Philipp Sterner; Kim De Roover; David Goretzko – Structural Equation Modeling: A Multidisciplinary Journal, 2025
When comparing relations and means of latent variables, it is important to establish measurement invariance (MI). Most methods to assess MI are based on confirmatory factor analysis (CFA). Recently, new methods have been developed based on exploratory factor analysis (EFA); most notably, as extensions of multi-group EFA, researchers introduced…
Descriptors: Error of Measurement, Measurement Techniques, Factor Analysis, Structural Equation Models
Denise Swanson; Gerald Tindal – Behavioral Research and Teaching, 2024
This technical report provides an authoritative bibliographic resource of all the studies conducted on "easyCBM"® and published on the main website for Behavioral Research and Teaching under Publications (https://brtprojects.org). The "easyCBM"© software is a direct descendent of "Curriculum-based Measurement" (CBM)…
Descriptors: Bibliographies, Computer Software, Test Construction, Test Reliability
Sonique Sailsman; Emma El-Shami – Quarterly Review of Distance Education, 2024
Nurse educators at the undergraduate level spend significant time developing and revising exam questions. Following the exam administration, course faculty have the opportunity to complete an item analysis and question revision to improve reliability and validity. A challenge faculty face is tracking these exam changes when teaching as part of a…
Descriptors: Nursing Education, Nursing Students, College Faculty, Test Construction
Scott J. Peters; Matthew C. Makel; Lindsay Ellis Lee; Tamra Stambaugh; Matthew T. McBee; D. Betsy McCoach; Kiana R. Johnson – Gifted Child Today, 2024
Universal screening is one of the most-common topics and well-accepted best practices within the field of gifted and talented education. There appears to be little disagreement that universally screening all students as part of a gifted and talented identification process results in fewer missed students. But surprisingly, there is little guidance…
Descriptors: Academically Gifted, Talent Identification, Screening Tests, Test Validity
Benjawan Plengkham; Sonthaya Rattanasak; Patsawut Sukserm – Journal of Education and Learning, 2025
This academic article provides the essential steps for designing an effective English questionnaire in social science research, with a focus on ensuring clarity, cultural sensitivity and ethical integrity. Developed from key insights from related studies, it outlines potential practice in questionnaire design, item development and the importance…
Descriptors: Guidelines, Test Construction, Questionnaires, Surveys
Maddox, Bryan – OECD Publishing, 2023
The digital transition in educational testing has introduced many new opportunities for technology to enhance large-scale assessments. These include the potential to collect and use log data on test-taker response processes routinely, and on a large scale. Process data has long been recognised as a valuable source of validation evidence in…
Descriptors: Measurement, Inferences, Test Reliability, Computer Assisted Testing
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
National Institute for Excellence in Teaching, 2023
Aspiring teachers must develop an in-depth understanding of high-quality instructional practices. In order to prepare, instruct, and coach aspiring teachers, the National Institute for Excellence in Teaching (NIET) has developed a the NIET Aspiring Teacher Rubric (ATR) based on principles of excellence in instruction. This research brief…
Descriptors: Scoring Rubrics, Preservice Teachers, Test Construction, Test Validity
Read, John – Language Testing, 2023
Published work on vocabulary assessment has grown substantially in the last 10 years, but it is still somewhat outside the mainstream of the field. There has been a recent call for those developing vocabulary tests to apply professional standards to their work, especially in validating their instruments for specified purposes before releasing them…
Descriptors: Language Tests, Vocabulary Development, Second Language Learning, Test Format
NWEA, 2022
This technical report documents the processes and procedures employed by NWEA® to build and support the English MAP® Reading Fluency™ assessments administered during the 2020-2021 school year. It is written for measurement professionals and administrators to help evaluate the quality of MAP Reading Fluency. The seven sections of this report: (1)…
Descriptors: Achievement Tests, Reading Tests, Reading Achievement, Reading Fluency
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction