Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 8 |
| Since 2017 (last 10 years) | 15 |
| Since 2007 (last 20 years) | 30 |
Descriptor
| Test Items | 76 |
| Test Validity | 76 |
| Testing | 76 |
| Test Construction | 40 |
| Test Reliability | 35 |
| Scoring | 24 |
| Language Tests | 18 |
| Item Analysis | 16 |
| English (Second Language) | 14 |
| Scores | 14 |
| Foreign Countries | 13 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Secondary Education | 5 |
| Early Childhood Education | 3 |
| Elementary Education | 3 |
| Grade 3 | 3 |
| Grade 4 | 3 |
| Grade 5 | 3 |
| Grade 7 | 3 |
| Grade 9 | 3 |
| High Schools | 3 |
| Junior High Schools | 3 |
| Middle Schools | 3 |
| More ▼ | |
Location
| California | 4 |
| Maryland | 2 |
| Belgium | 1 |
| Canada | 1 |
| Delaware | 1 |
| Illinois | 1 |
| Indiana | 1 |
| Indonesia | 1 |
| Iran | 1 |
| Israel | 1 |
| Japan | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Bruno D. Zumbo – International Journal of Assessment Tools in Education, 2023
In line with the journal volume's theme, this essay considers lessons from the past and visions for the future of test validity. In the first part of the essay, a description of historical trends in test validity since the early 1900s leads to the natural question of whether the discipline has progressed in its definition and description of test…
Descriptors: Test Theory, Test Validity, True Scores, Definitions
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023
Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…
Descriptors: Biology, Science Tests, High Stakes Tests, Scores
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
NWEA, 2022
This technical report documents the processes and procedures employed by NWEA® to build and support the English MAP® Reading Fluency™ assessments administered during the 2020-2021 school year. It is written for measurement professionals and administrators to help evaluate the quality of MAP Reading Fluency. The seven sections of this report: (1)…
Descriptors: Achievement Tests, Reading Tests, Reading Achievement, Reading Fluency
Patrisius Istiarto Djiwandono; Daniel Ginting – Language Education & Assessment, 2025
The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the…
Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring
Ceuppens, Stijn; Deprez, Johan; Dehaene, Wim; De Cock, Mieke – Physical Review Physics Education Research, 2018
This study reports on the development, validation, and administration of a 48-item multiple-choice test to assess students' representational fluency of linear functions in a physics context (1D kinematics) and a mathematics context. The test includes three external representations: graphs, tables, and formulas, which result in six possible…
Descriptors: Secondary School Students, Mathematics Tests, Test Construction, Foreign Countries
Reynolds, Matthew R.; Niileksela, Christopher R. – Journal of Psychoeducational Assessment, 2015
"The Woodcock-Johnson IV Tests of Cognitive Abilities" (WJ IV COG) is an individually administered measure of psychometric intellectual abilities designed for ages 2 to 90+. The measure was published by Houghton Mifflin Harcourt-Riverside in 2014. Frederick Shrank, Kevin McGrew, and Nancy Mather are the authors. Richard Woodcock, the…
Descriptors: Cognitive Tests, Testing, Scoring, Test Interpretation
Behizadeh, Nadia; Engelhard, George, Jr. – Measurement: Interdisciplinary Research and Perspectives, 2015
In his focus article, Koretz (this issue) argues that accountability has become the primary function of large-scale testing in the United States. He then points out that tests being used for accountability purposes are flawed and that the high-stakes nature of these tests creates a context that encourages score inflation. Koretz is concerned about…
Descriptors: Communities of Practice, High Stakes Tests, Testing, Test Validity
Ackerman, Debra J. – ETS Research Report Series, 2018
Kindergarten entry assessments (KEAs) have increasingly been incorporated into state education policies over the past 5 years, with much of this interest stemming from Race to the Top--Early Learning Challenge (RTT-ELC) awards, Enhanced Assessment Grants, and nationwide efforts to develop common K-12 state learning standards. Drawing on…
Descriptors: Screening Tests, Kindergarten, Test Validity, Test Reliability
McCrimmon, Adam; Rostad, Kristin – Journal of Psychoeducational Assessment, 2014
This article reviews the "Autism Diagnostic Observation Schedule, Second Edition" (ADOS-2; Lord, Luyster, Gotham, & Guthrie, 2012; Lord, Rutter et al., 2012), a newly updated, semistructured, standardized measure of communication, social interaction, play/imagination, and restricted and/or repetitive behaviors published by Western…
Descriptors: Diagnostic Tests, Autism, Pervasive Developmental Disorders, Testing

Peer reviewed
Direct link
