Publication Date
In 2025 | 6 |
Since 2024 | 14 |
Since 2021 (last 5 years) | 43 |
Since 2016 (last 10 years) | 87 |
Since 2006 (last 20 years) | 145 |
Descriptor
Test Items | 324 |
Test Validity | 324 |
Test Construction | 158 |
Test Reliability | 123 |
Computer Assisted Testing | 98 |
Item Analysis | 75 |
Testing | 73 |
Testing Problems | 71 |
Foreign Countries | 62 |
Scores | 49 |
Difficulty Level | 48 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 17 |
Researchers | 15 |
Teachers | 15 |
Administrators | 6 |
Students | 4 |
Support Staff | 3 |
Community | 1 |
Parents | 1 |
Location
California | 8 |
Canada | 6 |
China | 6 |
Florida | 4 |
Tennessee | 4 |
Turkey | 4 |
Germany | 3 |
Israel | 3 |
Malaysia | 3 |
Nebraska | 3 |
Netherlands | 3 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 5 |
Individuals with Disabilities… | 4 |
Every Student Succeeds Act… | 3 |
Rehabilitation Act 1973… | 3 |
Comprehensive Education… | 2 |
Americans with Disabilities… | 1 |
Rehabilitation Act 1973 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Jyun-Hong Chen; Hsiu-Yi Chao – Journal of Educational and Behavioral Statistics, 2024
To solve the attenuation paradox in computerized adaptive testing (CAT), this study proposes an item selection method, the integer programming approach based on real-time test data (IPRD), to improve test efficiency. The IPRD method turns information regarding the ability distribution of the population from real-time test data into feasible test…
Descriptors: Data Use, Computer Assisted Testing, Adaptive Testing, Design
Musa Adekunle Ayanwale; Mdutshekelwa Ndlovu – Journal of Pedagogical Research, 2024
The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT…
Descriptors: Adaptive Testing, Benchmarking, National Competency Tests, Computer Assisted Testing
Gorney, Kylie – ProQuest LLC, 2023
Aberrant behavior refers to any type of unusual behavior that would not be expected under normal circumstances. In educational and psychological testing, such behaviors have the potential to severely bias the aberrant examinee's test score while also jeopardizing the test scores of countless others. It is therefore crucial that aberrant examinees…
Descriptors: Behavior Problems, Educational Testing, Psychological Testing, Test Bias
Süleyman Demir; Derya Çobanoglu Aktan; Nese Güler – International Journal of Assessment Tools in Education, 2023
This study has two main purposes. Firstly, to compare the different item selection methods and stopping rules used in Computerized Adaptive Testing (CAT) applications with simulative data generated based on the item parameters of the Vocational Maturity Scale. Secondly, to test the validity of CAT application scores. For the first purpose,…
Descriptors: Computer Assisted Testing, Adaptive Testing, Vocational Maturity, Measures (Individuals)
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Yasuda, Jun-ichiro; Hull, Michael M.; Mae, Naohiro – Physical Review Physics Education Research, 2022
This paper presents improvements made to a computerized adaptive testing (CAT)-based version of the FCI (FCI-CAT) in regards to test security and test efficiency. First, we will discuss measures to enhance test security by controlling for item overexposure, decreasing the risk that respondents may (i) memorize the content of a pretest for use on…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Risk Management
Collin Shepley; Amanda Leigh Duncan; Anthony P. Setari – Journal of Early Intervention, 2025
The provision of progress monitoring within publicly funded early childhood classrooms is legally required, supported by empirical research, and recommended by early childhood professional organizations, for teachers providing Part B services under the Individuals with Disabilities Education Act. Despite the widespread recognition of progress…
Descriptors: Progress Monitoring, Measures (Individuals), Test Construction, Test Validity
Paige Haley – ProQuest LLC, 2023
As the research on feigning has grown, the number and quality of performance validity tests (PVTs) has increased as well. However, while several PVTs have been developed from assessments commonly used as part of neuropsychological batteries, there has been less exploration for PVTs scored from items in cognitive screeners. The Montreal Cognitive…
Descriptors: Cognitive Measurement, Performance, Test Validity, Psychological Testing
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Bruno D. Zumbo – International Journal of Assessment Tools in Education, 2023
In line with the journal volume's theme, this essay considers lessons from the past and visions for the future of test validity. In the first part of the essay, a description of historical trends in test validity since the early 1900s leads to the natural question of whether the discipline has progressed in its definition and description of test…
Descriptors: Test Theory, Test Validity, True Scores, Definitions
Falcão, Filipe; Pereira, Daniela Marques; Gonçalves, Nuno; De Champlain, Andre; Costa, Patrício; Pêgo, José Miguel – Advances in Health Sciences Education, 2023
Automatic Item Generation (AIG) refers to the process of using cognitive models to generate test items using computer modules. It is a new but rapidly evolving research area where cognitive and psychometric theory are combined into digital framework. However, assessment of the item quality, usability and validity of AIG relative to traditional…
Descriptors: Computer Assisted Testing, Test Construction, Test Items, Automation
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Rivas, Axel; Scasso, Martín Guillermo – Journal of Education Policy, 2021
Since 2000, the PISA test implemented by OECD has become the prime benchmark for international comparisons in education. The 2015 PISA edition introduced methodological changes that altered the nature of its results. PISA made no longer valid non-reached items of the final part of the test, assuming that those unanswered questions were more a…
Descriptors: Test Validity, Computer Assisted Testing, Foreign Countries, Achievement Tests
Bomna Ko; Phillip Ward; Han Joo Lee; Yaohui He; Kelsey Higginson; Insook Kim – International Journal of Kinesiology in Higher Education, 2025
Developing valid and reliable instruments to assess common content knowledge (CCK) is a prerequisite for determining and improving the content knowledge of preservice teachers (PSTs) and teachers. We report on the development and psychometric analysis of an instrument for assessing PSTs' gymnastics CCK for secondary physical education teaching…
Descriptors: Athletics, Foreign Countries, Preservice Teachers, Test Validity