Publication Date
In 2025 | 4 |
Since 2024 | 8 |
Since 2021 (last 5 years) | 36 |
Since 2016 (last 10 years) | 101 |
Since 2006 (last 20 years) | 199 |
Descriptor
Scores | 392 |
Test Validity | 392 |
Test Reliability | 132 |
Testing | 98 |
Computer Assisted Testing | 92 |
Testing Problems | 81 |
Standardized Tests | 68 |
Test Interpretation | 67 |
Test Construction | 65 |
Foreign Countries | 62 |
Correlation | 57 |
More ▼ |
Source
Author
Kane, Michael | 4 |
Ling, Guangming | 4 |
Anderson, Daniel | 3 |
Bowman, Harry L. | 3 |
Hambleton, Ronald K. | 3 |
Steinberg, Jonathan | 3 |
Tindal, Gerald | 3 |
Wise, Steven L. | 3 |
Alonzo, Julie | 2 |
Belur, Vinetha | 2 |
Biber, Douglas | 2 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 20 |
Practitioners | 12 |
Parents | 4 |
Teachers | 3 |
Administrators | 2 |
Community | 2 |
Policymakers | 2 |
Students | 2 |
Counselors | 1 |
Location
China | 9 |
Canada | 7 |
United Kingdom | 6 |
Germany | 5 |
United States | 5 |
Florida | 4 |
Indiana | 4 |
Iran | 4 |
California | 3 |
Israel | 3 |
Louisiana | 3 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Gorney, Kylie – ProQuest LLC, 2023
Aberrant behavior refers to any type of unusual behavior that would not be expected under normal circumstances. In educational and psychological testing, such behaviors have the potential to severely bias the aberrant examinee's test score while also jeopardizing the test scores of countless others. It is therefore crucial that aberrant examinees…
Descriptors: Behavior Problems, Educational Testing, Psychological Testing, Test Bias
Süleyman Demir; Derya Çobanoglu Aktan; Nese Güler – International Journal of Assessment Tools in Education, 2023
This study has two main purposes. Firstly, to compare the different item selection methods and stopping rules used in Computerized Adaptive Testing (CAT) applications with simulative data generated based on the item parameters of the Vocational Maturity Scale. Secondly, to test the validity of CAT application scores. For the first purpose,…
Descriptors: Computer Assisted Testing, Adaptive Testing, Vocational Maturity, Measures (Individuals)
Kayla V. Campaña; Benjamin G. Solomon – Assessment for Effective Intervention, 2025
The purpose of this study was to compare the classification accuracy of data produced by the previous year's end-of-year New York state assessment, a computer-adaptive diagnostic assessment ("i-Ready"), and the gating combination of both assessments to predict the rate of students passing the following year's end-of-year state assessment…
Descriptors: Accuracy, Classification, Diagnostic Tests, Adaptive Testing
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Paige Haley – ProQuest LLC, 2023
As the research on feigning has grown, the number and quality of performance validity tests (PVTs) has increased as well. However, while several PVTs have been developed from assessments commonly used as part of neuropsychological batteries, there has been less exploration for PVTs scored from items in cognitive screeners. The Montreal Cognitive…
Descriptors: Cognitive Measurement, Performance, Test Validity, Psychological Testing
Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024
In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…
Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries
Kalemdaroglu-Wheeler, Elif – ProQuest LLC, 2023
The purpose of this qualitative exploratory case study was to explore teachers' and administrators' perceptions of test score pollution deriving from COVID-19-related issues that may affect students' test scores on state-mandated standardized tests for grades six through 12 in a state along the Atlantic Coast of the United States. Four research…
Descriptors: Testing Problems, Scores, COVID-19, Pandemics
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Wise, Steven L. – Education Inquiry, 2019
A decision of whether to move from paper-and-pencil to computer-based tests is based largely on a careful weighing of the potential benefits of a change against its costs, disadvantages, and challenges. This paper briefly discusses the trade-offs involved in making such a transition, and then focuses on a relatively unexplored benefit of…
Descriptors: Computer Assisted Testing, Cheating, Test Wiseness, Scores
Julie Sriken; Bradley T. Erford; Martin F. Sherman; Kristen Watson; Heather L. Smith – Measurement and Evaluation in Counseling and Development, 2024
Psychometric characteristics of CESD-R scores were explored on a sample of 966 undergraduate students. Internal consistency ([alpha] = 0.92), external convergent and discriminant validity, and response bias were adequate to excellent. Strong measurement invariance was evident for gender and race comparisons, and the unidimensional model fit the…
Descriptors: Symptoms (Individual Disorders), Depression (Psychology), Measures (Individuals), Undergraduate Students
Isbell, Daniel R.; Kremmel, Benjamin – Language Testing, 2020
Administration of high-stakes language proficiency tests has been disrupted in many parts of the world as a result of the 2019 novel coronavirus pandemic. Institutions that rely on test scores have been forced to adapt, and in many cases this means using scores from a different test, or a new online version of an existing test, that can be taken…
Descriptors: Language Tests, High Stakes Tests, Language Proficiency, Second Language Learning
Raley, Sheida K.; Shogren, Karrie A.; Rifenbark, Graham G.; Anderson, Mark H.; Shaw, Leslie A. – Journal of Special Education Technology, 2020
The Self-Determination Inventory: Student Report (SDI: SR) was developed to measure the self-determination of adolescents and was recently validated for students aged 13-22 with and without disabilities across diverse racial/ethnic backgrounds. The SDI: SR is aligned Causal Agency Theory and its theoretical conceptualizations of self-determined…
Descriptors: Testing, Self Determination, Scores, Students with Disabilities
Jiayi Wang; Michael T. Kalkbrenner; Riley Schaner – Psychology in the Schools, 2025
Teaching is a stressful profession with a high turnover rate. Schools and related institutions need to take more action to support teachers and keep teacher stress at a manageable level. The continued research and practical effort require measures to examine teachers' stress in a briefer and accurate manner. The Teacher Stress Scale is a recently…
Descriptors: Elementary School Teachers, Secondary School Teachers, Preschool Teachers, Stress Variables
Chen, Yunxiao; Lee, Yi-Hsuan; Li, Xiaoou – Journal of Educational and Behavioral Statistics, 2022
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric…
Descriptors: Standardized Tests, Test Items, Test Validity, Scores
Choi, Yun Deok – Language Testing in Asia, 2022
A much-debated question in the L2 assessment field is if computer familiarity should be considered a potential source of construct-irrelevant variance in computer-based writing (CBW) tests. This study aims to make a partial validity argument for an online source-based writing test (OSWT) designed for English placement testing (EPT), focusing on…
Descriptors: Test Validity, Scores, Computer Assisted Testing, English (Second Language)