Publication Date
In 2025 | 2 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 37 |
Since 2016 (last 10 years) | 99 |
Since 2006 (last 20 years) | 235 |
Descriptor
Test Validity | 1083 |
Testing | 1083 |
Test Reliability | 487 |
Test Construction | 325 |
Language Tests | 201 |
Test Interpretation | 168 |
Standardized Tests | 149 |
Scoring | 138 |
Elementary Secondary Education | 133 |
Testing Problems | 130 |
Second Language Learning | 120 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 46 |
Teachers | 24 |
Researchers | 18 |
Administrators | 8 |
Policymakers | 6 |
Students | 3 |
Community | 1 |
Counselors | 1 |
Parents | 1 |
Location
Canada | 15 |
New York | 13 |
United Kingdom | 11 |
California | 10 |
China | 10 |
Australia | 8 |
Pennsylvania | 8 |
United Kingdom (England) | 8 |
Japan | 7 |
United States | 7 |
Illinois | 6 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
New York State Education Department, 2024
The New York State Education Department (NYSED) has a partnership with NWEA for the development of the 2024 Grades 3-8 English Language Arts Tests. Teachers from across the State work with NYSED in a variety of activities to ensure the validity and reliability of the New York State Testing Program (NYSTP). The 2024 Grades 6 and 7 English Language…
Descriptors: Language Tests, Test Format, Language Arts, English Instruction
Meagan Karvonen; Russell Swinburne Romine; Amy K. Clark – Practical Assessment, Research & Evaluation, 2024
This paper describes methods and findings from student cognitive labs, teacher cognitive labs, and test administration observations as evidence evaluated in a validity argument for a computer-based alternate assessment for students with significant cognitive disabilities. Validity of score interpretations and uses for alternate assessments based…
Descriptors: Students with Disabilities, Intellectual Disability, Severe Disabilities, Student Evaluation
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Kun Su – ProQuest LLC, 2022
This dissertation provides a start-to-finish description of development, administration, and validation for an online middle-school physics test using a DCM framework with response-time. The first paper illustrated the process of implementing DCM with a careful selection of the content domain and a simulation approach for a Q-matrix construction.…
Descriptors: Science Instruction, Physics, Middle Schools, Testing
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Patrick Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Report Series, 2024
Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international large-scale assessments of cognitive and…
Descriptors: Assessment Literacy, Testing, Test Bias, Test Construction
Patael, Smadar; Shamir, Julia; Soffer, Tal; Livne, Eynat; Fogel-Grinvald, Haya; Kishon-Rabin, Liat – Journal of Computer Assisted Learning, 2022
Background: The global COVID-19 pandemic turned the adoption of on-line assessment in the institutions for higher education from possibility to necessity. Thus, in the end of Fall 20/21 semester Tel Aviv University (TAU)--the largest university in Israel--designed and implemented a scalable procedure for administering proctored remote…
Descriptors: COVID-19, Pandemics, Computer Assisted Testing, Foreign Countries
Fairbairn, Judith; Spiby, Richard – European Journal of Special Needs Education, 2019
Language test developers have a responsibility to ensure that their tests are accessible to test takers of various backgrounds and characteristics and also that they have the opportunity to perform to the best of their ability. This principle is widely recognised by educational and language testing associations in guidelines for the production and…
Descriptors: Testing, Language Tests, Test Construction, Testing Accommodations
McLeod, Justin W.H.; McCrimmon, Adam W. – Journal of Psychoeducational Assessment, 2021
The "Raven's 2 Progressive Matrices Clinical Edition" (Raven's 2; Raven, Rust, Chan, & Zhou, 2018), published by NCS Pearson, is an individually administered nonverbal assessment of general cognitive ability developed to measure "educative abilities," defined as the ability to think clearly and solve complex problems in…
Descriptors: Test Reviews, Intelligence Tests, Testing, Test Reliability
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Bruno D. Zumbo – International Journal of Assessment Tools in Education, 2023
In line with the journal volume's theme, this essay considers lessons from the past and visions for the future of test validity. In the first part of the essay, a description of historical trends in test validity since the early 1900s leads to the natural question of whether the discipline has progressed in its definition and description of test…
Descriptors: Test Theory, Test Validity, True Scores, Definitions