Publication Date
In 2025 | 1 |
Since 2024 | 9 |
Since 2021 (last 5 years) | 26 |
Since 2016 (last 10 years) | 62 |
Since 2006 (last 20 years) | 176 |
Descriptor
Source
Author
McCrimmon, Adam W. | 6 |
Chapelle, Carol A. | 3 |
Kane, Michael | 3 |
Milton, Ohmer | 3 |
Ackerman, Debra J. | 2 |
Chalhoub-Deville, Micheline | 2 |
Chambers, Francine | 2 |
Cheng, Liying | 2 |
Cziko, Gary A. | 2 |
Davies, Alan | 2 |
Dickens, Rachel H. | 2 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 14 |
Researchers | 9 |
Teachers | 8 |
Administrators | 2 |
Location
Canada | 12 |
China | 9 |
United Kingdom | 8 |
United Kingdom (England) | 6 |
Japan | 5 |
Australia | 4 |
Malaysia | 4 |
United States | 4 |
Brazil | 3 |
Iran | 3 |
Arizona | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 5 |
Education for All Handicapped… | 1 |
Elementary and Secondary… | 1 |
Lau v Nichols | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
McLeod, Justin W.H.; McCrimmon, Adam W. – Journal of Psychoeducational Assessment, 2021
The "Raven's 2 Progressive Matrices Clinical Edition" (Raven's 2; Raven, Rust, Chan, & Zhou, 2018), published by NCS Pearson, is an individually administered nonverbal assessment of general cognitive ability developed to measure "educative abilities," defined as the ability to think clearly and solve complex problems in…
Descriptors: Test Reviews, Intelligence Tests, Testing, Test Reliability
Bruno D. Zumbo – International Journal of Assessment Tools in Education, 2023
In line with the journal volume's theme, this essay considers lessons from the past and visions for the future of test validity. In the first part of the essay, a description of historical trends in test validity since the early 1900s leads to the natural question of whether the discipline has progressed in its definition and description of test…
Descriptors: Test Theory, Test Validity, True Scores, Definitions
Susan K. Johnsen – Gifted Child Today, 2024
The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…
Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity
Ian Phil Canlas; Joyce Molino-Magtolis – Journal of Biological Education, 2024
The use of drawing as an assessment tool to reveal students' conceptions in biology specifically on human organs and organ systems is not new, however, there is a deficit in the literature that attempted to explore and reflect on its usefulness and relevance specifically, in eliciting students' preconceptions related thereto. Making use of a…
Descriptors: Foreign Countries, Preservice Teacher Education, Preservice Teachers, Biology
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Militsa G. Ivanova; Michalis P. Michaelides – Practical Assessment, Research & Evaluation, 2023
Research on methods for measuring examinee engagement with constructed-response items is limited. The present study used data from the PISA 2018 Reading domain to construct and compare indicators of test-taking effort on constructed-response items: response time, number of actions, the union (combining effortless responses detected by either…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Meagan Karvonen; Russell Swinburne Romine; Amy K. Clark – Practical Assessment, Research & Evaluation, 2024
This paper describes methods and findings from student cognitive labs, teacher cognitive labs, and test administration observations as evidence evaluated in a validity argument for a computer-based alternate assessment for students with significant cognitive disabilities. Validity of score interpretations and uses for alternate assessments based…
Descriptors: Students with Disabilities, Intellectual Disability, Severe Disabilities, Student Evaluation
Mansooreh Hosseinnia; Zahra Kafi – Language Testing in Asia, 2024
As testing involves various aspects of education as well as the ones who are involved like instructors, students, managers, teacher trainers, testers, and decision-makers, it comes to be highly crucial to develop ethical tests. In addition, as some methods of testing are more favored and practiced compared to others without considering the ethical…
Descriptors: Test Construction, Test Validity, Ethics, Testing
Yousuf, Mustafa S.; Miles, Katherine; Harvey, Heather; Al-Tamimi, Mohammad; Badran, Darwish – Journal of University Teaching and Learning Practice, 2022
Exams should be valid, reliable, and discriminative. Multiple informative methods are used for exam analysis. Displaying analysis results numerically, however, may not be easily comprehended. Using graphical analysis tools could be better for the perception of analysis results. Two such methods were employed: standardized x-bar control charts with…
Descriptors: Multiple Choice Tests, Testing, Test Reliability, Test Validity