Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 11 |
Since 2016 (last 10 years) | 27 |
Since 2006 (last 20 years) | 71 |
Descriptor
Evaluation Methods | 119 |
Scoring | 119 |
Test Validity | 61 |
Validity | 47 |
Student Evaluation | 31 |
Test Reliability | 23 |
Reliability | 20 |
Computer Assisted Testing | 18 |
Elementary Secondary Education | 18 |
Interrater Reliability | 18 |
Foreign Countries | 17 |
More ▼ |
Source
Author
Bejar, Isaac I. | 3 |
Baker, Eva L. | 2 |
Borko, Hilda | 2 |
Clariana, Roy B. | 2 |
Darling-Hammond, Linda | 2 |
Downer, Jason T. | 2 |
Hambleton, Ronald K. | 2 |
Han, Chao | 2 |
Johnson, Robert L. | 2 |
Kane, Thomas J. | 2 |
Oliveri, María Elena | 2 |
More ▼ |
Publication Type
Education Level
Location
Australia | 5 |
Vermont | 3 |
California | 2 |
Canada | 2 |
Connecticut | 2 |
New Hampshire | 2 |
New York | 2 |
Rhode Island | 2 |
Singapore | 2 |
United Kingdom (England) | 2 |
Alabama | 1 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 3 |
Comprehensive Education… | 1 |
Elementary and Secondary… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Deborah Oluwadele; Yashik Singh; Timothy Adeliyi – Electronic Journal of e-Learning, 2024
Validation is needed for any newly developed model or framework because it requires several real-life applications. The investment made into e-learning in medical education is daunting, as is the expectation for a positive return on investment. The medical education domain requires data-wise implementation of e-learning as the debate continues…
Descriptors: Electronic Learning, Evaluation Methods, Medical Education, Sustainability
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Rafner, Janet; Biskjaer, Michael Mose; Zana, Blanka; Langsford, Steven; Bergenholtz, Carsten; Rahimi, Seyedahmad; Carugati, Andrea; Noy, Lior; Sherson, Jacob – Creativity Research Journal, 2022
Creativity assessments should be valid, reliable, and scalable to support various stakeholders (e.g., policy-makers, educators, corporations, and the general public) in their decision-making processes. Established initiatives toward scalable creativity assessments have relied on well-studied standardized tests. Although robust in many ways, most…
Descriptors: Creativity, Evaluation Methods, Video Games, Computer Assisted Testing
Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022
Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…
Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment
Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022
We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…
Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods
Zychowicz, Katarzyna; Biedron, Adriana; Pawlak, Miroslaw – Studies in Second Language Learning and Teaching, 2017
Individual differences in second language acquisition (SLA) encompass differences in working memory capacity, which is believed to be one of the most crucial factors influencing language learning. However, in Poland research on the role of working memory in SLA is scarce due to a lack of proper Polish instruments for measuring this construct. The…
Descriptors: Verbal Ability, Short Term Memory, Individual Differences, Second Language Learning
Mattern, Krista; Radunzel, Justine – ACT, Inc., 2019
When applicants take the ACT® more than once, how do colleges and universities reconcile and make sense of the multiple scores? In terms of validity, fairness, and impact on subgroup differences, are certain score-use polices better than others? The focus of this issue brief is to summarize evidence on the validity and fairness of various…
Descriptors: Scoring, College Entrance Examinations, Test Validity, Evaluation Methods
Chan, Sathena; May, Lyn – Language Testing, 2023
Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…
Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills
Feranchak, Bret; Deiger, Megan – AERA Online Paper Repository, 2017
Increasingly content area projects and programs at the K-12 level, such as in mathematics, involve a programmatic component or project emphasis on developing "teacher leadership". However, there is no consistent definition or framework for this construct and even fewer validated tools for measuring it. This paper describes our efforts in…
Descriptors: Teacher Leadership, Mathematics Instruction, Guidelines, Elementary Secondary Education
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Ziwei Zhou – ProQuest LLC, 2020
In light of the ever-increasing capability of computer technology and advancement in speech and natural language processing techniques, automated speech scoring of constructed responses is gaining popularity in many high-stakes assessment and low-stakes educational settings. Automated scoring is a highly interdisciplinary and complex subject, and…
Descriptors: Certification, Speech Skills, Automation, Scoring
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring
Sharakhimov, Shoaziz; Nurmukhamedov, Ulugbek – English Teaching Forum, 2021
Vocabulary learning is an incremental process. Vocabulary knowledge, especially for second-language learners, may develop across a lifetime. Teachers with experience in providing feedback on their students' vocabulary use in writing or speech might have noticed that it is sometimes difficult to pinpoint one aspect of word knowledge. The reason is…
Descriptors: Vocabulary Development, Second Language Learning, Second Language Instruction, English (Second Language)
Bell, Courtney A.; Jones, Nathan D.; Qi, Yi; Lewis, Jennifer M. – Educational Assessment, 2018
All 50 states use observations to evaluate practicing teachers, but we know little about how administrators actually reason when they use those observation protocols. Drawing on think-aloud and stimulated recall data, this study describes the types of strategies and warrants practicing administrators used when rating with their district's…
Descriptors: Administrators, Observation, Validity, Logical Thinking
L. Hannah; E. E. Jang; M. Shah; V. Gupta – Language Assessment Quarterly, 2023
Machines have a long-demonstrated ability to find statistical relationships between qualities of texts and surface-level linguistic indicators of writing. More recently, unlocked by artificial intelligence, the potential of using machines to identify content-related writing trait criteria has been uncovered. This development is significant,…
Descriptors: Validity, Automation, Scoring, Writing Assignments