Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 6 |
| Since 2017 (last 10 years) | 13 |
| Since 2007 (last 20 years) | 32 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Location
| Germany | 2 |
| Iran | 2 |
| Japan | 2 |
| Texas | 2 |
| Turkey | 2 |
| Arizona | 1 |
| California | 1 |
| California (Los Angeles) | 1 |
| Colorado | 1 |
| Costa Rica | 1 |
| Delaware | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Glory Tobiason; Adrienne Lavine – Change: The Magazine of Higher Learning, 2025
Current methods for evaluating faculty teaching fall short, and one way to address this is through campus-wide initiatives that focus on change at the level of academic units. The complex context of higher education makes meaningful teaching evaluation difficult; in particular, four sobering realities of this context must be taken into account in…
Descriptors: Teacher Evaluation, Evaluation Methods, Testing Problems, Educational Change
Atehortua, Laura – ProQuest LLC, 2022
Intelligence tests are used in a variety of settings such as schools, clinics, and courts to assess the intellectual capacity of individuals of all ages. Intelligence tests are used to make high-stakes decisions such as special education placement, employment, eligibility for social security services, and determination of the death penalty.…
Descriptors: Adults, Intelligence Tests, Children, Error of Measurement
Skaggs, Gary; Hein, Serge F.; Wilkins, Jesse L. M. – Educational Measurement: Issues and Practice, 2020
In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can…
Descriptors: Standard Setting (Scoring), Cutting Scores, Testing Problems, Profiles
Arefsadr, Sajjad; Babaii, Esmat – TESL-EJ, 2023
According to the IELTS official website, IELTS candidates usually score lower in the IELTS Writing test than in the other language skills. This is disappointing for the many IELTS candidates who fail to get the overall band score they need. Surprisingly enough, few studies have addressed this issue. The present study, then, is aimed at shedding…
Descriptors: Second Language Learning, Language Tests, English (Second Language), Foreign Countries
Leventhal, Brian C.; Grabovsky, Irina – Educational Measurement: Issues and Practice, 2020
Standard setting is arguably one of the most subjective techniques in test development and psychometrics. The decisions when scores are compared to standards, however, are arguably the most consequential outcomes of testing. Providing licensure to practice in a profession has high stake consequences for the public. Denying graduation or forcing…
Descriptors: Standard Setting (Scoring), Weighted Scores, Test Construction, Psychometrics
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
LaFlair, Geoffrey T.; Langenfeld, Thomas; Baig, Basim; Horie, André Kenji; Attali, Yigal; von Davier, Alina A. – Journal of Computer Assisted Learning, 2022
Background: Digital-first assessments leverage the affordances of technology in all elements of the assessment process--from design and development to score reporting and evaluation to create test taker-centric assessments. Objectives: The goal of this paper is to describe the engineering, machine learning, and psychometric processes and…
Descriptors: Computer Assisted Testing, Affordances, Scoring, Engineering
Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019
Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…
Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)
Sarac, Merve; Loken, Eric – International Journal of Testing, 2023
This study is an exploratory analysis of examinee behavior in a large-scale language proficiency test. Despite a number-right scoring system with no penalty for guessing, we found that 16% of examinees omitted at least one answer and that women were more likely than men to omit answers. Item-response theory analyses treating the omitted responses…
Descriptors: English (Second Language), Language Proficiency, Language Tests, Second Language Learning
Chinda, Bordin; Cotter, Matthew; Ebrey, Matthew; Hinkelman, Don; Lambert, Peter; Miller, Annie – LEARN Journal: Language Education and Acquisition Research Network, 2022
This qualitative study investigated the washback of performance-based assessment used by three English language teachers in Hokkaido, Japan, each of whom implemented their own course-specific assessments. The study employed qualitative research of an in-depth interview with 15 students and a self-reflective method from 3 teachers. Teachers…
Descriptors: Language Tests, Testing Problems, Second Language Learning, Second Language Instruction
Xiao, Yang; Han, Jing; Koenig, Kathleen; Xiong, Jianwen; Bao, Lei – Physical Review Physics Education Research, 2018
Assessment instruments composed of two-tier multiple choice (TTMC) items are widely used in science education as an effective method to evaluate students' sophisticated understanding. In practice, however, there are often concerns regarding the common scoring methods of TTMC items, which include pair scoring and individual scoring schemes. The…
Descriptors: Hierarchical Linear Modeling, Item Response Theory, Multiple Choice Tests, Case Studies
Hoang, Ngoc Thi Huyen – Language Education & Assessment, 2019
As validity pertains to test use rather than the test itself, using a test for unintended purposes requires a new validation program using additional evidence from relevant sources. This small-scale study contributes to the validation of the use of originally academic language tests--the International English Language Testing System and the Test…
Descriptors: Language Tests, Immigrants, Immigration, Testing Problems
Clifford, Ray – Foreign Language Annals, 2016
This article summarizes some of the technical issues that add to the complexity of language testing. It focuses in particular on the criterion-referenced nature of the ACTFL Proficiency Guidelines-Speaking; and it proposes a criterion-referenced interpretation of the ACTFL guidelines for reading and listening. It then demonstrates how using…
Descriptors: Criterion Referenced Tests, Language Tests, Language Proficiency, Guidelines
Emadian, Farzaneh; Gholami, Javad; Sarkhosh, Mehdi – Journal of Teacher Education for Sustainability, 2018
The first and most crucial step towards developing a sustainable curriculum for instructors teaching English for Specific Academic Purposes (ESAP) is a needs analysis. Therefore, the main aim of conducting this study was to investigate the in-service needs of language instructors and content specialists teaching ESAP and to spot the differences…
Descriptors: English for Academic Purposes, Second Language Learning, Second Language Instruction, Inservice Teacher Education
Kahn, Josh; Nese, Joseph T.; Alonzo, Julie – Behavioral Research and Teaching, 2016
There is strong theoretical support for oral reading fluency (ORF) as an essential building block of reading proficiency. The current and standard ORF assessment procedure requires that students read aloud a grade-level passage (˜ 250 words) in a one-to-one administration, with the number of words read correctly in 60 seconds constituting their…
Descriptors: Teacher Surveys, Oral Reading, Reading Tests, Computer Assisted Testing

Peer reviewed
Direct link
