Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 14 |
Descriptor
Item Analysis | 14 |
Testing Problems | 14 |
Test Items | 9 |
Foreign Countries | 4 |
Item Response Theory | 4 |
Test Validity | 4 |
Evaluation Methods | 3 |
Test Construction | 3 |
Test Reliability | 3 |
Academic Standards | 2 |
Alternative Assessment | 2 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 12 |
Reports - Research | 5 |
Reports - Evaluative | 4 |
Reports - Descriptive | 3 |
Dissertations/Theses -… | 2 |
Education Level
Elementary Secondary Education | 3 |
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
Audience
Location
Colombia | 1 |
Netherlands | 1 |
Russia | 1 |
Turkey | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
Kaufman Test of Educational… | 1 |
Wechsler Individual… | 1 |
What Works Clearinghouse Rating
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Camenares, Devin – International Journal for the Scholarship of Teaching and Learning, 2022
Balancing assessment of learning outcomes with the expectations of students is a perennial challenge in education. Difficult exams, in which many students perform poorly, exacerbate this problem and can inspire a wide variety of interventions, such as a grading curve. However, addressing poor performance can sometimes distort or inflate grades and…
Descriptors: College Students, Student Evaluation, Tests, Test Items
Janssen, Gerriet – Language Testing, 2022
This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs…
Descriptors: Language Tests, Placement Tests, Language Teachers, College Faculty
Paneerselvam, Bavani – ProQuest LLC, 2017
Multiple-choice retrieval practice with additional lures reduces retention on a later test (Roediger & Marsh, 2005). However, the mechanism underlying the negative outcomes with additional lures is poorly understood. Given that the positive outcomes of retrieval practice are associated with enhanced relational and item-specific processing…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Recall (Psychology)
Karagöl, Efecan – Journal of Language and Linguistic Studies, 2020
Turkish and Foreign Languages Research and Application Center (TÖMER) is one of the important institutions for learning Turkish as a foreign language. In these institutions, proficiency tests are applied at the end of each level. However, test applications in TÖMERs vary between each center as there is no shared program in teaching Turkish as a…
Descriptors: Language Tests, Turkish, Language Proficiency, Second Language Learning
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Marushina, Albina – Journal of Mathematics Education at Teachers College, 2012
This paper aims to tell how the Russian national examination in mathematics (the Uniform State Examination or USE) has been conducted most recently. The author must say at once that the history of the system of secondary school graduation examinations or even the history of the USE will be covered only to the small degree that is necessary for…
Descriptors: Foreign Countries, Mathematics Tests, National Competency Tests, Secondary School Mathematics
Sawchuk, Stephen – Education Week, 2010
Most experts in the testing community have presumed that the $350 million promised by the U.S. Department of Education to support common assessments would promote those that made greater use of open-ended items capable of measuring higher-order critical-thinking skills. But as measurement experts consider the multitude of possibilities for an…
Descriptors: Test Items, Federal Legislation, Scoring, Accountability
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012
While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…
Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making
Kozloff, Allison Burstein – ProQuest LLC, 2009
Comprehensive academic achievement tests are routinely used by school psychologists in psycho-educational assessment batteries to identify learning disabled students. A variety of assessment measures are used across age groups to determine if a discrepancy exists between academic achievement and intellectual functioning; however, among the most…
Descriptors: Intelligence, Educational Assessment, Academic Achievement, Achievement Tests
Wollack, James A. – Applied Measurement in Education, 2006
Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…
Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size
Ketterlin-Geller, Leanne R. – Remedial and Special Education, 2007
When accurately assigned and administered appropriately, testing accommodations help ameliorate the effects of personal characteristics that limit access to critical information and prevent a person from demonstrating his or her true abilities in the tested domain. Inaccurate assignment or misuse of accommodations may counteract the benefits of…
Descriptors: Testing Accommodations, Individualized Instruction, Individualized Education Programs, Error of Measurement