Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 8 |
Descriptor
Test Items | 52 |
Test Reliability | 52 |
Testing Problems | 52 |
Test Construction | 29 |
Test Validity | 26 |
Item Analysis | 16 |
Achievement Tests | 12 |
Higher Education | 11 |
Multiple Choice Tests | 10 |
Test Format | 9 |
Test Interpretation | 8 |
More ▼ |
Source
Author
Wilcox, Rand R. | 2 |
Algina, James | 1 |
Altepeter, Tom | 1 |
Andrada, Gilbert N. | 1 |
Andrich, David | 1 |
Askegaard, Lewis D. | 1 |
Autman, Hamlet | 1 |
Baghi, Heibatollah | 1 |
Bao, Lei | 1 |
Barlow, Lisa | 1 |
Benderson, Albert, Ed. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 3 |
Postsecondary Education | 2 |
Elementary Secondary Education | 1 |
Audience
Researchers | 6 |
Practitioners | 5 |
Teachers | 2 |
Students | 1 |
Location
Burma | 1 |
Colombia | 1 |
Germany | 1 |
New Jersey | 1 |
United Arab Emirates | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Comprehensive Tests of Basic… | 1 |
Expressive One Word Picture… | 1 |
What Works Clearinghouse Rating
Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018
Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…
Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items
Bao, Lei; Xiao, Yang; Koenig, Kathleen; Han, Jing – Physical Review Physics Education Research, 2018
In science, technology, engineering, and mathematics education there has been increased emphasis on teaching goals that include not only the learning of content knowledge but also the development of scientific reasoning skills. The Lawson classroom test of scientific reasoning (LCTSR) is a popular assessment instrument for scientific reasoning.…
Descriptors: Science Tests, Science Process Skills, Logical Thinking, Test Validity
Giraldo, Frank – HOW, 2019
The purpose of this article of reflection is to raise awareness of how poor design of language assessments may have detrimental effects, if crucial qualities and technicalities of test design are not met. The article first discusses these central qualities for useful language assessments. Then, guidelines for creating listening assessments, as an…
Descriptors: Test Construction, Consciousness Raising, Language Tests, Second Language Learning
Autman, Hamlet; Kelly, Stephanie – Business and Professional Communication Quarterly, 2017
This article contains two measurement development studies on writing apprehension. Study 1 reexamines the validity of the writing apprehension measure based on the finding from prior research that a second false factor was embedded. The findings from Study 1 support the validity of a reduced measure with 6 items versus the original 20-item…
Descriptors: Writing Apprehension, Writing Tests, Test Validity, Test Reliability
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Kirkpatrick, Robert; Hlaing, Hmone Lian – Language Testing in Asia, 2013
This study examines the English section of the university entrance examination in Myanmar in terms of validity, reliability, practicality, and washback. The study highlights the significance of the matriculation examination, evaluates individual test items, and presents the opinions of teachers and students about the test. The results reveal that…
Descriptors: Foreign Countries, College Entrance Examinations, Test Reliability, Test Items
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Conger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
Andrich, David – 1984
Both the attenuation paradox of traditional test theory and the assumption of local independence in person-item response theory have caused problems in interpretation. This paper demonstrates that the two are related concepts, and, through this demonstration, both are clarified. It is demonstrated that the breakdown of local independence leads to…
Descriptors: Latent Trait Theory, Test Interpretation, Test Items, Test Reliability

Lord, Frederic M. – Journal of Educational Measurement, 1977
Two approaches for determining the optimal number of choices for a test item, presently in the literature, are compared with two new approaches. (Author)
Descriptors: Forced Choice Technique, Latent Trait Theory, Multiple Choice Tests, Test Items
Fishman, Judith – Writing Program Administration, 1984
Examines the CUNY-WAT program and questions many aspects of it, especially the choice and phrasing of topics. (FL)
Descriptors: Essay Tests, Higher Education, Test Format, Test Items

Rusch, Reuben; Steiner, Judith – Journal of Experimental Education, 1979
The Selected Marker Tests were examined for scoring problems and internal consistency and were administered orally to sixth and seventh graders. Scoring problems were discovered and changes were suggested. The problem was found to be item reliability rather than interrater reliability. (Author/MH)
Descriptors: Cognitive Tests, Elementary Education, Item Analysis, Problem Solving
Haenn, Joseph F. – 1981
Procedures for conducting functional level testing have been available for use by practitioners for some time. However, the Title I Evaluation and Reporting System (TIERS), developed in response to the educational amendments of 1974 to the Elementary and Secondary Education Act (ESEA), has provided the impetus for widespread adoption of this…
Descriptors: Achievement Tests, Difficulty Level, Scores, Scoring

Popham, W. James – Reading Horizons, 1982
Details the steps followed in the development of the Basic Skills Word List. (FL)
Descriptors: Elementary Education, Readability, Reading Tests, Test Construction