Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 2 |
Descriptor
Test Length | 13 |
Test Reliability | 13 |
Testing Problems | 13 |
Test Items | 8 |
Test Validity | 8 |
Test Format | 7 |
Test Construction | 6 |
Item Analysis | 5 |
Elementary Secondary Education | 3 |
Multiple Choice Tests | 3 |
Scores | 3 |
More ▼ |
Source
Educational Research and… | 1 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
Language Testing | 1 |
Author
Publication Type
Reports - Research | 7 |
Journal Articles | 4 |
Speeches/Meeting Papers | 3 |
Information Analyses | 2 |
Reports - Evaluative | 2 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 1 |
Audience
Researchers | 2 |
Location
New Jersey | 1 |
United Kingdom | 1 |
Vermont | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACTFL Oral Proficiency… | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Conger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length

Budescu, David – Journal of Educational Measurement, 1985
An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)
Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests
Myers, Charles T. – 1978
The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…
Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction
Oosterhof, Albert C.; Coats, Pamela K. – 1981
Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…
Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education
Hopper, Margaret F. – 2001
This paper provides an overview of the types of testing accommodations used for students with disabilities and presents arguments for and against their use. It begins by discussing student participation in educational assessments and federal requirements concerning the participation of students with disabilities. The types of accommodations are…
Descriptors: Academic Accommodations (Disabilities), Academic Standards, Disabilities, Educational Assessment
Freedman, Sarah Warshauer – 1991
Writing teachers and educators can add to information from large-scale testing and teachers can strengthen classroom assessment by creating a tight fit between large-scale testing and classroom assessment. Across the years, large-scale testing programs have struggled with a difficult problem: how to evaluate student writing reliably and…
Descriptors: Elementary Secondary Education, Foreign Countries, Informal Assessment, Portfolios (Background Materials)
Lenel, Julia C.; Gilmer, Jerry S. – 1986
In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…
Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)
Hambleton, Ronald K. – 1986
The problem of determining optimal test lengths with fixed total testing time has proved to be a difficult one for criterion-referenced test developers. An algorithm is needed which can be used by test developers to allocate available testing time to maximize the validity of their total criterion-referenced tests or testing programs. To be…
Descriptors: Algorithms, Criterion Referenced Tests, Elementary Secondary Education, Psychometrics
Wilcox, Rand R. – 1979
Mastery tests are analyzed in terms of the number of skills to be mastered and the number of items per skill, in order that correct decisions of mastery or nonmastery will be made to a desired degree of probability. It is assumed that a random sample of skills will be selected for measurement, that each skill will be measured by the same number of…
Descriptors: Achievement Tests, Cutting Scores, Decision Making, Equivalency Tests
Harnisch, Delwyn L. – 1985
Computer adaptive testing systems are feasible for certification and licensure testing. This is in part due to the availability of extensive yet inexpensive computers. Modern item response theory, combined with computerized adaptive testing, yields a powerful new method of testing which provides greater accuracy and efficiency and less boredom for…
Descriptors: Adaptive Testing, Certification, Computer Assisted Testing, Cost Effectiveness
Larson, Gordon A.; And Others – 1981
Adult educators in the state of New Jersey have expressed concern about the adequacy of the 1978 General Educational Development (GED) test. Based on these concerns research was conducted in the use and applicability of the current GED test. The project investigated whether the existing GED test was meeting the needs of the large majority of GED…
Descriptors: Adult Students, Community Attitudes, Educational Attitudes, Educational Needs