Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 9 |
Descriptor
Test Format | 66 |
Test Theory | 66 |
Test Construction | 27 |
Test Items | 25 |
Higher Education | 19 |
Test Validity | 18 |
Multiple Choice Tests | 15 |
Foreign Countries | 14 |
Testing | 12 |
Psychometrics | 9 |
Student Evaluation | 9 |
More ▼ |
Source
Author
Wainer, Howard | 2 |
White, David M. | 2 |
van der Linden, Wim J. | 2 |
Abramson, Theodore | 1 |
Ackerman, Terry A. | 1 |
Adler, Nurit | 1 |
Balch, William R. | 1 |
Banchick, Gail | 1 |
Barnett-Foster, Debora | 1 |
Beal, Judy | 1 |
Bell, Richard | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 2 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Practitioners | 4 |
Teachers | 3 |
Researchers | 2 |
Students | 2 |
Location
Canada | 4 |
United Kingdom | 2 |
United Kingdom (England) | 2 |
Australia | 1 |
California | 1 |
Israel | 1 |
Luxembourg | 1 |
Netherlands | 1 |
New York | 1 |
Sweden | 1 |
United Kingdom (Northern… | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Tao, Wei; Cao, Yi – Applied Measurement in Education, 2016
Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…
Descriptors: Item Response Theory, Equated Scores, Test Format, Models
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
van der Linden, Wim J. – Measurement: Interdisciplinary Research and Perspectives, 2010
The traditional way of equating the scores on a new test form X to those on an old form Y is equipercentile equating for a population of examinees. Because the population is likely to change between the two administrations, a popular approach is to equate for a "synthetic population." The authors of the articles in this issue of the…
Descriptors: Test Format, Equated Scores, Population Distribution, Population Trends
Brooks, Lindsay – Language Testing, 2009
This study, framed within sociocultural theory, examines the interaction of adult ESL test-takers in two tests of oral proficiency: one in which they interacted with an examiner (the individual format) and one in which they interacted with another student (the paired format). The data for the eight pairs in this study were drawn from a larger…
Descriptors: Testing, Rating Scales, Program Effectiveness, Interaction

Holland, Paul W.; Hoskens, Machteld – Psychometrika, 2003
Gives an account of classical test theory that shows how it can be viewed as a mean and variance approximation to a general version of item response theory and then shows how this approach can give insight into predicting the true score of a test and the true scores of tests not necessarily parallel to the given test. (SLD)
Descriptors: Prediction, Test Format, Test Theory, True Scores

Kolstad, Rosemarie K.; And Others – Journal of Research and Development in Education, 1985
Multiple choice questions that could logically provide two or more choices block the expression of judgment, thereby suppressing measurement of learning and failing to provide feedback to students and teachers. This study compares the effects of content identical multiple choice and multiple true false items on students' decision. (MT)
Descriptors: Evaluation Methods, Higher Education, Knowledge Level, Test Format
Purves, Alan; And Others – 1990
A study examined the results of an administration of a series of theoretically based prototype tests to 857 high school students in California, New York, and Wisconsin. By revising the existing framework of a prior study, tests were devised which attempted to measure three interrelated aspects of school literature: background knowledge, the…
Descriptors: Educational Research, Educational Testing, High Schools, Literature

Chambers, William V. – Social Behavior and Personality, 1985
Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)
Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability

Pumfrey, Peter D. – Journal of Research in Reading, 1987
Discusses, for the benefit of research workers and other test users, the ongoing controversy concerning the relative merits of conventional test theory and Rasch scaling in the construction of reading tests. Concludes that a great deal of further research is required to see whether these approaches are educationally valid. (JD)
Descriptors: Reading Research, Reading Tests, Test Construction, Test Format

Adler, Nurit; Guttman, Ruth – Educational and Psychological Measurement, 1982
Thirteen ability tests were administered as defined within a mapping sentence containing four content facets: rule type, expression mode, language of communication and dimensionality of portrayed object. Smallest Space Analysis of intercorrelations among test scores showed the radex structure of the two-dimensional space conformed to the…
Descriptors: Content Analysis, Factor Structure, Intelligence Tests, Scores
Svinicki, Marilla; Koch, Bill – Innovation Abstracts, 1984
The decision of whether to use essay tests or multiple choice tests depends on several qualifiers related to the different characteristics of the tests and the needs of the situation. The most important qualifier involves matching the type of test to the instructional objectives being tested, with multiple choice tests being used to measure a…
Descriptors: Comparative Analysis, Essay Tests, Multiple Choice Tests, Test Format
Berger, Martijn P. F.; Veerkamp, Wim J. J. – 1994
The designing of tests has been a source of concern for test developers over the past decade. Various kinds of test forms have been applied. Among these are the fixed-form test, the adaptive test, and the testlet. Each of these forms has its own design. In this paper, the construction of test forms is placed within the general framework of optimal…
Descriptors: Adaptive Testing, Foreign Countries, Research Design, Selection

Bieliauskas, Vytautas J.; Farragher, John – Journal of Clinical Psychology, 1983
Administered the House-Tree-Person test to male college students (N=24) to examine the effects of varying the size of the drawing form on the scores. Results suggested that use of the drawing sheet did not have a significant influence upon the quantitative aspects of the drawing. (LLL)
Descriptors: College Students, Higher Education, Intelligence Tests, Males

Little, Roderick J. A.; Rubin, Donald B. – Journal of Educational and Behavioral Statistics, 1994
Equating a new standard test to an old reference test is considered when samples for equating are not randomly selected from the target population of test takers, identifying two problems from equating from biased samples. An empirical example with data from the Armed Services Vocational Aptitude Battery illustrates the approach. (SLD)
Descriptors: Equated Scores, Military Personnel, Sampling, Statistical Analysis