ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	9

Descriptor

Test Format	66
Test Theory	66
Test Construction	27
Test Items	25
Higher Education	19
Test Validity	18
Multiple Choice Tests	15
Foreign Countries	14
Testing	12
Psychometrics	9
Student Evaluation	9
Test Reliability	9
Comparative Analysis	8
Error of Measurement	8
Evaluation Methods	8
Item Analysis	8
Test Interpretation	8
Testing Problems	8
Computer Assisted Testing	7
Item Response Theory	7
Language Tests	7
Latent Trait Theory	7
Statistical Analysis	7
Test Wiseness	7
Criterion Referenced Tests	6
More ▼

Publication Type

Journal Articles	38
Reports - Research	32
Reports - Evaluative	11
Speeches/Meeting Papers	9
Information Analyses	8
Reports - Descriptive	8
Opinion Papers	4
Guides - Non-Classroom	3
Guides - Classroom - Learner	2
Books	1
Collected Works - Proceedings	1
Collected Works - Serials	1
Guides - Classroom - Teacher	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	2
Elementary Secondary Education	1
High Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Practitioners	4
Teachers	3
Researchers	2
Students	2

Location

Canada	4
United Kingdom	2
United Kingdom (England)	2
Australia	1
California	1
Israel	1
Luxembourg	1
Netherlands	1
New York	1
Sweden	1
United Kingdom (Northern…	1
United Kingdom (Wales)	1
United States	1
Utah	1
West Germany	1
Wisconsin	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

ACT Assessment	1
Armed Services Vocational…	1
Comprehensive Tests of Basic…	1
Defining Issues Test	1
Embedded Figures Test	1
Graduate Management Admission…	1
Kaufman Assessment Battery…	1
Law School Admission Test	1
SAT (College Admission Test)	1
Stanford Achievement Tests	1
Wechsler Intelligence Scale…	1
Wisconsin Card Sorting Test	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 66 results Save | Export

An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

Peer reviewed

Direct link

Tao, Wei; Cao, Yi – Applied Measurement in Education, 2016

Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…

Descriptors: Item Response Theory, Equated Scores, Test Format, Models

Rating Quality Studies Using Rasch Measurement Theory. Research Report 2013-3

Download full text

Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013

The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…

Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores

Incomplete Psychometric Equivalence of Scores Obtained on the Manual and the Computer Version of the Wisconsin Card Sorting Test?

Peer reviewed

Direct link

Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010

The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…

Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores

On Bias in Linear Observed-Score Equating

Peer reviewed

Direct link

van der Linden, Wim J. – Measurement: Interdisciplinary Research and Perspectives, 2010

The traditional way of equating the scores on a new test form X to those on an old form Y is equipercentile equating for a population of examinees. Because the population is likely to change between the two administrations, a popular approach is to equate for a "synthetic population." The authors of the articles in this issue of the…

Descriptors: Test Format, Equated Scores, Population Distribution, Population Trends

Interacting in Pairs in a Test of Oral Proficiency: Co-Constructing a Better Performance

Peer reviewed

Direct link

Brooks, Lindsay – Language Testing, 2009

This study, framed within sociocultural theory, examines the interaction of adult ESL test-takers in two tests of oral proficiency: one in which they interacted with an examiner (the individual format) and one in which they interacted with another student (the paired format). The data for the eight pairs in this study were drawn from a larger…

Descriptors: Testing, Rating Scales, Program Effectiveness, Interaction

Classical Test Theory as a First-Order Item Response Theory: Application to True-Score Prediction from a Possibly Nonparallel Test.

Peer reviewed

Holland, Paul W.; Hoskens, Machteld – Psychometrika, 2003

Gives an account of classical test theory that shows how it can be viewed as a mean and variance approximation to a general version of item response theory and then shows how this approach can give insight into predicting the true score of a test and the true scores of tests not necessarily parallel to the given test. (SLD)

Descriptors: Prediction, Test Format, Test Theory, True Scores

Format-Dependent Selection of Choices on MC and MTF Test Items.

Peer reviewed

Kolstad, Rosemarie K.; And Others – Journal of Research and Development in Education, 1985

Multiple choice questions that could logically provide two or more choices block the expression of judgment, thereby suppressing measurement of learning and failing to provide feedback to students and teachers. This study compares the effects of content identical multiple choice and multiple true false items on students' decision. (MT)

Descriptors: Evaluation Methods, Higher Education, Knowledge Level, Test Format

Prototype Measures of the Domain of Learning in Literature. Report Series 3.3.

Download full text

Purves, Alan; And Others – 1990

A study examined the results of an administration of a series of theoretically based prototype tests to 857 high school students in California, New York, and Wisconsin. By revising the existing framework of a prior study, tests were devised which attempted to measure three interrelated aspects of school literature: background knowledge, the…

Descriptors: Educational Research, Educational Testing, High Schools, Literature

Measurement Error and Changes in Personal Constructs.

Peer reviewed

Chambers, William V. – Social Behavior and Personality, 1985

Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)

Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability

Rasch Scaling and Reading Tests.

Peer reviewed

Pumfrey, Peter D. – Journal of Research in Reading, 1987

Discusses, for the benefit of research workers and other test users, the ongoing controversy concerning the relative merits of conventional test theory and Rasch scaling in the construction of reading tests. Concludes that a great deal of further research is required to see whether these approaches are educationally valid. (JD)

Descriptors: Reading Research, Reading Tests, Test Construction, Test Format

The Radex Structure of Intelligence: A Replication.

Peer reviewed

Adler, Nurit; Guttman, Ruth – Educational and Psychological Measurement, 1982

Thirteen ability tests were administered as defined within a mapping sentence containing four content facets: rule type, expression mode, language of communication and dimensionality of portrayed object. Smallest Space Analysis of intercorrelations among test scores showed the radex structure of the two-dimensional space conformed to the…

Descriptors: Content Analysis, Factor Structure, Intelligence Tests, Scores

The Great Essay/Multiple Choice Debate: Different Strokes for Different Folks.

Svinicki, Marilla; Koch, Bill – Innovation Abstracts, 1984

The decision of whether to use essay tests or multiple choice tests depends on several qualifiers related to the different characteristics of the tests and the needs of the situation. The most important qualifier involves matching the type of test to the instructional objectives being tested, with multiple choice tests being used to measure a…

Descriptors: Comparative Analysis, Essay Tests, Multiple Choice Tests, Test Format

A Review of Selection Methods for Optimal Test Design. Research Report 94-4.

Download full text

Berger, Martijn P. F.; Veerkamp, Wim J. J. – 1994

The designing of tests has been a source of concern for test developers over the past decade. Various kinds of test forms have been applied. Among these are the fixed-form test, the adaptive test, and the testlet. Each of these forms has its own design. In this paper, the construction of test forms is placed within the general framework of optimal…

Descriptors: Adaptive Testing, Foreign Countries, Research Design, Selection

The Effect of Change of the Size of the Drawing Sheet on the H-T-P IQ Scores.

Peer reviewed

Bieliauskas, Vytautas J.; Farragher, John – Journal of Clinical Psychology, 1983

Administered the House-Tree-Person test to male college students (N=24) to examine the effects of varying the size of the drawing form on the scores. Results suggested that use of the drawing sheet did not have a significant influence upon the quantitative aspects of the drawing. (LLL)

Descriptors: College Students, Higher Education, Intelligence Tests, Males

Test Equating from Biased Samples, with Application to the Armed Services Vocational Aptitude Battery.

Peer reviewed

Little, Roderick J. A.; Rubin, Donald B. – Journal of Educational and Behavioral Statistics, 1994

Equating a new standard test to an old reference test is considered when samples for equating are not randomly selected from the target population of test takers, identifying two problems from equating from biased samples. An empirical example with data from the Armed Services Vocational Aptitude Battery illustrates the approach. (SLD)

Descriptors: Equated Scores, Military Personnel, Sampling, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Applied Psychological…	3
Applied Measurement in…	2
Educational and Psychological…	2
Language Testing	2
Teaching of Psychology	2
Alberta Journal of…	1
Annual Review of Applied…	1
British Journal of…	1
College Board	1
Communication Education	1
Edinburgh Working Papers in…	1
Educational Studies	1
Innovation Abstracts	1
Journal of Chemical Education	1
Journal of Clinical Psychology	1
Journal of Economic Education	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Interactive Online…	1
Journal of Research and…	1
Journal of Research in Reading	1
Measurement:…	1
Psychological Assessment	1
Psychometrika	1
More ▼

Wainer, Howard	2
White, David M.	2
van der Linden, Wim J.	2
Abramson, Theodore	1
Ackerman, Terry A.	1
Adler, Nurit	1
Balch, William R.	1
Banchick, Gail	1
Barnett-Foster, Debora	1
Beal, Judy	1
Bell, Richard	1
Berger, Martijn P. F.	1
Bieliauskas, Vytautas J.	1
Bloom, Benjamin S.	1
Brittain, Clay V.	1
Brittain, Mary M.	1
Brooks, Lindsay	1
Brunner, Martin	1
Bruno, James E.	1
Budescu, David V.	1
Budgell, Glen R.	1
Cao, Yi	1
Chambers, William V.	1
Chase, Clinton I.	1
More ▼