ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Descriptor

Difficulty Level	11
Test Construction	11
Test Length	11
Test Items	10
Item Analysis	5
Test Bias	4
Test Reliability	4
Higher Education	3
Item Banks	3
Achievement Tests	2
Adaptive Testing	2
Adults	2
Computer Assisted Testing	2
Cutting Scores	2
Equated Scores	2
Latent Trait Theory	2
Mastery Tests	2
Reaction Time	2
Responses	2
Test Validity	2
Testing	2
Ability	1
Accuracy	1
Adult Basic Education	1
Answer Sheets	1
More ▼

Source

Educational and Psychological…	1
Research in the Schools	1

Author

Hambleton, Ronald K.	2
Bergstrom, Betty	1
Byars, Alvin Gregg	1
Catts, Ralph	1
Clements, Andrea D.	1
Cook, Linda L.	1
Forsyth, Robert A.	1
He, Wei	1
Reckase, Mark D.	1
Robertson, David W.	1
Rothenberg, Lori	1
Scheetz, James P.	1
Thissen, David	1
Wainer, Howard	1
More ▼

Publication Type

Reports - Research	8
Speeches/Meeting Papers	3
Guides - Non-Classroom	2
Journal Articles	2
Reports - Evaluative	2
Information Analyses	1

Education Level

Audience

Researchers

Location

Australia

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

Peer reviewed

Direct link

He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014

For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing

Testing at Higher Taxonomic Levels: Are We Jeopardizing Reliability by Increasing the Emphasis on Complexity?

Clements, Andrea D.; Rothenberg, Lori – Research in the Schools, 1996

Undergraduate psychology examinations from 48 schools were analyzed to determine the proportion of items at each level of Bloom's Taxonomy, item format, and test length. Analyses indicated significant relationships between item complexity and test length even when taking format into account. Use of higher items may be related to shorter tests,…

Descriptors: Classification, Difficulty Level, Educational Objectives, Higher Education

Some Results on the Robustness of Latent Trait Models.

Download full text

Hambleton, Ronald K.; Cook, Linda L. – 1978

The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…

Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis

Computerized Adaptive Testing Exploring Examinee Response Time Using Hierarchical Linear Modeling.

Download full text

Bergstrom, Betty; And Others – 1994

Examinee response times from a computerized adaptive test taken by 204 examinees taking a certification examination were analyzed using a hierarchical linear model. Two equations were posed: a within-person model and a between-person model. Variance within persons was eight times greater than variance between persons. Several variables…

Descriptors: Adaptive Testing, Adults, Certification, Computer Assisted Testing

On Examinee Choice in Educational Testing. GRE Board Professional Report No. 91-17P.

Download full text

Wainer, Howard; Thissen, David – 1994

When an examination consists in whole or part of constructed response test items, it is common practice to allow the examinee to choose a subset of the constructed response questions from a larger pool. It is sometimes argued that, if choice were not allowed, the limitations on domain coverage forced by the small number of items might unfairly…

Descriptors: Constructed Response, Difficulty Level, Educational Testing, Equated Scores

A Comparison of Simple Random Sampling Versus Stratification for Allocating Items to Subtests in Multiple Matrix Sampling.

Download full text

Scheetz, James P.; Forsyth, Robert A. – 1977

Empirical evidence is presented related to the effects of using a stratified sampling of items in multiple matrix sampling on the accuracy of estimates of the population mean. Data were obtained from a sample of 600 high school students for a 36-item mathematics test and a 40-item vocabulary test, both subtests of the Iowa Tests of Educational…

Descriptors: Achievement Tests, Difficulty Level, Item Analysis, Item Sampling

Practical Procedures for Constructing Mastery Tests to Minimize Errors of Classification and to Maximize or Optimize Decision Reliability.

Byars, Alvin Gregg – 1980

The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…

Descriptors: Cutting Scores, Difficulty Level, Grade 4, Intermediate Grades

Q. How Many Options Should a Multiple-Choice Question Have? (a) 2. (b) 3. (c) 4. At-a-glance Research Report.

Catts, Ralph – 1978

The reliability of multiple choice tests--containing different numbers of response options--was investigated for 260 students enrolled in technical college economics courses. Four test forms, constructed from previously used four-option items, were administered, consisting of (1) 60 two-option items--two distractors randomly discarded; (2) 40…

Descriptors: Answer Sheets, Difficulty Level, Foreign Countries, Higher Education

Comparative Racial Analysis of Enlisted Advancement Exams: Item Differentiation. Final Report.

Download full text

Robertson, David W.; And Others – 1977

A comparative study of item analysis was conducted on the basis of race to determine whether alternative test construction or processing might increase the proportion of black enlisted personnel among those passing various military technical knowledge examinations. The study used data from six specialists at four grade levels and investigated item…

Descriptors: Difficulty Level, Enlisted Personnel, Item Analysis, Occupational Tests

Optimal Item Selection with Credentialing Examinations.

Download full text

Hambleton, Ronald K.; And Others – 1987

The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…

Descriptors: Comparative Analysis, Content Validity, Cutting Scores, Difficulty Level

Manual for the USES Basic Occupational Literacy Test. Section 2: Development.

PDF pending restoration

Manpower Administration (DOL), Washington, DC. – 1972

The Basic Occupational Literacy Test (BOLT) was developed as an achievement test of basic skills in reading and arithmetic, for educationally disadvantaged adults. The objective was to develop a test appropriate for this population with regard to content, format, instructions, timing, norms, and difficulty level. A major issue, the use of grade…

Descriptors: Achievement Tests, Adult Basic Education, Adults, Basic Skills