ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Test Format	13
Test Length	13
Test Items	9
Computer Assisted Testing	5
Test Construction	5
Adaptive Testing	4
Test Reliability	4
Testing Problems	4
Estimation (Mathematics)	3
Item Banks	3
Test Validity	3
Ability	2
Comparative Analysis	2
Educational Research	2
Educational Testing	2
Equated Scores	2
Individual Testing	2
Item Response Theory	2
Measurement Techniques	2
Prediction	2
Scoring	2
Test Bias	2
Achievement Tests	1
Algebra	1
Alternative Assessment	1
More ▼

Source

Psychological Assessment	2
Applied Measurement in…	1
Educational Research and…	1
Educational and Psychological…	1
Language Testing	1
Popular Measurement	1

Publication Type

Reports - Evaluative	13
Journal Articles	7
Reports - Research	2
Collected Works - General	1
Guides - Non-Classroom	1
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

ACTFL Oral Proficiency…	1
Raven Advanced Progressive…	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

ACTFL Oral Proficiency Interview -- Computer (OPIc)

Peer reviewed

Direct link

Isbell, Dan; Winke, Paula – Language Testing, 2019

The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…

Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Estimating the Reliability of a Test Containing Multiple Item Formats.

Peer reviewed

Qualls, Audrey L. – Applied Measurement in Education, 1995

Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)

Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

The Classification Accuracy of Shortened versus Full Length Tests with Number Correct Scoring.

Download full text

Schulz, E. Matthew; Wang, Lin – 2001

In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…

Descriptors: Ability, Classification, Item Response Theory, Pass Fail Grading

Corrected Estimates of WAIS-R Short Form Reliability and Standard Error of Measurement.

Peer reviewed

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996

The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)

Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests

Selective Reminding Test Short Form Administration: A Comparison of Two through Twelve Trials.

Peer reviewed

Smith, Renee L.; And Others – Psychological Assessment, 1995

The clinical utility of using fewer than 12 trials of the Selective Reminding Test, a task to assess verbal memory, was studied with 100 cardiac patients and 100 brain injury patients. Results suggest that as few as 6 trials might be adequate, providing information consistent with that from 12 trials. (SLD)

Descriptors: Clinical Diagnosis, Diagnostic Tests, Head Injuries, Memory

Three Practical Issues for Modern Adaptive Testing Item Pools.

Download full text

Stocking, Martha L. – 1994

As adaptive testing moves toward operational implementation in large scale testing programs, where it is important that adaptive tests be as parallel as possible to existing linear tests, a number of practical issues arise. This paper concerns three such issues. First, optimum item pool size is difficult to determine in advance of pool…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Standards

Some Empirical Guidelines for Building Testlets. Program Statistics Research Technical Report No. 91-14.

Download full text

Wainer, Howard; And Others – 1991

A series of computer simulations was run to measure the relationship between testlet validity and the factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Results confirmed the generality of earlier empirical findings of H. Wainer and others (1991) that making a testlet adaptive yields only marginal…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Simulation, Item Banks

Inferring Examinee Ability When Some Item Responses Are Missing.

Download full text

Mislevy, Robert J.; Wu, Pao-Kuei – 1988

The basic equations of item response theory provide a foundation for inferring examinees' abilities and items' operating characteristics from observed responses. In practice, though, examinees will usually not have provided a response to every available item--for reasons that may or may not have been intended by the test administrator, and that…

Descriptors: Ability, Adaptive Testing, Equations (Mathematics), Estimation (Mathematics)

Development of a Short Form for the Raven Advanced Progressive Matrices Test.

Peer reviewed

Arthur, Winfred, Jr.; Day, David V. – Educational and Psychological Measurement, 1994

The development of a short form of the Raven Advanced Progressive Matrices Test is reported. Results from 3 studies with 663 college students indicate that the short form demonstrates psychometric properties similar to the long form yet requires a substantially shorter administration time. (SLD)

Descriptors: Cognitive Ability, College Students, Educational Research, Higher Education

Testing Testing Testing.

Peer reviewed

Deville, Craig; O'Neill, Thomas; Wright, Benjamin D.; Woodcock, Richard W.; Munoz-Sandoval, Ana; Gershon, Richard C.; Bergstrom, Betty – Popular Measurement, 1998

Articles in this special section consider (1) flow in test taking (Craig Deville); (2) testwiseness (Thomas O'Neill); (3) test length (Benjamin Wright); (4) cross-language test equating (Richard W. Woodcock and Ana Munoz-Sandoval); (5) computer-assisted testing and testwiseness (Richard Gershon and Betty Bergstrom); and (6) Web-enhanced testing…

Descriptors: Computer Assisted Testing, Educational Testing, Equated Scores, Measurement Techniques

On Examinee Choice in Educational Testing. GRE Board Professional Report No. 91-17P.

Download full text

Wainer, Howard; Thissen, David – 1994

When an examination consists in whole or part of constructed response test items, it is common practice to allow the examinee to choose a subset of the constructed response questions from a larger pool. It is sometimes argued that, if choice were not allowed, the limitations on domain coverage forced by the small number of items might unfairly…

Descriptors: Constructed Response, Difficulty Level, Educational Testing, Equated Scores

An Adaptive Algebra Test: A Testlet-Based, Hierarchically-Structured Test with Validity-Based Scoring. Technical Report No. 90-92.

Download full text

Wainer, Howard; And Others – 1990

The initial development of a testlet-based algebra test was previously reported (Wainer and Lewis, 1990). This account provides the details of this excursion into the use of hierarchical testlets and validity-based scoring. A pretest of two 15-item hierarchical testlets was carried out in which examinees' performance on a 4-item subset of each…

Descriptors: Adaptive Testing, Algebra, Comparative Analysis, Computer Assisted Testing

Wainer, Howard	3
Arthur, Winfred, Jr.	1
Axelrod, Bradley N.	1
Bergstrom, Betty	1
Camilli, Gregory	1
Day, David V.	1
Deville, Craig	1
Gershon, Richard C.	1
Isbell, Dan	1
Mislevy, Robert J.	1
Munoz-Sandoval, Ana	1
O'Neill, Thomas	1
Qualls, Audrey L.	1
Schulz, E. Matthew	1
Smith, Renee L.	1
Stocking, Martha L.	1
Thissen, David	1
Wang, Lin	1
Winke, Paula	1
Woodcock, Richard W.	1
Wright, Benjamin D.	1
Wu, Pao-Kuei	1
More ▼