ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	3

Descriptor

Test Items	18
Test Length	18
Testing Problems	18
Test Construction	11
Test Reliability	8
Item Analysis	7
Test Validity	7
Adaptive Testing	5
Computer Assisted Testing	5
Item Banks	5
Multiple Choice Tests	5
Test Format	5
Mastery Tests	4
Achievement Tests	3
Equated Scores	3
Individual Testing	3
Item Sampling	3
Correlation	2
Criterion Referenced Tests	2
Cutting Scores	2
Difficulty Level	2
Educational Assessment	2
Evaluation Methods	2
Evaluation Problems	2
Evaluation Research	2
More ▼

Source

Journal of Educational…	2
Educational Research and…	1
Educational and Psychological…	1
Participatory Educational…	1

Publication Type

Reports - Research	9
Speeches/Meeting Papers	6
Journal Articles	5
Reports - Evaluative	4
Opinion Papers	2
Guides - Non-Classroom	1
Information Analyses	1
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Elementary Secondary Education

Audience

Researchers

Location

New Jersey

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	1
Wechsler Intelligence Scale…	1
Wechsler Intelligence Scales…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Effect of Item Parameter Drift in Mixed Format Common Items on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022

The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…

Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

One Iota Fills the Quota: A Paradox in Multifacet Reliability Coefficients.

Peer reviewed

Conger, Anthony J. – Educational and Psychological Measurement, 1983

A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)

Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length

Item Clusters and Computerized Adaptive Testing: A Case for Testlets.

Peer reviewed

Wainer, Howard; Kiely, Gerard L. – Journal of Educational Measurement, 1987

The testlet, a bundle of test items, alleviates some problems associated with computerized adaptive testing: context effects, lack of robustness, and item difficulty ordering. While testlets may be linear or hierarchical, the most useful ones are four-level hierarchical units, containing 15 items and partitioning examinees into 16 classes. (GDC)

Descriptors: Adaptive Testing, Computer Assisted Testing, Context Effect, Item Banks

Three Practical Issues for Modern Adaptive Testing Item Pools.

Download full text

Stocking, Martha L. – 1994

As adaptive testing moves toward operational implementation in large scale testing programs, where it is important that adaptive tests be as parallel as possible to existing linear tests, a number of practical issues arise. This paper concerns three such issues. First, optimum item pool size is difficult to determine in advance of pool…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Standards

An Examination of the Feasibility of Using Criterion-Referenced Measurement in Large-Scale, Survey Testing Situations.

Download full text

Graham, Darol L. – 1974

The adequacy of a test developed for statewide assessment of basic mathematics skills was investigated. The test, comprised of multiple-choice items reflecting a series of behavioral objectives, was compared with a more extensive criterion measure generated from the same objectives by the application of a strict item sampling model. In many…

Descriptors: Comparative Testing, Criterion Referenced Tests, Educational Assessment, Item Sampling

Test Length and Validity: An Application of Test Theory to a Finite World.

Myers, Charles T. – 1978

The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…

Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction

Pretesting alongside an Operational CAT.

Download full text

Davey, Tim; Pommerich, Mary; Thompson, Tony D. – 1999

In computerized adaptive testing (CAT), new or experimental items are frequently administered alongside operational tests to gather the pretest data needed to replenish and replace item pools. The two basic strategies used to combine pretest and operational items are embedding and appending. Variable-length CATs are preferred because of the…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Measurement Techniques

Estimating the Number of Examinees Who Did Not Reach the Last Item of a Section.

Wainer, Howard – 1985

It is important to estimate the number of examinees who reached a test item, because item difficulty is defined by the number who answered correctly divided by the number who reached the item. A new method is presented and compared to the previously used definition of three categories of response to an item: (1) answered; (2) omitted--a…

Descriptors: College Entrance Examinations, Difficulty Level, Estimation (Mathematics), High Schools

A Short and Simple Introduction to Tailored Testing.

Download full text

Rudner, Lawrence M. – 1978

Tailored testing provides the same information as group-administered standardized tests, but can do so using fewer items because the items administered are selected for the ability of the individual student. Thus, tailored testing offers several advantages over traditional methods. Because individual tailored tests are not timed, anxiety is…

Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Computer Assisted Testing

Comparison of Difficulties and Reliabilities of Math-Completion and Multiple-Choice Item Formats.

Download full text

Oosterhof, Albert C.; Coats, Pamela K. – 1981

Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…

Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education

The Effect of Keying All Options Correct on Equating Functions and Scores.

Download full text

Lenel, Julia C.; Gilmer, Jerry S. – 1986

In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…

Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)

WISC-R Short Forms: Long on Problems.

Boyd, Thomas A.; Tramontana, Michael G. – 1984

To examine the validity of short forms of the Wechsler Intelligence Scale for Children-Revised (WISC-R), the WISC-R was first administered to 106 hospitalized psychiatric patients, aged 8-16. No subjects had a primary diagnosis of mental retardation or learning disability, and one-third were receiving psychotropic medication. WISC-R IQ scores…

Descriptors: Adolescents, Children, Correlation, Elementary Secondary Education

An Approach to Measuring the Achievement or Proficiency of an Examinee.

Wilcox, Rand R. – 1979

Mastery tests are analyzed in terms of the number of skills to be mastered and the number of items per skill, in order that correct decisions of mastery or nonmastery will be made to a desired degree of probability. It is assumed that a random sample of skills will be selected for measurement, that each skill will be measured by the same number of…

Descriptors: Achievement Tests, Cutting Scores, Decision Making, Equivalency Tests

Previous Page | Next Page »

Pages: 1 | 2

Wainer, Howard	2
Boyd, Thomas A.	1
Camilli, Gregory	1
Coats, Pamela K.	1
Conger, Anthony J.	1
Cui, Ying	1
Davey, Tim	1
Gilmer, Jerry S.	1
Graham, Darol L.	1
Harnisch, Delwyn L.	1
Kiely, Gerard L.	1
Kiliç, Abdullah Faruk	1
Larson, Gordon A.	1
Leighton, Jacqueline P.	1
Lenel, Julia C.	1
Millman, Jason	1
Myers, Charles T.	1
Oosterhof, Albert C.	1
Pommerich, Mary	1
Rudner, Lawrence M.	1
Sahin-Kürsad, Merve	1
Stocking, Martha L.	1
Thompson, Tony D.	1
Tramontana, Michael G.	1
More ▼