ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Item Analysis	14
Test Length	14
Test Reliability	14
Test Items	8
Test Validity	7
Test Construction	6
Testing Problems	5
Computer Assisted Testing	4
Achievement Tests	3
Adaptive Testing	3
Difficulty Level	3
Item Banks	3
Multiple Choice Tests	3
Test Bias	3
Equated Scores	2
Individual Testing	2
Item Response Theory	2
Latent Trait Theory	2
Mastery Tests	2
Mathematical Models	2
Norms	2
Occupational Tests	2
Scores	2
Scoring Formulas	2
Simulation	2
More ▼

Source

Educational and Psychological…	2
Anatomical Sciences Education	1
Applied Measurement in…	1
Educational Research and…	1
Psychometrika	1

Publication Type

Reports - Research	7
Journal Articles	4
Reports - Evaluative	2
Speeches/Meeting Papers	2
Guides - Non-Classroom	1
Information Analyses	1
Opinion Papers	1

Education Level

Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Alabama	1
Illinois (Chicago)	1
Indiana	1

Laws, Policies, & Programs

Assessments and Surveys

Adaptive Behavior Scale	1
Stanford Binet Intelligence…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Using Generalizability Analysis to Estimate Parameters for Anatomy Assessments: A Multi-institutional Study

Peer reviewed

Direct link

Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017

With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…

Descriptors: Anatomy, Science Tests, Test Items, Test Reliability

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

Peer reviewed

Direct link

Yao, Lihua – Psychometrika, 2012

Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

A Method for Increasing the Reliability of a Short Multiple-Choice Test.

Peer reviewed

Serlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978

When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)

Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests

Item Homogeneity, Scale Reliability, and the Self-Concept Hypothesis

Peer reviewed

Taylor, James B. – Educational and Psychological Measurement, 1977

The reliability and item homogeneity of personality scales are in part dependent on the content domain being sampled, and this characteristic reliability cannot be explained by item ambiguity or scale length. It is suggested that clarity of self concept is also a determinant. (Author/JKS)

Descriptors: Item Analysis, Personality Assessment, Personality Measures, Personality Theories

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Test Length and Validity: An Application of Test Theory to a Finite World.

Myers, Charles T. – 1978

The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…

Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction

Comparative Racial Analysis of Enlisted Advancement Exams: Item Differentiation. Final Report.

Download full text

Robertson, David W.; And Others – 1977

A comparative study of item analysis was conducted on the basis of race to determine whether alternative test construction or processing might increase the proportion of black enlisted personnel among those passing various military technical knowledge examinations. The study used data from six specialists at four grade levels and investigated item…

Descriptors: Difficulty Level, Enlisted Personnel, Item Analysis, Occupational Tests

The Effect of Keying All Options Correct on Equating Functions and Scores.

Download full text

Lenel, Julia C.; Gilmer, Jerry S. – 1986

In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…

Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)

An Approach to Measuring the Achievement or Proficiency of an Examinee.

Wilcox, Rand R. – 1979

Mastery tests are analyzed in terms of the number of skills to be mastered and the number of items per skill, in order that correct decisions of mastery or nonmastery will be made to a desired degree of probability. It is assumed that a random sample of skills will be selected for measurement, that each skill will be measured by the same number of…

Descriptors: Achievement Tests, Cutting Scores, Decision Making, Equivalency Tests

Computer Application Issues in Certification and Licensure Testing.

Harnisch, Delwyn L. – 1985

Computer adaptive testing systems are feasible for certification and licensure testing. This is in part due to the availability of extensive yet inexpensive computers. Modern item response theory, combined with computerized adaptive testing, yields a powerful new method of testing which provides greater accuracy and efficiency and less boredom for…

Descriptors: Adaptive Testing, Certification, Computer Assisted Testing, Cost Effectiveness

Manual for the USES Basic Occupational Literacy Test. Section 2: Development.

PDF pending restoration

Manpower Administration (DOL), Washington, DC. – 1972

The Basic Occupational Literacy Test (BOLT) was developed as an achievement test of basic skills in reading and arithmetic, for educationally disadvantaged adults. The objective was to develop a test appropriate for this population with regard to content, format, instructions, timing, norms, and difficulty level. A major issue, the use of grade…

Descriptors: Achievement Tests, Adult Basic Education, Adults, Basic Skills

Evaluations of Implied Orders as a Basis for Tailored Testing Using Simulations. Technical Report No. 4.

Cliff, Norman; And Others – 1977

TAILOR is a computer program that uses the implied orders concept as the basis for computerized adaptive testing. The basic characteristics of TAILOR, which does not involve pretesting, are reviewed here and two studies of it are reported. One is a Monte Carlo simulation based on the four-parameter Birnbaum model and the other uses a matrix of…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Difficulty Level

The AAMD Adaptive Behavior Scale--Public School Version: A Normative Study.

Download full text

Boyd, Lenore A.; Chissom, Brad – 1977

This normative study of the American Association on Mental Deficiency (AAMD) Adaptive Behavior Scale--Public School Version was based on 291 Texas public school children divided into 12 categories. The categories were: age, ethnic, or racial group (white or non-white), and assignment to regular classes or special education classes for the educable…

Descriptors: Adjustment (to Environment), Behavior Rating Scales, Elementary Education, Handicapped Children

Boyd, Lenore A.	1
Brooks, William S.	1
Byram, Jessica N.	1
Camilli, Gregory	1
Chissom, Brad	1
Cliff, Norman	1
Fraser-Cotlin, Laura	1
Gilmer, Jerry S.	1
Harnisch, Delwyn L.	1
Kaiser, Henry F.	1
Lenel, Julia C.	1
Myers, Charles T.	1
Robertson, David W.	1
Seifert, Mark F.	1
Serlin, Ronald C.	1
Taylor, James B.	1
Thorp, Laura E.	1
Wilcox, Rand R.	1
Williams, James M.	1
Wilson, Adam B.	1
Wise, Steven L.	1
Yao, Lihua	1
More ▼