ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Descriptor

Test Format	23
Test Length	23
Test Reliability	23
Test Validity	14
Test Items	9
Test Construction	7
Testing Problems	7
Comparative Analysis	6
Scores	6
Computer Assisted Testing	5
Error of Measurement	5
Intelligence Tests	4
Test Use	4
Criterion Referenced Tests	3
Difficulty Level	3
Elementary Secondary Education	3
Estimation (Mathematics)	3
Language Proficiency	3
Language Tests	3
Simulation	3
Student Evaluation	3
Testing	3
Achievement Tests	2
Comparative Testing	2
Foreign Countries	2
More ▼

Source

Psychological Assessment	2
Applied Measurement in…	1
ETS Research Report Series	1
Education and Information…	1
Educational Research and…	1
Journal of Clinical Psychology	1
Journal of Experimental…	1
Language Testing	1
Measurement:…	1
Research Matters	1

Publication Type

Reports - Research	13
Journal Articles	11
Reports - Evaluative	4
Speeches/Meeting Papers	4
Reference Materials -…	2
Reports - Descriptive	2
Guides - Non-Classroom	1
Information Analyses	1
Opinion Papers	1

Education Level

Elementary Secondary Education

Audience

Practitioners	2
Community	1
Support Staff	1

Location

United Kingdom	2
New Jersey	1
Vermont	1

Laws, Policies, & Programs

Job Training Partnership Act…

Assessments and Surveys

Wechsler Intelligence Scale…	2
ACTFL Oral Proficiency…	1
Iowa Tests of Basic Skills	1
Minnesota Multiphasic…	1
National Assessment of…	1
Test of English as a Foreign…	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Item Response Theory, Computer Adaptive Testing and the Risk of Self-Deception

Download full text

Benton, Tom – Research Matters, 2021

Computer adaptive testing is intended to make assessment more reliable by tailoring the difficulty of the questions a student has to answer to their level of ability. Most commonly, this benefit is used to justify the length of tests being shortened whilst retaining the reliability of a longer, non-adaptive test. Improvements due to adaptive…

Descriptors: Risk, Item Response Theory, Computer Assisted Testing, Difficulty Level

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

ACTFL Oral Proficiency Interview -- Computer (OPIc)

Peer reviewed

Direct link

Isbell, Dan; Winke, Paula – Language Testing, 2019

The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…

Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Can a Good Short Form of the MMPI Ever Be Developed?

Peer reviewed

Streiner, David L.; Miller, Harold R. – Journal of Clinical Psychology, 1986

Numerous short forms of the Minnesota Multiphasic Personality Inventory have been proposed in the last 15 years. In each case, the initial enthusiasm has been replaced by the questions about the clinical utility of the abbreviated version. Argues that the statistical properties of the test and reduced reliability due to shortening the scales…

Descriptors: Test Construction, Test Format, Test Length, Test Reliability

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

Estimating the Reliability of a Test Containing Multiple Item Formats.

Peer reviewed

Qualls, Audrey L. – Applied Measurement in Education, 1995

Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)

Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

A Short Form of the WISC-III for Clinical Use.

Peer reviewed

Donders, Jacques – Psychological Assessment, 1997

Eight subtests were selected from the Wechsler Intelligence Scale for Children--Third Edition (WISC-III) to make a short form for clinical use. Results with the 2,200 children from the WISC-III standardization sample indicated the adequate reliability and validity of the short form for clinical use. (SLD)

Descriptors: Children, Clinical Diagnosis, Intelligence Tests, Test Format

Corrected Estimates of WAIS-R Short Form Reliability and Standard Error of Measurement.

Peer reviewed

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996

The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)

Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests

Multiple Choice and True-False: Reliability and Validity Compared.

Peer reviewed

Green, Kathy – Journal of Experimental Education, 1979

Reliabilities and concurrent validities of teacher-made multiple-choice and true-false tests were compared. No significant differences were found even when multiple-choice reliability was adjusted to equate testing time. (Author/MH)

Descriptors: Comparative Testing, Higher Education, Multiple Choice Tests, Test Format

A Comparison of Two Item Selection Procedures for Building Criterion-Referenced Tests.

Download full text

Haladyna, Tom; Roid, Gale – 1981

Two approaches to criterion-referenced test construction are compared. Classical test theory is based on the practice of random sampling from a well-defined domain of test items; latent trait theory suggests that the difficulty of the items should be matched to the achievement level of the student. In addition to these two methods of test…

Descriptors: Criterion Referenced Tests, Error of Measurement, Latent Trait Theory, Test Construction

Equivalence Reliability of the Split-Half WISC-R Object Assembly Subtest in a Cohort of Juvenile Offenders.

Download full text

Rodriguez-Aragon, Graciela; And Others – 1993

The predictive power of the Split-Half version of the Wechsler Intelligence Scale for Children--Revised (WISC-R) Object Assembly (OA) subtest was compared to that of the full administration of the OA subtest. A cohort of 218 male and 49 female adolescent offenders detained in a Texas juvenile detention facility between 1990 and 1992 was used. The…

Descriptors: Adolescents, Cohort Analysis, Comparative Testing, Correlation

Preschool Assessment Instrument Ratings Guide.

Metropolitan Atlanta Consortium of Consultants and Lead Speech-Language Pathologists, GA. – 1990

This guide presents ratings of assessment instruments for use by speech-language pathologists with preschool students. Tests are reviewed in alphabetical order on forms filled out by practicing speech-language pathologists, including data on speech components covered by each test, age range, factors of norms where norms are used, reliability,…

Descriptors: Diagnostic Tests, Examiners, Preschool Education, Preschool Tests

Comparison of Difficulties and Reliabilities of Math-Completion and Multiple-Choice Item Formats.

Download full text

Oosterhof, Albert C.; Coats, Pamela K. – 1981

Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…

Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education

Effects of Test Length and Advancement Score on Several Criterion-Referenced Test Reliability and Validity Indices. Laboratory of Psychometric and Evaluation Research Report No. 86.

Download full text

Eignor, Daniel R.; Hambleton, Ronald K. – 1979

The purpose of the investigation was to obtain some relationships among (1) test lengths, (2) shape of domain-score distributions, (3) advancement scores, and (4) several criterion-referenced test score reliability and validity indices. The study was conducted using computer simulation methods. The values of variables under study were set to be…

Descriptors: Comparative Analysis, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores

Previous Page | Next Page »

Pages: 1 | 2

Hambleton, Ronald K.	2
Axelrod, Bradley N.	1
Benton, Tom	1
Camilli, Gregory	1
Coats, Pamela K.	1
Donders, Jacques	1
Eignor, Daniel R.	1
Embretson, Susan E.	1
Freedman, Sarah Warshauer	1
Gelbal, Selahattin	1
Green, Kathy	1
Haladyna, Tom	1
Henning, Grant	1
Hopper, Margaret F.	1
Isbell, Dan	1
Larson, Gordon A.	1
Miller, Harold R.	1
Oosterhof, Albert C.	1
Ozdemir, Burhanettin	1
Patsula, Liane	1
Perlman, Carole	1
Qualls, Audrey L.	1
Rizavi, Saba	1
Rodriguez-Aragon, Graciela	1
More ▼