ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Source

Journal of Educational…	3
Applied Psychological…	2
Psychological Assessment	2
Applied Measurement in…	1
Evaluation and the Health…	1

Publication Type

Reports - Evaluative	17
Journal Articles	9
Speeches/Meeting Papers	4
Information Analyses	1
Opinion Papers	1
Reports - Research	1
Tests/Questionnaires	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
ACT Assessment	1
Pupil Control Ideology Form	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…

Descriptors: Comparative Analysis, Accuracy, Models, Sample Size

Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Peer reviewed

Direct link

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory

Clarifying Problems and Offering Solutions for Correlated Error when Assessing the Validity of Selected-Subtest Short Forms

Peer reviewed

Direct link

Girard, Todd A.; Christensen, Bruce K. – Psychological Assessment, 2008

The correlation between a short-form (SF) test and its full-scale (FS) counterpart is a mainstay in the evaluation of SF validity. However, in correcting for overlapping error variance in this measure, investigators have overattenuated the validity coefficient through an intuitive misapplication of P. Levy's (1967) formula. The authors of the…

Descriptors: Error of Measurement, Computation, Psychiatric Services, Correlation

Checking the Equivalence of Nearly Identical Test Editions.

Download full text

Dorans, Neil J.; Lawrence, Ida M. – 1988

A procedure for checking the score equivalence of nearly identical editions of a test is described. The procedure employs the standard error of equating (SEE) and utilizes graphical representation of score conversion deviation from the identity function in standard error units. Two illustrations of the procedure involving Scholastic Aptitude Test…

Descriptors: Equated Scores, Error of Measurement, Test Construction, Test Format

The Performance of a Method for the Long-Term Equating of Mixed-Format Assessment

Peer reviewed

Direct link

Kamata, Akihito; Tate, Richard – Journal of Educational Measurement, 2005

The goal of this study was the development of a procedure to predict the equating error associated with the long-term equating method of Tate (2003) for mixed-format tests. An expression for the determination of the error of an equating based on multiple links using the error for the component links was derived and illustrated with simulated data.…

Descriptors: Computer Simulation, Item Response Theory, Test Format, Evaluation Methods

Standard Errors of Levine Linear Equating.

Peer reviewed

Hanson, Bradley A.; And Others – Applied Psychological Measurement, 1993

The delta method was used to derive standard errors (SES) of the Levine observed score and Levine true score linear test equating methods using data from two test forms. SES derived without the normality assumption and bootstrap SES were very close. The situation with skewed score distributions is also discussed. (SLD)

Descriptors: Equated Scores, Equations (Mathematics), Error of Measurement, Sampling

Corrected Estimates of WAIS-R Short Form Reliability and Standard Error of Measurement.

Peer reviewed

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996

The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)

Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests

A Comparison of Three Equating Approaches to A Random-Groups, Common-Forms Design.

PDF pending restoration

Ito, Kyoko; Sykes, Robert C. – 1996

Equating multiple test forms is frequently desired. When multiple forms are linked in a chain of equating, error tends to build up in the process. This paper compares three procedures for equating multiple forms in a common-form design where each school administered, in a spiraled fashion, only a subset of multiple forms. Data used were from a…

Descriptors: Comparative Analysis, Equated Scores, Error of Measurement, Grade 11

Equating Multiple Tests via an IRT Linking Design: Utilizing a Single Set of Anchor Items with Fixed Common Item Parameters during the Calibration Process.

Download full text

Li, Yuan H.; Griffith, William D.; Tam, Hak P. – 1997

This study explores the relative merits of a potentially useful item response theory (IRT) linking design: using a single set of anchor items with fixed common item parameters (FCIP) during the calibration process. An empirical study was conducted to investigate the appropriateness of this linking design using 6 groups of students taking 6 forms…

Descriptors: Ability, Difficulty Level, Equated Scores, Error of Measurement

Equating Scores from Adaptive to Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2006

Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Equated Scores

A Test Reliability Analysis of an Abbreviated Version of the Pupil Control Ideology Form.

Download full text

Gaffney, Patrick V. – 1997

A reliability analysis was conducted of an abbreviated, 10-item version of the Pupil Control Ideology Form (PCI), using the Cronbach's alpha technique (L. J. Cronbach, 1951) and the computation of the standard error of measurement. The PCI measures a teacher's orientation toward pupil control. Subjects were 168 preservice teachers from one private…

Descriptors: Classroom Techniques, Discipline, Error of Measurement, Higher Education

Methodological Issues Related to the Study of Context Effects in Multisection Tests.

Stewart, E. Elizabeth – 1981

Context effects are defined as being influences on test performance associated with the content of successively presented test items or sections. Four types of context effects are identified: (1) direct context effects (practice effects) which occur when performance on items is affected by the examinee having been exposed to similar types of…

Descriptors: Context Effect, Data Collection, Error of Measurement, Evaluation Methods

Confidence in Pass/Fail Decisions for Computer Adaptive and Paper and Pencil Examinations.

Peer reviewed

Bergstrom, Betty A.; Lunz, Mary E. – Evaluation and the Health Professions, 1992

The level of confidence in pass/fail decisions obtained with computerized adaptive tests and paper-and-pencil tests was greater for 645 medical technology students when the computer adaptive test implemented a 90 percent confidence stopping rule than for paper-and-pencil tests of comparable length. (SLD)

Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Confidence Testing

The Determination of Empirical Standard Errors of Equating the Scores on SAT-Verbal and SAT-Mathematical.

Download full text

Angoff, William H. – 1991

An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…

Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students

Previous Page | Next Page »

Pages: 1 | 2

Error of Measurement	17
Test Format	17
Equated Scores	7
Item Response Theory	5
Comparative Analysis	4
Evaluation Methods	4
Test Construction	4
Test Items	4
Estimation (Mathematics)	3
Psychometrics	3
Accuracy	2
Adaptive Testing	2
Comparative Testing	2
Computer Assisted Testing	2
Difficulty Level	2
Equations (Mathematics)	2
High School Students	2
High Schools	2
Higher Education	2
Item Analysis	2
Item Banks	2
Mathematics Tests	2
Test Reliability	2
Test Theory	2
Test Validity	2
More ▼

Angoff, William H.	1
Axelrod, Bradley N.	1
Bergstrom, Betty A.	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
Choi, Jiwon	1
Christensen, Bruce K.	1
Colton, Dean A.	1
Dorans, Neil J.	1
Gaffney, Patrick V.	1
Girard, Todd A.	1
Griffith, William D.	1
Hanson, Bradley A.	1
Ito, Kyoko	1
Kamata, Akihito	1
Kang, Yujin	1
Kim, Stella Y.	1
Kolen, Michael J.	1
Lance, Charles E.	1
Lawrence, Ida M.	1
Lee, Won-Chan	1
Li, Yuan H.	1
Liu, Chunyan	1
Lunz, Mary E.	1
More ▼