ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Comparative Analysis	14
Test Reliability	14
Multiple Choice Tests	5
Test Items	5
Test Validity	5
Mathematical Models	3
Response Style (Tests)	3
Sample Size	3
Statistical Analysis	3
Test Format	3
Testing	3
Accuracy	2
Computation	2
Confidence Testing	2
Difficulty Level	2
Equated Scores	2
Guessing (Tests)	2
Higher Education	2
Item Response Theory	2
Junior High School Students	2
Objective Tests	2
Scoring Formulas	2
Simulation	2
Ability Identification	1
Achievement Tests	1
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	1

Education Level

Audience

Location

South Carolina

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
General Educational…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Attribute-Level and Pattern-Level Classification Consistency and Accuracy Indices for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang – Journal of Educational Measurement, 2015

Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…

Descriptors: Classification, Reliability, Accuracy, Cognitive Tests

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Small-Sample Equating Using a Synthetic Linking Function

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008

This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…

Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis

Quick Estimates of the Relative Efficiency of Two Tests as a Function of Ability Level

Peer reviewed

Lord, Frederic M. – Journal of Educational Measurement, 1974

When comparing two tests that measure the same trait, separate comparisons should be made at different levels of the trait. A simple, practical, approximate formula is given for doing this. The adequacy of the approximation is illustrated using data comparing seven nationally known sixth-grade reading tests. (Author/RC)

Descriptors: Ability Identification, Comparative Analysis, Reading Tests, Statistical Analysis

Item Analysis for Teacher-Made Mastery Tests

Peer reviewed

Crehan, Kevin D. – Journal of Educational Measurement, 1974

Various item selection techniques are compared on criterion-referenced reliability and validity. Techniques compared include three nominal criterion-referenced methods, a traditional point biserial selection, teacher selection, and random selection. (Author)

Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Analysis, Item Banks

Can Teachers Write Good True-False Test Items?

Peer reviewed

Ebel, Robert L. – Journal of Educational Measurement, 1975

Descriptors: Comparative Analysis, Multiple Choice Tests, Objective Tests, Teachers

Accuracy of Two Procedures for Estimating Reliability of Mastery Tests.

Peer reviewed

Huynh, Huynh; Saunders, Joseph C. – Journal of Educational Measurement, 1980

Single administration (beta-binomial) estimates for the raw agreement index p and the corrected-for-chance kappa index in mastery testing are compared with those based on two test administrations in terms of estimation bias and sampling variability. Bias is about 2.5 percent for p and 10 percent for kappa. (Author/RL)

Descriptors: Comparative Analysis, Error of Measurement, Mastery Tests, Mathematical Models

Comparison of Four Procedures for Equating the Tests of General Educational Development.

Peer reviewed

Kolen, Michael J.; Whitney, Douglas R. – Journal of Educational Measurement, 1982

The adequacy of equipercentile, linear, one-parameter (Rasch), and three-parameter logistic item-response theory procedures for equating 12 forms of five tests of general educational development were compared. Results indicated the equating method adequacy depends on a variety of factors such as test characteristics, equating design, and sample…

Descriptors: Achievement Tests, Comparative Analysis, Equated Scores, Equivalency Tests

Multiple-Choice versus Free-Response: A Simulation Study.

Peer reviewed

Frary, Robert B. – Journal of Educational Measurement, 1985

Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)

Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)

The Relative Merits of Multiple True-False Achievement Tests.

Peer reviewed

Frisbie, David A.; Sweeney, Daryl C. – Journal of Educational Measurement, 1982

A 100-item five-choice multiple choice (MC) biology final exam was converted to multiple choice true-false (MTF) form to yield two content-parallel test forms comprised of the two item types. Students found the MTF items easier and preferred MTF over MC; the MTF subtests were more reliable. (Author/GK)

Descriptors: Biology, College Science, Comparative Analysis, Difficulty Level

Stability of Individual Differences in Multiwave Panel Studies: Comparison of Simplex Models and One-Factor Models.

Peer reviewed

Marsh, Herbert W. – Journal of Educational Measurement, 1993

Structural equation models of the same construct collected on different occasions are evaluated in 2 studies involving the evaluation of 157 college instructors over 8 years and data for over 2,200 high school students over 4 years for the Youth in Transition Study. Results challenge overreliance on simplex models. (SLD)

Descriptors: College Faculty, Comparative Analysis, High School Students, High Schools

A Comparison of Several Methods of Assessing Partial Knowledge in Multiple Choice Tests: I. Scoring Procedures

Peer reviewed

Kansup, Wanlop; Hakstian, A. Ralph – Journal of Educational Measurement, 1975

Effects of logically weighting incorrect item options in conventional tests and different scoring functions with confidence tests on reliability and validity were examined. Ninth graders took conventionally administered Verbal and Mathematical Reasoning tests, scored conventionally and by a procedure assigning degree-of-correctness weights to…

Descriptors: Comparative Analysis, Confidence Testing, Junior High School Students, Multiple Choice Tests

A Comparison of Several Methods of Assessing Partial Knowledge in Multiple Choice Tests: II. Testing Procedures

Peer reviewed

Hakstian, A. Ralph; Kansup, Wanlop – Journal of Educational Measurement, 1975

A comparison of reliability and validity was made for three testing procedures: 1) responding conventionally to Verbal Ability and Mathematical Reasoning tests; 2) using a confidence weighting response procedure with the same tests; and 3) using the elimination response method. The experimental testing procedures were not psychometrically superior…

Descriptors: Comparative Analysis, Confidence Testing, Guessing (Tests), Junior High School Students

Hakstian, A. Ralph	2
Kansup, Wanlop	2
Chen, Ping	1
Crehan, Kevin D.	1
Ding, Shuliang	1
Ebel, Robert L.	1
Frary, Robert B.	1
Frisbie, David A.	1
Haberman, Shelby	1
Huynh, Huynh	1
Kim, Sooyeon	1
Kolen, Michael J.	1
Lord, Frederic M.	1
Marsh, Herbert W.	1
Meng, Yaru	1
Mingfeng Xue	1
Ping Chen	1
Robitzsch, Alexander	1
Saunders, Joseph C.	1
Schipolowski, Stefan	1
Schroeders, Ulrich	1
Song, Lihong	1
Sweeney, Daryl C.	1
Wang, Wenyi	1
Whitney, Douglas R.	1
More ▼