ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	6

Descriptor

Comparative Analysis	10
Correlation	10
Item Response Theory	3
Reliability	3
Test Items	3
Evaluation Methods	2
Factor Analysis	2
Goodness of Fit	2
Monte Carlo Methods	2
Nonparametric Statistics	2
Regression (Statistics)	2
Scoring	2
Statistical Analysis	2
Tests	2
Anxiety	1
Automation	1
Cognitive Tests	1
Computation	1
Computer Assisted Testing	1
Computer Simulation	1
Computer Software	1
Cutting Scores	1
Difficulty Level	1
Efficiency	1
Elementary School Students	1
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	10
Reports - Research	6
Reports - Evaluative	4

Education Level

Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Primary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

United States Medical…

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

Peer reviewed

Direct link

Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013

Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis

A Comparison of Four Methods of IRT Subscoring

Peer reviewed

Direct link

de la Torre, Jimmy; Song, Hao; Hong, Yuan – Applied Psychological Measurement, 2011

Lack of sufficient reliability is the primary impediment for generating and reporting subtest scores. Several current methods of subscore estimation do so either by incorporating the correlational structure among the subtest abilities or by using the examinee's performance on the overall test. This article conducted a systematic comparison of four…

Descriptors: Item Response Theory, Scoring, Methods, Comparative Analysis

Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

Peer reviewed

Direct link

Jones, Andrew T. – Applied Psychological Measurement, 2011

Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

Descriptors: Test Items, Item Analysis, Cutting Scores, Statistics

Two Approaches for Using Multiple Anchors in NEAT Equating: A Description and Demonstration

Peer reviewed

Direct link

Moses, Tim; Deng, Weiling; Zhang, Yu-Li – Applied Psychological Measurement, 2011

Nonequivalent groups with anchor test (NEAT) equating functions that use a single anchor can have accuracy problems when the groups are extremely different and/or when the anchor weakly correlates with the tests being equated. Proposals have been made to address these issues by incorporating more than one anchor into NEAT equating functions. These…

Descriptors: Equated Scores, Tests, Comparative Analysis, Correlation

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

The Comparative Performance of Conditional Independence Indices

Peer reviewed

Direct link

Kim, Doyoung; De Ayala, R. J.; Ferdous, Abdullah A.; Nering, Michael L. – Applied Psychological Measurement, 2011

To realize the benefits of item response theory (IRT), one must have model-data fit. One facet of a model-data fit investigation involves assessing the tenability of the conditional item independence (CII) assumption. In this Monte Carlo study, the comparative performance of 10 indices for identifying conditional item dependence is assessed. The…

Descriptors: Item Response Theory, Monte Carlo Methods, Error of Measurement, Statistical Analysis

An Equal-Level Approach to the Investigation of Multitrait-Multimethod Matrices.

Peer reviewed

Schweizer, Karl – Applied Psychological Measurement, 1991

An equal-level approach is proposed for investigating multitrait-multimethod (MTMM) matrices with respect to other organizational units that contain additional information concerning a MTMM matrix's validity. The approach requires equality in "data level" before coefficients are submitted for evaluation. Disaggregation is central to…

Descriptors: Comparative Analysis, Correlation, Equations (Mathematics), Mathematical Models

A Comparative Study of Test Data Dimensionality Assessment Procedures Under Nonparametric IRT Models

Peer reviewed

Direct link

van Abswoude, Alexandra A. H.; van der Ark, L. Andries; Sijtsma, Klaas – Applied Psychological Measurement, 2004

In this article, an overview of nonparametric item response theory methods for determining the dimensionality of item response data is provided. Four methods were considered: MSP, DETECT, HCA/CCPROX, and DIMTEST. First, the methods were compared theoretically. Second, a simulation study was done to compare the effectiveness of MSP, DETECT, and…

Descriptors: Comparative Analysis, Computer Software, Simulation, Nonparametric Statistics

On the Robustness of a Class of Naive Estimators.

Peer reviewed

Wainer, Howard; Thissen, David – Applied Psychological Measurement, 1979

A class of naive estimators of correlation was tested for robustness, accuracy, and efficiency against Pearson's r, Tukey's r, and Spearman's r. It was found that this class of estimators seems to be superior, being less affected by outliers, reasonably efficient, and frequently more easily calculated. (Author/CTM)

Descriptors: Comparative Analysis, Correlation, Goodness of Fit, Nonparametric Statistics

Three Approaches to Determining the Dimensionality of Binary Items.

Peer reviewed

Roznowski, Mary; And Others – Applied Psychological Measurement, 1991

Three heuristic methods of assessing the dimensionality of binary item pools were evaluated in a Monte Carlo investigation. The indices were based on (1) the local independence of unidimensional tests; (2) patterns of second-factor loadings derived from simplex theory; and (3) the shape of the curve of successive eigenvalues. (SLD)

Descriptors: Comparative Analysis, Computer Simulation, Correlation, Evaluation Methods

Almehrizi, Rashid S.	1
Baldwin, Peter	1
Clauser, Brian	1
De Ayala, R. J.	1
Deng, Weiling	1
Ferdous, Abdullah A.	1
Harik, Polina	1
Hong, Yuan	1
Jones, Andrew T.	1
Kim, Doyoung	1
Moses, Tim	1
Nering, Michael L.	1
Roznowski, Mary	1
Schweizer, Karl	1
Sijtsma, Klaas	1
Song, Hao	1
Thissen, David	1
Wainer, Howard	1
Zhang, Yu-Li	1
de la Torre, Jimmy	1
van Abswoude, Alexandra A. H.	1
van der Ark, L. Andries	1
More ▼