ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Models	6
Scores	6
Test Length	6
Item Response Theory	4
Error of Measurement	3
Goodness of Fit	3
Simulation	3
Comparative Analysis	2
Sample Size	2
Statistical Analysis	2
Test Format	2
Test Items	2
Test Reliability	2
Androgyny	1
College Students	1
Computer Assisted Testing	1
Computer Software	1
Construct Validity	1
Correlation	1
Educational Testing	1
Factor Analysis	1
Factor Structure	1
Guessing (Tests)	1
Higher Education	1
Multiple Choice Tests	1
More ▼

Source

ETS Research Report Series	2
ACT, Inc.	1
Assessment & Evaluation in…	1
Journal of Educational…	1

Author

Burton, Richard F.	1
Campbell, Todd	1
Chen, Troy T.	1
Chon, Kyong Hee	1
Dunbar, Stephen B.	1
Feng, Yuling	1
Fu, Jianbin	1
Kang, Taehoon	1
Lee, Won-Chan	1
Patsula, Liane	1
Rizavi, Saba	1
Rotou, Ourania	1
Steffen, Manfred	1
More ▼

Publication Type

Reports - Research	5
Journal Articles	4
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Bem Sex Role Inventory

What Works Clearinghouse Rating

Showing all 6 results Save | Export

A Comparison of Score Aggregation Methods for Unidimensional Tests on Different Dimensions. Research Report. ETS RR-18-01

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018

In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…

Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests

A Comparison of Item Fit Statistics for Mixed IRT Models

Peer reviewed

Direct link

Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B. – Journal of Educational Measurement, 2010

In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

Descriptors: Test Length, Goodness of Fit, Item Response Theory, Simulation

An Investigation of the Performance of the Generalized S-X[superscript 2] Item-Fit Index for Polytomous IRT Models. ACT Research Report Series, 2007-1

Download full text

Kang, Taehoon; Chen, Troy T. – ACT, Inc., 2007

Orlando and Thissen (2000, 2003) proposed an item-fit index, S-X[superscript 2], for dichotomous item response theory (IRT) models, which has performed better than traditional item-fit statistics such as Yen's (1981) Q[subscript 1] and McKinley and Mill's (1985) G[superscript 2]. This study extends the utility of S-X[superscript 2] to polytomous…

Descriptors: Item Response Theory, Models, Computer Software, Statistical Analysis

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

The Factor Structure of the Bem Sex-Role Inventory (BSRI): A Confirmatory Analysis.

Download full text

Campbell, Todd; And Others – 1995

In the early 1970s A. Constantinople wrote a seminal article that led to the development of the construct of psychological androgyny. The Bem Sex-Role Inventory is a popular measure of the construct, but the measure remains controversial. The construct validity of scores from the measure was explored using confirmatory factor analysis on data from…

Descriptors: Androgyny, College Students, Construct Validity, Factor Structure

Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking

Peer reviewed

Direct link

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004

The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…

Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items