ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	9

Descriptor

Comparative Analysis	13
Correlation	13
Test Length	13
Test Items	6
Item Response Theory	5
Simulation	5
Accuracy	3
Computer Assisted Testing	3
Difficulty Level	3
Sample Size	3
Statistical Analysis	3
Test Reliability	3
Adaptive Testing	2
Elementary Secondary Education	2
Equated Scores	2
Factor Analysis	2
Item Analysis	2
Methods	2
Models	2
Monte Carlo Methods	2
Scores	2
Test Format	2
Test Validity	2
Tests	2
Achievement Tests	1
More ▼

Source

ETS Research Report Series	2
Applied Psychological…	1
Asia Pacific Education Review	1
Educational and Psychological…	1
Online Submission	1
Pearson	1
ProQuest LLC	1
Psychological Assessment	1
Quality Assurance in…	1
Toegepaste taalwetenschap in…	1

Publication Type

Reports - Research	11
Journal Articles	9
Speeches/Meeting Papers	3
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Secondary Education	1

Audience

Location

Singapore	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Kaufman Brief Intelligence…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

A Comparison of Score Aggregation Methods for Unidimensional Tests on Different Dimensions. Research Report. ETS RR-18-01

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018

In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…

Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests

Assessing the Performance of Classical Test Theory Item Discrimination Estimators in Monte Carlo Simulations

Peer reviewed

Direct link

Bazaldua, Diego A. Luna; Lee, Young-Sun; Keller, Bryan; Fellers, Lauren – Asia Pacific Education Review, 2017

The performance of various classical test theory (CTT) item discrimination estimators has been compared in the literature using both empirical and simulated data, resulting in mixed results regarding the preference of some discrimination estimators over others. This study analyzes the performance of various item discrimination estimators in CTT:…

Descriptors: Test Items, Monte Carlo Methods, Item Response Theory, Correlation

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

Indexing Creativity Fostering Teacher Behaviour: Replication and Modification

Download full text

Dikici, Ayhan; Soh, Kaycheng – Online Submission, 2015

Many measurement tools on creativity are available in the literature. One of these scales is Creativity Fostering Teacher Behaviour Index (CFTIndex) developed for Singaporean teacher originally. It was then translated into Turkish and trialled on teachers in Nigde province with acceptable reliability and factorial validity. The main purpose of…

Descriptors: Creativity, Teacher Behavior, Comparative Analysis, Turkish

A Comparison of Four Methods of IRT Subscoring

Peer reviewed

Direct link

de la Torre, Jimmy; Song, Hao; Hong, Yuan – Applied Psychological Measurement, 2011

Lack of sufficient reliability is the primary impediment for generating and reporting subtest scores. Several current methods of subscore estimation do so either by incorporating the correlational structure among the subtest abilities or by using the examinee's performance on the overall test. This article conducted a systematic comparison of four…

Descriptors: Item Response Theory, Scoring, Methods, Comparative Analysis

A Comparison of Three Content Balancing Methods for Fixed and Variable Length Computerized Adaptive Tests

Direct link

Shin, Chingwei David; Chien, Yuehmei; Way, Walter Denny – Pearson, 2012

Content balancing is one of the most important components in the computerized adaptive testing (CAT) especially in the K to 12 large scale tests that complex constraint structure is required to cover a broad spectrum of content. The purpose of this study is to compare the weighted penalty model (WPM) and the weighted deviation method (WDM) under…

Descriptors: Computer Assisted Testing, Elementary Secondary Education, Test Content, Models

The Impact of Anchor Test Length on Equating Results in a Nonequivalent Groups Design. Research Report. ETS RR-07-44

Peer reviewed
PDF on ERIC

Download full text

Ricker, Kathryn L.; von Davier, Alina A. – ETS Research Report Series, 2007

This study explored the effects of external anchor test length on final equating results of several equating methods, including equipercentile (frequency estimation), chained equipercentile, kernel equating (KE) poststratification PSE with optimal bandwidths, and KE PSE linear (large bandwidths) when using the nonequivalent groups anchor test…

Descriptors: Equated Scores, Test Items, Statistical Analysis, Test Length

A Comparison of Two Screening Tests (the Matrix Analogies Test--Short Form and the Kaufman Brief Intelligence Test) with the WISC-III.

Peer reviewed

Prewett, Peter N. – Psychological Assessment, 1995

The concurrent validity of 2 brief intelligence tests, the Matrix Analogies Test-Short Form (MAT) and the Kaufman Brief Intelligence Test (K-BIT) with the Wechsler Intelligence Scale for Children-Third Edition (WISC-III) using a sample of 50 urban students. The MAT and K-BIT appeared equally useful as screening tests. (SLD)

Descriptors: Children, Comparative Analysis, Concurrent Validity, Correlation

A Comparison of Reliability Estimates from Single and Double Administrations of Criterion-Referenced Tests.

Schaefer, Mary M.; Gross, Susan K. – 1983

Viewing the reliability for criterion-referenced tests as that of mastery classification decisions, three models for determining reliability were examined using two test administrations so that two estimates could be compared to a standard. A major purpose of the research was to determine how several reliability coefficients (coefficient kappa, an…

Descriptors: Comparative Analysis, Correlation, Criterion Referenced Tests, Cutting Scores

An Adaptive Testing Strategy for Achievement Test Batteries. Research Report 77-6.

Download full text

Brown, Joel M.; Weiss, David J. – 1977

An adaptive testing strategy is described for achievement tests covering multiple content areas. The strategy combines adaptive item selection both within and between the subtests in the multiple-subtest battery. A real-data simulation was conducted to compare the results from adaptive testing and from conventional testing, in terms of test…

Descriptors: Achievement Tests, Adaptive Testing, Branching, Comparative Analysis

Listening, a Single Trait in First and Second Language Learning.

Download full text

de Jong, John H. A. L. – Toegepaste taalwetenschap in artikelen 20, 1984

A study investigated the validity of an English listening skills test by comparing the results of native American and British English speakers with those of Dutch students of English as a second language. A hypothesis suggested that two-thirds of the items would test listening skills and the remaining third would test other knowledge. Test results…

Descriptors: Age Differences, Comparative Analysis, Correlation, Educational Background

Allan S. Cohen	1
Bazaldua, Diego A. Luna	1
Brown, Joel M.	1
Chien, Yuehmei	1
Dikici, Ayhan	1
Fellers, Lauren	1
Feng, Yuling	1
Fu, Jianbin	1
Gross, Susan K.	1
Hong, Yuan	1
Keller, Bryan	1
Kárász, Judit T.	1
Lee, Eunjung	1
Lee, Young-Sun	1
Prewett, Peter N.	1
Ricker, Kathryn L.	1
Schaefer, Mary M.	1
Sedat Sen	1
Shin, Chingwei David	1
Soh, Kaycheng	1
Song, Hao	1
Széll, Krisztián	1
Takács, Szabolcs	1
Way, Walter Denny	1
Weiss, David J.	1
More ▼