ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	15

Descriptor

Correlation	20
Item Response Theory	20
Test Format	20
Test Items	10
Comparative Analysis	9
Foreign Countries	6
Scores	6
Factor Analysis	5
Multiple Choice Tests	5
Statistical Analysis	5
Difficulty Level	4
Language Tests	4
Test Reliability	4
College Entrance Examinations	3
Equated Scores	3
Error of Measurement	3
Evaluation Methods	3
High School Students	3
Likert Scales	3
Psychometrics	3
Robustness (Statistics)	3
Science Tests	3
Test Validity	3
Accuracy	2
Advanced Placement	2
More ▼

Source

ProQuest LLC	4
Educational and Psychological…	2
Language Testing	2
ACT, Inc.	1
Applied Measurement in…	1
College Board	1
International Journal of…	1
Journal of Creative Behavior	1
Journal of Interactive Online…	1
Measurement:…	1
Physical Review Special…	1
Structural Equation Modeling	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	10
Dissertations/Theses -…	4
Reports - Evaluative	3
Reports - Descriptive	2
Speeches/Meeting Papers	2
Non-Print Media	1
Numerical/Quantitative Data	1
Reference Materials - General	1
Tests/Questionnaires	1

Education Level

Higher Education	7
Postsecondary Education	5
Secondary Education	2
High Schools	1

Audience

Location

Taiwan (Taipei)	2
Germany	1
Illinois	1
Japan	1
South Korea	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Defining Issues Test	1
Program for International…	1
Remote Associates Test	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Impact of Multidimensionality on Unidimensional IRT Linking and Equating Methods

Direct link

Uk Hyun Cho – ProQuest LLC, 2024

The present study investigates the influence of multidimensionality on linking and equating in a unidimensional IRT. Two hypothetical multidimensional scenarios are explored under a nonequivalent group common-item equating design. The first scenario examines test forms designed to measure multiple constructs, while the second scenario examines a…

Descriptors: Item Response Theory, Classification, Correlation, Test Format

Along the Convergent-Divergent Continuum: The Role of Task Structure in the PISA Creative Thinking Assessment

Peer reviewed

Direct link

Selcuk Acar; Yuyang Shen – Journal of Creative Behavior, 2025

Creativity tests, like creativity itself, vary widely in their structure and use. These differences include instructions, test duration, environments, prompt and response modalities, and the structure of test items. A key factor is task structure, referring to the specificity of the number of responses requested for a given prompt. Classic…

Descriptors: Creativity, Creative Thinking, Creativity Tests, Task Analysis

Measuring Multidimensional Science Learning: Item Design, Scoring, and Psychometric Considerations

Direct link

Castle, Courtney – ProQuest LLC, 2018

The Next Generation Science Standards propose a multidimensional model of science learning, comprised of Core Disciplinary Ideas, Science and Engineering Practices, and Crosscutting Concepts (NGSS Lead States, 2013). Accordingly, there is a need for student assessment aligned with the new standards. Creating assessments that validly and reliably…

Descriptors: Science Education, Student Evaluation, Science Tests, Test Construction

Evidence for Paper and Online ACT® Comparability: Spring 2014 and 2015 Mode Comparability Studies. ACT Research Report Series 2017-1

Download full text

Li, Dongmei; Yi, Qing; Harris, Deborah – ACT, Inc., 2017

In preparation for online administration of the ACT® test, ACT conducted studies to examine the comparability of scores between online and paper administrations, including a timing study in fall 2013, a mode comparability study in spring 2014, and a second mode comparability study in spring 2015. This report presents major findings from these…

Descriptors: College Entrance Examinations, Computer Assisted Testing, Comparative Analysis, Test Format

A Comparison of Three Test Formats to Assess Word Difficulty

Peer reviewed

Direct link

Culligan, Brent – Language Testing, 2015

This study compared three common vocabulary test formats, the Yes/No test, the Vocabulary Knowledge Scale (VKS), and the Vocabulary Levels Test (VLT), as measures of vocabulary difficulty. Vocabulary difficulty was defined as the item difficulty estimated through Item Response Theory (IRT) analysis. Three tests were given to 165 Japanese students,…

Descriptors: Language Tests, Test Format, Comparative Analysis, Vocabulary

Dividing the Force Concept Inventory into Two Equivalent Half-Length Tests

Peer reviewed

Direct link

Han, Jing; Bao, Lei; Chen, Li; Cai, Tianfang; Pi, Yuan; Zhou, Shaona; Tu, Yan; Koenig, Kathleen – Physical Review Special Topics - Physics Education Research, 2015

The Force Concept Inventory (FCI) is a 30-question multiple-choice assessment that has been a building block for much of the physics education research done today. In practice, there are often concerns regarding the length of the test and possible test-retest effects. Since many studies in the literature use the mean score of the FCI as the…

Descriptors: Physics, Multiple Choice Tests, Science Instruction, Scores

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Developing and Evaluating a Paper-and-Pencil Test to Assess Components of Physics Teachers' Pedagogical Content Knowledge

Peer reviewed

Direct link

Kirschner, Sophie; Borowski, Andreas; Fischer, Hans E.; Gess-Newsome, Julie; von Aufschnaiter, Claudia – International Journal of Science Education, 2016

Teachers' professional knowledge is assumed to be a key variable for effective teaching. As teacher education has the goal to enhance professional knowledge of current and future teachers, this knowledge should be described and assessed. Nevertheless, only a limited number of studies quantitatively measures physics teachers' professional…

Descriptors: Evaluation Methods, Tests, Test Format, Science Instruction

Assessing First- and Second-Order Equity for the Common-Item Nonequivalent Groups Design Using Multidimensional IRT

Direct link

Andrews, Benjamin James – ProQuest LLC, 2011

The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of…

Descriptors: Test Format, Advanced Placement, Simulation, True Scores

Data Collection Design for Equivalent Groups Equating: Using a Matrix Stratification Framework for Mixed-Format Assessment

Direct link

Mbella, Kinge Keka – ProQuest LLC, 2012

Mixed-format assessments are increasingly being used in large scale standardized assessments to measure a continuum of skills ranging from basic recall to higher order thinking skills. These assessments are usually comprised of a combination of (a) multiple-choice items which can be efficiently scored, have stable psychometric properties, and…

Descriptors: Educational Assessment, Test Format, Evaluation Methods, Multiple Choice Tests

Causes of Gender DIF on an EFL Language Test: A Multiple-Data Analysis over Nine Years

Peer reviewed

Direct link

Pae, Tae-Il – Language Testing, 2012

This study tracked gender differential item functioning (DIF) on the English subtest of the Korean College Scholastic Aptitude Test (KCSAT) over a nine-year period across three data points, using both the Mantel-Haenszel (MH) and item response theory likelihood ratio (IRT-LR) procedures. Further, the study identified two factors (i.e. reading…

Descriptors: Aptitude Tests, Academic Aptitude, Language Tests, Test Items

On Bias in Linear Observed-Score Equating

Peer reviewed

Direct link

van der Linden, Wim J. – Measurement: Interdisciplinary Research and Perspectives, 2010

The traditional way of equating the scores on a new test form X to those on an old form Y is equipercentile equating for a population of examinees. Because the population is likely to change between the two administrations, a popular approach is to equate for a "synthetic population." The authors of the articles in this issue of the…

Descriptors: Test Format, Equated Scores, Population Distribution, Population Trends

Robustness to Format Effects of IRT Linking Methods for Mixed-Format Tests

Peer reviewed

Direct link

Kim, Seonghoon; Kolen, Michael J. – Applied Measurement in Education, 2006

Four item response theory linking methods (2 moment methods and 2 characteristic curve methods) were compared to concurrent (CO) calibration with the focus on the degree of robustness to format effects (FEs) when applying the methods to multidimensional data that reflected the FEs associated with mixed-format tests. Based on the quantification of…

Descriptors: Item Response Theory, Robustness (Statistics), Test Format, Comparative Analysis

Testing the Equivalence among Different Item Response Formats in Personality Measurement: A Structural Equation Modeling Approach.

Peer reviewed

Ferrando, Pere J. – Structural Equation Modeling, 2000

Discusses a procedure for testing the equivalence among different item response formats used in personality and attitude measurement. The procedure is based on the assumption that latent response variables underlie the observed item responses. It uses a nested series of confirmatory factor analysis models based on K. Joreskog's (1971) method for…

Descriptors: Attitude Measures, Correlation, Item Response Theory, Personality Assessment

Robustness of Unidimensional IRT Calibration in the Presence of Essential Dimensionality.

Download full text

Wang, Yu-Chung Lawrence – 1994

The first purpose of this study was to investigate the stability of two essential dimensionality measures across 10 random samples within a particular assessment item (AT1) selection. Other purposes were to investigate the discrepancy of the essential unidimensionality estimates for a test across different AT1 selections and sample sizes and to…

Descriptors: Correlation, Educational Assessment, Estimation (Mathematics), Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Chan, Jason C.	2
Ackerman, Terry	1
Andrews, Benjamin James	1
Bao, Lei	1
Bolt, Daniel	1
Borowski, Andreas	1
Cai, Tianfang	1
Castle, Courtney	1
Chen, Li	1
Culligan, Brent	1
DeMars, Christine E.	1
Ferrando, Pere J.	1
Fischer, Hans E.	1
Gess-Newsome, Julie	1
Han, Jing	1
Harris, Deborah	1
Hendrickson, Amy	1
Iran-Nejad, Asghar	1
Kim, Seonghoon	1
Kirschner, Sophie	1
Koenig, Kathleen	1
Kolen, Michael J.	1
Li, Dongmei	1
Mbella, Kinge Keka	1
Melican, Gerald	1
More ▼