ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	26

Descriptor

Item Response Theory	27
Statistical Analysis	27
Test Format	27
Test Items	18
Comparative Analysis	12
Difficulty Level	10
Simulation	8
Equated Scores	7
Foreign Countries	7
Computer Assisted Testing	6
Mathematics Tests	6
Multiple Choice Tests	6
Correlation	5
Science Tests	5
Evaluation Methods	4
Item Analysis	4
Models	4
Sample Size	4
Scores	4
Test Construction	4
Test Reliability	4
College Entrance Examinations	3
Grade 8	3
Physics	3
Science Instruction	3
More ▼

Source

ETS Research Report Series	5
ProQuest LLC	3
Applied Measurement in…	2
Grantee Submission	2
International Journal of…	2
Journal of Psychoeducational…	2
Language Testing	2
ACT, Inc.	1
Applied Psychological…	1
Educational Sciences: Theory…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Interactive Online…	1
Physical Review Special…	1
Research in Science &…	1
South African Journal of…	1
More ▼

Publication Type

Reports - Research	23
Journal Articles	21
Dissertations/Theses -…	3
Speeches/Meeting Papers	2
Collected Works - General	1
Numerical/Quantitative Data	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	7
Postsecondary Education	6
Secondary Education	6
Elementary Education	4
Grade 8	4
Elementary Secondary Education	3
Junior High Schools	3
Middle Schools	3
High Schools	2
Grade 12	1
Grade 4	1
Grade 7	1
Grade 9	1
More ▼

Audience

Location

Australia	2
Germany	2
Florida	1
Japan	1
South Africa	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	2
Advanced Placement…	1
College Level Examination…	1
Defining Issues Test	1
Law School Admission Test	1
National Assessment of…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

An Investigation of Item Position Effects by Means of IRT-Based Differential Item Functioning Methods

Peer reviewed
PDF on ERIC

Download full text

Soysal, Sumeyra; Yilmaz Kogar, Esin – International Journal of Assessment Tools in Education, 2021

In this study, whether item position effects lead to DIF in the condition where different test booklets are used was investigated. To do this the methods of Lord's chi-square and Raju's unsigned area with the 3PL model under with and without item purification were used. When the performance of the methods was compared, it was revealed that…

Descriptors: Item Response Theory, Test Bias, Test Items, Comparative Analysis

Extension of Caution Indices to Mixed-Format Tests

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2018

Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Using Necessary Information to Identify Item Dependence in Passage-Based Reading Comprehension Tests

Peer reviewed

Direct link

Baldonado, Angela Argo; Svetina, Dubravka; Gorin, Joanna – Applied Measurement in Education, 2015

Applications of traditional unidimensional item response theory models to passage-based reading comprehension assessment data have been criticized based on potential violations of local independence. However, simple rules for determining dependency, such as including all items associated with a particular passage, may overestimate the dependency…

Descriptors: Reading Tests, Reading Comprehension, Test Items, Item Response Theory

Comparing Student Performance on Paper-and-Pencil and Computer-Based-Tests

Peer reviewed
PDF on ERIC

Download full text

Hardcastle, Joseph; Herrmann-Abell, Cari F.; DeBoer, George E. – Grantee Submission, 2017

Can student performance on computer-based tests (CBT) and paper-and-pencil tests (PPT) be considered equivalent measures of student knowledge? States and school districts are grappling with this question, and although studies addressing this question are growing, additional research is needed. We report on the performance of students who took…

Descriptors: Academic Achievement, Computer Assisted Testing, Comparative Analysis, Student Evaluation

Evidence for Paper and Online ACT® Comparability: Spring 2014 and 2015 Mode Comparability Studies. ACT Research Report Series 2017-1

Download full text

Li, Dongmei; Yi, Qing; Harris, Deborah – ACT, Inc., 2017

In preparation for online administration of the ACT® test, ACT conducted studies to examine the comparability of scores between online and paper administrations, including a timing study in fall 2013, a mode comparability study in spring 2014, and a second mode comparability study in spring 2015. This report presents major findings from these…

Descriptors: College Entrance Examinations, Computer Assisted Testing, Comparative Analysis, Test Format

Examining the Teachers' Sense of Efficacy Scale at the Item Level with Rasch Measurement Model

Peer reviewed

Direct link

Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016

The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…

Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory

The Need for Invariant Assessments in South African Education

Peer reviewed
PDF on ERIC

Download full text

Dampier, Graham A. – South African Journal of Education, 2014

Presently, a plethora of instruments designed to assess a mathematical skill, disposition, or competence prevail in South Africa. Yet few of them adhere to the basic requirements of the unidimensionality and invariance of measures. The Marko-D is a mathematical instrument designed to test learners between the ages of 4 and 8. The instrument, thus…

Descriptors: Foreign Countries, Student Evaluation, Mathematics Skills, Measurement Techniques

The Impact of Sub-Skills and Item Content on Students' Skills with Regard to the Control-of-Variables Strategy

Peer reviewed

Direct link

Schwichow, Martin; Christoph, Simon; Boone, William J.; Härtig, Hendrik – International Journal of Science Education, 2016

The so-called control-of-variables strategy (CVS) incorporates the important scientific reasoning skills of designing controlled experiments and interpreting experimental outcomes. As CVS is a prominent component of science standards appropriate assessment instruments are required to measure these scientific reasoning skills and to evaluate the…

Descriptors: Thinking Skills, Science Instruction, Science Experiments, Science Tests

Dividing the Force Concept Inventory into Two Equivalent Half-Length Tests

Peer reviewed

Direct link

Han, Jing; Bao, Lei; Chen, Li; Cai, Tianfang; Pi, Yuan; Zhou, Shaona; Tu, Yan; Koenig, Kathleen – Physical Review Special Topics - Physics Education Research, 2015

The Force Concept Inventory (FCI) is a 30-question multiple-choice assessment that has been a building block for much of the physics education research done today. In practice, there are often concerns regarding the length of the test and possible test-retest effects. Since many studies in the literature use the mean score of the FCI as the…

Descriptors: Physics, Multiple Choice Tests, Science Instruction, Scores

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014

Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items

Previous Page | Next Page »

Pages: 1 | 2

Ali, Usama S.	1
Baldonado, Angela Argo	1
Bao, Lei	1
Batty, Aaron Olaf	1
Boone, William J.	1
Borowski, Andreas	1
Boughton, Keith A.	1
Cai, Tianfang	1
Chang, Hua-Hua	1
Chang, Mei-Lin	1
Chang, Wanchen	1
Chen, Li	1
Christoph, Simon	1
Dampier, Graham A.	1
DeBoer, George E.	1
DeMars, Christine E.	1
Dodd, Barbara G.	1
Duong, Minh Quang	1
Engelhard, George, Jr.	1
Filipi, Anna	1
Fischer, Hans E.	1
Floyd, Harlee S.	1
Gess-Newsome, Julie	1
Gorin, Joanna	1
Han, Jing	1
More ▼