ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	15

Descriptor

Difficulty Level	23
Statistical Analysis	23
Test Bias	23
Test Items	20
Item Analysis	10
Item Response Theory	8
Comparative Analysis	6
Culture Fair Tests	5
Foreign Countries	5
Latent Trait Theory	4
Mathematics Tests	4
Simulation	4
Correlation	3
Evaluation Methods	3
Gender Differences	3
Student Evaluation	3
Test Validity	3
Achievement Tests	2
Aptitude Tests	2
College Entrance Examinations	2
Effect Size	2
English (Second Language)	2
Equated Scores	2
Error of Measurement	2
Evaluation Criteria	2
More ▼

Source

ETS Research Report Series	4
African Journal of Research…	1
Applied Psychological…	1
CBE - Life Sciences Education	1
Educational Research and…	1
Educational and Psychological…	1
Journal of Educational and…	1
Journal of Speech, Language,…	1
Language Testing	1
Practical Assessment,…	1
ProQuest LLC	1
Scandinavian Journal of…	1
More ▼

Publication Type

Reports - Research	20
Journal Articles	14
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Information Analyses	1
Numerical/Quantitative Data	1
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	4
Higher Education	3
Postsecondary Education	3
Elementary Education	2
Grade 8	2
Junior High Schools	2
Middle Schools	2
Elementary Secondary Education	1
Grade 4	1
Grade 5	1
Grade 9	1
High Schools	1
More ▼

Audience

Location

Austria	1
Belgium	1
California	1
Canada	1
Germany	1
Japan	1
Luxembourg	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
National Assessment of…	1
Program for International…	1
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

An Item Analysis of the French Version of the Test for Reception of Grammar among Children and Adolescents with Down Syndrome or Intellectual Disability of Undifferentiated Etiology

Peer reviewed

Direct link

Facon, Bruno; Magis, David – Journal of Speech, Language, and Hearing Research, 2016

Purpose: An item analysis of Bishop's (1983) Test for Reception of Grammar (TROG) in its French version (F-TROG; Lecocq, 1996) was conducted to determine whether the difficulty of items is similar for participants with or without intellectual disability (ID). Method: In Study 1, responses to the 92 F-TROG items by 55 participants with Down…

Descriptors: Item Analysis, Grammar, Children, Adolescents

Equated Pooled Booklet Method in DIF Testing

Peer reviewed

Direct link

Cheng, Ying; Chen, Peihua; Qian, Jiahe; Chang, Hua-Hua – Applied Psychological Measurement, 2013

Differential item functioning (DIF) analysis is an important step in the data analysis of large-scale testing programs. Nowadays, many such programs endorse matrix sampling designs to reduce the load on examinees, such as the balanced incomplete block (BIB) design. These designs pose challenges to the traditional DIF analysis methods. For example,…

Descriptors: Test Bias, Equated Scores, Test Items, Effect Size

Who's on First? Gender Differences in Performance on the "SAT"® Test on Critical Reading Items with Sports and Science Content. Research Report. ETS RR-16-26

Peer reviewed
PDF on ERIC

Download full text

Chubbuck, Kay; Curley, W. Edward; King, Teresa C. – ETS Research Report Series, 2016

This study gathered quantitative and qualitative evidence concerning gender differences in performance by using critical reading material on the "SAT"® test with sports and science content. The fundamental research questions guiding the study were: If sports and science are to be included in a skills test, what kinds of material are…

Descriptors: College Entrance Examinations, Gender Differences, Critical Reading, Reading Tests

An Investigation of the Efficacy of Criterion Refinement Procedures in Mantel-Haenszel DIF Analysis. Research Report. ETS RR-13-16

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca; Ye, Lei; Isham, Steven – ETS Research Report Series, 2013

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we…

Descriptors: Test Bias, Statistical Analysis, Simulation, Educational Testing

Lessons Learned from PISA: A Systematic Review of Peer-Reviewed Articles on the Programme for International Student Assessment

Peer reviewed

Direct link

Hopfenbeck, Therese N.; Lenkeit, Jenny; El Masri, Yasmine; Cantrell, Kate; Ryan, Jeanne; Baird, Jo-Anne – Scandinavian Journal of Educational Research, 2018

International large-scale assessments are on the rise, with the Programme for International Student Assessment (PISA) seen by many as having strategic prominence in education policy debates. The present article reviews PISA-related English-language peer-reviewed articles from the programme's first cycle in 2000 to its most current in 2015. Five…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Examining Gender Differences in Written Assessment Tasks in Biology: A Case Study of Evolutionary Explanations

Peer reviewed

Direct link

Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K. – CBE - Life Sciences Education, 2016

Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a…

Descriptors: Gender Differences, Biology, Science Instruction, Case Studies

A Comparison of Video- and Audio-Mediated Listening Tests with Many-Facet Rasch Modeling and Differential Distractor Functioning

Peer reviewed

Direct link

Batty, Aaron Olaf – Language Testing, 2015

The rise in the affordability of quality video production equipment has resulted in increased interest in video-mediated tests of foreign language listening comprehension. Although research on such tests has continued fairly steadily since the early 1980s, studies have relied on analyses of raw scores, despite the growing prevalence of item…

Descriptors: Listening Comprehension Tests, Comparative Analysis, Video Technology, Audio Equipment

An Application of the Rasch Measurement Theory to an Assessment of Geometric Thinking Levels

Peer reviewed

Direct link

Stols, Gerrit; Long, Caroline; Dunne, Tim – African Journal of Research in Mathematics, Science and Technology Education, 2015

The purpose of this study is to apply the Rasch model to investigate both the Van Hiele theory for geometric development and an associated test. In terms of the test, the objective is to investigate the functioning of a classic 25-item instrument designed to identify levels of geometric proficiency. The dataset of responses by 244 students (106…

Descriptors: Item Response Theory, Geometry, Geometric Concepts, Mathematical Concepts

Detecting DIF: Many Paths to Salvation

Peer reviewed

Direct link

Wainer, Howard; Bradlow, Eric; Wang, Xiaohui – Journal of Educational and Behavioral Statistics, 2010

Confucius pointed out that the first step toward wisdom is calling things by the right name. The term "Differential Item Functioning" (DIF) did not arise fully formed from the miasma of psychometrics, it evolved from a variety of less accurate terms. Among its forebears was "item bias" but that term has a pejorative connotation…

Descriptors: Test Bias, Difficulty Level, Test Items, Statistical Analysis

Using a Mixture IRT Model to Understand English Learner Performance on Large-Scale Assessments

Direct link

Shea, Christine A. – ProQuest LLC, 2013

The purpose of this study was to determine whether an eighth grade state-level math assessment contained items that function differentially (DIF) for English Learner students (EL) as compared to English Only students (EO) and if so, what factors might have caused DIF. To determine this, Differential Item Functioning (DIF) analysis was employed.…

Descriptors: Item Response Theory, English Language Learners, Grade 8, Mathematics Tests

Gender and Minority Achievement Gaps in Science in Eighth Grade: Item Analyses of Nationally Representative Data. Research Report. ETS RR-17-36

Peer reviewed
PDF on ERIC

Download full text

Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017

In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…

Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8

On the Relationship between Differential Item Functioning and Item Difficulty: An Issue of Methods? Item Response Theory Approach to Differential Item Functioning

Peer reviewed

Direct link

Santelices, Maria Veronica; Wilson, Mark – Educational and Psychological Measurement, 2012

The relationship between differential item functioning (DIF) and item difficulty on the SAT is such that more difficult items tended to exhibit DIF in favor of the focal group (usually minority groups). These results were reported by Kulick and Hu, and Freedle and have been enthusiastically discussed by more recent literature. Examining the…

Descriptors: Test Bias, Test Items, Difficulty Level, Statistical Analysis

Detecting Differential Item Functioning and Differential Step Functioning due to Differences that "Should" Matter

Peer reviewed

Direct link

Miller, Tess; Chahine, Saad; Childs, Ruth A. – Practical Assessment, Research & Evaluation, 2010

This study illustrates the use of differential item functioning (DIF) and differential step functioning (DSF) analyses to detect differences in item difficulty that are related to experiences of examinees, such as their teachers' instructional practices, that are relevant to the knowledge, skill, or ability the test is intended to measure. This…

Descriptors: Test Bias, Difficulty Level, Test Items, Mathematics Tests

How Do Different Versions of a Test Instrument Function in a Single Language? A DIF Analysis of the PIRLS 2006 German Assessments

Peer reviewed

Direct link

Stubbe, Tobias C. – Educational Research and Evaluation, 2011

The challenge inherent in cross-national research of providing instruments in different languages measuring the same construct is well known. But even instruments in a single language may be biased towards certain countries or regions due to local linguistic specificities. Consequently, it may be appropriate to use different versions of an…

Descriptors: Test Items, International Studies, Foreign Countries, German

An Investigation of Cross-Cultural Stability in Mental Test Items.

Download full text

Breland, Hunter M. – 1974

Examples of cross-cultural stability or instability of mental test items are illustrated. A statistical procedure involving the cross-plotting of item difficulties for two different groups and generating a line of mutual regression through the resulting scatter of points is described. D-values, representing the perpendicular distance, in delta…

Descriptors: Cross Cultural Studies, Difficulty Level, Item Analysis, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2

Baird, Jo-Anne	1
Batty, Aaron Olaf	1
Bradlow, Eric	1
Breland, Hunter M.	1
Brutten, Sheila R.	1
Burton, Nancy W.	1
Cantrell, Kate	1
Chahine, Saad	1
Chang, Hua-Hua	1
Chen, Peihua	1
Cheng, Ying	1
Childs, Ruth A.	1
Chubbuck, Kay	1
Convey, John J.	1
Curley, W. Edward	1
Dunne, Tim	1
El Masri, Yasmine	1
Facon, Bruno	1
Federer, Meghan Rector	1
Fifield, Steve	1
Ford, Danielle	1
Glutting, Joseoph	1
Grossen, Neal E.	1
Holland, Paul	1
More ▼