ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	5

Descriptor

Difficulty Level	5
Test Items	5
Testing Programs	5
Item Response Theory	3
Data Analysis	2
Probability	2
Reading Tests	2
Statistical Analysis	2
Testing	2
Academic Ability	1
Artificial Intelligence	1
Chinese	1
College Entrance Examinations	1
Criterion Referenced Tests	1
Cutting Scores	1
Educational Research	1
Equated Scores	1
Foreign Countries	1
French	1
German	1
Grade 11	1
Grade 5	1
Indonesian	1
Italian	1
Item Analysis	1
More ▼

Source

Educational and Psychological…	2
Applied Measurement in…	1
Journal of Educational and…	1
Language Testing	1

Author

Filipi, Anna	1
Huggins-Manley, Anne Corinne	1
Leite, Walter	1
Longford, Nicholas T.	1
Meyers, Jason L.	1
Miller, G. Edward	1
Way, Walter D.	1
Wyse, Adam E.	1
Xue, Kang	1

Publication Type

Journal Articles	5
Reports - Research	3
Reports - Evaluative	2

Education Level

Grade 11	1
Grade 5	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Australia	1
Florida	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 5 results Save | Export

Semisupervised Learning Method to Adjust Biased Item Difficulty Estimates Caused by Nonignorable Missingness in a Virtual Learning Environment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…

Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis

Equating without an Anchor for Nonequivalent Groups of Examinees

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015

An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…

Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2011

Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…

Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability

Do Questions Written in the Target Language Make Foreign Language Listening Comprehension Tests More Difficult?

Peer reviewed

Direct link

Filipi, Anna – Language Testing, 2012

The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…

Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries

Item Position and Item Difficulty Change in an IRT-Based Common Item Equating Design

Peer reviewed

Direct link

Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009

In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…

Descriptors: Test Items, Test Content, Testing Programs, Simulation