ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	12
Since 2017 (last 10 years)	35
Since 2007 (last 20 years)	48

Descriptor

Test Items	195
Testing Problems	195
Test Construction	75
Item Analysis	54
Higher Education	40
Difficulty Level	39
Test Bias	38
Foreign Countries	35
Multiple Choice Tests	34
Test Validity	33
Latent Trait Theory	32
Test Format	31
Test Reliability	28
Scores	27
Achievement Tests	26
Mathematical Models	23
Elementary Secondary Education	22
Item Response Theory	22
College Entrance Examinations	19
Guessing (Tests)	17
Test Interpretation	17
Comparative Analysis	16
Statistical Analysis	16
Testing	16
Response Style (Tests)	15
More ▼

Publication Type

Reports - Research	195
Journal Articles	78
Speeches/Meeting Papers	64
Tests/Questionnaires	7
Reports - Evaluative	3
Collected Works - General	2
Guides - Non-Classroom	2
Information Analyses	2
Numerical/Quantitative Data	2
Guides - Classroom - Teacher	1
Opinion Papers	1
More ▼

Education Level

Higher Education	14
Postsecondary Education	12
Secondary Education	10
Elementary Education	4
High Schools	3
Elementary Secondary Education	2
Intermediate Grades	2
Junior High Schools	2
Middle Schools	2
Early Childhood Education	1
Grade 3	1
Grade 4	1
Grade 6	1
Grade 9	1
Primary Education	1
More ▼

Audience

Researchers	31
Practitioners	5
Teachers	3
Counselors	2

Location

Canada	4
Germany	2
South Africa	2
Sweden	2
United Kingdom	2
United Kingdom (Great Britain)	2
United States	2
Arizona	1
Brazil	1
Burma	1
China	1
Colombia	1
Hawaii	1
Hong Kong	1
Japan	1
Kentucky	1
Latin America	1
Netherlands	1
New Jersey	1
New Zealand	1
South Korea	1
Turkey	1
United Arab Emirates	1
United Kingdom (England)	1
Virginia	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Individuals with Disabilities…	1
No Child Left Behind Act 2001	1
Perkins Loan Program	1

What Works Clearinghouse Rating

Showing 1 to 15 of 195 results Save | Export

The Development of a Standardized Effect Size for the SIBTEST Procedure

Peer reviewed

Direct link

James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024

In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…

Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size

Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022

Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…

Descriptors: Ability, Tests, Equated Scores, Testing Problems

A Robust Method for Detecting Item Misfit in Large-Scale Assessments

Peer reviewed

Direct link

von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…

Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit

IRTrees for Skipping Items in PIRLS

Peer reviewed

Direct link

Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024

In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…

Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment

Problematizing the Measurement of Gender Identity in K-12 Education Survey Research: A Systematic Review

Peer reviewed

Direct link

Mario I. Suárez – Educational Studies: Journal of the American Educational Studies Association, 2024

The increase in youth's self-identification as trans in the United States and Canada has created new urgency in schools to meet the needs of these students, yet education survey researchers have yet to find ways to assess their educational outcomes based on sex and gender. In this critical systematic review, I provide an overview of surveys from…

Descriptors: Measures (Individuals), Sexual Identity, Identification (Psychology), LGBTQ People

Using Diagnostic Profiles to Describe Borderline Performance in Standard Setting

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F.; Wilkins, Jesse L. M. – Educational Measurement: Issues and Practice, 2020

In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can…

Descriptors: Standard Setting (Scoring), Cutting Scores, Testing Problems, Profiles

Assessing Mode Effects of At-Home Testing without a Randomized Trial. Research Report. ETS RR-21-10

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael – ETS Research Report Series, 2021

In this investigation, we used real data to assess potential differential effects associated with taking a test in a test center (TC) versus testing at home using remote proctoring (RP). We used a pseudo-equivalent groups (PEG) approach to examine group equivalence at the item level and the total score level. If our assumption holds that the PEG…

Descriptors: Testing, Distance Education, Comparative Analysis, Test Items

Local Placement Test Retrofit and Building Language Assessment Literacy with Teacher Stakeholders: A Case Study from Colombia

Peer reviewed

Direct link

Janssen, Gerriet – Language Testing, 2022

This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs…

Descriptors: Language Tests, Placement Tests, Language Teachers, College Faculty

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam

Peer reviewed

Direct link

Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022

Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…

Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory

Effect of Item Parameter Drift in Mixed Format Common Items on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022

The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…

Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores

A Case Study of Washback and Test Preparation of the New Version of PTE Academic

Peer reviewed
PDF on ERIC

Download full text

Yi Zou; Ying Zheng; Jingwen Wang – International Journal of Language Testing, 2025

The Pearson Test of English Academic (PTE-A), a widely used high-stakes language proficiency test for university admissions and migration purposes, underwent a notable change from a three-hour to a two-hour version in November 2021. The implementation of the new version has prompted inquiries into the washback effects on various stakeholders.…

Descriptors: Testing Problems, Test Preparation, High Stakes Tests, English (Second Language)

Comparing Different Trend Estimation Approaches in Country Means and Standard Deviations in International Large-Scale Assessment Studies

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023

One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…

Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests

Item Calibration Methods with Multiple Subscale Multistage Testing

Peer reviewed

Direct link

Chun Wang; Ping Chen; Shengyu Jiang – Journal of Educational Measurement, 2020

Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait [theta] estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to…

Descriptors: Test Construction, Test Items, Adaptive Testing, Maximum Likelihood Statistics

Simultaneously Modeling Differential Testlet Functioning and Differential Item Functioning: Addressing Variance Heterogeneity with a Multigroup One-Parameter Testlet Model

Peer reviewed

Direct link

Luo, Yong; Liang, Xinya – Measurement: Interdisciplinary Research and Perspectives, 2019

Current methods that simultaneously model differential testlet functioning (DTLF) and differential item functioning (DIF) constrain the variances of latent ability and testlet effects to be equal between the focal and the reference groups. Such a constraint can be stringent and unrealistic with real data. In this study, we propose a multigroup…

Descriptors: Test Items, Item Response Theory, Test Bias, Models

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13

Journal of Educational…	13
Educational and Psychological…	6
Educational Measurement:…	5
Journal of Experimental…	4
Applied Measurement in…	3
ETS Research Report Series	3
Language Testing in Asia	3
International Journal of…	2
Journal of Economic Education	2
Journal of Educational…	2
AERA Online Paper Repository	1
Alberta Journal of…	1
Assessment in Education:…	1
Business and Professional…	1
Canadian Journal of Education	1
Council of the Great City…	1
Economics	1
Education Inquiry	1
Education Sciences	1
Educational Assessment,…	1
Educational Studies: Journal…	1
Educational Technology &…	1
Electronic Journal of Science…	1
Elementary School Journal	1
Instructional Science	1
More ▼

Plake, Barbara S.	4
Wilcox, Rand R.	4
Hambleton, Ronald K.	3
Lord, Frederic M.	3
Smith, Richard M.	3
Wainer, Howard	3
Childs, Ruth A.	2
Debeer, Dries	2
Frary, Robert B.	2
Gilmer, Jerry S.	2
Jaeger, Richard M.	2
Janssen, Rianne	2
Kim, Sooyeon	2
Northrop, Lois C.	2
Reckase, Mark D.	2
Robitzsch, Alexander	2
Secolsky, Charles	2
Sinharay, Sandip	2
van der Linden, Wim J.	2
von Davier, Matthias	2
Abel, R. Robert	1
Allison Ames	1
Anderson, Lorin W.	1
Andrada, Gilbert N.	1
More ▼

National Assessment of…	6
SAT (College Admission Test)	6
Program for International…	5
Graduate Record Examinations	3
State Trait Anxiety Inventory	3
ACT Assessment	2
Comprehensive Tests of Basic…	2
Iowa Tests of Basic Skills	2
New Jersey College Basic…	2
Sequential Tests of…	2
Stanford Achievement Tests	2
Wechsler Intelligence Scale…	2
California Achievement Tests	1
Gates MacGinitie Reading Tests	1
Graduate Management Admission…	1
High School Longitudinal…	1
Medical College Admission Test	1
Peabody Picture Vocabulary…	1
Pearson Test of English…	1
Progress in International…	1
Test Anxiety Inventory	1
Test of English as a Foreign…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scales…	1
Wechsler Preschool and…	1
More ▼