ERIC - Search Results

Publication Date

In 2025	2
Since 2024	4
Since 2021 (last 5 years)	20
Since 2016 (last 10 years)	32
Since 2006 (last 20 years)	55

Descriptor

Test Items	56
Item Response Theory	24
Difficulty Level	16
Scores	13
Test Format	13
Multiple Choice Tests	12
Foreign Countries	10
Item Analysis	10
Statistical Analysis	10
Test Construction	10
Test Validity	9
Computation	8
Computer Assisted Testing	8
Equated Scores	7
Licensing Examinations…	7
Scoring	7
Test Reliability	7
Testing	7
Accuracy	6
Classification	6
Comparative Analysis	6
Evaluation Methods	6
Models	6
Test Bias	6
Tests	6
More ▼

Source

Practical Assessment,…

Publication Type

Journal Articles	56
Reports - Research	38
Reports - Descriptive	10
Reports - Evaluative	8
Tests/Questionnaires	2

Education Level

Higher Education	8
Postsecondary Education	8
Elementary Education	6
Middle Schools	6
Junior High Schools	5
Elementary Secondary Education	4
Intermediate Grades	4
Secondary Education	4
Grade 5	3
Grade 6	2
Grade 7	2
Grade 8	2
Early Childhood Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 9	1
High Schools	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

Canada	4
Iran	2
Austria	1
Botswana	1
Honduras	1
Indonesia	1
Maryland	1
Massachusetts	1
Sweden	1
Tunisia	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	2
Massachusetts Comprehensive…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 56 results Save | Export

Simultaneous Linear Equating for Scenarios with Optional Test Versions or across Multiple Alternative Anchors

Peer reviewed
PDF on ERIC

Download full text

Tom Benton – Practical Assessment, Research & Evaluation, 2025

This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…

Descriptors: Equated Scores, Test Format, Test Items, Computation

Item Parameter Estimation of the 2PL IRT Model with Fixed Ability Estimates: Choices of Ability Estimation Methods and Priors on Slopes

Peer reviewed
PDF on ERIC

Download full text

Jianbin Fu; TsungHan Ho; Xuan Tan – Practical Assessment, Research & Evaluation, 2025

Item parameter estimation using an item response theory (IRT) model with fixed ability estimates is useful in equating with small samples on anchor items. The current study explores the impact of three ability estimation methods (weighted likelihood estimation [WLE], maximum a posteriori [MAP], and posterior ability distribution estimation [PST])…

Descriptors: Item Response Theory, Test Items, Computation, Equated Scores

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

Examination of the Aggregate Scoring Method in a Judgment Concordance Test

Peer reviewed
PDF on ERIC

Download full text

Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023

The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…

Descriptors: Scoring, Tests, Evaluation Methods, Test Items

Application of Two-Parameter Item Response Theory for Determining Form-Dependent Items on Exams Using Different Item Orders

Peer reviewed
PDF on ERIC

Download full text

Pentecost, Thomas C.; Raker, Jeffery R.; Murphy, Kristen L. – Practical Assessment, Research & Evaluation, 2023

Using multiple versions of an assessment has the potential to introduce item environment effects. These types of effects result in version dependent item characteristics (i.e., difficulty and discrimination). Methods to detect such effects and resulting implications are important for all levels of assessment where multiple forms of an assessment…

Descriptors: Item Response Theory, Test Items, Test Format, Science Tests

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

Do Loss Aversion and the Ownership Effect Bias Content Validation Procedures?

Peer reviewed
PDF on ERIC

Download full text

Svihla, Vanessa; Gallup, Amber – Practical Assessment, Research & Evaluation, 2021

In making validity arguments, a central consideration is whether the instrument fairly and adequately covers intended content, and this is often evaluated by experts. While common procedures exist for quantitatively assessing this, the effect of loss aversion--a cognitive bias that would predict a tendency to retain items--on these procedures has…

Descriptors: Content Validity, Anxiety, Bias, Test Items

Rasch Measurement v. Item Response Theory: Knowing When to Cross the Line

Peer reviewed
PDF on ERIC

Download full text

Stemler, Steven E.; Naples, Adam – Practical Assessment, Research & Evaluation, 2021

When students receive the same score on a test, does that mean they know the same amount about the topic? The answer to this question is more complex than it may first appear. This paper compares classical and modern test theories in terms of how they estimate student ability. Crucial distinctions between the aims of Rasch Measurement and IRT are…

Descriptors: Item Response Theory, Test Theory, Ability, Computation

Essentials of Visual Diagnosis of Test Items. Logical, Illogical, and Anomalous Patterns in Tests Items to Be Detected

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

This article discusses visual techniques for detecting test items that would be optimal to be selected to the final compilation on the one hand and, on the other hand, to out-select those items that would lower the quality of the compilation. Some classic visual tools are discussed, first, in a practical manner in diagnosing the logical,…

Descriptors: Test Items, Item Analysis, Item Response Theory, Cutting Scores

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Using Cumulative Sum Control Chart to Detect Aberrant Responses in Educational Assessments

Peer reviewed
PDF on ERIC

Download full text

Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023

Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…

Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory

From Investigating the Alignment of a Priori Item Characteristics Based on the CTT and Four-Parameter Logistic (4-PL) IRT Models to Further Exploring the Comparability of the Two Models

Peer reviewed
PDF on ERIC

Download full text

Agus Santoso; Heri Retnawati; Timbul Pardede; Ibnu Rafi; Munaya Nikma Rosyada; Gulzhaina K. Kassymova; Xu Wenxin – Practical Assessment, Research & Evaluation, 2024

The test blueprint is important in test development, where it guides the test item writer in creating test items according to the desired objectives and specifications or characteristics (so-called a priori item characteristics), such as the level of item difficulty in the category and the distribution of items based on their difficulty level.…

Descriptors: Foreign Countries, Undergraduate Students, Business English, Test Construction

On the Use of Different Linkage Plans with Different Observed-Score Equipercentile Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Wiberg, Marie – Practical Assessment, Research & Evaluation, 2021

The overall aim was to examine the equated values when using different linkage plans and different observed-score equipercentile equating methods with the equivalent groups (EG) design and the nonequivalent groups with anchor test (NEAT) design. Both real data from a college admissions test and simulated data were used with frequency estimation,…

Descriptors: Equated Scores, Test Items, Methods, College Entrance Examinations

The Development and Validation of a Survey Measuring Opportunity to Learn Spatial Reasoning Skills at Home

Peer reviewed
PDF on ERIC

Download full text

Sarah Wellberg; Anthony Sparks; Leanne Ketterlin-Geller – Practical Assessment, Research & Evaluation, 2023

The early development of spatial reasoning skills has been linked to future success in mathematics (Wai, Lubinski, & Benbow, 2009), but research to date has mainly focused on the development of these skills within classroom settings rather than at home. The home environment is often the first place students are exposed to, and develop, early…

Descriptors: Test Construction, Test Validity, Measures (Individuals), Surveys

An Intersectional Approach to Differential Item Functioning: Reflecting Configurations of Inequality

Peer reviewed
PDF on ERIC

Download full text

Russell, Michael; Kaplan, Larry – Practical Assessment, Research & Evaluation, 2021

Differential Item Functioning (DIF) is commonly employed to examine measurement bias of test scores. Current approaches to DIF compare item functioning separately for select demographic identities such as gender, racial stratification, and economic status. Examining potential item bias fails to recognize and capture the intersecting configurations…

Descriptors: Test Bias, Test Items, Demography, Identification

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Han, Kyung T.	3
Metsämuuronen, Jari	3
Baghaei, Purya	2
Buckendahl, Chad W.	2
Russell, Michael	2
Agus Santoso	1
Ahmadi, Alireza	1
Anthony Sparks	1
Asmundson, Gordon J. G.	1
Babcock, Ben	1
Bao, Han	1
Basaraba, Deni L.	1
Bashkov, Bozhidar M.	1
Becker, Kirk A.	1
Bergstrom, Betty A.	1
Bryant, William	1
Bucak, S. Deniz	1
Carleton, R. Nicholas	1
Carstensen, Claus H.	1
Cawthon, Stephanie W.	1
Cesnik, Hermann S.	1
Chahine, Saad	1
Chiavaroli, Neville	1
Childs, Ruth A.	1
Clauser, Jerome C.	1
More ▼