ERIC - Search Results

Publication Date

In 2025	4
Since 2024	6
Since 2021 (last 5 years)	18
Since 2016 (last 10 years)	28
Since 2006 (last 20 years)	56

Descriptor

Test Items	57
Item Response Theory	25
Models	19
Measurement	14
Evaluation Methods	11
Psychometrics	11
Classification	10
Measurement Techniques	10
Test Construction	10
Diagnostic Tests	9
Equated Scores	9
Scores	9
Probability	8
Ability	7
Accuracy	7
Computation	7
Difficulty Level	7
Educational Assessment	7
Methods	7
Test Theory	7
Test Validity	7
Testing	7
Evaluation Problems	6
Item Analysis	6
Simulation	6
More ▼

Source

Measurement:…

Publication Type

Journal Articles	57
Reports - Research	27
Opinion Papers	17
Reports - Evaluative	15
Reports - Descriptive	6
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	5
Elementary Education	3
Higher Education	3
Postsecondary Education	3
Secondary Education	3
Junior High Schools	2
Middle Schools	2
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
More ▼

Audience

Practitioners

Location

California	1
Denmark	1
Finland	1
New York (Rochester)	1
Norway	1
Sweden	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Teaching and Learning…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 57 results Save | Export

Integration of Historical Data for the Analysis of Multiple Assessment Studies

Peer reviewed

Direct link

Marcoulides, Katerina M. – Measurement: Interdisciplinary Research and Perspectives, 2023

Integrative data analyses have recently been shown to be an effective tool for researchers interested in synthesizing datasets from multiple studies in order to draw statistical or substantive conclusions. The actual process of integrating the different datasets depends on the availability of some common measures or items reflecting the same…

Descriptors: Data Analysis, Synthesis, Test Items, Simulation

Influences of Carry-Over Effects across Scales on Mediation Analyses

Peer reviewed

Direct link

Kuan-Yu Jin; Yi-Jhen Wu; Ming Ming Chiu – Measurement: Interdisciplinary Research and Perspectives, 2025

Many education tests and psychological surveys elicit respondent views of similar constructs across scenarios (e.g., story followed by multiple choice questions) by repeating common statements across scales (one-statement-multiple-scale, OSMS). However, a respondent's earlier responses to the common statement can affect later responses to it…

Descriptors: Administrator Surveys, Teacher Surveys, Responses, Test Items

Examining the Wording Effect: What Are We Measuring?

Peer reviewed

Direct link

Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025

In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…

Descriptors: Test Items, Factor Structure, Test Reliability, Semantics

Analysis of Mixed-Format Assessments Using Measurement Models and Topic Modeling

Peer reviewed

Direct link

Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025

It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…

Descriptors: Responses, Test Items, Test Format, Grade 8

Evaluation of Response Probabilities along Studied Latent Dimensions: A Polytomous Item Extension

Peer reviewed

Direct link

Raykov, Tenko; Huber, Chuck; Marcoulides, George A.; Pusic, Martin; Menold, Natalja – Measurement: Interdisciplinary Research and Perspectives, 2021

A readily and widely applicable procedure is discussed that can be used to point and interval estimate the probabilities of particular responses on polytomous items at pre-specified points along underlying latent continua. The items are assumed thereby to be part of unidimensional multi-component measuring instruments that may contain also binary…

Descriptors: Probability, Computation, Test Items, Responses

From Likert to Forced Choice: Statement Parameter Invariance and Context Effects in Personality Assessment

Peer reviewed

Direct link

Jianbin Fu; Patrick C. Kyllonen; Xuan Tan – Measurement: Interdisciplinary Research and Perspectives, 2024

Users of forced-choice questionnaires (FCQs) to measure personality commonly assume statement parameter invariance across contexts -- between Likert and forced-choice (FC) items and between different FC items that share a common statement. In this paper, an empirical study was designed to check these two assumptions for an FCQ assessment measuring…

Descriptors: Measurement Techniques, Questionnaires, Personality Measures, Interpersonal Competence

Efficiency of PROMIS MCAT Assessments for Orthopaedic Care

Peer reviewed

Direct link

Michael Bass; Scott Morris; Sheng Zhang – Measurement: Interdisciplinary Research and Perspectives, 2025

Administration of patient-reported outcome measures (PROs), using multidimensional computer adaptive tests (MCATs) has the potential to reduce patient burden, but the efficiency of MCAT depends on the degree to which an individual's responses fit the psychometric properties of the assessment. Assessing patients' symptom burden through the…

Descriptors: Adaptive Testing, Computer Assisted Testing, Patients, Outcome Measures

Sample Size Requirements for Parameter Recovery in the 4-Parameter Logistic Model

Peer reviewed

Direct link

Cuhadar, Ismail – Measurement: Interdisciplinary Research and Perspectives, 2022

In practice, some test items may display misfit at the upper-asymptote of item characteristic curve due to distraction, anxiety, or carelessness by the test takers (i.e., the slipping effect). The conventional item response theory (IRT) models do not take the slipping effect into consideration, which may violate the model fit assumption in IRT.…

Descriptors: Sample Size, Item Response Theory, Test Items, Mathematical Models

Item Response Theory and Modeling with Stata

Peer reviewed

Direct link

Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023

This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…

Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

An Approach to Test Equating under the Latent "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021

This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…

Descriptors: Equated Scores, Scoring, Test Items, Accuracy

Regarding Item Parameter Invariance for the Rasch and the 2-Parameter Logistic Models: An Investigation under Finite Non-Representative Sample Calibrations

Peer reviewed

Direct link

Paek, Insu; Liang, Xinya; Lin, Zhongtian – Measurement: Interdisciplinary Research and Perspectives, 2021

The property of item parameter invariance in item response theory (IRT) plays a pivotal role in the applications of IRT such as test equating. The scope of parameter invariance when using estimates from finite biased samples in the applications of IRT does not appear to be clearly documented in the IRT literature. This article provides information…

Descriptors: Item Response Theory, Computation, Test Items, Bias

Utilizing Linear Logistic Test Models to Explore Item Characteristics of Medical Subspecialty Certification Examinations

Peer reviewed

Direct link

Emily K. Toutkoushian; Huaping Sun; Mark T. Keegan; Ann E. Harman – Measurement: Interdisciplinary Research and Perspectives, 2024

Linear logistic test models (LLTMs), leveraging item response theory and linear regression, offer an elegant method for learning about item characteristics in complex content areas. This study used LLTMs to model single-best-answer, multiple-choice-question response data from two medical subspecialty certification examinations in multiple years…

Descriptors: Licensing Examinations (Professions), Certification, Medical Students, Test Items

Interval Estimation of Item Response Probabilities along Studied Latent Dimensions

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Pusic, Martin – Measurement: Interdisciplinary Research and Perspectives, 2021

An interval estimation procedure is discussed that can be used to evaluate the probability of a particular response for a binary or binary scored item at a pre-specified point along an underlying latent continuum. The item is assumed to: (a) be part of a unidimensional multi-component measuring instrument that may contain also polytomous items,…

Descriptors: Item Response Theory, Computation, Probability, Test Items

An Investigation of Item Calibration Methods in Multistage Testing

Peer reviewed

Direct link

Cai, Liuhan; Albano, Anthony D.; Roussos, Louis A. – Measurement: Interdisciplinary Research and Perspectives, 2021

Multistage testing (MST), an adaptive test delivery mode that involves algorithmic selection of predefined item modules rather than individual items, offers a practical alternative to linear and fully computerized adaptive testing. However, interactions across stages between item modules and examinee groups can lead to challenges in item…

Descriptors: Adaptive Testing, Test Items, Item Response Theory, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Raykov, Tenko	4
Hill, Heather C.	3
Kane, Michael T.	3
Mroch, Andrew A.	3
Ripkey, Douglas R.	3
Suh, Youngsuk	3
Ames, Allison J.	2
Blunk, Merrie	2
Briggs, Derek C.	2
Goffney, Imani Masters	2
Leventhal, Brian C.	2
Liang, Xinya	2
Marcoulides, George A.	2
Marcoulides, Katerina M.	2
Peabody, Michael R.	2
Pusic, Martin	2
Wilhelm, Oliver	2
Abdullah Faruk Kiliç	1
Albano, Anthony D.	1
Allan S. Cohen	1
Ann E. Harman	1
Atanasov, Dimitar V.	1
Ball, Deborah Loewenberg	1
Bao, Yu	1
Basaraba, Deni	1
More ▼