ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	10

Descriptor

Difficulty Level	18
Error of Measurement	18
Simulation	18
Test Items	12
Item Response Theory	9
Equated Scores	8
Item Analysis	4
Statistical Analysis	4
Ability	3
Correlation	3
Sample Size	3
Sampling	3
Testing	3
Ability Grouping	2
Bayesian Statistics	2
Comparative Analysis	2
Computation	2
Factor Analysis	2
Item Banks	2
Latent Trait Theory	2
Mathematical Models	2
Measurement	2
Monte Carlo Methods	2
Psychometrics	2
Regression (Statistics)	2
More ▼

Source

Applied Measurement in…	2
ETS Research Report Series	2
Journal of Educational…	2
Applied Psychological…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Experimental…	1
Practical Assessment,…	1
Research Matters	1

Publication Type

Journal Articles	11
Reports - Research	11
Reports - Evaluative	5
Speeches/Meeting Papers	4
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Using E-Z Reader to Examine the Consequences of Fixation-Location Measurement Error

Peer reviewed

Direct link

Reichle, Erik D.; Drieghe, Denis – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2015

There is an ongoing debate about whether fixation durations during reading are only influenced by the processing difficulty of the words being fixated (i.e., the serial-attention hypothesis) or whether they are also influenced by the processing difficulty of the previous and/or upcoming words (i.e., the attention-gradient hypothesis). This article…

Descriptors: Reading, Eye Movements, Error of Measurement, Difficulty Level

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Monitoring Items in Real Time to Enhance CAT Security

Peer reviewed

Direct link

Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016

An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…

Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Multidimensional Item Response Theory Parameter Estimation with Nonsimple Structure Items

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2011

Estimation of multidimensional item response theory (MIRT) model parameters can be carried out using the normal ogive with unweighted least squares estimation with the normal-ogive harmonic analysis robust method (NOHARM) software. Previous simulation research has demonstrated that this approach does yield accurate and efficient estimates of item…

Descriptors: Item Response Theory, Computation, Test Items, Simulation

A Subdividing Method for Generalizability Theory: Precision of Measurement Errors and Patterns of Missing Data.

Chiu, Christopher W. T. – 2000

A procedure was developed to analyze data with missing observations by extracting data from a sparsely filled data matrix into analyzable smaller subsets of data. This subdividing method, based on the conceptual framework of meta-analysis, was accomplished by creating data sets that exhibit structural designs and then pooling variance components…

Descriptors: Difficulty Level, Error of Measurement, Generalizability Theory, Interrater Reliability

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Standard Errors of Estimate in Item-Examinee Sampling as a Function of Test Reliability, Variation in Item Difficulty Indices and Degree of Skewness in the Normative Distribution

Peer reviewed

Shoemaker, David M. – Educational and Psychological Measurement, 1972

Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

A Simultaneous Approach to Multi-Factor DIF Analysis.

Download full text

Tang, Huixing – 1994

A method is presented for the simultaneous analysis of differential item functioning (DIF) in multi-factor situations. The method is unique in that it combines item response theory (IRT) and analysis of variance (ANOVA), takes a simultaneous approach to multifactor DIF analysis, and is capable of capturing interaction and controlling for possible…

Descriptors: Ability, Analysis of Variance, Difficulty Level, Error of Measurement

Multi-level IRT with Measurement Error in the Predictor Variables. Research Report 98-16.

Download full text

Fox, Jean-Paul; Glas, Cees A. W. – 1998

A two-level regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather than observed scores as dependent variables of a multilevel model is that this offers the possibility of separating the influence of item difficulty and ability level and modeling response variation and…

Descriptors: Ability, Bayesian Statistics, Difficulty Level, Error of Measurement

Equating Multiple Tests via an IRT Linking Design: Utilizing a Single Set of Anchor Items with Fixed Common Item Parameters during the Calibration Process.

Download full text

Li, Yuan H.; Griffith, William D.; Tam, Hak P. – 1997

This study explores the relative merits of a potentially useful item response theory (IRT) linking design: using a single set of anchor items with fixed common item parameters (FCIP) during the calibration process. An empirical study was conducted to investigate the appropriateness of this linking design using 6 groups of students taking 6 forms…

Descriptors: Ability, Difficulty Level, Equated Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Li, Yuan H.	2
Abulela, Mohammed A. A.	1
Antal, Judit	1
Bramley, Tom	1
Chiu, Christopher W. T.	1
Curry, Allen R.	1
Dorans, Neil J.	1
Drieghe, Denis	1
Duong, Minh Q.	1
Finch, Holmes	1
Fox, Jean-Paul	1
Glas, Cees A. W.	1
Griffith, William D.	1
Guo, Hongwen	1
Holland, Paul	1
Inga Laukaityte	1
Jones, Patricia B.	1
Li, Jie	1
Lissitz, Robert W.	1
Lu, Ru	1
Marie Wiberg	1
Melican, Gerald J.	1
Proctor, Thomas P.	1
Reichle, Erik D.	1
More ▼