ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	7

Descriptor

Difficulty Level	8
Error of Measurement	8
Sampling	8
Test Items	7
Item Response Theory	5
Sample Size	4
Cutting Scores	3
Equated Scores	3
Measurement	3
Simulation	3
Standard Setting (Scoring)	3
Correlation	2
Item Analysis	2
Mathematics Tests	2
Ability Grouping	1
Accuracy	1
Classification	1
Comparative Analysis	1
Computation	1
Data Analysis	1
Educational Assessment	1
Elementary School Students	1
Evaluation Criteria	1
Factor Analysis	1
Foreign Countries	1
More ▼

Source

Applied Measurement in…	1
ETS Research Report Series	1
International Journal of…	1
International Journal of…	1
Online Submission	1
Research Matters	1
Research Papers in Education	1

Author

Anwyll, Steve	1
Arikan, Çigdem Akin	1
Bramley, Tom	1
Cetin, Sevda	1
Dorans, Neil J.	1
Duong, Minh Q.	1
Glanville, Matthew	1
Guo, Hongwen	1
Haertel, Edward H.	1
He, Qingping	1
Inal, Hatice	1
Kara, Hakan	1
Lu, Ru	1
Michaelides, Michalis P.	1
Opposs, Dennis	1
Soysal, Sümeyra	1
deGruijter, Dato N. M.	1
von Davier, Alina A.	1
More ▼

Publication Type

Reports - Research	8
Journal Articles	7
Speeches/Meeting Papers	1

Education Level

Junior High Schools	2
Middle Schools	2
Secondary Education	2
Elementary Education	1
Grade 8	1

Audience

Location

New Jersey	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Comparison of Passing Scores Determined by the Angoff Method in Different Item Samples

Peer reviewed
PDF on ERIC

Download full text

Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020

In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…

Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Impact of Missing Data on Rasch Model Estimations

Download full text

Soysal, Sümeyra; Arikan, Çigdem Akin; Inal, Hatice – Online Submission, 2016

This study aims to investigate the effect of methods to deal with missing data on item difficulty estimations under different test length conditions and sampling sizes. In this line, a data set including 10, 20 and 40 items with 100 and 5000 sampling size was prepared. Deletion process was applied at the rates of 5%, 10% and 20% under conditions…

Descriptors: Research Problems, Data Analysis, Item Response Theory, Test Items

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

An Investigation of Measurement Invariance of the Key Stage 2 National Curriculum Science Sampling Test in England

Peer reviewed

Direct link

He, Qingping; Anwyll, Steve; Glanville, Matthew; Opposs, Dennis – Research Papers in Education, 2014

Since 2010, the whole national cohort Key Stage 2 (KS2) National Curriculum test in science in England has been replaced with a sampling test taken by pupils at the age of 11 from a nationally representative sample of schools annually. The study reported in this paper compares the performance of different subgroups of the samples (classified by…

Descriptors: National Curriculum, Sampling, Foreign Countries, Factor Analysis

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Accounting for the Uncertainty in Performance Standards.

Download full text

deGruijter, Dato N. M. – 1980

The setting of standards involves subjective value judgments. The inherent arbitrariness of specific standards has been severely criticized by Glass. His antagonists agree that standard setting is a judgmental task but they have pointed out that arbitrariness in the positive sense of serious judgmental decisions is unavoidable. Further, small…

Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests