Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Difficulty Level | 8 |
Error of Measurement | 8 |
Sampling | 8 |
Test Items | 7 |
Item Response Theory | 5 |
Sample Size | 4 |
Cutting Scores | 3 |
Equated Scores | 3 |
Measurement | 3 |
Simulation | 3 |
Standard Setting (Scoring) | 3 |
More ▼ |
Source
Applied Measurement in… | 1 |
ETS Research Report Series | 1 |
International Journal of… | 1 |
International Journal of… | 1 |
Online Submission | 1 |
Research Matters | 1 |
Research Papers in Education | 1 |
Author
Anwyll, Steve | 1 |
Arikan, Çigdem Akin | 1 |
Bramley, Tom | 1 |
Cetin, Sevda | 1 |
Dorans, Neil J. | 1 |
Duong, Minh Q. | 1 |
Glanville, Matthew | 1 |
Guo, Hongwen | 1 |
Haertel, Edward H. | 1 |
He, Qingping | 1 |
Inal, Hatice | 1 |
More ▼ |
Publication Type
Reports - Research | 8 |
Journal Articles | 7 |
Speeches/Meeting Papers | 1 |
Education Level
Junior High Schools | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 8 | 1 |
Audience
Location
New Jersey | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…
Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Soysal, Sümeyra; Arikan, Çigdem Akin; Inal, Hatice – Online Submission, 2016
This study aims to investigate the effect of methods to deal with missing data on item difficulty estimations under different test length conditions and sampling sizes. In this line, a data set including 10, 20 and 40 items with 100 and 5000 sampling size was prepared. Deletion process was applied at the rates of 5%, 10% and 20% under conditions…
Descriptors: Research Problems, Data Analysis, Item Response Theory, Test Items
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
He, Qingping; Anwyll, Steve; Glanville, Matthew; Opposs, Dennis – Research Papers in Education, 2014
Since 2010, the whole national cohort Key Stage 2 (KS2) National Curriculum test in science in England has been replaced with a sampling test taken by pupils at the age of 11 from a nationally representative sample of schools annually. The study reported in this paper compares the performance of different subgroups of the samples (classified by…
Descriptors: National Curriculum, Sampling, Foreign Countries, Factor Analysis
Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012
Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…
Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis
deGruijter, Dato N. M. – 1980
The setting of standards involves subjective value judgments. The inherent arbitrariness of specific standards has been severely criticized by Glass. His antagonists agree that standard setting is a judgmental task but they have pointed out that arbitrariness in the positive sense of serious judgmental decisions is unavoidable. Further, small…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests