Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Test Items | 11 |
Comparative Analysis | 6 |
Foreign Countries | 3 |
Item Analysis | 3 |
Statistical Analysis | 3 |
Test Construction | 3 |
Accuracy | 2 |
Correlation | 2 |
Difficulty Level | 2 |
Equated Scores | 2 |
Item Response Theory | 2 |
More ▼ |
Source
ETS Research Report Series | 6 |
Scandinavian Journal of… | 2 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
Language Testing | 1 |
Author
Chen, Haiwen H. | 1 |
Cziko, Gary A. | 1 |
Fu, Jianbin | 1 |
Futagi, Yoko | 1 |
Hemat, Ramin | 1 |
Holland, Paul | 1 |
Kim, Sooyeon | 1 |
Kjaernsli, Marit | 1 |
Kong, Nan | 1 |
Kostin, Irene | 1 |
Lie, Svein | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Numerical/Quantitative Data | 11 |
Reports - Research | 7 |
Reports - Evaluative | 3 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Primary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 3 |
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015
One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Livingston, Samuel A.; Kim, Sooyeon – ETS Research Report Series, 2010
A series of resampling studies investigated the accuracy of equating by four different methods in a random groups equating design with samples of 400, 200, 100, and 50 test takers taking each form. Six pairs of forms were constructed. Each pair was constructed by assigning items from an existing test taken by 9,000 or more test takers. The…
Descriptors: Equated Scores, Accuracy, Sample Size, Sampling
Fu, Jianbin; Wise, Maxwell – ETS Research Report Series, 2012
In the Cognitively Based Assessment of, for, and as Learning ("CBAL"™) research initiative, innovative K-12 prototype tests based on cognitive competency models are developed. This report presents the statistical results of the 2 CBAL Grade 8 writing tests and 2 Grade 7 reading tests administered to students in 20 states in spring 2011.…
Descriptors: Cognitive Ability, Grade 8, Writing Tests, Grade 7

Cziko, Gary A. – Educational and Psychological Measurement, 1984
Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)
Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items
Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006
It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…
Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level
Zhang, Jinming – ETS Research Report Series, 2005
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Descriptors: Statistical Bias, Maximum Likelihood Statistics, Computation, Ability
Olsen, Rolf Vegar – Scandinavian Journal of Educational Research, 2004
In the Programme for International Student Assessment (PISA) the items are organised in small clusters relating to the same stimulus material (called 'units'). Homogeneity analysis (HA) is used to develop a detailed description of the relationship between all the items in one unit, using the categorical information available in the PISA data. The…
Descriptors: Thinking Skills, Knowledge Level, Student Evaluation, Foreign Countries
Kjaernsli, Marit; Lie, Svein – Scandinavian Journal of Educational Research, 2004
In this paper we have set out to search for similarities and differences between the Nordic countries concerning patterns of competencies defined as scientific literacy in the Programme for International Student Assessment (PISA) study. The first part focuses on gender differences concerning the two types of competencies, understanding of…
Descriptors: Foreign Countries, Scientific Literacy, Thinking Skills, Gender Differences
Pomplun, Mark; Ritchie, Timothy – Journal of Educational Computing Research, 2004
This study investigated the statistical and practical significance of context effects for items randomized within testlets for administration during a series of computerized non-adaptive tests. One hundred and twenty-five items from four primary school reading tests were studied. Logistic regression analyses identified from one to four items for…
Descriptors: Psychometrics, Context Effect, Effect Size, Primary Education
Sheehan, Kathleen M.; Kostin, Irene; Futagi, Yoko; Hemat, Ramin; Zuckerman, Daniel – ETS Research Report Series, 2006
This paper describes the development, implementation, and evaluation of an automated system for predicting the acceptability status of candidate reading-comprehension stimuli extracted from a database of journal and magazine articles. The system uses a combination of classification and regression techniques to predict the probability that a given…
Descriptors: Automation, Prediction, Reading Comprehension, Classification

Reynolds, Trudy; And Others – Language Testing, 1994
Presents a study conducted to provide a comparative analysis of five item analysis indices using both IRT and non-IRT indices to describe the characteristics of flagged items and to investigate the appropriateness of logistic regression as an item analysis technique for further studies. The performance of five item analysis indices was examined.…
Descriptors: College Students, Comparative Analysis, English (Second Language), Item Analysis