Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 10 |
Descriptor
Evaluation Methods | 19 |
Scaling | 19 |
Test Items | 19 |
Item Response Theory | 8 |
Test Construction | 7 |
Psychometrics | 6 |
Difficulty Level | 4 |
Equated Scores | 4 |
Foreign Countries | 4 |
Item Analysis | 4 |
Measures (Individuals) | 4 |
More ▼ |
Source
Author
Wu, Margaret | 2 |
Ainley, John | 1 |
Avery, Marybell | 1 |
Chan, David W. | 1 |
Chen, Hanwei | 1 |
Cook, Linda L. | 1 |
Cui, Zhongmin | 1 |
Dancer, L. Suzanne | 1 |
Deng, Hui | 1 |
Donovan, Jenny | 1 |
Dyson, Ben | 1 |
More ▼ |
Publication Type
Reports - Research | 10 |
Speeches/Meeting Papers | 7 |
Journal Articles | 6 |
Reports - Descriptive | 4 |
Reports - Evaluative | 4 |
Numerical/Quantitative Data | 2 |
Guides - General | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Secondary Education | 5 |
Elementary Education | 3 |
Secondary Education | 3 |
Grade 8 | 2 |
Grade 4 | 1 |
Grade 6 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
More ▼ |
Audience
Researchers | 3 |
Practitioners | 1 |
Teachers | 1 |
Location
Asia | 2 |
Australia | 1 |
Florida | 1 |
Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Florida Comprehensive… | 1 |
Piers Harris Childrens Self… | 1 |
Tennessee Self Concept Scale | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei – Applied Psychological Measurement, 2013
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Descriptors: Regression (Statistics), Item Response Theory, Test Items, Equated Scores
Improving Comprehension Assessment for Middle and High School Students: Challenges and Opportunities
Sabatini, John; Petscher, Yaacov; O'Reilly, Tenaha; Truckenmiller, Adrea – Grantee Submission, 2015
For decades, standardized reading comprehension tests have consisted of a series of passages and associated multiple-choice questions. Although widely used in and out of the classroom, there continues to be considerable disagreement regarding how or whether such tests have net value in the service of advancing educational progress in reading. This…
Descriptors: Middle School Students, High School Students, Reading Comprehension, Reading Tests
An Investigation of Scale Drift for Arithmetic Assessment of ACCUPLACER®. Research Report No. 2010-2
Deng, Hui; Melican, Gerald – College Board, 2010
The current study was designed to extend the current literature to study scale drift in CAT as part of improving quality control and calibration process for ACCUPLACER, a battery of large-scale adaptive placement tests. The study aims to evaluate item parameter drift using empirical data that span four years from the ACCUPLACER Arithmetic…
Descriptors: Student Placement, Adaptive Testing, Computer Assisted Testing, Mathematics Tests
Chan, David W. – Gifted Child Quarterly, 2010
Data of item responses to the Impossible Figures Task (IFT) from 492 Chinese primary, secondary, and university students were analyzed using the dichotomous Rasch measurement model. Item difficulty estimates and person ability estimates located on the same logit scale revealed that the pooled sample of Chinese students, who were relatively highly…
Descriptors: Test Items, Adaptive Testing, Scaling, Talent Identification
Zhu, Weimo; Rink, Judy; Placek, Judith H.; Graber, Kim C.; Fox, Connie; Fisette, Jennifer L.; Dyson, Ben; Park, Youngsik; Avery, Marybell; Franck, Marian; Raynes, De – Measurement in Physical Education and Exercise Science, 2011
New testing theories, concepts, and psychometric methods (e.g., item response theory, test equating, and item bank) developed during the past several decades have many advantages over previous theories and methods. In spite of their introduction to the field, they have not been fully accepted by physical educators. Further, the manner in which…
Descriptors: Physical Education, Quality Control, Psychometrics, Item Response Theory
Wu, Margaret – OECD Publishing (NJ1), 2010
This paper makes an in-depth comparison of the PISA (OECD) and TIMSS (IEA) mathematics assessments conducted in 2003. First, a comparison of survey methodologies is presented, followed by an examination of the mathematics frameworks in the two studies. The methodologies and the frameworks in the two studies form the basis for providing…
Descriptors: Mathematics Achievement, Foreign Countries, Gender Differences, Comparative Analysis
Frisbie, David A. – 1981
The relative difficulty ratio (RDR) is used as a method of representing test difficulty. The RDR is the ratio of a test mean to the ideal mean, the point midway between the perfect score and the mean chance score for the test. The RDR tranformation is a linear scale conversion method but not a linear equating method in the classical sense. The…
Descriptors: Comparative Testing, Difficulty Level, Evaluation Methods, Raw Scores
Zwick, Rebecca; Thayer, Dorothy T. – 1994
Several recent studies have investigated the application of statistical inference procedures to the analysis of differential item functioning (DIF) in test items that are scored on an ordinal scale. Mantel's extension of the Mantel-Haenszel test is a possible hypothesis-testing method for this purpose. The development of descriptive statistics for…
Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias
Micceri, Theodore; And Others – 1987
Several issues relating to agreement estimates for different types of data from performance evaluations are considered. New indices of agreement are presented for ordinal level items and for summative scores produced by nominal or ordinal level items. Two sets of empirical data illustrate the performance of the two formulas derived to estimate…
Descriptors: Correlation, Data Analysis, Educational Research, Estimation (Mathematics)

Snyder, Scott; Sheehan, Robert – Journal of Early Intervention, 1992
This examination of the Rasch scaling model concludes that the model could potentially facilitate objective comparisons of status and change of young children with disabilities at individual and group levels. The paper discusses applications of the model to early childhood assessment in the areas of item banking, test analysis, and subject…
Descriptors: Disabilities, Evaluation Methods, Item Response Theory, Measurement Techniques
May, Henry – Journal of Educational and Behavioral Statistics, 2006
In this article, a new method is presented and implemented for deriving a scale of socioeconomic status (SES) from international survey data using a multilevel Bayesian item response theory (IRT) model. The proposed model incorporates both international anchor items and nation-specific items and is able to (a) produce student family SES scores…
Descriptors: Item Response Theory, Bayesian Statistics, Socioeconomic Status, Scaling
Thomas, Julia Anne – 1985
A sample of 234 fifth- and 259 sixth-grade students scaled the items of the Piers-Harris, Tennessee, Coopersmith, and Lipsett self-concept measures. The scaling of the Piers-Harris and the Tennessee inventories was examined in reference to their subscales. The present technique placed items on a bivariate plane of two orthogonal dimensions…
Descriptors: Evaluation Methods, Factor Structure, Intermediate Grades, Orthogonal Rotation
van der Linden, Wim J.; Zwarts, Michel A. – 1994
It is argued that judgments in evaluative research are ultimately subjective, but that good criteria are available to assess their quality. One of these criteria is the robustness of the judgments against incompleteness or uncertainty in the data used to describe the educational system. The use of the robustness criterion is demonstrated through…
Descriptors: Ability, Case Studies, Criteria, Decision Making
Dancer, L. Suzanne – 1990
Methods proposed by L. A. Goodman (1987) for analyzing intrinsic properties of cross-classified nominal or ordered categorical variables were used to examine the performance of items measuring psychological adjustment (PA). These methods were applied to 35 3x4 tables cross-classifying 1,158 persons according to their mental health status (MHS) and…
Descriptors: Adjustment (to Environment), Adults, Behavioral Science Research, Evaluation Methods
Previous Page | Next Page »
Pages: 1 | 2