Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 9 |
Descriptor
Correlation | 12 |
Scaling | 12 |
Test Items | 12 |
Scores | 5 |
Simulation | 4 |
Test Construction | 4 |
College Entrance Examinations | 3 |
Interrater Reliability | 3 |
Item Response Theory | 3 |
Models | 3 |
Regression (Statistics) | 3 |
More ▼ |
Source
ProQuest LLC | 2 |
ACT, Inc. | 1 |
Applied Psychological… | 1 |
ETS Research Report Series | 1 |
Educational Assessment | 1 |
Educational and Psychological… | 1 |
European Educational Research… | 1 |
OECD Publishing (NJ1) | 1 |
Author
Almehrizi, Rashid S. | 1 |
Attali, Yigal | 1 |
Briggs, Derek C. | 1 |
Dadey, Nathan | 1 |
Gerick, Julia | 1 |
Goldhammer, Frank | 1 |
Ihme, Jan Marten | 1 |
Jackson, Carol | 1 |
Kingston, Neal | 1 |
Kroopnick, Marc Howard | 1 |
Micceri, Theodore | 1 |
More ▼ |
Publication Type
Reports - Research | 7 |
Journal Articles | 5 |
Dissertations/Theses -… | 2 |
Reports - Evaluative | 2 |
Guides - Non-Classroom | 1 |
Numerical/Quantitative Data | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 2 | 1 |
High Schools | 1 |
Primary Education | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
ACT Interest Inventory | 1 |
Graduate Management Admission… | 1 |
Program for International… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Briggs, Derek C.; Dadey, Nathan – Educational Assessment, 2015
This study focuses on an instance in which the mean grade-to-grade scale scores on a vertical scale showed evidence of common test items that do not get easier from one grade to the next. The issue was examined as part of a 2-day workshop in which participants were asked to predict the growth on all linking items used in the construction of…
Descriptors: Test Items, Grading, Scores, Scaling
Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014
Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…
Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations
Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia – European Educational Research Journal, 2017
The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…
Descriptors: Foreign Countries, Computer Literacy, Information Literacy, International Assessment
Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014
An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…
Descriptors: Sampling, Test Items, Effect Size, Scaling
Topczewski, Anna Marie – ProQuest LLC, 2013
Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…
Descriptors: Item Response Theory, Scaling, Scores, Student Development
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Kroopnick, Marc Howard – ProQuest LLC, 2010
When Item Response Theory (IRT) is operationally applied for large scale assessments, unidimensionality is typically assumed. This assumption requires that the test measures a single latent trait. Furthermore, when tests are vertically scaled using IRT, the assumption of unidimensionality would require that the battery of tests across grades…
Descriptors: Simulation, Scaling, Standard Setting, Item Response Theory
ACT, Inc., 2013
This manual contains information about the American College Test (ACT) PlanĀ® program. The principal focus of this manual is to document the Plan program's technical adequacy in light of its intended purposes. This manual supersedes the 2011 edition. The content of this manual responds to requirements of the testing industry as established in the…
Descriptors: College Entrance Examinations, Formative Evaluation, Evaluation Research, Test Bias
OECD Publishing (NJ1), 2009
The Organisation for Economic Cooperation and Development's (OECD's) Programme for International Student Assessment (PISA) surveys, which take place every three years, have been designed to collect information about 15-year-old students in participating countries. PISA examines how well students are prepared to meet the challenges of the future,…
Descriptors: Policy Formation, Scaling, Academic Achievement, Interrater Reliability
Micceri, Theodore; And Others – 1987
Several issues relating to agreement estimates for different types of data from performance evaluations are considered. New indices of agreement are presented for ordinal level items and for summative scores produced by nominal or ordinal level items. Two sets of empirical data illustrate the performance of the two formulas derived to estimate…
Descriptors: Correlation, Data Analysis, Educational Research, Estimation (Mathematics)
Zwick, Rebecca – 1986
Although perfectly scalable items rarely occur in practice, Guttman's concept of a scale has proved to be valuable to the development of measurement theory. If the score distribution is uniform and there is an equal number of items at each difficulty level, both the elements and the eigenvalues of the Pearson correlation matrix of dichotomous…
Descriptors: Correlation, Difficulty Level, Item Analysis, Latent Trait Theory
Kingston, Neal; And Others – 1985
A necessary prerequisite to the operational use of item response theory (IRT) in any testing program is the investigation of the feasibility of such an approach. This report presents the results of such research for the Graduate Management Admission Test (GMAT). Despite the fact that GMAT data appear to violate a basic assumption of the…
Descriptors: College Entrance Examinations, Computer Software, Correlation, Equated Scores