ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	9

Descriptor

Correlation	12
Scaling	12
Test Items	12
Scores	5
Simulation	4
Test Construction	4
College Entrance Examinations	3
Interrater Reliability	3
Item Response Theory	3
Models	3
Regression (Statistics)	3
Academic Achievement	2
Achievement Tests	2
Comparative Analysis	2
Data Analysis	2
Difficulty Level	2
Effect Size	2
Factor Analysis	2
Foreign Countries	2
Latent Trait Theory	2
Mathematical Models	2
Mathematics	2
Mathematics Achievement	2
Mathematics Tests	2
Measurement Techniques	2
More ▼

Source

ProQuest LLC	2
ACT, Inc.	1
Applied Psychological…	1
ETS Research Report Series	1
Educational Assessment	1
Educational and Psychological…	1
European Educational Research…	1
OECD Publishing (NJ1)	1

Publication Type

Reports - Research	7
Journal Articles	5
Dissertations/Theses -…	2
Reports - Evaluative	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Primary Education	1
Secondary Education	1

Audience

Location

Europe	1
Germany	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
ACT Interest Inventory	1
Graduate Management Admission…	1
Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Making Sense of Common Test Items That Do Not Get Easier over Time: Implications for Vertical Scale Designs

Peer reviewed

Direct link

Briggs, Derek C.; Dadey, Nathan – Educational Assessment, 2015

This study focuses on an instance in which the mean grade-to-grade scale scores on a vertical scale showed evidence of common test items that do not get easier from one grade to the next. The issue was examined as part of a 2-day workshop in which participants were asked to predict the growth on all linking items used in the construction of…

Descriptors: Test Items, Grading, Scores, Scaling

Estimating Item Difficulty with Comparative Judgments. Research Report. ETS RR-14-39

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014

Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…

Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations

Assessment of Computer and Information Literacy in ICILS 2013: Do Different Item Types Measure the Same Construct?

Peer reviewed

Direct link

Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia – European Educational Research Journal, 2017

The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…

Descriptors: Foreign Countries, Computer Literacy, Information Literacy, International Assessment

Minimum Sample Size Requirements for Mokken Scale Analysis

Peer reviewed

Direct link

Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014

An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…

Descriptors: Sampling, Test Items, Effect Size, Scaling

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

Exploring Unidimensional Proficiency Classification Accuracy from Multidimensional Data in a Vertical Scaling Context

Direct link

Kroopnick, Marc Howard – ProQuest LLC, 2010

When Item Response Theory (IRT) is operationally applied for large scale assessments, unidimensionality is typically assumed. This assumption requires that the test measures a single latent trait. Furthermore, when tests are vertically scaled using IRT, the assumption of unidimensionality would require that the battery of tests across grades…

Descriptors: Simulation, Scaling, Standard Setting, Item Response Theory

ACT Plan: Technical Manual. 2013/2014

Download full text

ACT, Inc., 2013

This manual contains information about the American College Test (ACT) Plan® program. The principal focus of this manual is to document the Plan program's technical adequacy in light of its intended purposes. This manual supersedes the 2011 edition. The content of this manual responds to requirements of the testing industry as established in the…

Descriptors: College Entrance Examinations, Formative Evaluation, Evaluation Research, Test Bias

PISA 2006 Technical Report

Direct link

OECD Publishing (NJ1), 2009

The Organisation for Economic Cooperation and Development's (OECD's) Programme for International Student Assessment (PISA) surveys, which take place every three years, have been designed to collect information about 15-year-old students in participating countries. PISA examines how well students are prepared to meet the challenges of the future,…

Descriptors: Policy Formation, Scaling, Academic Achievement, Interrater Reliability

Interrater Agreement: Same Data, Different Definitions, Different Outcomes.

Download full text

Micceri, Theodore; And Others – 1987

Several issues relating to agreement estimates for different types of data from performance evaluations are considered. New indices of agreement are presented for ordinal level items and for summative scores produced by nominal or ordinal level items. Two sets of empirical data illustrate the performance of the two formulas derived to estimate…

Descriptors: Correlation, Data Analysis, Educational Research, Estimation (Mathematics)

Some Properties of the Pearson Correlation Matrix of Guttman-Scalable Items.

Download full text

Zwick, Rebecca – 1986

Although perfectly scalable items rarely occur in practice, Guttman's concept of a scale has proved to be valuable to the development of measurement theory. If the score distribution is uniform and there is an equal number of items at each difficulty level, both the elements and the eigenvalues of the Pearson correlation matrix of dichotomous…

Descriptors: Correlation, Difficulty Level, Item Analysis, Latent Trait Theory

An Exploratory Study of the Applicability of Item Response Theory Methods to the Graduate Management Admission Test.

Download full text

Kingston, Neal; And Others – 1985

A necessary prerequisite to the operational use of item response theory (IRT) in any testing program is the investigation of the feasibility of such an approach. This report presents the results of such research for the Graduate Management Admission Test (GMAT). Despite the fact that GMAT data appear to violate a basic assumption of the…

Descriptors: College Entrance Examinations, Computer Software, Correlation, Equated Scores

Almehrizi, Rashid S.	1
Attali, Yigal	1
Briggs, Derek C.	1
Dadey, Nathan	1
Gerick, Julia	1
Goldhammer, Frank	1
Ihme, Jan Marten	1
Jackson, Carol	1
Kingston, Neal	1
Kroopnick, Marc Howard	1
Micceri, Theodore	1
Saldivia, Luis	1
Schuppan, Fred	1
Senkbeil, Martin	1
Sijtsma, Klaas	1
Straat, J. Hendrik	1
Topczewski, Anna Marie	1
Wanamaker, Wilbur	1
Zwick, Rebecca	1
van der Ark, L. Andries	1
More ▼