ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	8

Descriptor

Comparative Analysis	14
Error of Measurement	14
Test Construction	14
Test Items	8
Item Analysis	6
Achievement Tests	5
Correlation	4
Scores	4
Statistical Analysis	4
Test Reliability	4
Criterion Referenced Tests	3
Equated Scores	3
Factor Analysis	3
Higher Education	3
Measurement	3
Measurement Techniques	3
Sample Size	3
Sampling	3
Test Format	3
Test Validity	3
Academic Achievement	2
Career Development	2
Computer Assisted Testing	2
Difficulty Level	2
Educational Policy	2
More ▼

Source

ETS Research Report Series	2
Assessment & Evaluation in…	1
Assessment for Effective…	1
Education and Information…	1
International Journal of…	1
Partnership for Assessment of…	1
ProQuest LLC	1

Publication Type

Reports - Research	9
Journal Articles	6
Speeches/Meeting Papers	3
Reports - Evaluative	2
Dissertations/Theses -…	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Elementary Education	2
Grade 2	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Australia	1
Colorado (Boulder)	1
Oregon	1
Portugal	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

Dynamic Indicators of Basic…	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

Comparing OECD PISA Reading in English to Other Languages: Identifying Potential Sources of Non-Invariance

Peer reviewed

Direct link

Asil, Mustafa; Brown, Gavin T. L. – International Journal of Testing, 2016

The use of the Programme for International Student Assessment (PISA) across nations, cultures, and languages has been criticized. The key criticisms point to the linguistic and cultural biases potentially underlying the design of reading comprehension tests, raising doubts about the legitimacy of comparisons across economies. Our research focused…

Descriptors: Comparative Analysis, Reading Achievement, Achievement Tests, Secondary School Students

Measurement Properties of DIBELS Oral Reading Fluency in Grade 2: Implications for Equating Studies

Peer reviewed

Direct link

Stoolmiller, Michael; Biancarosa, Gina; Fien, Hank – Assessment for Effective Intervention, 2013

Lack of psychometric equivalence of oral reading fluency (ORF) passages used within a grade for screening and progress monitoring has recently become an issue with calls for the use of equating methods to ensure equivalence. To investigate the nature of the nonequivalence and to guide the choice of equating method to correct for nonequivalence,…

Descriptors: School Personnel, Reading Fluency, Emergent Literacy, Psychometrics

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

Investigating the Justifiability of an Additional Test Use: An Application of Assessment Use Argument to an English as a Foreign Language Test

Direct link

Wang, Huan – ProQuest LLC, 2010

Multiple uses of the same assessment may present challenges for both the design and use of an assessment. Little advice, however, has been given to assessment developers as to how to understand the phenomena of multiple assessment use and meet the challenges these present. Particularly problematic is the case in which an assessment is used for…

Descriptors: Test Use, Testing Programs, Program Effectiveness, Test Construction

Making Inferences about Growth and Value-Added: Design Issues for the PARCC Consortium. A White Paper

Download full text

Briggs, Derek C. – Partnership for Assessment of Readiness for College and Careers, 2011

There is often confusion about distinctions between growth models and value-added models. The first half of this paper attempts to dispel some of these confusions by clarifying terminology and illustrating by example how the results from a large-scale assessment can and will be used to make inferences about student growth and the value-added…

Descriptors: Value Added Models, Language Usage, Measurement, Inferences

E-Assessment within the Bologna Paradigm: Evidence from Portugal

Peer reviewed

Direct link

Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010

The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment

The Effect of Sequential Dependence on the Sampling Distributions of KR-20, KR-21, and Split-Halves Reliabilities.

Download full text

Sullins, Walter L. – 1971

Five-hundred dichotomously scored response patterns were generated with sequentially independent (SI) items and 500 with dependent (SD) items for each of thirty-six combinations of sampling parameters (i.e., three test lengths, three sample sizes, and four item difficulty distributions). KR-20, KR-21, and Split-Half (S-H) reliabilities were…

Descriptors: Comparative Analysis, Correlation, Error of Measurement, Item Analysis

Assessing the Psychometric Quality of Performance Rating Scales: Comparisons among Evaluative Criteria.

Download full text

Lance, Charles E.; Moomaw, Michael E. – 1983

Direct assessments of the accuracy with which raters can use a rating instrument are presented. This study demonstrated how surplus behavioral incidents scaled during the development of Behaviorally Anchored Rating Scales (BARS) can be used effectively in the evaluation of the newly developed scales. Construction of scenarios of hypothetical…

Descriptors: Behavior Rating Scales, Comparative Analysis, Error of Measurement, Evaluation Criteria

A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.

Benson, Jeri; Wilson, Michael – 1979

Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…

Descriptors: Comparative Analysis, Difficulty Level, Efficiency, Error of Measurement

A Comparison of Several Multiple-Choice, Linguistic-Based Item Writing Algorithms.

Roid, Gale; Haladyna, Tom – 1978

The technology of transforming sentences from prose instruction into test questions was examined by comparing two methods of selecting sentences (keyword vs. rare singleton), two types of question words (nouns vs. adjectives), and two foil construction methods (writer's choice vs. algorithmic). Four item writers created items using each…

Descriptors: Algorithms, Cloze Procedure, Comparative Analysis, Criterion Referenced Tests

The Paradox of Criterion-Referenced Measurement.

Download full text

Haladyna, Tom – 1976

The existence of criterion-referenced (CR) measurement is questioned in this paper. Despite beliefs that differences exist between two alternative forms of measurement, CR and Norm Referenced (NR), an analysis of philosophical and psychological descriptions of measurement, as well as a growing number of empirical studies, reveal that the common…

Descriptors: Academic Standards, Achievement Tests, Career Development, Comparative Analysis

A Theoretical and Empirical Comparison of Three Approaches to Achievement Testing.

Haladyna, Tom; Roid, Gale – 1976

Three approaches to the construction of achievement tests are compared: construct, operational, and empirical. The construct approach is based upon classical test theory and measures an abstract representation of the instructional objectives. The operational approach specifies instructional intent through instructional objectives, facet design,…

Descriptors: Academic Achievement, Achievement Tests, Career Development, Comparative Analysis

Haladyna, Tom	3
Roid, Gale	2
Asil, Mustafa	1
Benson, Jeri	1
Biancarosa, Gina	1
Briggs, Derek C.	1
Brown, Gavin T. L.	1
Ferrao, Maria	1
Fien, Hank	1
Gelbal, Selahattin	1
Kim, Sooyeon	1
Lance, Charles E.	1
Lee, Yi-Hsuan	1
Livingston, Samuel A.	1
Moomaw, Michael E.	1
Ozdemir, Burhanettin	1
Qian, Jiahe	1
Stoolmiller, Michael	1
Sullins, Walter L.	1
Wang, Huan	1
Wang, Lin	1
Wilson, Michael	1
More ▼