ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	9
Test Construction	9
Test Items	4
Difficulty Level	3
Item Response Theory	3
Scores	3
Achievement Tests	2
Equated Scores	2
Mathematics Tests	2
Simulation	2
Academic Achievement	1
Alternative Assessment	1
Certification	1
Classification	1
Computation	1
Computer Assisted Testing	1
Cultural Differences	1
Disabilities	1
Educational Research	1
English	1
Equations (Mathematics)	1
Estimation (Mathematics)	1
Evaluation Methods	1
Foreign Countries	1
French	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	9
Reports - Research	5
Reports - Evaluative	3
Reports - Descriptive	1

Education Level

Elementary Secondary Education	2
Grade 3	2
Elementary Education	1
Grade 10	1
Grade 2	1
Grade 5	1
Grade 6	1
Grade 8	1
Grade 9	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills

What Works Clearinghouse Rating

Showing all 9 results Save | Export

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French

Peer reviewed

Direct link

Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011

The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…

Descriptors: Test Items, Item Analysis, Achievement Tests, French

Estimating Non-Normal Latent Trait Distributions within Item Response Theory Using True and Estimated Item Parameters

Peer reviewed

Direct link

Sass, D. A.; Schmitt, T. A.; Walker, C. M. – Applied Measurement in Education, 2008

Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal…

Descriptors: Difficulty Level, Item Response Theory, Test Items, Computation

Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Peer reviewed

Direct link

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory

Estimating Conditional Standard Errors of Measurement for Tests Composed of Testlets.

Peer reviewed

Lee, Guemin – Applied Measurement in Education, 2000

Investigated incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets using five conditional SEM estimation methods. Results from 3,876 tests from the Iowa Tests of Basic Skills and 1,000 simulated responses show that item-based methods provide lower conditional…

Descriptors: Error of Measurement, Estimation (Mathematics), Simulation, Test Construction

Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…

Descriptors: Error of Measurement, Reliability, Scores, Test Construction

Experiences in the Application of Item Response Theory in Test Construction.

Peer reviewed

Green, Donald Ross; And Others – Applied Measurement in Education, 1989

Potential benefits of using item response theory in test construction are evaluated using the experience and evidence accumulated during nine years of using a three-parameter model in the development of major achievement batteries. Topics addressed include error of measurement, test equating, item bias, and item difficulty. (TJH)

Descriptors: Achievement Tests, Computer Assisted Testing, Difficulty Level, Equated Scores

Vertically Articulated Performance Standards: Logic, Procedures, and Likely Classification Accuracy

Peer reviewed

Direct link

Ferrara, Steve; Johnson, Eugene; Chen, Wen-Hung – Applied Measurement in Education, 2005

Psychometricians continue to develop and evaluate methods for linking test scores, both horizontally and vertically. This article describes a social moderation process for articulating (i.e., linking) performance standards across grade levels for an operational state assessment program. The researchers used generated data to evaluate the likely…

Descriptors: Grade 2, Grade 3, Scores, Error of Measurement

Antal, Judit	1
Chen, Wen-Hung	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
Feldt, Leonard S.	1
Ferrara, Steve	1
Green, Donald Ross	1
Johnson, Eugene	1
Lee, Guemin	1
Lin, Jie	1
Melican, Gerald J.	1
Pastor, Dena A.	1
Proctor, Thomas P.	1
Rinaldi, Christia M.	1
Rogers, W. Todd	1
Sass, D. A.	1
Schmitt, T. A.	1
Sun, Koun-Tem	1
Taylor, Melinda Ann	1
Tsai, Shu-Yen	1
Walker, C. M.	1
More ▼