ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	6

Source

Applied Measurement in…

Publication Type

Journal Articles	13
Reports - Evaluative	6
Reports - Research	6
Information Analyses	1
Reports - Descriptive	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 4	1
Grade 5	1
Grade 7	1
Grade 8	1
High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Israel	1
Kansas	1

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

Peer reviewed

Direct link

Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012

Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement

Stability of Rasch Scales over Time

Peer reviewed

Direct link

Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…

Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis

Determining the Validity of Placement Exams for Developmental College Curricula.

Peer reviewed

Schmitz, Constance C.; delMas, Robert C. – Applied Measurement in Education, 1990

Using S. J. Messick's theoretical work concerning construct validity as a guide, underlying hypotheses for investigation when validating placement test decisions are assessed. Guidelines on validating placement decisions are offered, and the hypotheses and guidelines are applied in a validation study of the Written English Expression Placement…

Descriptors: College Freshmen, Construct Validity, Guidelines, Higher Education

Computerized-Adaptive and Self-Adapted Music-Listening Tests: Psychometric Features and Motivational Benefits.

Peer reviewed

Vispoel, Walter P.; Coffman, Don D. – Applied Measurement in Education, 1994

Computerized-adaptive (CAT) and self-adapted (SAT) music listening tests were compared for efficiency, reliability, validity, and motivational benefits with 53 junior high school students. Results demonstrate trade-offs, with greater potential motivational benefits for SAT and greater efficiency for CAT. SAT elicited more favorable responses from…

Descriptors: Adaptive Testing, Computer Assisted Testing, Efficiency, Item Response Theory

High-Stakes Testing Accommodations: Validity versus Disabled Rights.

Peer reviewed

Phillips, S. E. – Applied Measurement in Education, 1994

This article explores the measurement problems associated with granting accommodations for mental disabilities, uses existing case law to construct a legal framework for considering such accommodations, and discusses the advantages and disadvantages of alternative strategies for handling testing accommodation requests. (Author/SLD)

Descriptors: Accessibility (for Disabled), Alternative Assessment, Court Litigation, Elementary Secondary Education

State Assessment and Instructional Change: A Path Model Analysis.

Peer reviewed

Pomplun, Mark – Applied Measurement in Education, 1997

A method to investigate consequential evidence of validity for a state assessment developed to change teacher instructional practices is presented. Survey responses from over 1,000 Kansas teachers were used to construct a path model that allowed effects of the state assessment to be studied at building and teacher levels. (SLD)

Descriptors: Educational Assessment, Educational Change, Instructional Effectiveness, Path Analysis

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Psychometric Issues in Testing Students with Disabilities.

Peer reviewed

Geisinger, Kurt F. – Applied Measurement in Education, 1994

Federal law requires that individuals with handicapping conditions be administered assessments in ways that accommodate their disabilities without penalizing them. Validation studies are needed to evaluate the meaning of scores resulting from nonstandard test administrations. The limited number of these studies to date is reviewed. (SLD)

Descriptors: Disabilities, Educational Assessment, Elementary School Students, Elementary Secondary Education

Assessing the Language Acquisition Progress of Limited English Proficient Students: Problems and a New Alternative.

Peer reviewed

Royer, James M.; Carlo, Maria S. – Applied Measurement in Education, 1991

Measures of linguistic competence for limited-English-proficient students are discussed. The results for 134 students in grades 3 through 6 from a study of the reliability and validity of the Sentence Verification Technique tests as measures of listening and reading comprehension performance in native languages and English are reported. (TJH)

Descriptors: Bilingual Education, Comparative Testing, Elementary Education, Elementary School Students

The Devaluation of Standardized Testing: One District's Response to a Mandated Assessment.

Peer reviewed

Moore, William P. – Applied Measurement in Education, 1994

Teacher testing-related attitudes and practices related to court-ordered achievement testing were studied through a mail survey completed by 79 elementary school teachers in a midwestern urban district. Teachers engaged in a large number of test preparation practices and reported finding minimal value in purpose or results of testing. (SLD)

Descriptors: Achievement Tests, Court Litigation, Educational Assessment, Educational Practices

Test Validity	13
Test Reliability	6
Computer Assisted Testing	5
Item Response Theory	5
Educational Assessment	3
Elementary Secondary Education	3
Item Analysis	3
Standardized Tests	3
State Programs	3
Test Construction	3
Testing Problems	3
Testing Programs	3
Achievement Tests	2
Automation	2
Comparative Testing	2
Court Litigation	2
Efficiency	2
Elementary Education	2
Elementary School Students	2
Equated Scores	2
High Stakes Tests	2
Psychometrics	2
School Districts	2
Scores	2
Scoring	2
More ▼

Ben-Simon, Anat	1
Carlo, Maria S.	1
Coffman, Don D.	1
Cohen, Yoav	1
Geisinger, Kurt F.	1
Henly, George A.	1
Lee, Yoonsun	1
Levi, Effi	1
Moore, William P.	1
Phillips, Gary W.	1
Phillips, S. E.	1
Pomplun, Mark	1
Royer, James M.	1
Rupp, André A.	1
Schmitz, Constance C.	1
Taylor, Catherine S.	1
Vispoel, Walter P.	1
Wan, Lei	1
Wise, Steven L.	1
delMas, Robert C.	1
More ▼