ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Source

Applied Measurement in…

Publication Type

Journal Articles	13
Reports - Research	13
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 11	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Location

Georgia	2
Canada	1

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Impact of Accumulated Error on Item Response Theory Pre-Equating with Mixed Format Tests

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert; Cook, Robert J.; Colvin, Kimberly F. – Applied Measurement in Education, 2016

The equating of tests is an essential process in high-stakes, large-scale testing conducted over multiple forms or administrations. By adjusting for differences in difficulty and placing scores from different administrations of a test on a common scale, equating allows scores from these different forms and administrations to be directly compared…

Descriptors: Item Response Theory, Equated Scores, Test Format, Testing Programs

Considering the Use of General and Modified Assessment Items in Computerized Adaptive Testing

Peer reviewed

Direct link

Wyse, Adam E.; Albano, Anthony D. – Applied Measurement in Education, 2015

This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Testing Programs

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Conducting a Lifecycle Audit of the National Assessment of Educational Progress

Peer reviewed

Direct link

Buckendahl, Chad W.; Plake, Barbara S.; Davis, Susan L. – Applied Measurement in Education, 2009

The National Assessment of Educational Progress (NAEP) program is a series of periodic assessments administered nationally to samples of students and designed to measure different content areas. This article describes a multi-year study that focused on the breadth of the development, administration, maintenance, and renewal of the assessments in…

Descriptors: National Competency Tests, Audits (Verification), Testing Programs, Program Evaluation

A Binomial Trials Model for Examining the Ratings of Standard-Setting Judges.

Peer reviewed

Engelhard, George, Jr.; Anderson, David W. – Applied Measurement in Education, 1998

A new approach for examining the quality of judgments from standard-setting judges using a Binomial Trials Model (BTM) is presented and illustrated with 26 judges from the Georgia High School Graduation Test. Results suggest that the BTM provides information not available from other methods. (SLD)

Descriptors: Graduation Requirements, High Schools, Judges, Standard Setting (Scoring)

Student Test Score Reports and Interpretive Guides: Review of Current Practices and Suggestions for Future Research

Peer reviewed

Direct link

Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004

A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…

Descriptors: Test Results, Student Evaluation, Scores, Testing Programs

Stability of School-Level Scores from Large-Scale Student Assessment.

Peer reviewed

Sicoly, Fiore – Applied Measurement in Education, 2002

Calculated year-1 to year-2 stability of assessment data from 21 states and 2 Canadian provinces. The median stability coefficient was 0.78 in mathematics and reading, and lower in writing. A stability coefficient of 0.80 is recommended as the standard for large-scale assessments of student performance. (SLD)

Descriptors: Educational Testing, Elementary Secondary Education, Foreign Countries, Mathematics

Gender Differences in Constructed Response Reading Items.

Peer reviewed

Pomplun, Mark; Sundbye, Nita – Applied Measurement in Education, 1999

Gender differences in answers to constructed-response reading items from a state assessment program were studied with four raters rating approximately 500 papers at two grade levels. Results indicate that number of words written and number of unrelated responses show significant gender differences and are related to holistic scores. (SLD)

Descriptors: Constructed Response, Holistic Evaluation, Reading Tests, Secondary Education

Sources of Uncertainty Often Ignored in Adjusting State Mean SAT Scores for Differential Participation Rates: The Rules of the Game.

Peer reviewed

Holland, Paul W.; Wainer, Howard – Applied Measurement in Education, 1990

Two attempts to adjust state mean Scholastic Aptitude Test (SAT) scores for differential participation rates are examined. Both attempts are rejected, and five rules for performing adjustments are outlined to foster follow-up checks on untested assumptions. National Assessment of Educational Progress state data are determined to be more accurate.…

Descriptors: College Applicants, College Entrance Examinations, Estimation (Mathematics), Item Bias

The Effects of Task Choice on the Quality of Writing Obtained in a Statewide Assessment.

Peer reviewed

Gabrielson, Stephen; And Others – Applied Measurement in Education, 1995

The effects of presenting a choice of writing tasks on the quality of essays produced by eleventh graders were studied with 34,200 students in Georgia. The choice condition had no substantive effect on the quality of essays, but race, gender, and the writing task variable did. (SLD)

Descriptors: Essay Tests, Grade 11, High School Students, High Schools

Generalizability of Large-Scale Performance Assessments in Science: Promises and Problems.

Peer reviewed

Gao, Xiaohong; And Others – Applied Measurement in Education, 1994

This study provides empirical evidence about the sampling variability and generalizability (reliability) of a statewide performance assessment for grade six. Results for 600 students at individual and school levels indicate that task-sampling variability was the major source of measurement error. Rater-sampling variability was negligible. (SLD)

Descriptors: Achievement Tests, Educational Assessment, Elementary School Students, Error of Measurement

The Devaluation of Standardized Testing: One District's Response to a Mandated Assessment.

Peer reviewed

Moore, William P. – Applied Measurement in Education, 1994

Teacher testing-related attitudes and practices related to court-ordered achievement testing were studied through a mail survey completed by 79 elementary school teachers in a midwestern urban district. Teachers engaged in a large number of test preparation practices and reported finding minimal value in purpose or results of testing. (SLD)

Descriptors: Achievement Tests, Court Litigation, Educational Assessment, Educational Practices

Testing Programs	13
State Programs	7
Item Response Theory	4
Evaluation Methods	3
Item Analysis	3
Scores	3
Test Items	3
Achievement Tests	2
Educational Assessment	2
Educational Practices	2
Equated Scores	2
Error of Measurement	2
Foreign Countries	2
Grade 11	2
Grade 6	2
High School Students	2
High Schools	2
Reliability	2
Sampling	2
School Districts	2
Standardized Tests	2
Standards	2
Student Evaluation	2
Test Reliability	2
Test Validity	2
More ▼

Albano, Anthony D.	1
Anderson, David W.	1
Brian F. French	1
Buckendahl, Chad W.	1
Colvin, Kimberly F.	1
Cook, Robert J.	1
Davis, Susan L.	1
Engelhard, George, Jr.	1
Gabrielson, Stephen	1
Gao, Xiaohong	1
Goodman, Dean P.	1
Hambleton, Ronald K.	1
Holland, Paul W.	1
Keller, Lisa A.	1
Keller, Robert	1
Moore, William P.	1
Phillips, Gary W.	1
Plake, Barbara S.	1
Pomplun, Mark	1
Sicoly, Fiore	1
Sundbye, Nita	1
Thao Thu Vo	1
Tony Albano	1
Wainer, Howard	1
More ▼