ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	13

Descriptor

Comparative Analysis	14
Statistical Analysis	14
Item Response Theory	4
Test Items	4
Correlation	3
Scoring	3
Accuracy	2
College Entrance Examinations	2
Computer Assisted Testing	2
Computers	2
Educational Assessment	2
Educational Technology	2
Essays	2
Ethnicity	2
Evaluation Methods	2
Evaluators	2
Graduate Study	2
Handheld Devices	2
High School Students	2
Longitudinal Studies	2
Monte Carlo Methods	2
Racial Differences	2
Reading Tests	2
Sample Size	2
Science Tests	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	14
Reports - Research	11
Reports - Evaluative	2
Information Analyses	1
Reports - Descriptive	1

Education Level

High Schools	2
Higher Education	2
Grade 10	1
Grade 8	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Virginia

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	2
Early Childhood Longitudinal…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods

Peer reviewed

Direct link

Finch, W. Holmes – Applied Measurement in Education, 2016

Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…

Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups

IRT Item Parameter Scaling for Developing New Item Pools

Peer reviewed

Direct link

Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua – Applied Measurement in Education, 2017

Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

Descriptors: Item Response Theory, Accuracy, Educational Assessment, Test Items

The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models

Peer reviewed

Direct link

Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017

Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…

Descriptors: Item Response Theory, Test Items, Bias, Computation

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Early Childhood Reading Skills and Proficiency in NAEP Eighth-Grade Reading Assessment

Peer reviewed

Direct link

Dogan, Enis; Ogut, Burhan; Kim, Young Yee – Applied Measurement in Education, 2015

The relationship between reading skills in earlier grades and achieving "Proficiency" on the National Assessment of Educational Progress (NAEP) grade 8 reading assessment was examined by establishing a statistical link between NAEP and the Early Childhood Longitudinal Study (ECLS) grade 8 reading assessments using data from a common…

Descriptors: Reading Skills, National Competency Tests, Reading Tests, Grade 8

Device Comparability of Tablets and Computers for Assessment Purposes

Peer reviewed

Direct link

Davis, Laurie Laughlin; Kong, Xiaojing; McBride, Yuanyuan; Morrison, Kristin M. – Applied Measurement in Education, 2017

The definition of what it means to take a test online continues to evolve with the inclusion of a broader range of item types and a wide array of devices used by students to access test content. To assure the validity and reliability of test scores for all students, device comparability research should be conducted to evaluate the impact of…

Descriptors: Educational Technology, Technology Uses in Education, High School Students, Tests

An Exploratory Analysis of Differential Item Functioning and Its Possible Sources in a Higher Education Admissions Context

Peer reviewed

Direct link

Oliveri, Maria Elena; Lawless, Rene; Robin, Frederic; Bridgeman, Brent – Applied Measurement in Education, 2018

We analyzed a pool of items from an admissions test for differential item functioning (DIF) for groups based on age, socioeconomic status, citizenship, or English language status using Mantel-Haenszel and item response theory. DIF items were systematically examined to identify its possible sources by item type, content, and wording. DIF was…

Descriptors: Test Bias, Comparative Analysis, Item Banks, Item Response Theory

Response Time Differences between Computers and Tablets

Peer reviewed

Direct link

Kong, Xiaojing; Davis, Laurie Laughlin; McBride, Yuanyuan; Morrison, Kristin – Applied Measurement in Education, 2018

Item response time data were used in investigating the differences in student test-taking behavior between two device conditions: computer and tablet. Analyses were conducted to address the questions of whether or not the device condition had a differential impact on rapid guessing and solution behaviors (with response time effort used as an…

Descriptors: Educational Technology, Technology Uses in Education, Computers, Handheld Devices

Comparing Human and Automated Essay Scoring for Prospective Graduate Students with Learning Disabilities and/or ADHD

Peer reviewed

Direct link

Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016

Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…

Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring

Using Explanatory Item Response Models to Analyze Group Differences in Science Achievement

Peer reviewed

Direct link

Briggs, Derek C. – Applied Measurement in Education, 2008

This article illustrates the use of an explanatory item response modeling (EIRM) approach in the context of measuring group differences in science achievement. The distinction between item response models and EIRMs, recently elaborated by De Boeck and Wilson (2004), is presented within the statistical framework of generalized linear mixed models.…

Descriptors: Science Achievement, Science Tests, Measurement, Error of Measurement

Simultaneous Use of Multiple Answer Copying Indexes to Improve Detection Rates

Peer reviewed

Direct link

Wollack, James A. – Applied Measurement in Education, 2006

Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…

Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size

Alignment as a Teacher Variable

Peer reviewed

Direct link

Porter, Andrew C.; Smithson, John; Blank, Rolf; Zeidner, Timothy – Applied Measurement in Education, 2007

With the exception of the procedures developed by Porter and colleagues (Porter, 2002), other methods of defining and measuring alignment are essentially limited to alignment between tests and standards. Porter's procedures have been generalized to investigating the alignment between content standards, tests, textbooks, and even classroom…

Descriptors: Teaching Methods, Computer Uses in Education, Instructional Innovation, Guidance Programs

The Performance Domain and the Structure of the Decision Space.

Peer reviewed

Plake, Barbara S. – Applied Measurement in Education, 1995

This article provides a framework for the rest of the articles in this special issue comparing the utility of three standard-setting methods with complex performance assessments. The context of the standard setting study is described, and the methods are outlined. (SLD)

Descriptors: Comparative Analysis, Criteria, Decision Making, Educational Assessment

Davis, Laurie Laughlin	2
Kong, Xiaojing	2
McBride, Yuanyuan	2
Oliveri, Maria Elena	2
Attali, Yigal	1
Blank, Rolf	1
Boyer, Michelle	1
Bridgeman, Brent	1
Briggs, Derek C.	1
Buzick, Heather	1
Chang, Hua-Hua	1
Cho, Sun-Joo	1
Dogan, Enis	1
Ferrara, Steve	1
Finch, W. Holmes	1
Flor, Michael	1
Kang, Hyeon-Ah	1
Kieftenbeld, Vincent	1
Kim, Young Yee	1
Lawless, Rene	1
Lee, Wooyeol	1
Lu, Ying	1
Morrison, Kristin	1
Morrison, Kristin M.	1
Ogut, Burhan	1
More ▼