ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	10

Source

Educational and Psychological…	3
Journal of Applied Testing…	3
Applied Measurement in…	2
Educational Measurement:…	2
Behavioral Research and…	1
Florida Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational…	1
Learning Disabilities: A…	1
Psychometrika	1
Review of Educational Research	1
More ▼

Publication Type

Reports - Evaluative	24
Journal Articles	17
Speeches/Meeting Papers	3
Information Analyses	1
Numerical/Quantitative Data	1

Education Level

Elementary Secondary Education	5
Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Middle Schools	1

Audience

Location

Connecticut	1
Dominica	1
Grenada	1
Hawaii	1
Massachusetts	1
Michigan	1
Pennsylvania	1
Saint Lucia	1
Saint Vincent and the…	1
Singapore	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Individuals with Disabilities…	1

Assessments and Surveys

National Assessment of…	2
Program for International…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Accumulative Equating Error after a Chain of Linear Equatings

Peer reviewed

Direct link

Guo, Hongwen – Psychometrika, 2010

After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…

Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

Design of a Computer-Adaptive Test to Measure English Literacy and Numeracy in the Singapore Workforce: Considerations, Benefits, and Implications

Peer reviewed

Direct link

Jacobsen, Jared; Ackermann, Richard; Eguez, Jane; Ganguli, Debalina; Rickard, Patricia; Taylor, Linda – Journal of Applied Testing Technology, 2011

A computer adaptive test (CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be…

Descriptors: Delivery Systems, Testing Programs, Computer Assisted Testing, Foreign Countries

Extended Time Testing Accommodations for Students with Disabilities: Answers to Five Fundamental Questions

Peer reviewed

Direct link

Lovett, Benjamin J. – Review of Educational Research, 2010

Extended time is one of the most common testing accommodations provided to students with disabilities. It is also controversial; critics of extended time accommodations argue that extended time is used too readily, without concern for how it changes the skills measured by tests, leading to scores that cannot be compared fairly with those of other…

Descriptors: Testing Accommodations, Academic Accommodations (Disabilities), Literature Reviews, Meta Analysis

Item Position and Item Difficulty Change in an IRT-Based Common Item Equating Design

Peer reviewed

Direct link

Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009

In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…

Descriptors: Test Items, Test Content, Testing Programs, Simulation

Detecting and Correcting Scale Drift in Test Equating: An Illustration from a Large Scale Testing Program

Peer reviewed

Direct link

Puhan, Gautam – Applied Measurement in Education, 2009

The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…

Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory

Differential Item Functioning Analysis Using Rasch Item Information Functions

Peer reviewed

Direct link

Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009

Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…

Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment

Technical Adequacy of the easyCBM Grade 2 Reading Measures. Technical Report #1004

Download full text

Jamgochian, Elisa; Park, Bitnara Jasmine; Nese, Joseph F. T.; Lai, Cheng-Fei; Saez, Leilani; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2010

In this technical report, we provide reliability and validity evidence for the easyCBM[R] Reading measures for grade 2 (word and passage reading fluency and multiple choice reading comprehension). Evidence for reliability includes internal consistency and item invariance. Evidence for validity includes concurrent, predictive, and construct…

Descriptors: Grade 2, Reading Comprehension, Testing Programs, Reading Fluency

Automated Simultaneous Assembly of Multistage Testlets for a High-Stakes Licensing Examination

Peer reviewed

Direct link

Breithaupt, Krista; Hare, Donovan R. – Educational and Psychological Measurement, 2007

Many challenges exist for high-stakes testing programs offering continuous computerized administration. The automated assembly of test questions to exactly meet content and other requirements, provide uniformity, and control item exposure can be modeled and solved by mixed-integer programming (MIP) methods. A case study of the computerized…

Descriptors: Testing Programs, Psychometrics, Certification, Accounting

Using Traditional Psychometric Methodologies and the Rasch Model in Designing a Test.

Download full text

Crislip, Marian A.; Chin-Chance, Selvin – 2001

This paper discusses the use of two theories of item analysis and test construction, their strengths and weaknesses, and applications to the design of the Hawaii State Test of Essential Competencies (HSTEC). Traditional analyses of the data collected from the HSTEC field test were viewed from the perspectives of item difficulty levels and item…

Descriptors: Difficulty Level, Item Response Theory, Psychometrics, Reliability

An Introduction to Differential Item Functioning Analysis

Peer reviewed
PDF on ERIC

Download full text

Kamata, Akihito; Vaughn, Brandon K. – Learning Disabilities: A Contemporary Journal, 2004

This article provides a brief primer overview of Differential Item Functioning (DIF) analysis. DIF analysis investigates a differential characteristic of a test item between subpopulations of examinees and is useful in detecting possibly biased items toward a particular subpopulation. As demonstration, a dataset from a 40-item math test in a…

Descriptors: Test Bias, Testing Accommodations, Test Items, Testing Programs

Computer-Based Signing Accommodations: Comparing a Recorded Human with an Avatar

Peer reviewed

Direct link

Russell, Michael; Kavanaugh, Maureen; Masters, Jessica; Higgins, Jennifer; Hoffmann, Thomas – Journal of Applied Testing Technology, 2009

Many students who are deaf or hard-of-hearing are eligible for a signing accommodation for state and other standardized tests. The signing accommodation, however, presents several challenges for testing programs that attempt to administer tests under standardized conditions. One potential solution for many of these challenges is the use of…

Descriptors: Testing Programs, Student Attitudes, Standardized Tests, Academic Achievement

A Comparison of the Minimax and Rasch Approaches to Set Simultaneous Passing Scores for Subtests.

Peer reviewed

Huynh, Huynh; Casteel, Jim – Journal of Educational Statistics, 1985

Two approaches, the minimax approach and the Rasch procedure, are described for the simultaneous determination of passing scores for subtests when the passing score for the total test is known. (Author/LMO)

Descriptors: Cutting Scores, Educational Assessment, Elementary Secondary Education, Latent Trait Theory

Overview of the Most Difficult Technical Issues on the VNT.

Download full text

Skaggs, Gary; Bourque, Mary Lyn – 1998

Political and legislative pressures have posed a number of measurement issues and challenges to the development of sound, valid voluntary national tests (VNTs). This paper focuses on what appear to be the most difficult technical issues related to the VNT proposed by President Clinton in 1997. Technical issues refer to psychometric issues, as…

Descriptors: Academic Achievement, Achievement Tests, Classification, Difficulty Level

OCOD-CTTP Test Evaluation Report.

Download full text

Shorey, Leonard – 1991

Tests in social studies and integrated science given in Saint Vincent, Saint Lucia, Grenada, and Dominica were analyzed by the Organization for Co-operation in Overseas Development (OCOD) Comprehensive Teacher Training Program (CTTP) for discrimination, difficulty, and reliability, as well as other characteristics. There were 767 examinees for the…

Descriptors: Difficulty Level, Elementary Secondary Education, Evaluation Methods, Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2

Psychometrics	24
Testing Programs	24
State Programs	9
Educational Assessment	8
Test Construction	8
Elementary Secondary Education	7
Standardized Tests	7
Test Items	7
Test Reliability	7
Item Response Theory	6
Achievement Tests	5
Computer Assisted Testing	5
Evaluation Methods	5
Test Use	5
Test Validity	5
Academic Achievement	4
Comparative Analysis	4
Difficulty Level	4
High Stakes Tests	4
Performance Based Assessment	4
Scoring	4
Student Evaluation	4
Test Content	4
Alternative Assessment	3
Educational Technology	3
More ▼

Ackermann, Richard	1
Alonzo, Julie	1
Anderson, Daniel	1
Bourque, Mary Lyn	1
Breithaupt, Krista	1
Burns, Matthew	1
Carvajal, Jorge	1
Casteel, Jim	1
Chin-Chance, Selvin	1
Crislip, Marian A.	1
Eguez, Jane	1
Ferrara, Steven	1
Ganguli, Debalina	1
Guo, Hongwen	1
Hare, Donovan R.	1
Higgins, Jennifer	1
Hoffmann, Thomas	1
Huynh, Huynh	1
Jacobsen, Jared	1
Jamgochian, Elisa	1
Kahl, Stuart R.	1
Kamata, Akihito	1
Kavanaugh, Maureen	1
Lai, Cheng-Fei	1
More ▼