NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)10
Audience
What Works Clearinghouse Rating
Showing 1 to 15 of 24 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen – Psychometrika, 2010
After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…
Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Jacobsen, Jared; Ackermann, Richard; Eguez, Jane; Ganguli, Debalina; Rickard, Patricia; Taylor, Linda – Journal of Applied Testing Technology, 2011
A computer adaptive test (CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be…
Descriptors: Delivery Systems, Testing Programs, Computer Assisted Testing, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Lovett, Benjamin J. – Review of Educational Research, 2010
Extended time is one of the most common testing accommodations provided to students with disabilities. It is also controversial; critics of extended time accommodations argue that extended time is used too readily, without concern for how it changes the skills measured by tests, leading to scores that cannot be compared fairly with those of other…
Descriptors: Testing Accommodations, Academic Accommodations (Disabilities), Literature Reviews, Meta Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009
Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…
Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment
Jamgochian, Elisa; Park, Bitnara Jasmine; Nese, Joseph F. T.; Lai, Cheng-Fei; Saez, Leilani; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2010
In this technical report, we provide reliability and validity evidence for the easyCBM[R] Reading measures for grade 2 (word and passage reading fluency and multiple choice reading comprehension). Evidence for reliability includes internal consistency and item invariance. Evidence for validity includes concurrent, predictive, and construct…
Descriptors: Grade 2, Reading Comprehension, Testing Programs, Reading Fluency
Peer reviewed Peer reviewed
Direct linkDirect link
Breithaupt, Krista; Hare, Donovan R. – Educational and Psychological Measurement, 2007
Many challenges exist for high-stakes testing programs offering continuous computerized administration. The automated assembly of test questions to exactly meet content and other requirements, provide uniformity, and control item exposure can be modeled and solved by mixed-integer programming (MIP) methods. A case study of the computerized…
Descriptors: Testing Programs, Psychometrics, Certification, Accounting
Crislip, Marian A.; Chin-Chance, Selvin – 2001
This paper discusses the use of two theories of item analysis and test construction, their strengths and weaknesses, and applications to the design of the Hawaii State Test of Essential Competencies (HSTEC). Traditional analyses of the data collected from the HSTEC field test were viewed from the perspectives of item difficulty levels and item…
Descriptors: Difficulty Level, Item Response Theory, Psychometrics, Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kamata, Akihito; Vaughn, Brandon K. – Learning Disabilities: A Contemporary Journal, 2004
This article provides a brief primer overview of Differential Item Functioning (DIF) analysis. DIF analysis investigates a differential characteristic of a test item between subpopulations of examinees and is useful in detecting possibly biased items toward a particular subpopulation. As demonstration, a dataset from a 40-item math test in a…
Descriptors: Test Bias, Testing Accommodations, Test Items, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Russell, Michael; Kavanaugh, Maureen; Masters, Jessica; Higgins, Jennifer; Hoffmann, Thomas – Journal of Applied Testing Technology, 2009
Many students who are deaf or hard-of-hearing are eligible for a signing accommodation for state and other standardized tests. The signing accommodation, however, presents several challenges for testing programs that attempt to administer tests under standardized conditions. One potential solution for many of these challenges is the use of…
Descriptors: Testing Programs, Student Attitudes, Standardized Tests, Academic Achievement
Peer reviewed Peer reviewed
Huynh, Huynh; Casteel, Jim – Journal of Educational Statistics, 1985
Two approaches, the minimax approach and the Rasch procedure, are described for the simultaneous determination of passing scores for subtests when the passing score for the total test is known. (Author/LMO)
Descriptors: Cutting Scores, Educational Assessment, Elementary Secondary Education, Latent Trait Theory
Skaggs, Gary; Bourque, Mary Lyn – 1998
Political and legislative pressures have posed a number of measurement issues and challenges to the development of sound, valid voluntary national tests (VNTs). This paper focuses on what appear to be the most difficult technical issues related to the VNT proposed by President Clinton in 1997. Technical issues refer to psychometric issues, as…
Descriptors: Academic Achievement, Achievement Tests, Classification, Difficulty Level
Shorey, Leonard – 1991
Tests in social studies and integrated science given in Saint Vincent, Saint Lucia, Grenada, and Dominica were analyzed by the Organization for Co-operation in Overseas Development (OCOD) Comprehensive Teacher Training Program (CTTP) for discrimination, difficulty, and reliability, as well as other characteristics. There were 767 examinees for the…
Descriptors: Difficulty Level, Elementary Secondary Education, Evaluation Methods, Foreign Countries
Previous Page | Next Page ยป
Pages: 1  |  2