Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 22 |
Descriptor
Source
Applied Measurement in… | 62 |
Author
Pomplun, Mark | 3 |
Wise, Steven L. | 3 |
Chen, Wen-Hung | 2 |
Ferrara, Steve | 2 |
Haberman, Shelby | 2 |
Hambleton, Ronald K. | 2 |
Johnson, Eugene | 2 |
Kingsbury, G. Gage | 2 |
Miller, G. Edward | 2 |
Puhan, Gautam | 2 |
Zara, Anthony R. | 2 |
More ▼ |
Publication Type
Journal Articles | 62 |
Reports - Evaluative | 62 |
Information Analyses | 8 |
Speeches/Meeting Papers | 2 |
Collected Works - General | 1 |
Opinion Papers | 1 |
Reports - Research | 1 |
Education Level
Higher Education | 5 |
Elementary Secondary Education | 4 |
Grade 8 | 3 |
Grade 3 | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Grade 11 | 1 |
Grade 2 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 3 |
SAT (College Admission Test) | 3 |
Georgia Criterion Referenced… | 1 |
Wechsler Intelligence Scale… | 1 |
Woodcock Johnson Tests of… | 1 |
What Works Clearinghouse Rating
Poe, Mya; Oliveri, Maria Elena; Elliot, Norbert – Applied Measurement in Education, 2023
Since 1952, the "Standards for Educational and Psychological Testing" has provided criteria for developing and evaluating educational and psychological tests and testing practice. Yet, we argue that the foundations, operations, and applications in the "Standards" are no longer sufficient to meet the current U.S. testing demands…
Descriptors: Racism, Social Justice, Standards, Psychological Testing
Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…
Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests
Abedi, Jamal – Applied Measurement in Education, 2014
Among the several forms of accommodations used in the assessment of English language learners (ELLs), language-based accommodations are the most effective in making assessments linguistically accessible to these students. However, there are significant challenges associated with the implementation of many of these accommodations. This article…
Descriptors: Testing Accommodations, English Language Learners, Language Aptitude, Academic Accommodations (Disabilities)
Kopriva, Rebecca J. – Applied Measurement in Education, 2014
In this commentary, Rebecca Kopriva examines the articles in this special issue by drawing on her experience from three series of investigations examining how English language learners (ELLs) and other students perceive what test items ask and how they can successfully represent what they know. The first series examined the effect of different…
Descriptors: English Language Learners, Test Items, Educational Assessment, Access to Education
Boyd, Aimee M.; Dodd, Barbara; Fitzpatrick, Steven – Applied Measurement in Education, 2013
This study compared several exposure control procedures for CAT systems based on the three-parameter logistic testlet response theory model (Wang, Bradlow, & Wainer, 2002) and Masters' (1982) partial credit model when applied to a pool consisting entirely of testlets. The exposure control procedures studied were the modified within 0.10 logits…
Descriptors: Computer Assisted Testing, Item Response Theory, Test Construction, Models
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Swerdzewski, Peter J.; Harmes, J. Christine; Finney, Sara J. – Applied Measurement in Education, 2011
Many universities rely on data gathered from tests that are low stakes for examinees but high stakes for the various programs being assessed. Given the lack of consequences associated with many collegiate assessments, the construct-irrelevant variance introduced by unmotivated students is potentially a serious threat to the validity of the…
Descriptors: Computer Assisted Testing, Student Motivation, Inferences, Universities
Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012
Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Stone, Elizabeth; Cook, Linda; Cahalan-Laitusis, Cara; Cline, Frederick – Applied Measurement in Education, 2010
This validity study examined differential item functioning (DIF) results on large-scale state standards-based English-language arts assessments at grades 4 and 8 for students without disabilities taking the test under standard conditions and students who are blind or visually impaired taking the test with either a large print or braille form.…
Descriptors: Test Bias, Large Type Materials, Testing Accommodations, Language Arts
Engelhard, George, Jr.; Fincher, Melissa; Domaleski, Christopher S. – Applied Measurement in Education, 2011
This study examines the effects of two test administration accommodations on the mathematics performance of students within the context of a large-scale statewide assessment. The two test administration accommodations were resource guides and calculators. A stratified random sample of schools was selected to represent the demographic…
Descriptors: Testing Accommodations, Disabilities, High Stakes Tests, Program Effectiveness
Kingston, Neal M. – Applied Measurement in Education, 2009
There have been many studies of the comparability of computer-administered and paper-administered tests. Not surprisingly (given the variety of measurement and statistical sampling issues that can affect any one study) the results of such studies have not always been consistent. Moreover, the quality of computer-based test administration systems…
Descriptors: Multiple Choice Tests, Computer Assisted Testing, Printed Materials, Effect Size
Wise, Steven L.; Pastor, Dena A.; Kong, Xiaojing J. – Applied Measurement in Education, 2009
Previous research has shown that rapid-guessing behavior can degrade the validity of test scores from low-stakes proficiency tests. This study examined, using hierarchical generalized linear modeling, examinee and item characteristics for predicting rapid-guessing behavior. Several item characteristics were found significant; items with more text…
Descriptors: Guessing (Tests), Achievement Tests, Correlation, Test Items