NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Applied Measurement in…62
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 62 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Poe, Mya; Oliveri, Maria Elena; Elliot, Norbert – Applied Measurement in Education, 2023
Since 1952, the "Standards for Educational and Psychological Testing" has provided criteria for developing and evaluating educational and psychological tests and testing practice. Yet, we argue that the foundations, operations, and applications in the "Standards" are no longer sufficient to meet the current U.S. testing demands…
Descriptors: Racism, Social Justice, Standards, Psychological Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…
Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Abedi, Jamal – Applied Measurement in Education, 2014
Among the several forms of accommodations used in the assessment of English language learners (ELLs), language-based accommodations are the most effective in making assessments linguistically accessible to these students. However, there are significant challenges associated with the implementation of many of these accommodations. This article…
Descriptors: Testing Accommodations, English Language Learners, Language Aptitude, Academic Accommodations (Disabilities)
Peer reviewed Peer reviewed
Direct linkDirect link
Kopriva, Rebecca J. – Applied Measurement in Education, 2014
In this commentary, Rebecca Kopriva examines the articles in this special issue by drawing on her experience from three series of investigations examining how English language learners (ELLs) and other students perceive what test items ask and how they can successfully represent what they know. The first series examined the effect of different…
Descriptors: English Language Learners, Test Items, Educational Assessment, Access to Education
Peer reviewed Peer reviewed
Direct linkDirect link
Boyd, Aimee M.; Dodd, Barbara; Fitzpatrick, Steven – Applied Measurement in Education, 2013
This study compared several exposure control procedures for CAT systems based on the three-parameter logistic testlet response theory model (Wang, Bradlow, & Wainer, 2002) and Masters' (1982) partial credit model when applied to a pool consisting entirely of testlets. The exposure control procedures studied were the modified within 0.10 logits…
Descriptors: Computer Assisted Testing, Item Response Theory, Test Construction, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Swerdzewski, Peter J.; Harmes, J. Christine; Finney, Sara J. – Applied Measurement in Education, 2011
Many universities rely on data gathered from tests that are low stakes for examinees but high stakes for the various programs being assessed. Given the lack of consequences associated with many collegiate assessments, the construct-irrelevant variance introduced by unmotivated students is potentially a serious threat to the validity of the…
Descriptors: Computer Assisted Testing, Student Motivation, Inferences, Universities
Peer reviewed Peer reviewed
Direct linkDirect link
Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012
Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Stone, Elizabeth; Cook, Linda; Cahalan-Laitusis, Cara; Cline, Frederick – Applied Measurement in Education, 2010
This validity study examined differential item functioning (DIF) results on large-scale state standards-based English-language arts assessments at grades 4 and 8 for students without disabilities taking the test under standard conditions and students who are blind or visually impaired taking the test with either a large print or braille form.…
Descriptors: Test Bias, Large Type Materials, Testing Accommodations, Language Arts
Peer reviewed Peer reviewed
Direct linkDirect link
Engelhard, George, Jr.; Fincher, Melissa; Domaleski, Christopher S. – Applied Measurement in Education, 2011
This study examines the effects of two test administration accommodations on the mathematics performance of students within the context of a large-scale statewide assessment. The two test administration accommodations were resource guides and calculators. A stratified random sample of schools was selected to represent the demographic…
Descriptors: Testing Accommodations, Disabilities, High Stakes Tests, Program Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Kingston, Neal M. – Applied Measurement in Education, 2009
There have been many studies of the comparability of computer-administered and paper-administered tests. Not surprisingly (given the variety of measurement and statistical sampling issues that can affect any one study) the results of such studies have not always been consistent. Moreover, the quality of computer-based test administration systems…
Descriptors: Multiple Choice Tests, Computer Assisted Testing, Printed Materials, Effect Size
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L.; Pastor, Dena A.; Kong, Xiaojing J. – Applied Measurement in Education, 2009
Previous research has shown that rapid-guessing behavior can degrade the validity of test scores from low-stakes proficiency tests. This study examined, using hierarchical generalized linear modeling, examinee and item characteristics for predicting rapid-guessing behavior. Several item characteristics were found significant; items with more text…
Descriptors: Guessing (Tests), Achievement Tests, Correlation, Test Items
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5