Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 9 |
Descriptor
Educational Testing | 12 |
Evaluation Methods | 4 |
Test Items | 4 |
College Students | 3 |
Comparative Analysis | 3 |
Equated Scores | 3 |
Item Response Theory | 3 |
Measurement Techniques | 3 |
Test Bias | 3 |
Testing Programs | 3 |
Data Analysis | 2 |
More ▼ |
Source
Applied Measurement in… | 12 |
Author
Antal, Judit | 1 |
Ban, Jae-Chun | 1 |
Banks, Kathleen | 1 |
Christine E. DeMars | 1 |
DeStefano, Marissa | 1 |
Fisher, Thomas H. | 1 |
Gilby, Caitlin | 1 |
Haberman, Shelby | 1 |
Hogan, Thomas | 1 |
Kim, Sooyeon | 1 |
Kosman, Dana | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Research | 8 |
Reports - Descriptive | 2 |
Reports - Evaluative | 2 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Grade 5 | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
What Works Clearinghouse Rating
Hogan, Thomas; DeStefano, Marissa; Gilby, Caitlin; Kosman, Dana; Peri, Joshua – Applied Measurement in Education, 2021
Buros' "Mental Measurements Yearbook (MMY)" has provided professional reviews of commercially published psychological and educational tests for over 80 years. It serves as a kind of conscience for the testing industry. For a random sample of 50 entries in the "19th MMY" (a total of 100 separate reviews) this study determined…
Descriptors: Test Reviews, Interrater Reliability, Psychological Testing, Educational Testing
Sarah Alahmadi; Christine E. DeMars – Applied Measurement in Education, 2024
Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote testing conditions. To remove the effects of low effort levels in responses observed in remote low-stakes testing, several motivation filtering methods…
Descriptors: Multiple Choice Tests, Item Response Theory, College Students, Scores
Youn Seon Lim – Applied Measurement in Education, 2024
Educational testing has been criticized for its disconnect from modern cognitive science and its limited role in improving instruction and student learning. Reform efforts emphasize the need for testing to provide specific diagnostic insights into students' skills and knowledge. Cognitive diagnosis (CD), an emerging paradigm in educational…
Descriptors: Q Methodology, Matrices, Models, Design
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Livingston, Samuel A.; Antal, Judit – Applied Measurement in Education, 2010
A simultaneous equating of four new test forms to each other and to one previous form was accomplished through a complex design incorporating seven separate equating links. Each new form was linked to the reference form by four different paths, and each path produced a different score conversion. The procedure used to resolve these inconsistencies…
Descriptors: Measurement Techniques, Measurement, Educational Assessment, Educational Testing
Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010
Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…
Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques
Penfield, Randall D. – Applied Measurement in Education, 2007
A widely used approach for categorizing the level of differential item functioning (DIF) in dichotomous items is the scheme proposed by Educational Testing Service (ETS) based on a transformation of the Mantel-Haeszel common odds ratio. In this article two classification schemes for DIF in polytomous items (referred to as the P1 and P2 schemes)…
Descriptors: Simulation, Educational Testing, Test Bias, Evaluation Methods

Sicoly, Fiore – Applied Measurement in Education, 2002
Calculated year-1 to year-2 stability of assessment data from 21 states and 2 Canadian provinces. The median stability coefficient was 0.78 in mathematics and reading, and lower in writing. A stability coefficient of 0.80 is recommended as the standard for large-scale assessments of student performance. (SLD)
Descriptors: Educational Testing, Elementary Secondary Education, Foreign Countries, Mathematics

Fisher, Thomas H. – Applied Measurement in Education, 1988
Future trends in high school basic skills testing programs are discussed. Topics include subject area testing, national comparison testing, test security concerns, and technology's impact. The future is likely to bring more testing, rather than less, but there will be significant changes in the ways tests are implemented. (SLD)
Descriptors: Basic Skills, Educational Change, Educational Testing, Educational Trends

Parkes, Jay; Stevens, Joseph J. – Applied Measurement in Education, 2003
Extrapolates from existing case law on educational testing to the likely ways in which school accountability systems may be challenged in the courts. Also offers a list of recommended actions for accountability system developers. (SLD)
Descriptors: Accountability, Court Litigation, Educational Testing, Elementary Secondary Education
McCarty, F. A.; Oshima, T. C.; Raju, Nambury S. – Applied Measurement in Education, 2007
Oshima, Raju, Flowers, and Slinde (1998) described procedures for identifying sources of differential functioning for dichotomous data using differential bundle functioning (DBF) derived from the differential functioning of items and test (DFIT) framework (Raju, van der Linden, & Fleer, 1995). The purpose of this study was to extend the…
Descriptors: Rating Scales, Test Bias, Scoring, Test Items
Banks, Kathleen – Applied Measurement in Education, 2006
The purpose of this article is to present a working definition of the term "culture," as well as to describe and demonstrate a comprehensive framework for evaluating hypotheses about cultural bias in educational testing. The framework is demonstrated using 5th-grade reading and language arts data from the Terra Nova test (CTB/McGraw-Hill, 1999).…
Descriptors: Test Bias, Educational Testing, Test Items, Hispanic Americans