Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Ballou, Dale – National Center on Performance Incentives, 2008
As currently practiced, value-added assessment relies on a strong assumption about the scales used to measure student achievement, namely that these are interval scales, with equal-sized gains at all points on the scale representing the same increment of learning. Many of the metrics in which test results are expressed do not have this property…
Descriptors: Test Items, Intervals, Data Analysis, Item Response Theory
Ashvind Nand Singh – ProQuest LLC, 2008
Due to the relative inability of individuals with intellectual disabilities (ID) to provide an accurate and reliable self-report, assessment in this population is more difficult than with individuals in the general population. As such, assessment procedures must be adjusted to compensate for the relative lack of information that the individual can…
Descriptors: Test Items, Item Analysis, Test Construction, Behavior Rating Scales
Liu, Ou Lydia; Lee, Hee-Sun; Hofstetter, Carolyn; Linn, Marcia C. – Educational Assessment, 2008
In response to the demand for sound science assessments, this article presents the development of a latent construct called knowledge integration as an effective measure of science inquiry. Knowledge integration assessments ask students to link, distinguish, evaluate, and organize their ideas about complex scientific topics. The article focuses on…
Descriptors: Standardized Tests, Scoring Rubrics, Psychometrics, Concept Mapping
Lee, Young-Sun; Grossman, Jennifer; Krishnan, Anita – Educational and Psychological Measurement, 2008
This study examined the cultural relevance of adult attachment within a Korean sample (N = 390) using Rasch rating scale modeling. The psychometric properties of scores from the Korean version of the Revised Experiences in Close Relationships, comprised of two subscales of Anxiety (self) and Avoidance (other), were assessed. Results obtained from…
Descriptors: Cultural Relevance, Attachment Behavior, Rating Scales, Psychometrics
Belov, Dmitry I.; Armstrong, Ronald D.; Weissman, Alexander – Applied Psychological Measurement, 2008
This article presents a new algorithm for computerized adaptive testing (CAT) when content constraints are present. The algorithm is based on shadow CAT methodology to meet content constraints but applies Monte Carlo methods and provides the following advantages over shadow CAT: (a) lower maximum item exposure rates, (b) higher utilization of the…
Descriptors: Test Items, Monte Carlo Methods, Law Schools, Adaptive Testing
Hattie, John A. C.; Brown, Gavin T. L. – Journal of Educational Technology Systems, 2008
National assessment systems can be enhanced with effective school-based assessment (SBA) that allows teachers to focus on improvement decisions. Modern computer-assisted technology systems are often used to deploy SBA systems. Since 2000, New Zealand has researched, developed, and deployed a national, computer-assisted SBA system. Eight major…
Descriptors: Computers, Information Technology, Foreign Countries, Computer Uses in Education
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics
Mahoney, Kate – International Journal of Testing, 2008
Education policy in many countries has undergone changes regarding the testing of English Language Learners (ELLs), who by definition are not yet proficient in the language of the test. As policies mandate the inclusion of ELLs in large-scale testing, many question the validity of achievement test scores because the degree to which the test score…
Descriptors: Test Items, Linguistics, Testing, Second Language Learning
Bietau, Lisa Artman – ProQuest LLC, 2011
A foundational mission of our public schools is dedicated to preserving a democratic republic dependent on a literate and actively engaged citizenry. Civic literacy is essential to supporting the rights and responsibilities of all citizens in a democratic society. Civic knowledge is the foundation of our citizens' civic literacy. National…
Descriptors: National Standards, Test Items, Feedback (Response), Citizenship
Ricker, Kathryn L.; von Davier, Alina A. – ETS Research Report Series, 2007
This study explored the effects of external anchor test length on final equating results of several equating methods, including equipercentile (frequency estimation), chained equipercentile, kernel equating (KE) poststratification PSE with optimal bandwidths, and KE PSE linear (large bandwidths) when using the nonequivalent groups anchor test…
Descriptors: Equated Scores, Test Items, Statistical Analysis, Test Length
Hendrickson, Amy – Educational Measurement: Issues and Practice, 2007
Multistage tests are those in which sets of items are administered adaptively and are scored as a unit. These tests have all of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive test. As a seemingly…
Descriptors: Adaptive Testing, Test Construction, Measurement Techniques, Evaluation Methods
Kim, Seock-Ho; Cohen, Allan S.; Alagoz, Cigdem; Kim, Sukwoo – Journal of Educational Measurement, 2007
Data from a large-scale performance assessment (N = 105,731) were analyzed with five differential item functioning (DIF) detection methods for polytomous items to examine the congruence among the DIF detection methods. Two different versions of the item response theory (IRT) model-based likelihood ratio test, the logistic regression likelihood…
Descriptors: Performance Based Assessment, Performance Tests, Item Response Theory, Test Bias
Passos, Valeria Lima; Berger, Martijn P. F.; Tan, Frans E. – Applied Psychological Measurement, 2007
The early stage of computerized adaptive testing (CAT) refers to the phase of the trait estimation during the administration of only a few items. This phase can be characterized by bias and instability of estimation. In this study, an item selection criterion is introduced in an attempt to lessen this instability: the D-optimality criterion. A…
Descriptors: Test Construction, Test Items, Item Response Theory, Computer Assisted Testing
Marshall, Robert C.; Wright, Heather Harris – American Journal of Speech-Language Pathology, 2007
Purpose: The Kentucky Aphasia Test (KAT) is an objective measure of language functioning for persons with aphasia. This article describes materials, administration, and scoring of the KAT; presents the rationale for development of test items; reports information from a pilot study; and discusses the role of the KAT in aphasia assessment. Method:…
Descriptors: Aphasia, Test Format, Language Tests, Expressive Language
Cheng, Ying; Chang, Hua-Hua; Yi, Qing – Applied Psychological Measurement, 2007
Content balancing is an important issue in the design and implementation of computerized adaptive testing (CAT). Content-balancing techniques that have been applied in fixed content balancing, where the number of items from each content area is fixed, include constrained CAT (CCAT), the modified multinomial model (MMM), modified constrained CAT…
Descriptors: Adaptive Testing, Item Analysis, Computer Assisted Testing, Item Response Theory

Peer reviewed
Direct link
