Publication Date
| In 2026 | 0 |
| Since 2025 | 222 |
| Since 2022 (last 5 years) | 1091 |
| Since 2017 (last 10 years) | 2601 |
| Since 2007 (last 20 years) | 4962 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 227 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Hendrickson, Amy – Educational Measurement: Issues and Practice, 2007
Multistage tests are those in which sets of items are administered adaptively and are scored as a unit. These tests have all of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive test. As a seemingly…
Descriptors: Adaptive Testing, Test Construction, Measurement Techniques, Evaluation Methods
Kim, Seock-Ho; Cohen, Allan S.; Alagoz, Cigdem; Kim, Sukwoo – Journal of Educational Measurement, 2007
Data from a large-scale performance assessment (N = 105,731) were analyzed with five differential item functioning (DIF) detection methods for polytomous items to examine the congruence among the DIF detection methods. Two different versions of the item response theory (IRT) model-based likelihood ratio test, the logistic regression likelihood…
Descriptors: Performance Based Assessment, Performance Tests, Item Response Theory, Test Bias
Passos, Valeria Lima; Berger, Martijn P. F.; Tan, Frans E. – Applied Psychological Measurement, 2007
The early stage of computerized adaptive testing (CAT) refers to the phase of the trait estimation during the administration of only a few items. This phase can be characterized by bias and instability of estimation. In this study, an item selection criterion is introduced in an attempt to lessen this instability: the D-optimality criterion. A…
Descriptors: Test Construction, Test Items, Item Response Theory, Computer Assisted Testing
Marshall, Robert C.; Wright, Heather Harris – American Journal of Speech-Language Pathology, 2007
Purpose: The Kentucky Aphasia Test (KAT) is an objective measure of language functioning for persons with aphasia. This article describes materials, administration, and scoring of the KAT; presents the rationale for development of test items; reports information from a pilot study; and discusses the role of the KAT in aphasia assessment. Method:…
Descriptors: Aphasia, Test Format, Language Tests, Expressive Language
Cheng, Ying; Chang, Hua-Hua; Yi, Qing – Applied Psychological Measurement, 2007
Content balancing is an important issue in the design and implementation of computerized adaptive testing (CAT). Content-balancing techniques that have been applied in fixed content balancing, where the number of items from each content area is fixed, include constrained CAT (CCAT), the modified multinomial model (MMM), modified constrained CAT…
Descriptors: Adaptive Testing, Item Analysis, Computer Assisted Testing, Item Response Theory
Braden, Jeffery P.; Iribarren, Jacqueline A. – Journal of Psychoeducational Assessment, 2007
In this article, the authors review the Wechsler Intelligence Scale for Children-Fourth Edition Spanish (WISC-IV Spanish), a Spanish translation and adaptation of the WISC-IV. The test was developed to measure the intellectual ability of Spanish-speaking children in the United States ages 6 years, 0 months, through 16 years, 11 months. These…
Descriptors: Intelligence Tests, Spanish, Translation, Children
Wainer, Howard; Robinson, Daniel H. – Journal of Educational and Behavioral Statistics, 2007
Fumiko Samejima is best known for her pioneering work in polytomous response item response theory (IRT), yielding the eponymous model that has been used broadly for more than 30 years. In this interview, Samejima, on the verge of retiring from her faculty position at the University of Tennessee, discusses her life and career. She also describes…
Descriptors: Foreign Countries, Psychometrics, Item Response Theory, Test Items
Sireci, Stephen G. – Educational Researcher, 2007
Lissitz and Samuelsen (2007) propose a new framework for conceptualizing test validity that separates analysis of test properties from analysis of the construct measured. In response, the author of this article reviews fundamental characteristics of test validity, drawing largely from seminal writings as well as from the accepted standards. He…
Descriptors: Test Content, Test Validity, Guidelines, Test Items
van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas – Multivariate Behavioral Research, 2007
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…
Descriptors: Evaluation Methods, Psychometrics, Item Response Theory, Scores
Ramirez, Sylvia Z.; Lukenbill, James F. – Research in Developmental Disabilities: A Multidisciplinary Journal, 2007
This paper describes the development of the fear survey for adults with mental retardation (FSAMR) and provides initial evidence of its psychometric properties. The FSAMR was designed to be sensitive to the assessment needs of individuals with mental retardation. The items were developed through open-ended interviews, a review of existing…
Descriptors: Psychometrics, Test Validity, Fear, Mental Retardation
Scharf, Eric M.; Baldwin, Lynne P. – Active Learning in Higher Education: The Journal of the Institute for Learning and Teaching, 2007
The reasoning behind popular methods for analysing the raw data generated by multiple choice question (MCQ) tests is not always appreciated, occasionally with disastrous results. This article discusses and analyses three options for processing the raw data produced by MCQ tests. The article shows that one extreme option is not to penalize a…
Descriptors: Guessing (Tests), Test Items, Multiple Choice Tests, Questioning Techniques
Weiss, Michael Kevin – ProQuest LLC, 2009
How can the secondary Geometry course serve as an opportunity for students to learn to "be like" a mathematician--that is, to acquire a mathematical sensibility? In the first part of this dissertation, I investigate what might be meant by "mathematical sensibility". By analyzing narratives of mathematicians and their work, I identify a collection…
Descriptors: Feedback (Response), Geometry, Mathematics Instruction, Secondary School Mathematics
Al-Shabatat, Ahmad Mohammad; Abbas, Merza; Ismail, Hairul Nizam – International Journal of Special Education, 2009
Many people believe that environmental factors promote giftedness and invest in many programs to adopt gifted students providing them with challenging activities. Intellectual giftedness is founded on fluid intelligence and extends to more specific abilities through the growth and inputs from the environment. Acknowledging the roles played by the…
Descriptors: Intelligence, Test Items, Academically Gifted, Foreign Countries
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
Sawaki, Yasuyo; Kim, Hae-Jin; Gentile, Claudia – Language Assessment Quarterly, 2009
In cognitive diagnosis a Q-matrix (Tatsuoka, 1983, 1990), which is an incidence matrix that defines the relationships between test items and constructs of interest, has great impact on the nature of performance feedback that can be provided to score users. The purpose of the present study was to identify meaningful skill coding categories that…
Descriptors: Feedback (Response), Test Items, Test Content, Identification

Peer reviewed
Direct link
