Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
van der Linden, Wim J. – 2001
This report contains a review of procedures for computerized assembly of linear, sequential, and adaptive tests. The common approach to these test assembly problems is to view them as instances of constrained combinatorial optimization. For each testing format, several potentially useful objective functions and types of constraints are discussed.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Test Format
Chang, Shun-Wen; Hanson, Bradley A.; Harris, Deborah J. – 2001
The requirement of large sample sizes for calibrating items based on item response theory (IRT) models is not easily met in many practical pretesting situations. Although classical item statistics could be estimated with much smaller samples, the values may not be comparable across different groups of examinees. This study extended the authors'…
Descriptors: Item Response Theory, Pretests Posttests, Sample Size, Test Items
Chang, Shun-Wen; Twu, Bor-Yaun – 2001
To satisfy the security requirements of computerized adaptive tests (CATs), efforts have been made to control the exposure rates of optimal items directly by incorporating statistical methods into the item selection procedure. Since differences are likely to occur between the exposure control parameter derivation stage and the operational CAT…
Descriptors: Adaptive Testing, Computer Assisted Testing, Selection, Simulation
Massachusetts State Dept. of Education, Boston. – 2002
This report shares with educators and the public all of the test items on which spring 2002 student results from the Massachusetts Comprehensive Assessment System (MCAS) are based. The release of these items provides information on the kinds of knowledge and skills students are expected to demonstrate on the MCAS tests. Local educators are…
Descriptors: Achievement Tests, Elementary Secondary Education, State Programs, Test Items
Hwang, Dae-Yeop – 2002
This study compared classical test theory (CTT) and item response theory (IRT). The behavior of the item and person statistics derived from these two measurement frameworks was examined analytically and empirically using a data set obtained from BILOG (R. Mislay and D. Block, 1997). The example was a 15-item test with a sample size of 600…
Descriptors: Comparative Analysis, Measurement Techniques, Scores, Statistical Distributions
PDF pending restorationDimitrov, Dimiter M. – 2002
Exact formulas for classical error variance are provided for Rasch measurement with logistic distributions. An approximation formula with the normal ability distribution is also provided. With the proposed formulas, the additive contribution of individual items to the population error variance can be determined without knowledge of the other test…
Descriptors: Ability, Error of Measurement, Item Response Theory, Test Items
Deane, Paul; Sheehan, Kathleen – 2003
This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language…
Descriptors: Natural Language Processing, Semantics, Test Construction, Test Items
Lee, Guemin – 1999
Previous studies have indicated that the reliability of test scores composed of testlets is overestimated by conventional item-based reliability estimation methods (S. Sireci, D. Thissen, and H. Wainer, 1991; H. Wainer, 1995; H. Wainer and D. Thissen, 1996; G. Lee and D. Frisbie). In light of these studies, it seems reasonable to ask whether the…
Descriptors: Definitions, Error of Measurement, Estimation (Mathematics), Reliability
Lee, Guemin – 1998
The primary purpose of this study was to investigate the appropriateness and implication of incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets. The five conditional SEM estimation methods used in this study were classified into two categories: item-based and…
Descriptors: Definitions, Error of Measurement, Estimation (Mathematics), Reliability
Egan, Karla L.; Sireci, Stephen G.; Swaminathan, Hariharan; Sweeney, Kevin P. – 1998
The primary purpose of this study was to assess the effect of item bundling on multidimensional data. A second purpose was to compare three methods for assessing dimensionality. Eight multidimensional data sets consisting of 100 items and 1,000 examinees were simulated varying in terms of dimensionality, inter-dimensional correlation, and number…
Descriptors: Certified Public Accountants, Evaluation Methods, Licensing Examinations (Professions), Simulation
Li, Yuan H.; Lissitz, Robert W.; Yang, Yu Nu – 1999
Recent years have seen growing use of tests with mixed item formats, e.g., partly containing dichotomously scored items and partly consisting of polytomously scored items. A matching two test characteristic curves method (CCM) for placing these mixed format items on the same metric is described and evaluated in this paper under a common-item…
Descriptors: Equated Scores, Estimation (Mathematics), Item Response Theory, Test Format
Shen, Linjun – 1999
A multilevel approach was proposed for the assessment of differential item functioning and compared with the traditional logistic regression approach. Data from the Comprehensive Osteopathic Medical Licensing Examination for 2,300 freshman osteopathic medical students were analyzed. The multilevel approach used three-level hierarchical generalized…
Descriptors: Evaluation Methods, Higher Education, Item Bias, Medical Students
Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – 2000
Information based item selection methods in computerized adaptive tests (CATs) tend to choose the item that provides maximum information at an examinee's estimated trait level. As a result, these methods can yield extremely skewed item exposure distributions in which items with high "a" values may be overexposed, while those with low…
Descriptors: Adaptive Testing, Computer Assisted Testing, Selection, Simulation
Shen, Linjun – 2000
This study assessed the effects of the type of medical curriculum on differential item functioning (DIF) and group differences at the test level in Level 1 of the Comprehensive Osteopathic Medical Licensing Examinations (COMLEX). The study also explored the relationship of the DIF and group differences at the test level. There are generally two…
Descriptors: Curriculum, Item Bias, Licensing Examinations (Professions), Medical Students
Deng, Hui; Chang, Hua-Hua – 2001
The purpose of this study was to compare a proposed revised a-stratified, or alpha-stratified, USTR method of test item selection with the original alpha-stratified multistage computerized adaptive testing approach (STR) and the use of maximum Fisher information (FSH) with respect to test efficiency and item pool usage using simulated computerized…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Selection


