Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Chaudhary, Shreesh – TESL-EJ, 2008
Courses in Spoken English (SE) are yet to be acceptable in Indian universities because conducting session-end tests in SE is assumed to be logistically difficult and academically problematic. This article argues that it need not necessarily be so; session-end tests can be conducted just as in other courses. With voice recording, preferably a…
Descriptors: Educational Technology, Computer Networks, French, English (Second Language)
MacSwan, Jeff; Mahoney, Kate – Journal of Educational Research & Policy Studies, 2008
Construct validity concerns for the IPT I Oral Grades K-6 Spanish Second Edition (IPT-S) as a measure of native oral language proficiency are examined. The examination included describing a subset of items that contributes most to overall score and native-language proficiency designation. Correlations between this subset of items and the overall…
Descriptors: Language Research, Oral Language, Language Tests, Construct Validity
Frey, Andreas; Hartig, Johannes; Rupp, Andre A. – Educational Measurement: Issues and Practice, 2009
In most large-scale assessments of student achievement, several broad content domains are tested. Because more items are needed to cover the content domains than can be presented in the limited testing time to each individual student, multiple test forms or booklets are utilized to distribute the items to the students. The construction of an…
Descriptors: Measures (Individuals), Test Construction, Theory Practice Relationship, Design
Smith, Richard M.; And Others – 1995
In the mid to late 1970s, considerable research was conducted on the properties of Rasch fit mean squares, resulting in transformations to convert the mean squares into approximate t-statistics. In the late 1980s and the early 1990s, the trend seems to have reversed, with numerous researchers using the untransformed fit mean squares as a means of…
Descriptors: Evaluation Methods, Goodness of Fit, Item Response Theory, Sample Size
Stocking, Martha L.; And Others – 1991
A previously developed method of automatically selecting items for inclusion in a test subject to constraints on item content and statistical properties is applied to real data. Two tests are first assembled by experts in test construction who normally assemble such tests on a routine basis. Using the same pool of items and constraints articulated…
Descriptors: Algorithms, Automation, Coding, Computer Assisted Testing
Linacre, John Michael – 1995
The effects on Rasch measurement of both response underfit (noise) and overfit (mutedness or superuniformity) are described and illustrated. Misfit is identified by mean-square fit statistics. Person separation and reliability are shown to be deceptive indicators of measurement effectiveness when some items exhibit marked overfit. Theoretical…
Descriptors: Children, Goodness of Fit, Item Response Theory, Measurement Techniques
Wang, Tianyou; Zeng, Lingjia – 1996
F. Samejima (1973) proposed a continuous response model in which item response is on a continuous scale rather than some discrete levels. This model has potential because in many psychological and educational assessments, the responses are on a conceptual continuum rather than on some fixed levels. As a first step toward studying the applicability…
Descriptors: Ability, Educational Assessment, Estimation (Mathematics), Item Response Theory
Samejima, Fumiko – 1996
Traditionally, the test score represented by the number of items answered correctly was taken as an indicator of the examinee's ability level. Researchers still tend to think that the number-correct score is a way of ordering individuals with respect to the latent trait. The objective of this study is to depict the benefits of using ability…
Descriptors: Ability, Attitude Measures, Estimation (Mathematics), Models
Stocking, Martha L. – 1989
The success of applications of item response theory (IRT) depends upon the properties of the estimates of model parameters. Many theoretical properties of these estimates have been extensively studied. However, the properties of estimates obtained empirically from real data depend not only on the theoretical results, but also on the data and the…
Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Models
Lee, William M.; And Others – 1989
Projects to develop an automated item banking and test development system have been undertaken on several occasions at the Air Force Human Resources Laboratory (AFHRL) throughout the past 10 years. Such a system permits the construction of tests in far less time and with a higher degree of accuracy than earlier test construction procedures. This…
Descriptors: Automation, Computer Assisted Testing, Item Banks, Item Response Theory
Krass, Iosif A. – 1998
In the process of item calibration for a computerized adaptive test (CAT), many well-established calibrating packages show weakness in the estimation of item parameters. This paper introduces an on-line calibration algorithm based on the convexity of likelihood functions. This package consists of: (1) an algorithm that estimates examinee ability…
Descriptors: Ability, Adaptive Testing, Algorithms, Computer Assisted Testing
Glas, Cees A. W. – 1998
In this paper it is shown that various violations of the two parameter logistic (2PL) model can be evaluated using the Lagrange multiplier test (J. Aitchison and S. Silvey, 1958) or the equivalent difference score test. The tests focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic…
Descriptors: Foreign Countries, Goodness of Fit, Item Response Theory, Maximum Likelihood Statistics
Sireci, Stephen G.; Wiley, Andrew; Keller, Lisa A. – 1998
Seven specific guidelines included in the taxonomy proposed by T. Haladyna and S. Downing (1998) for writing multiple-choice test items were evaluated. These specific guidelines are: (1) avoid the complex multiple-choice, K-type format; (2) state the stem in question format; (3) word the stem positively; (4) avoid the phrase "all of the…
Descriptors: Certified Public Accountants, Licensing Examinations (Professions), Multiple Choice Tests, Test Construction
Spray, Judith A. – 1993
Sequential probability ratio testing (PRT), which usually is applied in situations requiring a decision between two simple hypotheses or a single decision point, is extended to include situations involving k decision points and [(k + 1)-choose-2] sets of simultaneous, simple hypotheses, where k>1. The multiple-decision point or…
Descriptors: Classification, Computation, Computer Simulation, Decision Making
Stone, Kathy Kees; And Others – 1983
Looking beyond the overall effectiveness of sensory stimulation, this study aimed to identify specific aspects of infant behavior most responsive to early stimulation. Subjects were 65 premature infants with a birth weight of less than 5 pounds, 8 ounces and a gestational age under 37 weeks. Experimental group members had completed a multimodal…
Descriptors: Comparative Analysis, Discriminant Analysis, Infant Behavior, Premature Infants

Peer reviewed
Direct link
