NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)2
Audience
Researchers23
Practitioners1
Location
Pennsylvania1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 23 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Willse, John T.; Goodman, Joshua T. – Educational and Psychological Measurement, 2008
This research provides a direct comparison of effect size estimates based on structural equation modeling (SEM), item response theory (IRT), and raw scores. Differences between the SEM, IRT, and raw score approaches are examined under a variety of data conditions (IRT models underlying the data, test lengths, magnitude of group differences, and…
Descriptors: Test Length, Structural Equation Models, Effect Size, Raw Scores
Pennsylvania Department of Education, 2010
This handbook describes the responsibilities of district and school assessment coordinators in the administration of the Pennsylvania System of School Assessment (PSSA). This updated guidebook contains the following sections: (1) General Assessment Guidelines for All Assessments; (2) Writing Specific Guidelines; (3) Reading and Mathematics…
Descriptors: Guidelines, Guides, Educational Assessment, Writing Tests
Lautenschlager, Gary J.; Park, Dong-Gun – 1987
The effects of variations in degree of range restriction and different subgroup sample sizes on the validity of several item bias detection procedures based on Item Response Theory (IRT) were investigated in a simulation study. The degree of range restriction for each of two subpopulations was varied by cutting the specified subpopulation ability…
Descriptors: Computer Simulation, Item Analysis, Latent Trait Theory, Mathematical Models
Chang, S. Tai; Bashaw, W. L. – 1984
The purpose of this study was twofold: to investigate to what extent characteristics of anchor tests may affect precision of item calibration, and to estimate to what extent precision of item calibration may be affected by removal of persons whose response patterns deviate from those normally expected from the Rasch one-parameter logistic model.…
Descriptors: Aptitude Tests, Difficulty Level, Equated Scores, Junior High Schools
Hwang, Chi-en; Cleary, T. Anne – 1986
The results obtained from two basic types of pre-equatings of tests were compared: the item response theory (IRT) pre-equating and section pre-equating (SPE). The simulated data were generated from a modified three-parameter logistic model with a constant guessing parameter. Responses of two replication samples of 3000 examinees on two 72-item…
Descriptors: Computer Simulation, Equated Scores, Latent Trait Theory, Mathematical Models
Lutz, William – 1983
After an extensive review of the available research on large-scale writing assessment, certain issues in writing assessment seem to be unresolved, and still other issues are not supported by adequate research. This paper reviews the basic issues in writing assessment, points out which topics are supported by strong research, and which topics are…
Descriptors: Educational Assessment, Essay Tests, Higher Education, Multiple Choice Tests
Livingston, Samuel A. – 1984
Much previously published material for estimating the reliability of classification has been based on the assumption that a test consists of a known number of equally weighted items. The test score is the number of those items answered correctly. These methods cannot be used with classifications based on weighted composite scores, especially if…
Descriptors: Equated Scores, Essay Tests, Estimation (Mathematics), Mathematical Models
Peer reviewed Peer reviewed
Hill, Kennedy T.; Wigfield, Allan – Elementary School Journal, 1984
Discusses the problem of and solution to anxiety in school testing situations. Focuses on Hill and his colleagues' long term program of research. Describes school intervention studies where new evaluation procedures and teaching programs have been developed to help students perform better in evaluative situations. (CB)
Descriptors: Elementary School Students, Elementary Secondary Education, Grades (Scholastic), Intervention
Mitchell, Karen J.; Anderson, Judith A. – 1987
The Association of American Medical Colleges is conducting research to develop, implement, and evaluate a Medical College Admission Test (MCAT) essay testing program. Essay administration in the spring and fall of 1985 and 1986 suggested that additional research was needed on the development of topics which elicit similar skills and meet standard…
Descriptors: College Entrance Examinations, Essay Tests, Estimation (Mathematics), Generalizability Theory
Jolly, S. Jean; And Others – 1985
Scores from the Stanford Achievement Tests administered to 50,000 students in Palm Beach County, Florida, were studied in order to determine whether the speeded nature of the reading comprehension subtest was related to inconsistencies in the score profiles. Specifically, the probable effect of random guessing was examined. Reading scores were…
Descriptors: Achievement Tests, Elementary Secondary Education, Guessing (Tests), Item Analysis
Wingersky, Marilyn S.; Lord, Frederic M. – 1983
The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory
Hambleton, Ronald K.; And Others – 1987
The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…
Descriptors: Comparative Analysis, Content Validity, Cutting Scores, Difficulty Level
Lenel, Julia C.; Gilmer, Jerry S. – 1986
In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…
Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)
Samejima, Fumiko – 1986
Item analysis data fitting the normal ogive model were simulated in order to investigate the problems encountered when applying the three-parameter logistic model. Binary item tests containing 10 and 35 items were created, and Monte Carlo methods simulated the responses of 2,000 and 500 examinees. Item parameters were obtained using Logist 5.…
Descriptors: Computer Simulation, Difficulty Level, Guessing (Tests), Item Analysis
Olsen, James B.; And Others – 1986
Student achievement test scores were compared and equated, using three different testing methods: paper-administered, computer-administered, and computerized adaptive testing. The tests were developed from third and sixth grade mathematics item banks of the California Assessment Program. The paper and the computer-administered tests were identical…
Descriptors: Achievement Tests, Adaptive Testing, Comparative Testing, Computer Assisted Testing
Previous Page | Next Page ยป
Pages: 1  |  2