Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Source
Author
Publication Type
Education Level
Early Childhood Education | 2 |
Elementary Education | 2 |
Higher Education | 2 |
Elementary Secondary Education | 1 |
Grade 1 | 1 |
Grade 2 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 8 | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
More ▼ |
Location
California | 7 |
Arizona (Phoenix) | 1 |
Australia | 1 |
Canada | 1 |
Ghana | 1 |
Indiana | 1 |
Maryland | 1 |
New Jersey | 1 |
North America | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Assessments and Surveys
Kaufman Assessment Battery… | 2 |
National Assessment of… | 1 |
New Jersey College Basic… | 1 |
Pediatric Evaluation of… | 1 |
Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Guangming Li; Zhengyan Liang – SAGE Open, 2024
In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used…
Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3
Quaigrain, Kennedy; Arhin, Ato Kwamina – Cogent Education, 2017
Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…
Descriptors: Item Analysis, Teacher Developed Materials, Test Reliability, Educational Assessment
Ling, Guangming; Bochenek, Jennifer; Burkander, Kri – Journal of Education for Business, 2015
By applying multilevel models with random effects, the authors reviewed and synthesized findings from 30 studies that were published in the last 20 years exploring the relationship between the Educational Testing Service Major Field Test for a Bachelor's Degree in Business (MFTB) and related factors. The results suggest that MFTB scores correlated…
Descriptors: Bachelors Degrees, Institutional Research, Educational Testing, Scores
Ling, Guangming – International Journal of Testing, 2016
To investigate possible iPad related mode effect, we tested 403 8th graders in Indiana, Maryland, and New Jersey under three mode conditions through random assignment: a desktop computer, an iPad alone, and an iPad with an external keyboard. All students had used an iPad or computer for six months or longer. The 2-hour test included reading, math,…
Descriptors: Educational Testing, Computer Assisted Testing, Handheld Devices, Computers
Newton, Paul E. – Educational Research, 2009
Background: National curriculum tests have been administered in England for well over a decade. Although reliability evidence has been published, critics have argued that there is not enough evidence (of the right kind) and that test results may be insufficiently reliable. Purpose: This article collates and discusses evidence on the reliability of…
Descriptors: National Curriculum, Test Results, Foreign Countries, Elementary Secondary Education
Macy, Marisa; Bagnato, Stephen J. – NHSA Dialog, 2010
The inclusion of young children with disabilities has remained a function of the Head Start program since its inception in the 1960s when the United States Congress mandated that children with disabilities comprise 10% of the Head Start enrollment (Zigler & Styfco, 2000). Standardized, norm-referenced tests used to identify children with…
Descriptors: Performance Based Assessment, Disadvantaged Youth, Norm Referenced Tests, Disabilities

Davis, Ken – Research in the Teaching of English, 1979
Impromtu pre- and post-test essays by 302 students randomly selected from over 80 sections of a first-semester freshman composition course revealed significant improvement. (DD)
Descriptors: College Freshmen, Educational Research, Educational Testing, Higher Education

Williams, Richard H.; Zimmerman, Donald W. – Journal of Experimental Education, 1984
This paper provides a list of 10 salient features of the standard error of measurement, contrasting it to the reliability coefficient. It is concluded that the standard error of measurement should be regarded as a primary characteristic of a mental test. (Author/DWH)
Descriptors: Educational Testing, Error of Measurement, Evaluation Methods, Psychological Testing

Chase, Clint – Mid-Western Educational Researcher, 1996
Classical procedures for calculating the two indices of decision consistency (P and Kappa) for criterion-referenced tests require two testings on each child. Huynh, Peng, and Subkoviak have presented one-testing procedures for these indices. These indices can be estimated without any test administration using Ebel's estimates of the mean, standard…
Descriptors: Criterion Referenced Tests, Educational Research, Educational Testing, Estimation (Mathematics)

Klinger, Don A.; Rogers, W. Todd – Alberta Journal of Educational Research, 2003
The estimation accuracy of procedures based on classical test score theory and item response theory (generalized partial credit model) were compared for examinations consisting of multiple-choice and extended-response items. Analysis of British Columbia Scholarship Examination results found an error rate of about 10 percent for both methods, with…
Descriptors: Academic Achievement, Educational Testing, Foreign Countries, High Stakes Tests

Klein, Howard A. – Reading Improvement, 1989
Examines whether using a combined silent reading-listening mode to administer the "Social Studies Inference Test" optimized information gathering. Finds that the combined modality produced more correct inferences than did silent reading alone. Finds only one gender difference--girls'"caution score" was higher than that for…
Descriptors: Data Collection, Educational Testing, Grade 6, Intermediate Grades
Bauer, Joseph J.; Smith, Douglas K. – 1988
Stability of performance on the Kaufman Assessment Battery for Children (K-ABC) and the Stanford-Binet Intelligence Scale: Fourth Edition (S-B:4) over a 1-year interval was examined with a sample of 28 nonhandicapped preschoolers. Each child was administered both tests in counterbalanced order and retested in 1 year with either the K-ABC or the…
Descriptors: Early Childhood Education, Educational Testing, Intelligence Tests, Middle Class

Weiten, Wayne – Journal of Experimental Education, 1982
A comparison of double as opposed to single multiple-choice questions yielded significant differences in regard to item difficulty, item discrimination, and internal reliability, but not concurrent validity. (Author/PN)
Descriptors: Difficulty Level, Educational Testing, Higher Education, Multiple Choice Tests
Houser, Ronald L.; And Others – 1983
This report describes a procedure that promises to improve the stability, accuracy, and efficiency of the employment of latent trait models and an application of the procedure to the Rasch model. Data were collected from the Portland Public Schools Level Tests administered to 25,740 students. Since each of the 173 items (chosen from the total…
Descriptors: Academic Achievement, Educational Testing, Item Banks, Latent Trait Theory