NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 6,526 to 6,540 of 10,090 results Save | Export
Arizona Department of Education, 2006
Arizona's Instrument to Measure Standards (AIMS), a Standards-Based test, provides educators and the public with valuable information regarding the progress of Arizona's students toward mastering Arizona's reading, writing and mathematics Standards. This specific test, Arizona's Instrument to Measure Standards Dual Purpose Assessment (AIMS DPA) is…
Descriptors: Grade 8, Reference Materials, Test Items, Scoring
Schafer, William D.; Swanson, Gwenyth; Bene, Nancy; Newberry, George – 1999
The hypothesis that enhanced knowledge of assessment rubrics by teachers and thus by students results in improved student achievement was studied in the context of the development of mandatory high school assessments for the Maryland State Department of Education. Rubrics were under development to score constructed-response items in the content…
Descriptors: Academic Achievement, Biology, Civics, Constructed Response
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Burstein, Jill – Journal of Technology, Learning, and Assessment, 2006
E-rater[R] has been used by the Educational Testing Service for automated essay scoring since 1999. This paper describes a new version of e-rater (V.2) that is different from other automated essay scoring systems in several important respects. The main innovations of e-rater V.2 are a small, intuitive, and meaningful set of features used for…
Descriptors: Educational Testing, Test Scoring Machines, Scoring, Writing Evaluation
Cross, Lawrence H. – 1995
Third Digest addresses several aspects of assigning grades, beginning with a discussion of the variability of test scores, and reviewing the use of standardized scores, ideas on assigning letter grades, and recommendations for grading. An important characteristic of grades as initially recorded, the variability of the scores of each test or…
Descriptors: Academic Achievement, Elementary Secondary Education, Grades (Scholastic), Grading
Choi, Hee-sook – 1991
Twenty-eight protocols of the Stanford-Binet Fourth Edition (SB:IV) obtained from graduate students were examined for scoring and clerical errors that contributed to the inaccuracy of test scores. Scoring of individual items was identified as the most error prone process, as evidenced by the fact that 96% of the protocols contained scoring errors.…
Descriptors: Error Patterns, Graduate Students, Higher Education, Intelligence Tests
Spray, Judith; Miller, Tim – 1994
Computer simulations under three conditions of polytomous differential item functioning (DIF) compared the ability of three different statistical procedures to detect nonuniform DIF. The procedures were a nominal and an ordinal extension of the Mantel-Haenszel statistic, and logistic discriminant function analysis. Results showed that only the…
Descriptors: Computer Simulation, Identification, Item Bias, Sample Size
Sykes, Robert C.; Ito, Kyoko – 1998
A common procedure for obtaining multiple readings (ratings) for a constructed response item, especially in high-stakes tests, is to have two readers read the papers independently, with a third reading if the results differ by more than one point. This necessitates a scoring rule that specifies how the ratings will be aggregated into a single item…
Descriptors: Ability, Constructed Response, High Stakes Tests, Judges
Kurz, Terri Barber – 1999
Multiple-choice tests are generally scored using a conventional number right scoring method. While this method is easy to use, it has several weaknesses. These weaknesses include decreased validity due to guessing and failure to credit partial knowledge. In an attempt to address these weaknesses, psychometricians have developed various scoring…
Descriptors: Algorithms, Guessing (Tests), Item Response Theory, Multiple Choice Tests
Peer reviewed Peer reviewed
Humphreys, Donald W. – Education, 1975
In this study a Q sort was developed to measure pupil self-image of science achievement in high school. The Q sort was described so that similar tests could be constructed to measure affective behavior for most classroom environments. (Editor/RK)
Descriptors: Measurement Instruments, Science Education, Scoring Formulas, Self Concept
Katch, Frank I.; And Others – Research Quarterly, 1974
Descriptors: Human Body, Muscular Strength, Physical Education, Physical Fitness
Peer reviewed Peer reviewed
Neville, M. H.; Pugh, A. K. – British Journal of Educational Psychology, 1974
Sixty-six children aged 9-10 years were tested in two groups with two parallel cloze tests of reading comprehension. The same tests were then given as cloze tests of listening comprehension. (Editor)
Descriptors: Cloze Procedure, Educational Psychology, Listening Comprehension, Reading Comprehension
Peer reviewed Peer reviewed
Bryant, William H. – Canadian Modern Language Review, 1975
Descriptors: French, Grading, Language Instruction, Scoring
Patience, Wayne; Auchter, Joan – 1988
A central aim in any assessment program is to ensure fair and stable scoring from administration to administration. When administrations are decentralized, not only in location, but in frequency and in logistical configuration, it is imperative to construct training, certifying, and monitoring systems that provide continuity between the original…
Descriptors: Equivalency Tests, Essay Tests, Scoring, Secondary Education
Littlefield, John H.; Troendle, G. Roger – 1987
The effect of different types of rating task instructions on rater behavior was examined using experts, as opposed to novices, as raters. The experts were instructed to (1) form a global categorical judgment (early hypothesis generation); (2) assess 19 detailed elements; or (3) both. Subjects were 8 dental faculty members who ranged in age from 28…
Descriptors: Dentistry, Evaluation Methods, Higher Education, Holistic Evaluation
Allert, Adrienne; And Others – 1982
This paper outlines a methodology for assessing change in the parent-child relationship between infancy and toddlerhood. Specifically, the paper focuses on the utilization of the same or comparable tasks to observe and rate parent-child interactions at infancy and toddlerhood, and points out the scoring modifications that proved helpful in…
Descriptors: Infants, Longitudinal Studies, Parent Child Relationship, Personality
Pages: 1  |  ...  |  432  |  433  |  434  |  435  |  436  |  437  |  438  |  439  |  440  |  ...  |  673