Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 38 |
| Since 2007 (last 20 years) | 132 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 10 |
| Community | 5 |
| Parents | 5 |
| Teachers | 3 |
| Policymakers | 2 |
Location
| Florida | 7 |
| United Kingdom | 6 |
| United Kingdom (England) | 6 |
| Australia | 5 |
| Canada | 5 |
| United States | 5 |
| Georgia | 3 |
| New York | 3 |
| North Carolina | 3 |
| Turkey | 3 |
| California | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 3 |
| No Child Left Behind Act 2001 | 3 |
| Education for All Handicapped… | 1 |
| Individuals with Disabilities… | 1 |
| Serrano v Priest | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Veldhuijzen, Niels H. – Evaluation in Education: International Progress, 1982
Setting a cutting score is a key problem in criterion-referenced measurement which is discussed within a decision theoretic approach when just one student is considered. A minimum information solution is given and compared with approaches when there is information about a group of students. Formulas illustrate the discussion. (CM)
Descriptors: Criterion Referenced Tests, Cutting Scores, Educational Testing, Measurement Techniques
Peer reviewedTallmadge, G. Kasten – Evaluation Review, 1982
Correction for guessing does not fulfill its intended function when test takers who have nothing to gain from scoring will respond randomly when they could have answered correctly had they tried. Raw scores underestimate abilities. If random guessing is more prevalent in the control group, correction for guessing inflates treatment effects.…
Descriptors: Guessing (Tests), Research Methodology, Research Problems, Responses
Peer reviewedRaju, Nambury S. – Educational and Psychological Measurement, 1982
Rajaratnam, Cronbach and Gleser's generalizability formula for stratified-parallel tests and Raju's coefficient beta are generalized to estimate the reliability of a composite of criterion-referenced tests, where the parts have different cutting scores. (Author/GK)
Descriptors: Criterion Referenced Tests, Cutting Scores, Mathematical Formulas, Scoring Formulas
Peer reviewedAbu-Sayf, F. K. – Educational Review, 1979
The purpose of this article is to discuss some recent developments in the scoring of multiple-choice items from two angles. The first consists of the recent developments in the test instructions of the conventional scoring procedures, and the second consists of a discussion of new scoring methods and formulas. (Author)
Descriptors: Confidence Testing, Guessing (Tests), Measurement Objectives, Multiple Choice Tests
Peer reviewedNaglieri, Jack A.; Maxwell, Susanna – Perceptual and Motor Skills, 1981
Inter-rater reliability of the Goodenough-Harris and McCarthy Draw-A-Child scoring systems was examined for a sample of 60 children, including 20 school-labeled learning disabled, 20 mentally retarded, and 20 normal children between the ages of six and eight-and-one-half years. (Author)
Descriptors: Correlation, Intelligence Tests, Learning Disabilities, Mental Retardation
Peer reviewedPflaum, Susanna W. – Reading Teacher, 1979
Describes a new system for scoring informal reading inventories that helps eliminate problems inherent in other scoring systems. (DD)
Descriptors: Elementary Education, Informal Reading Inventories, Oral Reading, Reading Diagnosis
Peer reviewedWallbrown, Fred H.; Fremont, Theodore – Psychology in the Schools, 1980
Findings support Koppitz's assertion that the total error score for the Bender Gestalt Test is stable and reliable. Working time is a relatively stable dimension of Bender performance, which may be of value in assessment activities. Perseveration and integration should not be used in differential diagnosis. (Author)
Descriptors: Children, Educational Diagnosis, Followup Studies, Psychological Testing
Peer reviewedWilcox, Rand R. – Educational and Psychological Measurement, 1980
Technical problems in achievement testing associated with using latent structure models to estimate the probability of guessing correct responses by examinees is studied; also the lack of problems associated with using Wilcox's formula score. Maximum likelihood estimates are derived which may be applied when items are hierarchically related.…
Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Maximum Likelihood Statistics
Peer reviewedElliott, Max – Journal of Learning Disabilities, 1981
The article reviews current estimation techniques and their resultant effects on the learning disability classroom's composition. An alternate estimation methodology, using z-score conversions, is presented. (Author/SBH)
Descriptors: Achievement Tests, Elementary Secondary Education, Evaluation Methods, Learning Disabilities
Peer reviewedBrannigan, Gary G.; Brunner, Nancy A. – Journal of School Psychology, 1993
Examined two scoring systems for Modified Version of the Bender-Gestalt Test. Administered Bender-Gestalt and Otis-Lennon School Ability Test to 75 first-grade and 84 second-grade students. Both systems were significantly correlated with school ability. Results of tests for differences between correlations indicated that Qualitative Scoring System…
Descriptors: Grade 1, Grade 2, Intelligence Tests, Primary Education
Peer reviewedRutledge, Michael L.; Warden, Melissa A. – School Science and Mathematics, 1999
Describes the development and validation of the Measure of Acceptance of the Theory of Evolution (MATE), a 20-item, Likert-scaled instrument that assesses teachers' overall acceptance of evolutionary theory. (Author/CCM)
Descriptors: Evolution, Higher Education, Mathematics Education, Scaling
MacCann, Robert G. – Psychometrika, 2004
For (0, 1) scored multiple-choice tests, a formula giving test reliability as a function of the number of item options is derived, assuming the "knowledge or random guessing model," the parallelism of the new and old tests (apart from the guessing probability), and the assumptions of classical test theory. It is shown that the formula is a more…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Reliability, Test Theory
Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David M. – ETS Research Report Series, 2008
This report presents the results of a research and development effort for SpeechRater? Version 1.0 (v1.0), an automated scoring system for the spontaneous speech of English language learners used operationally in the Test of English as a Foreign Language™ (TOEFL®) Practice Online assessment (TPO). The report includes a summary of the validity…
Descriptors: Speech, Scoring, Scoring Rubrics, Scoring Formulas
Parker, Kevin R.; Chao, Joseph T.; Ottaway, Thomas A.; Chang, Jane – Journal of Information Technology Education, 2006
The selection of a programming language for introductory courses has long been an informal process involving faculty evaluation, discussion, and consensus. As the number of faculty, students, and language options grows, this process becomes increasingly unwieldy. As it stands, the process currently lacks structure and replicability. Establishing a…
Descriptors: Programming Languages, Introductory Courses, Selection, Criteria
Kim, Daesang; Gilman, David A. – Educational Technology & Society, 2008
This study is an investigation of the use of multimedia components such as visual text, spoken text, and graphics in a Web-based self-instruction program to increase learners' English vocabulary learning at Myungin Middle School in Seoul, South Korea. A total of 172 middle school students (14 years of age) in five classes participated in the…
Descriptors: Vocabulary, Multimedia Materials, Foreign Countries, Vocabulary Development

Direct link
