Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 20 |
Descriptor
Scoring | 91 |
Scoring Formulas | 91 |
Multiple Choice Tests | 23 |
Test Reliability | 23 |
Test Validity | 22 |
Guessing (Tests) | 17 |
Evaluation Methods | 16 |
Higher Education | 15 |
Test Construction | 14 |
Test Items | 14 |
Item Analysis | 13 |
More ▼ |
Source
Author
Publication Type
Education Level
Higher Education | 7 |
Postsecondary Education | 5 |
Elementary Secondary Education | 4 |
Secondary Education | 3 |
Elementary Education | 2 |
Adult Education | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
More ▼ |
Audience
Researchers | 3 |
Policymakers | 1 |
Practitioners | 1 |
Teachers | 1 |
Location
Georgia | 3 |
United States | 2 |
Australia | 1 |
Czech Republic | 1 |
Florida | 1 |
India | 1 |
Minnesota | 1 |
New York | 1 |
New York (New York) | 1 |
Pennsylvania | 1 |
Turkey | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021
Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…
Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas
Yun, Young Ho; Kim, Yaeji; Sim, Jin A.; Choi, Soo Hyuk; Lim, Cheolil; Kang, Joon-ho – Journal of School Health, 2018
Background: The objective of this study was to develop the School Health Score Card (SHSC) and validate its psychometric properties. Methods: The development of the SHSC questionnaire included 3 phases: item generation, construction of domains and items, and field testing with validation. To assess the instrument's reliability and validity, we…
Descriptors: School Health Services, Psychometrics, Test Construction, Test Validity
Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia – Anatomical Sciences Education, 2015
In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…
Descriptors: Anatomy, Multiple Choice Tests, Scoring, Decision Making
Cetin, Bayram; Guler, Nese; Sarica, Rabia – Eurasian Journal of Educational Research, 2016
Problem Statement: In addition to being teaching tools, concept maps can be used as effective assessment tools. The use of concept maps for assessment has raised the issue of scoring them. Concept maps generated and used in different ways can be scored via various methods. Holistic and relational scoring methods are two of them. Purpose of the…
Descriptors: Generalizability Theory, Concept Mapping, Scoring, Scoring Formulas
Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015
This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…
Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis
Jancarík, Antonín; Kostelecká, Yvona – Electronic Journal of e-Learning, 2015
Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…
Descriptors: Online Courses, Computer Assisted Testing, Test Items, Scoring Formulas
Buri, John R.; Cromett, Cristina E.; Post, Maria C.; Landis, Anna Marie; Alliegro, Marissa C. – Online Submission, 2015
Rationale is presented for the derivation of a new measure of stressful life events for use with students [Negative Life Events Scale for Students (NLESS)]. Ten stressful life events questionnaires were reviewed, and the more than 600 items mentioned in these scales were culled based on the following criteria: (a) only long-term and unpleasant…
Descriptors: Experience, Social Indicators, Stress Variables, Affective Measures
Partnership for Assessment of Readiness for College and Careers, 2015
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a group of states working together to develop a modern assessment that replaces previous state standardized tests. It provides better information for teachers and parents to identify where a student needs help, or is excelling, so they are able to enhance instruction to…
Descriptors: Literacy, Language Arts, Scoring Formulas, Scoring
Northwest Evaluation Association, 2014
Recently, Northwest Evaluation Association (NWEA) completed a study to connect the scale of the Minnesota Comprehensive Assessments (MCA) Testing Program used for Minnesota's mathematics and reading assessments with NWEA's RIT (Rasch Unit) scale. Information from the state assessments was used in a study to establish performance-level scores on…
Descriptors: Alignment (Education), Testing Programs, State Programs, Mathematics Tests
Dimoliatis, Ioannis D. K.; Jelastopulu, Eleni – Universal Journal of Educational Research, 2013
The surgical theatre educational environment measures STEEM, OREEM and mini-STEEM for students (student-STEEM) comprise an up to now disregarded systematic overestimation (OE) due to inaccurate percentage calculation. The aim of the present study was to investigate the magnitude of and suggest a correction for this systematic bias. After an…
Descriptors: Educational Environment, Scores, Grade Prediction, Academic Standards
Ahmed, Ayesha; Pollitt, Alastair – Assessment in Education: Principles, Policy & Practice, 2011
At the heart of most assessments lies a set of questions, and those who write them must achieve "two" things. Not only must they ensure that each question elicits the kind of performance that shows how "good" pupils are at the subject, but they must also ensure that each mark scheme gives more marks to those who are…
Descriptors: Academic Achievement, Classification, Educational Quality, Quality Assurance
Murphy, Brooke; Dionigi, Rylee A.; Litchfield, Chelsea – Issues in Educational Research, 2014
We argue that gender issues in physical education (PE) remain in some schools, despite advances in PE research and curricula aimed at engaging females in PE. We interviewed five Australian PE teachers (1 male and 4 females) at a co-educational, regional high school about the factors affecting female participation in PE and the strategies they used…
Descriptors: Physical Education, Females, Case Studies, Teacher Attitudes
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012
Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…
Descriptors: Scoring, Test Scoring Machines, Automation, Models
Barkaoui, Khaled – Assessment in Education: Principles, Policy & Practice, 2011
This study examined the effects of marking method and rater experience on ESL (English as a Second Language) essay test scores and rater performance. Each of 31 novice and 29 experienced raters rated a sample of ESL essays both holistically and analytically. Essay scores were analysed using a multi-faceted Rasch model to compare test-takers'…
Descriptors: Writing Evaluation, Writing Tests, Essay Tests, Interrater Reliability
Dorans, Neil J.; Liang, Longjuan; Puhan, Gautam – Educational Testing Service, 2010
Scores are the most visible and widely used products of a testing program. The choice of score scale has implications for test specifications, equating, and test reliability and validity, as well as for test interpretation. At the same time, the score scale should be viewed as infrastructure likely to require repair at some point. In this report…
Descriptors: Testing Programs, Standard Setting (Scoring), Test Interpretation, Certification