Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 38 |
| Since 2007 (last 20 years) | 122 |
Descriptor
| Comparative Analysis | 145 |
| Statistical Analysis | 145 |
| Scoring | 73 |
| Scoring Rubrics | 63 |
| Teaching Methods | 40 |
| Foreign Countries | 39 |
| Pretests Posttests | 34 |
| Control Groups | 26 |
| Experimental Groups | 25 |
| Scores | 25 |
| Correlation | 24 |
| More ▼ | |
Source
Author
| Awada, Ghada M. | 2 |
| Bulunuz, Mizrap | 2 |
| Bulunuz, Nermin | 2 |
| Chuang, Chi-ching | 2 |
| Fujiki, Mayo | 2 |
| Herman, Keith | 2 |
| Liu, Ou Lydia | 2 |
| Puhan, Gautam | 2 |
| Reinke, Wendy | 2 |
| Rohrer, David | 2 |
| Singh, Chandralekha | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 2 |
Sinclair, Andrea L., Ed.; Thacker, Arthur, Ed. – Human Resources Research Organization (HumRRO), 2019
California's Commission on Teacher Credentialing (Commission) requires all programs of preliminary multiple and single subject teacher preparation to use a Commission-approved Teaching Performance Assessment (TPA) as one of the program completion requirements for prospective teacher candidates. Three TPA models were approved by the Commission: (1)…
Descriptors: Preservice Teachers, Performance Based Assessment, Models, Credentials
Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022
As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…
Descriptors: Scores, Scoring, Comparative Analysis, Testing
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Kelleher, Leila K.; Beach, Tyson A. C.; Frost, David M.; Johnson, Andrew M.; Dickey, James P. – Measurement in Physical Education and Exercise Science, 2018
The scoring scheme for the functional movement screen implicitly assumes that the factor structure is consistent, stable, and congruent across different populations. To determine if this is the case, we compared principal components analyses of three samples: a healthy, general population (n = 100), a group of varsity athletes (n = 101), and a…
Descriptors: Factor Structure, Test Reliability, Screening Tests, Motion
Maries, Alexandru; Singh, Chandralekha – Physical Review Physics Education Research, 2018
Drawing appropriate diagrams is a useful problem solving heuristic that can transform a problem into a representation that is easier to exploit for solving it. One major focus while helping introductory physics students learn effective problem solving is to help them understand that drawing diagrams can facilitate problem solution. We conducted an…
Descriptors: Science Instruction, Physics, Introductory Courses, Comparative Analysis
Reilly, Erin D.; Williams, Kyle M.; Stafford, Rose E.; Corliss, Stephanie B.; Walkow, Janet C.; Kidwell, Donna K. – Online Learning, 2016
This paper utilizes a case-study design to discuss global aspects of massive open online course (MOOC) assessment. Drawing from the literature on open-course models and linguistic gatekeeping in education, we position freeform assessment in MOOCs as both challenging and valuable, with an emphasis on current practices and student resources. We…
Descriptors: Online Courses, Case Studies, Higher Education, Electronic Learning
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Clayton, Zachary S.; Wilds, Gabriel P.; Mangum, Joshua E.; Hocker, Austin D.; Dawson, Sierra M. – Advances in Physiology Education, 2016
We investigated how students performed on weekly two-page laboratory reports based on whether the grading rubric was provided to the student electronically or in paper form and the inclusion of one- to two-sentence targeted comments. Subjects were registered for a 289-student, third-year human physiology class with laboratory and were randomized…
Descriptors: Physiology, Feedback (Response), Integrated Learning Systems, Questionnaires
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016
In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…
Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics
Velasco-Martínez, Leticia-Concepción; Tójar-Hurtado, Juan-Carlos – International Education Studies, 2018
Competency-based learning requires making changes in the higher education model in response to current socio-educational demands. Rubrics are an innovative educational tool for competence evaluation, for both students and educators. Ever since arriving at the university systems, the application of rubrics in evaluation programs has grown…
Descriptors: Competency Based Education, Higher Education, Scoring Rubrics, Evaluation Methods
Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew – ACT, Inc., 2017
The percentage of students retaking college admissions tests is rising (Harmston & Crouse, 2016). Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing companies, interested in validity evidence like correlations with college first-year grade-point averages (FYGPA), often…
Descriptors: College Entrance Examinations, Grade Point Average, College Freshmen, Correlation
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Han, Jing; Koenig, Kathleen; Cui, Lili; Fritchman, Joseph; Li, Dan; Sun, Wanyi; Fu, Zhao; Bao, Lei – Physical Review Physics Education Research, 2016
In a recent study, the 30-question Force Concept Inventory (FCI) was theoretically split into two 14-question "half-length" tests (HFCIs) covering the same set of concepts and producing mean scores that can be equated to those of the original FCI. The HFCIs require less administration time and reduce test-retest issues when different…
Descriptors: Physics, Scientific Concepts, Science Instruction, College Science
Eckes, Thomas – Language Testing, 2017
This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…
Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)
Zeng, Songtian – ProQuest LLC, 2017
Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring…
Descriptors: Rating Scales, Scoring, Validity, Comparative Analysis

Peer reviewed
Direct link
