Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 11 |
Descriptor
Scoring Formulas | 68 |
Test Items | 68 |
Multiple Choice Tests | 26 |
Test Reliability | 25 |
Guessing (Tests) | 22 |
Item Analysis | 19 |
Difficulty Level | 18 |
Higher Education | 17 |
Test Construction | 16 |
Scoring | 14 |
Testing Problems | 12 |
More ▼ |
Source
Author
Angoff, William H. | 3 |
Plake, Barbara S. | 3 |
Frary, Robert B. | 2 |
Huynh, Huynh | 2 |
Schrader, William B. | 2 |
Smith, Richard M. | 2 |
Weiss, David J. | 2 |
Aaronson, May | 1 |
Aghbar, Ali A. | 1 |
Aiken, Lewis R. | 1 |
Alliegro, Marissa C. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Secondary Education | 1 |
Audience
Researchers | 7 |
Practitioners | 2 |
Teachers | 2 |
Policymakers | 1 |
Laws, Policies, & Programs
Education for All Handicapped… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Yun, Young Ho; Kim, Yaeji; Sim, Jin A.; Choi, Soo Hyuk; Lim, Cheolil; Kang, Joon-ho – Journal of School Health, 2018
Background: The objective of this study was to develop the School Health Score Card (SHSC) and validate its psychometric properties. Methods: The development of the SHSC questionnaire included 3 phases: item generation, construction of domains and items, and field testing with validation. To assess the instrument's reliability and validity, we…
Descriptors: School Health Services, Psychometrics, Test Construction, Test Validity
Partnership for Assessment of Readiness for College and Careers, 2016
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a group of states working together to develop a set of assessments that measure whether students are on track to be successful in college and careers. Administrations of the PARCC assessment included three Prose Constructed Responses (PCR), one per task for English…
Descriptors: Scoring Rubrics, Test Items, Literacy, Language Arts
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Partnership for Assessment of Readiness for College and Careers, 2015
The 2014-2015 administrations of the PARCC assessment included two separate test administration windows: the Performance-Based Assessment (PBA) and the End-of-Year (EOY), both of which were administered in paper-based and computer-based formats. The first window was for administration of the PBA, and the second window was for the administration of…
Descriptors: Mathematics Tests, Scoring Formulas, Scoring Rubrics, Performance Based Assessment
Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017
Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…
Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns
Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015
This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…
Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis
Jancarík, Antonín; Kostelecká, Yvona – Electronic Journal of e-Learning, 2015
Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…
Descriptors: Online Courses, Computer Assisted Testing, Test Items, Scoring Formulas
Buri, John R.; Cromett, Cristina E.; Post, Maria C.; Landis, Anna Marie; Alliegro, Marissa C. – Online Submission, 2015
Rationale is presented for the derivation of a new measure of stressful life events for use with students [Negative Life Events Scale for Students (NLESS)]. Ten stressful life events questionnaires were reviewed, and the more than 600 items mentioned in these scales were culled based on the following criteria: (a) only long-term and unpleasant…
Descriptors: Experience, Social Indicators, Stress Variables, Affective Measures
Partnership for Assessment of Readiness for College and Careers, 2015
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a group of states working together to develop a modern assessment that replaces previous state standardized tests. It provides better information for teachers and parents to identify where a student needs help, or is excelling, so they are able to enhance instruction to…
Descriptors: Literacy, Language Arts, Scoring Formulas, Scoring
Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016
Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…
Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures

Frary, Robert B. – Applied Measurement in Education, 1989
Multiple-choice response and scoring methods that attempt to determine an examinee's degree of knowledge about each item in order to produce a total test score are reviewed. There is apparently little advantage to such schemes; however, they may have secondary benefits such as providing feedback to enhance learning. (SLD)
Descriptors: Knowledge Level, Multiple Choice Tests, Scoring, Scoring Formulas

Claudy, John G. – Applied Psychological Measurement, 1978
Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)
Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items

Oltman, Phillip K.; Stricker, Lawrence J. – Language Testing, 1990
A recent multidimensional scaling analysis of the Test of English-as-a-Foreign-Language (TOEFL) item response data identified clusters of items in the test sections that, being more homogeneous than their parent sections, might be better for diagnostic use. The analysis was repeated using different scoring techniques. Results diverged only for…
Descriptors: English (Second Language), Item Analysis, Language Tests, Scaling
Budescu, David V. – 1979
This paper outlines a technique for differentially weighting options of a multiple choice test in a fashion that maximizes the item predictive validity. The rule can be applied with different number of categories and the "optimal" number of categories can be determined by significance tests and/or through the R2 criterion. Our theoretical analysis…
Descriptors: Multiple Choice Tests, Predictive Validity, Scoring Formulas, Test Items