Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
Scoring Formulas | 8 |
Statistical Analysis | 8 |
Test Items | 8 |
Difficulty Level | 4 |
Testing Problems | 4 |
Guessing (Tests) | 3 |
Test Construction | 3 |
Cutting Scores | 2 |
Error of Measurement | 2 |
Foreign Countries | 2 |
Item Analysis | 2 |
More ▼ |
Source
College Entrance Examination… | 1 |
Educational and Psychological… | 1 |
Journal of Psychoeducational… | 1 |
Language Assessment Quarterly | 1 |
Peabody Journal of Education | 1 |
Author
Engell, Sebastian | 1 |
Floyd, Harlee S. | 1 |
Frary, Robert B. | 1 |
Frey, Andreas | 1 |
Gräfe, Linda | 1 |
Holster, Trevor A. | 1 |
Hutchinson, T.P. | 1 |
Lake, J. | 1 |
Lawrence, Ida M. | 1 |
Legg, Sue M. | 1 |
Livingston, Samuel A. | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Journal Articles | 4 |
Speeches/Meeting Papers | 2 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016
Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…
Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures
Willingness to Answer Multiple-Choice Questions as Manifested Both in Genuine and in Nonsense Items.

Frary, Robert B.; Hutchinson, T.P. – Educational and Psychological Measurement, 1982
Alternate versions of Hutchinson's theory were compared, and one which implies the existence of partial knowledge was found to be better than one which implies that an appropriate measure of ability is obtained by applying the conventional correction for guessing. (Author/PN)
Descriptors: Guessing (Tests), Latent Trait Theory, Multiple Choice Tests, Scoring Formulas
Livingston, Samuel A. – 1986
This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…
Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models
Lawrence, Ida M.; Schmidt, Amy Elizabeth – College Entrance Examination Board, 2001
The SAT® I: Reasoning Test is administered seven times a year. Primarily for security purposes, several different test forms are given at each administration. How is it possible to compare scores obtained from different test forms and from different test administrations? The purpose of this paper is to provide an overview of the statistical…
Descriptors: Scores, Comparative Analysis, Standardized Tests, College Entrance Examinations
Legg, Sue M. – 1982
A case study of the Florida Teacher Certification Examination (FTCE) program was described to assist others launching the development of large scale item banks. FTCE has four subtests: Mathematics, Reading, Writing, and Professional Education. Rasch calibrated item banks have been developed for all subtests except Writing. The methods used to…
Descriptors: Cutting Scores, Difficulty Level, Field Tests, Item Analysis
Rippey, Robert M. – 1971
Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)
Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level