ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	3

Descriptor

Scoring Formulas	8
Statistical Analysis	8
Test Items	8
Difficulty Level	4
Testing Problems	4
Guessing (Tests)	3
Test Construction	3
Cutting Scores	2
Error of Measurement	2
Foreign Countries	2
Item Analysis	2
Item Response Theory	2
Latent Trait Theory	2
Multiple Choice Tests	2
Scaling	2
Scores	2
Test Format	2
Test Interpretation	2
Test Reliability	2
Validity	2
Academic Achievement	1
Achievement Rating	1
Achievement Tests	1
Chemical Engineering	1
Cognitive Processes	1
More ▼

Source

College Entrance Examination…	1
Educational and Psychological…	1
Journal of Psychoeducational…	1
Language Assessment Quarterly	1
Peabody Journal of Education	1

Author

Engell, Sebastian	1
Floyd, Harlee S.	1
Frary, Robert B.	1
Frey, Andreas	1
Gräfe, Linda	1
Holster, Trevor A.	1
Hutchinson, T.P.	1
Lake, J.	1
Lawrence, Ida M.	1
Legg, Sue M.	1
Livingston, Samuel A.	1
Moore, Courtney A.	1
Morgan, Grant B.	1
Rippey, Robert M.	1
Schmidt, Amy Elizabeth	1
Steimel, Jochen	1
Taskinen, Päivi H.	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	4
Speeches/Meeting Papers	2
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Higher Education	3
Postsecondary Education	3

Audience

Researchers

Location

Germany	1
Japan	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 8 results Save | Export

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Guessing and the Rasch Model

Peer reviewed

Direct link

Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016

Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…

Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests

A Competency Model for Process Dynamics and Control and Its Use for Test Construction at University Level

Peer reviewed

Direct link

Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015

This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…

Descriptors: Models, Engineering Education, Test Items, Outcome Measures

Willingness to Answer Multiple-Choice Questions as Manifested Both in Genuine and in Nonsense Items.

Peer reviewed

Frary, Robert B.; Hutchinson, T.P. – Educational and Psychological Measurement, 1982

Alternate versions of Hutchinson's theory were compared, and one which implies the existence of partial knowledge was found to be better than one which implies that an appropriate measure of ability is obtained by applying the conventional correction for guessing. (Author/PN)

Descriptors: Guessing (Tests), Latent Trait Theory, Multiple Choice Tests, Scoring Formulas

Adjusting Scores on Examinations Offering a Choice of Questions.

Download full text

Livingston, Samuel A. – 1986

This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…

Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models

Ensuring Comparable Scores on the SAT® I: Reasoning Test. Research Notes. RN-14

Download full text

Lawrence, Ida M.; Schmidt, Amy Elizabeth – College Entrance Examination Board, 2001

The SAT® I: Reasoning Test is administered seven times a year. Primarily for security purposes, several different test forms are given at each administration. How is it possible to compare scores obtained from different test forms and from different test administrations? The purpose of this paper is to provide an overview of the statistical…

Descriptors: Scores, Comparative Analysis, Standardized Tests, College Entrance Examinations

The Use of Precalibrated Item Bank to Establish and Maintain Cutoff Scores: A Case Study of the Florida Teacher Certification Examination.

Download full text

Legg, Sue M. – 1982

A case study of the Florida Teacher Certification Examination (FTCE) program was described to assist others launching the development of large scale item banks. FTCE has four subtests: Mathematics, Reading, Writing, and Professional Education. Rasch calibrated item banks have been developed for all subtests except Writing. The methods used to…

Descriptors: Cutting Scores, Difficulty Level, Field Tests, Item Analysis

Scoreing and Analyzing Confidence Tests. Final Report.

Download full text

Rippey, Robert M. – 1971

Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)

Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level