ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	6

Descriptor

Comparative Analysis	7
Reliability	7
Test Theory	7
Computation	4
Scores	4
Test Items	3
Classification	2
Interviews	2
Item Response Theory	2
Misconceptions	2
Science Education	2
Test Length	2
Validity	2
Accuracy	1
Anxiety	1
Astronomy	1
Bayesian Statistics	1
Biochemistry	1
Cognitive Tests	1
Computer Software	1
Content Validity	1
Correlation	1
Criterion Referenced Tests	1
Cutting Scores	1
Decision Making	1
More ▼

Source

Applied Psychological…	2
Applied Measurement in…	1
Biochemistry and Molecular…	1
International Journal of…	1
ProQuest LLC	1

Author

Almehrizi, Rashid S.	1
Bailey, Janelle M.	1
Bretz, Stacey Lowery	1
Culpepper, Steven Andrew	1
Deng, Nina	1
Haberman, Shelby	1
Haladyna, Tom	1
Johnson, Bruce	1
Larkin, Kevin	1
Linenberger, Kimberly J.	1
Prather, Edward E.	1
Puhan, Gautam	1
Roid, Gale	1
Sinharay, Sandip	1
Slater, Timothy F.	1
More ▼

Publication Type

Journal Articles	5
Reports - Research	4
Dissertations/Theses -…	1
Reports - Descriptive	1
Reports - Evaluative	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Early Childhood Education	1
Elementary Education	1
Grade 2	1
Higher Education	1
Postsecondary Education	1
Primary Education	1

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

Development and Validation of the Star Properties Concept Inventory

Peer reviewed

Direct link

Bailey, Janelle M.; Johnson, Bruce; Prather, Edward E.; Slater, Timothy F. – International Journal of Science Education, 2012

Concept inventories (CIs)--typically multiple-choice instruments that focus on a single or small subset of closely related topics--have been used in science education for more than a decade. This paper describes the development and validation of a new CI for astronomy, the "Star Properties Concept Inventory" (SPCI). Questions cover the areas of…

Descriptors: Educational Strategies, Validity, Testing, Astronomy

Development of the Enzyme-Substrate Interactions Concept Inventory

Peer reviewed

Direct link

Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012

Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…

Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

Evaluating IRT- and CTT-Based Methods of Estimating Classification Consistency and Accuracy Indices from Single Administrations

Direct link

Deng, Nina – ProQuest LLC, 2011

Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…

Descriptors: Item Response Theory, Test Theory, Computation, Classification

A Comparison of Decision-Making Methods for Criterion-Referenced Tests.

Haladyna, Tom; Roid, Gale – 1980

The problems associated with misclassifying students when pass-fail decisions are based on test scores are discussed. One protection against misclassification is to set a confidence interval around the cutting score. Those whose scores fall above the interval are passed; those whose scores fall below the interval are failed; and those whose scores…

Descriptors: Bayesian Statistics, Classification, Comparative Analysis, Criterion Referenced Tests