Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 7 |
Descriptor
Test Reliability | 7 |
Test Items | 3 |
Anxiety | 2 |
Computation | 2 |
Foreign Countries | 2 |
Measures (Individuals) | 2 |
Multiple Choice Tests | 2 |
Scores | 2 |
Scoring | 2 |
Academic Achievement | 1 |
Academic Standards | 1 |
More ▼ |
Source
Applied Measurement in… | 7 |
Author
Almehrizi, Rashid S. | 1 |
Clark, Amy K. | 1 |
George A. Marcoulides | 1 |
Godfrey, Alan T. K. | 1 |
Hau, Kit-Tai | 1 |
Jak, Suzanne | 1 |
Jansen in de Wal, Joost | 1 |
Musch, Jochen | 1 |
Nash, Brooke | 1 |
Natalja Menold | 1 |
Papenberg, Martin | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 6 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Tenko Raykov; George A. Marcoulides; Natalja Menold – Applied Measurement in Education, 2024
We discuss an application of Bayesian factor analysis for estimation of the optimal linear combination and associated maximal reliability of a multi-component measuring instrument. The described procedure yields point and credibility interval estimates of this reliability coefficient, which are readily obtained in educational and behavioral…
Descriptors: Bayesian Statistics, Test Reliability, Error of Measurement, Measurement Equipment
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
van Alphen, Thijmen; Jak, Suzanne; Jansen in de Wal, Joost; Schuitema, Jaap; Peetsma, Thea – Applied Measurement in Education, 2022
Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on…
Descriptors: Test Reliability, Measures (Individuals), Anxiety, Data Collection
Almehrizi, Rashid S. – Applied Measurement in Education, 2021
KR-21 reliability and its extension (coefficient [alpha]) gives the reliability estimate of test scores under the assumption of tau-equivalent forms. KR-21 reliability gives the reliability estimate for summed scores for dichotomous items when items are randomly sampled from an infinite pool of similar items (randomly parallel forms). The article…
Descriptors: Test Reliability, Scores, Scoring, Computation
Thompson, W. Jake; Clark, Amy K.; Nash, Brooke – Applied Measurement in Education, 2019
As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting are needed. The purpose of this paper is to summarize one simulation-based method for estimating and reporting reliability for an…
Descriptors: Test Reliability, Diagnostic Tests, Classification, Computation
Slepkov, Aaron D.; Godfrey, Alan T. K. – Applied Measurement in Education, 2019
The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely…
Descriptors: Multiple Choice Tests, Test Items, Test Reliability, Scores
Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017
In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…
Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability