Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 6 |
Descriptor
Models | 7 |
Test Length | 7 |
Test Reliability | 7 |
Error of Measurement | 4 |
Test Items | 4 |
Item Response Theory | 3 |
Psychometrics | 3 |
Comparative Analysis | 2 |
Scores | 2 |
Test Validity | 2 |
Alignment (Education) | 1 |
More ▼ |
Source
Assessment & Evaluation in… | 2 |
ETS Research Report Series | 1 |
Educational Sciences: Theory… | 1 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
Measurement in Physical… | 1 |
Author
Burton, Richard F. | 2 |
Ellis, Jules L. | 1 |
Jowett, Sophia | 1 |
Kahraman, Nilufer | 1 |
Patsula, Liane | 1 |
Rizavi, Saba | 1 |
Rotou, Ourania | 1 |
Sengul Avsar, Asiye | 1 |
Steffen, Manfred | 1 |
Tavsancil, Ezel | 1 |
Thompson, Tony | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 5 |
Reports - Evaluative | 2 |
Education Level
Audience
Location
China | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Ellis, Jules L. – Educational and Psychological Measurement, 2021
This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…
Descriptors: Test Length, Models, Error of Measurement, Measurement
Sengul Avsar, Asiye; Tavsancil, Ezel – Educational Sciences: Theory and Practice, 2017
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Descriptors: Test Items, Psychometrics, Nonparametric Statistics, Item Response Theory
Yang, Sophie Xin; Jowett, Sophia – Measurement in Physical Education and Exercise Science, 2013
The Coach-Athlete Relationship Questionnaire was developed to effectively measure affective, cognitive, and behavioral aspects, represented by the interpersonal constructs of closeness, commitment, and complementarity, of the quality of the relationship within the context of sport coaching. The current study sought to determine the internal…
Descriptors: Foreign Countries, Athletes, Athletic Coaches, Interpersonal Relationship
Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011
A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…
Descriptors: Test Length, Test Items, Alignment (Education), Models
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006
Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…
Descriptors: Item Sampling, Tests, Test Length, Test Reliability
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items