Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Statistical Analysis | 16 |
Test Length | 16 |
Test Reliability | 16 |
Mathematical Models | 6 |
Test Validity | 5 |
Correlation | 4 |
Error of Measurement | 4 |
Scores | 4 |
Test Items | 4 |
Computer Assisted Testing | 3 |
Equated Scores | 3 |
More ▼ |
Source
Educational and Psychological… | 2 |
Journal of Educational… | 2 |
College Student Journal | 1 |
ETS Research Report Series | 1 |
Eurasian Journal of… | 1 |
Psychometrika | 1 |
School Psychology Quarterly | 1 |
Toegepaste taalwetenschap in… | 1 |
Author
Publication Type
Reports - Research | 12 |
Journal Articles | 7 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Audience
Researchers | 1 |
Location
Netherlands | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Kelly, William E.; Daughtry, Don – College Student Journal, 2018
This study developed an abbreviated form of Barron's (1953) Ego Strength Scale for use in research among college student samples. A version of Barron's scale was administered to 100 undergraduate college students. Using item-total score correlations and internal consistency, the scale was reduced to 18 items (Es18). The Es18 possessed adequate…
Descriptors: Undergraduate Students, Self Concept Measures, Test Length, Scores
Paek, Insu – Educational and Psychological Measurement, 2016
The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…
Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length
Anthony, Christopher James; DiPerna, James Clyde – School Psychology Quarterly, 2017
The Academic Competence Evaluation Scales-Teacher Form (ACES-TF; DiPerna & Elliott, 2000) was developed to measure student academic skills and enablers (interpersonal skills, engagement, motivation, and study skills). Although ACES-TF scores have demonstrated psychometric adequacy, the length of the measure may be prohibitive for certain…
Descriptors: Test Items, Efficiency, Item Response Theory, Test Length
Bulut, Okan; Kan, Adnan – Eurasian Journal of Educational Research, 2012
Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…
Descriptors: Adaptive Testing, Computer Assisted Testing, College Entrance Examinations, Graduate Students
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

Cureton, Edward E.; And Others – Educational and Psychological Measurement, 1973
Study based on F. M. Lord's arguments in 1957 and 1959 that tests of the same length do have the same standard error of measurement. (CB)
Descriptors: Error of Measurement, Statistical Analysis, Test Interpretation, Test Length

Rowley, Glenn – Journal of Educational Measurement, 1978
The reliabilities of various observational measures were determined, and the influence of both the number and the length of the observation periods on reliability was examined, both separately and jointly. A single simplifying assumption leads to a variant of the Spearman-Brown formula, which may have wider application. (Author/CTM)
Descriptors: Career Development, Classroom Observation Techniques, Observation, Reliability

Budescu, David – Journal of Educational Measurement, 1985
An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)
Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests
Livingston, Samuel A. – 1984
Much previously published material for estimating the reliability of classification has been based on the assumption that a test consists of a known number of equally weighted items. The test score is the number of those items answered correctly. These methods cannot be used with classifications based on weighted composite scores, especially if…
Descriptors: Equated Scores, Essay Tests, Estimation (Mathematics), Mathematical Models

Kristof, Walter – Psychometrika, 1971
Descriptors: Cognitive Measurement, Error of Measurement, Mathematical Models, Psychological Testing

Gilmer, Jerry S.; Feldt, Leonard S. – 1982
The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…
Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models
de Jong, John H. A. L. – 1984
The Netherlands' secondary education system is highly differentiated, with four different school types for four scholastic ability levels. Final examinations must accommodate these four levels, and require a test-independent definition of the intended final ability levels as well as a sample-free evaluation of the range of ability levels at which…
Descriptors: Difficulty Level, Efficiency, Equated Scores, Foreign Countries
Subkoviak, Michael J. – 1977
Four different procedures were used for estimating the proportion of persons who would be classified consistently as either passing both of two parallel tests or failing both. These four methods were applied at each of four different mastery level scores for each of three different length tests. Data were based on 50 replications of each procedure…
Descriptors: Criterion Referenced Tests, Cutting Scores, Data Analysis, Data Collection
Hambleton, Ronald K. – 1986
The problem of determining optimal test lengths with fixed total testing time has proved to be a difficult one for criterion-referenced test developers. An algorithm is needed which can be used by test developers to allocate available testing time to maximize the validity of their total criterion-referenced tests or testing programs. To be…
Descriptors: Algorithms, Criterion Referenced Tests, Elementary Secondary Education, Psychometrics
de Jong, John H. A. L. – Toegepaste taalwetenschap in artikelen 20, 1984
A study investigated the validity of an English listening skills test by comparing the results of native American and British English speakers with those of Dutch students of English as a second language. A hypothesis suggested that two-thirds of the items would test listening skills and the remaining third would test other knowledge. Test results…
Descriptors: Age Differences, Comparative Analysis, Correlation, Educational Background
Previous Page | Next Page ยป
Pages: 1 | 2