Descriptor
Higher Education | 23 |
Test Reliability | 23 |
Test Theory | 23 |
Test Validity | 13 |
Test Construction | 9 |
Test Interpretation | 7 |
Measurement Techniques | 6 |
Comparative Analysis | 5 |
Criterion Referenced Tests | 4 |
Foreign Countries | 4 |
Item Analysis | 4 |
More ▼ |
Source
British Educational Research… | 1 |
Evaluation Practice | 1 |
Freshman English News | 1 |
Reading Research and… | 1 |
Research Quarterly for… | 1 |
Author
Bachman, Lyle F. | 1 |
Cason, Gerald J. | 1 |
Chase, Clinton I. | 1 |
Clemmons, Sandra | 1 |
Dewalt, Mark W. | 1 |
Feldt, Leonard S. | 1 |
Gamache, LeAnn M. | 1 |
Goulden, Nancy Rost | 1 |
Haladyna, Tom | 1 |
Harris, Jimmy Carl | 1 |
Houston, Robert | 1 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 3 |
Teachers | 2 |
Administrators | 1 |
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
California Critical Thinking… | 1 |
Learning and Study Strategies… | 1 |
New Jersey College Basic… | 1 |
Watson Glaser Critical… | 1 |
What Works Clearinghouse Rating

Seddon, G. M. – British Educational Research Journal, 1988
Demonstrates that some commonly used indices can be misleading in their quantification of reliability. The effects are most pronounced on gain or difference scores. Proposals are made to avoid sources of invalidity by using a procedure to assess reliability in terms of upper and lower limits for the true scores of each examinee. (Author/JDH)
Descriptors: Foreign Countries, Higher Education, Research Problems, Statistical Studies
Houston, Robert – Freshman English News, 1981
Provides information on statistical data and jargon so that English department members can more confidently and responsibly identify the inevitable weaknesses and limitations of both tests of writing ability and the research on them. (RL)
Descriptors: Higher Education, Measurement Techniques, Standardized Tests, Test Reliability
Goulden, Nancy Rost – 1989
Since speech communication evaluators are beginning to adapt the analytic and holistic instruments and methods used for rating written products to oral products and performance, this research review investigated: (1) what the labels "analytic" and "holistic" mean; (2) the theoretical bases of the two scoring approaches; and (3)…
Descriptors: Comparative Analysis, Higher Education, Holistic Evaluation, Rating Scales

Wheeler, Patricia H. – Evaluation Practice, 1995
This volume is the fourth in a series for college faculty and advanced graduate students, "Survival Skills for Scholars." It offers practical advice for developing, using, and grading classroom examinations, focusing on traditional multiple-choice and constructed-response tests rather than alternative assessments. (SLD)
Descriptors: College Faculty, Constructed Response, Grading, Higher Education
Ross, Steven; Hua, Te-Fang – 1994
A general issue related to language program development involves the empirical rationalization of cut score decisions in criterion-referenced language tests. Cut score dependability focuses on the consistency of the decisions in repeated testing or the assessment of language learner performances. In this case, the issue is to determine the optimal…
Descriptors: Achievement Gains, Criterion Referenced Tests, English (Second Language), Higher Education
Sammon, Susan F. – 1988
A study investigated whether a positive correlation existed between scores obtained by incoming freshman on the recently developed Degrees of Reading Power Test (DRP) and the required Reading Comprehension subtest of the New Jersey College Basic Skills Placement Test (NJCBSPT). The subjects, 217 William Paterson College freshman enrolled in a…
Descriptors: Comparative Analysis, Comparative Testing, Correlation, Educational Testing
Dewalt, Mark W.; Loyd, Brenda H. – 1985
Attitude measurement through Likert-type surveys usually provides no opportunity to assess the importance of the statements of the subjects. This study, involving 479 graduate and undergraduate students, examines the question of whether importance and agreement measures have different underlying dimensions, and examines the question of whether the…
Descriptors: Affective Measures, Attitude Measures, Factor Analysis, Factor Structure

Feldt, Leonard S.; Spray, Judith A. – Research Quarterly for Exercise and Sport, 1983
The reliabilities of two types of measurement plans were compared across six hypothetical distributions of true scores or abilities. The measurement plans were: (1) fixed-length, where the number of trials for all examinees is set in advance; and (2) trials-to-criterion, where examinees must keep trying until they complete a given number of trials…
Descriptors: Criterion Referenced Tests, Evaluation Methods, Higher Education, Measurement Techniques
Naizer, Gilbert – 1992
A measurement approach called generalizability theory (G-theory) is an important alternative to the more familiar classical measurement theory that yields less useful coefficients such as alpha or the KR-20 coefficient. G-theory is a theory about the dependability of behavioral measurements that allows the simultaneous estimation of multiple…
Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Higher Education
Bachman, Lyle F.; And Others – 1993
This paper outlines the development of a performance assessment measure of language speaking ability, the Language Ability Assessment System (LAAS), which is highly reliable and can be examined for reliability through modern measurement theories, such as generalizability theory (G-theory) and the many-facet Rasch theory. LAAS was developed to…
Descriptors: College Students, Higher Education, Interrater Reliability, Language Proficiency
Gamache, LeAnn M. – 1983
Scales constructed under procedures and criteria outlined by the various traditional and latent trait methods were examined as to whether they varied in characteristics related to scale quality. Scales were constructed from a common pool of items analyzed in full form according to Likert and a one-parameter Rasch model for non-dichotomous data.…
Descriptors: Comparative Analysis, Correlation, Higher Education, Item Analysis

Nist, Sherrie L.; And Others – Reading Research and Instruction, 1990
Investigates the utility and predictive validity of the Learning and Study Strategies Inventory (LASSI) as a means of measuring college students' cognitive and affective growth following a study strategies course. Finds cognitive and affective growth in both regularly admitted and developmental studies students. Finds that LASSI cannot yet be used…
Descriptors: Affective Measures, Cognitive Measurement, College Students, Developmental Studies Programs
Cason, Gerald J.; And Others – 1983
Prior research in a single clinical training setting has shown Cason and Cason's (1981) simplified model of their performance rating theory can improve rating reliability and validity through statistical control of rater stringency error. Here, the model was applied to clinical performance ratings of 14 cohorts (about 250 students and 200 raters)…
Descriptors: Clinical Experience, Error of Measurement, Evaluation Methods, Higher Education
Soh, Kay Cheng – 1986
This research study investigates the validity of the Teacher Locus of Control Scale (TLCS) in a different cultural environment. The scale, developed in the United States by Taylor et al., measures teachers' beliefs about their own potential to influence student performance and classroom events. Specifically, the study investigates the…
Descriptors: College Faculty, Correlation, Cultural Context, Foreign Countries
Marsh, Herbert W.; And Others – 1984
Items from two American instruments (Students' Evaluation of Educational Effectiveness, and the Endeavor instrument) designed to measure students' evaluations of teaching effectiveness were translated into Spanish and administered to a sample of Spanish university students. Most of the items were judged by the students to be appropriate; every…
Descriptors: Attitude Measures, Factor Analysis, Factor Structure, Foreign Countries
Previous Page | Next Page ยป
Pages: 1 | 2