NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 241 to 255 of 418 results Save | Export
Kemerer, Richard; Wahlstrom, Merlin – Performance and Instruction, 1985
Compares the features, learning outcomes tested, reliability, viability, and cost effectiveness of essay tests with those of interpretive tests used in training programs. A case study illustrating how an essay test was converted to an interpretive test and pilot tested is included to illustrate the advantages of interpretive testing. (MBR)
Descriptors: Case Studies, Comparative Analysis, Cost Effectiveness, Essay Tests
Peer reviewed Peer reviewed
Allison, Donald E. – Alberta Journal of Educational Research, 1984
Reports that no significant difference in reliability appeared between a heterogeneous and a homogeneous form of the same general science matching-item test administered to 316 sixth-grade students but that scores on the heterogeneous form of the test were higher, independent of the examinee's sex or intelligence. (SB)
Descriptors: Comparative Analysis, Comparative Testing, Elementary Education, Grade 6
Shrock, Sharon A.; Foshay, Wellesley R. – Performance and Instruction, 1984
Discusses methods of sampling the best information from instruction/training developers/candidates for professional certification and examines the problems of interpreting that information and making classification decisions. Assessment strategies including criterion-referenced, multiple-choice, short answer, and essay questions, and portfolio…
Descriptors: Certification, Competence, Criterion Referenced Tests, Instructional Development
Peer reviewed Peer reviewed
Scott, Owen; Hsu, Yi-Ming – Perceptual and Motor Skills, 1982
Based on data from classes of 23 instructors in two institutions, it was concluded that specific items on an appraisal-of-instruction inventory probably do not influence students' global appraisals of instruction. If true, this conclusion has important implications for use of such inventories in appraising the effectiveness of college instruction.…
Descriptors: College Instruction, Course Evaluation, Higher Education, Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Baldauf, Richard B., Jr. – Educational and Psychological Measurement, 1982
A Monte Carlo design examined how the effects of guessing and item dependence influence test characteristics and student scores. Although validity for cloze variants was high, multiple-choice cloze had significantly lower reliabilities than did true score equivalents. (Author/PN)
Descriptors: Cloze Procedure, Elementary Education, Guessing (Tests), Reading Comprehension
Peer reviewed Peer reviewed
Douglas, Dan – Annual Review of Applied Linguistics, 1995
Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…
Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests
Peer reviewed Peer reviewed
Woodburn, Jim; Sutcliffe, Nick – Assessment & Evaluation in Higher Education, 1996
The Objective Structured Clinical Examination (OSCE), initially developed for undergraduate medical education, has been adapted for assessment of clinical skills in podiatry students. A 12-month pilot study found the test had relatively low levels of reliability, high construct and criterion validity, and good stability of performance over time.…
Descriptors: Clinical Teaching (Health Professions), Higher Education, Medical Education, Podiatry
Rodriguez-Aragon, Graciela; And Others – 1993
The predictive power of the Split-Half version of the Wechsler Intelligence Scale for Children--Revised (WISC-R) Object Assembly (OA) subtest was compared to that of the full administration of the OA subtest. A cohort of 218 male and 49 female adolescent offenders detained in a Texas juvenile detention facility between 1990 and 1992 was used. The…
Descriptors: Adolescents, Cohort Analysis, Comparative Testing, Correlation
Steele, Cam Monroe; Reinsch, N. L., Jr. – 1983
An instrument for measuring telephone apprehension was developed to facilitate research into hypothesized relationships between communication apprehension and telephone apprehension. A set of 92 Likert-type items was adapted from previous communication apprehension scales and administered to 81 undergraduate students in a speech communication…
Descriptors: Adults, Attitude Measures, Communication Apprehension, Communication Research
Herman, Joan – 1984
Diagnostic testing can provide specific information about student skills as a decision-making aid to teachers in prescribing instruction, identifying needs for remediation, determining effective instructional materials and methods, and ultimately, improving student learning. Diagnostic testing, as viewed here, includes individual and group…
Descriptors: Diagnostic Tests, Elementary Secondary Education, Skill Analysis, Student Evaluation
Brown, Stephen W. – 1987
A "modified essay examination" was used to help teach and to assess clinical problem-solving skills with 11 first trimester doctoral students. This examination provided a paper-and-pencil simulation of problems encountered in case management. Students were required to generate hypotheses, formulate questions, discuss issues, and make…
Descriptors: Case Records, Clinical Experience, Clinical Psychology, Essay Tests
Stewart, E. Elizabeth – 1981
Context effects are defined as being influences on test performance associated with the content of successively presented test items or sections. Four types of context effects are identified: (1) direct context effects (practice effects) which occur when performance on items is affected by the examinee having been exposed to similar types of…
Descriptors: Context Effect, Data Collection, Error of Measurement, Evaluation Methods
Walker, Clinton B. – 1978
Standards for evaluating criterion-referenced tests are presented. Twenty-one standards, grouped in three categories, are discussed. Category one is defined as measurement properties and is comprised of conceptual validity, including description of the domain, test item agreement with objectives, and item representativeness of the objectives; and…
Descriptors: Course Objectives, Criterion Referenced Tests, Evaluation Criteria, Scoring
Peer reviewed Peer reviewed
Onore, Cynthia S. – Journal of Reading, 1986
Reviews the Stanford Writing Assessment Program that has three intended uses: district-wide survey of students' writing ability, diagnosis of classroom or district-wide instructional strengths and weaknesses, and staff development through training for administering and scoring of writing samples. Notes that of these uses, none is necessarily best…
Descriptors: Educational Assessment, Educational Diagnosis, Evaluation Methods, Staff Development
Peer reviewed Peer reviewed
Frisbie, David A.; Sweeney, Daryl C. – Journal of Educational Measurement, 1982
A 100-item five-choice multiple choice (MC) biology final exam was converted to multiple choice true-false (MTF) form to yield two content-parallel test forms comprised of the two item types. Students found the MTF items easier and preferred MTF over MC; the MTF subtests were more reliable. (Author/GK)
Descriptors: Biology, College Science, Comparative Analysis, Difficulty Level
Pages: 1  |  ...  |  13  |  14  |  15  |  16  |  17  |  18  |  19  |  20  |  21  |  ...  |  28