Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Interrater Reliability | 10 |
Test Interpretation | 10 |
Test Reliability | 10 |
Scores | 4 |
Test Construction | 4 |
Test Validity | 4 |
Generalizability Theory | 3 |
Measurement Techniques | 3 |
Test Theory | 3 |
Error of Measurement | 2 |
Essay Tests | 2 |
More ▼ |
Source
Applied Measurement in… | 1 |
Developmental Psychology | 1 |
International Journal of… | 1 |
Language Assessment Quarterly | 1 |
Author
Publication Type
Reports - Research | 5 |
Speeches/Meeting Papers | 5 |
Journal Articles | 4 |
Reports - Evaluative | 4 |
Guides - Non-Classroom | 1 |
Education Level
Grade 9 | 1 |
Audience
Researchers | 2 |
Administrators | 1 |
Practitioners | 1 |
Teachers | 1 |
Location
Sweden | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Adult Attachment Interview | 1 |
Minnesota Multiphasic… | 1 |
What Works Clearinghouse Rating
Tengberg, Michael – Language Assessment Quarterly, 2018
Reading comprehension is often treated as a multidimensional construct. In many reading tests, items are distributed over reading process categories to represent the subskills expected to constitute comprehension. This study explores (a) the extent to which specified subskills of reading comprehension tests are conceptually conceivable to…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Results

Bakermans-Kranenburg, Marian J; van IJzendoorn, Marinus H. – Developmental Psychology, 1993
Examined the validity of the Adult Attachment Interview (AAI) measure by interviewing 83 mothers twice over 2 months, using different interviewers on each occasion. The results indicated that the reliability of the AAI classifications was quite high over time and across interviewers. The AAI classifications were independent of nonattachment…
Descriptors: Attachment Behavior, Examiners, Interrater Reliability, Mothers
Arnold, Margery E. – 1996
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Descriptors: Estimation (Mathematics), Generalizability Theory, Heuristics, Interrater Reliability

Van Balen, H. G. G.; Van Limbeek, J.; De Mey, H. R. A. – International Journal of Rehabilitation Research, 1997
Forty neuropsychologists, neurologists, psychiatrists, and physiatrists identified neurologically relevant items (NRIs) in the Minnesota Multiphasic Personality Inventory-2 (MMPI-2). Raters identified four sets of NRIs: one for brain damage in general and three partially overlapping sets for stroke, traumatic brain damage, and whiplash.…
Descriptors: Clinical Diagnosis, Head Injuries, Interrater Reliability, Neurological Impairments
Naizer, Gilbert – 1992
A measurement approach called generalizability theory (G-theory) is an important alternative to the more familiar classical measurement theory that yields less useful coefficients such as alpha or the KR-20 coefficient. G-theory is a theory about the dependability of behavioral measurements that allows the simultaneous estimation of multiple…
Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Higher Education
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Fuchs, Douglas; And Others – 1985
The present investigation represents a systematic effort to determine whether handicapped children have been included in the development of test norms, items, and indices of reliability and validity. It analysed up-to-date user manuals and technical supplements of 27 well known and widely used aptitude and achievement tests. Study procedure…
Descriptors: Achievement Tests, Aptitude Tests, Disabilities, Elementary Secondary Education

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991
Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)
Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques
Sullivan, Francis J. – 1986
A study examined how pragmatic form influences evaluation of student essays in university placement testing. Specifically, the study documented how patterns in students' use of information (assumed to be either old, inferable, or new for readers) affected the holistic scores for quality given to the essays. Subjects, 99 randomly selected entering…
Descriptors: College Freshmen, Essay Tests, Evaluation Criteria, Evaluation Methods
Florida State Dept. of Education, Tallahassee. Div. of Vocational, Adult, and Community Education. – 1991
This packet contains a manual and a workbook for developing performance tests in vocational education. The manual gives an in-depth description of how to develop, score, and use performance tests. It includes the following sections: definitions of performance testing, steps in developing a performance test, selecting a performance development…
Descriptors: Interrater Reliability, Performance Tests, Postsecondary Education, Scoring