NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)0
Since 2007 (last 20 years)4
Education Level
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 64 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Zumbo, Bruno D.; Hubley, Anita M. – Assessment in Education: Principles, Policy & Practice, 2016
Ultimately, measures in research, testing, assessment and evaluation are used, or have implications, for ranking, intervention, feedback, decision-making or policy purposes. Explicit recognition of this fact brings the often-ignored and sometimes maligned concept of consequences to the fore. Given that measures have personal and social…
Descriptors: Testing Programs, Testing Problems, Measurement Techniques, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Porayska-Pomsta, Kaska; Mavrikis, Manolis; D'Mello, Sidney; Conati, Cristina; Baker, Ryan S. J. d. – International Journal of Artificial Intelligence in Education, 2013
Research on the relationship between affect and cognition in Artificial Intelligence in Education (AIEd) brings an important dimension to our understanding of how learning occurs and how it can be facilitated. Emotions are crucial to learning, but their nature, the conditions under which they occur, and their exact impact on learning for different…
Descriptors: Intelligent Tutoring Systems, Psychological Patterns, Data Collection, Affective Measures
Peer reviewed Peer reviewed
Direct linkDirect link
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Santmire, Toni E. – 1984
The purpose of this paper is to discuss ways in which developmental psychology suffers from the lack of an appropriate technology of measurement and statistical analysis. The paper begins by noting that developmental psychology is the study of change; that individuals develop through a succession of "stages" which are separated by…
Descriptors: Data Analysis, Data Collection, Developmental Psychology, Developmental Stages
Peer reviewed Peer reviewed
Green, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Smith, Leon I.; Greenberg, Sandra – 1973
A discussion of selected applications of new tests developed within the context of a large-scale curriculum for educable mentally retarded (EMR) children, the Social Learning Curriculum (SLC), is presented in this paper which investigates three types of reliability that need to be demonstrated in order to provide a basis of these applications. The…
Descriptors: Curriculum Evaluation, Educational Research, Evaluation Methods, Measurement Techniques
Thrash, Susan K.; Porter, Andrew C. – 1974
The purpose of this paper is to prove that one currently recommended method of obtaining the reliability of an instrument defined on a population of aggregate units is invalid. This method randomly splits the aggregate into two halves, correlates the two half unit scores by a Pearson product moment correlation coefficient, and corrects the…
Descriptors: Comparative Analysis, Correlation, Measurement Techniques, Sampling
Peer reviewed Peer reviewed
Luftig, Jeffrey T.; Norton, Willis P. – Journal of Studies in Technical Careers, 1981
The purpose of this article is to review applications of reliability formulas and to recommend more appropriate methods of determining the reliability of affective instruments. (SK)
Descriptors: Affective Measures, Error of Measurement, Measurement Techniques, Test Reliability
Kaplan, Bruce A.; Johnson, Eugene G. – 1992
Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…
Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators
PDF pending restoration PDF pending restoration
Cundiff, D.; Schwane, J. – 1977
Observations during research involving the Bruce Treadmill Test (BTMT) indicating that Stage III for females and Stage IV for males represented speeds which are intermediate between comfortable walking and confortable jogging for many subjects, prompted this study to determine ways to obtain more consistent group results. Twenty-eight subjects…
Descriptors: Measurement Instruments, Measurement Techniques, Physical Activities, Predictor Variables
Quellmalz, Edys – 1980
Measurement problems which jeopardize the reliability and validity of competency-based writing assessments are analyzed. Methods to stabilize rating criteria and readers' application of them are necessary. Most writing assessment programs use guidelines from norm-referenced test methodology. Use of this method of criteria application based on…
Descriptors: Measurement Techniques, Scoring, Test Reliability, Testing Problems
Follman, John; And Others – 1974
Three substudies of effects of different formats on student ratings of faculty teaching effectiveness were conducted. One substudy investigated Kinds of Keys, Agreement, Evaluation, and Needs Improvement. The second, NO TUP, (New Observation of Teaching of University Professor Rating Scale), investigated numbers of positive rating categories. The…
Descriptors: College Faculty, College Students, Measurement Techniques, Rating Scales
Peer reviewed Peer reviewed
De Santi, Roger J.; Sullivan, Vicki Gallo – Journal of Research and Development in Education, 1985
Cloze-based evaluations of reading comprehension present room for a greater amount of subjectivity in rating reader response. A study was designed to ascertain the nature of potential subjectivity within a single-rater's ratings of cloze-based assessments of reading comprehension. (DF)
Descriptors: Cloze Procedure, Elementary Secondary Education, Error of Measurement, Interrater Reliability
Dyer, Henry S. – NJEA Review, 1973
Retired vice-president of Educational Testing Service asserts that chances for tests being misused are greater than ever. Speech delivered at ETS's Invitational Conference on Testing Problems on October 28, 1972, in New York, New York. (DS)
Descriptors: Group Testing, Intelligence Tests, Measurement Techniques, Test Bias
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5