Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Cluster Grouping | 8 |
Documentation | 8 |
Information Retrieval | 5 |
Algorithms | 4 |
Databases | 3 |
Probability | 3 |
Automatic Indexing | 2 |
Bayesian Statistics | 2 |
Classification | 2 |
Mathematical Models | 2 |
Sequential Approach | 2 |
More ▼ |
Author
Shaw, W. M., Jr. | 2 |
White, Lee J. | 2 |
Cai, Zhiqiang | 1 |
Gordon, Michael D. | 1 |
Graesser, Art | 1 |
Hu, Xiangen | 1 |
Kar, B. Gautam | 1 |
Li, Hiyiang | 1 |
Murray, Daniel McClure | 1 |
Rasmussen, Edie M. | 1 |
Publication Type
Reports - Research | 5 |
Journal Articles | 4 |
Information Analyses | 2 |
Collected Works - General | 1 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Cai, Zhiqiang; Li, Hiyiang; Hu, Xiangen; Graesser, Art – Grantee Submission, 2016
This paper provides an alternative way of document representation by treating topic probabilities as a vector representation for words and representing a document as a combination of the word vectors. A comparison on summary data shows that this representation is more effective in document classification. [This paper was published in:…
Descriptors: Probability, Natural Language Processing, Models, Automation

Shaw, W. M., Jr. – Information Processing and Management, 1990
Investigates the presence of clustering structure in a document collection and the influence of the presence of clustering structure on the success of cluster-based retrieval. Term-weight and similarity thresholds are discussed, empirical and statistical significance are considered, and indexing exhaustivity for document representation is…
Descriptors: Cluster Grouping, Documentation, Indexing, Information Retrieval
Murray, Daniel McClure – 1972
A retrieval system is considered in which document descriptions are stored and accessed in groups called clusters. All items in a cluster meet common similarity criteria and are represented by a composite entity called a profile. In large collections, profiles themselves are clustered and additional levels of profiles are generated. This entire…
Descriptors: Cluster Grouping, Doctoral Dissertations, Documentation, Information Retrieval

Rasmussen, Edie M.; And Others – Information Processing and Management, 1991
This issue contains nine articles that provide an overview of trends and research in parallel information retrieval. Topics discussed include network design for text searching; the Connection Machine System; PThomas, an adaptive information retrieval system on the Connection Machine; algorithms for document clustering; and system architecture for…
Descriptors: Algorithms, Cluster Grouping, Computer Networks, Computer System Design

Gordon, Michael D. – Journal of the American Society for Information Science, 1991
Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…
Descriptors: Algorithms, Cluster Grouping, Documentation, Information Retrieval

White, Lee J.; And Others – 1975
The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

Shaw, W. M., Jr. – Information Processing and Management, 1993
Describes a study conducted on the cystic fibrosis (CF) database, a subset of MEDLINE, that investigated clustering structure and the effectiveness of cluster-based retrieval as a function of the exhaustivity of the uncontrolled subject descriptions. Results are compared to calculations for controlled descriptions based on Medical Subject Headings…
Descriptors: Bibliographic Records, Cluster Analysis, Cluster Grouping, Comparative Analysis
Kar, B. Gautam; White, Lee J. – 1975
The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification