ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	1

Descriptor

Cluster Grouping	8
Documentation	8
Information Retrieval	5
Algorithms	4
Databases	3
Probability	3
Automatic Indexing	2
Bayesian Statistics	2
Classification	2
Mathematical Models	2
Sequential Approach	2
Statistical Analysis	2
Subject Index Terms	2
Automation	1
Bibliographic Records	1
Cluster Analysis	1
Comparative Analysis	1
Computer Networks	1
Computer System Design	1
Cystic Fibrosis	1
Data Analysis	1
Doctoral Dissertations	1
Feasibility Studies	1
Flow Charts	1
Indexing	1
More ▼

Source

Information Processing and…	3
Grantee Submission	1
Journal of the American…	1

Author

Shaw, W. M., Jr.	2
White, Lee J.	2
Cai, Zhiqiang	1
Gordon, Michael D.	1
Graesser, Art	1
Hu, Xiangen	1
Kar, B. Gautam	1
Li, Hiyiang	1
Murray, Daniel McClure	1
Rasmussen, Edie M.	1

Publication Type

Reports - Research	5
Journal Articles	4
Information Analyses	2
Collected Works - General	1
Reports - Descriptive	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Can Word Probabilities from LDA Be Simply Added up to Represent Documents?

Peer reviewed
PDF on ERIC

Download full text

Cai, Zhiqiang; Li, Hiyiang; Hu, Xiangen; Graesser, Art – Grantee Submission, 2016

This paper provides an alternative way of document representation by treating topic probabilities as a vector representation for words and representing a document as a combination of the word vectors. A comparison on summary data shows that this representation is more effective in document classification. [This paper was published in:…

Descriptors: Probability, Natural Language Processing, Models, Automation

An Investigation of Document Structures.

Peer reviewed

Shaw, W. M., Jr. – Information Processing and Management, 1990

Investigates the presence of clustering structure in a document collection and the influence of the presence of clustering structure on the success of cluster-based retrieval. Term-weight and similarity thresholds are discussed, empirical and statistical significance are considered, and indexing exhaustivity for document representation is…

Descriptors: Cluster Grouping, Documentation, Indexing, Information Retrieval

Document Retrieval Based on Clustered Files.

Download full text

Murray, Daniel McClure – 1972

A retrieval system is considered in which document descriptions are stored and accessed in groups called clusters. All items in a cluster meet common similarity criteria and are represented by a composite entity called a profile. In large collections, profiles themselves are clustered and additional levels of profiles are generated. This entire…

Descriptors: Cluster Grouping, Doctoral Dissertations, Documentation, Information Retrieval

Parallel Processing and Information Retrieval.

Peer reviewed

Rasmussen, Edie M.; And Others – Information Processing and Management, 1991

This issue contains nine articles that provide an overview of trends and research in parallel information retrieval. Topics discussed include network design for text searching; the Connection Machine System; PThomas, an adaptive information retrieval system on the Connection Machine; algorithms for document clustering; and system architecture for…

Descriptors: Algorithms, Cluster Grouping, Computer Networks, Computer System Design

User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.

Peer reviewed

Gordon, Michael D. – Journal of the American Society for Information Science, 1991

Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…

Descriptors: Algorithms, Cluster Grouping, Documentation, Information Retrieval

A Sequential Method for Automatic Document Classification.

PDF pending restoration

White, Lee J.; And Others – 1975

The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

Controlled and Uncontrolled Subject Descriptions in the CF Database: A Comparison of Optimal Cluster-Based Retrieval Results.

Peer reviewed

Shaw, W. M., Jr. – Information Processing and Management, 1993

Describes a study conducted on the cystic fibrosis (CF) database, a subset of MEDLINE, that investigated clustering structure and the effectiveness of cluster-based retrieval as a function of the exhaustivity of the uncontrolled subject descriptions. Results are compared to calculations for controlled descriptions based on Medical Subject Headings…

Descriptors: Bibliographic Records, Cluster Analysis, Cluster Grouping, Comparative Analysis

A Distance Measure for Automatic Sequential Document Classification.

Download full text

Kar, B. Gautam; White, Lee J. – 1975

The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification