ERIC - Search Results

Descriptor

Automatic Indexing	3
Databases	3
Probability	3
Statistical Analysis	3
Algorithms	2
Bayesian Statistics	2
Classification	2
Cluster Grouping	2
Documentation	2
Mathematical Models	2
Sequential Approach	2
Cluster Analysis	1
Comparative Analysis	1
Database Design	1
Evaluation Methods	1
Feasibility Studies	1
Flow Charts	1
Information Retrieval	1
Intellectual Disciplines	1
Language Patterns	1
Measurement Techniques	1
Predictor Variables	1
Ratios (Mathematics)	1
Reliability	1
Subject Index Terms	1
More ▼

Source

Information Processing and…

Author

White, Lee J.	2
Damerau, Fred J.	1
Kar, B. Gautam	1

Publication Type

Reports - Research	3
Journal Articles	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

A Sequential Method for Automatic Document Classification.

PDF pending restoration

White, Lee J.; And Others – 1975

The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

A Distance Measure for Automatic Sequential Document Classification.

Download full text

Kar, B. Gautam; White, Lee J. – 1975

The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

Generating and Evaluating Domain-Oriented Multi-Word Terms from Texts.

Peer reviewed

Damerau, Fred J. – Information Processing and Management, 1993

Examines the use of various statistical techniques for generating domain-oriented multiword vocabulary terms for natural language database systems. Conclusions show the vocabulary clustering effect should be considered when making significance calculations and that a simple ratio of subject matter relative frequency to total sample relative…

Descriptors: Automatic Indexing, Cluster Analysis, Comparative Analysis, Database Design