Descriptor
Automatic Indexing | 14 |
Mathematical Models | 14 |
Information Retrieval | 8 |
Relevance (Information… | 6 |
Subject Index Terms | 6 |
Algorithms | 5 |
Probability | 5 |
Classification | 4 |
Statistical Analysis | 4 |
Cluster Grouping | 3 |
Comparative Analysis | 3 |
More ▼ |
Author
Harter, Stephen P. | 2 |
Salton, G. | 2 |
White, Lee J. | 2 |
Biru, Tesfaye | 1 |
Bookstein, Abraham | 1 |
Bruandet, Marie-France | 1 |
Buckley, Christopher | 1 |
Crouch, Carolyn J. | 1 |
Deerwester, Scott | 1 |
Fuhr, Norbert | 1 |
Harding, P. | 1 |
More ▼ |
Publication Type
Reports - Research | 10 |
Journal Articles | 7 |
Opinion Papers | 1 |
Education Level
Audience
Researchers | 2 |
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating

Harter, Stephen P. – Journal of the American Society for Information Science, 1975
Confirms previously published research in concluding that specialty words tend to possess frequency distributions which cannot be described by a single Poisson distribution. (Author/PF)
Descriptors: Automatic Indexing, Indexing, Keywords, Mathematical Models

Harter, Stephen P. – Journal of the American Society for Information Science, 1975
A probabilistic model of keyword indexing is outlined, and some of the consequences of the model are examined. An algorithm defining a measure of indexability is developed--a measure intended to reflect the relative significance of words in documents. (Author)
Descriptors: Algorithms, Automatic Indexing, Indexing, Mathematical Models

Bookstein, Abraham; Swanson, Don R. – Journal of the American Society for Information Science, 1974
Descriptors: Automatic Indexing, Cluster Grouping, Indexes, Information Retrieval

And Others; Salton, G. – Journal of the American Society for Information Science, 1975
A new technique, known as discrimination value analysis, ranks the text words in accordance with how well they are able to discriminate the documents of a collection from each other. (Author/PF)
Descriptors: Automatic Indexing, Databases, Discriminant Analysis, Information Processing

Robertson, S. E.; Harding, P. – Journal of Documentation, 1984
Presents adaptation of a probabilistic theoretical model previously used in relevance feedback for use in automatic indexing of documents (in the sense of imitating) human indexers. Methods for model application are proposed, independence assumptions used in the model are interpreted, and the probability of a dependence model is discussed.…
Descriptors: Automatic Indexing, Classification, Information Retrieval, Mathematical Models
Yu, C. T.; Salton, G. – 1975
Formal proofs are given of the effectiveness under well-defined conditions of the thesaurus method in information retrieval. It is shown, in particular, that when certain semantically related terms are added to the information queries originally submitted by the user population, a superior retrieval system is obtained in the sense that for every…
Descriptors: Automatic Indexing, Information Retrieval, Information Storage, Mathematical Models

Crouch, Carolyn J. – Information Processing and Management, 1988
Describes the two basic approaches to the calculation of term discrimination values for automatic indexing. The results of an experiment that investigated the differences between algorithms of these two approaches in terms of their impact on the discrimination value model are reported and discussed. (13 references) (Author/CLB)
Descriptors: Algorithms, Automatic Indexing, Comparative Analysis, Computational Linguistics

Biru, Tesfaye; And Others – Journal of Documentation, 1989
Discusses the effect of including relevance data on the calculation of term discrimination values in bibliographic databases. Algorithms that calculate the ability of index terms to discriminate between relevant and non-relevant documents are described and tested. The results are discussed in terms of the relationship between term frequency and…
Descriptors: Algorithms, Automatic Indexing, Bibliographic Databases, Mathematical Models

Fuhr, Norbert – Information Processing and Management, 1989
Describes three models for probabilistic indexing, all based on the Darmstadt automatic indexing approach, and presents experimental evaluation results for each. The discussion covers the improved retrieval effectiveness of probabilistic indexing over binary indexing, and suggestions for using this automatic indexing method with free text terms.…
Descriptors: Automatic Indexing, Comparative Analysis, Information Retrieval, Mathematical Formulas

Bruandet, Marie-France – Information Processing and Management, 1989
Outlines an approach to the automatic construction of a knowledge base resulting in a system that is able in each phase of its construction to acquire domain knowledge from all new information that it is building, particularly index terms. Topics covered include production rules, the use of semantic networks, and user interfaces. (32 references)…
Descriptors: Automatic Indexing, Classification, Expert Systems, Information Retrieval

Deerwester, Scott; And Others – Journal of the American Society for Information Science, 1990
Describes a new method for automatic indexing and retrieval called latent semantic indexing (LSI). Problems with matching query words with document words in term-based information retrieval systems are discussed, semantic structure is examined, singular value decomposition (SVD) is explained, and the mathematics underlying the SVD model is…
Descriptors: Automatic Indexing, Documentation, Factor Analysis, Information Retrieval

White, Lee J.; And Others – 1975
The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

Salton, Gerard; Buckley, Christopher – Information Processing and Management, 1988
Summarizes the experimental evidence that indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results superior to those obtained with more elaborate text representations, and provides baseline single term indexing models with which more elaborate content analysis procedures can be…
Descriptors: Automatic Indexing, Comparative Analysis, Content Analysis, Information Retrieval
Kar, B. Gautam; White, Lee J. – 1975
The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification