ERIC - Search Results

Descriptor

Automatic Indexing	14
Mathematical Models	14
Information Retrieval	8
Relevance (Information…	6
Subject Index Terms	6
Algorithms	5
Probability	5
Classification	4
Statistical Analysis	4
Cluster Grouping	3
Comparative Analysis	3
Databases	3
Documentation	3
Tables (Data)	3
Bayesian Statistics	2
Discriminant Analysis	2
Indexing	2
Information Storage	2
Semantics	2
Sequential Approach	2
Statistical Distributions	2
Bibliographic Databases	1
Computational Linguistics	1
Content Analysis	1
Expert Systems	1
More ▼

Source

Journal of the American…	5
Information Processing and…	4
Journal of Documentation	2

Author

Harter, Stephen P.	2
Salton, G.	2
White, Lee J.	2
Biru, Tesfaye	1
Bookstein, Abraham	1
Bruandet, Marie-France	1
Buckley, Christopher	1
Crouch, Carolyn J.	1
Deerwester, Scott	1
Fuhr, Norbert	1
Harding, P.	1
Kar, B. Gautam	1
Robertson, S. E.	1
Salton, Gerard	1
Swanson, Don R.	1
Yu, C. T.	1
More ▼

Publication Type

Reports - Research	10
Journal Articles	7
Opinion Papers	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

A Probabilistic Approach to Automatic Keyword Indexing

Peer reviewed

Harter, Stephen P. – Journal of the American Society for Information Science, 1975

Confirms previously published research in concluding that specialty words tend to possess frequency distributions which cannot be described by a single Poisson distribution. (Author/PF)

Descriptors: Automatic Indexing, Indexing, Keywords, Mathematical Models

A Probabilistic Approach to Automatic Keyword Indexing: Part II, An Algorithm for Probabilistic Indexing

Peer reviewed

Harter, Stephen P. – Journal of the American Society for Information Science, 1975

A probabilistic model of keyword indexing is outlined, and some of the consequences of the model are examined. An algorithm defining a measure of indexability is developed--a measure intended to reflect the relative significance of words in documents. (Author)

Descriptors: Algorithms, Automatic Indexing, Indexing, Mathematical Models

Probabilistic Models for Automatic Indexing

Peer reviewed

Bookstein, Abraham; Swanson, Don R. – Journal of the American Society for Information Science, 1974

Descriptors: Automatic Indexing, Cluster Grouping, Indexes, Information Retrieval

A Theory of Term Importance in Automatic Text Analysis

Peer reviewed

And Others; Salton, G. – Journal of the American Society for Information Science, 1975

A new technique, known as discrimination value analysis, ranks the text words in accordance with how well they are able to discriminate the documents of a collection from each other. (Author/PF)

Descriptors: Automatic Indexing, Databases, Discriminant Analysis, Information Processing

Probabilistic Automatic Indexing by Learning from Human Indexers.

Peer reviewed

Robertson, S. E.; Harding, P. – Journal of Documentation, 1984

Presents adaptation of a probabilistic theoretical model previously used in relevance feedback for use in automatic indexing of documents (in the sense of imitating) human indexers. Methods for model application are proposed, independence assumptions used in the model are interpreted, and the probability of a dependence model is discussed.…

Descriptors: Automatic Indexing, Classification, Information Retrieval, Mathematical Models

The Effectiveness of the Thesaurus Method in Automatic Information Retrieval. Technical Report No. 75-261.

Download full text

Yu, C. T.; Salton, G. – 1975

Formal proofs are given of the effectiveness under well-defined conditions of the thesaurus method in information retrieval. It is shown, in particular, that when certain semantically related terms are added to the information queries originally submitted by the user population, a superior retrieval system is obtained in the sense that for every…

Descriptors: Automatic Indexing, Information Retrieval, Information Storage, Mathematical Models

An Analysis of Approximate versus Exact Discrimination Values.

Peer reviewed

Crouch, Carolyn J. – Information Processing and Management, 1988

Describes the two basic approaches to the calculation of term discrimination values for automatic indexing. The results of an experiment that investigated the differences between algorithms of these two approaches in terms of their impact on the discrimination value model are reported and discussed. (13 references) (Author/CLB)

Descriptors: Algorithms, Automatic Indexing, Comparative Analysis, Computational Linguistics

Inclusion of Relevance Information in the Term Discrimination Model.

Peer reviewed

Biru, Tesfaye; And Others – Journal of Documentation, 1989

Discusses the effect of including relevance data on the calculation of term discrimination values in bibliographic databases. Algorithms that calculate the ability of index terms to discriminate between relevant and non-relevant documents are described and tested. The results are discussed in terms of the relationship between term frequency and…

Descriptors: Algorithms, Automatic Indexing, Bibliographic Databases, Mathematical Models

Models for Retrieval with Probabilistic Indexing.

Peer reviewed

Fuhr, Norbert – Information Processing and Management, 1989

Describes three models for probabilistic indexing, all based on the Darmstadt automatic indexing approach, and presents experimental evaluation results for each. The discussion covers the improved retrieval effectiveness of probabilistic indexing over binary indexing, and suggestions for using this automatic indexing method with free text terms.…

Descriptors: Automatic Indexing, Comparative Analysis, Information Retrieval, Mathematical Formulas

Outline of a knowledge-Base Model for an Intelligent Information Retrieval System.

Peer reviewed

Bruandet, Marie-France – Information Processing and Management, 1989

Outlines an approach to the automatic construction of a knowledge base resulting in a system that is able in each phase of its construction to acquire domain knowledge from all new information that it is building, particularly index terms. Topics covered include production rules, the use of semantic networks, and user interfaces. (32 references)…

Descriptors: Automatic Indexing, Classification, Expert Systems, Information Retrieval

Indexing by Latent Semantic Analysis.

Peer reviewed

Deerwester, Scott; And Others – Journal of the American Society for Information Science, 1990

Describes a new method for automatic indexing and retrieval called latent semantic indexing (LSI). Problems with matching query words with document words in term-based information retrieval systems are discussed, semantic structure is examined, singular value decomposition (SVD) is explained, and the mathematics underlying the SVD model is…

Descriptors: Automatic Indexing, Documentation, Factor Analysis, Information Retrieval

A Sequential Method for Automatic Document Classification.

PDF pending restoration

White, Lee J.; And Others – 1975

The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification

Term-Weighting Approaches in Automatic Text Retrieval.

Peer reviewed

Salton, Gerard; Buckley, Christopher – Information Processing and Management, 1988

Summarizes the experimental evidence that indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results superior to those obtained with more elaborate text representations, and provides baseline single term indexing models with which more elaborate content analysis procedures can be…

Descriptors: Automatic Indexing, Comparative Analysis, Content Analysis, Information Retrieval

A Distance Measure for Automatic Sequential Document Classification.

Download full text

Kar, B. Gautam; White, Lee J. – 1975

The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…

Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification