Descriptor
Statistical Distributions | 17 |
Subject Index Terms | 17 |
Information Retrieval | 11 |
Indexing | 8 |
Tables (Data) | 7 |
Mathematical Models | 6 |
Databases | 5 |
Online Systems | 5 |
Relevance (Information… | 5 |
Search Strategies | 5 |
Algorithms | 4 |
More ▼ |
Source
Information Processing and… | 8 |
Journal of the American… | 4 |
Journal of Documentation | 2 |
Information Processing &… | 1 |
Information Technology and… | 1 |
International Library Review | 1 |
Author
Nelson, Michael J. | 3 |
Wolfram, Dietmar | 3 |
Biru, Tesfaye | 1 |
Biswas, S. C. | 1 |
Buckley, Christopher | 1 |
Devine, K. | 1 |
Farrell, Michael P. | 1 |
Fedorowicz, Jane | 1 |
Hood, William | 1 |
Jansen, Bernard J. | 1 |
Mahapatra, M. | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 15 |
Reports - Descriptive | 3 |
Education Level
Audience
Researchers | 5 |
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating

Nelson, Michael J. – Journal of Documentation, 1989
Presents a probability model of the occurrence of index terms used to derive discrete distributions which are mixtures of Poisson and negative binomial distributions. These distributions give better fits than the simpler Zipf distribution, have the advantage of being more explanatory, and can incorporate a time parameter if necessary. (25…
Descriptors: Goodness of Fit, Mathematical Models, Probability, Statistical Distributions

Wolfram, Dietmar – Journal of the American Society for Information Science, 1996
Explores inter-record linkage relationships of a bibliographic hypertext system through the use of descriptor term co-occurrences. Using term distribution and term exhaustivity data for an existing system, three models of term co-occurrence are developed and tested against the observed data. (Author/LRW)
Descriptors: Bibliographic Records, Hypermedia, Information Retrieval, Models

Nelson, Michael J. – Information Processing and Management, 1988
Describes a study that collected statistics on the occurrence of terms used in searching an online catalog and investigated the correlation of the selection of terms with their occurrence in indexing. The discussion of the results includes implications for indexing theories. (19 references) (CLB)
Descriptors: Correlation, Indexing, Online Catalogs, Online Searching

Nelson, Michael J.; Tague, Jean M. – Journal of the American Society for Information Science, 1985
Proposes split model for index term distribution in document set that uses rank function for high frequency terms and size function for low frequency terms; the point of transition is determined either empirically or by rule. Distributions to describe index term exhaustivity and term co-occurrence are considered briefly. (36 references) (EJS)
Descriptors: Databases, Indexing, Information Retrieval, Models

Todeschini, Claudio; Farrell, Michael P. – Journal of the American Society for Information Science, 1989
Describes an expert system that can identify documents erroneously categorized by indexers working within a prior category scheme. The discussion covers the results of a test of the system and its implications for automatic indexing systems. (14 references) (CLB)
Descriptors: Bibliographic Databases, Expert Systems, Indexing, Mathematical Formulas

Vizine-Goetz, Diane; Markey, Karen – Information Technology and Libraries, 1989
Details the characteristics of authority records in the machine readable Library of Congress Subject Headings. Statistics are given on occurrences of variable length fields and the length and maximum length of such fields in subject heading records. The usefulness of these statistics for system design is discussed. (five references) (CLB)
Descriptors: Bibliographic Records, Machine Readable Cataloging, Statistical Distributions, Subject Index Terms

Biru, Tesfaye; And Others – Journal of Documentation, 1989
Discusses the effect of including relevance data on the calculation of term discrimination values in bibliographic databases. Algorithms that calculate the ability of index terms to discriminate between relevant and non-relevant documents are described and tested. The results are discussed in terms of the relationship between term frequency and…
Descriptors: Algorithms, Automatic Indexing, Bibliographic Databases, Mathematical Models

Wisniewski, Janusz L. – Information Processing and Management, 1986
Discussion of a new method of index term dictionary compression in an inverted-file-oriented database highlights a technique of word coding, which generates short fixed-length codes obtained from the index terms themselves by analysis of monogram and bigram statistical distributions. Substantial savings in communication channel utilization are…
Descriptors: Algorithms, Database Management Systems, Databases, Information Retrieval

Mahapatra, M.; Biswas, S. C. – International Library Review, 1984
Describes research which measured the efficiency of role operators through frequency of appearances in PRECIS input strings for 200 abstracts related to taxation, genetic psychology, and Shakespearian drama. Frequencies of appearance of major categories of role operators, role operators in different subjects, individual main line operators, and…
Descriptors: Comparative Analysis, Indexing, Information Retrieval, Relevance (Information Retrieval)

Jansen, Bernard J.; Spink, Amanda; Saracevic, Tefko – Information Processing & Management, 2000
Describes a study that analyzed transaction logs of users of Excite, an Internet search service. Highlights include data on sessions (changes in queries, number of pages viewed, and relevance feedback), queries (number of search terms and the use of logic and modifiers), and terms (rank/frequency distribution); user characteristics; and failure…
Descriptors: Information Retrieval, Relevance (Information Retrieval), Search Strategies, Statistical Distributions

Fedorowicz, Jane – Journal of the American Society for Information Science, 1982
Derives the underlying structure of the Zipf distribution, with emphasis on its application to word frequencies in the inverted files of automatic bibliographic systems, and applies the Zipfian model to the National Library of Medicine's MEDLINE database. An appendix on the Zipfian mean and 12 references are included. (Author/JL)
Descriptors: Citations (References), Databases, Information Retrieval, Mathematical Models

Hood, William; Wilson, Concepcion S. – Information Processing and Management, 1994
Summarizes the findings of a recent study on the indexing practices used in the Library and Information Science Abstracts (LISA) database. Descriptors, or indexing terms, from the thesaurus are analyzed; searching implications are discussed; and the relationship between the classification code and the descriptors is examined. (Contains 21…
Descriptors: Bibliographic Databases, Classification, Indexing, Library Science

Willett, Peter – Information Processing and Management, 1985
Reports algorithm for calculation of term discrimination values that is sufficiently fast in operation to permit use of exact values. Evidence is presented to show that relationship between term discrimination and term frequency is crucially dependent upon type of inter-document similarity measure used for calculation of discrimination values. (13…
Descriptors: Algorithms, Graphs, Information Retrieval, Information Systems

Salton, Gerard; Buckley, Christopher – Information Processing and Management, 1988
Summarizes the experimental evidence that indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results superior to those obtained with more elaborate text representations, and provides baseline single term indexing models with which more elaborate content analysis procedures can be…
Descriptors: Automatic Indexing, Comparative Analysis, Content Analysis, Information Retrieval

Smith, F. J.; Devine, K. – Information Processing and Management, 1985
Zipfian laws for frequency distributions of word pairs and longer phrases are derived from text sample analysis. From crossing of Zipfian curves, it is deduced that number of multi-word phrases that occur frequently in text is surprisingly small, of same order of magnitude as number of individual word-types. (8 references) (EJS)
Descriptors: Algorithms, Graphs, Indexing, Information Retrieval
Previous Page | Next Page ยป
Pages: 1 | 2