lexicon.Rd
The lexicon of a corpus consists of all the terms that occur in any document in the corpus. The lexical frequency of a term tells us how often a term occurs across all of the documents. Often the most interesting words in a document are those words whose frequency within a document is higher than their frequency in the corpus as a whole.
lexicon(corpus) update_lexicon(corpus) # S3 method for corpus lexicon(corpus) # S3 method for corpus update_lexicon(corpus)
corpus | A corpus, as returned vy |
---|
# NOT RUN { init_textanalysis() # build document doc1 <- string_document("First document.") doc2 <- string_document("Second document.") # do not automatically update corpus <- corpus(doc1, doc2, update_lexicon = FALSE) update_lexicon(corpus) lexicon(corpus) # }