how does the argument maxDocFreq in create_matrix work? I tried to eliminate recurring words in bill titles (act, bill, amend ecc...) by setting a high value for maxDocFreq but the resulting DocumentTermMatrix seems unaffected.
# SET THE SEED AND LOAD THE DATA
set.seed(95616)
data(USCongress)
# CREATE THE DOCUMENT-TERM MATRIX AND WRAP THE DATA IN A CONTAINER
tm::findFreqTerms(doc_matrix, 1)
tm::findFreqTerms(doc_matrix, 800)
doc_matrix <- create_matrix(USCongress$text, language="english", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998)
doc_matrix
doc_matrix1 <- create_matrix(USCongress$text, maxDocFreq=800, language="english", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998)
doc_matrix1