Package: tm 0.7-15
tm: Text Mining Package
A framework for text mining applications within R.
Authors:
tm_0.7-15.tar.gz
tm_0.7-15.zip(r-4.5)tm_0.7-15.zip(r-4.4)tm_0.7-15.zip(r-4.3)
tm_0.7-15.tgz(r-4.4-x86_64)tm_0.7-15.tgz(r-4.4-arm64)tm_0.7-15.tgz(r-4.3-x86_64)tm_0.7-15.tgz(r-4.3-arm64)
tm_0.7-15.tar.gz(r-4.5-noble)tm_0.7-15.tar.gz(r-4.4-noble)
tm_0.7-15.tgz(r-4.4-emscripten)tm_0.7-15.tgz(r-4.3-emscripten)
tm.pdf |tm.html✨
tm/json (API)
NEWS
# Install 'tm' in R: |
install.packages('tm', repos = c('https://r-forge.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://r-forge.r-project.org/projects/tm
Last updated 3 days agofrom:c24c6324a0. Checks:OK: 2 ERROR: 7. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 18 2024 |
R-4.5-win-x86_64 | ERROR | Nov 18 2024 |
R-4.5-linux-x86_64 | OK | Nov 18 2024 |
R-4.4-win-x86_64 | ERROR | Nov 18 2024 |
R-4.4-mac-x86_64 | ERROR | Nov 18 2024 |
R-4.4-mac-aarch64 | ERROR | Nov 18 2024 |
R-4.3-win-x86_64 | ERROR | Nov 18 2024 |
R-4.3-mac-x86_64 | ERROR | Nov 18 2024 |
R-4.3-mac-aarch64 | ERROR | Nov 18 2024 |
Exports:as.DocumentTermMatrixas.TermDocumentMatrixas.VCorpusBoost_tokenizercontent_transformerCorpusDataframeSourceDirSourceDocsDocumentTermMatrixDublinCoreDublinCore<-eoifindAssocsfindFreqTermsfindMostFreqTermsFunctionGeneratorgetElemgetMetagetReadersgetSourcesgetTokenizersgetTransformationsHeaps_plotinspectMC_tokenizernDocsnTermsPCorpuspGetElemPlainTextDocumentread_dtm_Blei_et_alread_dtm_MCreadDataframereadDOCreaderreadPDFreadPlainreadRCV1readRCV1asPlainreadReut21578XMLreadReut21578XMLasPlainreadTaggedreadXMLremoveNumbersremovePunctuationremoveSparseTermsremoveWordsscan_tokenizerSimpleCorpusSimpleSourcestemCompletionstemDocumentstepNextstopwordsstripWhitespaceTermDocumentMatrixtermFreqTermstm_filtertm_indextm_maptm_parLapplytm_parLapply_enginetm_reducetm_term_scoreURISourceVCorpusVectorSourceweightBinWeightFunctionweightSMARTweightTfweightTfIdfwriteCorpusXMLSourceXMLTextDocumentZipf_plotZipSource
Readme and manuals
Help Manual
Help page | Topics |
---|---|
50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq | acq |
Content Transformers | content_transformer |
Corpora | Corpus |
20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude | crude |
Data Frame Source | DataframeSource |
Directory Source | DirSource |
Access Document IDs and Terms | Docs nDocs nTerms Terms |
Find Associations in a Term-Document Matrix | findAssocs findAssocs.DocumentTermMatrix findAssocs.TermDocumentMatrix |
Find Frequent Terms | findFreqTerms |
Find Most Frequent Terms | findMostFreqTerms findMostFreqTerms.DocumentTermMatrix findMostFreqTerms.TermDocumentMatrix findMostFreqTerms.term_frequency |
Read Document-Term Matrices | read_dtm_Blei_et_al read_dtm_MC |
Tokenizers | getTokenizers |
Transformations | getTransformations |
Parallelized 'lapply' | tm_parLapply tm_parLapply_engine |
Inspect Objects | inspect inspect.PCorpus inspect.TermDocumentMatrix inspect.TextDocument inspect.VCorpus |
Metadata Management | DublinCore DublinCore<- meta meta.PCorpus meta.PlainTextDocument meta.SimpleCorpus meta.VCorpus meta.XMLTextDocument meta<-.PCorpus meta<-.PlainTextDocument meta<-.SimpleCorpus meta<-.VCorpus meta<-.XMLTextDocument |
Permanent Corpora | PCorpus |
Plain Text Documents | PlainTextDocument |
Visualize a Term-Document Matrix | plot.TermDocumentMatrix |
Read In a Text Document from a Data Frame | readDataframe |
Read In a MS Word Document | readDOC |
Readers | FunctionGenerator getReaders Reader |
Read In a PDF Document | readPDF |
Read In a Text Document | readPlain |
Read In a Reuters Corpus Volume 1 Document | readRCV1 readRCV1asPlain |
Read In a Reuters-21578 XML Document | readReut21578XML readReut21578XMLasPlain |
Read In a POS-Tagged Word Text Document | readTagged |
Read In an XML Document | readXML |
Remove Numbers from a Text Document | removeNumbers removeNumbers.character removeNumbers.PlainTextDocument |
Remove Punctuation Marks from a Text Document | removePunctuation removePunctuation.character removePunctuation.PlainTextDocument |
Remove Sparse Terms from a Term-Document Matrix | removeSparseTerms |
Remove Words from a Text Document | removeWords removeWords.character removeWords.PlainTextDocument |
Simple Corpora | SimpleCorpus |
Sources | close.SimpleSource eoi eoi.SimpleSource getElem getElem.DataframeSource getElem.DirSource getElem.URISource getElem.VectorSource getElem.XMLSource getMeta getMeta.DataframeSource getSources length.SimpleSource open.SimpleSource pGetElem pGetElem.DataframeSource pGetElem.DirSource pGetElem.URISource pGetElem.VectorSource reader reader.SimpleSource SimpleSource Source stepNext stepNext.SimpleSource |
Complete Stems | stemCompletion |
Stem Words | stemDocument stemDocument.character stemDocument.PlainTextDocument |
Stopwords | stopwords |
Strip Whitespace from a Text Document | stripWhitespace stripWhitespace.PlainTextDocument |
Term-Document Matrix | as.DocumentTermMatrix as.TermDocumentMatrix DocumentTermMatrix TermDocumentMatrix |
Term Frequency Vector | termFreq |
Text Documents | TextDocument |
Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors | c.TermDocumentMatrix c.term_frequency c.TextDocument c.VCorpus |
Filter and Index Functions on Corpora | tm_filter tm_filter.PCorpus tm_filter.SimpleCorpus tm_filter.VCorpus tm_index tm_index.PCorpus tm_index.SimpleCorpus tm_index.VCorpus |
Transformations on Corpora | tm_map tm_map.PCorpus tm_map.SimpleCorpus tm_map.VCorpus |
Combine Transformations | tm_reduce |
Compute Score for Matching Terms | tm_term_score tm_term_score.DocumentTermMatrix tm_term_score.PlainTextDocument tm_term_score.TermDocumentMatrix tm_term_score.term_frequency |
Tokenizers | Boost_tokenizer MC_tokenizer scan_tokenizer |
Uniform Resource Identifier Source | URISource |
Volatile Corpora | as.VCorpus VCorpus |
Vector Source | VectorSource |
Weight Binary | weightBin |
Weighting Function | WeightFunction |
SMART Weightings | weightSMART |
Weight by Term Frequency | weightTf |
Weight by Term Frequency - Inverse Document Frequency | weightTfIdf |
Write a Corpus to Disk | writeCorpus |
XML Source | XMLSource |
XML Text Documents | XMLTextDocument |
Explore Corpus Term Frequency Characteristics | Heaps_plot Zipf_plot |
ZIP File Source | ZipSource |