Very fast and memory-friendly tools for text vectorization and state-of-the-art word embeddings (GloVe). This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are much larger than available RAM. All core functions are parallelized to benefit from multicore machines.
Version: | 0.3.0 |
Depends: | R (≥ 3.2.0), methods |
Imports: | Matrix (≥ 1.1), Rcpp (≥ 0.11), RcppParallel (≥ 4.3.14), digest (≥ 0.6.8), iterators (≥ 1.0.8), foreach (≥ 1.4.3), data.table (≥ 1.9.6), stringr (≥ 1.0.0), magrittr (≥ 1.5) |
LinkingTo: | Rcpp, RcppParallel, digest |
Suggests: | testthat, knitr, rmarkdown, glmnet, parallel |
Published: | 2016-03-31 |
Author: | Dmitriy Selivanov [aut, cre], Lincoln Mullen [ctb] |
Maintainer: | Dmitriy Selivanov <selivanov.dmitriy at gmail.com> |
BugReports: | https://github.com/dselivanov/text2vec/issues |
License: | MIT + file LICENSE |
URL: | https://github.com/dselivanov/text2vec |
NeedsCompilation: | yes |
SystemRequirements: | GNU make, C++11 |
Materials: | README |
CRAN checks: | text2vec results |
Reference manual: | text2vec.pdf |
Vignettes: |
Advanced topics GloVe Word Embeddings Analyzing Texts with the text2vec Package |
Package source: | text2vec_0.3.0.tar.gz |
Windows binaries: | r-devel: text2vec_0.3.0.zip, r-release: text2vec_0.3.0.zip, r-oldrel: text2vec_0.3.0.zip |
OS X Mavericks binaries: | r-release: text2vec_0.3.0.tgz, r-oldrel: text2vec_0.3.0.tgz |
Old sources: | text2vec archive |
Reverse imports: | textmineR |