text2vec: Fast Text Mining Framework for Vectorization and Word Embeddings

Very fast and memory-friendly tools for text vectorization and state-of-the-art word embeddings (GloVe). This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are much larger than available RAM. All core functions are parallelized to benefit from multicore machines.

Version: 0.3.0
Depends: R (≥ 3.2.0), methods
Imports: Matrix (≥ 1.1), Rcpp (≥ 0.11), RcppParallel (≥ 4.3.14), digest (≥ 0.6.8), iterators (≥ 1.0.8), foreach (≥ 1.4.3), data.table (≥ 1.9.6), stringr (≥ 1.0.0), magrittr (≥ 1.5)
LinkingTo: Rcpp, RcppParallel, digest
Suggests: testthat, knitr, rmarkdown, glmnet, parallel
Published: 2016-03-31
Author: Dmitriy Selivanov [aut, cre], Lincoln Mullen [ctb]
Maintainer: Dmitriy Selivanov <selivanov.dmitriy at gmail.com>
BugReports: https://github.com/dselivanov/text2vec/issues
License: MIT + file LICENSE
URL: https://github.com/dselivanov/text2vec
NeedsCompilation: yes
SystemRequirements: GNU make, C++11
Materials: README
CRAN checks: text2vec results

Downloads:

Reference manual: text2vec.pdf
Vignettes: Advanced topics
GloVe Word Embeddings
Analyzing Texts with the text2vec Package
Package source: text2vec_0.3.0.tar.gz
Windows binaries: r-devel: text2vec_0.3.0.zip, r-release: text2vec_0.3.0.zip, r-oldrel: text2vec_0.3.0.zip
OS X Mavericks binaries: r-release: text2vec_0.3.0.tgz, r-oldrel: text2vec_0.3.0.tgz
Old sources: text2vec archive

Reverse dependencies:

Reverse imports: textmineR