R : word cloud from dataframe
Ok, so let's image we have a dataframe with a raw text field.
Step 1 cleaning data
Let's load stop words which we will remove from our text.
install.packages("stopwords")
library("stopwords")
library(tidyverse)
library(tidytext)
freq_dataframe <- text_dataframe %>% unnest_tokens(word, text) %>% anti_join(stopwords("ru"))
#unnest_tokens - tokenize text
#anti_join - removes stop words
## You can choose any language instead of russian, "en", "es, "fr", ...
Step 2 Count
count_dataframe = freq_dataframe %>% count(word)
#word - is a field of tokenized text
Step 3 Build the cloud
install.packages("wordcloud")
library("wordcloud")
wordcloud(words = count_dataframe$word, freq = count_dataframe$n, max.words = 300)