Codementor Events

R : word cloud from dataframe

Published May 05, 2020
R : word cloud from dataframe

Ok, so let's image we have a dataframe with a raw text field.

Step 1 cleaning data

Let's load stop words which we will remove from our text.

install.packages("stopwords")
library("stopwords")
library(tidyverse)
library(tidytext)


freq_dataframe <- text_dataframe %>% unnest_tokens(word, text) %>% anti_join(stopwords("ru"))

#unnest_tokens - tokenize text
#anti_join - removes stop words

## You can choose any language instead of russian, "en", "es, "fr", ...

Step 2 Count

count_dataframe = freq_dataframe %>% count(word)
#word - is a field of tokenized text

Step 3 Build the cloud

install.packages("wordcloud")
library("wordcloud")
wordcloud(words = count_dataframe$word, freq = count_dataframe$n, max.words = 300)
Discover and read more posts from Alex Polymath
get started