R : word cloud from dataframe

Published May 05, 2020

Ok, so let's image we have a dataframe with a raw text field.

Step 1 cleaning data

Let's load stop words which we will remove from our text.

install.packages("stopwords")
library("stopwords")
library(tidyverse)
library(tidytext)


freq_dataframe <- text_dataframe %>% unnest_tokens(word, text) %>% anti_join(stopwords("ru"))

#unnest_tokens - tokenize text
#anti_join - removes stop words

## You can choose any language instead of russian, "en", "es, "fr", ...

Step 2 Count

count_dataframe = freq_dataframe %>% count(word)
#word - is a field of tokenized text

Step 3 Build the cloud

install.packages("wordcloud")
library("wordcloud")
wordcloud(words = count_dataframe$word, freq = count_dataframe$n, max.words = 300)

Data Science R Machine learning Data mining

Report

Enjoy this post? Give Alex Polymath a like if it's helpful.

Alex Polymath

Let me solve some troubles

Hello, my name is Alex Polymath! I'm indie-hacker, creator of 📷 http://colorize.cc - photo colorization + restoration 🧑‍🎨 https://portret.ai/ AI avatars generation I've been doing complex admin dashboards : for video manag...

Discover and read more posts from Alex Polymath

get started