How to Scrape Word Documents with R
Written by Matt Dancho
![](/assets/2020-09-17-scrape-word-docs/scrape_word_docs_cover.jpeg)
This article is part of a R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks.
Today we discuss an awesome skill for automating data collection from word documents:
(Click image to play video)
Here’s a common situation, you’re company has LOTS OF WORD FILES.
They contain tables of information that look like this:
![Word Tables](/assets/2020-09-17-scrape-word-docs/scrape_word_doc_1.jpg)
Thinking like a programmer, you can extract this data using officer:
![](/assets/2020-09-17-scrape-word-docs/scrape_word_doc_2.jpg)
With a little bit of data wrangling with the tidyverse, you’ve got your table extracted & formatted:
![](/assets/2020-09-17-scrape-word-docs/format_data_2.jpg)
Then you use ggplot2 to make a sweet plot:
![](/assets/2020-09-17-scrape-word-docs/plot_code.jpg)
Whoa - Look at 201! Getting a high “Activity Ratio” - Ratio of Lessons completed to Number of Students Enrolled:
![](/assets/2020-09-17-scrape-word-docs/plot.jpg)
You’ve just automated extracting word tables in R. BOOM! 💥💥💥
![](/assets/2020-09-17-scrape-word-docs/boom.gif)
SETUP R-TIPS WEEKLY PROJECT
-
Get the Code
-
Check out the R-Tips Setup Video.
Once you take these actions, you’ll be set up to receive R-Tips with Code every week. =)