How To Geocode In R For FREE
Written by Matt Dancho
What’s the one thing that help you add value to your company’s raw geospatial data? GEOCODING.
Geocoding is the process of converting raw physical addresses to latitude and longitude geospatial points that can be viewed on a map and used for geospatial calculations. Heck - Geocoding has been known to increase my machine learning model perfomance by up to 10%!
Table of Contents
Today I’m going to show you how to do Geocoding in R for FREE using
tidygeocoder. Here’s what you’re learning today:
- Tutorial Part 1: How to use
tidygeocoder to effortlessly geocode addresses (convert your company addresses to Lat/Long)
- Tutorial Part 2: And I’m going to show you how to do Reverse Geocoding (go from Lat/Long to Physical Addresses)
- Bonus: I’m going to show you how to Map lat/long data using Simple Features + Mapview!
SPECIAL ANNOUNCEMENT: ChatGPT for Data Scientists Workshop on October 18th
Inside the workshop I’ll share how I built a Machine Learning Powered Production Shiny App with
ChatGPT (extends this data analysis to an insane production app):
What: ChatGPT for Data Scientists
When: Wednesday October 18th, 2pm EST
How It Will Help You: Whether you are new to data science or are an expert, ChatGPT is changing the game. There’s a ton of hype. But how can ChatGPT actually help you become a better data scientist and help you stand out in your career? I’ll show you inside my free chatgpt for data scientists workshop.
Price: Does Free sound good?
How To Join: 👉 Register Here
This article is part of R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks. Pretty cool, right?
Here are the links to get set up. 👇
This Tutorial is Available in Video
I have a companion video tutorial that gives you the bonus Mapview Shortcuts shown in this video (plus walks you through how to use it). And, I’m finding that a lot of my students prefer the dialogue that goes along with coding. So check out this video to see me running the code in this tutorial. 👇
Why Geocoding is a Must
Look, I’ve been working with customer data for a long time…
And one of the RICHEST sources of data is raw company addresses!
Think about it. If you know where a company is located, do you think that might be important to their purchasing behavior?
Well it was for me. In fact I found out that just simply adding the Latitude and Longitude information to my customer churn prediction models…
Gave my models a 10% increase in performance!
Lot's of Value to Machine Learning in Raw Customer Addresses
The Latitude and Longitude was key!
And that’s just one of the benefits of working with geospatial data (and geocoding).
But you’re probably thinking geospatial data is really tough.
Listen, I get it. Geospatial data is a little weird.
But, you have good ole Matt Dancho to help you out.
And my promise is today, I’m going to get you on the right track.
So let’s fix that geospatial problem, and make one small step today. And it starts with geocoding.
Thank You to the Developer (and Community).
Before we do our deep-dive into
tidygeocoder, I want to take a brief moment to thank the developers working on theTidygeocoder project, Jesse Cambon, Diego Hernangómez, Christopher Belanger and Daniel Possenriede. Without their hard work, this tutorial (and easy Geocoding) wouldn’t be possible. Thank you!
Free Gift: Cheat Sheet for my Top 100 R Packages (Special Geospatial Analysis Topics Included)
Before we dive in…
You’re going to need R packages to complete the geospatial analysis that helps your company. So why not speed up the process?
To help, I’m going to share my secret weapon…
Even I forget which R packages to use from time to time. And this cheat sheet saves me so much time. Instead of googling to filter through 20,000 R packages to find a needle in a haystack. I keep my cheat sheet handy so I know which to use and when to use them. Seriously. This cheat sheet is my bible.
Once you download it, head over to page 3 and you’ll see several R packages I use frequently just for Data Analysis.
Which is important when you want to work in these fields:
- Machine Learning
- Time Series
- Financial Analysis
- Geospatial Analysis
- Text Analysis and NLP
- Shiny Web App Development
So steal my cheat sheet. It will save you a ton of time.
Tutorial: How to Geocode in R for Free with
Time for geocoding with
tidygeocoder. Let’s have some fun!
Step 1: Load the Libraries
Load the following libraries.
tidygeocoder are the main libraries.
- But my bonus lat/long map hack uses
Get the code.
Step 2: Get My Pittsburgh Pharmacies Dataset
Next, you can steal my Pittsburgh Pharmacies dataset. This dataset is a great way to test your skills with Geocoding.
Steal The Pittsburgh Pharmacies Data Set
We’ll the Pittsburgh Pharmacies dataset (171 geocoded pharmacies) throughout the rest of this tutorial.
Get it here. It’s in the
Next, read the data set into R.
Get the code.
Step 3: Geocode the Address Column to get Latitude and Longitude
Next, use the
geocode() function to convert a company’s physical address to a Latitude / Longitude.
Get the code.
Here’s what happens…
Get the code.
A quick point on the API being used. The default is
method = "osm", which connects to the FREE Open Street Maps Nomenatim API. This is great, but may be too slow for your needs. Other free and paid API’s exist. (and yes google’s maps API is an option).
Step 4: Reverse Geocode to go from Lat/Long to Physical Address
Sometimes you have a latitude and longitude and want a physical address. For example, if your salesperson needs to know what addresses to visit (you wouldn’t send them a Lat/Long… or else they’d think your nuts!)
Did you know that you can reverse geocode?
You can! Here’s how to go from Latitude / Longitude to a Physical Address. (And save your inter-office reputation)
Get the code.
And you can see that reverse geocoding creates an address from Lat/Long coordinates.
Get the code.
Bonus: Steal My Map Hack to Visualize Lat/Long Data
Want to visualize the geocoded data?
Steal my bonus script here. (It’s in the
Here’s what it does in 2 lines of code:
Now you can visualize all 171 Pittsburgh Pharmacies in an interactive map!
You learned how to use the
tidygeocoder library to geocode and reverse geocode. Great work! But, there’s a lot more to becoming a data scientist.
If you’d like to become a Business Data Scientist (and have an awesome career, improve your quality of life, enjoy your job, and all the fun that comes along), then I can help with that.
Struggling to become a data scientist?
You know the feeling. Being unhappy with your current job.
Promotions aren’t happening. You’re stuck. Feeling Hopeless. Confused…
And you’re praying that the next job interview will go better than the last 12…
… But you know it won’t. Not unless you take control of your career.
The good news is…
I Can Help You Speed It Up.
I’ve helped 6,107+ students learn data science for business from an elite business consultant’s perspective.
I’ve worked with Fortune 500 companies like S&P Global, Apple, MRM McCann, and more.
And I built a training program that gets my students life-changing data science careers (don’t believe me? see my testimonials here):
6-Figure Data Science Job at CVS Health ($125K)
Senior VP Of Analytics At JP Morgan ($200K)
50%+ Raises & Promotions ($150K)
Lead Data Scientist at Northwestern Mutual ($175K)
2X-ed Salary (From $60K to $120K)
2 Competing ML Job Offers ($150K)
Promotion to Lead Data Scientist ($175K)
Data Scientist Job at Verizon ($125K+)
Data Scientist Job at CitiBank ($100K + Bonus)
Whenever you are ready, here’s the system they are taking:
Here’s the system that has gotten aspiring data scientists, career transitioners, and life long learners data science jobs and promotions…
Join My 5-Course R-Track Program
(And Become The Data Scientist You Were Meant To Be...)
P.S. - Samantha landed her NEW Data Science R Developer job at CVS Health (Fortune 500). This could be you.