Part 1 - Five Full Stack Data Science Technologies for 2020 (and Beyond)

Written by Matt Dancho



Moving into 2020, three things are clear - Organizations want Data Science, Cloud, and Apps. Here are the Top 5 essential skills for Data Scientists that need to build and deploy applications in 2020 and beyond.

Articles in Series

  1. Part 1 - Five Full-Stack Data Science Technologies for 2020 (and Beyond) (You Are Here)
  2. Part 2 - AWS Cloud
  3. Part 3 - Docker
  4. Part 4 - Git Version Control
  5. Part 5 - H2O Automated Machine Learning (AutoML)
  6. Part 6 - R Shiny vs Tableau (3 Business Application Examples)
  7. [NEW BOOK] - The Shiny Production with AWS Book

Top 20 Tech Skills 2014-2019

Indeed, the popular employment-related search engine, released an article showing changing trends from 2015 to 2019 in “Technology-Related Job Postings” examining the 5-Year Change of the most requested technology skills.

Today's Top Tech Skills

Top 20 Tech Skills 2014-2019
Source: Indeed Hiring Lab.

I’m generally not a big fan of these reports because the technology landscape changes so quickly. But, I was pleasantly surprised at the length of time from the analysis - Indeed looked at changes over a 5-year period, which gives a much better sense of the long term trends.

Why No R, Shiny, Tableau PowerBI, Alteryx?

The skills reported are not “Data Science”-specific (which is why you don’t see R, Tableau, PowerBI, Alteryx, on the list).

However, we can glean insights based on the technologies present…

Cloud, Machine Learning, Apps Driving Growth

From the technology growth, it’s clear that Businesses need Cloud + ML + Apps.

Key Technologies Driving Tech Skill Growth

Technologies Driving Tech Skill Growth

My Takeaway

This assessment has led me to my key technologies for Data Scientists heading into 2020. I focus on key technologies related to Cloud + ML + Apps.

Top 5 Data Science Technologies for Cloud + ML + Apps

That Data Scientists should learn for 2020 and beyond - these are geared towards the Business Demands: Cloud + ML + Apps. In other words, businesses need data-science and machine learning-powered web applications deployed into production via the Cloud.

Here's what you need to learn to build ML-Powered Web Applications and deploy in the Cloud.

*Note that R and Python are skills that you should be learning before you jump into these.

5 Key Data Science Technologies for Cloud + Machine Learning + Applications

5 Key Data Science Technologies for Cloud + Machine Learning + Applications

1. AWS Cloud Services

The most popular cloud service provider. EC2 is a staple for apps, running jupyter/rstudio in the cloud, and leveraging cloud resources rather than investing in expensive computers & servers.

Learn More: Data Science with AWS (A Top Skill for 2020)

2. Docker for Web Apps

Creating docker environments drastically reduces the risk of software incompatibility in production. DockerHub makes it easy to share your environment with other Data Scientists or DevOps. Further, Docker and DockerHub make it easy to deploy applications into production.

Learn More: Docker for Data Scientists (A Top Skill for 2020)

3. Git Version Control

Git and GitHub are staples for reproducible research and web application development. Git tracks past versions and enables software upgrades to be performed on branches. GitHub makes it easy to share your research and/or web applications with other Data Scientists, DevOps, or Data Engineering. Further, Git and GitHub make it easy to deploy changes to apps in production.

Learn More: Git for Data Science Applications (A Top Skill for 2020)

4. H2O Machine Learning

Automated machine learning library available in Python and R. Works well on structured data (format for 95% of business problems). Automation drastically increases productivity in machine learning.

Learn More: 5 Reasons to Learn H2O Machine Learning

5. Shiny Web Apps

A comprehensive web framework designed for data scientists with a rich ecosystem of extension libraries (dubbed the “shinyverse”).

Learn More: Shiny vs Tableau (3 Business Application Examples)

Other Technologies Worth Mentioning

  1. SQL - For data scientists that need to create complex SQL queries, but don’t have time to deal with messy SQL. dbplyr is a massive productivity booster - It converts R (dplyr) to SQL. You can use it for 95% of SQL queries.

  2. Bootstrap - For data scientists that build apps, Bootstrap is a Front-End web framework that Shiny is built on top of and it powers much of the web (e.g. Twitter’s app). Bootstrap makes it easy to control the User Interface (UI) of your application.

  3. MongoDB - For data scientists that build apps, MongoDB is a NoSQL database that is useful for storing complex user information of your application in one table. Much easier than creating a multi-table SQL database.

Real Shiny App + AWS + Docker Case Example

In my Shiny Developer with AWS Course (NEW), you use the following application architecture that uses AWS EC2 to create an Ubuntu Linux Server that hosts a Shiny App in the cloud called the Stock Analyzer.

Data Science Web Application Architecture
From Shiny Developer with AWS Course

You use AWS EC2 to build a server to run your Stock Analyzer application along with several other web apps.

AWS EC2 Instance used for Cloud Deployment
From Shiny Developer with AWS Course

Next, you use a DockerFile to containerize the application’s software environment.

DockerFile for Stock Analyzer App
From Shiny Developer with AWS Course

You then deploy your “Stock Analyzer” application so it’s accessible anywhere via the AWS Cloud.

DockerFile for Stock Analyzer App
From Shiny Developer with AWS Course

If you are ready to learn how to build and deploy Shiny Applications in the cloud using AWS, then I recommend my NEW 4-Course R-Track System.



I look forward to providing you the best data science for business education.

Matt Dancho

Founder, Business Science

Lead Data Science Instructor, Business Science University