What is the Career Path for a Data Scientist? (From $75,000 to $150,000 salary in 1-year)
Written by Matt Dancho
It was 2018 and Mohana was a struggling business analyst. He’d been getting a measly 3.5% raise since he joined his company.
Then in 2019 he got 1 raise (10%)…
In 6-months Mohana got another raise (this time 26%)…
And, then in just another 2 months (40% hike).
In total, in under one year Mohana got a total of 94% increase in his salary!
Today, Mohana is the Lead Data Scientist, at a Company called Money View - one of India’s fastest growing startups in India that just recently closed a $75-Million Dollar Series D Round of investments.
Mohana is kicking butt, this time in a different capacity.
As Lead Data Scientist, he’s helping Money View grow their talented and high-productivity team as the move into a new phase of startup growth.
I told Mohana how happy I was of him.
But how was Mohana able to double (2X) his salary in under 12-months?
What did he do to climb the career latter so quickly and land a job where ever he wanted?
And how did he maneuver his career into to working at a leading startup, Money View, where he’s now responsible for growing a team as a Lead Data Scientist?
The rest of this post will show you exactly how Mohana did it.
This post includes career research from Glassdoor, and case studies from 2 data scientists that are growing their careers faster than I’ve ever seen anyone do it. In this post, we’ll answer questions like:
- What data science roles exist? (and which to steer clear of)
- The career path for a data scientist (you start at $125,000)
- The skills needed to get promoted to Senior and Lead Data Scientist (you start at $150,000)
- Case Study 1: How to 2X your salary in 1-year ($75,000 -> $150,000)
- Case Study 2: How to make a splash (How one data scientist saved his company $5,000,000 each year)
1. What data science roles exist? (and which to steer clear of)
Data Science and Analytics Job Roles
The first question that comes to mind when you are learning about data science or trying to figure your way through is which roles exist?
There are 4 main categories:
Data Science Job Salaries (Glassdoor 2022)
We can see from the table that in general Data Science, Machine Learning, and NLP (Natural Language Processing) being compensated 40% more than Business Analyst positions.
So it’s clear that you want to avoid Business Analyst (more on this in a minute).
We can also see that depending on your interest (general data science vs specialized NLP) that there tends to be more pay for more specialization.
But, we’ll also see that pay will increase as your position changes from entry-level to senior/lead data scientist (coming shortly, hang in there).
But first, what do each of these data science roles do?
What do each of the data science roles do?
I’ve written extensively about the differences between each of the data science and analytics job roles here, but I’ll briefly recap in this table:
Data Science Job Roles Uncovered
We can see that it’s more typical for the hard-core engineering disciplines to use Python versus the more business analytical disciplines to use R/Python and Excel/PowerBI/Tableau.
If you are looking to move from Business Analysis to Data Science, we can see from the chart that you should add: R or Python to your skillset.
I explain in-depth exactly which skills are important to become a data scientist here.
What about Data Engineering?
The question I always get at this point is: What about Data Engineering?
Well, that’s a great question. Let me explain why Data Engineering is not in the table.
Here’s a typical conversation between a Business Analyst and a Data Engineer…
Typical conversation between a Business Analyst and a Data Engineer (Day in the Life of a Data Engineer)
So, what Data Engineers do is make the jobs of the Business Analysts and Data Scientists much easier.
They do this by giving data scientists access to data in a nice tidy looking format that comes from what they call a “data pipeline”.
According to the article, “A day in the life of a data engineer”, Data Engineers regularly deal with:
- Development of a data pipeline/API/microservice.
- Setup/Maintenance infrastructure
- Fixing bugs, improving code base, documentation
Data Engineers are valuable (if not essential) to data science program success
No question - Data Engineers are valuable.
But, since we are focused on the Data Science Career Path, it’s more important to focus on downstream tasks like production and business results rather than upstream tasks like data engineering, which is why I’m excluding Data Engineering career path from the conversation.
And quite honestly, I’m not the person to give you the pros and cons about data engineering.
This is why I’ll point you to a data engineering guru like Andreas Kretz.
It’s a mistake to go for NLP or Machine Learning Engineering (right away)
If you want to migrate into specialized roles like ML Ops or NLP Engineering from a Data Scientist position, then I’m all for that.
But, when you are just starting out, you’re best served learning data science first before moving into the more specialized fields.
Remember, you can always learn more later (become specialized), but in the beginning it’s important to gain general business domain and data science experience.
Then make your key moves after learning the business.
Avoid the business analyst position (RIGHT NOW!)
Popular Opinion: People should start as a business analyst, work there 2-4 years, and then migrated into data scientist positions.
Matt’s Opinion: People are regularly getting 50% raises by snatching up lucrative data scientist positions. You should do that.
We are in a once-in-a-lifetime generational disparity between the number of data scientists available (supply) and the number of positions needed (demand).
Massive Labor Shortage!
Because of COVID, there’s a bullwhip-effect in the US labor market. Let me explain.
In response to COVID, governments enforced a shutdown, forcing labor to decline swiftly and without notice.
Upon reopening, not all workers came back. This created a supply imbalance forcing companies to fill spots any way they could.
So what happened next is a once in a lifetime generational supply/demand imbalance that is working in your favor.
Companies began poaching data scientists from other companies stealing their highest value assets: their employees.
And the training time for most new hires is 1-2 years, so companies either had to offer higher salaries and benefits or be at risk of data scientists being poached.
Now you benefit.
Because you can SKIP the whole “business analyst -> data scientist” game and…
Jump right into Data Scientist roles.
How to jump right into data scientist roles
If you want to jump right into data scientist roles, check out these 2 articles:
- Which data science skills are important (To get a $50,000 increase in salary)
- How To Become A Financial Data Scientist (Or A Data Scientist In Any Domain)
Now that you know data science is right for you, let’s show you how to get promoted to Senior and Lead Data Scientist (to make $150,000/yr).
2. The career path from Data Scientist (Start at $125,000/yr)
First, let’s cover the career path for a data scientist, which for 85% of organizations looks like this:
Data Science Career Path - Flow Chart
I’ve done the hard work of doing all the research on each of the positions. Here’s what it looks like in table form:
Data Science Career Path - Compensation
The things you need to think about are:
- How to get to Senior / Lead Data Scientist as fast as possible ($150,000 - $160,000)
- Pick a path - Strategic or Technical
- Then keep repeating until you get to the top
The General Path
Most organizations have a general track which will take you to a Lead Data Scientist. The path looks like this:
The General Path
You start as a data scientist
You’ll start at data scientist making around $125,000 per year in total compensation. All you need to do is get the skills listed here.
In fact, I even made a convenient cheat sheet to make it even easier (which you can download for free here).
And a Pro-Tip: Skip the Business Analyst position. Companies NEED YOU right now. Get the skills and go for it!
Next, you’ll become a Senior Data Scientist.
These guys and gals make $150,000 per year in total compensation. And they are more experienced, probably have some big data experience, cloud experience (AWS, Docker, Git), and can do more advanced analyses when compared to the regular data scientists.
So learn big data and cloud. And learn to do more advanced analyses: Time Series, NLP, and Web Applications.
Next, you’ll become a Lead Data Scientist
These fellas make $160,000 per year in total compensation. And what really separates the Leads from the Seniors is their ability to work with Management, craft persuasive arguments, deliver insights (in the face of scrutiny), and they have well developed EQ (not just IQ).
So learn to make and deliver presentations, work with others well, and build persuasive arguments.
Let’s put this ALL together (comparing Senior/Lead vs Data Scientist)
If you really want to compare these 3 job general job roles, then I’ll make it even simpler for you. Just learn these skills.
Comparing Senior/Lead vs Data Scientist
Now you are probably thinking…
3. What skills do I need to become a Senior/Lead Data Scientist ($150,000+ year)?
The easiest way is to cheat!
What I mean is use a cheat sheet. Here’s my R-Cheat Sheet that will help you learn the skills you need to go from Data Scientist to Senior Data Scientist.
How to cheat to become a Senior/Lead Data Scientist.
If we head to my cheat sheet (page 3) you’ll find links to my goto-advanced tools for Senior/Lead Data Scientists. (PS- Check out this article for the tools for Data Scientists if you are becoming a Data Scientist.)
Matt's Goto Advanced Tools for Senior Data Scientists
Listen, I’m going to give you a little secret. THIS is how the Senior and Lead Data Scientists separate themselves from the novice Data Scientists.
Advanced Machine Learning, Feature Engineering, and Cross Validation
In the section titled, “Machine Learning”, you have all of the most powerful tools used for advanced machine learning, feature engineering, and cross-validation/hyperparameter tuning. THIS is a goldmine!
Advanced Machine Learning
Here’s my personal favorites. I’m a big fan of two machine learning packages (or ecosystems):
- Tidymodels: I use this for making adhoc models and then explaining
- H2O: I use this for automatic machine learning and in production
Another (extremely important) skill is feature engineering. I’m always using THIS package to create features:
- Recipes: Has preprocessing tools to transform numeric data and create features from date, time, and text data.
Next is hyperparameter tuning / cross validation. Here are my goto packages:
- Tune: Fore Hyperparameter tuning
- Rsample: For resampling and cross-validation sets that are inputs to
- Yardstick: For using pre-built accuracy metrics to minimize/maximize your loss during cross-validation.
Data Engineering (Big Data)
Another key skill of the “big dogs” is “big data”. This is where you work with data that is very large, sometimes SO large that it doesn’t fit inside your computer’s memory.
But don’t worry, I’ve got you covered here with some AMAZING packages.
Data Engineering in R (Big Data Tools)
If we head on down a little further on Page 3 of the cheat sheet, we find a section called “Speed and Scale” and “Integrating Python”.
First up is Data.Table
- data.table: This is the premier package for blazing speed. You can see how fast this is by exploring the Data Table Benchmarks here. It’s faster than Spark, dplyr, pandas, dask, and most major data engineering and database softwares.
- dtplyr: Now the big knock from tidyverse people (like me) that are used to dplyr is that the
data.tablesyntax is weird. I eventually learned it, but people that want to skip the pain can use
dtplyr. Dtplyr is the data table translator for dplyr. And, if you want to get up to speed quickly, I wrote a comprehensive dtplyr tutorial here.
Next is databases
- dbplyr: This stands for “database” dplyr and allows us to run dplyr scripts on your database, which is mindblowing! Why? Because databases are built for speed and scale (RAM is normally 1000X more than your puny macbook pro) and we don’t need to transfer the data to our macbook until it’s been chopped down, aggregated and summarized. I wanted to help you get up to speed, so I made a free dbplyr tutorial here.
Out-of-memory errors 😰
Now sometimes you’re going to run out of memory right before a presentation.
This is what happened to young Matt. Before I knew about the next 2 package.
I’d run code for my presentation tomorrow, and I’d get an error 2-hours in saying something like “out-of-memory” or “vector can’t be allocated.” 😰
Fortunately, I’ll help save your job (the way I eventually learned how to save mine). Here’s how.
Spark and Disk Frame (Fix Out of Memory Errors)
Head over to Speed and Scale (Page 3). Then click the links to sparklyr and Disk Frame.
Spark in R
- sparklyr: Spark is a tool that runs on cloud clusters and allows you to do all of your big data analysis in the cloud! And even better, sparklyr allows you to run all of the computations using
dplyrtranslations, which makes you 10X more productive than your python counterparts.
But you’re probably thinking, “But Matt, I don’t know how to do Spark from R. Can you help me?”
Yes… I’ll help. Here’s my Spark in R Masterclass that I opened up for free. Normally these are only available through my Learning Labs PRO membership program, but I can’t let you lose your job over an out-of-memory error. I wouldn’t be able to live with myself.
Disk Frame (R’s little big data secret)
Now, what happens if you don’t have access to a Spark Cluster? Well, another AWESOME package is the little known
- disk.frame: Disk frame allows you to chunk your datasets into blazingly fast
fstfiles, which can then be treated as a single dataset. Disk frame integrates with data.table and dplyr, meaning you can write translators no matter if you are data.table person OR a tidyverse person.
Finally, there’s Python in R
The last thing that separates Senior/Lead Data Scientists from the entry level is the ability to use Python with R.
Yep, you CAN use Python in R. Here’s how.
Reticulate: R's Python Connector
This is the most mind-blowing thing about R. And, it’s a super-power that will:
- Empower you to work collaboratively with Python teams (even though your an R user)
- Give you the key ingredient to make R packages that connect to python package. Here’s an R+Python Package that I created called
modeltime.gluontsthat connects to the GluonTS Python package for forecasting. Pretty sweet!!
Ok, now that you have the skills to become a Senior / Lead Data Scientist, we need to consider where you go after you become a Lead Data Scientist…
The Technical Path vs Strategic Path
Technical vs Strategic Career Path
You see, there are two paths… so choose wisely.
Don’t worry, I’ll help make this decision crystal clear.
I’ll share my perspective and how I chose when it was my time.
You see back in the day, before I was this amazing data science educator, I was a data scientist without a title (it was before “data scientist” existed in my previous employer).
I worked at a small company called Bonney Forge.
And, more than anything I loved the idea of influencing the direction of the company.
I was entrepreneurial, and enjoyed working with people.
Business was like a game of chess and I wanted to master it.
My customers were my unsuspecting opponent. And I used data science to checkmate them into more revenue.
Can anyone guess the path I chose?
If you guessed “STRATEGIC” you are 100% correct!
What about technical?
Even though I chose the strategic path, I don’t recommend it for everyone. Especially if you don’t like dealing with personnel issues as a manager.
I actually didn’t like this aspect one bit, but learned to be good with it, then busted my butt to get promoted out of a line manager position as fast as possible.
I eventually became a director, and my life was once again in harmony (like 38% of the time).
So what’s my point?
Well, if you can stand personnel issues for a year or two then don’t go into the strategic path.
Directors and chiefs are great, but I’m no where near that level
Listen, I get it.
But if you are reading this, you’re probably also highly motivated.
And guess what, those highly motivated people are the ones that eventually become directors and chiefs.
So it would be a mistake not to explain to you the ins-and-outs of the entire data science career path.
Not just simply how to double your salary… capisce?!
Three ways to getting promotions (FAST)
The 3-ways to getting promotions fast are:
- Be more productive than everyone else around you
- Do something big!! (and repeat)
- Job hopping
I’m a big fan of case-studies (it’s what we do in MBA school), and they work. So let’s cover some case studies of how to get promoted.
Note, I’m not going to discuss job-hopping. I’ll have a different article soon on how to get a job in data science (with interview hacks and back-office secrets guaranteed to land you a job). Stay tuned.
Onto our first case-study.
4. Case Study 1: How one data scientist 2X’ed his salary in 1-year
People are lazy. (I’m just going to say it.)
The simple fact is that people get comfortable.
But you don’t have to. In fact, the comfort of others CAN be something you can exploit.
An edge (if you’re smart).
Surely, you can’t be serious?
It am serious!
In fact, here’s the story of how Mohana did it (remember Mohana from the beginning of this article?).
Mohana was the analyst that got 3 raises in the span of a year totaling a 94% increase.
So if his salary was $75,000 starting out. By the end of the year his salary was $150,000.
So, how did Mohana do it?
Mohana says, “I just wanted to thank you again. You are my career savior.”
He continues, “Before when I had no idea about you and your courses, my growth as an analyst just sucked! I got a hike of 3.5% [per year].”
Mohana exclaims, “After your entry into my life, I got a 10% hike, and then a 26% hike, and then a 40% hike”.
So what changed?
Mohana enrolled in my 5-Course R-Track Program.
That’s when the flood of raises started.
Let’s dive into how Mohana trippled (yes 3x-ed) his productivity.
3X-ing his productivity with my R-courses
Here’s the scoop. Mohana was working with a bunch of Python coders.
These guys are slow and comfortable.
But Mohana isn’t like them. He’s motivated.
Mohana just needs a little edge.
And, Mohana got that when he met Matt Dancho (me). :)
You see I gave him the edge he needed to triple (yes, 3X!) his productivity versus his peers.
How did I 3X Mohana’s productivity?
I taught him the way I code in R. He was able to write half the code and get twice as much done versus his python counterparts.
I taught him how to make hundreds of machine learning models in minutes. I gave him my playbook for consulting with the secrets I used to spend less time on machine learning and more time on feature engineering.
I taught him the secrets to unlocking shiny web apps that his organization can use. You see while his python counterparts were trying to get their first app launched, Mohana already had three done.
And, I taught him the hidden way to scale time series to 1000’s of forecasts in minutes. This gave him a skill that no one… I mean no one had in his company.
Then, Mohana simply applied what I taught him to his business. And…
Now he’s a Lead Data Scientist
Mohana kept repeating. He kept growing.
Today he’s now the Lead Data Scientist at Money View, one of the fastest growing startups in India. And they are about to grow even faster with the $75-Million Series D round of investment they just received.
And, this is what I live for. Seeing my students succeed like this.
But that’s just one case. I couldn’t possibly duplicate it could I?
5. Case Study 2: How one data scientist saved his company $5,000,000 per year
What if you could save your company $5,000,000 every year in perpetuity?
Would your company value you?
Would you be promoted?
Well, this is exactly what happened to another one of my students.
Auggie learned how to make attrition models
Here’s what Auggie did…
Through my R-Track Program, Auggie learned the necessary skills to build complex attrition models.
Auggie was then able to apply the course framework to his business problem.
In the car insurance industry, his company needs to make assessments of whether or not vehicles were totaled in collisions. An incorrect assessment can be very costly to the car insurance firm.
Using my coursework, Auggie made a better model. In fact so much better that…
Auggie’s model saved the organization $400,000 every month
A quick math check means that Auggie saved his organization $4,800,000 per year. And these estimates may actually be low (meaning the model is likely saving more).
Auggie was recognized.
Auggie says, “The project was a huge success. I got a personal message from the CTO and the CEO mentioned the model in our most recent investor call.”
Auggie was rewarded with a promotion.
He exclaims, “The skills displayed during the project were a major consideration factor in my promotion to Analytics Manager a few months later. And it was all thanks to the skills I picked up in your R-Track courses.”
This is why organizations everywhere will value you if you learn data science.
And, I can help.
How to go from a $75,000 to a $150,000 salary
If you’ve read this article, you now have all of the information that is needed to take you from a $75,000 salary to a $150,000 salary.
But, you still don’t have a plan to do it fast.
It will take 2-years (or longer) on your own.
In fact, it actually took me 5-years of struggle to learn data science on my own. I took bootcamps, read books, research paper after paper, and nothing worked.
But that’s why I created my R-Track Program. To help people like me, struggling to get the 6-figure career they deserve.
Imagine what earning $125,000+ in 6-months from now could do for you
How amazing would it be to know you have the financial freedom to do anything you want.
You can take a vacation.
Spend more time with family.
Have financial stability and less stress.
And this is why an investment in yourself will unlock those dreams.
Remember Mohana? (3.5% raise to 94% raise in under 1-year)
Mohana was getting 3.5% raises.
He’s now a Lead Data Scientist at Money View, one of India’s fastest growing start-ups.
He says, “I just want to thank you again. You are my career savior.”
I replied, “Congratulations. You are seeing what happens when you invest in yourself.”
If you are ready to learn. I’m ready to teach.
If you are ready to learn. Then, I’m ready to teach.