# ggside: A new R package for plotting distributions in side-plots

Written by Matt Dancho

I fell in love with a new ggplot2 extension. It made my life much simpler to help me uncover relationships in my complex business data. ggside is a new R package uses â€śmarginal distribution plotsâ€ť, which are the density side-plot panels to the top and right of scatter (made popular by the Python Seaborn package). Letâ€™s get you up and running with ggside in under 5-minutes with this quick R-Tip.

# SPECIAL ANNOUNCEMENT: ChatGPT for Data Scientists Workshop on October 18th

Inside the workshop Iâ€™ll share how I built a Machine Learning Powered Production Shiny App with ChatGPT (extends this data analysis to an insane production app):

What: ChatGPT for Data Scientists

When: Wednesday October 18th, 2pm EST

How It Will Help You: Whether you are new to data science or are an expert, ChatGPT is changing the game. Thereâ€™s a ton of hype. But how can ChatGPT actually help you become a better data scientist and help you stand out in your career? Iâ€™ll show you inside my free chatgpt for data scientists workshop.

Price: Does Free sound good?

How To Join: đź‘‰ Register Here

Here are the links to get set up. đź‘‡

# This Tutorial Is Available In Video

I have a companion video tutorial that shows even more secrets (plus mistakes to avoid). And, Iâ€™m finding that a lot of my students prefer the dialogue that goes along with coding. So check out this video to see me running the code in this tutorial. đź‘‡

Watch my 5-minute tutorial on YouTube

# What are Marginal Distributions?

Marginal Distribution (Density) plots are a way to extend your numeric data with side plots that highlight the density (histogram or boxplots work too).

Marginal Distribution Plots were made popular with the seaborn jointplot() side-panels in Python. These add side plots that highlight distributions.

# Side-Plot Tutorial with ggside

Marginal distributions can now be made in R using ggside, a new ggplot2 extension. You can make linear regression with marginal distributions using histograms, densities, box plots, and more. Bonus - The side panels are super customizable for uncovering complex relationships.

Here are two examples of what you will do in this tutorial! đź‘‡

### Plot 1: Linear Regression with Marginal Distribution (Density) Side-Plots (Top and Left)

The first plot youâ€™ll makeâ€¦

### Plot 2: Facet-Plot with Marginal Box Plots (Top)

The second plot youâ€™ll makeâ€¦

# Thank You Developers

I want to thank jtlandis for his amazing software contribution. JT is a data scientist at the University of North Carolina at Chapel Hill and and R Developer who created ggside. Thank you for all you do!

# Before we get started, get the Cheat Sheet

ggside is great for making marginal distribution side plots. But, you'll still need to learn how to visualize data with ggplot2. For those topics, I'll use the Ultimate R Cheat Sheet to refer to ggplot2 code in my workflow.

### Quick Example:

Download the Ultimate R Cheat Sheet. Then Click the "CS" next to "ggplot2" which opens the Data Visualization with Dplyr Cheat Sheet.

Now you're ready to quickly reference ggplot2 functions.

The libraries we'll need today are patchwork, ggridges, ggrepel, maps, tidyverse, and lubridate. All packages are available on CRAN and can be installed with install.packages(). Note - I'm using the development version of ggside, which is what I recommend in the YouTube Video .

The dataset is the mpg data that comes with ggplot2.

# Plot 1: Linear Regression with Marginal Distribution Plot

We'll start by replicating what you can do in Python's Seaborn jointdist() Plot. We'll accomplish this with ggside::geom_xsidedensity()

### We set up the plot just like a normal ggplot.

Refer to the Ultimate R Cheat Sheet for:

• ggplot()
• geom_point()
• geom_smooth()

### Next we add from ggside:

• geom_xsidedensity() - Adds a side density panel (top panel).
• geom_ysidedensity() - Adds a side density panel (right panel).

The trick is using the after_stat(density), which makes an awesome looking marginal density side panel plot. I increased the size of the marginal density panels with the theme(ggside.panel.scale.x).

### Loess Regression w/ Marginal Density

We generate the regression plot with marginal distributions (density) to highlight key differences between the automobile classes. We can see:

• Pickup, SUV - Have the lowest Highway Fuel Economy (MPG)
• 2seater, Compact, Midsize, Subcompact - Have the highest Highway Fuel Economy

# Plot 2. Faceted Side-Panels

Next, let's try out some advanced functionality. I want to see how ggside handles faceted plots, which are subplots that vary based on a categorical feature. We'll use the "cyl" column to facet, which is for engine size (number of cylinders).

### Faceted Side Panels? No problem.

Awesome! I have included facets by "cyl", which creates four plots based on the engine size. ggside picked up on the facets and has made 4 side-panel plots.

# đź’ˇ Conclusions

You learned how to use ggside. Great work! But, thereâ€™s a lot more to becoming a Business Scientist (my term for an incredibly valuable data scientist that has business problem-solving skills).

If youâ€™d like to become a Business Scientistâ€¦

With an awesome 6-figure data science career, improved quality of life, a fulfilling job that helps your business, and all the fun that comes along with a career that gives you the freedom to be creative and a problem solver in industry, then I would love to help you.

# Struggling to become a data scientist?

You know the feeling. Being unhappy with your current job.

Promotions arenâ€™t happening. Youâ€™re stuck. Feeling Hopeless. Confusedâ€¦

And youâ€™re praying that the next job interview will go better than the last 12â€¦

â€¦ But you know it wonâ€™t. Not unless you take control of your career.

The good news isâ€¦

Iâ€™ve helped 6,107+ students learn data science for business from an elite business consultantâ€™s perspective.

Iâ€™ve worked with Fortune 500 companies like S&P Global, Apple, MRM McCann, and more.

And I built a training program that gets my students life-changing data science careers (donâ€™t believe me? see my testimonials here):

# Whenever you are ready, hereâ€™s the system they are taking:

Hereâ€™s the system that has gotten aspiring data scientists, career transitioners, and life long learners data science jobs and promotionsâ€¦

P.S. - Samantha landed her NEW Data Science R Developer job at CVS Health (Fortune 500). This could be you.