Start Now 👉 Business Science Logo
Business Science
  • Free Masterclass
  • E-Book
  • Courses
    • Become A Business Data Scientist With R
    • The 5-Course Data Scientist R-Track SEE RESULTS

    • The Data Scientist R-Track Courses:

    • Course 1: Data Science for Business Part 1

    • Course 2: Data Science for Business Part 2

    • Course 3: High Performance Time Series

    • Course 4: Shiny Web Applications Part 1

    • Course 5: Shiny Web Applications Part 2

    • Data Science Projects & Portfolio
    • Learning Labs PRO Projects & Case Studies

    • Become A Production Data Scientist with Python
    • The 2-Course Production Python-Track NEW!

    • The Production Python-Track Courses:

    • Course 1: Data Science Automation

    • Course 2: Machine Learning & APIs NEW!

    • For The Enterprise
    • Team Plans

  • Student Testimonials
  • Blog
  • Contact
  • Log In
  • Start Now 👉

Learn Data Science with R

    Supply Chain Analysis with R Using the planr Package

    Written by Matt Dancho on October 20, 2024
    ...

    Effectively managing inventory is critical to a well-functioning supply chain. Learn how to project inventories using the planr package in R to optimize your supply chain operations.

    Read More...

    Top 10 R Packages for Exploratory Data Analysis (EDA) (Bookmark this!)

    Written by Matt Dancho on October 3, 2024
    ...

    Enhance your data analysis workflow with these top 10 R packages for exploratory data analysis (EDA). Discover how to gain deeper insights into your data using tools like skimr, psych, corrplot, and more.

    Read More...

    Introducing gt_summarytools: Analyze Your Data Faster With R

    Written by Matt Dancho on September 22, 2024
    ...

    In today's fast-paced data science environment, speeding up exploratory data analysis (EDA) is more critical than ever. This is where gt_summarytools() comes in. A new function I’ve developed, gt_summarytools(), combines the best features of gt and summarytools, allowing you to create detailed, interactive data summaries faster and with more flexibility than ever.

    Read More...

    How to Analyze Your Data Faster With R Using summarytools

    Written by Matt Dancho on September 15, 2024
    ...

    Getting quick insights into your data is absolutely critical to data understanding, predictive modeling, and production. Learn how to use the summarytools package in R to analyze your data faster.

    Read More...

    Top 25 R Packages (You Need To Learn In 2024)

    Written by Matt Dancho on August 25, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing a curated list of the top 25 R packages that you aren't using (but you need to learn in 2024). Let's go!

    Read More...

    Tableau in R for $0 (Introducing GWalkR)

    Written by Matt Dancho on August 9, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to do data exploration in R for visualization tool called GWalkR that is like Tableau for $0. Let's go!

    Read More...

    Shockingly-fast data manipulation in R with polars

    Written by Matt Dancho on July 19, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to use Polars in R for shockingly-fast data manipulation. Let's go!

    Read More...

    Large Language Models (LLMs) in R with tidychatmodels

    Written by Matt Dancho on June 14, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to use Large Language Models (LLMs) in R with tidychatmodels. Let's go!

    Read More...

    How to Get ChatGPT in R with chattr

    Written by Matt Dancho on May 11, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to get ChatGPT in R with chattr. Let's go!

    Read More...

    How to Make Mobile Apps with R Shiny

    Written by Matt Dancho on April 14, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to make Mobile Web Apps with R Shiny using shinyMobile. Let's go!

    Read More...

    How to Scrape PDF Documents and Search PDFs with OpenAI LLMs (in R)

    Written by Matt Dancho on March 31, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to scrape a PDF financial statement. Then I'll show you how to summarize it with OpenAI LLMs in R. Let's go!

    Read More...

    Make Microsoft Word Reports with R + officedown

    Written by Matt Dancho on February 24, 2024
    ...

    What's the one thing that will impress your company? A professional business report. And Microsoft Word is the defacto standard (NOT Jupyter Notebooks, HTML web-reports). Even PDFs aren't ideal, especially if they need to review them.

    Read More...

    XGBoost: Tuning the Hyperparameters (My Secret 2 Step Process in R)

    Written by Matt Dancho on January 12, 2024
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how I tune XGBoost Hyperparameters in R. Let's go!

    Read More...

    The Top 5 Time Series Analysis Concepts (that helped me the most in my career)

    Written by Matt Dancho on December 23, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, I'm sharing how to do Time Series Analysis in R. Let's go!

    Read More...

    Introduction to A/B Testing in R (For Marketing Analytics)

    Written by Matt Dancho on December 16, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's video, I'm sharing how to do A/B testing in R. Let's go!

    Read More...

    Data Engineering in R: How to Build Your First Data Pipeline with R, Mage, and Google Cloud Platform (in under 45 Minutes)

    Written by Arben Kqiku (Intro by Matt Dancho) on December 2, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's lesson, we're sharing how to use R in production, with Mage.ai and Google Cloud. Let's go!

    Read More...

    How to Make a Data Science Portfolio Website (in Under 15 Minutes with R)

    Written by Matt Dancho on November 4, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's video, I'm sharing how to make a professional data science portfolio in under 15 minutes. Let's go!

    Read More...

    Introducing anomalize for timetk in R (For Time Series Anomaly Detection)

    Written by Matt Dancho on October 29, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's video, I'm sharing the cheat code to detecting anomalizes. We'll cover a full financial analysis. Let's go!

    Read More...

    ChatGPT: How to use ChatGPT to make a Shiny App (an Excel Analyzer)

    Written by Matt Dancho on October 15, 2023
    ...

    Hey guys, welcome back to my R-tips newsletter. In today's video, I'm sharing the cheat code to making an R shiny web app that automatically analyzes your Excel Files. Plus, I'm sharing exactly how I made it in under 15 minutes. AND how you can do it for ANY company. Let's go!

    Read More...

    Top 9 R packages (that every Data Scientist must know)

    Written by Matt Dancho on September 30, 2023
    ...

    In this article, I share 9 R packages that have helped me the most. Let's go!

    Read More...

    ChatGPT: How to automate Google Sheets in under 2 minutes (with R)

    Written by Matt Dancho on September 5, 2023
    ...

    In this article, I share the cheat code to automating Google Sheets. Plus, I'm sharing exactly how I made it in under 2 minutes. AND how you can do it for ANY company (using ChatGPT and R). Let's go!

    Read More...

    How to make Data Visualizations THAT GO VIRAL (with ggplot2 in R)

    Written by Matt Dancho on August 27, 2023
    ...

    Quit trying to tell stories with data visualization. Do this instead. If you're a data scientist or data analyst who wants to get an executive to stop from looking at his phone and to start looking at your slide deck OR someone on LinkedIn to stop mindlessly scrolling and start reading your post, then this tutorial will help.

    Read More...

    How to make Ridiculous Tables in R (from Excel)

    Written by Matt Dancho on August 6, 2023
    ...

    Transitioning from Excel to R for data analysis enhances efficiency and enables more complex operations, and R's capability to convert Excel tables simplifies this transition. This article illustrates the importance of this shift and guides readers through the process of converting Excel tables into R.

    Read More...

    How to use ChatGPT for Time Series in R

    Written by Matt Dancho on July 20, 2023
    ...

    Writing code is a slow process especially when you are first learning time series. What if you could speed it up? You can and this is how.

    Read More...

    K-Means Clustering in R the Tidy Way (Feat. Tidyclust)

    Written by Matt Dancho on July 6, 2023
    ...

    Need to cluster faster? I've been playing around with a new R package that makes it super simple. It's called Tidyclust.

    Read More...

    How to improve your storytelling with R

    Written by Matt Dancho on June 5, 2023
    ...

    Storytelling is critical to your success as a Data Scientist. Your career hinges on whether or not you can pursuade management, executives, and leadership to make decisions. But how do you do this effectively?

    Read More...

    How to R code faster with ChatGPT

    Written by Matt Dancho on April 30, 2023
    ...

    Writing code is a slow process especially when you are first learning data science. What if you could speed it up? You can and this is how.

    Read More...

    ChatGPT: Made this Shiny App in 10 Minutes

    Written by Matt Dancho on April 2, 2023
    ...

    Want to learn how to build a shiny app in under 10 minutes. You can with chatgpt!

    Read More...

    How To Geocode In R For FREE

    Written by Matt Dancho on March 19, 2023
    ...

    Want to impact your business? Learn how to use geospatial data... And the first step is Geocoding addresses and latitude/longitude.

    Read More...

    Exploratory DataXray Analysis (EDXA)

    Written by Matt Dancho on January 5, 2023
    ...

    Do you know how long EDA (exploratory data analysis) used to take me? Not hours, not days... A full week! Today I'm going to show you how to use dataxray. With this new R package I'm about to show you, you'll cut your EDA time into 5 minutes.

    Read More...

    How to make a plot with two different y-axis in R with ggplot2? (a secret ggplot2 hack)

    Written by Matt Dancho on December 1, 2022
    ...

    Your company lives off them... Excel files. Why not automate them & save some time? Here's an Excel File you're going to make in this tutorial from R.

    Read More...

    ggradar: radar plots with ggplot in R

    Written by Matt Dancho on October 4, 2022
    ...

    Need to quickly compare multiple groups in your data? Radar plots are the perfect way to analyze groups across many numeric metrics.

    Read More...

    explore: simplified exploratory data analysis (EDA) in R

    Written by Matt Dancho on September 23, 2022
    ...

    Did you know most Data Scientists spend 80% of their time just trying to understand and prepare data for analysis? R has an Insane Exploratory Data Analysis productivity-enhancer. It's called Explore.

    Read More...

    ggdensity: A new R package for plotting high-density regions

    Written by Matt Dancho on July 21, 2022
    ...

    As data scientists, it can be downright impossible to drill into messy data. Fortunately, there's a new R package that helps us focus on a "high-density region". It's called ggdensity.

    Read More...

    modelDown: Automate Explainable AI (Machine Learning) in R

    Written by Matt Dancho on July 6, 2022
    ...

    Machine learning is great... until you have to explain it. Stakeholders are normally non-technical, C-suites that ultimately want to know what the model does for the business. And how it helps increase revenue or decrease costs. A new R package, modelDown can help.

    Read More...

    Survival Analysis in R (in under 10-minutes)

    Written by Matt Dancho on June 9, 2022
    ...

    Learn how to do survival analysis in R in under 10-minutes. Plus get 3 bonuses to take your survival plots to the NEXT LEVEL. Let's go!

    Read More...

    The Most Overlooked R Package (That Can Get You Through A Data Science Job Interview)

    Written by Matt Dancho on June 2, 2022
    ...

    If you are looking to learn about the most overlooked R package that can help you get through a job interview AND you probably don't know it yet, you've come to the right place, my friend! And, if you want a job in data science, I'm going to show you how THIS R package can help you get through an interview with 5 lines of code.

    Read More...

    How I analyze 100+ ggplots at once

    Written by Matt Dancho on March 30, 2022
    ...

    Big data? Lot's of time series? Traditionally you'd use ggplot facets. But that only works for a few datasets. Enter trelliscopejs. It's a game changer!

    Read More...

    What is the Career Path for a Data Scientist? (From $75,000 to $150,000 salary in 1-year)

    Written by Matt Dancho on March 15, 2022
    ...

    The data science career path is demystified in this article showing you 2 case studies and a ton of research on how to set your career up for going from $75,000 to $150,000 in 1-year.

    Read More...

    The 14 most important data science skills (To get a $50,000 increase in salary)

    Written by Matt Dancho on March 11, 2022
    ...

    Which skills are important to becoming a data scientist? How to pick a language? How to learn the skills? These questions and many more are answered in this post.

    Read More...

    My 4 most important explainable AI visualizations (modelStudio)

    Written by Matt Dancho on February 22, 2022
    ...

    The modelStudio library offers an interactive studio for developing and exploring explainable AI visualizations.

    Read More...

    A new R package for Business Analytics... radiant

    Written by Matt Dancho on February 1, 2022
    ...

    I'm super impressed by the radiant R package. With no prior exposure to radiant, I was able to complete a short business analytics report in under 10-minutes.

    Read More...

    Introducing portfoliodown: The Data Science Portfolio Website Builder

    Written by Matt Dancho on December 20, 2021
    ...

    I'm super excited to introduce a new R package that makes it painless for data scientists to create a professional.

    Read More...

    How to Make a Heatmap in R

    Written by Matt Dancho on October 12, 2021
    ...

    The ggplot2 package is an essential tool in every data scientists toolkit. Today we show you how to use ggplot2 to make a professional heatmap that organizes customers by their sales purchasing habits.

    Read More...

    Tidy Parallel Processing in R with furrr

    Written by Matt Dancho on September 14, 2021
    ...

    furrr is a critical package to speed up iterative calculations using tidyverse purrr syntax.

    Read More...

    ggalt: Make a Lollipop Plot to Compare Categories in ggplot2

    Written by Matt Dancho on August 24, 2021
    ...

    ggalt is a ggplot2 extension that adds many new ggplot geometries. In this tutorial, we'll learn how to make lollipop plots for comparing categories within our data using geom_lollipop().

    Read More...

    ggalt: Make a Dumbbell Plot to Visualize Change in ggplot2

    Written by Matt Dancho on August 12, 2021
    ...

    ggalt is a ggplot2 extension that adds many new ggplot geometries. In this tutorial, we'll learn how to make dumbbell plots for visualizing change within our data using geom_dumbbell().

    Read More...

    ggforce: Make a Hull Plot to Visualize Clusters in ggplot2

    Written by Matt Dancho on July 27, 2021
    ...

    ggforce is a ggplot2 extension that adds many exploratory data analysis features. In this tutorial, we'll learn how to make hull plots for visualizing clusters or groups within our data.

    Read More...

    ggdist: Make a Raincloud Plot to Visualize Distribution in ggplot2

    Written by Matt Dancho on July 22, 2021
    ...

    The ggdist package is a ggplot2 extension that is made for visualizing distributions and uncertainty. We'll show see how ggdist can be used to make a raincloud plot.

    Read More...

    easystats: Quickly investigate model performance

    Written by Matt Dancho on July 13, 2021
    ...

    The easystats performance R package makes it easy to investigate the relevant assumptions for regression models. Simply use the check_model() function to produce a visualization that combines 6 tests for model performance.

    Read More...

    R is for Research, Python is for Production

    Written by Matt Dancho and Jarrell Chalmers on July 12, 2021
    ...

    Both R and Python are great. We’ll showcase some of the strengths of each language in this article by showcasing where the major development efforts are within each ecosystem.

    Read More...

    Gentle Introduction to Forecasting with Modeltime [Video Tutorial]

    Written by Matt Dancho on June 29, 2021
    ...

    A gentle introduction to our forecasting package, Modeltime. Modeltime extends the Tidymodels ecosystem for time series forecasting. Learn how to forecast with ARIMA, Prophet, and linear regression time series models.

    Read More...

    grafify: Make 5 powerful ggplot2 graphs quickly with R

    Written by Matt Dancho on June 15, 2021
    ...

    grafify offers 19 plotting functions that make it quick and easy to make great-looking plots in R.

    Read More...

    Siuba: Data wrangling with dplyr in Python

    Written by Matt Dancho on June 8, 2021
    ...

    The siuba python library brings the power of R's dplyr and the tidyverse to Python. Gain access to functions like group_by(), mutate(), summarize(), and more!

    Read More...

    gghalves: Make Half Boxplot | Half Dotplot Visualizations with ggplot2

    Written by Matt Dancho on May 25, 2021
    ...

    gghalves is a new R package that makes it easy to compose your own half-plots using ggplot2

    Read More...

    DataEditR: The GUI for Interactive Dataframe Editing in R

    Written by Matt Dancho on May 18, 2021
    ...

    Now you can edit data in R using a GUI that is reminiscent of Excel.

    Read More...

    ggside: A new R package for plotting distributions in side-plots

    Written by Matt Dancho on May 18, 2021
    ...

    Marginal distributions can now be made in R using ggside, a new ggplot2 extension.

    Read More...

    patchwork: ggplot2 plot combiner

    Written by Matt Dancho on May 11, 2021
    ...

    Now you can make publication-ready storyboards. Patchwork makes it simple to combine separate ggplots into the same graphic.

    Read More...

    mmtable2: ggplot2 for tables

    Written by Matt Dancho on May 4, 2021
    ...

    I love ggplot2 for plotting. The grammar of graphics allows us to add elements to plots. Tables seem to be forgotten in terms of an intuitive grammar with tidy data philosophy - Until now.

    Read More...

    ggplot2 Extension: corrmorrant for Flexible Correlation Plots in R

    Written by Matt Dancho on April 27, 2021
    ...

    Productivity is essential in data science. Businesses need value quickly so they can make decisions. Corrmorrant gets this.

    Read More...

    Webscraping Tables in R: Datapasta Copy-and-Paster

    Written by Matt Dancho on April 20, 2021
    ...

    Datapasta is an amazing package that allows us to copy-and-paste any HTML or Excel Tables into R.

    Read More...

    Enhance Your Storytelling - Interactive Slide Decks with Rmarkdown

    Written by Matt Dancho on April 6, 2021
    ...

    Slide Decks are so important for storytelling in business. We can use Rmarkdown to tell our story with engaging interactivity thanks to the xaringan library. Here's how to make PowerPoint-style Slide Presentations that are interactive straight from R.

    Read More...

    Make PDF Data Analysis Reports with R | Rmarkdown Visual Editor

    Written by Matt Dancho on March 30, 2021
    ...

    Let's make a professional business report in 5-minutes in HTML and PDF formats, and incorporates your data analysis in R. Reporting used to take me much longer and is now faster with the new Rmarkdown Visual Editor.

    Read More...

    Build GGPLOT Code with Tableau Drag-and-Drop (R esquisse)

    Written by Matt Dancho on March 23, 2021
    ...

    Tableau-users rejoice! The esquisse R package is here to make you life much easier - make ggplot2 plot code using a drag-and-drop Tableau interface. Here's what you need to do.

    Read More...

    A Gentle Introduction to R Shiny

    Written by Matt Dancho on March 16, 2021
    ...

    What if you could turn your #datascience analysis into a web application? You can do exactly that with R Shiny. R Shiny is an amazing framework built to convert your data analysis into a web app - FAST! Create amazing applications your business can use in hours (not months!).

    Read More...

    Assess Your DATA QUALITY in R with skimr

    Written by Matt Dancho on March 9, 2021
    ...

    Skimr is my go-to R package for fast data quality assessment, and Skimr is my first step in exploratory data analysis. Before I do anything else, I check data quality with skimr.

    Read More...

    Hierarchical Time Series Forecasting [Full Code Tutorial]

    Written by Matt Dancho on March 4, 2021
    ...

    In Learning Labs PRO Episode 50, Matt tackles an in-depth tutorial on Hierarchical Forecasting using the M5 Forecasting Competition.

    Read More...

    DataExplorer: Exploratory Data Analysis in R

    Written by Matt Dancho on March 2, 2021
    ...

    Did you know most Data Scientists spend 80% of their time just trying to understand and prepare data for analysis? R has an Insane Exploratory Data Analysis productivity-enhancer. It's called DataExplorer.

    Read More...

    5 Reasons You Should Learn Shiny

    Written by Matt Dancho and Jarrell Chalmers on February 25, 2021
    ...

    Many data scientists struggle with distributing their work, however, you can make that a problem of the past thanks to Shiny. Here are five reasons you should learn Shiny and why it is a game-changer for upskilling your career.

    Read More...

    How to Add Shiny to Rmarkdown

    Written by Matt Dancho on February 24, 2021
    ...

    Shiny is an R web framework with a HUGE ECOSYSTEM of interactive widgets, themes, and customizable user interfaces called the Shinyverse. In this article, we use Shiny to make our R Markdown Report interactive.

    Read More...

    R is for Research, Python is for Production

    Written by Matt Dancho and Jarrell Chalmers on February 18, 2021
    ...

    Both R and Python are great. We’ll showcase some of the strengths of each language in this article by showcasing where the major development efforts are within each ecosystem.

    Read More...

    Predictive Power Score vs CorrelationFunnel

    Written by Matt Dancho on February 16, 2021
    ...

    Exploratory Data Analysis is what every data scientist does to understand actionable insights from the data. This process used to take forever. Not anymore...

    Read More...

    Full Feature Engineering Tutorial with Max Kuhn

    Written by Matt Dancho on February 11, 2021
    ...

    Max Kuhn, from RStudio, discusses in-depth feature engineering for customer analytics. Watch Max and Matt tackle a tough feature engineering problem for customer analytics prediction.

    Read More...

    Make Awesome Statistical Plots in R

    Written by Matt Dancho on February 9, 2021
    ...

    I never thought I'd be able to make publication-ready statistical plots so easily. Seriously. Thanks to ggstatsplot.

    Read More...

    Should I Become a Data Scientist or Data Analyst?

    Written by Matt Dancho and Jarrell Chalmers on February 5, 2021
    ...

    In order to determine where you wish to set your career trajectory, you need to understand the grey area and differences between data scientists and data analysts.

    Read More...

    4 Ways to make Data Frames in R!

    Written by Matt Dancho on February 2, 2021
    ...

    Data frames (like Excel tables) are the main way for storing, organizing, and analyzing data in R. Here are 4 ways using the tidyverse.

    Read More...

    Should you learn Python or R in 2021?

    Written by Matt Dancho on January 28, 2021
    ...

    For years Python and R have been pitted as mortal enemies in the world of data science, enticing its practitioners to choose a side and never look back - not anymore. It's time for these two titans to join forces through reticulate which allows us to use Python and R together!

    Read More...

    How to Write SQL From R

    Written by Matt Dancho on January 26, 2021
    ...

    SQL queries getting you down? Let R write SQL queries for you!

    Read More...

    What kind of job can you get if you master machine learning?

    Written by Matt Dancho on January 21, 2021
    ...

    One reason interest in machine learning jobs will continue to grow is how lucrative the pay is. Another is how interesting the work is. If you're looking to plant your foot in a growing industry, then machine learning could be for you. The average machine learning salary, according to Indeed's research, can be anywhere between $96,00 - $146,085.

    Read More...

    How to Handle Missing Data in R with simputation

    Written by Matt Dancho on January 19, 2021
    ...

    In 10-minutes, learn how to visualize and impute in R using ggplot dplyr and 3 more packages to simple imputation. Here are the links to get set up.

    Read More...

    R/Python Teams Course Announcement

    Written by Matt Dancho on January 14, 2021
    ...

    Add value as part of an R/Python Collaborative Team, be confident working with Python Users as part of a Team and working with Python.

    Read More...

    How to Make 3D Plots in R

    Written by Matt Dancho on January 12, 2021
    ...

    Learn how to make AMAZING 3D Plots in R by combining ggplot2 and rayshader.

    Read More...

    6 Life-Altering RStudio Keyboard Shortcuts

    Written by Matt Dancho on January 5, 2021
    ...

    The RStudio IDE is amazing. You can enhance your R productivity even more with these simple keyboard shortcuts.

    Read More...

    Plotting Time Series in R (New Cyberpunk Theme)

    Written by Matt Dancho on December 30, 2020
    ...

    One of the most common data science visualization is a Time Series plot. In this tutorial we'll learn how to plot time series using ggplot, plotly and timetk.

    Read More...

    Build and Evaluate A Logistic Regression Classifier

    Written by Matt Dancho on December 22, 2020
    ...

    Logistic regression is a simple, yet powerful classification model. In this tutorial, learn how to build a predictive classifier that classifies the age of a vehicle.

    Read More...

    6 Reasons To Learn R For Business [2021]

    Written by Matt Dancho on December 17, 2020
    ...

    Learn R for business - Data science for business is the future of business analytics. Here are 6 reasons why R is the right choice.

    Read More...

    Interactive Principal Component Analysis in R

    Written by Matt Dancho on December 15, 2020
    ...

    Identify Clusters in your Data. We'll make an Interactive PCA visualization to investigate clusters and learn why observations are similar to each other.

    Read More...

    How To Make Geographic Map Visualizations In R

    Written by Matt Dancho on December 8, 2020
    ...

    If you are explaining data related to geography or just want to visualize by latitude / longitude location, you need to know ggplot2 and the tidyverse for making maps.

    Read More...

    Top 5 Best Articles on R for Business [November 2020]

    Written by Matt Dancho on December 4, 2020
    ...

    Each month, we release tons of great content on R for Business. These are the 5 Top Articles in R for Business over the past month. We have some great ones in November 2020.

    Read More...

    Analyzing Solar Power Energy (IoT Analysis)

    Written by Nathaniel Whitlock on December 1, 2020
    ...

    Solar power is a form of renewable clean energy that is created when photons from the sun excite elections in a photovoltaic panel, generating electricity. The power generated is usually tracked via sensor with measurements happening on a time based cadence.

    Read More...

    Time Series Demand Forecasting

    Written by Luciano Oliveira Batista on November 25, 2020
    ...

    Demand Forecasting is a technique for estimation of probable demand for a product or services. It is based on the analysis of past demand for that product or service in the present market condition.

    Read More...

    Forecasting Time Series ARIMA Models (10 Must-Know Tidyverse Functions #5)

    Written by Matt Dancho on November 24, 2020
    ...

    Making multiple ARIMA Time Series models in R used to be difficult. But, with the purrr nest() function and modeltime, forecasting has never been easier. Learn how to make many ARIMA models in this tutorial.

    Read More...

    Detect Relationships With Linear Regression (10 Must-Know Tidyverse Functions #4)

    Written by Matt Dancho on November 17, 2020
    ...

    Group Split and Map are SECRET TOOLS in my data science arsenal. Combining them will help us scale up to 15 linear regression summaries to assess relationship strength and combine in a GT table.

    Read More...

    10 Must-Know Tidyverse Functions: #3 - Pivot Wider and Longer

    Written by Matt Dancho on November 13, 2020
    ...

    Pivoting wider is essential for making summary tables that go into reports and help humans understand key information.

    Read More...

    Top 5 Best Articles on R for Business [October 2020]

    Written by Matt Dancho on November 6, 2020
    ...

    Each month, we release tons of great content on R for Business. These are the 5 Top Articles in R for Business over the past month. We have some great ones in October. Let's dive in.

    Read More...

    10 Must-Know Tidyverse Functions: #2 - across()

    Written by Matt Dancho on November 3, 2020
    ...

    The across() function was released in dplyr 1.0.0. It's a new tidyverse function that extends group_by and summarize for multiple column and function summaries.

    Read More...

    10 Must-Know Tidyverse Functions: #1 - relocate()

    Written by Matt Dancho on October 27, 2020
    ...

    relocate() is like arrange() for columns. It keeps all of the columns, but provides much more flexibility for reordering. Notice how all of the columns are returned.

    Read More...

    How to Visualize Time Series Data: Tidy Forecasting in R

    Written by Joon Im on October 22, 2020
    ...

    Plot time series data using the fpp2, fpp3, and timetik forecasting frameworks.

    Read More...

    How to Automate PDF Reporting with R

    Written by Matt Dancho on October 21, 2020
    ...

    Why create PDF's manually when you can automate PDFs with R? That's exactly what I show you how to do in this video showcasing parameterized Rmarkdown.

    Read More...

    How to Make Publication-Quality Excel Pivot Tables with R

    Written by Matt Dancho on October 14, 2020
    ...

    The biggest thing I missed when I transititioned from Excel to R was PIVOT TABLES! Seriously, Pivot Tables are so useful. You can summarize and reshape (aka Pivot) data so easily with them in Excel.

    Read More...

    How to Automate Exploratory Analysis Plots

    Written by Luciano Oliveira Batista on October 8, 2020
    ...

    A eatter way to do your EDA, and with less unnecessary coding and more flexibility using GGPLOT2 + PURRR. When you are plotting different charts during your exploratory data analysis, you sometimes end up doing a lot of repeated coding...

    Read More...

    How to Automate Excel with R

    Written by Matt Dancho on October 7, 2020
    ...

    Your company lives off them... Excel files. Why not automate them & save some time? Here's an Excel File you're going to make in this tutorial from R.

    Read More...

    Top 5 Best Articles on R for Business [September 2020]

    Written by Matt Dancho on October 2, 2020
    ...

    The top 5 best articles on R for Business from last month.

    Read More...

    Finance in R - Evaluating American Funds Portfolio

    Written by David Lucey on September 30, 2020
    ...

    Active funds have done poorly over the last ten years, and in most cases, struggled to justify their fees. In the post, there is a supporting chart showing a group of American Funds funds compared to the Vanguard Total Market index.

    Read More...

    Using Drake for ETL - Building A Shiny Real Estate App

    Written by David Lucey on September 24, 2020
    ...

    The drake plan organizes the project work flow according to targets, which are generated by scripts of functions and often functions of functions. The natural flow for our ETL was to check if the raw data was available on the local disc...

    Read More...

    How to Automate PowerPoint Slidedecks with R

    Written by Matt Dancho on September 22, 2020
    ...

    Here's a common situation, you have to make a Monday Morning Slide Deck. It's the same deck each week, just date ranges for your data change. Here's how to automate this process with R!

    Read More...

    How to Scrape Word Documents with R

    Written by Matt Dancho on September 16, 2020
    ...

    Your company has tons of them - Microsoft Word Documents! Scraping word documents is a powerful technique for extracting data. Let's learn how with R, officer, & tidyverse.

    Read More...

    How To Get My Company To Pay For My Data Science Courses

    Written by Matt Dancho on September 7, 2020
    ...

    Read More...

    Win Data Science Competitions with Shiny

    Written by Matt Dancho on August 12, 2020
    ...

    The secret to accelerating your career - SHOW THAT YOU CAN PROVIDE BUSINESS VALUE! Check out the story of Raj, who won a Shiny data science competition using Shiny.

    Read More...

    From No-Shiny Experience to Deploying My First Shiny App in 3-Months

    Written by Matt Dancho on August 5, 2020
    ...

    Data science doesnt have to take years to learn. Here's an inspiring use-case from one of our students & how data science education helped add value to his company by creating a decision-making application.

    Read More...

    How One Student Landed a VP-Level Analytics Role at a Major Bank

    Written by Matt Dancho on May 26, 2020
    ...

    Data Science is the perfect field for those who are naturally curious and aspire to learn continuously throughout their career.

    Read More...

    How to Set Up TensorFlow 2 in R in 5 Minutes (BONUS Image Recognition Tutorial)

    Written by Matt Dancho on May 15, 2020
    ...

    Python can be run from R to leverage the strengths of both R and Python Data Science langauges. Learn how to set up Python's TensorFlow Library in 5 minutes.

    Read More...

    How to Set Up Python's Scikit-Learn in R in 5 minutes

    Written by Matt Dancho on April 20, 2020
    ...

    Python can be run from R to leverage the strengths of both R and Python Data Science langauges. Learn how to set up Python's Scikit-Learn Library in 5 minutes.

    Read More...

    Increase Your Salary With Data Science Skills

    Written by Matt Dancho on April 13, 2020
    ...

    The majority of us have experienced the average pay increase, because this is what most people receive. How would it feel to save your organization money or increase revenue for your organization and receive more compensation because of your work?

    Read More...

    3 Free Resources to Learn R - Now Open

    Written by Matt Dancho on March 20, 2020
    ...

    Business Science is offering free educational resources as a response to the coronavirus outbreak and social distancing measures.

    Read More...

    Time Series Machine Learning (and Feature Engineering) in R

    Written by Matt Dancho on March 18, 2020
    ...

    Machine learning is a powerful way to forecast Time Series. Feature Engineering is critical. A new innovation is coming in timetk - to help generate 200+ time-series features.

    Read More...

    Part 6 - R Shiny vs Tableau (3 Business Application Examples)

    Written by Matt Dancho on March 9, 2020
    ...

    Shiny is much more than just a dashboarding tool. Here we illustrate 3 powerful use cases for R Shiny Apps in business.

    Read More...

    tidyquant v1.0.0: Pivot Tables, VLOOKUPs in R

    Written by Matt Dancho on March 4, 2020
    ...

    The NEW tidyquant package (v1.0.0) makes popular Excel functions like Pivot Tables, VLOOKUP(), SUMIFS(), and much more possible in R.

    Read More...

    R for Excel Users: Pivot Tables, VLOOKUPs in R

    Written by Matt Dancho on February 26, 2020
    ...

    Learn how to use popular Excel functions in R like Pivot Tables, VLOOKUP(), SUMIFS(), and much more.

    Read More...

    Tidy Discounted Cash Flow Analysis in R (for Company Valuation)

    Written by Rafael Nicolas Fermin Cota on February 21, 2020
    ...

    Learn how to use the Tidy Data Principles to perform a discounted cash flow analysis for Saudi Aramco, an oil giant with a value listed of 1.7 Trillion USD

    Read More...

    Shiny Real Estate with Zillow API (Free Course)

    Written by Matt Dancho on February 10, 2020
    ...

    Zillow is a FREE TOOL with an API that allows Data Scientists to connect to a massive repository of Home Prices and Features. In this lab, we use Shiny, Crosstalk, and Zillow API to create an ML-powered Zillow Home Price Explanation tool.

    Read More...

    Google Trends Email Automation with Shiny

    Written by Matt Dancho on January 24, 2020
    ...

    Google Trends is a FREE tool to gain insights about Google Search Terms your organization cares about. What if you could streamline the process of analyzing keyword search trends and emailing a report in 3-4 seconds? Learn how to do just that.

    Read More...

    Product Price Prediction: A Tidy Hyperparameter Tuning and Cross Validation Tutorial

    Written by Matt Dancho on January 21, 2020
    ...

    Learn how to model product prices using the tune library for hyperparameter tuning and cross-validation.

    Read More...

    Part 5 - Five Reasons to Learn H2O for High-Performance Machine Learning

    Written by Matt Dancho on January 13, 2020
    ...

    H2O is the scalable, open-source ML library that features AutoML. Here's why it's an essential library for me (and you).

    Read More...

    NEW BOOK - The Shiny Production with AWS Book

    Written by Matt Dancho on January 2, 2020
    ...

    The enterprise-grade process for deploying, hosting, and maintaining Shiny web applications using AWS, Docker, and Git.

    Read More...

    Part 4 - Git for Data Science Applications (A Top Skill for 2020)

    Written by Matt Dancho on December 9, 2019
    ...

    Moving into 2020, three things are clear - Organizations want Data Science, Cloud, and Apps. A key skill that companies need is Git for application development (I call this Full Stack Data Science). Here's what is driving Git's growth, and why you should learn Git for data science application development.

    Read More...

    Part 1 - Five Full Stack Data Science Technologies for 2020 (and Beyond)

    Written by Matt Dancho on December 9, 2019
    ...

    Moving into 2020, three things are clear - Organizations want Data Science, Cloud, and Apps. Here are the essential skills for Data Scientists that need to build and deploy applications in 2020 and beyond.

    Read More...

    How I Landed My Data Science Job

    Written by Matt Dancho on November 27, 2019
    ...

    Getting a job in Data Science is difficult. Here's how one Business Science student aced his Data Science Interview and landed a job at a top Management Consulting Firm.

    Read More...

    Part 3 - Docker for Data Scientists (A Top Skill for 2020)

    Written by Matt Dancho on November 22, 2019
    ...

    Moving into 2020, three things are clear - Organizations want Data Science, Cloud, and Apps. Here's how Docker plays a part in the essential skills of 2020.

    Read More...

    Customer Churn Modeling using Machine Learning with parsnip

    Written by Diego Usai on November 18, 2019
    ...

    Learn how to perform a tidy approach to classification problem with the new parsnip R package for machine learning.

    Read More...

    Is Your Job's Data Science Tech Stack Intimidating?

    Written by Matt Dancho on November 15, 2019
    ...

    A data science team has many tools that all need to be integrated. And, this can be INTIMIDATING. Here are some tips to deal with the complexity of a data science tech stack.

    Read More...

    Part 2 - Data Science with AWS (A Top Skill for 2020)

    Written by Matt Dancho on November 13, 2019
    ...

    Organizations depend on the Data Science team to build distributed applications that solve business needs. AWS provides an infrastructure to host data science products for stakeholder to access.

    Read More...

    Apply Data Science to Improve Addiction Treatment

    Written by Matt Dancho on November 11, 2019
    ...

    Learn how one Business Science student created a data product that aims to help his organization improve the quality of care while reducing cost.

    Read More...

    Web Scraping Product Data in R with rvest and purrr

    Written by Joon Im on October 7, 2019
    ...

    Learn how to web scrape HTML, wangle JSON, and visualize product data from the Bicycle Manufacturer, Specialized Bicycles.

    Read More...

    Cleaning Anomalies to Reduce Forecast Error by 9% with anomalize

    Written by Matt Dancho on September 30, 2019
    ...

    We can often improve forecast performance by cleaning anomalous data prior to forecasting. This is the perfect use case for integrating the clean_anomalies() function from anomalize into your forecast workflow.

    Read More...

    PDF Scraping in R with tabulizer

    Written by Jennifer Cooper on September 23, 2019
    ...

    Learn how to scrape and wrangle PDF tables of a Report on Endangered Species with the tabulizer R package and visualize trends with ggplot2.

    Read More...

    Big Data: Wrangling 4.6M Rows with dtplyr (the NEW data.table backend for dplyr)

    Written by Matt Dancho on August 15, 2019
    ...

    Wrangling Big Data is one of the best features of the R programming language - which boasts a Big Data Ecosystem that contains fast in-memory tools (e.g. data.table) and distributed computational tools (sparklyr). With the NEW dtplyr package, data scientists with dplyr experience gain the benefits of data.table backend. We saw a 3X speed boost for dplyr!

    Read More...

    Introducing correlationfunnel v0.1.0 - Speed Up Exploratory Data Analysis by 100X

    Written by Matt Dancho on August 7, 2019
    ...

    I'm pleased to announce the introduction of correlationfunnel version 0.1.0, which officially hit CRAN yesterday. The correlationfunnel package is something I've been using for a while to efficiently explore data, understand relationships, and get to business insights as fast as possible.

    Read More...

    How I Started My Data Science Business

    Written by Matt Dancho on July 22, 2019
    ...

    This is a true story based on how I created my data science company from scratch. It's a detailed documentation of my personal journey along with the company I founded, Business Science.

    Read More...

    Excel to R, Part 2 - Speed Up Exploratory Data Analysis 100X (R Code!)

    Written by Matt Dancho on July 8, 2019
    ...

    Learn how going from Excel to R can speed up Exploratory Data Analysis, getting business insights 100X FASTER.

    Read More...

    Build A R Shiny App (Tutorial) - Wedding Risk Model

    Written by Bryan Clark on June 9, 2019
    ...

    Learn step-by-step how to built a wedding risk model Shiny app.

    Read More...

    Introducing the Ultimate R Cheat Sheet Version 2.0: The Shinyverse

    Written by Matt Dancho on June 3, 2019
    ...

    The Ultimate R Cheat Sheet now covers the Shinyverse - An Ecosystem of R Packages for Shiny Web Application Development, Deployment, and putting Machine Learning into Production. Download the Cheat Sheet for Free!

    Read More...

    How To Become A Financial Data Scientist (Or A Data Scientist In Any Domain)

    Written by Matt Dancho on May 23, 2019
    ...

    Becoming a data scientist in Finance can be a lofty challenge... unless you know how to streamline the path.

    Read More...

    A/B Testing with Machine Learning - A Step-by-Step Tutorial

    Written by Matt Dancho on March 11, 2019
    ...

    Experience how to implement Machine Learning for A/B Testing step-by-step

    Read More...

    New Cheat Sheet - Customer Segmentation and Clustering Workflow

    Written by Matt Dancho on February 24, 2019
    ...

    Student feedback led to a BRAND NEW CHEAT SHEET on customer segmentation and a major overhaul to our Week 6 Modeling Chapter in our Business Analysis with R Course.

    Read More...

    Excel to R, Part 1 - The 10X Productivity Boost

    Written by Matt Dancho on February 20, 2019
    ...

    The first article in a 3-part series on Excel to R, this article walks the reader through a Marketing Case Study exposing the 10X productivity boost from switching from Excel to R.

    Read More...

    Data Science In R - The Ultimate R Cheat Sheet - The Ultimateness Just Doubled!

    Written by Matt Dancho on January 7, 2019
    ...

    The ultimate R cheat sheet links to the documentation and cheat sheets for every major R package. It just got even better with a brand new second page containing special topics and R packages!

    Read More...

    Time Series Analysis for Business Forecasting with Artificial Neural Networks

    Written by Blaine Bateman on December 4, 2018
    ...

    This article demonstrates a real-world case study for business forecasting with regression models including artificial neural networks (ANNs) with Keras

    Read More...

    R Cheat Sheet: Data Science Workflow with R

    Written by Matt Dancho on November 4, 2018
    ...

    Get the new R Cheat Sheet that makes learning data science with R quick and efficient.

    Read More...

    R and Python: How to Integrate the Best of Both into Your Data Science Workflow

    Written by Matt Dancho on October 8, 2018
    ...

    R and Python - learn how to integrate both R and Python into your data science workflow. Use the strengths of the two dominant data science languages.

    Read More...

    How To Complete A Kaggle Competition In 30 Minutes - Home Credit Default Challenge

    Written by Matt Dancho on August 7, 2018
    ...

    Real world data science - Learn how to compete in a Kaggle Competition using Machine Learning with R.

    Read More...

    DALEX: Interpretable Machine Learning Algorithms with Dalex and H2O

    Written by Brad Boehmke on July 23, 2018
    ...

    Interpret machine learning algorithms with R to explain why one prediction is made over another.

    Read More...

    New Course Content: DS4B 201 Chapter 7, The Expected Value Framework For Modeling Churn With H2O

    Written by Matt Dancho on July 16, 2018
    ...

    I’m pleased to announce that we released brand new content for our flagship course, Data Science For Business (DS4B 201). Over the course of 10 weeks, the DS4B 201 course teaches students and end-to-end data science project solving Employee Churn with R, H2O, & LIME. The latest content is focused on transitioning from modeling Employee Churn with H2O and LIME to evaluating our binary classification model using Return-On-Investment (ROI), thus delivering business value. We do this through application of a special tool called the Expected Value Framework. Let’s learn about the new course content available now in DS4B 201, Chapter 7, which covers the Expected Value Framework for modeling churn with H2O!

    Read More...

    Data Science For Business: Course Now Open!

    Written by Matt Dancho on April 30, 2018
    ...

    We are pleased to announce that our Data Science For Business (#DS4B) Course (HR 201) is OFFICIALLY OPEN! This course is for intermediate to advanced data scientists looking to apply H2O and LIME to a real-world binary classification problem in an organization: Employee Attrition. If you are interested applying data science for business in a real-world setting with advanced tools using a client-proven system that delivers ROI to the organization, then this is the course for you. For a limited time we are offering 15% off enrollment.

    Read More...

    Data Science For Business: Course Launch In 5 Days!!!

    Written by Matt Dancho on April 25, 2018
    ...

    Last November, our data science team embarked on a journey to build the ultimate Data Science For Business (DS4B) learning platform. We saw a problem: A gap exists in organizations between the data science team and the business. To bridge this gap, we’ve created Business Science University, an online learning platform that teaches DS4B, using high-end machine learning algorithms, and organized in the fashion of an on-premise workshop but at a fraction of the price. I’m pleased to announce that, in 5 days, we will launch our first course, HR 201, as part of a 4-course Virtual Workshop. We crafted the Virtual Workshop after the data science program that we wished we had when we began data science (after we got through the basics of course!). Now, our data science process is being opened up to you. We guide you through our process for solving high impact business problems with data science!

    Read More...

    How To Learn R, Part 1: Learn From A Master Data Scientist's Code

    Written by Matt Dancho on March 3, 2018
    ...

    The R programming language is a powerful tool used in data science for business (DS4B), but R can be unnecessarily challenging to learn. We believe you can learn R quickly by taking an 80/20 approach to learning the most in-demand functions and packages. In this article, we seek to ultimately understand what techniques are most critical to a beginners success through analyzing a master data scientist’s code base. Half of this article covers the web scraping procedure (using rvest and purrr) we used to collect our data (if new to R, you can skip this). The second half covers the insights gained from analyzing a master’s code base. In the next article in our series, we’ll develop a strategic learning plan built on our knowledge of the master. Last, there’s a bonus at the end of the article that shows how you can analyze your own code base using the new fs package. Enjoy.

    Read More...

    The Tidy Time Series Platform: tibbletime 0.1.0

    Written by Davis Vaughan on January 4, 2018
    ...

    We’re happy to announce the third release of the tibbletime package. This is a huge update, mainly due to a complete rewrite of the package. It contains a ton of new functionality and a number of breaking changes that existing users need to be aware of. All of the changes have been well documented in the NEWS file, but it’s worthwhile to touch on a few of them here and discuss the future of the package. We’re super excited so let’s check out the vision for tibbletime and its new functionality!

    Read More...

    6 Reasons To Learn R For Business

    Written by Matt Dancho on December 27, 2017
    ...

    Learn R for business - Data science for business is the future of business analytics. Here are 6 reasons why R is the right choice.

    Read More...

    Demo Week: Time Series Machine Learning with h2o and timetk

    Written by Matt Dancho on October 28, 2017
    ...

    Learn R in this time series using H2O machine learning demonstration.

    Read More...

    Demo Week: Tidy Time Series Analysis with tibbletime

    Written by Matt Dancho on October 26, 2017
    ...

    We’re into the fourth day of Business Science Demo Week. We have a really cool one in store today: tibbletime, which uses a new tbl_time class that is time-aware!! For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Let’s take tibbletime for a spin!

    Read More...

    LIVE DataTalk on HR Analytics Tonight: Using Machine Learning to Predict Employee Turnover

    Written by Matt Dancho on October 26, 2017
    ...

    Tonight at 7PM EST, we will be giving a LIVE #DataTalk on Using Machine Learning to Predict Employee Turnover. Employee turnover (attrition) is a major cost to an organization, and predicting turnover is at the forefront of needs of Human Resources (HR) in many organizations. Until now the mainstream approach has been to use logistic regression or survival curves to model employee attrition. However, with advancements in machine learning (ML), we can now get both better predictive performance and better explanations of what critical features are linked to employee attrition. We used two cutting edge techniques: the h2o package’s new FREE automatic machine learning algorithm, h2o.automl(), to develop a predictive model that is in the same ballpark as commercial products in terms of ML accuracy. Then we used the new lime package that enables breakdown of complex, black-box machine learning models into variable importance plots. The talk will cover HR Analytics and how we used R, H2O, and LIME to predict employee turnover.

    Read More...

    Demo Week: Tidy Forecasting with sweep

    Written by Matt Dancho on October 25, 2017
    ...

    We’re into the third day of Business Science Demo Week. Hopefully by now you’re getting a taste of some interesting and useful packages. For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today is sweep, which has broom-style tidiers for forecasting. Let’s get going!

    Read More...

    Demo Week: Time Series Machine Learning with timetk

    Written by Matt Dancho on October 24, 2017
    ...

    We’re into the second day of Business Science Demo Week. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Second up is timetk, your toolkit for time series in R. Here we go!

    Read More...

    Demo Week: class(Monday) <- tidyquant

    Written by Matt Dancho on October 23, 2017
    ...

    We’ve got an exciting week ahead of us at Business Science: we’re launching our first ever Business Science Demo Week. Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. First up is tidyquant, our flagship package that’s useful for financial and time series analysis. Here we go!

    Read More...

    It's tibbletime v0.0.2: Time-Aware Tibbles, New Functions, Weather Analysis and More

    Written by Davis Vaughan on October 8, 2017
    ...

    Today we are introducing tibbletime v0.0.2, and we’ve got a ton of new features in store for you. We have functions for converting to flexible time periods with the ~period formula~ and making/calculating custom rolling functions with rollify() (plus a bunch more new functionality!). We’ll take the new functionality for a spin with some weather data (from the weatherData package). However, the new tools make tibbletime useful in a number of broad applications such as forecasting, financial analysis, business analysis and more! We truly view tibbletime as the next phase of time series analysis in the tidyverse. If you like what we do, please connect with us on social media to stay up on the latest Business Science news, events and information!

    Read More...

    It's tibbletime: Time-Aware Tibbles

    Written by Davis Vaughan on September 7, 2017
    ...

    We are very excited to announce the initial release of our newest R package, tibbletime. As evident from the name, tibbletime is built on top of the tibble package (and more generally on top of the tidyverse) with the main purpose of being able to create time-aware tibbles through a one-time specification of an “index” column (a column containing timestamp information). There are a ton of useful time functions that we can now use such as time_filter(), time_summarize(), tmap(), as_period() and time_collapse(). We’ll walk through the basics in this post.

    Read More...

    alphavantager: An R interface to the Free Alpha Vantage Financial Data API

    Written by Matt Dancho on September 3, 2017
    ...

    We’re excited to announce the alphavantager package, a lightweight R interface to the Alpha Vantage API! Alpha Vantage is a FREE API for retreiving real-time and historical financial data. It’s very easy to use, and, with the recent glitch with the Yahoo Finance API, Alpha Vantage is a solid alternative for retrieving financial data for FREE! It’s definitely worth checking out if you are interested in financial analysis. We’ll go through the alphavantager R interface in this post to show you how easy it is to get real-time and historical financial data. In the near future, we have plans to incorporate the alphavantager into tidyquant to enable scaling from one equity to many.

    Read More...

    BizSci Package Updates: Formerly timekit... Now timetk :)

    Written by Matt Dancho on July 27, 2017
    ...

    We have several announcements regarding Business Science R packages. First, as of this week the R package formerly known as timekit has changed to timetk for time series tool kit. There are a few “breaking” changes because of the name change, and this is discussed further below. Second, the sweep and tidyquant packages have several improvements, which are discussed in detail below. Finally, don’t miss a beat on future news, events and information by following us on social media.

    Read More...

    sweep: Extending broom for time series forecasting

    Written by Matt Dancho on July 9, 2017
    ...

    We’re pleased to introduce a new package, sweep, now on CRAN! Think of it like broom for the forecast package. The forecast package is the most popular package for forecasting, and for good reason: it has a number of sophisticated forecast modeling functions. There’s one problem: forecast is based on the ts system, which makes it difficult work within the tidyverse. This is where sweep fits in! The sweep package has tidiers that convert the output from forecast modeling and forecasting functions to “tidy” data frames. We’ll go through a quick introduction to show how the tidiers can be used, and then show a fun example of forecasting GDP trends of US states. If you’re familiar with broom it will feel like second nature. If you like what you read, don’t forget to follow us on social media to stay up on the latest Business Science news, events and information!

    Read More...

    timekit: New Documentation, Function Improvements, Forecasting Vignette

    Written by Matt Dancho on May 17, 2017
    ...

    We’ve just released timekit v0.3.0 to CRAN. The package updates include changes that help with making an accurate future time series with tk_make_future_timeseries() and we’ve added a few features to tk_get_timeseries_signature(). Most important are the new vignettes that cover both the making of future time series task and forecasting using the timekit package. If you saw our last timekit post, you were probably surprised to learn that you can use machine learning to forecast using the time series signature as an engineered feature space. Now we are expanding on that concept by providing two new vignettes that teach you how to use ML and data mining for time series predictions. We’re really excited about the prospects of ML applications with time series. If you are too, I strongly encourage you to explore the timekit package important links below. Don’t forget to check out our announcements and to follow us on social media to stay up on the latest Business Science news, events and information! Here’s a summary of the updates.

    Read More...

    tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

    Written by Matt Dancho on May 11, 2017
    ...

    In advance of upcoming Business Science talks on tidyquant at R/Finance and EARL San Francisco, we are releasing a technical paper entitled “New Tools For Performing Financial Analysis within the ‘Tidy’ Ecosystem”. The technical paper covers an overview of the current R financial package landscape, the independent development of the “tidyverse” data science tools, and the tidyquant package that bridges the gap between the two underlying systems. Several usage cases are discussed. We encourage anyone interested in financial analysis and financial data science to check out the technical paper. We will be giving talks related to the paper at R/Finance on May 19th in Chicago and EARL on June 7th in San Francisco. If you can’t make it, I encourage you to read the technical paper and to follow us on social media to stay up on the latest Business Science news, events and information.

    Read More...

    timekit: Time Series Forecast Applications Using Data Mining

    Written by Matt Dancho on May 2, 2017
    ...

    The timekit package contains a collection of tools for working with time series in R. There’s a number of benefits. One of the biggest is the ability to use a time series signature to predict future values (forecast) through data mining techniques. While this post is geared toward exposing the user to the timekit package, there are examples showing the power of data mining a time series as well as how to work with time series in general. A number of timekit functions will be discussed and implemented in the post. The first group of functions works with the time series index, and these include functions tk_index(), tk_get_timeseries_signature(), tk_augment_timeseries_signature() and tk_get_timeseries_summary(). We’ll spend the bulk of this post introducing you to these. The next function deals with creating a future time series from an existing index, tk_make_future_timeseries(). The last set of functions deal with coercion to and from the major time series classes in R, tk_tbl(), tk_xts(), tk_zoo() (and tk_zooreg()), and tk_ts().

    Read More...

    tidyquant 0.5.0: select, rollapply, and Quandl

    Written by Davis Vaughan on April 4, 2017
    ...

    We’ve got some good stuff cooking over at Business Science. Yesterday, we had the fifth official release (0.5.0) of tidyquant to CRAN. The release includes some great new features. First, the Quandl integration is complete, which now enables getting Quandl data in “tidy” format. Second, we have a new mechanism to handle selecting which columns get sent to the mutation functions. The new argument name is… select, and it provides increased flexibility which we show off in a rollapply example. Finally, we have added several PerformanceAnalytics functions that deal with modifying returns to the mutation functions. In this post, we’ll go over a few of the new features in version 5.

    Read More...

    tidyquant Integrates Quandl: Getting Data Just Got Easier

    Written by Matt Dancho on March 19, 2017
    ...

    Today I’m very pleased to introduce the new Quandl API integration that is available in the development version of tidyquant. Normally I’d introduce this feature during the next CRAN release (v0.5.0 coming soon), but it’s really useful and honestly I just couldn’t wait. If you’re unfamiliar with Quandl, it’s amazing: it’s a web service that has partnered with top-tier data publishers to enable users to retrieve a wide range of financial and economic data sets, many of which are FREE! Quandl has it’s own R package (aptly named Quandl) that is overall very good but has one minor inconvenience: it doesn’t return multiple data sets in a “tidy” format. This slight inconvenience has been addressed in the integration that comes packaged in the latest development version of tidyquant. Now users can use the Quandl API from within tidyquant with three functions: quandl_api_key(), quandl_search(), and the core function tq_get(get = "quandl"). In this post, we’ll go through a user-contributed example, How To Perform a Fama French 3 Factor Analysis, that showcases how the Quandl integration fits into the “Collect, Modify, Analyze” financial analysis workflow. Interested readers can download the development version using devtools::install_github("business-science/tidyquant"). More information is available on the tidyquant GitHub page including the updated development vignettes.

    Read More...

    tidyquant 0.4.0: PerformanceAnalytics, Improved Documentation, ggplot2 Themes and More

    Written by Matt Dancho on March 4, 2017
    ...

    I’m excited to announce the release of tidyquant version 0.4.0!!! The release is yet again sizable. It includes integration with the PerformanceAnalytics package, which now enables full financial analyses to be performed without ever leaving the “tidyverse” (i.e. with DATA FRAMES). The integration includes the ability to perform performance analysis and portfolio attribution at scale (i.e. with many stocks or many portfolios at once)! But wait there’s more… In addition to an introduction vignette, we created five (yes, five!) topic-specific vignettes designed to reduce the learning curve for financial data scientists. We also have new ggplot2 themes to assist with creating beautiful and meaningful financial charts. We included tq_get support for “compound getters” so multiple data sources can be brought into a nested data frame all at once. Last, we have added new tq_index() and tq_exchange() functions to make collecting stock data with tq_get even easier. I’ll briefly touch on several of the updates. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    tidyquant 0.3.0: ggplot2 Enhancements, Real-Time Data, and More

    Written on January 22, 2017
    ...

    tidyquant, version 0.3.0, is a pretty sizable release that includes a little bit for everyone, including new financial charting and moving average geoms for use with ggplot2, a new tq_get get option called "key.stats" for retrieving real-time stock information, and several nice integrations that improve the ease of scaling your analyses. If your not already familiar with tidyquant, it integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates by going through some neat examples. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    Speed Up Your Code Part 2: Parallel Processing Financial Data with multidplyr + tidyquant

    Written on January 21, 2017
    ...

    Since my initial post on parallel processing with multidplyr, there have been some recent changes in the tidy eco-system: namely the package tidyquant, which brings financial analysis to the tidyverse. The tidyquant package drastically increase the amount of tidy financial data we have access to and reduces the amount of code needed to get financial data into the tidy format. The multidplyr package adds parallel processing capability to improve the speed at which analysis can be scaled. I seriously think these two packages were made for each other. I’ll go through the same example used previously, updated with the new tidyquant functionality.

    Read More...

    tidyquant 0.2.0: Added Functionality for Financial Engineers and Business Analysts

    Written on January 8, 2017
    ...

    tidyquant, version 0.2.0, is now available on CRAN. If your not already familiar, tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    tidyquant: Bringing Quantitative Financial Analysis to the tidyverse

    Written on January 1, 2017
    ...

    My new package, tidyquant, is now available on CRAN. tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. While this post aims to introduce tidyquant to the R community, it just scratches the surface of the features and benefits. We’ll go through a simple stock visualization using ggplot2, which which shows off the integration. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    Speed Up Your Code: Parallel Processing with multidplyr

    Written on December 18, 2016
    ...

    Use parallel processing to speed up your R code, using tidyverse multidplyr.

    Read More...

Become a 6-Figure Data Scientist

We've helped thousands of students become 6-figure data scientists.

Habbee made a killer portfolio and got a job at the US Federal Reserve!
Will landed $100,000 base offer with Verizon
Jennifer landed a Senior VP Role at JP Morgan Chase!
Amit got a job at PWC (Fortune 500 Consulting Firm)!
AIR CANADA was searching for a Senior Data Scientist for over 1 year. Diana got the job!
Janio increased his confidence and landed a job at Google!

Make 6-figures. Launch your career. And become a Data Scientist!

Start Your Classes Now! 👉

Business Science Logo Business Science Logo

Company

  • About
  • Contact

Education

  • Course List
  • Terms Of Use
  • Privacy Policy
  • Refund Policy

Free Resources

  • R-Tips Newsletter
  • Blog
  • Cheat Sheets
  • Software
  • App Gallery

Follow Us

  • R-Tips Newsletter
  • LinkedIn
  • Twitter
  • GitHub
  • Facebook



Copyright © www.business-science.io 2024