Customer Analytics: Using Deep Learning With Keras To Predict Customer Churn

    Written by Matt Dancho on November 28, 2017

    Customer churn is a problem that all companies need to monitor, especially those that depend on subscription-based revenue streams. The simple fact is that most organizations have data that can be used to target these individuals and to understand the key drivers of churn, and we now have Keras for Deep Learning available in R (Yes, in R!!), which predicted customer churn with 82% accuracy. We’re super excited for this article because we are using the new keras package to produce an Artificial Neural Network (ANN) model on the IBM Watson Telco Customer Churn Data Set! As for most business problems, it’s equally important to explain what features drive the model, which is why we’ll use the lime package for explainability. We cross-checked the LIME results with a Correlation Analysis using the corrr package. We’re not done yet. In addition, we use three new packages to assist with Machine Learning (ML): recipes for preprocessing, rsample for sampling data and yardstick for model metrics. These are relatively new additions to CRAN developed by Max Kuhn at RStudio (creator of the caret package). It seems that R is quickly developing ML tools that rival Python. Good news if you’re interested in applying Deep Learning in R! We are so let’s get going!!

    Read More...

    EARL Presentation on HR Analytics: Using ML to Predict Employee Turnover

    Written by Matt Dancho on November 6, 2017

    The EARL Boston 2017 conference was held November 1 - 3 in Boston, Mass. There were some excellent presentations illustrating how R is being embraced in enterprises, especially in the financial and pharmaceutical industries. Matt Dancho, founder of Business Science, presented on using machine learning to predict and explain employee turnover, a hot topic in HR! We’ve uploaded the HR Analytics presentation to YouTube. Check out the presentation, and don’t forget to follow us on social media to stay up on the latest Business Science news, events and information!

    Read More...

    Demo Week: Time Series Machine Learning with h2o and timetk

    Written by Matt Dancho on October 28, 2017

    We’re at the final day of Business Science Demo Week. Today we are demo-ing the h2o package for machine learning on time series data. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today you’ll see how we can use timetk + h2o to get really accurate time series forecasts. Here we go!

    Read More...

    Demo Week: Tidy Time Series Analysis with tibbletime

    Written by Matt Dancho on October 26, 2017

    We’re into the fourth day of Business Science Demo Week. We have a really cool one in store today: tibbletime, which uses a new tbl_time class that is time-aware!! For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Let’s take tibbletime for a spin!

    Read More...

    LIVE DataTalk on HR Analytics Tonight: Using Machine Learning to Predict Employee Turnover

    Written by Matt Dancho on October 26, 2017

    Tonight at 7PM EST, we will be giving a LIVE #DataTalk on Using Machine Learning to Predict Employee Turnover. Employee turnover (attrition) is a major cost to an organization, and predicting turnover is at the forefront of needs of Human Resources (HR) in many organizations. Until now the mainstream approach has been to use logistic regression or survival curves to model employee attrition. However, with advancements in machine learning (ML), we can now get both better predictive performance and better explanations of what critical features are linked to employee attrition. We used two cutting edge techniques: the h2o package’s new FREE automatic machine learning algorithm, h2o.automl(), to develop a predictive model that is in the same ballpark as commercial products in terms of ML accuracy. Then we used the new lime package that enables breakdown of complex, black-box machine learning models into variable importance plots. The talk will cover HR Analytics and how we used R, H2O, and LIME to predict employee turnover.

    Read More...

    Demo Week: Tidy Forecasting with sweep

    Written by Matt Dancho on October 25, 2017

    We’re into the third day of Business Science Demo Week. Hopefully by now you’re getting a taste of some interesting and useful packages. For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today is sweep, which has broom-style tidiers for forecasting. Let’s get going!

    Read More...

    Demo Week: Time Series Machine Learning with timetk

    Written by Matt Dancho on October 24, 2017

    We’re into the second day of Business Science Demo Week. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Second up is timetk, your toolkit for time series in R. Here we go!

    Read More...

    Demo Week: class(Monday) <- tidyquant

    Written by Matt Dancho on October 23, 2017

    We’ve got an exciting week ahead of us at Business Science: we’re launching our first ever Business Science Demo Week. Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. First up is tidyquant, our flagship package that’s useful for financial and time series analysis. Here we go!

    Read More...