Customer Segmentation Part 3: Network Visualization

    Written on October 1, 2016

    This post is the third and final part in the customer segmentation analysis. The first post focused on K-Means Clustering to segment customers into distinct groups based on purchasing habits. The second post takes a different approach, using Pricipal Component Analysis (PCA) to visualize customer groups. The third and final post performs Network Visualization (Graph Drawing) using the igraph and networkD3 libraries as a method to visualize the customer connections and relationship strengths.

    Read More...

    Customer Segmentation Part 2: PCA for Segment Visualization

    Written on September 4, 2016

    This post is the second part in the customer segmentation analysis. The first post focused on k-means clustering in R to segment customers into distinct groups based on purchasing habits. This post takes a different approach, using Pricipal Component Analysis (PCA) in R as a tool to view customer groups. Because PCA attacks the problem from a different angle than k-means, we can get different insights. We’ll compare both the k-means results with the PCA visualization. Let’s see what happens when we apply PCA.

    Read More...

    Customer Segmentation Part 1: K-Means Clustering

    Written on August 7, 2016

    In this post, we’ll be using k-means clustering in R to segment customers into distinct groups based on purchasing habits. k-means clustering is an unsupervised learning technique, which means we don’t need to have a target for clustering. All we need is to format the data in a way the algorithm can process, and we’ll let it determine the customer segments or clusters. This makes k-means clustering great for exploratory analysis as well as a jumping-off point for more detailed analysis. We’ll walk through a relevant example using the Cannondale bikes data set from the orderSimulatoR project GitHub repository.

    Read More...

    orderSimulatoR: Simulate Orders for Business Analytics

    Written on July 12, 2016

    In this post, we will be discussing orderSimulatoR, which enables fast and easy R order simulation for customer and product learning. The basic premise is to simulate data that you’d retrieve from a SQL query of an ERP system. The data can then be merged with products and customers tables to data mine. I’ll go through the basic steps to create an order data set that combines customers and products, and I’ll wrap up with some visualizations to show how you can use order data to expose trends. You can get the scripts and the Cannondale bikes data set at the orderSimulatoR GitHub repository. In case you are wondering what simulated orders look like, click here to scroll to the end result.

    Read More...

    Marketing Strategy: Why MBAs Can Benefit from Learning Analytics

    Written by Matt Dancho on May 1, 2016

    Just because you’re a business professional does not mean you can’t or you shouldn’t pursue furthering yourself in analytics. Businesses view strategic decision making as a competitive advantage. You should too! Learning the basics behind data science not only adds value to your organization, it increases your value and thus your demand too.

    Read More...

    A Data Scientist's Resources

    Written by Matt Dancho on April 9, 2016

    Getting up and running in data science is tough. It’s easy to get overwhelmed, and your biggest asset is time (don’t waste it). Here’s some resources to help speed you along. I’ll continually update these as I get time. Feel free to comment or email me if I’m missing something.

    Read More...