Data Science in Finance

Finance in R - Evaluating American Funds Portfolio

Written by David Lucey on September 30, 2020

Active funds have done poorly over the last ten years, and in most cases, struggled to justify their fees. In the post, there is a supporting chart showing a group of American Funds funds compared to the Vanguard Total Market index.

tidyquant v1.0.0: Pivot Tables, VLOOKUPs in R

Written by Matt Dancho on March 4, 2020

The NEW tidyquant package (v1.0.0) makes popular Excel functions like Pivot Tables, VLOOKUP(), SUMIFS(), and much more possible in R.

R for Excel Users: Pivot Tables, VLOOKUPs in R

Written by Matt Dancho on February 26, 2020

Learn how to use popular Excel functions in R like Pivot Tables, VLOOKUP(), SUMIFS(), and much more.

Tidy Discounted Cash Flow Analysis in R (for Company Valuation)

Written by Rafael Nicolas Fermin Cota on February 21, 2020

Learn how to use the Tidy Data Principles to perform a discounted cash flow analysis for Saudi Aramco, an oil giant with a value listed of 1.7 Trillion USD

Big Data: Wrangling 4.6M Rows with dtplyr (the NEW data.table backend for dplyr)

Written by Matt Dancho on August 15, 2019

Wrangling Big Data is one of the best features of the R programming language - which boasts a Big Data Ecosystem that contains fast in-memory tools (e.g. data.table) and distributed computational tools (sparklyr). With the NEW dtplyr package, data scientists with dplyr experience gain the benefits of data.table backend. We saw a 3X speed boost for dplyr!

How To Become A Financial Data Scientist (Or A Data Scientist In Any Domain)

Written by Matt Dancho on May 23, 2019

Becoming a data scientist in Finance can be a lofty challenge... unless you know how to streamline the path.

Finance in R with tidyquant - New Learning Lab 1-Hour Course

Written by Matt Dancho on May 15, 2019

We have just released a new 1-HOUR COURSE - Finance in R with tidyquant - Available in Learning Labs PRO.

How To Complete A Kaggle Competition In 30 Minutes - Home Credit Default Challenge

Written by Matt Dancho on August 7, 2018

Real world data science - Learn how to compete in a Kaggle Competition using Machine Learning with R.

Algorithmic Trading: Using Quantopian's Zipline Python Library In R And Backtest Optimizations By Grid Search And Parallel Processing

Written by Davis Vaughan and Matt Dancho on May 31, 2018

We are ready to demo our new experimental package for Algorithmic Trading, flyingfox, which uses reticulate to to bring Quantopian’s open source algorithmic trading Python library, Zipline, to R. The flyingfox library is part of our NEW Business Science Labs innovation lab, which is dedicated to bringing experimental packages to our followers early on so they can test them out and let us know what they think before they make their way to CRAN. This article includes a long-form code tutorial on how to perform backtest optimizations of trading algorithms via grid search and parallel processing. In this article, we’ll show you how to use the combination of tibbletime (time-based extension of tibble) + furrr (a parallel-processing compliment to purrr) + flyingfox (Zipline in R) to develop a backtested trading algorithm that can be optimized via grid search and parallel processing. We are releasing this article as a compliment to the R/Finance Conference presentation “A Time Series Platform For The Tidyverse”, which Matt will present on Saturday (June 2nd, 2018). Enjoy!

tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

Written by Matt Dancho on May 11, 2017

In advance of upcoming Business Science talks on tidyquant at R/Finance and EARL San Francisco, we are releasing a technical paper entitled “New Tools For Performing Financial Analysis within the ‘Tidy’ Ecosystem”. The technical paper covers an overview of the current R financial package landscape, the independent development of the “tidyverse” data science tools, and the tidyquant package that bridges the gap between the two underlying systems. Several usage cases are discussed. We encourage anyone interested in financial analysis and financial data science to check out the technical paper. We will be giving talks related to the paper at R/Finance on May 19th in Chicago and EARL on June 7th in San Francisco. If you can’t make it, I encourage you to read the technical paper and to follow us on social media to stay up on the latest Business Science news, events and information.

tidyquant 0.5.0: select, rollapply, and Quandl

Written by Davis Vaughan on April 4, 2017

We’ve got some good stuff cooking over at Business Science. Yesterday, we had the fifth official release (0.5.0) of tidyquant to CRAN. The release includes some great new features. First, the Quandl integration is complete, which now enables getting Quandl data in “tidy” format. Second, we have a new mechanism to handle selecting which columns get sent to the mutation functions. The new argument name is… select, and it provides increased flexibility which we show off in a rollapply example. Finally, we have added several PerformanceAnalytics functions that deal with modifying returns to the mutation functions. In this post, we’ll go over a few of the new features in version 5.

tidyquant Integrates Quandl: Getting Data Just Got Easier

Written by Matt Dancho on March 19, 2017

Today I’m very pleased to introduce the new Quandl API integration that is available in the development version of tidyquant. Normally I’d introduce this feature during the next CRAN release (v0.5.0 coming soon), but it’s really useful and honestly I just couldn’t wait. If you’re unfamiliar with Quandl, it’s amazing: it’s a web service that has partnered with top-tier data publishers to enable users to retrieve a wide range of financial and economic data sets, many of which are FREE! Quandl has it’s own R package (aptly named Quandl) that is overall very good but has one minor inconvenience: it doesn’t return multiple data sets in a “tidy” format. This slight inconvenience has been addressed in the integration that comes packaged in the latest development version of tidyquant. Now users can use the Quandl API from within tidyquant with three functions: quandl_api_key(), quandl_search(), and the core function tq_get(get = "quandl"). In this post, we’ll go through a user-contributed example, How To Perform a Fama French 3 Factor Analysis, that showcases how the Quandl integration fits into the “Collect, Modify, Analyze” financial analysis workflow. Interested readers can download the development version using devtools::install_github("business-science/tidyquant"). More information is available on the tidyquant GitHub page including the updated development vignettes.

tidyquant 0.4.0: PerformanceAnalytics, Improved Documentation, ggplot2 Themes and More

Written by Matt Dancho on March 4, 2017

I’m excited to announce the release of tidyquant version 0.4.0!!! The release is yet again sizable. It includes integration with the PerformanceAnalytics package, which now enables full financial analyses to be performed without ever leaving the “tidyverse” (i.e. with DATA FRAMES). The integration includes the ability to perform performance analysis and portfolio attribution at scale (i.e. with many stocks or many portfolios at once)! But wait there’s more… In addition to an introduction vignette, we created five (yes, five!) topic-specific vignettes designed to reduce the learning curve for financial data scientists. We also have new ggplot2 themes to assist with creating beautiful and meaningful financial charts. We included tq_get support for “compound getters” so multiple data sources can be brought into a nested data frame all at once. Last, we have added new tq_index() and tq_exchange() functions to make collecting stock data with tq_get even easier. I’ll briefly touch on several of the updates. The package is open source, and you can view the code on the tidyquant github page.

Recreating RView's ''Reproducible Finance With R: Sector Correlations''

Written by Davis Vaughan on February 2, 2017

The folks at RStudio have a segment on their RViews blog, “Reproducible Finance with R”, one that we at Business Science are very fond of! In the spirit of reproducibility, we thought that it would be appropriate to recreate the RViews post, “Reproducible Finance with R: Sector Correlations”. This time, however, the tidyquant package will be used to streamline much of the code that is currently used. The main advantage of tidyquant is to bridge the gap between the best quantitative resources for collecting and manipulating quantitative data: xts, zoo, quantmod and TTR, and the data modeling workflow and infrastructure of the tidyverse. When implemented, tidyquant cuts the code down by about half and simplifies the workflow.

tidyquant 0.2.0: Added Functionality for Financial Engineers and Business Analysts

Written on January 8, 2017

tidyquant, version 0.2.0, is now available on CRAN. If your not already familiar, tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates. The package is open source, and you can view the code on the tidyquant github page.

tidyquant: Bringing Quantitative Financial Analysis to the tidyverse

Written on January 1, 2017

My new package, tidyquant, is now available on CRAN. tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. While this post aims to introduce tidyquant to the R community, it just scratches the surface of the features and benefits. We’ll go through a simple stock visualization using ggplot2, which which shows off the integration. The package is open source, and you can view the code on the tidyquant github page.

Russell 2000 Quantitative Stock Analysis in R: Six Stocks with Amazing, Consistent Growth

Written on November 30, 2016

The Russell 2000 Small-Cap Index, ticker symbol: ^RUT, is the hottest index of 2016 with YTD gains of over 18%. The index components are interesting not only because of recent performance, but because the top performers either grow to become mid-cap stocks or are bought by large-cap companies at premium prices. This means selecting the best components can result in large gains. In this post, I’ll perform a quantitative stock analysis on the entire list of Russell 2000 stock components using the R programming language. Building on the methodology from my S&P Analysis Post, I develop screening and ranking metrics to identify the top stocks with amazing growth and most consistency. I use R for the analysis including the rvest library for web scraping the list of Russell 2000 stocks, quantmod to collect historical prices for all 2000+ stock components, purrr to map modeling functions, and various other tidyverse libraries such as ggplot2, dplyr, and tidyr to visualize and manage the data workflow. Last, I use plotly to create an interactive visualization used in the screening process. Whether you are familiar with quantitative stock analysis, just beginning, or just interested in the R programming language, you’ll gain both knowledge of data science in R and immediate insights into the best Russell 2000 stocks, quantitatively selected for future returns!

Quantitative Stock Analysis Tutorial: Screening the Returns for Every S&P500 Stock in Less than 5 Minutes

Written on October 23, 2016

Develop quantitative trading strategies in R. Analyze every stock in the S&P 500 to screen risk versus reward.

Learn Data Science in Finance

Finance in R - Evaluating American Funds Portfolio

tidyquant v1.0.0: Pivot Tables, VLOOKUPs in R

R for Excel Users: Pivot Tables, VLOOKUPs in R

Tidy Discounted Cash Flow Analysis in R (for Company Valuation)

Big Data: Wrangling 4.6M Rows with dtplyr (the NEW data.table backend for dplyr)

How To Become A Financial Data Scientist (Or A Data Scientist In Any Domain)

Finance in R with tidyquant - New Learning Lab 1-Hour Course

How To Complete A Kaggle Competition In 30 Minutes - Home Credit Default Challenge

Algorithmic Trading: Using Quantopian's Zipline Python Library In R And Backtest Optimizations By Grid Search And Parallel Processing

tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

tidyquant 0.5.0: select, rollapply, and Quandl

tidyquant Integrates Quandl: Getting Data Just Got Easier

tidyquant 0.4.0: PerformanceAnalytics, Improved Documentation, ggplot2 Themes and More

Recreating RView's ''Reproducible Finance With R: Sector Correlations''

tidyquant 0.2.0: Added Functionality for Financial Engineers and Business Analysts

tidyquant: Bringing Quantitative Financial Analysis to the tidyverse

Russell 2000 Quantitative Stock Analysis in R: Six Stocks with Amazing, Consistent Growth

Quantitative Stock Analysis Tutorial: Screening the Returns for Every S&P500 Stock in Less than 5 Minutes

Become a 6-Figure Data Scientist

Make 6-figures. Launch your career. And become a Data Scientist!

Company

Education

Free Resources

Follow Us

Become A Business Data Scientist With R

Data Science Projects & Portfolio

Become A Production Data Scientist with Python

For The Enterprise

Learn Data Science in Finance

Become a 6-Figure Data Scientist

Make 6-figures. Launch your career. And become a Data Scientist!