Blog

Tidy Time Series Analysis, Part 2: Rolling Functions

Written by Matt Dancho on July 23, 2017

In the second part in a series on Tidy Time Series Analysis, we’ll again use tidyquant to investigate CRAN downloads this time focusing on Rolling Functions. If you haven’t checked out the previous post on period apply functions, you may want to review it to get up to speed. Both zoo and TTR have a number of “roll” and “run” functions, respectively, that are integrated with tidyquant. In this post, we’ll focus on the rollapply function from zoo because of its flexibility with applying custom functions across rolling windows. If you like what you read, please follow us on social media to stay up on the latest Business Science news, events and information! As always, we are interested in both expanding our network of data scientists and seeking new clients interested in applying data science to business and finance.

sweep: Extending broom for time series forecasting

Written by Matt Dancho on July 9, 2017

We’re pleased to introduce a new package, sweep, now on CRAN! Think of it like broom for the forecast package. The forecast package is the most popular package for forecasting, and for good reason: it has a number of sophisticated forecast modeling functions. There’s one problem: forecast is based on the ts system, which makes it difficult work within the tidyverse. This is where sweep fits in! The sweep package has tidiers that convert the output from forecast modeling and forecasting functions to “tidy” data frames. We’ll go through a quick introduction to show how the tidiers can be used, and then show a fun example of forecasting GDP trends of US states. If you’re familiar with broom it will feel like second nature. If you like what you read, don’t forget to follow us on social media to stay up on the latest Business Science news, events and information!

Tidy Time Series Analysis, Part 1

Written by Matt Dancho on July 2, 2017

In the first part in a series on Tidy Time Series Analysis, we’ll use tidyquant to investigate CRAN downloads. You’re probably thinking, “Why tidyquant?” Most people think of tidyquant as purely a financial package and rightfully so. However, because of its integration with xts, zoo and TTR, it’s naturally suited for “tidy” time series analysis. In this post, we’ll discuss the the “period apply” functions from the xts package, which make it easy to apply functions to time intervals in a “tidy” way using tq_transmute()!

Business Science EARL SF 2017 Presentation: tidyquant, timekit, and more!

Written by Matt Dancho on June 18, 2017

The EARL SF 2017 conference was just held June 5 - 7 in San Francisco, CA. There were some amazing presentations illustrating how R is truly being embraced in enterprises. We gave a three-part presentation on tidyquant for financial data science at scale, timekit for time series machine learning, and Business Science enterprise applications. We’ve uploaded the EARL presentation to YouTube. Check out the presentation, and don’t forget to check out our announcements and to follow us on social media to stay up on the latest Business Science news, events and information!

tidyquant: R/Finance 2017 Presentation

Written by Matt Dancho on May 28, 2017

The R/Finance 2017 conference was just held at the UIC in Chicago, and the event was a huge success. There were a ton of high quality presentations really showcasing innovation in finance. We gave a presentation on tidyquant illustrating the key benefits related to financial analysis in the tidyverse. We’ve uploaded the tidyquant presentation to YouTube. Check out the presentation. Don’t forget to check out our announcements and to follow us on social media to stay up on the latest Business Science news, events and information!

timekit: New Documentation, Function Improvements, Forecasting Vignette

Written by Matt Dancho on May 17, 2017

We’ve just released timekit v0.3.0 to CRAN. The package updates include changes that help with making an accurate future time series with tk_make_future_timeseries() and we’ve added a few features to tk_get_timeseries_signature(). Most important are the new vignettes that cover both the making of future time series task and forecasting using the timekit package. If you saw our last timekit post, you were probably surprised to learn that you can use machine learning to forecast using the time series signature as an engineered feature space. Now we are expanding on that concept by providing two new vignettes that teach you how to use ML and data mining for time series predictions. We’re really excited about the prospects of ML applications with time series. If you are too, I strongly encourage you to explore the timekit package important links below. Don’t forget to check out our announcements and to follow us on social media to stay up on the latest Business Science news, events and information! Here’s a summary of the updates.

tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

Written by Matt Dancho on May 11, 2017

In advance of upcoming Business Science talks on tidyquant at R/Finance and EARL San Francisco, we are releasing a technical paper entitled “New Tools For Performing Financial Analysis within the ‘Tidy’ Ecosystem”. The technical paper covers an overview of the current R financial package landscape, the independent development of the “tidyverse” data science tools, and the tidyquant package that bridges the gap between the two underlying systems. Several usage cases are discussed. We encourage anyone interested in financial analysis and financial data science to check out the technical paper. We will be giving talks related to the paper at R/Finance on May 19th in Chicago and EARL on June 7th in San Francisco. If you can’t make it, I encourage you to read the technical paper and to follow us on social media to stay up on the latest Business Science news, events and information.

timekit: Time Series Forecast Applications Using Data Mining

Written by Matt Dancho on May 2, 2017

The timekit package contains a collection of tools for working with time series in R. There’s a number of benefits. One of the biggest is the ability to use a time series signature to predict future values (forecast) through data mining techniques. While this post is geared toward exposing the user to the timekit package, there are examples showing the power of data mining a time series as well as how to work with time series in general. A number of timekit functions will be discussed and implemented in the post. The first group of functions works with the time series index, and these include functions tk_index(), tk_get_timeseries_signature(), tk_augment_timeseries_signature() and tk_get_timeseries_summary(). We’ll spend the bulk of this post introducing you to these. The next function deals with creating a future time series from an existing index, tk_make_future_timeseries(). The last set of functions deal with coercion to and from the major time series classes in R, tk_tbl(), tk_xts(), tk_zoo() (and tk_zooreg()), and tk_ts().

tidyquant 0.5.0: select, rollapply, and Quandl

Written by Davis Vaughan on April 4, 2017

We’ve got some good stuff cooking over at Business Science. Yesterday, we had the fifth official release (0.5.0) of tidyquant to CRAN. The release includes some great new features. First, the Quandl integration is complete, which now enables getting Quandl data in “tidy” format. Second, we have a new mechanism to handle selecting which columns get sent to the mutation functions. The new argument name is… select, and it provides increased flexibility which we show off in a rollapply example. Finally, we have added several PerformanceAnalytics functions that deal with modifying returns to the mutation functions. In this post, we’ll go over a few of the new features in version 5.

tidyquant Integrates Quandl: Getting Data Just Got Easier

Written by Matt Dancho on March 19, 2017

Today I’m very pleased to introduce the new Quandl API integration that is available in the development version of tidyquant. Normally I’d introduce this feature during the next CRAN release (v0.5.0 coming soon), but it’s really useful and honestly I just couldn’t wait. If you’re unfamiliar with Quandl, it’s amazing: it’s a web service that has partnered with top-tier data publishers to enable users to retrieve a wide range of financial and economic data sets, many of which are FREE! Quandl has it’s own R package (aptly named Quandl) that is overall very good but has one minor inconvenience: it doesn’t return multiple data sets in a “tidy” format. This slight inconvenience has been addressed in the integration that comes packaged in the latest development version of tidyquant. Now users can use the Quandl API from within tidyquant with three functions: quandl_api_key(), quandl_search(), and the core function tq_get(get = "quandl"). In this post, we’ll go through a user-contributed example, How To Perform a Fama French 3 Factor Analysis, that showcases how the Quandl integration fits into the “Collect, Modify, Analyze” financial analysis workflow. Interested readers can download the development version using devtools::install_github("business-science/tidyquant"). More information is available on the tidyquant GitHub page including the updated development vignettes.

tidyquant 0.4.0: PerformanceAnalytics, Improved Documentation, ggplot2 Themes and More

Written by Matt Dancho on March 4, 2017

I’m excited to announce the release of tidyquant version 0.4.0!!! The release is yet again sizable. It includes integration with the PerformanceAnalytics package, which now enables full financial analyses to be performed without ever leaving the “tidyverse” (i.e. with DATA FRAMES). The integration includes the ability to perform performance analysis and portfolio attribution at scale (i.e. with many stocks or many portfolios at once)! But wait there’s more… In addition to an introduction vignette, we created five (yes, five!) topic-specific vignettes designed to reduce the learning curve for financial data scientists. We also have new ggplot2 themes to assist with creating beautiful and meaningful financial charts. We included tq_get support for “compound getters” so multiple data sources can be brought into a nested data frame all at once. Last, we have added new tq_index() and tq_exchange() functions to make collecting stock data with tq_get even easier. I’ll briefly touch on several of the updates. The package is open source, and you can view the code on the tidyquant github page.

Recreating RView's ''Reproducible Finance With R: Sector Correlations''

Written by Davis Vaughan on February 2, 2017

The folks at RStudio have a segment on their RViews blog, “Reproducible Finance with R”, one that we at Business Science are very fond of! In the spirit of reproducibility, we thought that it would be appropriate to recreate the RViews post, “Reproducible Finance with R: Sector Correlations”. This time, however, the tidyquant package will be used to streamline much of the code that is currently used. The main advantage of tidyquant is to bridge the gap between the best quantitative resources for collecting and manipulating quantitative data: xts, zoo, quantmod and TTR, and the data modeling workflow and infrastructure of the tidyverse. When implemented, tidyquant cuts the code down by about half and simplifies the workflow.