timekit: Time Series Forecast Applications Using Data Mining

    Written by Matt Dancho on May 2, 2017

    The timekit package contains a collection of tools for working with time series in R. There’s a number of benefits. One of the biggest is the ability to use a time series signature to predict future values (forecast) through data mining techniques. While this post is geared toward exposing the user to the timekit package, there are examples showing the power of data mining a time series as well as how to work with time series in general. A number of timekit functions will be discussed and implemented in the post. The first group of functions works with the time series index, and these include functions tk_index(), tk_get_timeseries_signature(), tk_augment_timeseries_signature() and tk_get_timeseries_summary(). We’ll spend the bulk of this post introducing you to these. The next function deals with creating a future time series from an existing index, tk_make_future_timeseries(). The last set of functions deal with coercion to and from the major time series classes in R, tk_tbl(), tk_xts(), tk_zoo() (and tk_zooreg()), and tk_ts().

    Read More...

    tidyquant 0.5.0: select, rollapply, and Quandl

    Written by Davis Vaughan on April 4, 2017

    We’ve got some good stuff cooking over at Business Science. Yesterday, we had the fifth official release (0.5.0) of tidyquant to CRAN. The release includes some great new features. First, the Quandl integration is complete, which now enables getting Quandl data in “tidy” format. Second, we have a new mechanism to handle selecting which columns get sent to the mutation functions. The new argument name is… select, and it provides increased flexibility which we show off in a rollapply example. Finally, we have added several PerformanceAnalytics functions that deal with modifying returns to the mutation functions. In this post, we’ll go over a few of the new features in version 5.

    Read More...

    tidyquant Integrates Quandl: Getting Data Just Got Easier

    Written by Matt Dancho on March 19, 2017

    Today I’m very pleased to introduce the new Quandl API integration that is available in the development version of tidyquant. Normally I’d introduce this feature during the next CRAN release (v0.5.0 coming soon), but it’s really useful and honestly I just couldn’t wait. If you’re unfamiliar with Quandl, it’s amazing: it’s a web service that has partnered with top-tier data publishers to enable users to retrieve a wide range of financial and economic data sets, many of which are FREE! Quandl has it’s own R package (aptly named Quandl) that is overall very good but has one minor inconvenience: it doesn’t return multiple data sets in a “tidy” format. This slight inconvenience has been addressed in the integration that comes packaged in the latest development version of tidyquant. Now users can use the Quandl API from within tidyquant with three functions: quandl_api_key(), quandl_search(), and the core function tq_get(get = "quandl"). In this post, we’ll go through a user-contributed example, How To Perform a Fama French 3 Factor Analysis, that showcases how the Quandl integration fits into the “Collect, Modify, Analyze” financial analysis workflow. Interested readers can download the development version using devtools::install_github("business-science/tidyquant"). More information is available on the tidyquant GitHub page including the updated development vignettes.

    Read More...

    tidyquant 0.4.0: PerformanceAnalytics, Improved Documentation, ggplot2 Themes and More

    Written by Matt Dancho on March 4, 2017

    I’m excited to announce the release of tidyquant version 0.4.0!!! The release is yet again sizable. It includes integration with the PerformanceAnalytics package, which now enables full financial analyses to be performed without ever leaving the “tidyverse” (i.e. with DATA FRAMES). The integration includes the ability to perform performance analysis and portfolio attribution at scale (i.e. with many stocks or many portfolios at once)! But wait there’s more… In addition to an introduction vignette, we created five (yes, five!) topic-specific vignettes designed to reduce the learning curve for financial data scientists. We also have new ggplot2 themes to assist with creating beautiful and meaningful financial charts. We included tq_get support for “compound getters” so multiple data sources can be brought into a nested data frame all at once. Last, we have added new tq_index() and tq_exchange() functions to make collecting stock data with tq_get even easier. I’ll briefly touch on several of the updates. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    Recreating RView's ''Reproducible Finance With R: Sector Correlations''

    Written by Davis Vaughan on February 2, 2017

    The folks at RStudio have a segment on their RViews blog, “Reproducible Finance with R”, one that we at Business Science are very fond of! In the spirit of reproducibility, we thought that it would be appropriate to recreate the RViews post, “Reproducible Finance with R: Sector Correlations”. This time, however, the tidyquant package will be used to streamline much of the code that is currently used. The main advantage of tidyquant is to bridge the gap between the best quantitative resources for collecting and manipulating quantitative data: xts, zoo, quantmod and TTR, and the data modeling workflow and infrastructure of the tidyverse. When implemented, tidyquant cuts the code down by about half and simplifies the workflow.

    Read More...

    tidyquant 0.3.0: ggplot2 Enhancements, Real-Time Data, and More

    Written on January 22, 2017

    tidyquant, version 0.3.0, is a pretty sizable release that includes a little bit for everyone, including new financial charting and moving average geoms for use with ggplot2, a new tq_get get option called "key.stats" for retrieving real-time stock information, and several nice integrations that improve the ease of scaling your analyses. If your not already familiar with tidyquant, it integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates by going through some neat examples. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    Speed Up Your Code Part 2: Parallel Processing Financial Data with multidplyr + tidyquant

    Written on January 21, 2017

    Since my initial post on parallel processing with multidplyr, there have been some recent changes in the tidy eco-system: namely the package tidyquant, which brings financial analysis to the tidyverse. The tidyquant package drastically increase the amount of tidy financial data we have access to and reduces the amount of code needed to get financial data into the tidy format. The multidplyr package adds parallel processing capability to improve the speed at which analysis can be scaled. I seriously think these two packages were made for each other. I’ll go through the same example used previously, updated with the new tidyquant functionality.

    Read More...

    tidyquant 0.2.0: Added Functionality for Financial Engineers and Business Analysts

    Written on January 8, 2017

    tidyquant, version 0.2.0, is now available on CRAN. If your not already familiar, tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    tidyquant: Bringing Quantitative Financial Analysis to the tidyverse

    Written on January 1, 2017

    My new package, tidyquant, is now available on CRAN. tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, xts, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each. While this post aims to introduce tidyquant to the R community, it just scratches the surface of the features and benefits. We’ll go through a simple stock visualization using ggplot2, which which shows off the integration. The package is open source, and you can view the code on the tidyquant github page.

    Read More...

    Russell 2000 Quantitative Stock Analysis in R: Six Stocks with Amazing, Consistent Growth

    Written on November 30, 2016

    The Russell 2000 Small-Cap Index, ticker symbol: ^RUT, is the hottest index of 2016 with YTD gains of over 18%. The index components are interesting not only because of recent performance, but because the top performers either grow to become mid-cap stocks or are bought by large-cap companies at premium prices. This means selecting the best components can result in large gains. In this post, I’ll perform a quantitative stock analysis on the entire list of Russell 2000 stock components using the R programming language. Building on the methodology from my S&P Analysis Post, I develop screening and ranking metrics to identify the top stocks with amazing growth and most consistency. I use R for the analysis including the rvest library for web scraping the list of Russell 2000 stocks, quantmod to collect historical prices for all 2000+ stock components, purrr to map modeling functions, and various other tidyverse libraries such as ggplot2, dplyr, and tidyr to visualize and manage the data workflow. Last, I use plotly to create an interactive visualization used in the screening process. Whether you are familiar with quantitative stock analysis, just beginning, or just interested in the R programming language, you’ll gain both knowledge of data science in R and immediate insights into the best Russell 2000 stocks, quantitatively selected for future returns!

    Read More...