tidyquant, version 0.3.0, is a pretty sizable release that includes a little bit for everyone, including new financial charting and moving average geoms for use with ggplot2, a new tq_get get option called "key.stats" for retrieving real-time stock information, and several nice integrations that improve the ease of scaling your analyses. If your not already familiar with tidyquant, it integrates the best quantitative resources for collecting and analyzing quantitative data, xts, zoo, quantmod and TTR, with the tidyverse allowing for seamless interaction between each. I’ll briefly touch on some of the updates by going through some neat examples. The package is open source, and you can view the code on the tidyquant github page.

tidyquant: Bringing financial analysis to the tidyverse

When I said this was a big release, I wasn’t kidding. We have some major enhancements in tidyquant:

1. Financial Visualizations for ggplot2: Candlestick charts, barcharts, moving averages and Bollinger Bands can be used in the ggplot “grammar of graphics” workflow. There’s a new vignette, Charting with tidyquant, that details the new financial charting capabilities.

2. Key stats from Yahoo Finance: Users can now get 55 different key statistics in real-time from Yahoo Finance with the new "key.stats" get option. The statistics include Bid, Ask, Day’s High, Day’s Low, Last Trade Price, Current P/E Ratio, and many more most of which change throughout the day. With the addition of the key statistics, tq_get is now truly a one-stop shop for financial information. The user can now get:
• Real-time key stock statistics with "key.stats"
• Historical key ratios and financial information over the past 10-years with "key.ratios"
• Quarterly and annual financial statement data with "financials"
• Historical daily stock prices with "stock.prices"
• Stock indexes for 18 different indexes with "stock.index"
• And more!
3. Enhancements that Make Scaling Financial Analysis Simple:

• tq_get now accepts multiple stocks in the form of either a character vector (e.g. c("AAPL", "GOOG", "FB")) or a data frame with the stocks in the first column. This means scaling is ridiculously simple now. A call to tq_get(c("AAPL", "GOOG", "FB"), get = "stock.prices") now gets the 10-years of daily stock prices for all three stocks in one data frame!

• tq_mutate and tq_transform now work with grouped data frames. This means that you can extend the xts, zoo, quantmod and TTR functions to grouped data frames the same way that you can with dplyr::mutate. In addition, you can now more easily rename the transformed / mutated data frame, with the col_rename argument. All of this saves you time and requires less code!

This concludes the major changes. Now, let’s go through some examples!

# Prerequisites

First, update to tidyquant v0.3.0.

Next, load tidyquant.

I also recommend the open-source RStudio IDE, which makes R Programming easy and efficient.

# Examples

We’ve got some neat examples to show off the new capabilities:

1. Enhanced Financial Data Visualizations: We’ll check out how to use the new tidyquant geoms with ggplot2, which provide great visualizations for time-series and stock data!

2. Working with Key Statistics: We’ll investigate the new tq_get get option, get = "key.stats", which enables access to real-time, intraday trading information!

3. Scaling Your Analysis: We’ll test out some of the new scaling features that make it even easier to scale your analysis from one security to many!

## Example 1: Enhanced Financial Data Visualizations

I absolutely love these new ggplot geoms that come packaged with tidyquant, and I’m really excited to show them off! Two new chart types come packaged with tidyquant v0.3.0: geom_candlestick and geom_barchart (not to be confused with geom_bar). In this post, we’ll focus on the candlestick chart, but the barchart works in a very similar manner.

Before we start, let’s get some data using tq_get. The first call gets a single stock (nothing new here), and the second call retrieves the FANG stocks using the new scaling functionality by piping (%>%) a character vector of symbols to tq_get (there are other ways too!).

Before v0.3.0, we used geom_line to create a line chart like so. Note that coord_x_date is a new tidyquant coordinate function that enables zooming in a part of the chart without out-of-bounds data loss (scale_x_date is similar but causes out-of-bounds data loss which wreaks havoc on moving average geoms).

With tidyquant, we can replace the geom_line with geom_candlestick to create a beautiful candlestick chart that shows open, high, low, close, and direction visually. The only real difference is that we need to specify the aesthetic arguments, open, high, low and close. Everything else can stay the same.

Pretty sweet! Let’s take this a step further with moving averages. The moving average geom, geom_ma, is used to quickly draw moving average lines using a moving average function, ma_fun, that is one of seven from the TTR package. We can use these to “rapid prototype” moving averages, enabling us to quickly identify changes in trends. Let’s add 15 and 50-day moving averages. Note that geom_ma takes arguments to control the moving average function (ma_fun = SMA and n = 15) and arguments to control the line such as color = "red" or linetype = 4.

We can also use Bollinger Bands to help visualize volatility. BBands take a moving average, such as ma_fun = SMA from TTR, and a standard deviation, sd = 2 by default. Because BBands depend on the high, low and close prices, we need to add these as aesthetic arguments. Let’s use a 20-day simple moving average with two standard deviations. We can see that there were two periods, one in October and one in November, that had higher volatility.

Last, we can visualize multiple stocks at once by adding a group aesthetic and tacking on a facet_wrap at the end of the ggplot workflow. Note that the out-of-bounds data becomes important to the scale of the facet: too much data and the y-axis is off scale, too little data and the moving average is thrown off. An easy way to adjust is to use filter() to subtract double the moving average number of periods (2 * n) from the start date of the data. This reduces the out-of-bounds data without eliminating data that the moving average function needs for calculations.

## Example 2: Working with Key Statistics

New to tq_get is the get option get = "key.stats". So, what are key stats? Yahoo Finance has an amazing list of real-time statistics such as bid price, ask price, day’s high, day’s low, change, and many more features that change throughout the day. Key stats are our access to live data, the most current features of a stock / company, many of which are accurate to the second that they are retrieved. Pretty neat!

##### Getting Key Stats

Let’s get some key stats, and see what’s inside. We get key stats using the tq_get function, setting get = "key.stats". When we show the data, it’s kind of messy (there’s a reason) so I’ve just listed the first ten column names. It comes in the form of a one row tibble (tidy data frame) that has 55 columns, one for each key stat.

The reason that the data comes this way is because, using the new scaling capability, we can get key stats for multiple stocks, and the rows get stacked on top of each other. This makes comparing key stats very easy!

##### Retrieve Real-Time Data at Periodic Intervals

Something great about real-time data is that it can be collected at periodic intervals when trading is in-session! The following code chunk when run will retrieve stock prices at a periodic interval:

##### Comparing Historical Data to Current Data

We now have get = "key.stats" for current stats and with v0.2.0 we got get = "key.ratios" for 10-years of historical ratios. When combined, we can now compare current attributes to historical trends. To put into perspective, we will investigate the P/E Ratio: Comparing Historical Trends Versus Current Value for AAPL. The P/E ratio is a measure of the stock valuation. Stocks are considered “expensive” when they trade above historical averages or above industry averages.

We already have the key stats from AAPL, so getting the current P/E Ratio is very easy.

Due to the amount of data and time-series nature, the key ratios come as a nested tibble, grouped by section type.

We need to get the historical P/E Ratios, which are in the “Valuation Ratios” section. We will do a series of filtering and unnesting to peel away the layers and isolate the “Price to Earnings” time-series data.

Now, we are ready to visualize the P/E Ratio: Comparing Historical Trends Versus Current Value for AAPL. The visualization below is inspired by r-statistics.co, an awesome resource for ggplot2 and R analysis. We add the following:

• Geoms:
1. geom_line() and geom_point() to chart the historical data
2. geom_ma() to chart the three period simple moving average (the three period average helps identify the trend through the noise)
3. geom_hline() to add a horizontal line at current P/E Ratio obtained from key stats.
• Legend: We manipulate the colors with scale_color_manual() and the position in the theme() function.
• Logo: A logo is generated as a grob (grid graphical object) using the grid and png packages. The function annotate_custom() allows us to simply add to the ggplot workflow. See Add an Image to Background for a tutorial.

The chart shows that the current valuation is slightly above the recent historical valuation indicating that the stock prices is slightly “expensive”. However, given that the P/E ratio is below the current SP500 average of 25, courtesy of www.multpl.com, one could also consider this stock “inexpensive”. It just depends on your perspective. :)

## Example 3: Scaling Your Analysis

Probably the single most important benefit of performing financial analysis in the tidyverse is the ability to scale. Based on some excellent feedback from @KanAugust, I have made scaling even easier. There’s two new options for scaling:

New Option 1: Passing a character vector of symbols:

Send a character vector in the form c("X", "Y", "Z") to tq_get. A new column is generated, symbol.x, with the symbols that were passed to the x argument.

New Option 2: Passing a tibble with symbols in the first column:

We can combine tq_get calls using get = "stock.index" and get = "stock.prices" to pass a stock index to get stock prices. I’ve added slice(1:3) to get the first three stocks from the index, which reduces the download time. If you remove slice(1:3), you will get the historical prices for all stocks in an index in the next step!

First, get stocks from an index.

Then get stock prices. Note that symbols must be in the first column.

We can also use tq_mutate and tq_transform with dplyr::group_by to scale analyses! Thanks to some great feedback from @dvaughan32, the col_rename argument is available to conveniently rename the newly transformed / mutated columns.

Here’s a powerful example: We can use group_by and tq_transform to collect annual returns for a tibble of stock prices for multiple stocks. The result can be piped to ggplot for charting.

# Conclusions

The tidyquant package has several enhancements for financial analysis:

• New ggplot2 geoms for candlestick charts, barcharts, moving averages, and Bollinger Bands, and a brand new vignette to help guide users on charting capabilities.

• New get = "key.stats" for current stats on stocks: 55 total are available. The key stats compliment the key ratios (get = “key.ratios”), which contain 10-years of historical information on various key ratios and financial information.

• New capabilities for scaling financial analyses to many stocks:

• Using tq_get with character vectors or tibbles of stocks
• Using tq_mutate / tq_transform with dplyr::group_by

With these updates, we can really do full financial analyses without ever leaving the tidyverse!

# Recap

We went over a few examples to illustrate the main updates to tidyquant:

1. The first example showed an implementation of several new tidyquant geoms that work with ggplot2: geom_candlestick / geom_barchart, geom_ma, and geom_bbands.

2. The second example showed use of the new tq_get get option, get = "key.stats". The key stats provide real-time data from Yahoo Finance, and are a handy complement to the historical data provided using get options, "stock.prices", "key.ratios", and "financials".

3. The third and final example showed some of the improvements in scaling analysis with the tidyverse. You can now pipe multiple symbols into tq_get to scale any of the get options, and you can use tq_mutate and tq_transform with dplyr::group_by.

I hope you enjoy the new features as much as I did creating them. As always there’s more to come! :)

1. r-statistics.co: You need to check out this website, which contains a wealth of quality, up-to-date R information. The Top 50 ggplot2 visualizations is amazing. This is now my go-to reference on ggplot2.

2. Tidyquant Vignettes: This tutorial just scratches the surface of tidyquant. The vignettes explain much, much more!

3. R for Data Science: A free book that thoroughly covers the tidyverse packages.

4. Quantmod Website: Covers many of the quantmod functions. Also, see the quantmod CRAN site.

5. Extensible Time-Series Website: Covers many of the xts functions. Also, see the xts vignette.

6. TTR on CRAN: The reference manual covers each of the TTR functions.

7. Zoo Vignettes: Covers the zoo` rollapply functions as well as other usage.