The Tidy Time Series Platform: tibbletime 0.1.0
Written by Davis Vaughan on January 4, 2018
We’re happy to announce the third release of the
tibbletime package. This is a huge update, mainly due to a complete rewrite of the package. It contains a ton of new functionality and a number of breaking changes that existing users need to be aware of. All of the changes have been well documented in the NEWS file, but it’s worthwhile to touch on a few of them here and discuss the future of the package. We’re super excited so let’s check out the vision for
tibbletime and its new functionality!
For those new to to package,
tibbletime is a new package that enables the creation of time aware tibbles. It’s sole purpose is to make working with time series in the tidyverse much easier! The documentation really explains everything, and here are a few important vignettes that can help get you up to speed on all of the functionality:
- Time-Based Filtering
- Changing Periodicity
- Rolling Calculations In tibbletime
- Using tibbletime With dplyr BRAND NEW!!
The grand view is to have
tibbletime function as a base package that others can build off of, utilizing the infrastructure that “knows” about the index column and provides support for time series transformations on tibbles. This can include extensions to finance, but also has room to grow into other areas such as economic forecasting, longitudinal studies, and other general time series analyses. We’ve already begun work on one such package, but that will be a post for another time ;).
At this point, the first bit of core functionality for
tibbletime is complete. A few other functions will likely be added, but we will definitely support backwards compatability from here on out.
New time series capabilities
tibbletime package was completely re-invisioned, making it much more flexible and general. Here are a few of the important new tools in
A new index partitioning function (
collapse_index()) that opens up powerful time based analysis with any
dplyrfunction, rather than a specific (and limited) set of
time_mutate(), etc, functions.
Full support for
POSIXctclasses as indices, and experimental support for
hmswhich should get more stable over time.
A consistent API along with more informative argument names that attempt to give it that intuitive look and feel of a
The one downside is that we had to make a few breaking changes, but with this post you’ll be able to easily get your code up to speed with the new functionality. What follows are a few of the most important changes for those that already used
tibbletime and are interested in seeing what has changed.
Load the following libraries to follow along.
time_collapse() -> collapse_index()
Rather than having a function like
time_collapse() that worked on an entire
tbl_time object, it has been replaced with
collapse_index() that solely manipulate the index (date) vector. This allows them to be used inside of a call to
mutate() and gives the user more control over the outcome (for example, whether they want to assign it to a new column or overwrite the original index column).
The index has been collapsed. We can now do easy
dplyr operations like summarizes.
An added bonus of this is that it promotes an integration with
dplyr that renders the previous need for
time_summarise() and other
time_*() functions obsolete. Rather, you now group on the collapsed date column and can then use any dplyr function that your heart desires. For example, here is a powerful example of easily creating 6 month summaries for every column of Facebook using
This incremental approach utilizing
dplyr groups should feel natural to any
tidyverse user. Because of this improved workflow,
time_summarise() and friends have been removed.
time_filter() -> filter_time()
A simple change, but with the removal of other
time_*() functions it makes more sense to rename
Formula style arguments
Those familiar with
tibbletime may be used to the formula style shorthand used in specifying both the
time_formula arguments found throughout the package. The
period argument now only accepts characters as there was little added benefit from using formulas. The
time_formula argument found in
create_series() still use the
from ~ to style syntax, but each side must be a character rather than a bare specification.
Previous way (error):
New way (quoted, no error):
Time Formula Specification
Previous way (error):
New way (quoted, no error):
This may seem like a step backwards, but it is more robust to program with and allows the user to pass in actual variables to the time formula (something that was requested a few times but was difficult to do). In this example you can use characters or real Date objects, both of which are then unquoted appropriately using
Programming with character date.
Programming with “date” class date.
While we are on the topic of
filter_time(), check out the new keywords
"end" that you can use in your formula specification.
There are plenty of other minor changes that make the package more consistent and easier for the user, so we encourage reading the NEWS file and checking out the updated vignettes for more information.
Dmytro Perepolkin (@dmi3k on Twitter) gave a lot of good feedback on the previous version of
tibbletime, and nicely helped promote the package on Twitter and Stack Overflow, so we just wanted to give a special shout out to him! Thanks!
We are super excited about the new release of the re-imagined
tibbletime package. It has a ton of new functionality and it can now be extended as a platform to build packages on. The sky is the limit with
tibbletime. Install the package, check out the docs, and let us know what you think!
Business Science specializes in “ROI-driven data science”. Our focus is machine learning and data science in business and financial applications. We build web applications and automated reports to put machine learning in the hands of decision makers. Visit the Business Science or contact us to learn more!
Business Science University
Interested in learning data science for business? Enroll in Business Science University. We’ll teach you how to apply data science and machine learning in real-world business applications. We take you through the entire process of modeling problems, creating interactive data products, and distributing solutions within an organization. We are launching courses in early 2018!