Time series data has memory. It remembers what happened in the past and avenge any wrongdoings! Can you believe it? Okay the avenging part may not be true, but it definitely remembers the past. The “memory” refers to how strongly the past can influence the future in a given time series variable. If it has a strong memory, then we know that analyzing the past would be really useful to us because it can tell us what’s going to happen in the future. If you need a quick refresher, you can check out my blog post where I talked about memory in time series data. We have a high level understanding of how we can classify time series data into short memory and long memory, but how do we actually measure the memory?
Why do we need to measure memory?
When we talk about long and short memory processes, we are basically talking about how correlated different points are on the timeline. As we increase the temporal distance between two points, how strongly will they continue to be correlated? This is what we refer to as the “memory” of the data. The reason this is important is that it has a huge impact on the predictability of the data. Put another way, if you want to do things like prediction or forecasting, you need to estimate the memory or else you will get very spurious results.
Short memory processes don’t lend themselves nicely to analysis because the past doesn’t impact the future that much. But long memory processes are fascinating because the past strongly influences the present and the future. So it would be very useful to estimate the long-term memory of a variable. This is where Hurst exponent comes into picture.
What is Hurst exponent?
Hurst exponent is basically a measure of long term memory of a given time series variable. It is also referred to as the index of dependence. It tells us how strongly the given time series data will regress to the mean. The value of the Hurst exponent can be between 0 and 1. If the value is 0.5, then it indicates that there is no correlation between the values in the data. It’s just Brownian!
If the value is between 0.5 and 1, it indicates that the time series data is persistent. It means that if the values are increasing right now, then it is more likely that it will be followed by another increase in the short term. Similarly, if the values are decreasing right now, then it is more likely that it will be followed by another decrease in the short term.
If the value is between 0 and 0.5, it indicates that the time series data is anti-persistent. It means that if the values are increasing right now, then it is more likely that it will be followed by a decrease in the short term. Similarly, if the values are decreasing right now, then it is more likely that it will be followed by an increase in the short term.
In essence, as the value of the Hurst exponent moves away from 0.5, it tends to give more information about the time series data. It becomes more “predictable”.
What about stationarity?
If you recall, we discussed the concept of stationarity in our previous blog posts and how it affects the predictability of time series data. In case you are not familiar with it, you can check out this blog post. A stationary process is basically a stochastic process whose properties don’t change when shifted in time. Now we are saying that Hurst exponent also gives an idea about predictability. Are they related? Well, both these methods are designed to give us an idea of whether or not the given time series data lends itself to predictive analysis well.
For a lot of time series analysis techniques, we need to make assumptions about stationarity of the data before we extract properties from it. This is restrictive because we don’t really know about it beforehand. The Hurst exponent is interesting because we can extract the properties of a time series variable without making any assumptions about its stationarity. Hurst exponent is used extensively in finance, healthcare, manufacturing, economics, biotech, and so on.