We encounter time series data very frequently in the real world. Some common examples include real time sensors, surveillance video, stock market, astrophysics, speech recognition, and so on. In order to study time series data, we try to extract various characteristics that tend to define it. One of the most important things to think about is the dependence between various points in the time series data. Is there any dependence between the values in the time series data? If so, how far apart in time do they have to be in order to affect each other? Understanding these aspects will open up new doors in terms of how we analyze the data. This is where the concept of long memory comes into picture. Let’s dig a little deeper and understand it, shall we?
What is long memory?
Long memory is a situation that we encounter when we analyze time series data. It is also referred to as Long-range dependence. It basically refers to the level of statistical dependence between two points in the time series. More specifically, it relates to the rate of decay of statistical dependence between the two points as we increase the distance between them.
Time series data usually describes an event. In order to analyze the dependence, we use exponential decay as the threshold. This event is considered to have long memory if the statistical dependence decays more slowly than an exponential decay. This decay is usually a power-like decay. If you are not sure what exponential and power-like decays look like, this should jog your memory (pun totally intended):
exponential decay: y = ax
power-like decay: y = xa
Exponential decay is way faster than power-like decay. So if the dependence doesn’t decay quickly enough, then we say that the event has a long memory. This analysis is extensively used in various fields of study like natural language processing, speech recognition, financial modeling, computer networks, and so on. Now that we know what it means, how do we quantify it? This is where the autocorrelation function comes in handy.
What is autocorrelation function?
Autocorrelation refers to the correlation of a given signal with itself at various points in time. What is correlation? Correlation is a statistical method that’s used to understand how strongly a given pair of variables are related to each other. It has a strong mathematical foundation underneath and it’s very easy to google.
Let’s see how autocorrelation can help us here. If you look at it closely, it measures the similarity between measurements as a function of the time difference between them. We use it to find repeating patterns within a given time series. For example, if we have a periodic signal, then we will have repeating patterns throughout the signal. Autocorrelation helps us identify those patterns even if the signal is noisy or if some parts are missing.
How do we characterize it?
We know how to understand long and short memory processes, but we need to quantify it in some way. One way to do that would be to in terms of their autocorrelation functions. For a short memory process, the dependence between values at different times rapidly decreases as we increase the time difference between them. The autocorrelation has an exponential decay or it just drops to 0 after a certain time lag, indicating that the points are independent. In the case of long memory processes, the dependence is stronger. The decay of the autocorrelation function is power-like.
An important characteristic of time series data is stationarity. It’s basically a stochastic process whose properties don’t change when shifted in time. If you not familiar with the concept, you can quickly check out this blog post to learn more. If the given time series data is stationary, it works to our advantage. Hence we try to convert any time series data into stationary time series because it helps our case. This is commonly achieved by taking first order difference. This is referred to using the notation I(1). The original time series is referred to as an integrated process of order 0 and is denoted using I(0) where as I(1) is called integrated process of order 1.
Some time series are stationary to start with. If this is the case, then its autocorrelation function declines at an exponential rate. Hence these processes are said to have short memory because the values that are far apart in time are basically independent. But if we are dealing with data that’s of the first order, then its autocorrelation function declines at a linear rate, which is slower than exponential rate. The values that are far apart in time are not independent in this case.
Once we extract this information, we can model the time series data accordingly. Long memory behavior is crucial in how we approach data modeling and can have a significant impact in the financial world.