Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Periodic Scheduling Optimization for Dual-Arm Cluster Tools with Arm Task and Residency Time Constraints via Petri Net Model
Previous Article in Journal
Three Weak Solutions for a Critical Non-Local Problem with Strong Singularity in High Dimension
Previous Article in Special Issue
A Study on the Nature of Complexity in the Spanish Electricity Market Using a Comprehensive Methodological Framework
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Anti-Persistent Values of the Hurst Exponent Anticipate Mean Reversion in Pairs Trading: The Cryptocurrencies Market as a Case Study

1
Grupo de Sistemas Complejos, Universidad Politécnica de Madrid, Av Puerta de Hierro 2, 28040 Madrid, Spain
2
AGrowingData, 04001 Almería, Spain
3
Departamento de Química, Universidad Autónoma de Madrid, Cantoblanco, 28049 Madrid, Spain
4
ICAI Engineering School, Universidad Pontificia de Comillas, Alberto Aguilera 23, 28015 Madrid, Spain
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(18), 2911; https://doi.org/10.3390/math12182911
Submission received: 15 August 2024 / Revised: 12 September 2024 / Accepted: 13 September 2024 / Published: 19 September 2024
(This article belongs to the Special Issue Chaos Theory and Its Applications to Economic Dynamics)
Figure 1
<p>The green line shows the dependence of the median difference in <math display="inline"><semantics> <mrow> <mi>H</mi> <mi>M</mi> <mi>R</mi> </mrow> </semantics></math> between the control and treatment groups (<math display="inline"><semantics> <mrow> <mi>H</mi> <mi>M</mi> <msub> <mi>R</mi> <mrow> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>t</mi> <mi>r</mi> <mi>o</mi> <mi>l</mi> </mrow> </msub> <mo>−</mo> <mi>H</mi> <mi>M</mi> <msub> <mi>R</mi> <mrow> <mi>t</mi> <mi>r</mi> <mi>e</mi> <mi>a</mi> <mi>t</mi> </mrow> </msub> </mrow> </semantics></math>) with <math display="inline"><semantics> <mrow> <mi>T</mi> <mi>W</mi> </mrow> </semantics></math>. The blue line shows how the number of trading signals triggered by <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>&lt;</mo> <mn>0.5</mn> </mrow> </semantics></math> decreases as a function of <math display="inline"><semantics> <mrow> <mi>T</mi> <mi>W</mi> </mrow> </semantics></math>.</p> ">
Figure 2
<p>Median <math display="inline"><semantics> <mrow> <mi>H</mi> <mi>M</mi> <mi>R</mi> </mrow> </semantics></math> for the treatment and control groups as a function of the co-movement (classified in five categories ordered from low level to high level of co-movement) according to several metrics: (<b>A</b>) correlation, (<b>B</b>) cointegration, (<b>C</b>) MI, and (<b>D</b>) DTW. In (<b>E</b>), the median difference in <math display="inline"><semantics> <mrow> <mi>H</mi> <mi>M</mi> <mi>R</mi> </mrow> </semantics></math> between the control and treatment groups as a function of the degree of co-movement for the four metrics are given. The co-movement metric is color-coded.</p> ">
Figure 3
<p>Cumulative profit for the five strategies, plus the random version used in this work (see <a href="#sec2-mathematics-12-02911" class="html-sec">Section 2</a> for details).</p> ">
Figure 4
<p>Boxplots comparing the duration of trades, where <math display="inline"><semantics> <mrow> <mi>M</mi> <mi>R</mi> </mrow> </semantics></math> actually happened, for positions opened when <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>&lt;</mo> <mn>0.5</mn> </mrow> </semantics></math> and for positions opened when <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>≥</mo> <mn>0.5</mn> </mrow> </semantics></math>.</p> ">
Figure 5
<p>Comparison of the median <math display="inline"><semantics> <mrow> <mi>M</mi> <mi>S</mi> <mi>O</mi> <mi>T</mi> </mrow> </semantics></math> as a function of the portfolio size between the strategy including <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>&lt;</mo> <mn>0.5</mn> </mrow> </semantics></math> as a trading signal and the regular case not considering <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>&lt;</mo> <mn>0.5</mn> </mrow> </semantics></math>. Each panel contains the results for each co-movement metric: (<b>A</b>) Hurst, (<b>B</b>) correlation, (<b>C</b>) MI, (<b>D</b>) DTW, and (<b>E</b>) cointegration. (<b>F</b>) represents the median difference in <math display="inline"><semantics> <mrow> <mi>M</mi> <mi>S</mi> <mi>O</mi> <mi>T</mi> </mrow> </semantics></math> between the regular version of the strategy and the one considering <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>&lt;</mo> <mn>0.5</mn> </mrow> </semantics></math> for the five co-movement metrics. The co-movement metric is color-coded.</p> ">
Versions Notes

Abstract

:
Pairs trading is a short-term speculation trading strategy based on matching a long position with a short position in two assets in the hope that their prices will return to their historical equilibrium. In this paper, we focus on identifying opportunities where mean reversion will happen quickly, as the commission costs associated with keeping the positions open for an extended period of time can eliminate excess returns. To this end, we propose the use of the local Hurst exponent as a signal to open trades in the cryptocurrencies market. We conduct a natural experiment to show that the spread of pairs with anti-persistent values of Hurst revert to their mean significantly faster. Next, we verify that this effect is universal across pairs with different levels of co-movement. Finally, we back-test several pairs trading strategies that include H < 0.5 as an indicator and check that all of them result in profits. Hence, we conclude that the Hurst exponent represents a meaningful indicator to detect pairs trading opportunities in the cryptocurrencies market.

1. Introduction

Pairs trading [1,2] is a quantitative trading strategy based on the historical correlation of two assets that are expected to move in harmony. Thus, the idea behind it is to track two assets whose prices have historically moved together, detect when their prices diverge, and simultaneously open a short position on the winner and a long on the loser. If the observed pattern holds, it will not take long for both prices to converge again, and the trade will result in profits. By the same token, if the divergence among the prices continues, the trade will result in losses.
As shown in [3], despite its simplicity, this strategy can generate benefits even in its most naive version, the distance method (DM). In DM, the euclidean distance on the normalized price time series is used to monitor the pair and, when the sum of the squares diverges by more than a given threshold, the long and short positions are opened. Several studies have backed up the profitability of this strategy [4]. However, more recently, other studies have highlighted the current limitations of DM. First, its profitability has been reported to be declining over time at an increasing rate [5,6]. Secondly, it also presents a high risk rate, as the profitability of the strategy is immensely time-varying [7]. Last but not least, other authors have confirmed the weak performance of DM in the US market during the first decade of the 21st century, but at the same time they have pointed out that by taking advantage of more sophisticated methodologies such as cointegration, pairs trading can still be profitable [8]. In this line of research, several studies have proposed more sophisticated strategies for pairs trading, most of them focusing on methodologies to measure co-movement for the pair selection task. This line of study proposes ordering the universe of pairs by their co-movement and trading top pairs. In fact, there are several works proposing different methods to rank the pairs, including correlation, cointegration [1,9], copulas [10,11], or the Hurst exponent [12,13], and comparing them [8,14,15,16,17]. An alternative approach to co-movement consists of combining forecasts and multi-criteria decision methods [18,19]. This approach, which significantly differs from the previous ones, proposes to first forecast the difference in return the stocks should have at the end of the trading period, rank them, and finally buy the top stocks of the ranking and sell the last ones.
In this paper, we aim to contribute to the understanding of pairs trading and how to build profitable strategies around this paradigm. To this end, we focus on the applicability of the Hurst exponent (H) [20] to identify opportunities where mean reversion (MR) happens faster. For the profitability of pairs trading, speed plays a critical role, as the short selling fees increase proportionally with the time period that the position remains open. Additionally, as several studies [6,21] highlight, profits are very sensitive to the associated transaction costs. We propose to use H because it describes the chaotic properties [22,23] of time series, and is the most widely accepted test to measure long-term memory properties [24,25,26,27]. H is a measure for the long-term memory and fractality of a time series that quantifies the degree of persistence of similar change patterns. H ranges between 0 and 1, and provides information on whether the series presents long-term memory or not. If H = 0.5 , then each step is independent of the past values of the series. Thus, there is no memory and the series is equivalent to white noise. Under this setting, the time series is unpredictable, and the Efficient Market Hypothesis (EMH) [28,29] is fulfilled. When H < 0.5 , the series is anti-persistent. In this scenario, the series is expected to display mean reversion. In terms of pairs trading, this implies that divergences within a pair will generally converge back to equilibrium. Finally, when H > 0.5 , the series is persistent. In a persistent regimen, the series is more likely to maintain the trend in a broader range than what is expected by pure random walk. Thus, an initial divergence among the pair will be expected to increase and drive the pair away from equilibrium. In fact, local H has already been used to describe the chaotic properties of single financial time series and provide useful information to anticipate their trend [30,31].
Unlike most of the scientific literature around pairs trading, we do not rely on back-testing to test our hypothesis, but instead perform a natural experiment [32]. As pointed out in [33], back-testing is just a simulation of how a strategy would have performed if it had been executed over a past period of time, and therefore does not represent a proper research tool. Since we aim to find if there is a cause–effect relationship between anti-persistent values of H and an increase in the speed to MR, we take advantage of a natural experiment that allows us to control for environmental variables.
The present paper is organized as follows: in Section 2, we will describe the dataset, explain the methodology followed to compute several co-movement metrics, including the local Hurst exponent, and describe the design of our natural experiment. Next, in Section 3, we present our results that show that anti-persistent values of H reduce the time to MR. Finally, in Section 4, we present our conclusions and discuss the importance of our results.

2. Data and Methods

2.1. Data

We built a dataset with the price time series of the top 20 cryptocurrencies in terms of market capitalization, according to CoinMarketCap as of 6 June 2024. A list of all 20 cryptocurrencies can be found in Table A1. We downloaded this data from Binance through its API, at an hourly frequency, for the period that ranged from 1 January 2019 to 5 June 2024.
For each pair of cryptocurrencies, we computed the spread (s) as a continuous time model given by:
s = l o g ( p t A ) b × l o g ( p t B )
where l o g ( p t A ) is the natural log price of A at timestamp t, l o g ( p t B ) is the natural log price of B at timestamp t, and b is given by the following expression:
b = s t d ( l r ( A ) ) s t d ( l r ( B ) )
where l r stands for the natural logarithmic returns of each cryptocurrency. Note that in order to compute the s t d we considered the preceding 30 days (720 hourly timestamps). Thus, b represents the fraction of the 30-day volatility of cryptocurrency A over B. The reason behind this choice of b is because when we buy the pair, we will short sell b shares of B for each share of A that we buy. Therefore, by adopting this definition of b, we attempt to minimize the differences in volatility between the position in cryptocurrency A and the position in cryptocurrency B.

2.2. The Hurst Exponent as a Mean Reversion Indicator

Regarding time-series analysis, the Hurst exponent is a widely used measure of the long-term memory of a time series. In this way, H provides information on to what extent a time series is chaotic and unpredictable. Currently, there is a variety of methods to calculate H, among which we have re-scaled range (RS) [20], detrended fluctuation analysis (DFA) [34,35], wavelet transforms [36], and generalized Hurst exponent (GHE) [37].
In this work, we used GHE to measure the long-term memory of the spread of two price time series. GHE is based on the scaling behavior of the statistic:
K q ( τ ) = | X ( t + τ ) X ( t ) | q | ( X ( t ) ) q | ,
which is given by
K q ( τ ) τ q H ,
where τ is the time scale and can vary between 1 and τ m a x , H is the Hurst exponent, and q represents the order of the moment considered. H is calculated by taking logarithms in Equation (4) for different values of τ . In this work, we use τ = 2 n ( n = 0 , 1 , , l o g 2 ( N ) 2 ) , and q = 1 , where H 1 is the closest estimation to the classical Hurst exponent [38]. We use q = 1 , which captures the scaling properties of the absolute deviation of time series, since we are not interested in the multi-fractal feature of GHE.
H ranges between 0 and 1. H = 0.5 means that there is no memory; H < 0.5 indicates that the series is anti-persistent and is expected to display MR; and H > 0.5 indicates that the series is persistent, being more likely to maintain the trend.
To prevent using future values when computing H from the time series, we calculated a local H exponent according to a rolling window of 24, 72, 120, 168, 240, 336, 504, and 672 h, which is equivalent to 1, 3, 5, 7, 10, 14, 21, and 28 days, and that ends one hour before the measurement timestamp. By doing this, we ensured that we only used past records to obtain each value of H.
Finally, we tested whether the spread of a pair was more likely to display MR properties during anti-persistent periods ( H < 0.5 ) rather than during persistent or random periods ( H 0.5 ). Thus, for each hour i, we measured the time taken for the spread s to cross to its long-term mean value m as
H M R = j * i
where j * is the minimum j that satisfies
s j m i , if s i m i s j m i , if s i < m i
where j i and m represent the moving average of the spread, calculated with a rolling window of 1, 3, 5, 7, 10, 14, 21, and 28 days, matching the time window used for the calculation of H.

2.3. Natural Experiment

In order to test if MR happens faster when H < 0.5 , we ran a natural experiment (or quasi-experimental design) [32,39,40,41,42]. The goal was to compare the time that the series of a pair takes to revert to its mean value under persistent and anti-persistent or random regimes. Thus, we considered the number of hours for the spread time series to cross its long-term mean value, i.e., hours to mean reversion ( H M R ) as our response variable. Then, observations where H < 0.5 were assigned to the treatment group, while observations where H 0.5 were assigned to the control group.
Ideally, the observations of the control group needed to be as similar as possible to the treatment group during the pre-intervention period. In our case, the intervention corresponded to the anti-persistent value of H. Thus, to satisfy this condition, each observation in the treatment group was randomly paired with a control that satisfied the following conditions:
  • Belonged to the same pair.
  • The timestamp was within a 3 to 30-day interval around the treatment timestamp. By matching observations that are not far away in time, we minimized the variability of the environmental factors surrounding the market, along with the cryptocurrencies’ properties (such as the age of the coin). In addition, we did not match observations that were less than 3 days apart because the response variable ( H M R ) would have been too closely related.
  • Had the same level of volatility at that timestamp. In particular, we averaged the 30-day volatility of the cryptocurrencies that formed the pair, and classified each observation into low, medium, or high volatility. We considered avg. volatility < 0.005 as low volatility and avg. volatility > 0.015 as high volatility. Otherwise, we classified the observation as medium volatility.
  • Additionally, when performing the experiments where we also controlled for the degree of co-movement, we only matched pairs classified into the same co-movement category (these categories are reported in Table A2) at the timestamp when the trade was opened.
This matching process resulted in a final data sample of more than 8 billion control–treatment pairings.
Finally, we performed a t-test and a Wilcoxon signed-rank test to infer if there were significant differences between the means of the two groups or not and how they were related. In our data, the differences in H M R between two sets of values were approximately normal (although they did not fit perfectly). Thus, to make the results more robust, we choose to apply both the paired t-test and Wilcoxon signed-rank test. Note that we performed these tests for the full data sample, plus separate tests also controlling for the degree of co-movement.

2.4. Pairs Trading Strategy

As stated before, pairs trading is a quantitative trading strategy based on the assumption that two assets that historically move in sync will revert to their historical relationship over time. Thus, when their prices diverge, there is a trading opportunity involving simultaneously taking a long position on the low priced asset and a short position on the high priced one. If the observed pattern holds, it will not take long for both prices to converge again and the trade will result in profit. Otherwise, the trade will result in a loss.
In this work, we used H < 0.5 as a trading signal to open a trade. Then, if the spread s was between its mean value m plus one standard deviation of m s and m plus two standard deviations of m s , we sold the pair. This was equivalent to selling cryptocurrency A and buying cryptocurrency B, according to Equation (1). In contrast, if the spread s was between its mean value m minus one standard deviation of m s and m minus two standard deviations of m s , we bought the pair. This was equivalent to buying cryptocurrency A and selling cryptocurrency B, according to Equation (1). These rules can be expressed mathematically as follows:
  • if m + s t d ( m s ) < s < m + 2 s t d ( m s ) and H < 0.5 , sell the pair.
  • if m s t d ( m s ) > s > m 2 s t d ( m s ) and H < 0.5 , buy the pair.
Then, the trade was closed when one of the following conditions was met:
  • The spread, s, of the pair reverted to its mean value.
  • The spread, s, of the pair deviated more than two s t d ( m s ) from its mean value m.
  • The duration of the trade exceeded 3 days (72 h), reaching its expiration date.
One of the key points of this strategy surrounds how to select the best pairs so that the strategy results in benefits. Taking this into account, we back-tested our strategies by selecting monthly portfolios made up of the top 20 pairs according to their degree of co-movement. To do so, we ranked the pairs according to the following metrics:
  • Correlation. Correlation is a statistical measure that describes the extent to which two variables are linearly related. It quantifies the strength and direction of the relationship between the variables. In that sense, the higher the correlation coefficient is, the greater the variables move in sync. We used the Pearson correlation coefficient, which is the most commonly used measure of correlation.
  • Cointegration. The cointegration approach was introduced by Engle, and Granger [43] and it states that two variables X and Y are cointegrated if there exists β such that the linear combination X t β Y t is a stationary process. We used the ordinary least squares (OLS) method to estimate the regression parameters. Then, we used the Augmented Dickey–Fuller test to verify whether the residual was stationary or not, and therefore whether the stocks were cointegrated. Thus, we selected the pairs with the lower p-values in the ADF test, since the residuals of these pairs are stationary with more probability. In contrast to correlation, which considers movements in returns and therefore is a short-term relationship, cointegration specifies co-movement of prices and it is a long-term relationship.
  • Dynamic time warping (DTW). DTW is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. It works by identifying an optimal match between the sequences, stretching or compressing different sections of the time series. Then, the distance measure quantifies how similar the sequences are, taking into account the alignment cost computed as the sum of absolute differences for each matched pair of indices. We used a Python implementation of FastDTW [44], which is an approximate DTW algorithm that provides optimal alignments with less time and memory complexity than DTW. The best pair will be the pair whose distance between its returns is the lowest possible, since this means that the coins move in sync and there is a high degree of co-movement between them.
  • Mutual Information (MI). MI is a measure of the mutual dependence between two variables. It quantifies the amount of information obtained about one variable through the other variable. In other words, it measures how much knowing one variable reduces uncertainty about the other. MI is equal to zero if and only if the two random variables are independent, and higher values mean higher dependency. For its calculation, we used the mutual_info_regression function of the Python library scikit-learn, which relies on nonparametric methods based on entropy estimation from k-nearest neighbors distances.
  • Hurst Exponent. Defined in Section 2.2.
Additionally, we compared the results with a strategy that used a random selection of the 20 pairs for every month. This random selection was repeated 50 times to avoid any bias.

3. Results

We began by asking whether MR of the spread between pairs happens faster when the value of H is below 0.5 (anti-persistent) or not. To this end, we conducted a natural experiment as described in Section 2. In this experiment, the treatment represents timestamps where the considered pair satisfies H < 0.5 , and we repeated the experiment for several values of T W . The results are summarized in Figure 1, and full details of the experiment are given in Table A3. We found that the effect of H < 0.5 was significant (p-value < 0.001 ) for all the considered values of T W (1, 3, 5, 7, 10, 14, 21, and 28 days), and for all cases, MR happened much faster for the treatment group compared to the control. For example, for T W = 5 , the median difference between the time taken to MR between the two groups was approximately 18 h in favor of the treatment group.
Next, in search of an optimal configuration of H, we explored the effect that the T W chosen to compute the local H had on MR. In Figure 1 we represent the difference in H M R between the control and treatment groups for different values of T W . On the one hand, larger differences in H M R between the two groups are preferable, as it can better differentiate potentially profitable trades from losses. However, on the other hand, the more observations that satisfy the anti-persistence condition ( H < 0.5 ), the more trading opportunities. Thus, we needed to find values of T W that resulted in an equilibrium between both quantities. As Figure 1 shows, the number of observations with H < 0.5 decreased with T W , while the difference in H M R between the groups increased with T W , reaching its maximum at T W = 10 days, which represented a turning point. We identified that the region between T W = 3 and T W = 10 had a good balance and that one could choose T W within that range according to whether one preferred more observations with H under 0.5 , and therefore more trading opportunities, or a higher H M R difference between the two groups. For the remainder of the section we focus on T W = 7 as it presents a difference in H M R between the groups very close to the maximum, while at the same time keeping a reasonable fraction of observations with H < 0.5 .
A widely extended pair selection method relies on measuring their co-movement and choosing the top related ones without relying on a fixed threshold. Following this line of research, we test whether the observed effect of H < 0.5 on H M R is independent of the level of co-movement between the pairs or if, on the contrary, it disappears for the top related pairs, as the effect could be already captured by the co-movement metric. To this end, we repeated the natural experiment, but this time, we also controlled by the degree of co-movement between the pairs. To measure the co-movement, we used four different metrics: correlation, cointegration, DTW, and MI. More details about the process can be found in Section 2. Figure 2 shows the results of the experiment separated by levels of co-movement considering different metrics: correlation (Figure 2A), cointegration (Figure 2B), MI (Figure 2C), and DTW (Figure 2D). We found that the effect was significant (see Table A4) across all levels of co-movement for the four considered metrics. Thus, our results suggest that the effect of H < 0.5 on reducing the average time to MR is universal. Moreover, in general, we did not find dramatic differences across pairs with a higher or lower degree of co-movement. Nevertheless, we identified some differences that are worth mentioning. For MI, the treatment effect was slightly larger for pairs exhibiting a higher degree of co-movement. Thus, in this case, one expects that the H < 0.5 indicator would work better when selecting the pairs with the highest co-movement degree. By contrast, when considering cointegration as the co-movement metric, we found larger effects for the most related and non-related pairs, while the effect was reduced for intermediate values of cointegration.
To further evaluate the potential benefits of using anti-persistent values of H as a trading signal for pairs trading, we simulated several strategies which incorporated this indicator. All these strategies were described in detail in Section 2; however, we will summarize them below. The first of these strategies ( r a n d o m ) consisted of selecting 20 random pairs and opening a position on each pair when the spread deviated one s t d ( m s ) from its mean and H < 0.5 . The operation was closed either when the spread reverted to the mean, when it deviated more than two s t d ( m s ) from it or, when the trade expired for exceeding 72 h. Next, we built five more sophisticated strategies that had the same entry and exit conditions, but with the difference that the 20 pairs were not selected randomly. Unlike the random variant of the strategy, to select the pairs in these more advanced strategies, we ordered the pairs by level of co-movement and selected the top 20. This process is updated monthly. In this way, we generated a strategy for each of the co-movement metrics. The results of the back-testing for all the six strategies are represented in Figure 3 and Table A5. According to our back-testing simulations, all six strategies are profitable, but the most profitable one is cointegration. Hence, our results show that H < 0.5 represents an informative signal for pairs trading, as it is not only associated with a faster M R , but it is also capable of generating profits regardless of how the pairs are selected.
Furthermore, we evaluated the duration of the trades executed on these strategies and compared them to the duration of potential trades where H does not satisfy the anti-persistence condition. To do this, we calculated the distribution of the duration of the trades executed for each of the strategies, which effectively reached MR. That is, we excluded the trades in which the spread diverged by more than 2 × s t d ( m s ) and trades closed due to the expiration limit. Then, in the second step, we identified all the potential trades where all the conditions to open the trade were satisfied except for H < 0.5 , and also computed the resulting distribution of the duration of these potential trades. Finally, we compared both distributions for all of the strategies, and the results are summarized in Figure 4, which shows the box-plots of each group. As can be seen, the operations triggered by H < 0.5 were closed significantly faster than potential trades not satisfying the H < 0.5 condition. Moreover, once again, this result was independent of the method used to select the pairs, as the effect was significant and equally large across the five co-movement metrics. In general, the median duration of trades that reached MR was around 13 h for our strategies, and increased up to 22 h for trades not satisfying the H < 0.5 condition.
Finally, we explored the potential benefits that incorporating H as an indicator has on enabling the construction of larger portfolios in pairs trading. If there are too many pairs in a trading portfolio, there is a risk that at some point there would be too many open trades simultaneously. Indeed, as in every strategy, there is usually a limited budget for trading; this translates to missed trading opportunities. Here, we studied the dependence of Maximum Simultaneous Open Trades ( M S O T ) on the portfolio size for two scenarios: when H < 0.5 is used, and the regular case where H < 0.5 is not used. To this end, we simulated for each month the pairs trading strategies described above for different portfolio sizes and compared the results obtained when using H < 0.5 as a trigger indicator and otherwise. In this way, for each strategy, month, and portfolio size, we obtained two values of M S O T : the first one for operations that satisfy H < 0.5 , and the second one for the regular case where H is not used. The results are summarized in Figure 5, where panels A, B, C, D, and E show the results for H, correlation, MI, DTW, and cointegration, respectively, while panel F illustrates the difference in M S O T between the regular case and H < 0.5 for each strategy. We found that the number of parallel open trades expected when using H < 0.5 was significantly lower, and that the difference in M S O T between them increased proportionally to the portfolio size. In addition, we found that this effect was independent of the co-movement metric used to choose the pairs. Carrying out a deeper analysis, we observed that the difference in M S O T between the H < 0.5 case and the regular case grew approximately linearly with the portfolio size for the five studied cases. In the case of the H strategy, this relation is given by
Δ M S O T = 0.3 × P o r t f o l i o S i z e
This finding implies that when considering H < 0.5 as a trading signal, for each additional parallel trade that the strategy budget can afford, the portfolio size could be increased by three units. For example, when limiting the number of parallel open trades to 20, the maximum portfolio size has a limit of 32 for the regular case, and increases up to 66 for H < 0.5 . By the same token, when the limit on the number of parallel open trades is extended up to 40, the maximum portfolio size has a limit of 65 for the regular case, and increases up to 132 for H < 0.5 .

4. Discussion

The idea behind pairs trading is to track two assets whose prices move in harmony, detect when they diverge, and simultaneously open a long position on the loser and a short on the winner betting that the series will converge back to their equilibrium. In this work, we contribute to the understanding of pairs trading strategies and their profitability by showing how H can serve as a trade alert that detects trades where MR happens faster than average. In fact, this faster MR is because MR in the short-term actually occurs much more frequently when H < 0.5 than for other values of H. This is related to the fact that H < 0.5 is a valid signal to indicate that a series is at that moment anti-persistent and will return to its average.
Most of the current scientific literature in this field focuses on developing and evaluating methods to measure the co-movement between the pairs in order to select a portfolio of pairs for the strategy, and test the results through back-testing. Thus, despite the fact that there have been significant advances in the methodologies used to select the pairs and their relation with the expected returns, there is still little understanding regarding which factors actually determine the success or failure of the strategies. We aim to partially fill this gap by investigating what causes MR in the cryptocurrencies market. In this respect, we propose the use of anti-persistent values of H as an indicator. We do so because H measures the long-term memory and fractality of a time series, and anti-persistent time series are expected to display mean reversion. To test whether the effect is significant or not, we have conducted a natural experiment that allowed us to search for a cause–effect relationship between anti-persistent values of H and an increase in the speed of MR, while controlling for environmental variables. Our results show that for positions opened when H < 0.5 , MR happens faster and therefore the trade duration is shorter. Moreover, we also show that this effect persists even after controlling for the degree of co-movement of the pair. This finding suggests that re-balancing the portfolio periodically to select the top related pairs is not enough to maximize the profitability of pairs trading. Thus, the results of this paper are of interest for professional and private investors developing pairs trading strategies. As ref. [33] points out, before designing a trading strategy, it is vital to formulate hypotheses and test them through experiments and statistical tests rather than by means of back-tests. In this respect, we have contributed to understanding what causes a faster M R , and our conclusions can be used to design strategies where trading commissions are reduced due to the execution of shorter trades.
The duration of trades is a key factor in the success of pairs trading for two reasons. Firstly, in pairs trading, profits have been reported to be very sensitive to associated transaction costs, and short-selling fees increase proportionally with the time that the pair stays open. Hence, closing the operations faster can significantly reduce associated fees and prevent the strategy from failing due to associated transaction costs. Furthermore, the amount locked in the long and short positions is not available for other operations. This fact can drastically limit the number of trades that are actually executed, specially when trading with a small budget.
Another advantage is that our results represent that the design of pairs trading strategies should be related to the portfolio size. Most pairs trading strategies select a portfolio of pairs and open an operation each time a pair deviates from its mean without relying on extra information about the chances that the trade will result in profit. Alternatively, we propose the use of H as a complementary indicator to identify opportunities where M R is more probable, reducing the trades on each pair and focusing on the fastest ones. Thus, our approach allows a larger portfolio without varying the trade size. We analyzed the relation of M S O T with the portfolio size and showed that the number of parallel open trades is reduced significantly when taking into account the value of H. Thus, despite portfolio optimization being beyond the scope of this paper, our results represent a relevant contribution that can help private and professional traders design pairs trading strategies with larger portfolios. In fact, larger and more diversified portfolios bring important benefits related to risk reduction [45].
Finally, we will discuss future lines of research related to the results of this paper. Our findings regarding the impact that anti-persistent values of H have on MR for pairs in the cryptocurrencies market encourages us to explore its extension to other markets. In particular, it would be of great interest to explore the stocks and Forex markets. In addition, we back-tested several strategies incorporating H as a trading signal and found that all of them were profitable between 2019 and 2024. The winner strategy consisted of selecting the pairs to trade using cointegration as the co-movement metric. However, as we already discussed, finding profits in the back-test represents no warranty, as its only role is to discard bad strategies. Hence, further research should be conducted to answer the question of whether once H < 0.5 is used as a trading alert, it is still relevant in how to choose the portfolio of pairs.
In this paper, we focused on short-term MR rather on analyzing long-term MR, and have not analyzed whether MR is sustained or not. We did so because to build a profitable pairs trading strategy that minimizes the commission fees, it is critical that the spread returns to its mean as fast as possible, regardless of whether MR persists in time or not. However, short-term MR and long-term MR are not necessarily coupled, and understanding sustained MR represents a crucial factor for long term investors. Hence, we plan to extend the current research to explore the relationship between short-term MR and sustained MR. This will allow us to gain a deeper understanding of the effects of long-term memory on the cryptocurrency market.

Author Contributions

Conceptualization, M.G. and J.B.; Methodology, M.G. and J.B.; Validation, M.G., F.B., J.C.L. and J.B.; Formal analysis, M.G. and J.B.; Investigation, M.G. and J.B.; Resources, F.B.; Data curation, M.G.; Writing—original draft, M.G. and J.B.; Writing—review & editing, M.G., F.B. and J.B.; Visualization, M.G.; Supervision, J.B.; Project administration, J.C.L.; Funding acquisition, J.C.L. and J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by Spanish Ministry of Science and Innovation under Contract No. PID2021-122711NB-C21, and by DG of Research and Technological Innovation of the Community of Madrid (Spain) under Contract No. IND2022/TIC-23716.

Data Availability Statement

The data presented in this study are available on reasonable request from the corresponding author. However, all the data used is publicly accessible from Binance through its API.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Tables

Table A1. Name and symbol of the 20 cryptocurrencies used to form the pairs. The last column reports the first day that this cryptocurrency appears in our dataset.
Table A1. Name and symbol of the 20 cryptocurrencies used to form the pairs. The last column reports the first day that this cryptocurrency appears in our dataset.
NameSymbolFirst Date
BitcoinBTC1 January 2019
EthereumETH1 January 2019
Binance CoinBNB1 January 2019
SolanaSOL11 August 2020
RippleXRP1 January 2019
DogecoinDOGE5 July 2019
CardanoADA1 January 2019
Shiba InuSHIB10 May 2021
AvalancheAVAX22 September 2020
PolkadotDOT18 August 2020
ChainlinkLINK16 January 2019
TronTRX1 January 2019
Bitcoin CashBCH28 November 2019
NearNEAR14 October 2020
PolygonMATIC26 April 2019
UniswapUNI17 September 2020
LitecoinLTC1 January 2019
Internet ComputerICP11 May 2021
Ethereum ClassicETC1 January 2019
HederaHBAR29 September 2019
Table A2. Categories of co-movement used to classify the observations regarding correlation, cointegration, MI, and DTW. In the case of cointegration, this value refers to the p-value of the ADF test.
Table A2. Categories of co-movement used to classify the observations regarding correlation, cointegration, MI, and DTW. In the case of cointegration, this value refers to the p-value of the ADF test.
CategoryMetric
Correlation Cointegration MI DTW
1 [ 0 , 0.5 ) [ 0 , 0.01 ) [ 0 , 1 ) [ 0 , 2.5 )
2 [ 0.5 , 0.6 ) [ 0.01 , 0.05 ) [ 1 , 1.2 ) [ 2.5 , 3.2 )
3 [ 0.6 , 0.7 ) [ 0.05 , 0.1 ) [ 1.2 , 1.5 ) [ 3.2 , 4 )
4 [ 0.7 , 0.8 ) [ 0.1 , 0.5 ) [ 1.5 , 1.8 ) [ 4 , 5 )
5 [ 0.8 , 1 ] [ 0.5 , 1 ] [ 1.8 , 3 ] [ 5 , 15 ]
Table A3. Results of the t-test and Wilcoxon signed-rank test comparing the HMR of observations with H < 0.5 and H 0.5 . TW indicates the time window used to compute H. In all cases, p-value < 0.001 , which indicates that the differences are significant.
Table A3. Results of the t-test and Wilcoxon signed-rank test comparing the HMR of observations with H < 0.5 and H 0.5 . TW indicates the time window used to compute H. In all cases, p-value < 0.001 , which indicates that the differences are significant.
TWPaired t-TestWilcoxon Signed-Rank Test
t p -Value W p -Value
1 53.54 0.061,417,930,239.50.0
3 65.73 0.035,224,593,461.50.0
5 57.67 0.024,646,696,180.50.0
7 46.28 0.018,787,930,723.50.0
10 40.28 0.012,416,266,332.50.0
14 26.03 2.68 · 10 149 8,578,829,181.00.0
21 14.75 3.15 · 10 49 5,062,243,957.0 4.90 · 10 129
28 13.32 1.89 · 10 40 3,450,547,673.5 7.64 · 10 114
Table A4. Results of the t-test and Wilcoxon signed-rank test comparing the HMR of observations with H < 0.5 and H 0.5 , and also controlling by the level of co-movement regarding different metrics: correlation, cointegration, MI, and DTW. In all cases, p-value < 0.001 , which indicates that the differences are significant.
Table A4. Results of the t-test and Wilcoxon signed-rank test comparing the HMR of observations with H < 0.5 and H 0.5 , and also controlling by the level of co-movement regarding different metrics: correlation, cointegration, MI, and DTW. In all cases, p-value < 0.001 , which indicates that the differences are significant.
MetricPaired t-TestWilcoxon Signed-Rank Test
t p -Value W p -Value
Correlation 147.57 0.0194,138,128,819.00.0
Cointegration 155.58 0.0189,923,159,659.00.0
MI 145.61 0.0180,748,717,882.50.0
DTW 152.36 0.0198,569,581,457.50.0
Table A5. Summary of the total investment amount, absolute profit, and percentage profit obtained with every strategy.
Table A5. Summary of the total investment amount, absolute profit, and percentage profit obtained with every strategy.
StrategyInvestment ($)Profit ($)Profit (%)
ine Cointegration 13.04 8.03 61.56
MI 11.70 4.98 42.57
H 18.44 7.13 38.67
Correlation 11.65 4.5 38.63
DTW 10.75 2.52 23.43

References

  1. Vidyamurthy, G. Pairs Trading: Quantitative Methods and Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2004; Volume 217. [Google Scholar]
  2. Elliott, R.J.; Van Der Hoek, J.; Malcolm, W.P. Pairs trading. Quant. Financ. 2005, 5, 271–276. [Google Scholar] [CrossRef]
  3. Gatev, E.; Goetzmann, W.N.; Rouwenhorst, K.G. Pairs trading: Performance of a relative-value arbitrage rule. Rev. Financ. Stud. 2006, 19, 797–827. [Google Scholar] [CrossRef]
  4. Broussard, J.P.; Vaihekoski, M. Profitability of pairs trading strategy in an illiquid market with multiple share classes. J. Int. Financ. Mark. Inst. Money 2012, 22, 1188–1201. [Google Scholar] [CrossRef]
  5. Do, B.; Faff, R. Does simple pairs trading still work? Financ. Anal. J. 2010, 66, 83–95. [Google Scholar] [CrossRef]
  6. Do, B.; Faff, R. Are pairs trading profits robust to trading costs? J. Financ. Res. 2012, 35, 261–287. [Google Scholar] [CrossRef]
  7. Jacobs, H.; Weber, M. On the determinants of pairs trading profitability. J. Financ. Mark. 2015, 23, 75–97. [Google Scholar] [CrossRef]
  8. Huck, N.; Afawubo, K. Pairs trading and selection methods: Is cointegration superior? Appl. Econ. 2015, 47, 599–613. [Google Scholar] [CrossRef]
  9. Schmidt, A.D. Pairs trading: A cointegration approach. Trends Plant Sci. 2009, 24, P152–P164. [Google Scholar]
  10. Liew, R.Q.; Wu, Y. Pairs trading: A copula approach. J. Deriv. Hedge Funds 2013, 19, 12–30. [Google Scholar] [CrossRef]
  11. Xie, W.; Liew, R.Q.; Wu, Y.; Zou, X. Pairs Trading with Copulas. J. Trading 2015, 11, 41–52. [Google Scholar] [CrossRef]
  12. Ramos-Requena, J.P.; Trinidad-Segovia, J.; Sánchez-Granero, M. Introducing Hurst exponent in pair trading. Phys. A Stat. Mech. Its Appl. 2017, 488, 39–45. [Google Scholar] [CrossRef]
  13. Bui, Q.; Ślepaczuk, R. Applying Hurst Exponent in pair trading strategies on Nasdaq 100 index. Phys. A Stat. Mech. Its Appl. 2022, 592, 126784. [Google Scholar] [CrossRef]
  14. Rad, H.; Low, R.K.Y.; Faff, R. The profitability of pairs trading strategies: Distance, cointegration and copula methods. Quant. Financ. 2016, 16, 1541–1558. [Google Scholar] [CrossRef]
  15. Ramos-Requena, J.P.; Trinidad-Segovia, J.E.; Sánchez-Granero, M.Á. Some notes on the formation of a pair in pairs trading. Mathematics 2020, 8, 348. [Google Scholar] [CrossRef]
  16. Ko, P.C.; Lin, P.C.; Do, H.T.; Kuo, Y.H.; Huang, Y.F.; Chen, W.H. Pairs trading strategies in cryptocurrency markets: A comparative study between statistical methods and evolutionary algorithms. Eng. Proc. 2023, 38, 74. [Google Scholar] [CrossRef]
  17. Krauss, C. Statistical arbitrage pairs trading strategies: Review and outlook. J. Econ. Surv. 2017, 31, 513–545. [Google Scholar] [CrossRef]
  18. Huck, N. Pairs selection and outranking: An application to the S&P 100 index. Eur. J. Oper. Res. 2009, 196, 819–825. [Google Scholar]
  19. Huck, N. Pairs trading and outranking: The multi-step-ahead forecasting case. Eur. J. Oper. Res. 2010, 207, 1702–1716. [Google Scholar] [CrossRef]
  20. Hurst, H.E. Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 1951, 116, 770–799. [Google Scholar] [CrossRef]
  21. Bowen, D.; Hutchinson, M.C.; O’Sullivan, N. High frequency equity pairs trading: Transaction costs, speed of execution and patterns in returns. J. Trading 2010, 5, 31–38. [Google Scholar] [CrossRef]
  22. Farmer, J.D.; Sidorowich, J.J. Predicting chaotic time series. Phys. Rev. Lett. 1987, 59, 845. [Google Scholar] [CrossRef] [PubMed]
  23. Sprott, J.C. Chaos and Time-Series Analysis; Oxford University Press: Oxford, UK, 2003. [Google Scholar]
  24. Greene, M.T.; Fielitz, B.D. Long-term dependence in common stock returns. J. Financ. Econ. 1977, 4, 339–349. [Google Scholar] [CrossRef]
  25. Lillo, F.; Farmer, J.D. The long memory of the efficient market. Stud. Nonlinear Dyn. Econom. 2004, 8. [Google Scholar] [CrossRef]
  26. Barkoulas, J.T.; Baum, C.F. Long-term dependence in stock returns. Econ. Lett. 1996, 53, 253–259. [Google Scholar] [CrossRef]
  27. Kasman, S.; Turgutlu, E.; Ayhan, A.D. Long memory in stock returns: Evidence from the major emerging Central European stock markets. Appl. Econ. Lett. 2009, 16, 1763–1768. [Google Scholar] [CrossRef]
  28. Fama, E.F. Efficient capital markets. J. Financ. 1970, 25, 383–417. [Google Scholar] [CrossRef]
  29. Fama, E.F. Efficient capital markets: II. J. Financ. 1991, 46, 1575–1617. [Google Scholar] [CrossRef]
  30. Kroha, P.; Skoula, M. Hurst Exponent and Trading Signals Derived from Market Time Series. In Proceedings of the 20th International Conference on Enterprise Information Systems (ICEIS 2018), Madeira, Portugal, 21–24 March 2018; pp. 371–378. [Google Scholar]
  31. Pérez-Sienes, L.; Grande, M.; Losada, J.C.; Borondo, J. The hurst exponent as an indicator to anticipate agricultural commodity prices. Entropy 2023, 25, 579. [Google Scholar] [CrossRef]
  32. Rutter, M. Proceeding from observed correlation to causal inference: The use of natural experiments. Perspect. Psychol. Sci. 2007, 2, 377–395. [Google Scholar] [CrossRef]
  33. De Prado, M.L. Advances in Financial Machine Learning; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
  34. Peng, C.K.; Buldyrev, S.V.; Havlin, S.; Simons, M.; Stanley, H.E.; Goldberger, A.L. Mosaic organization of DNA nucleotides. Phys. Rev. E 1994, 49, 1685. [Google Scholar] [CrossRef]
  35. Hu, K.; Ivanov, P.C.; Chen, Z.; Carpena, P.; Stanley, H.E. Effect of trends on detrended fluctuation analysis. Phys. Rev. E 2001, 64, 011114. [Google Scholar] [CrossRef]
  36. Simonsen, I.; Hansen, A.; Nes, O.M. Determination of the Hurst exponent by use of wavelet transforms. Phys. Rev. E 1998, 58, 2779. [Google Scholar] [CrossRef]
  37. Barabási, A.L.; Vicsek, T. Multifractality of self-affine fractals. Phys. Rev. A 1991, 44, 2730. [Google Scholar] [CrossRef] [PubMed]
  38. Di Matteo, T.; Aste, T.; Dacorogna, M.M. Long-term memories of developed and emerging markets: Using the scaling analysis to characterize their stage of development. J. Bank. Financ. 2005, 29, 827–851. [Google Scholar] [CrossRef]
  39. Leatherdale, S.T. Natural experiment methodology for research: A review of how different methods can support real-world research. Int. J. Soc. Res. Methodol. 2019, 22, 19–35. [Google Scholar] [CrossRef]
  40. DiNardo, J. Natural experiments and quasi-natural experiments. In Microeconometrics; Springer: Berlin/Heidelberg, Germany, 2010; pp. 139–153. [Google Scholar]
  41. Rosenzweig, M.R.; Wolpin, K.I. Natural “natural experiments” in economics. J. Econ. Lit. 2000, 38, 827–874. [Google Scholar] [CrossRef]
  42. Butler, A.W.; Cornaggia, J. Does access to external finance improve productivity? Evidence from a natural experiment. J. Financ. Econ. 2011, 99, 184–203. [Google Scholar] [CrossRef]
  43. Engle, R.F.; Granger, C.W. Co-integration and error correction: Representation, estimation, and testing. Econom. J. Econom. Soc. 1987, 55, 251–276. [Google Scholar] [CrossRef]
  44. Salvador, S.; Chan, P. Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 2007, 11, 561–580. [Google Scholar] [CrossRef]
  45. Elton, E.J.; Gruber, M.J. Risk reduction and portfolio size: An analytical solution. J. Bus. 1977, 50, 415–437. [Google Scholar] [CrossRef]
Figure 1. The green line shows the dependence of the median difference in H M R between the control and treatment groups ( H M R c o n t r o l H M R t r e a t ) with T W . The blue line shows how the number of trading signals triggered by H < 0.5 decreases as a function of T W .
Figure 1. The green line shows the dependence of the median difference in H M R between the control and treatment groups ( H M R c o n t r o l H M R t r e a t ) with T W . The blue line shows how the number of trading signals triggered by H < 0.5 decreases as a function of T W .
Mathematics 12 02911 g001
Figure 2. Median H M R for the treatment and control groups as a function of the co-movement (classified in five categories ordered from low level to high level of co-movement) according to several metrics: (A) correlation, (B) cointegration, (C) MI, and (D) DTW. In (E), the median difference in H M R between the control and treatment groups as a function of the degree of co-movement for the four metrics are given. The co-movement metric is color-coded.
Figure 2. Median H M R for the treatment and control groups as a function of the co-movement (classified in five categories ordered from low level to high level of co-movement) according to several metrics: (A) correlation, (B) cointegration, (C) MI, and (D) DTW. In (E), the median difference in H M R between the control and treatment groups as a function of the degree of co-movement for the four metrics are given. The co-movement metric is color-coded.
Mathematics 12 02911 g002
Figure 3. Cumulative profit for the five strategies, plus the random version used in this work (see Section 2 for details).
Figure 3. Cumulative profit for the five strategies, plus the random version used in this work (see Section 2 for details).
Mathematics 12 02911 g003
Figure 4. Boxplots comparing the duration of trades, where M R actually happened, for positions opened when H < 0.5 and for positions opened when H 0.5 .
Figure 4. Boxplots comparing the duration of trades, where M R actually happened, for positions opened when H < 0.5 and for positions opened when H 0.5 .
Mathematics 12 02911 g004
Figure 5. Comparison of the median M S O T as a function of the portfolio size between the strategy including H < 0.5 as a trading signal and the regular case not considering H < 0.5 . Each panel contains the results for each co-movement metric: (A) Hurst, (B) correlation, (C) MI, (D) DTW, and (E) cointegration. (F) represents the median difference in M S O T between the regular version of the strategy and the one considering H < 0.5 for the five co-movement metrics. The co-movement metric is color-coded.
Figure 5. Comparison of the median M S O T as a function of the portfolio size between the strategy including H < 0.5 as a trading signal and the regular case not considering H < 0.5 . Each panel contains the results for each co-movement metric: (A) Hurst, (B) correlation, (C) MI, (D) DTW, and (E) cointegration. (F) represents the median difference in M S O T between the regular version of the strategy and the one considering H < 0.5 for the five co-movement metrics. The co-movement metric is color-coded.
Mathematics 12 02911 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Grande, M.; Borondo, F.; Losada, J.C.; Borondo, J. Anti-Persistent Values of the Hurst Exponent Anticipate Mean Reversion in Pairs Trading: The Cryptocurrencies Market as a Case Study. Mathematics 2024, 12, 2911. https://doi.org/10.3390/math12182911

AMA Style

Grande M, Borondo F, Losada JC, Borondo J. Anti-Persistent Values of the Hurst Exponent Anticipate Mean Reversion in Pairs Trading: The Cryptocurrencies Market as a Case Study. Mathematics. 2024; 12(18):2911. https://doi.org/10.3390/math12182911

Chicago/Turabian Style

Grande, Mar, Florentino Borondo, Juan Carlos Losada, and Javier Borondo. 2024. "Anti-Persistent Values of the Hurst Exponent Anticipate Mean Reversion in Pairs Trading: The Cryptocurrencies Market as a Case Study" Mathematics 12, no. 18: 2911. https://doi.org/10.3390/math12182911

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop