- Research
- Open access
- Published:
Signature-based portfolio allocation: a network approach
Applied Network Science volume 9, Article number: 54 (2024)
Abstract
Portfolio allocation represents a significant challenge within financial markets, traditionally relying on correlation or covariance matrices to delineate relationships among stocks. However, these methodologies assume time stationarity and only capture linear relationships among stocks. In this study, we propose to substitute the conventional Pearson’s correlation or covariance matrix in portfolio optimization with a similarity matrix derived from the signature. The signature, a concept from path theory, provides a unique representation of time series data, encoding their geometric patterns and inherent properties. Furthermore, we undertake a comparative analysis of network structures derived from the correlation matrix versus those obtained from the signature-based similarity matrix. Through numerical evaluation on the Standard & Poor’s 500, we assess that portfolio allocation utilizing the signature-based similarity matrix yielded superior results in terms of cumulative log-returns and Sharpe ratio compared to the baseline network approach based on Pearson’s correlation. This assessment was conducted across various portfolio optimization strategies. This research contributes to portfolio allocation and financial network representation by proposing the use of signature-based similarity matrices over traditional correlation or covariance matrices.
Introduction
Portfolio allocation is the process of constructing an investment portfolio by selecting a combination of assets that optimizes a suitable trade-off between risk and return (Prigent 2007). Traditionally, modeling the dependencies and risk among assets has relied on the covariance matrix or the Person’s correlation matrix. The former was introduced in the Mean-Variance model proposed by Markowitz (1952), which forms the basis of modern portfolio theory. Meanwhile, the Person’s correlation matrix is commonly employed in network-based portfolio models, where this correlation matrix is used to reproduce the dependencies among the assets using network theory in order to increase the number of choices in the portfolio selection process (Clemente et al. 2022). However, both matrices introduce biases in the portfolio model due to their assumptions of temporal stationarity and focus on capturing linear relationships in the data (Brockwell and Davis 2002). Temporal stationarity implies that the statistical properties of financial returns, such as their mean and variance, remain constant over time. Moreover, covariance and Pearson’s correlation matrices focus solely on analyzing linear relationships in the data, where changes in one variable are accompanied by proportional changes in another variable. To address these drawbacks, several variations and extensions of the original models were proposed in the literature, such as shrinkage estimation of the covariance matrix (Jorion 1985, 1986), which attempts to reduce biases associated with using historical data by imposing constraints on the moments and co-moments in the time series data. However, these approaches still rely on covariance or correlation matrices and are subject to issues of temporal stationarity and linear relationships, albeit to a lesser extent since they reduce the sampling error in using historical data.
In this research, we propose a paradigm shift by replacing the correlation or covariance matrix with a similarity matrix derived from the analysis of the so-called time series signature (Lyons 1998; Lyons et al. 2014). The time series signature, a concept derived from path theory, offers a structured and comprehensive representation of temporal evolution within a time series. In particular, the signature can be viewed as analogous to the Moment Generating Function (MGF), which is significant for comparing random variable distributions as it encodes all distribution moments into a single function uniquely characterizing the distribution itself (Resnick 2019). Its unique nature and ability to capture both temporal and geometric patterns make it a valuable tool for identifying community structure within a basket of time series, as demonstrated in Gregnanin et al. (2024). Identifying stock communities is particularly relevant for portfolio strategies and risk management tasks, as it can enhance portfolio diversification and reduce risk (Prigent 2007). Based on the unique properties of the time series signature, our proposed model begins by computing the signature for each considered time series. Subsequently, we apply a similarity function to derive a matrix that quantifies the relationships between the selected stocks. Prior to substituting the correlation matrix, traditionally used in the network-based portfolio framework, with the obtained similarity matrix, we filter out the noisy components from the latter matrix (the similarity one) to retain only the relevant information. Our replacement of the correlation matrix with the similarity matrix can be justified as follows. Traditional portfolio optimization methods rely on estimating either the covariance matrix, in the classical portfolio framework, or the correlation matrix, in the network portfolio framework. However, these estimations require the computation of a large number of pairwise coefficients, which can lead to highly unstable results (Zhang et al. 2021). The naive approach is to consider the historical covariance or correlation matrix, as in Markowitz’s framework (Markowitz 1952). In this research, we replace the correlation matrix with a signature-based similarity matrix. The time series signature is able to encode the information of a realization of a stochastic process (Lyons 1998; Lyons et al. 2014), allowing one to compute the pairwise coefficients using a similarity measure and thus mitigate the potential estimation errors that arise from the adoption of statistical models such as the shrinkage estimation of the covariance matrix (Jorion 1985, 1986). Finally, we demonstrate that the portfolio derived using the signature-based similarity matrix consistently achieves higher cumulative returns, Sharpe Ratio (SR) (Sharpe 1998), and volatility compared to the baseline models considered. Notably, the increased volatility is also associated with a skewness close to 0 and a lower excess kurtosis of the obtained portfolio. This indicates that the portfolio returns’ distribution is closer to a normal distribution, which is significant from a risk management perspective.
In this framework, our contributions include:
-
1.
Proposing and investigating a novel solution to the portfolio allocation task obtained using the signature-based similarity matrix in both modern portfolio theory and network-based portfolio models.
-
2.
Analyzing the properties of the network derived from the signature-based similarity matrix and comparing them with the properties of the network derived from the correlation matrix.
-
3.
Conducting a portfolio evaluation on the Standard and Poor’s 500 (S&P 500) index and illustrating how the portfolio construction changes from using the correlation matrix to exploiting the signature-based similarity matrix.
This article represents a thorough extension of our previous conference paper (Gregnanin et al. 2024). In that work, we demonstrated how, in the case of financial time series, the signature-based similarity matrix has the ability to find a better community structure (as measured through modularity) than the correlation matrix. In this study, we build upon these results and demonstrate their implications in terms of portfolio allocation. Additionally, we analyze the signature-based similarity matrix to ensure it satisfies suitable mathematical properties for its use in a portfolio allocation framework (see Proposition 1 in “Signature-based portfolio strategies” section). Then, we illustrate how to apply it in a network portfolio approach.
The remainder of the paper is structured as follows: the “Related work” section reviews relevant portfolio allocation models; the “Preliminaries” section defines the time series signature, reviews portfolio optimization models expressed in terms of Mean-Variance (and its variations) and network-based approaches, and outlines the process of deriving a graph from a basket of time series; the “Data collection” section illustrates the dataset used for the analysis; the “Why a signature-based similarity matrix? A network analysis” section investigates the network properties of graphs derived from the correlation matrix and the signature-based similarity matrix; the “Empirical evaluation” section presents the portfolio allocation analysis; and finally, the “Conclusion” section concludes the paper.
Related work
In this section, we provide a brief overview of classical portfolio allocation models and financial time series similarity measurement, followed by the development of models based on complex networks.
Modern portfolio theory, proposed by Markowitz (1952), utilizes the expected value and variance of portfolio returns to gauge portfolio performance. This approach formulates a bi-objective optimization problem, aimed at optimizing the trade-off between risk and returns to inform investment decisions. However, the model faces criticism on various fronts. Notably, estimation errors in mean and covariance can lead to poor out-of-sample portfolio performance. Additionally, the assumption of a normal distribution may not align with real-world data distributions, resulting in a biased estimate of the covariance matrix. A comprehensive discussion on the mean-variance model’s drawbacks can be found in Chung et al. (2022) and Kolm et al. (2014). To address these limitations, several variations and extensions were proposed in the literature. The Global Minimum Variance Portfolio (GMVP) model minimizes portfolio variance without considering portfolio returns. This strategy, extensively explored and supported by Jagannathan and Ma (2003), offers promising insights. Another widely used approach consists in maximizing the so-called Sharpe ratio, i.e., the risk-adjusted log-return (Sharpe 1998). Recently, Zhang et al. (2020) leveraged deep learning to optimize the Sharpe ratio directly for portfolio construction, surpassing benchmark strategies. The Equally Weighted Portfolio (EWP) model assigns equal weights to each underlying asset. Comparative studies between EWP and other portfolio models were conducted. Plyakha et al. (2015) demonstrated that equally weighted portfolios can outperform value-weighted portfolios, while Taljaard and Mare (2021) revealed that the equally weighted portfolio of stocks from the S&P 500 significantly underperformed market capitalization-weighted portfolios.
In terms of financial time series analysis, various methods have been explored in the literature for measuring similarity. These include Pearson’s correlation, mutual information, and dynamic time warping distance. For instance, Tian et al. (2022) employed Pearson’s correlation to assess stock similarity, followed by the construction of a dynamic graph to predict stock movements. In Feng et al. (2022), mutual information was utilized to measure stock similarity, leading to the creation of a graph for stock recommendation. Additionally, D’Urso et al. (2021) used dynamic time warping distance to cluster multivariate financial time series, identifying common time patterns. More recently, Gregnanin et al. (2024) introduced a signature-based matrix to measure stock similarity, subsequently employing it for community detection. Recent discussions on portfolio selection have explored network perspectives, representing the security market and interdependencies among returns using Pearson correlation. While traditional portfolio methods consider the entire covariance matrix, network-based approaches filter the correlation matrix to reduce noise and capture only relevant information. Without such filtering, a complete graph is obtained, indicating that all nodes are connected and irrelevant information is included. Various methods were proposed in the literature to filter the correlation matrix to retain only important correlations. In Tumminello et al. (2005, 2007), the correlation matrix was filtered using the Minimum Spanning Tree (MST) and the Planar Maximally Filtered Graph (PMFG). The MST yields a sub-graph where each stock is connected to only one other stock, capturing the most relevant correlations (Mantegna 1999; Tumminello et al. 2010). However, the MST does not consider cycles or cliques, potentially leading to the loss of important information. Conversely, the PMFG considers more links, allowing for cycles and cliques in the graph and containing the MST topology (Tumminello et al. 2005). While these filtering methods are typically used for studying risk propagation in financial systems, they were employed in Pozzi et al. (2013) to demonstrate that constructing a portfolio based on the peripheral nodes of the graph increases diversification while maintaining satisfactory returns. Despite their filtering ability, the PMFG and MST have a high computational complexity, equal to \(O(N^{3})\) and \(O(E \log N)\), respectively, where N is the number of stocks (nodes) and E is the number of edges of the original graph (Massara et al. 2016; Martel 2002). For these reasons, it is often preferred to filter the correlation matrix in portfolio allocation problems using the “Asset Graph” approach (Mantegna and Stanley 1999), where we retain the entries of the correlation matrix if they are greater than a predefined threshold value. Peralta and Zareei (2016) linked Markowitz’s model with network theory, illustrating that a network-based approach can enhance portfolio performance. Vỳrost et al. (2019) utilized centrality measures in financial graphs to adjust portfolio selection strategies, enhancing risk-return characteristics. Clemente et al. (2021) extracted dependence structures among assets using various methods to address asset allocation problems. Additionally, Clemente et al. (2022) compared network-based portfolios with traditional standard portfolio models on the S&P 100 index and on the world’s largest banks and insurance companies, highlighting the former’s superior performance and lower risk. Jing and Rocha (2023) filtered the correlation matrix using the MST and employed the average distance among the network’s node as centrality measure to construct diversified cryptocurrency portfolios. They demonstrated competitive potential compared to stock or commodity investments. Ricca and Scozzari (2024) combined network assortativity coefficients and mixed linear programs for portfolio selection, achieving favorable out-of-sample performance based on risk-return perspectives in experimental settings.
Preliminaries
In this section, we elucidate the concept of time series signature. Subsequently, we delineate the conventional methodology employed to derive a graph from multiple time series. Lastly, we expound upon the portfolio optimization problems utilized in our analysis.
Signature
The notion of signature originates from path theory, offering a structured and comprehensive portrayal of the temporal evolution within a time series. Its efficacy lies in capturing both temporal and geometric patterns inherent in the time series. When we consider univariate time series, temporal patterns encompass long-term dependencies and recurrent trends over time, while geometric patterns encompass the shape of trajectories of suitable transformations of time series (e.g., the lead-lag transformation), along with intricate data behaviors such as loops and self-intersections (Lyons 2014).
For clarity, we adhere to the notation delineated in Liao et al. (2023) and confine our discourse to continuous functions mapping from a compact time interval \(J:=[a,b]\) to \({\mathbb {R}}^d\) with finite p-variation,Footnote 1 all commencing from the origin. This space is denoted as \(C^{p}_0(J, {\mathbb {R}}^d)\). Let \(\textrm{T}(({\mathbb {R}}^{d})):=\oplus _{k=0}^{\infty }({\mathbb {R}}^{d})^{\otimes k}\) signify a tensor algebra space, encompassing the signatures of \({\mathbb {R}}^{d}\)-valued paths, thereby providing their comprehensive representation, where d represents the path dimension. Additionally, let \(S_i = \{s_i(t_0), s_i(t_1), \dots , s_i(t_T)\}\) represent a discrete univariate time series with \(T+1\) realizations. To bridge the discrete-continuous gap, the time series must undergo conversion into a continuous path, achieved through methods such as the lead-lag transformation or the time-joined transformation (Levin et al. 2016). Therefore, we can represent the stream of a generic univariate discrete time series as \(\{(t_j, s_i(t_j))\}_{j = 0}^{T}\), the lead-lag transformation of this stream of discrete time series can be defined as in Flint et al. (2016):
for \(t \in [0, 2T]\). We can note that the path derived using the lead-lag transformation is a 2-dimensional path. Moreover, the first term in L corresponds to the lead component, while the second term to the lag component.
Let L denote the continuous path produced by the lead-lag transformation,Footnote 2 defined by Eq. (1). Consequently, we define the signature \({\mathcal {S}}\) and the truncated signature at level M, denoted as \({\mathcal {S}}_M\), as follows:
Definition 1
(Signature and Truncated Signature) Let \(L \in C^{p}_0(J, {\mathbb {R}}^d)\) be a path. The signature \({\mathcal {S}}\) of the path \(L\) is defined as:
where \({L_{J}^{k}={\int _{t_{1}< t_{2} < \cdots t_{k}, t_{1}, \dots t_{k} \in J}}dL_{t_{1}}\otimes \cdots \otimes dL_{t_{k}}}\) are called iterated integrals.Footnote 3
The truncated signature of degree \(M\) is defined as:
The signature structure presents a hierarchical interpretation, where lower-order components encapsulate broad path attributes, while higher-order terms unveil intricate characteristics, including higher-order moments and local geometric features. Critically, the signature maintains invariance under reparameterization, thereby preserving integral values despite transformations in time. Additionally, it adheres to translation invariance and concatenation properties (Chen 1958). When truncating the signature, the first \(\frac{d^{M+1}-1}{d-1}\) iterated integrals are preserved, where M denotes the truncation degree. The factorial decay of neglected iterated integrals ensures minimal information loss in the truncation of \({\mathcal {S}}\) (Lemercier et al. 2021).
Considering two stochastic processes, A and B, defined on a probability space \((\Omega , {\mathcal {F}}, {\mathbb {P}})\), where \(\Omega\) represents the sample space, i.e., the set of all possible outcomes; \({\mathcal {F}}\) is a sigma-algebra of subsets of \(\Omega\), i.e., the set of events to which probabilities can be assigned; and \({\mathbb {P}}\) is a probability measure defined on \({\mathcal {F}}\) (Resnick 2019). Suppose that Eq. (2) holds almost surely for both A and B, with the expected values \({\mathbb {E}}[\cdot ]\) of \({\mathcal {S}}(A)\) and \({\mathcal {S}}(B)\) being finite, the following theorem holds (Lyons and Ni 2015):
Theorem 1
(Expected Signature) Let A and B be two \(C_{0}^{1}(J,{\mathbb {R}}^{d})\)-valued random variables. If \({\mathbb {E}}[{\mathcal {S}}(A)]={\mathbb {E}}[{\mathcal {S}}(B)]\), and \({\mathbb {E}}[{\mathcal {S}}(A)]\) has infinite radius of convergence, then \(A \overset{d}{=}\ B\), i.e., A and B are equal in distribution.
The signature uniquely defines a path’s trajectory (Lyons 1998) under suitable assumptions. Moreover, the expected signatures uniquely determine the distributions of paths, akin to the role of moment generating functions (Chevyrev and Lyons 2016). A more comprehensive exposition, rigorous formulations, and visual examples, are given in Lyons (2014), Levin et al. (2016), Chevyrev and Kormilitzin (2016).
From time series to graphs
Consider a collection of N univariate time series denoted as \({\textbf{S}}\), each comprising a realization over \(T+1\) discrete time steps, represented as \(S_i = \{s_i(0), s_i(1), \dots , s_i(T)\}\). The standard approach utilized in the network-based framework to derive the graph from \({\textbf{S}}\) involves computing the correlation matrix among the N univariate time series. The entries of the correlation matrix C, denoted as \(c_{ij}\), are defined as follows:
Here, \(\sigma _{S_i,S_j}\) represents the covariance between time series i and j, while \(\sigma ^{2}_{S_i}\) denotes the variance of time series \(S_i\). These are expressed empirically as:
In this research, we decided to consider the “Asset Graph” approach, hence we retain all correlations that are larger than or equal to a certain threshold and discard the others. The choice of the threshold is crucial, as it can result in either a disconnected or complete graph, indicating too much or too little discarded information, respectively. Typically, multiple threshold values are evaluated in the filtering process (Ricca and Scozzari 2024), or the threshold is considered as a hyperparameter to optimize.
In this work, we derive the threshold based on its statistical significance, as illustrated in MacMahon and Garlaschelli (2015). Assuming that each of the time series contained in \({\textbf{S}}\) has in this case T observations, and that are independent and normally distributed, the null hypothesis (Fisher 1915) states that the next random variables \(x_{ij}\) follow a normal distribution with a mean of 0 and standard deviation of \(\sigma _x = (T - 3) ^{-1/2}\), where each random variable \(x_{ij}\) is defined as:
Here, the \(c_{ij}\) represent the entries of the correlation matrix (estimated based on the available data). Therefore, the statistically significant (realizations of) random variables \(x_{ij}\) are those that are larger than or equal to \(\theta \sigma _x\), i.e., for which \(|x_{ij}|\ge \theta \sigma _x\), where \(\theta\) represents a suitable threshold. This means that only the realizations of random variables staying \(\theta\) standard deviations away from 0 are considered to be statistically significant. Thus, the critical value for filtering the correlation matrix can be derived as:
where \(x_\theta = \theta \sigma _x\) is the selected threshold for the \(|x_{ij}|\). Finally, the entries of the filtered correlation matrix, denoted as \(c^{*}_{ij}\), can be calculated as:
The advantage of this approach is that the threshold \(c_\theta\) can be derived from the critical value of the confidence interval of a normal distribution, which is reported in Table 8 in Appendix 1. Moreover, this method enables us to avoid treating the threshold for filtering the correlation matrix as a hyperparameter or choosing it arbitrarily based on the observed data. On the other hand, the main disadvantage is that we assume that the random variables follow a normal distribution. This is not necessarily true when dealing with financial time series, which empirically can exhibit “stylized effects” which include fatter-tailed distributions compared to the tails of the normal distribution (Cont 2001). Finally, it is important to note that increasing the value of the threshold, \(c_\theta\), results in discarding more correlation entries. Consequently, the associated graphs become sparser.
Portfolio optimization problems
Several portfolio strategies exist. In the following, we establish the mathematical formulations for the portfolio strategies employed in our successive numerical performance evaluation. We commence by delineating the common classical portfolio approach, followed by the definition of network-based portfolio strategies. Then, we expound upon the signature-based similarity matrix and its utilization for the construction of portfolio strategies.
Classical portfolio strategies
Let \({\textbf{S}}\) denote a collection of N stock prices, each with \(T+1\) realizations, and let \({\textbf{R}}=\{r_1, r_2, \dots , r_N \}\) represent the collection of log-returns computed on each stock in \({\textbf{S}}\), where each element of \({\textbf{R}}\) comprises T realizations. Specifically the log-returns for a generic asset i, denoted as \(S_i = \{s_i(0), s_i(1),\dots , s_i(T)\}\), are defined as:
Let \(\mu\) denote the mean vector of \({\textbf{R}}\), and \(\Sigma\) denote the covariance matrix computed based on \({\textbf{R}}\).
Mean-Variance Portfolio Strategy. The Mean-Variance portfolio strategy, pioneered by Markowitz (1952), serves as the cornerstone for portfolio strategies. It aims to optimize a sui table trade-off between risk and returns, with risk represented by the covariance matrix, \(\Sigma\), and log-returns represented by the mean vector, \(\mu\). Mathematically, this approach is expressed as the following optimization problem:
where \(w=(w_1, \dots , w_N)^{T}\) represents a vector of weights to optimize. The first constraint ensures a budget requirement, while the second constraint prohibits short-selling.
Global Minimum Variance Portfolio Strategy. In contrast to the Mean-Variance approach, the Global Minimum Variance portfolio strategy solely considers risk in its objective function. It seeks to find a vector of optimal weights w for the portfolio that minimize risk. Thus, the Global Minimum Variance optimization problem is formulated as follows:
Maximum Sharpe Ratio Portfolio Strategy. The Sharpe Ratio, introduced by Sharpe (1998), is a performance measure used to compare investment returns with their risk. In the portfolio context, one denotes \(R_p = \sum _{i=1}^{N}w_i\mu _i\) as the expected log-returns of a portfolio, \(\sigma _p=\sqrt{\sum _{i=1}^{N}\sum _{j=1}^{N}w_iw_j\sigma _{ij}}\) as the standard deviation of the portfolio log-returns, and \(r_f\) as the risk-free rate. Then, the Sharpe Ratio is defined as:
Hence, the Sharpe Ratio measures risk-adjusted log-returns. Finally, the Maximum Sharpe Ratio portfolio strategy is obtained by solving the following optimization problem:
Network-based portfolio strategies
In the portfolio strategies based on the network approach, the financial market is represented as a network derived from the correlation matrix among the stock’s log-returns (Li et al. 2019; Clemente et al. 2021; Peralta and Zareei 2016).
Let \(G=(V,E)\) represent a graph of N stocks in the set \({\textbf{S}}\), where \(V=\{v_1, \dots , v_N\}\) is the set of nodes (representing stocks), and \(E \subseteq V \times V\) is the set of edges (representing their relations). Two nodes \(v_i\) and \(v_j\) are connected if there exists a link \((i,j) \in E\). Consider \(A \in {\mathbb {R}}^{N\times N}\) as the adjacency matrix associated with the graph G, where its entries \(a_{ij}\) can be either 1 or 0 for an unweighted graph, or non-negative values for a weighted graph. In this study, we focus solely on undirected weighted graphs. Moreover, we derive the graph representation of \({\textbf{S}}\) by filtering the correlation matrix C, computed on \({\textbf{R}}\), using the “asset graph” method described in the “From time series to graphs” section. Specifically, the adjacency matrix is derived using Eq. (4).
The network portfolio strategies used as a baseline in this research are based on Clemente et al. (2022), with the distinction that we consider a weighted graph instead of an unweighted one as in the original formulation. The essence lies in incorporating both the volatility and the degree of clustering of nodes in a graph. The clustering coefficient \(\eta _i\) for node i is defined as the geometric average of suitable subgraph weights (Onnela et al. 2005):
Here, \(k_i\) represents the degree of node i, and \({\hat{a}}_{ij}\) denotes the normalized entries of the adjacency matrix A, computed as \({\hat{a}}_{ij}=\frac{a_{ij}}{\max _{h,k}(a_{hk})}\), where \(a_{ij}=c_{ij}^*\). From the clustering coefficient, we derive the matrix \(C^{\eta }\) which considers the level of interconnection of all the nodes in the network, with its entries denoted as \(c_{ij}^{\eta }\):
Finally, we construct the following matrix H to replace the matrix \(\Sigma\) in the optimization problems (6) and (7), in a similar way as it was done in Clemente et al. (2022):
Here, \(\Delta\) is a diagonal matrix with the ith entry representing the ratio between the standard deviation of the log-return of the ith asset and the market standard deviation. Thus, its diagonal entries \(\delta _{ii}\) are expressed as \(\delta _{ii}=\frac{\sigma _i}{\sqrt{\sum _{n=1}^{N}\sigma ^{2}_n}}\). The key distinction between using H and \(\Sigma\) is that H implicitly includes a measure of the financial system’s stress state (Clemente et al. 2022), while \(\Sigma\) only considers single assets volatility.
Signature-based portfolio strategies
In the signature-based portfolio strategies, the notion is to substitute the correlation matrix C with a similarity matrix derived from the signature computed on the collection of log-returns, denoted as \({\textbf{R}}\). The rationale behind using the signature to derive a similarity matrix instead of directly computing the correlation on \({\textbf{R}}\) stems from the unique properties of the expected signature described in the theorem 1. Indeed, the expected signature of each time series can be associated with its moment generating function. Consequently, the signature serves as a potent tool for assessing the similarity between time series.
Let \(d(S_i, S_j)\) denote a distance function between two time series \(S_i\) and \(S_j\), and let \({\mathcal {S}}_M(S_i)\) denote the truncated signature with truncation degree M of the path associated with the time series \(S_i\). In this paper, it is assumed that if two time series possess highly similar signatures, they should exhibit substantial similarity in their behavior. This claim is based on Theorem 1 on the expected signature. Formally, we compute the distance in terms of the truncated signature (i.e., \(d\left( S_i,S_j\right) =d\left( {\mathcal {S}}_M(S_i),{\mathcal {S}}_M(S_j)\right)\)), and we represent the assumption above as:
Assumption 1
\(\forall S_i,S_j \in {\textbf{S}}, \quad d\left( {\mathcal {S}}_M(S_i),{\mathcal {S}}_M(S_j)\right) \simeq 0 \quad \Longrightarrow \quad S_i \sim S_j\), where the symbol \(\sim\) denotes similar behavior.
Hence, it is assumed that, the closer the distance computed based on the truncated signature is to 0, the more the time series \(S_i\) and \(S_j\) exhibit similar behavior.
To substitute the correlation matrix in the asset allocation framework, we need to derive a similarity matrix based on the signature. This construction involves the following multi-step process:
-
(i)
Derive the path denoted as L for each log-return in \({\textbf{R}}\) by applying the lead-lag transformation.
-
(ii)
Compute the truncated signature on the path L with a truncation degree equal to M.
-
(iii)
Generate a distance matrix D using the Euclidean distance. This matrix has the following form:
$$\begin{aligned} {D = \begin{bmatrix} d\left( {\mathcal {S}}_M(S_1),{\mathcal {S}}_M(S_1)\right) &{} \cdots &{} d\left( {\mathcal {S}}_M(S_1),{\mathcal {S}}_M(S_N)\right) \\ \vdots &{} \ddots &{} \vdots \\ d\left( {\mathcal {S}}_M(S_N),{\mathcal {S}}_M(S_1)\right) &{} \cdots &{} d\left( {\mathcal {S}}_M(S_N),{\mathcal {S}}_M(S_N)\right) \end{bmatrix} ,} \end{aligned}$$where \(d\left( {\mathcal {S}}_M(S_i),{\mathcal {S}}_M(S_i)\right) = 0\) for all \(i \in {1, \dots , N}\), and \(d\left( {\mathcal {S}}_M(S_i),{\mathcal {S}}_M(S_j)\right) \in [0, +\infty )\) for all \(i,j \in {1, \dots , N}\).
-
(iv)
Transform the distance matrix D into a similarity matrix, denoted as P, by using a strictly monotone decreasing function, namely, using the transformation \(p_{ij} = \frac{1}{a + d_{ij}}\), with \(a>0\). For simplicity, in the following we set a equal to 1. The matrix P has the form:
$$\begin{aligned} P = \begin{bmatrix} 1 &{} p_{12} &{} \cdots &{} p_{1N}\\ p_{21} &{} 1 &{} \cdots &{} p_{2N} \\ \vdots &{} \vdots &{}\ddots &{} \vdots \\ p_{N1}&{} p_{N2} &{} \cdots &{} 1 \end{bmatrix} , \end{aligned}$$where \(p_{ij} = d\left( {\mathcal {S}}_M(S_i),{\mathcal {S}}_M(S_j)\right)\), for all \(i,j \in {1, \dots , N }\), and \(p_{ij}\in [0,1]\).
To use the similarity matrix P instead of the correlation matrix C, we need to verify if the matrix P is symmetric Positive Definite (PD), or at least symmetric Positive Semi-Definite (PSD). This issue is investigated in the next proposition.
Proposition 1
If the truncated signatures \({\mathcal {S}}_M(S_i)\) (\(\forall i=1,\ldots ,N\)) are all different, then the matrix P is symmetric PD. Otherwise, it is symmetric PSD.
Proof
The matrix P is symmetric by construction. Moreover, since the transformation \(p_{ij} = \frac{1}{1 + d_{ij}}\) is obtained by applying, for \(r=d_{ij} \ge 0\), the function \(f(r)=\frac{1}{1 + r}\), which is completely monotone,Footnote 4 it follows by an application of Schoenberg’s theorem (Fasshauer 2007, Theorem 5.2) that the matrix P is also PD if the truncated signatures \({\textbf{S}}_M(S_i)\) (\(\forall i=1,\ldots ,N\)) are all different.Footnote 5 Otherwise, it follows by a limiting argument that it is PSD. \(\square\)
Finally, we can employ the similarity matrix P based on the signature in the portfolio allocation strategies, substituting it for the correlation matrix C, as described in the “Network-based portfolio strategies” section. It is important to note that, depending on the choices of d and M, the computational time of our approach could be higher compared to computing the empirical correlation or covariance matrix. However, this higher computational cost would be justified by the advantages of using a similarity matrix based on the signature, which enables us to capture higher-order relationships within the time series. For the naive approach, the computational time for the covariance or correlation matrix is \(O(T N^{2})\), where T represents the number of observations for stock and N is the number of stocks. In our approach, we need to compute three elements: the truncated signature, the Euclidean distance, and the similarity matrix. The time complexity for computing each truncated signature is \(O(T d^{M})\), where M is the truncation degree and d is the dimension of the path associated with the time series (Morrill et al. 2021). The time complexity for computing the Euclidean distance is \(O(\tilde{T})\), where \(\tilde{T}=\frac{d^{M+1}-1}{d-1}\), while for the creation of a generic similarity matrix with N rows and N columns, it is \(O(N^{2})\). Therefore, the overall computational time of our approach turns out to be \(O(\tilde{T}N^{2}+ N T d^{M})\).
Data collection
In this section, we describe the data selected and utilized for the various analyses conducted in this research.
We chose to consider only the Standard & Poor’s 500 (S&P 500) dataset for several reasons. Firstly, the S&P 500 is a high liquidity and efficient stock market (Amihud 2002; Chordia et al. 2001) due to the presence of the largest public companies from various industries and sectors.Footnote 6 This allows for a comprehensive study of the market. Secondly, the S&P 500 holds significant influence in economic studies and serves as a reflection of the performance of the United States’ economy (Welch and Goyal 2008). Finally, the S&P 500 index is widely recognized as a benchmark for the United States’ equity market by practitioners and academic researchers. This ensures that the findings are relevant and comparable to widely accepted performance standards.
We downloaded the data from Yahoo Finance,Footnote 7 which is an open-source data provider. Although using data from Yahoo Finance has its downsides as it may contain potential inaccuracies (Boritz and No 2020), we assert that this is not problematic for our research because we are focusing on asset prices rather than balance sheet data, which are more prone to such issues (Boritz and No 2020). Clayton and Schmidt (2017) investigated potential discrepancies between NASDAQ market prices and those provided by Yahoo Finance, concluding that there are no statistically significant differences. Given that we are considering the S&P 500 index, which includes the largest companies in the United States market, rather than the top 100 non-financial stocks as in the NASDAQ index, we can reasonably assume that there will be no statistically significant difference between the stock prices provided by Yahoo Finance and those from other more reliable data providers such as Bloomberg or Refinitiv Eikon. Additionally, many studies are based on S&P 500 data collected from Yahoo Finance. The choice of Yahoo Finance is also supported by reproducibility considerations, as it is an open-source data provider accessible to everyone, whereas professional financial data providers can be prohibitively expensive.
In this research, we perform two types of analysis. In the first analysis, described in the “Community detection” section, we solve a community detection problem among the stocks listed on the S&P 500. We collect the daily closing price for the stocks in the S&P 500 from Saturday \(10^{\text {th}}\) July, 2010 to Monday \(10^{\text {th}}\) July, 2023, where each stock has 3270 observations. For the second analysis, described in the “Empirical evaluation” section, we solve the portfolio optimization problem using different methods. This analysis is divided into two parts: “Asset Allocation” and “Out-of-Sample Asset Allocation”. In the former, we use the same dataset as in the community detection analysis. For the “Out-of-Sample Asset Allocation”, we collected the closing prices of stocks on the S&P 500 from Tuesday \(11^{\text {th}}\) July, 2023 to Wednesday \(31^{\text {st}}\) January, 2024, yielding 141 observations for each stock.
Why a signature-based similarity matrix? A network analysis
In this section, we elucidate our rationale for utilizing a similarity matrix derived from the time series signature. We accomplish this by examining and contrasting the network properties of this matrix with those derived from the correlation matrix. Initially, we investigate the performance of the correlation matrix and the signature-based similarity matrix in resolving the community detection problem. Subsequently, we delve into an analysis of network characteristics, including the clustering coefficient and degree distribution, for both matrices.
Community detection
Community detection refers to the task of identifying groups of nodes within a network that are more likely to be interconnected among themselves than with nodes from other communities (Barabási 2013; Fortunato 2010). In the context of asset allocation, identifying communities can be highly relevant as it enables the definition and execution of various strategies, such as market-neutral strategies aimed at mitigating market risk by investing in uncorrelated stocks (Dunis and Ho 2005). Consequently, the objective is to uncover stock communities wherein stocks exhibit positive correlation within the communities and negative correlation or almost no correlation with stocks from other communities. A comprehensive investigation on community detection for financial time series can be found in MacMahon and Garlaschelli (2015), while a study on community detection for financial time series using the signature-based similarity matrix is presented in Gregnanin et al. (2024).
In this study, we provide a brief comparison between the correlation matrix and the signature-based similarity matrix, both filtered using the “Asset Graph” method, when employed for the community detection problem. Following the methodology outlined in Gregnanin et al. (2024), we utilize the modularity optimization approach (Newman and Girvan 2004) for community identification. Modularity serves as a metric to assess the quality of the identified partitions. Specifically, partitions with high modularity exhibit dense intra-cluster connections and sparse inter-cluster connections. In accordance with MacMahon and Garlaschelli (2015), modularity, denoted as \(Q(\epsilon )\), is defined as follows:
where \(A \in {\mathbb {R}}^{N \times N}\) represents the adjacency matrix with N nodes, \(a_{\text {TOT}}\) denotes a normalized factor defined as \(a_{\text {TOT}}=\sum _{ij}a_{ij}\), \(\langle a_{ij} \rangle\) denotes the employed null model (i.e., the expectation of \(a_{ij}\) according to a suitable null hypothesis), \(\epsilon\) is an N-dimensional vector representing non-overlapping communities, \(\epsilon _i\) indicates the community to which node i belongs, and \(\delta (\epsilon _i,\epsilon _j)\) refers to the Kronecker delta function. Its value equals 1 if \(\epsilon _i = \epsilon _j\), and 0 otherwise, meaning that only nodes within the same community contribute to the computation of modularity. The modularity \(Q(\epsilon )\) lies within the range \([-0.5,1]\), indicating the density of edges within communities relative to those between communities. Higher modularity values suggest a stronger community structure, characterized by distinct clusters of nodes, whereas lower values imply a more uniform distribution of edges across the network.
For community detection, the dataset used is described in the “Data collection” section. Recall that we consider the closing prices of stocks listed in the S&P 500, following the approach outlined in Gregnanin et al. (2024), MacMahon and Garlaschelli (2015). After removing stocks with missing values, we are left with 440 stocks, each with 3270 observations. Subsequently, we compute the log-returns as defined in Eq. (5). Next, we compute the correlation matrix and the signature-based similarity matrix, as described in the “Signature-based portfolio strategies” section. Finally, we filter both matrices using the “Asset Graph” approach outlined in the “From time series to graphs” section. The S&P 500 already classifies stocks into eleven different sectors based on the structural characteristics of the companies. Here, the goal of our analysis is to identify partitions of stocks based on their similar past behavior in the financial market. Table 1 reports the modularity values for the two matrices under consideration, i.e., the correlation matrix and a similarity matrix derived from the signature, denoted as “Signature-based”. The best, i.e. largest, modularity value for each threshold value considered is highlighted in bold.
Note that the threshold value, denoted as \(c_\theta\) depends on the statistical significance level considered, as indicated in Table 8 in Appendix 1. Notably, the modularity value consistently exceeds that of the classical correlation matrix for all threshold filtering scenarios considered. Hence, we can infer that the signature-based similarity matrix effectively identifies communities superior to those detected by the traditional correlation matrix. Moreover, we can observe that the number of clusters, denoted as “Num. Clusters”, is consistently lower than the number of sectors in the classification of stocks in the S&P 500 index. This indicates that our partition of stocks based on their past behavior results in fewer groups compared to the original S&P 500 index classification. The only exception is for the network derived from the correlation matrix and filtered with a threshold value equal to 0.401. A possible explanation for this result is that increasing the value of the threshold leads to a sparser network, meaning that more nodes are not connected to other nodes. Consequently, disconnected nodes form clusters by themselves.
Network characteristics
We opted to utilize the same dataset employed for the community detection task also to investigate the network characteristics.
Analyzing network characteristics, such as the degree distribution and the clustering coefficient, is crucial for comprehending the structural properties and organization of networks. Given that we are dealing with weighted graphs, it is imperative to consider the weighted degree of a node, which signifies the total influence or interaction that the node holds within the network. The weighted degree of a node i can be defined as the sum of the weights of all edges incident to that node i. Additionally, the clustering coefficient quantifies the tendency of nodes in a network to cluster together.
Figures 1 and 2 illustrate a comparison of the weighted degree distribution and the clustering coefficient distribution for the graphs derived from the correlation matrix and from the signature-based similarity matrix, respectively, filtered using several threshold values. As observed from these figures, the curves associated with the signature-based similarity matrix (orange curves) are consistently above the curves corresponding to the graphs derived from the correlation matrix (blue curves) in both node weighted degree and clustering coefficient plots. This indicates that the signature-based similarity matrix identifies more connectivity patterns compared to those derived from the correlation matrix.
Another crucial comparison involves determining whether the network exhibits assortative or disassortative behavior, indicating whether nodes with similar properties tend to connect (assortativity) or nodes with differing properties tend to connect (disassortativity) (Barrat et al. 2004). The assortativity or disassortativity measure can be inferred by analyzing the scatter plot between the nodes weighted degree and the nodes clustering coefficient. Figure 3 refers to assortativity. In the graph derived from the correlation matrix, there is no significant correlation between the clustering coefficient and node weighted degree. Conversely, in the graph based on the signature-based similarity matrix, a strong positive relationship is evident.
The next structural property to analyze is the relationship between standard deviation of the log-returns of each asset i and centrality measures of the node corresponding to that asset, specifically degree centrality and eigenvector centrality (Barabási 2013). Degree centrality simply measures the number of connections a node has in a network, while eigenvector centrality considers both the number of connections a node has and the centrality of the nodes to which it is connected. Figures 4 and 5 illustrate the relationship between standard deviation and degree centrality, and eigenvector centrality, respectively. Notably, for both centrality measures, a distinct structure is evident when considering the graph derived from the signature-based similarity matrix. This structure becomes more defined as the underlying graph becomes sparser, corresponding to an increase in the threshold value used to remove noisy connections, where the list of threshold values considered is reported in Table 8 in Appendix 1. In both plots, an inverse relationship is observed between standard deviation and the respective centrality measure for the graph derived from the signature-based similarity matrix. Specifically, the stock standard deviation decreases with increasing the centrality measure up to a certain value, after which a positive relationship between the standard deviation and the centrality measure occurs.
Empirical evaluation
In this section, we present a performance evaluation of the various portfolio strategies employed in our analysis. We commence by detailing the dataset under consideration and elucidating the procedure for determining the number of stocks utilized by the strategies. Subsequently, we assess the efficacy of the portfolio strategies. Then, we relax the assumption of positive weights, thereby permitting short selling and scrutinize the asset allocation problem within the context of market-neutral strategies. Finally, we compare the strategies using an out-of-sample dataset.
Stock selection
For our analysis, the dataset used is described in the “Data collection” section. Recall that we consider the closing prices of stocks listed in the S&P 500. After eliminating stocks with missing data and calculating the log-returns for the remaining stocks, we obtained a dataset comprising 440 stocks, each with 3270 observations.
The next step involves determining the maximum number of assets to include in the portfolio. Conventionally, this is achieved by imposing a constraint known as the “cardinality constraint,” which limits the number of stocks held in the portfolio to a predefined value (Mansini et al. 2014). However, employing this approach a-priori poses several challenges. Firstly, the maximum number of assets to include is determined arbitrarily, lacking a rational basis for its selection. Secondly, imposing a maximum number of stocks does not address the possibility of including highly illiquid assets in the portfolio, as this constraint does not consider the nature of the stocks themselves. Illiquid assets typically exhibit higher expected returns due to their increased risk and trading such assets can impact their prices, potentially resulting in an unrealistic portfolio. Consequently, relying solely on the cardinality constraint may lead to the inclusion of illiquid assets, which may not be optimal in terms of practicality. Furthermore, incorporating the cardinality constraint into a quadratic programming problem, such as in the classical Mean-Variance framework, transforms it into a mixed-integer quadratic problem due to the introduction of binary variables representing asset inclusion. This escalation in complexity results in longer computational times for the portfolio optimization algorithms used to solve the respective optimization problems.
To address these drawbacks, we opted not to impose an a-priori given cardinality constraint. Instead, in an effort to reduce the size of the dataset, we chose to employ stock turnover as a criterion for selecting the subset of assets for asset allocation. Stock turnover, defined as the product of volume and price for the selected stock, is a crucial metric in financial markets as it provides insights into stock liquidity. Other studies have employed turnover as a basis for constructing investment strategies, as demonstrated by Vidović (2019). Additionally, turnover and other accounting variables have been utilized for preliminary stock selection, as discussed by Fulga et al. (2009). Specifically, we calculated the mean turnover for all stocks under consideration, then proceeded to rank the stocks in descending order based on their mean turnover values. Subsequently, we selected the top 10, 20, 40, and 80 stocks for inclusion in the asset allocation problem. This approach enabled us to circumvent the second and third challenges associated with an a-priori given cardinality constraint. Specifically, it enables us to exclude illiquid stocks in the portfolio and reduce the complexity of the portfolio optimization algorithm.
Asset allocation
In our analysis, we compare the performance of various portfolio strategies. We specifically compare the network approach, detailed in the “Network-based portfolio strategies” section, with the signature implementation outlined in the “Signature-based portfolio strategies” section, across all basic portfolio strategies described in the “Classical portfolio strategies” section. The fundamental strategies considered are Mean-Variance (MV), Maximum Sharpe Ratio (MS), and Global Minimum Variance (GMV). For each basic strategy, we substitute the covariance matrix, \(\Sigma\), with the network implementation derived from the correlation matrix and from the signature-based similarity matrix, as described in the sections “Network-based portfolio strategies” and “Signature-based portfolio strategies3.3.3”, respectively. We denote the baseline network implementation as “Network” and the signature-based similarity matrix implementation as “Sig”. Finally, we also report the performance of the vanilla MV, GMV, and EWP strategies. In the EWP, each stock is assigned an equal weight, which is defined as one divided by the number of stocks.
To evaluate the performance of the different models, we decided to consider the annualized meanFootnote 8 and the annualized standard deviationFootnote 9 of the log-returns with respect to the daily mean and standard deviation because using the annualized metrics facilitates the comparison. We also employ the Sharpe Ratio (Sharpe 1998), defined in Eq. (8), and we assume that the risk-free rate \(r_f\) is equal to 0. This allows us to assess the strategies based on their risk-adjusted log-returns. Additionally, we examine the excess kurtosis and skewness of the distribution of the portfolio’s log-returns. This evaluation helps determine if the portfolio’s log-returns approximately follow a normal distribution. A skewness value of 0 and a kurtosis value of 3 indicate a normal distribution, with excess kurtosis calculated as the kurtosis of the log-returns minus 3. We also consider the cumulative log-returns of the portfolio strategies for the selected time period of length T. Lastly, we utilize two risk measures to evaluate the potential loss of an investment portfolio: the Maximum Drawdown (MDD) (Chekhlov et al. 2005) and the Conditional Value-at-Risk (CVaR) (Sarykalin et al. 2008). The MDD measures the maximum decline in the portfolio value and is calculated as the difference between the peak value of an investment and its lowest subsequent value. This metric captures the potential loss in the worst-case scenario for an investment. In contrast, CVaR provides an average estimation of the tail end of the portfolio’s loss distribution. This measure accounts for the magnitude of extreme losses, offering a more comprehensive risk assessment for heavy-tailed distributions. The MDD and CVaR can be defined as follows:
where \(R_p(i)\) denotes the returns of the portfolio at time \(i \in \{1,2,\ldots ,T\}\), \(R^{-}_p(i)\) denotes the negative returns of the portfolio at time \(i \in \{1,2,\ldots ,T\}\), \({\textbf{1}}_{\{\cdot \}}\) is an indicator function, \(\text {VaR}_\alpha\) is the Value-at-Risk measure (Sarykalin et al. 2008), and \(\alpha\) is the confidence level. In our analysis, we set the confidence level to 95%.
To ensure realism in our analysis, we rebalance the portfolio monthly, corresponding to approximately 20 trading days, and set the trading cost to 0. While assuming a zero trading cost may not be entirely realistic, we justify this choice based on the infrequency of portfolio rebalancing, occurring only once a month. Moreover, we consider it negligible for simplicity; otherwise, a more complex optimization problem could be considered, which would take into account such a cost.
Tables 2 and 3 present the results obtained for the portfolio strategies when considering the first 20 and 80 of the most liquid stocks for asset allocation, while in Appendix 2, Tables 9 and 10 present the results obtained for the portfolio strategies when considering the first 10 and 80 of the most liquid stocks for asset allocation. The best results are highlighted in bold. Notably, for annualized log-returns, cumulative log-returns, and Sharpe ratio, higher values are considered better, while for annualized standard deviation, excess kurtosis, skewness, MDD, and CVaR values closer to 0 are preferred. Additionally, annualized log-returns, annualized standard deviation, cumulative log-returns, MDD and CVaR are multiplied by 100 for easier comparison. The key observations regarding Tables 2, 3, 9 and 10 include the consistently higher annualized standard deviations, MDD, and CVaR achieved by signature-based models compared to baseline network models. However, these outcomes are accompanied by higher annualized log-returns, cumulative log-returns, and lower excess kurtosis. The optimal models for daily Sharpe Ratio vary depending on the number of stocks considered, with no clear distinction between signature-based and network baseline models. Moreover, portfolios constructed with 10 and 20 stocks exhibit log-return distributions with kurtosis and skewness values very close to those of a normal distribution, indicating kurtosis equal to 3 and skewness equal to 0. This finding underscores the importance of portfolio risk management because having the log-returns distribution of a portfolio closer to a normal distribution allows a better predictability and understanding of potential outcomes. Furthermore, we observe that increasing the value of the threshold \(c_\theta\) tends to increase the values of all considered metrics. This indicates that the portfolio based on the network benefits from increased sparsity in the graph in terms of log-returns and SR. While a risk-adverse investor will benefit from using a lower threshold \(c_\theta\) since the risk measure, i.e. standard deviation, MDD, and CVaR, achieved a lower value when the graph is more connected. We also observe that the “Network” and “Sig” approaches tend to outperform the vanilla models. Specifically, Table 2 clearly shows that the “Sig MS” method improves the values of all the considered metrics compared to the MV, GMV, and EWP models. Finally, the best-performing models in terms of cumulative log-returns are those associated with maximizing the Sharpe Ratio, denoted as “Network MS” and “Sig MS”, respectively, derived from the baseline network approach and the signature-based similarity matrix. It is noteworthy that achieving the best-performing models with the highest Sharpe ratio and cumulative log-returns is not straightforward. This is because the Sharpe ratio scales the log-returns of the portfolio by its associated risk, whereas cumulative log-returns represent the overall performance of the portfolio over the entire time period.
Equity market neutral strategies
To investigate whether the signature-based similarity matrix can effectively transfer some community properties of the market into the portfolio allocation problem, as it may be argued by the results obtained in the “Community detection” section for community detection, we opt to construct portfolio strategies following a market-neutral approach. This aims to maintain a neutral exposure to overall market movements by balancing long and short positions. Consequently, the constraints of the portfolio optimization problems outlined in the “Classical portfolio strategies” section are modified as follows:
while the respective objective functions remain unchanged. Identifying better communities can significantly enhance portfolio performance under the market-neutral regime.
Tables 4 and 5 present the results for portfolio allocation under the market-neutral approach using the first 20 and 40 most liquid stocks, while in Appendix 2, Tables 11 and 12 present the results for portfolio allocation under the market-neutral approach using the first 10 and 80 most liquid stocks. It is important to note that, in all cases, the portfolio is rebalanced every 20 days, with transaction costs assumed to be 0. When considering 10 stocks (Table 11), we observe differing results between the signature-based and baseline models, indicating that no single model consistently outperforms the others across all the metrics employed. Specifically, the signature-based portfolio consistently achieves better cumulative log-returns, while the network approach yields a superior daily Sharpe ratio. However, increasing the number of stocks reveals that the signature-based portfolio begins to outperform the network baseline across all the considered metrics, except for annualized standard deviation, MDD, and CVaR, where the signature approach consistently exhibits higher values. Of particular interest is that the best cumulative log-returns for the signature-based portfolio are consistently from 2 to 5 times larger than those achieved with the best network-based portfolio across all the values of the filtering threshold and numbers of stocks considered. Furthermore, when considering 40 stocks and filtering the similarity matrix using a threshold of 0.292, the log-returns distribution of the “Sig MS” portfolio approximately follows a normal distribution, with kurtosis close to 3 and skewness close to 0.
Lastly, the results obtained by relaxing the constraint on positive weights, i.e., allowing short selling, align with the findings of the community detection analysis presented in the “Community detection” section. Thus, using a signature-based similarity matrix instead of the correlation matrix in a network approach improves the portfolio performance both with and without the short selling constraint, albeit at the possible expense of increased portfolio standard deviation.
Out-of-sample asset allocation
To validate our analysis, we conducted an out-of-sample study. Recall from the “Data collection” section that we collected the closing prices of stocks belonging to the S&P 500 from Tuesday \(11^{\text {th}}\) July, 2023 to Wednesday \(31^{\text {st}}\) January, 2024, yielding 141 observations for each stock. Subsequently, we computed the log-returns and selected the same liquid stocks as in the previous analysis to maintain consistency and facilitate comparison between the two approaches with the new data. We focused our comparison on the results of the best portfolio model for both the network-based and signature-based strategies, denoted as “Network MS” and “Sig MS”, respectively. Additionally, we also report the results for the vanilla MV, GMV, and EWP. Finally, we compared the performance of the portfolio with and without the short selling constraint.
Tables 6, and 7 present the results for the approaches considered using the first 20 and 40 most liquid stocks, while in Appendix 2, Tables 13 and 14 present the results for the first 10 and 80 most liquid stocks. Notably, when the portfolio comprises a small number of stocks (i.e., 10), the network approach consistently outperforms the signature approach when short selling is disallowed. However, with short selling permitted, the signature-based portfolio consistently outperforms the network portfolio in terms of cumulative log-returns and daily Sharpe ratio. Furthermore, we observe that the network-based approaches yield negative log-returns and consequently negative Sharpe ratios in this scenario. Conversely, increasing the number of stocks in the portfolio allocation reveals that the signature-based model achieves superior results compared to the network approach.
It is noteworthy that the network baseline approaches with short selling consistently yield negative cumulative log-returns and Sharpe ratios across all the numbers of stocks considered, whereas the signature-based portfolio consistently yields positive cumulative log-returns and Sharpe ratios, except when considering 40 stocks, as shown in Table 7. Additionally, we observe that the performance of the signature-based portfolio, with or without the short selling constraint, does not change significantly across different threshold values.
In conclusion, the out-of-sample analysis demonstrates that the signature-based portfolio consistently outperforms the baseline network approach. Moreover, when relaxing the assumption of considering only positive weights in the portfolio, the signature-based portfolio clearly outperforms the baseline network portfolio in terms of cumulative log-returns and Sharpe ratio. This may be attributed to the fact that, differently from the correlation, the signature is capable of transferring geometric patterns present in the similarity matrix, further validating its effectiveness in portfolio optimization. Finally, it is noteworthy that the signature-based portfolio allocation consistently results in portfolios with higher risk compared to the network approach. This outcome is attributable to the use of a similarity matrix derived from the signature of different assets, which is designed to capture nonlinear relationships within the time series. To account for the additional risk introduced by this model, one could substitute the objective function in each optimization problem with a new one that explicitly considers this aspect.
Conclusion
This study explored the application of a similarity matrix derived from the signature within the portfolio allocation framework. Initially, we provided an overview of several primary portfolio optimization problems. Subsequently, we introduced network portfolio approaches, which served as our baseline models. Finally, we elucidated the incorporation of the signature in portfolio allocation problems using network approaches. Furthermore, we conducted a comparative analysis of the network characteristics and community detection capabilities between the correlation matrix and the signature-based similarity matrix. Our findings revealed that the signature approaches yielded superior community detection and well-defined network properties. We then addressed portfolio allocation problems on the Standard & Poor’s 500, conducting various analyses with adjustments to the number of stocks, filtering threshold, and short selling constraint. Our results demonstrate that the signature-based portfolios consistently outperformed the network-baseline approaches in terms of both cumulative log-returns and Sharpe ratio.
Future research endeavors will delve deeper into investigating the network characteristics of the signature-based similarity matrix and exploring its applicability in diverse network problems. Additionally, we aim to explore and implement methodologies for substituting the covariance matrix in classical portfolio optimization problems with a signature-based matrix, and to study how we can control the risk in the signature-based portfolio strategy. These methodologies are not included in the present comparison because, as a preliminary step, they require investigating how to properly substitute a covariance matrix with a signature-based matrix while preserving the same properties of the original matrix. Finally, as the (truncated) signature allows to extract a large amount of features from a set of time series, we plan to apply a similar signature-based community detection methodology as in the present article to other contexts involving time series, such as movement analysis.
Availability of data and materials
The datasets used and/or analysed during the current study were downloaded using the API of Yahoo Finance (https://finance.yahoo.com). The resulting “.csv” file is available from the first corresponding author on reasonable request.
The code used in the current study can be found in the following GitHub repository: https://github.com/GeNiN01/Signature-Based-Portfolio-Allocation.
Notes
The p-variation is a measure used to quantify the roughness or irregularity of a path, hence its variability. As the p-variation increases, so does the level of roughness exhibited by the path under consideration. More details can be found in Appendix A of Liao et al. (2023).
The lead-lag transformation is selected for its capability to directly extract various features including path volatility (emanating from the second term of the signature), which is a pivotal facet in finance, as stated in Remark 4.1 in Levin et al. (2016).
Note that the symbol \(\otimes\) denotes all the combinations of components taken from \(dL_{t_{1}}\) to \(dL_{t_{k}}\).
It is recalled here that a function \(f: [0,+\infty )\rightarrow {\mathbb {R}}\) is called completely monotone if \(f \in C[0,+\infty ) \cap C^\infty (0,+\infty )\) and \((-1)^l f^{(l)}(r) \ge 0\), \(\forall r>0\) and \(l=0, 1, 2, \ldots\) (Fasshauer 2007, Definition 5.1). It is easy to check that the specific function \(f(r)=\frac{1}{1 + r}\) is completely monotone since \(f^{(l)}(r)=(-1)^l \frac{l!}{(1+r)^{l+1}}\) (see also (Fasshauer 2007, Example 5.3) for a similar check).
The same argument can be applied for the case of the transformation \(p_{ij} = \frac{1}{a + d_{ij}}\) obtained by applying, for \(r=d_{ij}\ge 0\), the function \(f(r)=\frac{1}{a + r}\) with \(a>0\), since also that function is completely monotone (as it can be checked by reasoning in a similar way as at the end of footnote 4).
The annualized mean is equal to: \(R^{a} = (1 + R^{d})^{252} -1\), where \(R^{d}\) is the mean of the daily log-returns, and 252 are the number of trading days in a year.
The annualized standard deviation is equal to: \(\sigma ^{a} = \sigma ^{d} \cdot \sqrt{252}\), where \(\sigma ^{d}\) is the standard deviation of the daily log-returns, and 252 is the number of trading days in a year.
Abbreviations
- CVaR:
-
Conditional value-at-risk
- EWP:
-
Equally weighted portfolio
- GMV:
-
Global minimum variance
- GMVP:
-
Global minimum variance portfolio
- MGF:
-
Moment generating function
- MDD:
-
Maximum drawdown
- MS:
-
Maximum Sharpe ratio
- MST:
-
Minimum spanning tree
- MV:
-
Mean-variance
- PD:
-
Positive definite
- PMFG:
-
Planar maximally filtered graph
- PSD:
-
Positive semi-definite
- S&P 500:
-
Standard and Poor’s 500
- SR:
-
Sharpe ratio
References
Amihud Y (2002) Illiquidity and stock returns: cross-section and time-series effects. J Financ Mark 5(1):31–56
Barabási A-L (2013) Network science. Philos Trans R Soc A Math Phys Eng Sci 371:1987
Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci 101(11):3747–3752
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008
Boritz JE, No WG (2020) How significant are the differences in financial data provided by key data sources? A comparison of XBRL, Compustat, Yahoo! Finance, and Google Finance. J Inf Syst 34(3):47–75
Brockwell PJ, Davis RA (2002) Introduction to time series and forecasting. Springer, New York
Chekhlov A, Uryasev S, Zabarankin M (2005) Drawdown measure in portfolio optimization. Int J Theor Appl Finance 8:13–58. https://doi.org/10.2139/ssrn.544742
Chen K-T (1958) Integration of paths—a faithful representation of paths by noncommutative formal power series. Trans Am Math Soc 89(2):395–407
Chevyrev I, Kormilitzin A (2016) A primer on the signature method in machine learning. arXiv:1603.03788
Chevyrev I, Lyons T (2016) Characteristic functions of measures on geometric rough paths. Ann Probab 44(6):4049–4082
Chordia T, Roll R, Subrahmanyam A (2001) Market liquidity and trading activity. J Financ 56(2):501–530
Chung M, Lee Y, Kim JH, Kim WC, Fabozzi FJ (2022) The effects of errors in means, variances, and correlations on the mean-variance framework. Quant Finance 22(10):1893–1903
Clayton R, Schmidt B (2017) Are capital market parameters estimated from Yahoo Finance and NASDAQ data the same? Bank Finance Rev 9(1):27–46
Clemente GP, Grassi R, Hitaj A (2021) Asset allocation: new evidence through network approaches. Ann Oper Res 299(1):61–80
Clemente GP, Grassi R, Hitaj A (2022) Smart network based portfolios. Ann Oper Res 316(2):1519–1541
Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quant Finance 1(2):223
Dunis CL, Ho R (2005) Cointegration portfolios of european equities for index tracking and market neutral strategies. J Asset Manag 6(1):33–52
D’Urso P, De Giovanni L, Massari R (2021) Trimmed fuzzy clustering of financial time series based on dynamic time warping. Ann Oper Res 299(1):1379–1395
Fasshauer GE (2007) Meshfree approximation methods with MATLAB. World Scientific, Singapore
Feng S, Xu C, Zuo Y, Chen G, Lin F, XiaHou J (2022) Relation-aware dynamic attributed graph attention network for stocks recommendation. Pattern Recogn 121:108119
Fisher RA (1915) Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10(4):507–521
Flint G, Hambly B, Lyons T (2016) Discretely sampled signals and the rough Hoff process. Stoch Process Appl 126(9):2593–2614. https://doi.org/10.1016/j.spa.2016.02.011
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Fulga C, Dedu S, Şerban F (2009) Portfolio optimization with prior stock selection. Econ Comput Econ Cybernet Stud Res 43(4):157–172
Gregnanin M, De Smedt GJ, Gnecco Parton M (2024) Signature-based community detection for time series. In: Complex networks & their applications XII, vol 1142. Springer, Cham, pp 146–158
Jagannathan R, Ma T (2003) Risk reduction in large portfolios: why imposing the wrong constraints helps. J Financ 58(4):1651–1683
Jing R, Rocha LE (2023) A network-based strategy of price correlations for optimal cryptocurrency portfolios. Financ Res Lett 58:104503
Jorion P (1985) International portfolio diversification with estimation risk. Bus 66:259–278
Jorion P (1986) Bayes–Stein estimation for portfolio analysis. J Financ Quant Anal 21(3):279–292
Kolm PN, Tütüncü R, Fabozzi FJ (2014) 60 years of portfolio optimization: practical challenges and current trends. Eur J Oper Res 234(2):356–371
Lemercier M, Salvi C, Damoulas T, Bonilla E, Lyons T (2021) Distribution regression for sequential data. In: Banerjee A, Fukumizu K (eds) Proceedings of the 24th international conference on artificial intelligence and statistics. Proceedings of machine learning research, vol 130, pp 3754–3762
Levin D, Lyons T, Ni H (2016) Learning from the past, predicting the statistics for the future, learning an evolving system. arXiv:1309.0260
Li Y, Jiang X-F, Tian Y, Li S-P, Zheng B (2019) Portfolio optimization based on network topology. Phys A 515:671–681
Liao S, Ni H, Szpruch L, Wiese M, Sabate-Vidales M, Xiao B (2023) Conditional Sig-Wasserstein GANs for time series generation. arXiv:2006.05421
Lyons TJ (1998) Differential equations driven by rough signals. Revista Matemática Iberoamericana 14(2):215–310
Lyons T (2014) Rough paths, signatures and the modelling of functions on streams. arXiv:1405.4537
Lyons T, Ni H (2015) Expected signature of brownian motion up to the first exit time from a bounded domain. Ann Probab 43(5):2729–2762
Lyons T, Ni H, Oberhauser H (2014) A feature set for streams and an application to high-frequency financial tick data. In: Proceedings of the 2014 international conference on big data science and computing, pp 1–8
MacMahon M, Garlaschelli D (2015) Community detection for correlation matrices. Phys Rev X 5(2):66
Mansini R, Ogryczak W, Speranza MG (2014) Twenty years of linear programming based portfolio optimization. Eur J Oper Res 234(2):518–535
Mantegna RN (1999) Hierarchical structure in financial markets. Eur Phys J B Condens Matter Complex Syst 11:193–197
Mantegna RN, Stanley HE (1999) Introduction to econophysics: correlations and complexity in finance. Cambridge University Press, Cambdridge
Markowitz H (1952) Portfolio selection. J Financ 7(1):77–91
Martel C (2002) The expected complexity of prim’s minimum spanning tree algorithm. Inf Process Lett 81(4):197–201
Massara GP, Di Matteo T, Aste T (2016) Network filtering for big data: triangulated maximally filtered graph. J Complex Netw 5(2):161–178
Morrill J, Fermanian A, Kidger P, Lyons T (2021) A generalised signature method for multivariate time series feature extraction. arXiv:2006.00873
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Onnela J-P, Saramäki J, Kertész J, Kaski K (2005) Intensity and coherence of motifs in weighted complex networks. Phys Rev E 71(6):66
Peralta G, Zareei A (2016) A network approach to portfolio selection. J Empir Financ 38:157–180
Plyakha Y, Uppal R, Vilkov G (2015) Why do equal-weighted portfolios outperform value-weighted portfolios? SSRN Electron J 6:66
Pozzi F, Di Matteo T, Aste T (2013) Spread of risk across financial markets: better to invest in the peripheries. Sci Rep 3(1):1665
Prigent J-L (2007) Portfolio optimization and performance analysis. CRC Press, London
Resnick S (2019) A probability path. Springer, Boston
Ricca F, Scozzari A (2024) Portfolio optimization through a network approach: network assortative mixing and portfolio diversification. Eur J Oper Res 312(2):700–717
Sarykalin S, Serraino G, Uryasev S (2008) Value-at-risk vs conditional value-at-risk in risk management and optimization. https://doi.org/10.1287/educ.1080.0052
Sharpe WF (1998) The sharpe ratio. Streetwise Best J Portf Manag 3:169–185
Taljaard BH, Mare E (2021) Why has the equal weight portfolio underperformed and what can we do about it? Quant Finance 21(11):1855–1868
Tian H, Zheng X, Zhao K, Liu MW, Zeng DD (2022) Inductive representation learning on dynamic stock co-movement graphs for stock predictions. INFORMS J Comput 34(4):1940–1957
Tumminello M, Aste T, Di Matteo T, Mantegna RN (2005) A tool for filtering information in complex systems. Proc Natl Acad Sci 102(30):10421–10426
Tumminello M, Di Matteo T, Aste T, Mantegna RN (2007) Correlation based networks of equity returns sampled at different time horizons. Eur Phys J B 55:209–217
Tumminello M, Lillo F, Mantegna RN (2010) Correlation, hierarchies, and networks in financial markets. J Econ Behav Organ 75(1):40–58
Vidović J (2019) Turnover based illiquidity measurement as investment strategy on Zagreb stock exchange. Am J Oper Res 10(1):1–12
Vỳrost T, Lyócsa Š, Baumöhl E (2019) Network-based asset allocation strategies. N Am J Econ Finance 47:516–536
Welch I, Goyal A (2008) A comprehensive look at the empirical performance of equity premium prediction. Rev Financ Stud 21(4):1455–1508
Zhang Z, Zohren S, Roberts S (2020) Deep learning for portfolio optimization. J Financ Data Sci 2(4):8–20
Zhang C, Zhang Z, Cucuringu M, Zohren S (2021) A universal end-to-end approach to portfolio optimization via deep learning. arXiv:2111.09170
Acknowledgements
Marco Gregnanin and Giorgio Gnecco were partially supported by the PRIN 2022 project “MAHATMA” (CUP: D53D23008790006) and by the PRIN PNRR 2022 project “MOTUS” (CUP: D53D23017470001), funded by the European Union - Next Generation EU program.
Funding
The work received no specific funding.
Author information
Authors and Affiliations
Contributions
Marco Gregnanin and Yanyi Zhang contributed equally to the work by preparing a first draft of it, which was then revised by Johannes De Smedt, Giorgio Gnecco, and Maurizio Parton. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Gregnanin, M., Zhang, Y., De Smedt, J. et al. Signature-based portfolio allocation: a network approach. Appl Netw Sci 9, 54 (2024). https://doi.org/10.1007/s41109-024-00651-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109-024-00651-1