Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3604237.3626909acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article
Open access

NMTucker: Non-linear Matryoshka Tucker Decomposition for Financial Time Series Imputation

Published: 25 November 2023 Publication History

Abstract

Missing values in financial time series data are of paramount importance in financial modeling and analysis. Appropriately handling missing data is essential to ensure the accuracy and reliability of financial models and forecasts. In this paper, we focus on datasets containing multiple attributes of different firms across time, such as firm fundamentals or characteristics, which can be represented as three dimensional tensors with the dimensions time, firm and attribute. Hence, the task of imputing missing values for these datasets can also be formulated as a tensor completion problem. Tensor completion has a wide range of applications, including link prediction, recommendation, and scientific data extrapolation. The widely used completion algorithms, CP and Tucker decompositions, factorize an N-order tensor into N embedding matrices and use multi-linearity among the factors to reconstruct the tensor. Real-world data are often highly sparse and involve complex interactions beyond simple N-order linearity; they demand models capable of capturing latent variables and their non-linear multi-way interactions. We design an algorithm, called Non-Linear Matryoshka Tucker Completion (NMTucker), that uses element-wise Tucker decomposition, multi-layer perceptrons, and non-linear activation functions to solve these challenges and ensure its scalability. To avoid the overfitting problem with existing neural network-based tensor algorithms, we develop a novel strategy that recursively decomposes a tucker core into smaller ones, reduces the number of trainable parameters, and regularizes the complexity. Its structure is similar to Matryoshka dolls of decreasing size in which one is nested inside another. We conduct experiments to show that NMTucker effectively mitigates overfitting and demonstrate its superior generalization capability (up to 53.91% less RMSE) in comparison with the state-of-the-art models in multiple tensor completion tasks.

References

[1]
Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda, and Morten Morup. 2011. Scalable tensor factorizations for incomplete data. Chemometrics and Intelligent Laboratory Systems 106, 1 (2011), 41 – 56. Multiway and Multiset Data Analysis.
[2]
Ivana Balažević, Carl Allen, and Timothy M. Hospedales. 2019. TuckER: Tensor Factorization for Knowledge Graph Completion. arXiv e-prints, Article arXiv:1901.09590 (Jan 2019), arXiv:1901.09590 pages. arxiv:1901.09590 [cs.LG]
[3]
Turan G Bali, Amit Goyal, Dashan Huang, Fuwei Jiang, and Quan Wen. 2021. Different strokes: Return predictability across stocks and bonds with machine learning and big data. Swiss Finance Institute, Research Paper Series20-110 (2021).
[4]
Ryan T Ball and Eric Ghysels. 2018. Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?Management Science 64, 10 (2018), 4936–4952.
[5]
Ercument Cahan, Jushan Bai, and Serena Ng. 2023. Factor-based imputation of missing values and covariances in panel data of large dimensions. Journal of Econometrics 233, 1 (2023), 113–131.
[6]
J. Douglas Carroll and Jih-Jie Chang. 1970. Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika 35 (1970), 283–319.
[7]
Andrew Y Chen and Tom Zimmermann. 2021. Open source cross-sectional asset pricing. Critical Finance Review, Forthcoming (2021).
[8]
Ching-Hsue Cheng, Chia-Pang Chan, and Yu-Jheng Sheu. 2019. A novel purity-based k nearest neighbors imputation method and its application in financial distress prediction. Engineering Applications of Artificial Intelligence 81 (2019), 283–299.
[9]
Dehua Cheng, Richard Peng, Yan Liu, and Ioakeim Perros. 2016. SPALS: Fast Alternating Least Squares via Implicit Leverage Scores Sampling. In Advances in Neural Information Processing Systems 29.
[10]
J. Chien and Y. Bao. 2018. Tensor-factorized neural networks. IEEE Transactions on Neural Networks and Learning Systems 29, 5 (2018), 1998–2011.
[11]
Gintare Karolina Dziugaite and Daniel M. Roy. 2015. Neural Network Matrix Factorization. CoRR abs/1511.06443 (2015). arxiv:1511.06443http://arxiv.org/abs/1511.06443
[12]
Eugene F Fama and Kenneth R French. 2006. Profitability, investment and average returns. Journal of financial economics 82, 3 (2006), 491–518.
[13]
Silvia Gandy, Benjamin Recht, and Isao Yamada. 2011. Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems 27, 2, Article 025010 (Feb. 2011), 025010 pages. https://doi.org/10.1088/0266-5611/27/2/025010
[14]
Shihao Gu, Bryan Kelly, and Dacheng Xiu. 2020. Empirical asset pricing via machine learning. The Review of Financial Studies 33, 5 (2020), 2223–2273.
[15]
Shihao Gu, Bryan Kelly, and Dacheng Xiu. 2020. Empirical asset pricing via machine learning. The Review of Financial Studies 33, 5 (2020), 2223–2273.
[16]
R. A. Harshman. 1970. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis. UCLA Working Papers in Phonetics 16 (1970), 1–84.
[17]
Kewei Hou, Mathijs A Van Dijk, and Yinglei Zhang. 2012. The implied cost of capital: A new approach. Journal of Accounting and Economics 53, 3 (2012), 504–526.
[18]
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).
[19]
Tamara G. Kolda and Brett W. Bader. 2009. Tensor Decompositions and Applications. SIAM Rev. 51, 3 (2009), 455–500. https://doi.org/10.1137/07070111X arXiv:https://doi.org/10.1137/07070111X
[20]
Charles MC Lee and Eric C So. 2017. Uncovering expected returns: Information in analyst coverage proxies. Journal of Financial Economics 124, 2 (2017), 331–348.
[21]
Bin Liu, Lirong He, Yingming Li, Shandian Zhe, and Zenglin Xu. 2018. NeuralCP: Bayesian Multiway Data Analysis with Neural Tensor Decomposition. Cognitive Computation 10 (2018), 1051–1061.
[22]
Hanpeng Liu, Yaguang Li, Michael Tsang, and Yan Liu. 2019. CoSTCo: A Neural Tensor Completion Model for Sparse Tensors. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 324–334. https://doi.org/10.1145/3292500.3330881
[23]
J. Liu, P. Musialski, P. Wonka, and J. Ye. 2013. Tensor Completion for Estimating Missing Values in Visual Data. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 208–220.
[24]
Dijun Luo, Chris Ding, and Heng Huang. 2011. Are Tensor Decomposition Solutions Unique? On the Global Convergence HOSVD and ParaFac Algorithms. In Advances in Knowledge Discovery and Data Mining, Joshua Zhexue Huang, Longbing Cao, and Jaideep Srivastava (Eds.). 148–159.
[25]
Takanori Maehara, Kohei Hayashi, and Ken-ichi Kawarabayashi. 2016. Expected Tensor Decomposition with Stochastic Gradient Descent. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (Phoenix, Arizona) (AAAI’16). AAAI Press, 1919–1925.
[26]
Rebecca Morger. 2022. Imputation Algorithms with Principal Component Analysis for Financial Data. (2022).
[27]
Sejoon Oh, Namyong Park, Lee Sael, and U Kang. 2017. Scalable Tucker Factorization for Sparse Tensors - Algorithms and Discoveries. CoRR abs/1710.02261 (2017). arxiv:1710.02261http://arxiv.org/abs/1710.02261
[28]
Adam Paszke and et al.2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach and et al. (Eds.). Curran Associates, Inc., 8024–8035.
[29]
Bernardino Romera-Paredes and Massimiliano Pontil. 2013. A New Convex Relaxation for Tensor Completion. In Advances in Neural Information Processing Systems 26.
[30]
Ledyard R. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 3 (01 Sep 1966), 279–311. https://doi.org/10.1007/BF02289464
[31]
Ajim Uddin, Xinyuan Tao, Chia-Ching Chou, and Dantong Yu. 2020. Nonlinear Tensor Completion Using Domain Knowledge: An Application in Analysts’ Earnings Forecast. In 2020 International Conference on Data Mining Workshops (ICDMW). IEEE, 377–384.
[32]
Ajim Uddin, Xinyuan Tao, Chia-Ching Chou, and Dantong Yu. 2022. Are missing values important for earnings forecasts? A machine learning perspective. Quantitative finance 22, 6 (2022), 1113–1132.
[33]
Ajim Uddin, Xinyuan Tao, Chia-Ching Chou, and Dantong Yu. 2022. Machine Learning for Earnings Prediction: A Nonlinear Tensor Approach for Data Integration and Completion. In Proceedings of the Third ACM International Conference on AI in Finance. 282–290.
[34]
Xian Wu, Baoxu Shi, Yuxiao Dong, Chao Huang, and Nitesh V. Chawla. 2018. Neural Tensor Factorization. CoRR abs/1802.04416 (2018). arxiv:1802.04416http://arxiv.org/abs/1802.04416

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
November 2023
697 pages
ISBN:9798400702402
DOI:10.1145/3604237
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2023

Check for updates

Author Tags

  1. Data Imputation
  2. Financial Time Series
  3. Non-linear Tensor Decomposition
  4. Sparse Tensor Completion

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICAIF '23

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 318
    Total Downloads
  • Downloads (Last 12 months)252
  • Downloads (Last 6 weeks)31
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media