Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3604237.3626898acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

Published: 25 November 2023 Publication History

Abstract

Developing a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by the JAX-LOB simulator, which updates the LOB state. To handle long sequences efficiently, the model employs simplified structured state-space layers to process sequences of order book states and tokenized messages. Using LOBSTER data of NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens, similar to tokenization in large language models. Out-of-sample results show promising performance in approximating the data distribution, as evidenced by low model perplexity. Furthermore, the mid-price returns calculated from the generated order flow exhibit a significant correlation with the data, indicating impressive conditional forecast performance. Due to the granularity of generated data, and the accuracy of the model, it offers new application areas for future work beyond forecasting, e.g. acting as a world model in high-frequency financial reinforcement learning applications. Overall, our results invite the use and extension of the model in the direction of autoregressive large financial models for the generation of high-frequency financial data. 1

References

[1]
Samuel A Assefa, Danial Dervovic, Mahmoud Mahfouz, Robert E Tillman, Prashant Reddy, and Manuela Veloso. 2020. Generating synthetic data in finance: opportunities, challenges and pitfalls. In Proceedings of the First ACM International Conference on AI in Finance. 1–8.
[2]
David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. 2019. Seeing what a gan cannot generate. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4502–4511.
[3]
Siddharth Bhatia, Arjit Jain, and Bryan Hooi. 2021. Exgan: Adversarial generation of extreme samples. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6750–6758.
[4]
Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. Vol. 4. Springer.
[5]
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
[6]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[7]
David Byrd, Maria Hybinette, and Tucker Hybinette Balch. 2020. ABIDES: Towards high-fidelity multi-agent market simulation. In Proceedings of the 2020 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation. 11–22.
[8]
Stanley F Chen, Douglas Beeferman, and Roni Rosenfeld. 1998. Evaluation metrics for language models. (1998).
[9]
Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint:1904.10509 (2019).
[10]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).
[11]
Andrea Coletta, Joseph Jerome, Rahul Savani, and Svitlana Vyetrenko. 2023. Conditional Generators for Limit Order Book Environments: Explainability, Challenges, and Robustness. arXiv preprint arXiv:2306.12806 (2023).
[12]
Andrea Coletta, Aymeric Moulin, Svitlana Vyetrenko, and Tucker Balch. 2022. Learning to simulate realistic limit order book markets from data as a World Agent. In Proceedings of the Third ACM International Conference on AI in Finance. 428–436.
[13]
Andrea Coletta, Matteo Prata, Michele Conti, Emanuele Mercanti, Novella Bartolini, Aymeric Moulin, Svitlana Vyetrenko, and Tucker Balch. 2021. Towards realistic market simulations: a generative adversarial networks approach. In Proceedings of the Second ACM International Conference on AI in Finance. 1–9.
[14]
Rama Cont, Mihai Cucuringu, Renyuan Xu, and Chao Zhang. 2022. Tail-GAN: Learning to Simulate Tail Risk Scenarios. Available at SSRN 3812973 (2022).
[15]
Rama Cont, Mihai Cucuringu, and Chao Zhang. 2021. Cross Impact of Order Flow Imbalances: Contemporaneous and Predictive. arXiv preprint arXiv:2112.13213 (2021).
[16]
Rama Cont, Sasha Stoikov, and Rishi Talreja. 2010. A stochastic model for order book dynamics. Operations research 58, 3 (2010), 549–563.
[17]
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. 2023. Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[18]
Florian Eckerli and Joerg Osterrieder. 2021. Generative Adversarial Networks in finance: an overview. arXiv preprint arXiv:2106.06364 (2021).
[19]
Zoltan Eisler, Jean-Philippe Bouchaud, and Julien Kockelkoren. 2012. The price impact of order book events: market orders, limit orders and cancellations. Quantitative Finance 12, 9 (2012), 1395–1419.
[20]
Sascha Frey, Kang Li, Peer Nagy, Silvia Sapora, Chris Lu, Stefan Zohren, Jakob Foerster, and Anisoara Calinescu. 2023. JAX-LOB: A GPU-Accelerated Limit Order Book Simulator to Unlock Large-Scale Reinforcement Learning for Trading. In Proceedings of the Fourth ACM International Conference on AI in Finance.
[21]
Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. 2020. Tadgan: Time series anomaly detection using generative adversarial networks. In 2020 IEEE International Conference on Big Data (Big Data). IEEE, 33–43.
[22]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
[23]
Martin D. Gould, Mason A. Porter, Stacy Williams, Mark McDonald, Daniel J. Fenn, and Sam D. Howison. 2013. Limit order books. Quantitative Finance 13, 11 (Nov. 2013), 1709–1742. https://doi.org/10.1080/14697688.2013.803148 Publisher: Routledge _eprint: https://doi.org/10.1080/14697688.2013.803148.
[24]
Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. 2020. Hippo: Recurrent memory with optimal polynomial projections. Advances in neural information processing systems 33 (2020), 1474–1487.
[25]
Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. 2022. On the parameterization and initialization of diagonal state space models. Advances in Neural Information Processing Systems 35 (2022), 35971–35983.
[26]
Albert Gu, Karan Goel, and Christopher Ré. 2021. Efficiently Modeling Long Sequences with Structured State Spaces. In International Conference on Learning Representations.
[27]
Pierre Henry-Labordere. 2019. Generative models for financial data. Available at SSRN 3408007 (2019).
[28]
Ruihong Huang and Tomas Polak. 2011. Lobster: Limit order book reconstruction system. Available at SSRN 1977207 (2011).
[29]
Hanna Hultin, Henrik Hult, Alexandre Proutiere, Samuel Samama, and Ala Tarighati. 2023. A generative model of a limit order book using recurrent neural networks. Quantitative Finance (2023), 1–28.
[30]
Hanna Hultin, Henrik Hult, Alexandre Proutiere, Samuel Samama, and Ala Tarighati. 2023. A generative model of a limit order book using recurrent neural networks. Quantitative Finance 23, 6 (2023), 931–958. https://doi.org/10.1080/14697688.2023.2205583 arXiv:https://doi.org/10.1080/14697688.2023.2205583
[31]
John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.
[32]
Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171–4186.
[33]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[34]
Petter N Kolm, Jeremy Turiel, and Nicholas Westray. 2021. Deep order flow imbalance: Extracting alpha at multiple horizons from the limit order book. Available at SSRN 3900141 (2021).
[35]
Alireza Koochali, Peter Schichtel, Andreas Dengel, and Sheraz Ahmed. 2019. Probabilistic Forecasting of Sensory Data With Generative Adversarial Networks – ForGAN. IEEE Access 7 (2019), 63868–63880. https://doi.org/10.1109/access.2019.2915544
[36]
Jerzy Korczak and Marcin Hemes. 2017. Deep learning for financial time series forecasting in a-trader system. In 2017 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 905–912.
[37]
Cheuk Ting Li and Farzan Farnia. 2023. Mode-Seeking Divergences: Theory and Applications to GANs. In International Conference on Artificial Intelligence and Statistics. PMLR, 8321–8350.
[38]
Chunli Liu, Carmine Ventre, and Maria Polukarov. 2022. Synthetic Data Augmentation for Deep Reinforcement Learning in Financial Trading. In Proceedings of the Third ACM International Conference on AI in Finance. 343–351.
[39]
Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. CoRR abs/1411.1784 (2014). arXiv:1411.1784http://arxiv.org/abs/1411.1784
[40]
Yusuke Naritomi and Takanori Adachi. 2020. Data augmentation of high frequency financial data using generative adversarial network. In 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). IEEE, 641–648.
[41]
Georg Ostrovski, Will Dabney, and Rémi Munos. 2018. Autoregressive quantile networks for generative modeling. In International Conference on Machine Learning. PMLR, 3936–3945.
[42]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695.
[43]
David E Rumelhart and James L McClelland. 1987. Learning Internal Representations by Error Propagation. (1987).
[44]
Omer Berat Sezer, Mehmet Ugur Gudelek, and Ahmet Murat Ozbayoglu. 2020. Financial time series forecasting with deep learning : A systematic literature review: 2005–2019. Applied Soft Computing 90 (2020), 106181. https://doi.org/10.1016/j.asoc.2020.106181
[45]
Jimmy TH Smith, Andrew Warrington, and Scott Linderman. 2022. Simplified State Space Layers for Sequence Modeling. In The Eleventh International Conference on Learning Representations.
[46]
Shuntaro Takahashi, Yu Chen, and Kumiko Tanaka-Ishii. 2019. Modeling financial time-series with generative adversarial networks. Physica A: Statistical Mechanics and its Applications 527 (2019), 121261.
[47]
Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. 2020. Long Range Arena: A Benchmark for Efficient Transformers. In International Conference on Learning Representations.
[48]
Arnold Tustin. 1947. A method of analysing the behaviour of linear systems in terms of time series. Journal of the Institution of Electrical Engineers-Part IIA: Automatic Regulators and Servo Mechanisms 94, 1 (1947), 130–142.
[49]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[50]
Milena Vuletić, Felix Prenzel, and Mihai Cucuringu. 2023. Fin-gan: Forecasting and classifying financial time series via generative adversarial networks. Available at SSRN 4328302 (2023).
[51]
Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series Generative Adversarial Networks. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2019/file/c9efe5f26cd17ba6216bbe2a7d26d490-Paper.pdf
[52]
Michael Zhang, Khaled Kamal Saab, Michael Poli, Tri Dao, Karan Goel, and Christopher Re. 2022. Effectively Modeling Time Series with Simple Discrete State Spaces. In The Eleventh International Conference on Learning Representations.
[53]
Zhaoyu Zhang, Mengyan Li, and Jun Yu. 2018. On the convergence and mode collapse of GAN. In SIGGRAPH Asia 2018 Technical Briefs. 1–4.
[54]
Zihao Zhang, Bryan Lim, and Stefan Zohren. 2021. Deep learning for market by order data. Applied Mathematical Finance 28, 1 (2021), 79–95.
[55]
Zihao Zhang, Stefan Zohren, and Stephen Roberts. 2019. Deeplob: Deep convolutional neural networks for limit order books. IEEE Transactions on Signal Processing 67, 11 (2019), 3001–3012.
[56]
Xingyu Zhou, Zhisong Pan, Guyu Hu, Siqi Tang, and Cheng Zhao. 2018. Stock market prediction on high-frequency data using generative adversarial nets.Mathematical Problems in Engineering (2018).

Cited By

View all
  • (2024)Microstructure Modes -- Disentangling the Joint Dynamics of Prices & Order FlowSSRN Electronic Journal10.2139/ssrn.4831906Online publication date: 2024
  • (2024)Bankruptcy Prediction: Data Augmentation, LLMs and the Need for Auditor's OpinionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698627(453-460)Online publication date: 14-Nov-2024

Index Terms

  1. Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
    November 2023
    697 pages
    ISBN:9798400702402
    DOI:10.1145/3604237
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 November 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ML
    2. generative AI
    3. limit order books
    4. structured state space models

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICAIF '23

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)331
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 27 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Microstructure Modes -- Disentangling the Joint Dynamics of Prices & Order FlowSSRN Electronic Journal10.2139/ssrn.4831906Online publication date: 2024
    • (2024)Bankruptcy Prediction: Data Augmentation, LLMs and the Need for Auditor's OpinionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698627(453-460)Online publication date: 14-Nov-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media