Review of Statistical Approaches for Modeling High-Frequency Trading Data

Chiranjit Dutta¹,
Kara Karpman²,
Sumanta Basu³ &
…
Nalini Ravishanker¹

1450 Accesses
6 Citations
Explore all metrics

Abstract

Due to technological advancements over the last two decades, algorithmic trading strategies are now widely used in financial markets. In turn, these strategies have generated high-frequency (HF) data sets, which provide information at an extremely fine scale and are useful for understanding market behaviors, dynamics, and microstructures. In this paper, we discuss how information flow impacts the behavior of high-frequency (HF) traders and how certain high-frequency trading (HFT) strategies significantly impact market dynamics (e.g., asset prices). The paper also reviews several statistical modeling approaches for analyzing HFT data. We discuss four popular approaches for handling HFT data: (i) aggregating data into regularly spaced bins and then applying regular time series models, (ii) modeling jumps in price processes, (iii) point process approaches for modeling the occurrence of events of interest, and (iv) modeling sequences of inter-event durations. We discuss two methods for defining events, one based on the asset price, and the other based on both price and volume of the asset. We construct durations based on these two definitions, and apply models to tick-by-tick data for assets traded on the New York Stock Exchange (NYSE). We discuss some open challenges arising in HFT data analysis including some empirical analysis, and also review applications of HFT data in finance and economics, outlining several research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From Calendar to Economic Time. Deciphering the Arrival of Events in Irregularly Spaced Time

High-frequency trading: a literature review

Article 01 June 2019

On the Study of Two Models for Integer-Valued High-Frequency Data

Notes

Indeed this is one of the criticisms of HFT. The May 6, 2010 “Flash Crash,” in which the Dow Jones Industrial Average dropped by almost 1,000 points in 30 min, was the result of an execution algorithm that considered only volume, not time. As a result, $4.1 billion of E-Mini S&P 500 futures contracts were sold on the Chicago Mercantile Exchange in a mere 20 min interval (Goldstein et al. (2014)).
A number of approaches can be used to classify trades as buyer- or seller-initiated, including the Lee-Ready algorithm, the tick rule, and bulk volume classification (see Easley et al. 2016 and references therein).

References

Aït-Sahalia, Y. and Jacod, J. (2009). Testing for jumps in a discretely observed process. The Annals of Statistics 184–222.
Aït-sahalia, Y., Fan, J. and Xiu, D. (2010). High-frequency covariance estimates with noisy and asynchronous financial data. Journal of the American Statistical Association 105, 492, 1504–1517.
Article MathSciNet MATH Google Scholar
Aït-sahalia, Y., Jacod, J. and Li, J. (2012). Testing for jumps in noisy high frequency data. Journal of Econometrics 168, 2, 207–222.
Article MathSciNet MATH Google Scholar
Alizadeh, S., Brandt, M.W. and Diebold, F.X. (2002). Range-based estimation of stochastic volatility models. Journal of Finance 57, 3, 1047–1091.
Article Google Scholar
Andersen, T.G. and Bollerslev, T. (1997). Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance, 4, 2-3, 115–158.
Article Google Scholar
Andersen, T.G. and Bollerslev, T. (1998). Answering the skeptics: yes, standard volatility models do provide accurate forecasts. International Economic Review 39, 4, 885–905.
Article Google Scholar
Andersen, T.G., Benzoni, L. and Lund, J. (2002). An empirical investigation of continuous-time equity return models. The Journal of Finance, 57, 3, 1239–1284.
Article Google Scholar
Andersen, T.G., Bollerslev, T. and Diebold, F.X. (2007). Roughing it up: including jump components in the measurement, modeling, and forecasting of return volatility. The Review of Economics and Statistics 89, 4, 701–720.
Article Google Scholar
Ardia, D., Bluteau, K., Boudt, K., Catania, L. and Trottier, D.-A. (2019). Markov-switching GARCH models in r: the MSGARCH Package. Journal of Statistical Software 91(4).
Asai, M., Chang, C. -L. and McAleer, M. (2017). Realized stochastic volatility with general asymmetry and long memory. Journal of Econometrics 199, 2, 202–212.
Article MathSciNet MATH Google Scholar
Baillie, R.T. (1996). Long memory processes and fractional integration in econometrics. Journal of Econometrics, 73, 1, 5–59.
Article MathSciNet MATH Google Scholar
Baillie, R.T., Bollerslev, T. and Mikkelsen, H.O. (1996). Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 74, 1, 3–30.
Article MathSciNet MATH Google Scholar
Barndorff-Nielsen, O.E. and Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society Series B (Statistical Methodology) 64, 2, 253–280.
Article MathSciNet MATH Google Scholar
Barndorff-Nielsen, O.E. and Shephard, N. (2004). Power and bipower variation with stochastic volatility and jumps. Journal of Financial Econometrics2, 2, 1–37.
Article Google Scholar
Barndorff-Nielsen, O.E. and Shephard, N. (2005). Variation, jumps market frictions and high frequency data in financial econometrics.
Barndorff-Nielsen, O.E. and Shephard, N. (2006). Econometrics of testing for jumps in financial economics using bipower variation. Journal of Financial Econometrics 4, 1, 1–30.
Article Google Scholar
Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. and Shephard, N. (2011). Multivariate realised kernels: consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. Journal of Econometrics, 162, 2, 149–169.
Article MathSciNet MATH Google Scholar
Bauwens, L. and Giot, P. (2000). The logarithmic ACD model: an application to the bid-ask quote process of three NYSE stocks. Annales d’Economie et de Statistique, (60):117–149.
Bauwens, L. and Hautsch, N. (2006). Stochastic conditional intensity processes. Journal of Financial Econometrics 4, 3, 450–493.
Article Google Scholar
Bauwens, L. and Veredas, D. (2004). The stochastic conditional duration model: a latent variable model for the analysis of financial durations. Journal of Econometrics 119, 2, 381–412.
Article MathSciNet MATH Google Scholar
Belfrage, M. (2016). ACDM: tools for autoregressive conditional duration models. (R package version 1.0.4).
Beran, J. (1994). Statistics for long-memory processes. CRC Press.
Bibinger, M. (2011). Efficient covariance estimation for asynchronous noisy high-frequency data. Scandinavian Journal of Statistics 38, 1, 23–45.
Article MathSciNet MATH Google Scholar
Billio, M., Getmansky, M., Lo, A.W. and Pelizzon, L. (2012). Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of Financial Economics 104, 3, 535–559.
Article Google Scholar
Bjursell, J. and Gentle, J.E. (2012). Identifying jumps in asset prices. In: Handbook of computational finance, pp. 371–399. Springer.
Black, F. (1976). Studies of stock market volatility changes. In: Proceedings of the American statistical association business and economic statistics section, pp. 177–181.
Black, F. (1986). Noise. The Journal of Finance 41, 3, 528–543.
Article Google Scholar
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 3, 307–327.
Article MathSciNet MATH Google Scholar
Boudt, K., Cornelissen, J., Payseur, S., Kleen, O. and Sjoerup, E. (2021a). Highfrequency: tools for highfrequency data analysis. https://CRAN.R-project.org/package=highfrequency. R package version 0.9.0.
Boudt, K., Kleen, O. and Sjørup, E. (2021b). Analyzing intraday financial data in r: the highfrequency package. Available at SSRN 3917548.
Brouste, A., Fukasawa, M., Hino, H., Iacus, S., Kamatani, K., Koike, Y., Masuda, H., Nomura, R., Ogihara, T., Shimuzu, Y. et al (2014). The yuima project: a computational framework for simulation and inference of stochastic differential equations. Journal of Statistical Software 57, 1–51.
Article Google Scholar
Buccheri, G., Bormetti, G., Corsi, F. and Lillo, F. (2021a). A score-driven conditional correlation model for noisy and asynchronous data: an application to high-frequency covariance dynamics. Journal of Business & Economic Statistics, 39, 4, 920–936.
Article MathSciNet Google Scholar
Buccheri, G., Corsi, F. and Peluso, S. (2021b). High-frequency lead-lag effects and cross-asset linkages: a multi-asset lagged adjustment model. Journal of Business & Economic Statistics, 39, 605–621.
Article MathSciNet Google Scholar
Cameron, A.C. and Trivedi, P.K. (2013). Regression analysis of count data. Cambridge University Press, Cambridge.
Book MATH Google Scholar
Cao, W., Hurvich, C. and Soulier, P. (2017). Drift in transaction-level asset price models. Journal of Time Series Analysis, 38, 5, 769–790.
Article MathSciNet MATH Google Scholar
Carr, P. and Wu, L. (2003). The finite moment log stable process and option pricing. The Journal of Finance 58, 2, 753–777.
Article Google Scholar
Carr, P., Madan, D. and Chang, E. (1998). The variance gamma process and option pricing. European Finance Review 2, 1, 79–105.
Article MATH Google Scholar
Carr, P., Geman, H., Madan, D.B. and Yor, M. (2002). The fine structure of asset returns: an empirical investigation. The Journal of Business 75, 2, 305–332.
Article Google Scholar
Cartea, A. and Jaimungal, S. (2013). Modelling asset prices for algorithmic and high-frequency trading. Applied Mathematical Finance, 20, 6, 512–547.
Article MathSciNet MATH Google Scholar
Chakrabarti, A. and Sen, R. (2019). Copula estimation for nonsynchronous financial data. arXiv:1904.10182.
Chan, K. (1992). A further analysis of the lead–lag relationship between the cash market and stock index futures market. The Review of Financial Studies5, 1, 123–152.
Article MathSciNet Google Scholar
Chen, C.W.S., Gerlach, R., Hwang, B.B.K and McAleer, M. (2012). Forecasting Value-at-Risk using nonlinear regression quantiles and the intra-day range. International Journal of Forecasting 28, 3, 557–574.
Article Google Scholar
Chen, F., Diebold, F.X. and Schorfheide, F. (2013). A Markov-switching multifractal inter-trade duration model, with application to us equities. Journal of Econometrics 177, 2, 320–342.
Article MathSciNet MATH Google Scholar
Chib, S., Nardari, F. and Shephard, N. (2002). Markov chain Monte Carlo methods for stochastic volatility models. Journal of Econometrics 108, 2, 281–316.
Article MathSciNet MATH Google Scholar
Chib, S., Omori, Y. and Asai, M. (2009). Multivariate stochastic volatility. In: Handbook of financial time series, pp. 365–400. Springer.
Christensen, K., Kinnebrock, S. and Podolskij, M. (2010). Pre-averaging estimators of the ex-post covariance matrix in noisy diffusion models with non-synchronous data. Journal of Econometrics 159, 1, 116–133.
Article MathSciNet MATH Google Scholar
Christensen, K., Oomen, R.C. and Podolskij, M. (2014). Fact or friction: jumps at ultra high frequency. Journal of Financial Economics 114, 3, 576–599.
Article Google Scholar
Cont, R. (2011). Statistical modeling of high-frequency financial data. IEEE Signal Processing Magazine, 28, 5, 16–25.
Article Google Scholar
Cont, R. and Tankov, P. (2004). Financial modeling with jump processes. Chapman & Hall/CRC, Boca Raton.
MATH Google Scholar
Coroneo, L. and Veredas, D. (2012). A simple two-component model for the distribution of intraday returns. The European Journal Finance 18, 9, 775–797.
Article Google Scholar
Corsi, F. and Audrino, F. (2012). Realized covariance tick-by-tick in presence of rounded time stamps and general microstructure effects. J. Financ. Econom.10, 591–616.
Google Scholar
Cox, D.R. (1972). Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodol.) 34, 187–202.
MathSciNet MATH Google Scholar
Cox, D.R. and Oakes, D. (2018). Analysis of survival data. Chapman and hall/CRC.
Daley, D.J. and Vere-Jones, D. (2003). An introduction to the theory of point processes: volume i: elementary theory and methods. Springer.
De Jong, F. and Nijman, T. (1997). High frequency analysis of lead-lag relationships between financial markets. J. Empir. Finance 4, 259–277.
Article Google Scholar
Deo, R., Hsieh, M. and Hurvich, C.M. (2010). Long memory in intertrade durations, counts and realized volatility of NYSE stocks. J. Stat. Plan. Inference 140, 3715–3733.
Article MathSciNet MATH Google Scholar
Diamond, D.W. and Verrecchia, R.E. (1987). Constraints on short-selling and asset price adjustment to private information. J. Financ. Econ. 18, 277–311.
Article Google Scholar
Diebold, F.X. and Yılmaz, K. (2014). On the network topology of variance decompositions: measuring the connectedness of financial firms. J. Econ.182, 119–134.
Article MathSciNet MATH Google Scholar
Dionne, G., Duchesne, P. and Pacurar, M. (2009). Intraday value at risk (IVaR) using tick-by-tick data with application to the Toronto Stock Exchange. J. Empir. Finance 16, 777–792.
Article Google Scholar
Dobrev, D. and Schaumburg, E. (2017). High-frequency cross-market trading: model free measurement and applications. Perspectives.
Duffie, D., Pan, J. and Singleton, K. (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica 68, 1343–1376.
Article MathSciNet MATH Google Scholar
Dufour, A. and Engle, R.F. (2000). Time and the price impact of a trade. J. Financ. 55, 2467–2498.
Article Google Scholar
Easley, D. and O’Hara, M. (1992). Time and the process of security price adjustment. J. Financ. 47, 577–605.
Article Google Scholar
Easley, D., Kiefer, N.M., O’Hara, M. and Paperman, J.B. (1996). Liquidity, information, and infrequently traded stocks. J. Financ. 51, 1405–1436.
Article Google Scholar
Easley, D., Hvidkjaer, S. and O’Hara, M. (2002). Is information risk a determinant of asset returns? J. Financ. 57, 2185–2221.
Article Google Scholar
Easley, D., de Prado, M.M.L. and O’Hara, M. (2012a). The volume clock: insights into the high-frequency paradigm. J. Portfolio Manag. 39, 19–29.
Article Google Scholar
Easley, D., López de Prado, M.M. and O’Hara, M. (2012b). Flow toxicity and liquidity in a high frequency world. Rev. Financ. Stud. 25, 1457–1493.
Article Google Scholar
Easley, D., de Prado, M.L. and O’Hara, M. (2016). Discerning information from trade data. J. Financ. Econ. 120, 269–285.
Article Google Scholar
Easley, D., López de Prado, M., O’Hara, M. and Zhang, Z. (2021). Microstructure in the machine age. Rev. Financ. Stud. 34, 3316–3363.
Article Google Scholar
Eberlein, E. and Keller, U. (1995). Hyperbolic distributions in finance. Bernoulli 281–299.
Efron, B. (1986). Double exponential families and their use in generalized linear regression. J. Am. Stat. Assoc. 81, 709–721.
Article MathSciNet MATH Google Scholar
Embrechts, P., Klüppelberg, C. and Mikosch, T. (2013). Modelling extremal events: for insurance and finance. Springer Science & Business Media.
Engle, R. (2002a). Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional heteroskedasticity models. J. Bus. Econ. Stat. 20, 339–350.
Article MathSciNet Google Scholar
Engle, R. (2002b). New frontiers for ARCH models. J. Appl. Econom.17, 425–446.
Article Google Scholar
Engle, R.F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econom.: J. Econom. Soc. 50, 987–1007.
Article MathSciNet MATH Google Scholar
Engle, R.F. and Russell, J.R. (1998). Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66, 1127–1162.
Article MathSciNet MATH Google Scholar
Epps, T.W. (1979). Comovements in stock prices in the very short run. J. Am. Stat. Assoc. 74, 291–298.
Google Scholar
Evans, K.P. (2011). Intraday jumps and us macroeconomic news announcements. J. Bank. Finance 35, 2511–2527.
Article Google Scholar
Fan, J., Li, Y. and Yu, K. (2012). Vast volatility matrix estimation using high-frequency data for portfolio selection. J. Am. Stat. Assoc. 107, 412–428.
Article MathSciNet MATH Google Scholar
Feng, Y. and Zhou, C. (2015). Forecasting financial market activity using a semiparametric fractionally integrated log-acd. Int. J. Forecast. 31, 349–363.
Article Google Scholar
Fernandes, M. and Grammig, J. (2006). A family of autoregressive conditional duration models. J. Econ. 130, 1–23.
Article MathSciNet MATH Google Scholar
Fissler, T. and Ziegel, J.F. (2016). Higher order elicitability and Osband’s principle. Ann. Stat. 44, 1680–1707.
Article MathSciNet MATH Google Scholar
Fleming, T.R. and Harrington, D.P. (2011). Counting processes and survival analysis. Wiley, New York.
MATH Google Scholar
Gerlach, R. and Chen, C.W. (2015). Bayesian expected shortfall forecasting incorporating the intraday range. J. Financ. Econom. 14, 128–158.
Google Scholar
Gerlach, R. and Wang, C. (2016). Forecasting risk via realized GARCH, incorporating the realized range. Quant. Finance 16, 501–511.
Article MathSciNet MATH Google Scholar
Ghalanos, A. (2020). Rugarch: univariate GARCH models. R package version 1.4-4.
Giot, P. (2005). Market risk models for intraday data. Eur. J. Finance11, 309–324.
Article Google Scholar
Goldstein, M.A., Kumar, P. and Graves, F.C. (2014). Computerized and high-frequency trading. Financ. Rev. 49, 177–202.
Article Google Scholar
Gordy, M.B. and Juneja, S. (2010). Nested simulation in portfolio risk measurement. Manag. Sci. 56, 1833–1848.
Article MATH Google Scholar
Granger, C.W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc., 424–438.
Hansen, P.R. and Lunde, A. (2005). A forecast comparison of volatility models: does anything beat a GARCH(1,1)? J. Appl. Econ. 20, 873–889.
Article MathSciNet Google Scholar
Hansen, P.R., Lunde, A. and Nason, J.M. (2003). Choosing the best volatility models: the model confidence set approach. Oxf. Bull. Econ. Stat. 65, 839–861.
Article Google Scholar
Hansen, P.R., Huang, Z. and Shek, H.H. (2012). Realized GARCH: a joint model for returns and realized measures of volatility. J. Appl. Econ. 27, 877–906.
Article MathSciNet Google Scholar
Harris, L. (2003). Trading and exchanges: market microstructure for practitioners. Oxford University Press, Oxford.
Google Scholar
Harvey, A., Ruiz, E. and Shephard, N. (1994). Multivariate stochastic variance models. Rev. Econ. Stud. 61, 247–264.
Article MATH Google Scholar
Harvey, A.C. and Shephard, N. (1996). Estimation of an asymmetric stochastic volatility model for asset returns. J. Bus. Econ. Stat. 14, 429–434.
Google Scholar
Hasbrouck, J. (2007). Empirical market microstructure: the institutions, economics, and econometrics of securities trading. Oxford University Press, Oxford.
Google Scholar
Hautsch, N. (2011). Econometrics of financial high-frequency data. Springer Science & Business Media.
Hautsch, N., Klausurtagung, S. and Risiko, O. (2006). Generalized autoregressive conditional intensity models with long range dependence.
Hawkes, A.G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90.
Article MathSciNet MATH Google Scholar
Hayashi, T. and Koike, Y. (2017). Multi-scale analysis of lead-lag relationships in high-frequency financial markets. arXiv:1708.03992.
Hayashi, T., Yoshida, N. et al (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11, 359–379.
Article MathSciNet MATH Google Scholar
Heckman, J.J. and Singer, B. (1984). Econometric duration analysis. J. Econ. 24, 63–132.
Article MathSciNet MATH Google Scholar
Heinen, A. (2003). Modelling time series count data: an autoregressive conditional poisson model available at SSRN 1117187.
Hoffmann, M., Rosenbaum, M. and Yoshida, N. (2013). Estimation of the lead-lag parameter from non-synchronous data. Bernoulli 19, 426–461.
Article MathSciNet MATH Google Scholar
Hosszejni, D. and Kastner, G. (2019). Modeling univariate and multivariate stochastic volatility in R with stochvol and factorstochvol. arXiv:1906.12123.
Hsieh, M.-C., Hurvich, C. and Soulier, P. (2019). Modeling leverage and long memory in volatility in a pure-jump process. High Frequency 2, 124–141.
Article Google Scholar
Huang, D., Zhu, S., Fabozzi, F.J. and Fukushima, M. (2010). Portfolio selection under distributional uncertainty: a relative robust cvar approach. Eur. J. Oper. Res. 203, 185–194.
Article MATH Google Scholar
Iacus, S.M. and Yoshida, N. (2018). Simulation and inference for stochastic processes with yuima. A comprehensive R framework for SDEs and other stochastic processes. Use R.
Jacod, J., Li, Y., Mykland, P.A., Podolskij, M. and Vetter, M. (2009). Microstructure noise in the continuous case: the pre-averaging approach. Stoch. Process. Appl. 119, 2249–2276.
Article MathSciNet MATH Google Scholar
Jacquier, E., Polson, N.G. and Rossi, P.E. (2004). Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. J. Econ.122, 185–212.
Article MathSciNet MATH Google Scholar
Jasiak, J. (1999). Persistence in intertrade durations. Available at SSRN: https://ssrn.com/abstract=162008.
Jiang, G.J. and Oomen, R. (2005). A new test for jumps in asset prices. Preprint.
Jiang, G.J. and Oomen, R.C. (2008). Testing for jumps when asset prices are observed with noise—a “swap variance” approach. J. Econ. 144, 352–370.
Article MathSciNet MATH Google Scholar
Kalbfleisch, J.D. and Prentice, R.L. (2011). The statistical analysis of failure time data. Wiley, New York.
MATH Google Scholar
Keim, D.B. and Madhavan, A. (1996). The upstairs market for large-block transactions: analysis and measurement of price effects. Rev. Financ. Stud.9, 1–36.
Article Google Scholar
Kleinbaum, D.G. and Klein, M. (2010). Survival analysis. Springer, Berlin.
MATH Google Scholar
Kwok, S.S.M., Li, W.K. and Yu, P.L.H. (2009). The autoregressive conditional marked duration model: statistical inference to market microstructure. J. Data Sci.
Lancaster, T. (1979). Econometric methods for the duration of unemployment. Econom. J. Econom. Soc. 47, 939–956.
MATH Google Scholar
Lane, W.R., Looney, S.W. and Wansley, J.W. (1986). An application of the cox proportional hazards model to bank failure. J. Bank. Finance 10, 511–531.
Article Google Scholar
Lee, S.S. and Mykland, P.A. (2008). Jumps in financial markets: a new nonparametric test and jump dynamics. Rev. Financ. Stud. 21, 2535–2563.
Article Google Scholar
Li, J., Todorov, V., Tauchen, G. and Lin, H. (2019). Rank tests at jump events. J. Bus. Econ. Stat. 37, 312–321.
Article MathSciNet Google Scholar
Liboschik, T., Fokianos, K. and Fried, R. (2017). tscount: an R package for analysis of count time series following generalized linear models. J. Stat. Softw. 82, 1–51.
Article Google Scholar
Liu, H., Zou, J. and Ravishanker, N. (2018). Multiple day biclustering of high-frequency financial time series. Stat 7, e176.
Article MathSciNet Google Scholar
Liu, H., Zou, J. and Ravishanker, N. (2021). Clustering high-frequency financial time series based on information theory, forthcoming. Appl. Stoch. Models Bus. Ind.
Liu, S. and Tse, Y.-K. (2015). Intraday value-at-Risk: an asymmetric autoregressive conditional duration approach. J. Econ. 189, 437–446.
Article MathSciNet MATH Google Scholar
Madan, D.B. and Seneta, E. (1990). The variance gamma (vg) model for share market returns. J. Bus. 511–524.
Mancino, M.E. and Sanfelici, S. (2011). Estimating covariance via fourier method in the presence of asynchronous trading and microstructure noise. J. Financ. Econom. 9, 367–408.
Google Scholar
Manganelli, S. (2005). Duration, volume and volatility impact of trades. J. Financ. Mark. 8, 377–399.
Article Google Scholar
Martens, M. and Van Dijk, D. (2007). Measuring volatility with the realized range. J. Econ. 138, 181–207.
Article MathSciNet MATH Google Scholar
Meng, X. and Taylor, J.W. (2020). Estimating value-at-risk and expected shortfall using the intraday low and range data. Eur. J. Oper. Res. 280, 191–202.
Article MathSciNet MATH Google Scholar
Mies, F., Bibinger, M., Steland, A. and Podolskij, M. (2020). High-frequency inference for stochastic processes with jumps of infinite activity. PhD thesis, RWTH Aachen University.
Mukherjee, A., Peng, W., Swanson, N.R. and Yang, X. (2020). Financial econometrics and big data: a survey of volatility estimators and tests for the presence of jumps and co-jumps. In: Handbook of statistics, vol 42, pp 3–59. Elsevier.
Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new approach. Econom. J. Econom. Soc. 59, 347–370.
MathSciNet MATH Google Scholar
N.Y.S.E. Trade and Quote Database (2019). Retrieved from wharton research data services accessed.
O’Hara, M. (1997). Market microstructure theory. Wiley, New York.
Google Scholar
Pacurar, M. (2008). Autoregressive conditional duration models in finance: a survey of the theoretical and empirical literature. J. Econ. Surv. 22, 711–751.
Article Google Scholar
Palma, W. (2007). Long-memory time series: theory and methods. Wiley, New York.
Book MATH Google Scholar
Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return. J. Bus. 53, 61–65.
Article Google Scholar
Peluso, S., Corsi, F. and Mira, A. (2014). A bayesian high-frequency estimator of the multivariate covariance of noisy and asynchronous returns. J. Financ. Econom. 13, 665–697.
Google Scholar
Renò, R. (2003). A closer look at the epps effect. Int. J. Theor. Appl. Finance 6, 87–102.
Article MATH Google Scholar
Robinson, P.M. (2003). Time series with long memory. Advanced Texts in Econometrics.
Rocco, M. (2014). Extreme value theory in finance: a survey. J. Econ. Surv. 28, 82–108.
Article Google Scholar
Russell, J.R. (1999). Econometric modeling of multivariate irregularly-spaced high-frequency data. Working Paper, University of Chicago.
Rydberg, T.H. and Shephard, N. (2000). BIN models for trade-by-trade data. modelling the number of trades in a fixed interval of time. Econometric Society World Congress 2000 Contributed Papers 0740, Econometric Society. https://ideas.repec.org/p/ecm/wc2000/0740.html.
Sen, R. (2009). Jumps and microstructure noise in stock price volatility. Volatility, 163.
Shirota, S., Hizu, T. and Omori, Y. (2014). Realized stochastic volatility with leverage and long memory. Comput. Stat. Data Anal. 76, 618–641.
Article MathSciNet MATH Google Scholar
So, M.K. and Xu, R. (2013). Forecasting intraday volatility and value-at-risk with high-frequency data. Asia-Pac. Finan. Markets 20, 83–111.
Article MATH Google Scholar
So, M.K., Chu, A.M., Lo, C.C. and Ip, C.Y. (2021). Volatility and dynamic dependence modeling: review, applications, and financial risk management. Wiley Interdiscip. Rev.: Comput. Stat., e1567.
Song, X., Kim, D., Yuan, H., Cui, X., Lu, Z., Zhou, Y. and Wang, Y. (2021). Volatility analysis with realized garch-itô models. J. Econ.222, 393–410.
Article MATH Google Scholar
Stroud, J.R. and Johannes, M.S. (2014). Bayesian modeling and forecasting of 24-hour high-frequency volatility. J. Am. Stat. Assoc. 109, 1368–1384.
Article MathSciNet Google Scholar
Sun, W., Rachev, S., Fabozzi, F.J. and Kalev, P.S. (2008). Fractals in trade duration: capturing long-range dependence and heavy tailedness in modeling trade duration. Ann. Finance 4, 217–241.
Article MATH Google Scholar
Swishchuk, A. and Huffman, A. (2020). General compound hawkes processes in limit order books. Risks 8, 28.
Article Google Scholar
Takahashi, M., Omori, Y. and Watanabe, T. (2009). Estimating stochastic volatility models using daily returns and realized volatility simultaneously. Comput. Stat. Data Anal. 53, 2404–2426.
Article MathSciNet MATH Google Scholar
Takahashi, M., Watanabe, T. and Omori, Y. (2016). Volatility and quantile forecasts by realized stochastic volatility models with generalized hyperbolic distribution. Int. J. Forecast. 32, 437–457.
Article Google Scholar
Tay, A.S., Ting, C., Tse, Y.K. and Warachka, M. (2004). Transaction-data analysis of marked durations and their implications for market microstructure.
Tay, A.S., Ting, C., Kuen Tse, Y. and Warachka, M. (2011). The impact of transaction duration, volume and direction on price dynamics and volatility. Quant. Finance 11, 447–457.
Article MathSciNet MATH Google Scholar
Taylor, S.J. (1982). Financial returns modelled by the product of two stochastic processes-a study of the daily sugar prices 1961–75. Time Ser. Anal. Theory Pract. 1, 203–226.
Google Scholar
Taylor, S.J. (1994). Modeling stochastic volatility: a review and comparative study. Math. Financ. 4, 183–204.
Article MATH Google Scholar
Thavaneswaran, A., Ravishanker, N. and Liang, Y. (2015). Generalized duration models and optimal estimation using estimating functions. Ann. Inst. Stat. Math. 67, 129–156.
Article MathSciNet MATH Google Scholar
Therneau, T.M. (2021). Survival: a package for survival analysis in R. R package version 3.2-13.
Tsai, P.-C. and Shackleton, M.B. (2016). Detecting jumps in high-frequency prices under stochastic volatility: a review and a data-driven approach. In: Handbook of high-frequency trading and modeling in finance, pp 137–181.
Tsay, R.S. (2005). Analysis of financial time series. Wiley, New York.
Book MATH Google Scholar
Vasileios, S. (2015). acp: autoregressive conditional poisson (R package version 2.1).
Wang, Q., Figueroa-López, J.E. and Kuffner, T.A. (2021). Bayesian inference on volatility in the presence of infinite jump activity and microstructure noise. Electron. J. Stat. 15, 506–553.
Article MathSciNet MATH Google Scholar
Wang, Y. and Zou, J. (2014). Volatility analysis in high-frequency financial data. Wiley Interdiscip. Rev. Comput. Stat. 6, 393–404.
Article Google Scholar
Yan, B. and Zivot, E. (2003). Analysis of high-frequency financial data with S-PLUS. Working paper, UWEC-2005-03. http://ideas.repec.org/p/udb/wpaper/uwec-2005-03.html.
Yu, J. and Meyer, R. (2006). Multivariate stochastic volatility models: bayesian estimation and model comparison. Econ. Rev. 25, 361–384.
Article MathSciNet MATH Google Scholar
Zaatour, R. (2014). Hawkes: Hawkes process simulation and calibration toolkit (R package version 0.0-4).
Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. J. Econ. 160, 33–47.
Article MathSciNet MATH Google Scholar
Zhang, Y., Zou, J., Ravishanker, N. and Thavaneswaran, A. (2019). Modeling financial durations using penalized estimating functions. Comput. Stat. Data Anal. 131, 145–158.
Article MathSciNet MATH Google Scholar
Zheng, Y., Li, Y. and Li, G. (2016). On Fréchet autoregressive conditional duration models. J. Stat. Plan. Inference 175, 51–66.
Article MATH Google Scholar
žikeš, F., Baruník, J. and Shenai, N. (2017). Modeling and forecasting persistent financial durations. Econom. Rev. 36, 1081–1110.
Article MathSciNet MATH Google Scholar
Zivot, E. and Wang, J. (2007). Modeling financial time series with s-plus®;, vol 191. Springer Science & Business Media.

Download references

Acknowledgements

The authors are very grateful to the reviewers and editors for their helpful suggestions for improving the paper.

Funding

This paper was based upon work partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. In addition, the work of SB was supported in part by an NSF award (DMS-1812128).

Author information

Authors and Affiliations

Department of Statistics, University of Connecticut, Storrs, CT, USA
Chiranjit Dutta & Nalini Ravishanker
Department of Mathematics, Middlebury College, Middlebury, VT, USA
Kara Karpman
Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
Sumanta Basu

Authors

Chiranjit Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Kara Karpman
View author publications
You can also search for this author in PubMed Google Scholar
Sumanta Basu
View author publications
You can also search for this author in PubMed Google Scholar
Nalini Ravishanker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chiranjit Dutta.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutta, C., Karpman, K., Basu, S. et al. Review of Statistical Approaches for Modeling High-Frequency Trading Data. Sankhya B 85 (Suppl 1), 1–48 (2023). https://doi.org/10.1007/s13571-022-00280-7

Download citation

Received: 29 September 2021
Accepted: 22 March 2022
Published: 28 April 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s13571-022-00280-7

Review of Statistical Approaches for Modeling High-Frequency Trading Data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From Calendar to Economic Time. Deciphering the Arrival of Events in Irregularly Spaced Time

High-frequency trading: a literature review

On the Study of Two Models for Integer-Valued High-Frequency Data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

AMS (2000) subject classification

Subscribe and save

Buy Now

Navigation

Review of Statistical Approaches for Modeling High-Frequency Trading Data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From Calendar to Economic Time. Deciphering the Arrival of Events in Irregularly Spaced Time

High-frequency trading: a literature review

On the Study of Two Models for Integer-Valued High-Frequency Data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS (2000) subject classification

Subscribe and save

Buy Now

Search

Navigation