Abstract
Due to technological advancements over the last two decades, algorithmic trading strategies are now widely used in financial markets. In turn, these strategies have generated high-frequency (HF) data sets, which provide information at an extremely fine scale and are useful for understanding market behaviors, dynamics, and microstructures. In this paper, we discuss how information flow impacts the behavior of high-frequency (HF) traders and how certain high-frequency trading (HFT) strategies significantly impact market dynamics (e.g., asset prices). The paper also reviews several statistical modeling approaches for analyzing HFT data. We discuss four popular approaches for handling HFT data: (i) aggregating data into regularly spaced bins and then applying regular time series models, (ii) modeling jumps in price processes, (iii) point process approaches for modeling the occurrence of events of interest, and (iv) modeling sequences of inter-event durations. We discuss two methods for defining events, one based on the asset price, and the other based on both price and volume of the asset. We construct durations based on these two definitions, and apply models to tick-by-tick data for assets traded on the New York Stock Exchange (NYSE). We discuss some open challenges arising in HFT data analysis including some empirical analysis, and also review applications of HFT data in finance and economics, outlining several research directions.
Similar content being viewed by others
Notes
Indeed this is one of the criticisms of HFT. The May 6, 2010 “Flash Crash,” in which the Dow Jones Industrial Average dropped by almost 1,000 points in 30 min, was the result of an execution algorithm that considered only volume, not time. As a result, $4.1 billion of E-Mini S&P 500 futures contracts were sold on the Chicago Mercantile Exchange in a mere 20 min interval (Goldstein et al. (2014)).
A number of approaches can be used to classify trades as buyer- or seller-initiated, including the Lee-Ready algorithm, the tick rule, and bulk volume classification (see Easley et al. 2016 and references therein).
References
Aït-Sahalia, Y. and Jacod, J. (2009). Testing for jumps in a discretely observed process. The Annals of Statistics 184–222.
Aït-sahalia, Y., Fan, J. and Xiu, D. (2010). High-frequency covariance estimates with noisy and asynchronous financial data. Journal of the American Statistical Association 105, 492, 1504–1517.
Aït-sahalia, Y., Jacod, J. and Li, J. (2012). Testing for jumps in noisy high frequency data. Journal of Econometrics 168, 2, 207–222.
Alizadeh, S., Brandt, M.W. and Diebold, F.X. (2002). Range-based estimation of stochastic volatility models. Journal of Finance 57, 3, 1047–1091.
Andersen, T.G. and Bollerslev, T. (1997). Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance, 4, 2-3, 115–158.
Andersen, T.G. and Bollerslev, T. (1998). Answering the skeptics: yes, standard volatility models do provide accurate forecasts. International Economic Review 39, 4, 885–905.
Andersen, T.G., Benzoni, L. and Lund, J. (2002). An empirical investigation of continuous-time equity return models. The Journal of Finance, 57, 3, 1239–1284.
Andersen, T.G., Bollerslev, T. and Diebold, F.X. (2007). Roughing it up: including jump components in the measurement, modeling, and forecasting of return volatility. The Review of Economics and Statistics 89, 4, 701–720.
Ardia, D., Bluteau, K., Boudt, K., Catania, L. and Trottier, D.-A. (2019). Markov-switching GARCH models in r: the MSGARCH Package. Journal of Statistical Software 91(4).
Asai, M., Chang, C. -L. and McAleer, M. (2017). Realized stochastic volatility with general asymmetry and long memory. Journal of Econometrics 199, 2, 202–212.
Baillie, R.T. (1996). Long memory processes and fractional integration in econometrics. Journal of Econometrics, 73, 1, 5–59.
Baillie, R.T., Bollerslev, T. and Mikkelsen, H.O. (1996). Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 74, 1, 3–30.
Barndorff-Nielsen, O.E. and Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society Series B (Statistical Methodology) 64, 2, 253–280.
Barndorff-Nielsen, O.E. and Shephard, N. (2004). Power and bipower variation with stochastic volatility and jumps. Journal of Financial Econometrics2, 2, 1–37.
Barndorff-Nielsen, O.E. and Shephard, N. (2005). Variation, jumps market frictions and high frequency data in financial econometrics.
Barndorff-Nielsen, O.E. and Shephard, N. (2006). Econometrics of testing for jumps in financial economics using bipower variation. Journal of Financial Econometrics 4, 1, 1–30.
Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. and Shephard, N. (2011). Multivariate realised kernels: consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. Journal of Econometrics, 162, 2, 149–169.
Bauwens, L. and Giot, P. (2000). The logarithmic ACD model: an application to the bid-ask quote process of three NYSE stocks. Annales d’Economie et de Statistique, (60):117–149.
Bauwens, L. and Hautsch, N. (2006). Stochastic conditional intensity processes. Journal of Financial Econometrics 4, 3, 450–493.
Bauwens, L. and Veredas, D. (2004). The stochastic conditional duration model: a latent variable model for the analysis of financial durations. Journal of Econometrics 119, 2, 381–412.
Belfrage, M. (2016). ACDM: tools for autoregressive conditional duration models. (R package version 1.0.4).
Beran, J. (1994). Statistics for long-memory processes. CRC Press.
Bibinger, M. (2011). Efficient covariance estimation for asynchronous noisy high-frequency data. Scandinavian Journal of Statistics 38, 1, 23–45.
Billio, M., Getmansky, M., Lo, A.W. and Pelizzon, L. (2012). Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of Financial Economics 104, 3, 535–559.
Bjursell, J. and Gentle, J.E. (2012). Identifying jumps in asset prices. In: Handbook of computational finance, pp. 371–399. Springer.
Black, F. (1976). Studies of stock market volatility changes. In: Proceedings of the American statistical association business and economic statistics section, pp. 177–181.
Black, F. (1986). Noise. The Journal of Finance 41, 3, 528–543.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 3, 307–327.
Boudt, K., Cornelissen, J., Payseur, S., Kleen, O. and Sjoerup, E. (2021a). Highfrequency: tools for highfrequency data analysis. https://CRAN.R-project.org/package=highfrequency. R package version 0.9.0.
Boudt, K., Kleen, O. and Sjørup, E. (2021b). Analyzing intraday financial data in r: the highfrequency package. Available at SSRN 3917548.
Brouste, A., Fukasawa, M., Hino, H., Iacus, S., Kamatani, K., Koike, Y., Masuda, H., Nomura, R., Ogihara, T., Shimuzu, Y. et al (2014). The yuima project: a computational framework for simulation and inference of stochastic differential equations. Journal of Statistical Software 57, 1–51.
Buccheri, G., Bormetti, G., Corsi, F. and Lillo, F. (2021a). A score-driven conditional correlation model for noisy and asynchronous data: an application to high-frequency covariance dynamics. Journal of Business & Economic Statistics, 39, 4, 920–936.
Buccheri, G., Corsi, F. and Peluso, S. (2021b). High-frequency lead-lag effects and cross-asset linkages: a multi-asset lagged adjustment model. Journal of Business & Economic Statistics, 39, 605–621.
Cameron, A.C. and Trivedi, P.K. (2013). Regression analysis of count data. Cambridge University Press, Cambridge.
Cao, W., Hurvich, C. and Soulier, P. (2017). Drift in transaction-level asset price models. Journal of Time Series Analysis, 38, 5, 769–790.
Carr, P. and Wu, L. (2003). The finite moment log stable process and option pricing. The Journal of Finance 58, 2, 753–777.
Carr, P., Madan, D. and Chang, E. (1998). The variance gamma process and option pricing. European Finance Review 2, 1, 79–105.
Carr, P., Geman, H., Madan, D.B. and Yor, M. (2002). The fine structure of asset returns: an empirical investigation. The Journal of Business 75, 2, 305–332.
Cartea, A. and Jaimungal, S. (2013). Modelling asset prices for algorithmic and high-frequency trading. Applied Mathematical Finance, 20, 6, 512–547.
Chakrabarti, A. and Sen, R. (2019). Copula estimation for nonsynchronous financial data. arXiv:1904.10182.
Chan, K. (1992). A further analysis of the lead–lag relationship between the cash market and stock index futures market. The Review of Financial Studies5, 1, 123–152.
Chen, C.W.S., Gerlach, R., Hwang, B.B.K and McAleer, M. (2012). Forecasting Value-at-Risk using nonlinear regression quantiles and the intra-day range. International Journal of Forecasting 28, 3, 557–574.
Chen, F., Diebold, F.X. and Schorfheide, F. (2013). A Markov-switching multifractal inter-trade duration model, with application to us equities. Journal of Econometrics 177, 2, 320–342.
Chib, S., Nardari, F. and Shephard, N. (2002). Markov chain Monte Carlo methods for stochastic volatility models. Journal of Econometrics 108, 2, 281–316.
Chib, S., Omori, Y. and Asai, M. (2009). Multivariate stochastic volatility. In: Handbook of financial time series, pp. 365–400. Springer.
Christensen, K., Kinnebrock, S. and Podolskij, M. (2010). Pre-averaging estimators of the ex-post covariance matrix in noisy diffusion models with non-synchronous data. Journal of Econometrics 159, 1, 116–133.
Christensen, K., Oomen, R.C. and Podolskij, M. (2014). Fact or friction: jumps at ultra high frequency. Journal of Financial Economics 114, 3, 576–599.
Cont, R. (2011). Statistical modeling of high-frequency financial data. IEEE Signal Processing Magazine, 28, 5, 16–25.
Cont, R. and Tankov, P. (2004). Financial modeling with jump processes. Chapman & Hall/CRC, Boca Raton.
Coroneo, L. and Veredas, D. (2012). A simple two-component model for the distribution of intraday returns. The European Journal Finance 18, 9, 775–797.
Corsi, F. and Audrino, F. (2012). Realized covariance tick-by-tick in presence of rounded time stamps and general microstructure effects. J. Financ. Econom.10, 591–616.
Cox, D.R. (1972). Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodol.) 34, 187–202.
Cox, D.R. and Oakes, D. (2018). Analysis of survival data. Chapman and hall/CRC.
Daley, D.J. and Vere-Jones, D. (2003). An introduction to the theory of point processes: volume i: elementary theory and methods. Springer.
De Jong, F. and Nijman, T. (1997). High frequency analysis of lead-lag relationships between financial markets. J. Empir. Finance 4, 259–277.
Deo, R., Hsieh, M. and Hurvich, C.M. (2010). Long memory in intertrade durations, counts and realized volatility of NYSE stocks. J. Stat. Plan. Inference 140, 3715–3733.
Diamond, D.W. and Verrecchia, R.E. (1987). Constraints on short-selling and asset price adjustment to private information. J. Financ. Econ. 18, 277–311.
Diebold, F.X. and Yılmaz, K. (2014). On the network topology of variance decompositions: measuring the connectedness of financial firms. J. Econ.182, 119–134.
Dionne, G., Duchesne, P. and Pacurar, M. (2009). Intraday value at risk (IVaR) using tick-by-tick data with application to the Toronto Stock Exchange. J. Empir. Finance 16, 777–792.
Dobrev, D. and Schaumburg, E. (2017). High-frequency cross-market trading: model free measurement and applications. Perspectives.
Duffie, D., Pan, J. and Singleton, K. (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica 68, 1343–1376.
Dufour, A. and Engle, R.F. (2000). Time and the price impact of a trade. J. Financ. 55, 2467–2498.
Easley, D. and O’Hara, M. (1992). Time and the process of security price adjustment. J. Financ. 47, 577–605.
Easley, D., Kiefer, N.M., O’Hara, M. and Paperman, J.B. (1996). Liquidity, information, and infrequently traded stocks. J. Financ. 51, 1405–1436.
Easley, D., Hvidkjaer, S. and O’Hara, M. (2002). Is information risk a determinant of asset returns? J. Financ. 57, 2185–2221.
Easley, D., de Prado, M.M.L. and O’Hara, M. (2012a). The volume clock: insights into the high-frequency paradigm. J. Portfolio Manag. 39, 19–29.
Easley, D., López de Prado, M.M. and O’Hara, M. (2012b). Flow toxicity and liquidity in a high frequency world. Rev. Financ. Stud. 25, 1457–1493.
Easley, D., de Prado, M.L. and O’Hara, M. (2016). Discerning information from trade data. J. Financ. Econ. 120, 269–285.
Easley, D., López de Prado, M., O’Hara, M. and Zhang, Z. (2021). Microstructure in the machine age. Rev. Financ. Stud. 34, 3316–3363.
Eberlein, E. and Keller, U. (1995). Hyperbolic distributions in finance. Bernoulli 281–299.
Efron, B. (1986). Double exponential families and their use in generalized linear regression. J. Am. Stat. Assoc. 81, 709–721.
Embrechts, P., Klüppelberg, C. and Mikosch, T. (2013). Modelling extremal events: for insurance and finance. Springer Science & Business Media.
Engle, R. (2002a). Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional heteroskedasticity models. J. Bus. Econ. Stat. 20, 339–350.
Engle, R. (2002b). New frontiers for ARCH models. J. Appl. Econom.17, 425–446.
Engle, R.F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econom.: J. Econom. Soc. 50, 987–1007.
Engle, R.F. and Russell, J.R. (1998). Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66, 1127–1162.
Epps, T.W. (1979). Comovements in stock prices in the very short run. J. Am. Stat. Assoc. 74, 291–298.
Evans, K.P. (2011). Intraday jumps and us macroeconomic news announcements. J. Bank. Finance 35, 2511–2527.
Fan, J., Li, Y. and Yu, K. (2012). Vast volatility matrix estimation using high-frequency data for portfolio selection. J. Am. Stat. Assoc. 107, 412–428.
Feng, Y. and Zhou, C. (2015). Forecasting financial market activity using a semiparametric fractionally integrated log-acd. Int. J. Forecast. 31, 349–363.
Fernandes, M. and Grammig, J. (2006). A family of autoregressive conditional duration models. J. Econ. 130, 1–23.
Fissler, T. and Ziegel, J.F. (2016). Higher order elicitability and Osband’s principle. Ann. Stat. 44, 1680–1707.
Fleming, T.R. and Harrington, D.P. (2011). Counting processes and survival analysis. Wiley, New York.
Gerlach, R. and Chen, C.W. (2015). Bayesian expected shortfall forecasting incorporating the intraday range. J. Financ. Econom. 14, 128–158.
Gerlach, R. and Wang, C. (2016). Forecasting risk via realized GARCH, incorporating the realized range. Quant. Finance 16, 501–511.
Ghalanos, A. (2020). Rugarch: univariate GARCH models. R package version 1.4-4.
Giot, P. (2005). Market risk models for intraday data. Eur. J. Finance11, 309–324.
Goldstein, M.A., Kumar, P. and Graves, F.C. (2014). Computerized and high-frequency trading. Financ. Rev. 49, 177–202.
Gordy, M.B. and Juneja, S. (2010). Nested simulation in portfolio risk measurement. Manag. Sci. 56, 1833–1848.
Granger, C.W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc., 424–438.
Hansen, P.R. and Lunde, A. (2005). A forecast comparison of volatility models: does anything beat a GARCH(1,1)? J. Appl. Econ. 20, 873–889.
Hansen, P.R., Lunde, A. and Nason, J.M. (2003). Choosing the best volatility models: the model confidence set approach. Oxf. Bull. Econ. Stat. 65, 839–861.
Hansen, P.R., Huang, Z. and Shek, H.H. (2012). Realized GARCH: a joint model for returns and realized measures of volatility. J. Appl. Econ. 27, 877–906.
Harris, L. (2003). Trading and exchanges: market microstructure for practitioners. Oxford University Press, Oxford.
Harvey, A., Ruiz, E. and Shephard, N. (1994). Multivariate stochastic variance models. Rev. Econ. Stud. 61, 247–264.
Harvey, A.C. and Shephard, N. (1996). Estimation of an asymmetric stochastic volatility model for asset returns. J. Bus. Econ. Stat. 14, 429–434.
Hasbrouck, J. (2007). Empirical market microstructure: the institutions, economics, and econometrics of securities trading. Oxford University Press, Oxford.
Hautsch, N. (2011). Econometrics of financial high-frequency data. Springer Science & Business Media.
Hautsch, N., Klausurtagung, S. and Risiko, O. (2006). Generalized autoregressive conditional intensity models with long range dependence.
Hawkes, A.G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90.
Hayashi, T. and Koike, Y. (2017). Multi-scale analysis of lead-lag relationships in high-frequency financial markets. arXiv:1708.03992.
Hayashi, T., Yoshida, N. et al (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11, 359–379.
Heckman, J.J. and Singer, B. (1984). Econometric duration analysis. J. Econ. 24, 63–132.
Heinen, A. (2003). Modelling time series count data: an autoregressive conditional poisson model available at SSRN 1117187.
Hoffmann, M., Rosenbaum, M. and Yoshida, N. (2013). Estimation of the lead-lag parameter from non-synchronous data. Bernoulli 19, 426–461.
Hosszejni, D. and Kastner, G. (2019). Modeling univariate and multivariate stochastic volatility in R with stochvol and factorstochvol. arXiv:1906.12123.
Hsieh, M.-C., Hurvich, C. and Soulier, P. (2019). Modeling leverage and long memory in volatility in a pure-jump process. High Frequency 2, 124–141.
Huang, D., Zhu, S., Fabozzi, F.J. and Fukushima, M. (2010). Portfolio selection under distributional uncertainty: a relative robust cvar approach. Eur. J. Oper. Res. 203, 185–194.
Iacus, S.M. and Yoshida, N. (2018). Simulation and inference for stochastic processes with yuima. A comprehensive R framework for SDEs and other stochastic processes. Use R.
Jacod, J., Li, Y., Mykland, P.A., Podolskij, M. and Vetter, M. (2009). Microstructure noise in the continuous case: the pre-averaging approach. Stoch. Process. Appl. 119, 2249–2276.
Jacquier, E., Polson, N.G. and Rossi, P.E. (2004). Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. J. Econ.122, 185–212.
Jasiak, J. (1999). Persistence in intertrade durations. Available at SSRN: https://ssrn.com/abstract=162008.
Jiang, G.J. and Oomen, R. (2005). A new test for jumps in asset prices. Preprint.
Jiang, G.J. and Oomen, R.C. (2008). Testing for jumps when asset prices are observed with noise—a “swap variance” approach. J. Econ. 144, 352–370.
Kalbfleisch, J.D. and Prentice, R.L. (2011). The statistical analysis of failure time data. Wiley, New York.
Keim, D.B. and Madhavan, A. (1996). The upstairs market for large-block transactions: analysis and measurement of price effects. Rev. Financ. Stud.9, 1–36.
Kleinbaum, D.G. and Klein, M. (2010). Survival analysis. Springer, Berlin.
Kwok, S.S.M., Li, W.K. and Yu, P.L.H. (2009). The autoregressive conditional marked duration model: statistical inference to market microstructure. J. Data Sci.
Lancaster, T. (1979). Econometric methods for the duration of unemployment. Econom. J. Econom. Soc. 47, 939–956.
Lane, W.R., Looney, S.W. and Wansley, J.W. (1986). An application of the cox proportional hazards model to bank failure. J. Bank. Finance 10, 511–531.
Lee, S.S. and Mykland, P.A. (2008). Jumps in financial markets: a new nonparametric test and jump dynamics. Rev. Financ. Stud. 21, 2535–2563.
Li, J., Todorov, V., Tauchen, G. and Lin, H. (2019). Rank tests at jump events. J. Bus. Econ. Stat. 37, 312–321.
Liboschik, T., Fokianos, K. and Fried, R. (2017). tscount: an R package for analysis of count time series following generalized linear models. J. Stat. Softw. 82, 1–51.
Liu, H., Zou, J. and Ravishanker, N. (2018). Multiple day biclustering of high-frequency financial time series. Stat 7, e176.
Liu, H., Zou, J. and Ravishanker, N. (2021). Clustering high-frequency financial time series based on information theory, forthcoming. Appl. Stoch. Models Bus. Ind.
Liu, S. and Tse, Y.-K. (2015). Intraday value-at-Risk: an asymmetric autoregressive conditional duration approach. J. Econ. 189, 437–446.
Madan, D.B. and Seneta, E. (1990). The variance gamma (vg) model for share market returns. J. Bus. 511–524.
Mancino, M.E. and Sanfelici, S. (2011). Estimating covariance via fourier method in the presence of asynchronous trading and microstructure noise. J. Financ. Econom. 9, 367–408.
Manganelli, S. (2005). Duration, volume and volatility impact of trades. J. Financ. Mark. 8, 377–399.
Martens, M. and Van Dijk, D. (2007). Measuring volatility with the realized range. J. Econ. 138, 181–207.
Meng, X. and Taylor, J.W. (2020). Estimating value-at-risk and expected shortfall using the intraday low and range data. Eur. J. Oper. Res. 280, 191–202.
Mies, F., Bibinger, M., Steland, A. and Podolskij, M. (2020). High-frequency inference for stochastic processes with jumps of infinite activity. PhD thesis, RWTH Aachen University.
Mukherjee, A., Peng, W., Swanson, N.R. and Yang, X. (2020). Financial econometrics and big data: a survey of volatility estimators and tests for the presence of jumps and co-jumps. In: Handbook of statistics, vol 42, pp 3–59. Elsevier.
Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new approach. Econom. J. Econom. Soc. 59, 347–370.
N.Y.S.E. Trade and Quote Database (2019). Retrieved from wharton research data services accessed.
O’Hara, M. (1997). Market microstructure theory. Wiley, New York.
Pacurar, M. (2008). Autoregressive conditional duration models in finance: a survey of the theoretical and empirical literature. J. Econ. Surv. 22, 711–751.
Palma, W. (2007). Long-memory time series: theory and methods. Wiley, New York.
Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return. J. Bus. 53, 61–65.
Peluso, S., Corsi, F. and Mira, A. (2014). A bayesian high-frequency estimator of the multivariate covariance of noisy and asynchronous returns. J. Financ. Econom. 13, 665–697.
Renò, R. (2003). A closer look at the epps effect. Int. J. Theor. Appl. Finance 6, 87–102.
Robinson, P.M. (2003). Time series with long memory. Advanced Texts in Econometrics.
Rocco, M. (2014). Extreme value theory in finance: a survey. J. Econ. Surv. 28, 82–108.
Russell, J.R. (1999). Econometric modeling of multivariate irregularly-spaced high-frequency data. Working Paper, University of Chicago.
Rydberg, T.H. and Shephard, N. (2000). BIN models for trade-by-trade data. modelling the number of trades in a fixed interval of time. Econometric Society World Congress 2000 Contributed Papers 0740, Econometric Society. https://ideas.repec.org/p/ecm/wc2000/0740.html.
Sen, R. (2009). Jumps and microstructure noise in stock price volatility. Volatility, 163.
Shirota, S., Hizu, T. and Omori, Y. (2014). Realized stochastic volatility with leverage and long memory. Comput. Stat. Data Anal. 76, 618–641.
So, M.K. and Xu, R. (2013). Forecasting intraday volatility and value-at-risk with high-frequency data. Asia-Pac. Finan. Markets 20, 83–111.
So, M.K., Chu, A.M., Lo, C.C. and Ip, C.Y. (2021). Volatility and dynamic dependence modeling: review, applications, and financial risk management. Wiley Interdiscip. Rev.: Comput. Stat., e1567.
Song, X., Kim, D., Yuan, H., Cui, X., Lu, Z., Zhou, Y. and Wang, Y. (2021). Volatility analysis with realized garch-itô models. J. Econ.222, 393–410.
Stroud, J.R. and Johannes, M.S. (2014). Bayesian modeling and forecasting of 24-hour high-frequency volatility. J. Am. Stat. Assoc. 109, 1368–1384.
Sun, W., Rachev, S., Fabozzi, F.J. and Kalev, P.S. (2008). Fractals in trade duration: capturing long-range dependence and heavy tailedness in modeling trade duration. Ann. Finance 4, 217–241.
Swishchuk, A. and Huffman, A. (2020). General compound hawkes processes in limit order books. Risks 8, 28.
Takahashi, M., Omori, Y. and Watanabe, T. (2009). Estimating stochastic volatility models using daily returns and realized volatility simultaneously. Comput. Stat. Data Anal. 53, 2404–2426.
Takahashi, M., Watanabe, T. and Omori, Y. (2016). Volatility and quantile forecasts by realized stochastic volatility models with generalized hyperbolic distribution. Int. J. Forecast. 32, 437–457.
Tay, A.S., Ting, C., Tse, Y.K. and Warachka, M. (2004). Transaction-data analysis of marked durations and their implications for market microstructure.
Tay, A.S., Ting, C., Kuen Tse, Y. and Warachka, M. (2011). The impact of transaction duration, volume and direction on price dynamics and volatility. Quant. Finance 11, 447–457.
Taylor, S.J. (1982). Financial returns modelled by the product of two stochastic processes-a study of the daily sugar prices 1961–75. Time Ser. Anal. Theory Pract. 1, 203–226.
Taylor, S.J. (1994). Modeling stochastic volatility: a review and comparative study. Math. Financ. 4, 183–204.
Thavaneswaran, A., Ravishanker, N. and Liang, Y. (2015). Generalized duration models and optimal estimation using estimating functions. Ann. Inst. Stat. Math. 67, 129–156.
Therneau, T.M. (2021). Survival: a package for survival analysis in R. R package version 3.2-13.
Tsai, P.-C. and Shackleton, M.B. (2016). Detecting jumps in high-frequency prices under stochastic volatility: a review and a data-driven approach. In: Handbook of high-frequency trading and modeling in finance, pp 137–181.
Tsay, R.S. (2005). Analysis of financial time series. Wiley, New York.
Vasileios, S. (2015). acp: autoregressive conditional poisson (R package version 2.1).
Wang, Q., Figueroa-López, J.E. and Kuffner, T.A. (2021). Bayesian inference on volatility in the presence of infinite jump activity and microstructure noise. Electron. J. Stat. 15, 506–553.
Wang, Y. and Zou, J. (2014). Volatility analysis in high-frequency financial data. Wiley Interdiscip. Rev. Comput. Stat. 6, 393–404.
Yan, B. and Zivot, E. (2003). Analysis of high-frequency financial data with S-PLUS. Working paper, UWEC-2005-03. http://ideas.repec.org/p/udb/wpaper/uwec-2005-03.html.
Yu, J. and Meyer, R. (2006). Multivariate stochastic volatility models: bayesian estimation and model comparison. Econ. Rev. 25, 361–384.
Zaatour, R. (2014). Hawkes: Hawkes process simulation and calibration toolkit (R package version 0.0-4).
Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. J. Econ. 160, 33–47.
Zhang, Y., Zou, J., Ravishanker, N. and Thavaneswaran, A. (2019). Modeling financial durations using penalized estimating functions. Comput. Stat. Data Anal. 131, 145–158.
Zheng, Y., Li, Y. and Li, G. (2016). On Fréchet autoregressive conditional duration models. J. Stat. Plan. Inference 175, 51–66.
žikeš, F., Baruník, J. and Shenai, N. (2017). Modeling and forecasting persistent financial durations. Econom. Rev. 36, 1081–1110.
Zivot, E. and Wang, J. (2007). Modeling financial time series with s-plus®;, vol 191. Springer Science & Business Media.
Acknowledgements
The authors are very grateful to the reviewers and editors for their helpful suggestions for improving the paper.
Funding
This paper was based upon work partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. In addition, the work of SB was supported in part by an NSF award (DMS-1812128).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dutta, C., Karpman, K., Basu, S. et al. Review of Statistical Approaches for Modeling High-Frequency Trading Data. Sankhya B 85 (Suppl 1), 1–48 (2023). https://doi.org/10.1007/s13571-022-00280-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13571-022-00280-7