Learning Financial Networks with High-frequency Trade Data

Data Sci Sci. 2023;2(1):2166624. doi: 10.1080/26941899.2023.2166624. Epub 2023 Feb 28.

Abstract

Financial networks are typically estimated by applying standard time series analyses to price-based economic variables collected at low-frequency (e.g., daily or monthly stock returns or realized volatility). These networks are used for risk monitoring and for studying information flows in financial markets. High-frequency intraday trade data sets may provide additional insights into network linkages by leveraging high-resolution information. However, such data sets pose significant modeling challenges due to their asynchronous nature, complex dynamics, and nonstationarity. To tackle these challenges, we estimate financial networks using random forests, a state-of-the-art machine learning algorithm which offers excellent prediction accuracy without expensive hyperparameter optimization. The edges in our network are determined by using microstructure measures of one firm to forecast the sign of the change in a market measure such as the realized volatility of another firm. We first investigate the evolution of network connectivity in the period leading up to the U.S. financial crisis of 2007-09. We find that the networks have the highest density in 2007, with high degree connectivity associated with Lehman Brothers in 2006. A second analysis into the nature of linkages among firms suggests that larger firms tend to offer better predictive power than smaller firms, a finding qualitatively consistent with prior works in the market microstructure literature.

Keywords: high-frequency trading; market microstructure; random forests.