-
Adaptive Agents and Data Quality in Agent-Based Financial Markets
Authors:
Colin M. Van Oort,
Ethan Ratliff-Crain,
Brian F. Tivnan,
Safwan Wshah
Abstract:
We present our Agent-Based Market Microstructure Simulation (ABMMS), an Agent-Based Financial Market (ABFM) that captures much of the complexity present in the US National Market System for equities (NMS). Agent-Based models are a natural choice for understanding financial markets. Financial markets feature a constrained action space that should simplify model creation, produce a wealth of data th…
▽ More
We present our Agent-Based Market Microstructure Simulation (ABMMS), an Agent-Based Financial Market (ABFM) that captures much of the complexity present in the US National Market System for equities (NMS). Agent-Based models are a natural choice for understanding financial markets. Financial markets feature a constrained action space that should simplify model creation, produce a wealth of data that should aid model validation, and a successful ABFM could strongly impact system design and policy development processes. Despite these advantages, ABFMs have largely remained an academic novelty. We hypothesize that two factors limit the usefulness of ABFMs. First, many ABFMs fail to capture relevant microstructure mechanisms, leading to differences in the mechanics of trading. Second, the simple agents that commonly populate ABFMs do not display the breadth of behaviors observed in human traders or the trading systems that they create. We investigate these issues through the development of ABMMS, which features a fragmented market structure, communication infrastructure with propagation delays, realistic auction mechanisms, and more. As a baseline, we populate ABMMS with simple trading agents and investigate properties of the generated data. We then compare the baseline with experimental conditions that explore the impacts of market topology or meta-reinforcement learning agents. The combination of detailed market mechanisms and adaptive agents leads to models whose generated data more accurately reproduce stylized facts observed in actual markets. These improvements increase the utility of ABFMs as tools to inform design and policy decisions.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Revisiting Cont's Stylized Facts for Modern Stock Markets
Authors:
Ethan Ratliff-Crain,
Colin M. Van Oort,
James Bagrow,
Matthew T. K. Koehler,
Brian F. Tivnan
Abstract:
In 2001, Rama Cont introduced a now-widely used set of 'stylized facts' to synthesize empirical studies of financial price changes (returns), resulting in 11 statistical properties common to a large set of assets and markets. These properties are viewed as constraints a model should be able to reproduce in order to accurately represent returns in a market. It has not been established whether the c…
▽ More
In 2001, Rama Cont introduced a now-widely used set of 'stylized facts' to synthesize empirical studies of financial price changes (returns), resulting in 11 statistical properties common to a large set of assets and markets. These properties are viewed as constraints a model should be able to reproduce in order to accurately represent returns in a market. It has not been established whether the characteristics Cont noted in 2001 still hold for modern markets following significant regulatory shifts and technological advances. It is also not clear whether a given time series of financial returns for an asset will express all 11 stylized facts. We test both of these propositions by attempting to replicate each of Cont's 11 stylized facts for intraday returns of the individual stocks in the Dow 30, using the same authoritative data as that used by the U.S. regulator from October 2018 - March 2019. We find conclusive evidence for eight of Cont's original facts and no support for the remaining three. Our study represents the first test of Cont's 11 stylized facts against a consistent set of stocks, therefore providing insight into how these stylized facts should be viewed in the context of modern stock markets.
△ Less
Submitted 20 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Scaling of inefficiencies in the U.S. equity markets: Evidence from three market indices and more than 2900 securities
Authors:
John H. Ring IV,
Colin M. Van Oort,
David R. Dewhurst,
Tyler J. Gray,
Christopher M. Danforth,
Brian F. Tivnan
Abstract:
Using the most comprehensive, commercially-available dataset of trading activity in U.S. equity markets, we catalog and analyze quote dislocations between the SIP National Best Bid and Offer (NBBO) and a synthetic BBO constructed from direct feeds. We observe a total of over 3.1 billion dislocation segments in the Russell 3000 during trading in 2016, roughly 525 per second of trading. However, the…
▽ More
Using the most comprehensive, commercially-available dataset of trading activity in U.S. equity markets, we catalog and analyze quote dislocations between the SIP National Best Bid and Offer (NBBO) and a synthetic BBO constructed from direct feeds. We observe a total of over 3.1 billion dislocation segments in the Russell 3000 during trading in 2016, roughly 525 per second of trading. However, these dislocations do not occur uniformly throughout the trading day. We identify a characteristic structure that features more dislocations near the open and close. Additionally, around 23% of observed trades executed during dislocations. These trades may have been impacted by stale information, leading to estimated opportunity costs on the order of $ 2 billion USD. A subset of the constituents of the S&P 500 index experience the greatest amount of opportunity cost and appear to drive inefficiencies in other stocks. These results quantify impacts of the physical structure of the U.S. National Market System.
△ Less
Submitted 8 October, 2020; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Fragmentation and inefficiencies in US equity markets: Evidence from the Dow 30
Authors:
Brian F. Tivnan,
David Rushing Dewhurst,
Colin M. Van Oort,
John H. Ring IV,
Tyler J. Gray,
Brendan F. Tivnan,
Matthew T. K. Koehler,
Matthew T. McMahon,
David Slater,
Jason Veneman,
Christopher M. Danforth
Abstract:
Using the most comprehensive source of commercially available data on the US National Market System, we analyze all quotes and trades associated with Dow 30 stocks in 2016 from the vantage point of a single and fixed frame of reference. We find that inefficiencies created in part by the fragmentation of the equity marketplace are relatively common and persist for longer than what physical constrai…
▽ More
Using the most comprehensive source of commercially available data on the US National Market System, we analyze all quotes and trades associated with Dow 30 stocks in 2016 from the vantage point of a single and fixed frame of reference. We find that inefficiencies created in part by the fragmentation of the equity marketplace are relatively common and persist for longer than what physical constraints may suggest. Information feeds reported different prices for the same equity more than 120 million times, with almost 64 million dislocation segments featuring meaningfully longer duration and higher magnitude. During this period, roughly 22% of all trades occurred while the SIP and aggregated direct feeds were dislocated. The current market configuration resulted in a realized opportunity cost totaling over $160 million when compared with a single feed, single exchange alternative---a conservative estimate that does not take into account intra-day offsetting events.
△ Less
Submitted 18 November, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Price Discovery and the Accuracy of Consolidated Data Feeds in the U.S. Equity Markets
Authors:
Brian F. Tivnan,
David Slater,
James R. Thompson,
Tobin A. Bergen-Hill,
Carl D. Burke,
Shaun M. Brady,
Matthew T. K. Koehler,
Matthew T. McMahon,
Brendan F. Tivnan,
Jason Veneman
Abstract:
Both the scientific community and the popular press have paid much attention to the speed of the Securities Information Processor, the data feed consolidating all trades and quotes across the US stock market. Rather than the speed of the Securities Information Processor, or SIP, we focus here on its accuracy. Relying on Trade and Quote data, we provide various measures of SIP latency relative to h…
▽ More
Both the scientific community and the popular press have paid much attention to the speed of the Securities Information Processor, the data feed consolidating all trades and quotes across the US stock market. Rather than the speed of the Securities Information Processor, or SIP, we focus here on its accuracy. Relying on Trade and Quote data, we provide various measures of SIP latency relative to high-speed data feeds between exchanges, known as direct feeds. We use first differences to highlight not only the divergence between the direct feeds and the SIP, but also the fundamental inaccuracy of the SIP. We find that as many as 60 percent or more of trades are reported out of sequence for stocks with high trade volume, therefore skewing simple measures such as returns. While not yet definitive, this analysis supports our preliminary conclusion that the underlying infrastructure of the SIP is currently unable to keep pace with the trading activity in today's stock market.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
Reply to Garcia et al.: Common mistakes in measuring frequency dependent word characteristics
Authors:
P. S. Dodds,
E. M. Clark,
S. Desu,
M. R. Frank,
A. J. Reagan,
J. R. Williams,
L. Mitchell,
K. D. Harris,
I. M. Kloumann,
J. P. Bagrow,
K. Megerdoomian,
M. T. McMahon,
B. F. Tivnan,
C. M. Danforth
Abstract:
We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English…
▽ More
We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English component of our study compares well statistically with two related surveys, that no survey design influence is apparent, and that estimates of measurement error do not explain the positivity biases reported in our work and that of others. We further demonstrate that for the frequency dependence of positivity---of which we explored the nuances in great detail in [1]---Garcia et al. did not perform a reanalysis of our data---they instead carried out an analysis of a different, statistically improper data set and introduced a nonlinearity before performing linear regression.
△ Less
Submitted 28 May, 2015; v1 submitted 25 May, 2015;
originally announced May 2015.
-
Reducing Cascading Failure Risk by Increasing Infrastructure Network Interdependency
Authors:
Mert Korkali,
Jason G. Veneman,
Brian F. Tivnan,
Paul D. H. Hines
Abstract:
Increased coupling between critical infrastructure networks, such as power and communication systems, will have important implications for the reliability and security of these systems. To understand the effects of power-communication coupling, several have studied interdependent network models and reported that increased coupling can increase system vulnerability. However, these results come from…
▽ More
Increased coupling between critical infrastructure networks, such as power and communication systems, will have important implications for the reliability and security of these systems. To understand the effects of power-communication coupling, several have studied interdependent network models and reported that increased coupling can increase system vulnerability. However, these results come from models that have substantially different mechanisms of cascading, relative to those found in actual power and communication networks. This paper reports on two sets of experiments that compare the network vulnerability implications resulting from simple topological models and models that more accurately capture the dynamics of cascading in power systems. First, we compare a simple model of topological contagion to a model of cascading in power systems and find that the power grid shows a much higher level of vulnerability, relative to the contagion model. Second, we compare a model of topological cascades in coupled networks to three different physics-based models of power grids coupled to communication networks. Again, the more accurate models suggest very different conclusions. In all but the most extreme case, the physics-based power grid models indicate that increased power-communication coupling decreases vulnerability. This is opposite from what one would conclude from the coupled topological model, in which zero coupling is optimal. Finally, an extreme case in which communication failures immediately cause grid failures, suggests that if systems are poorly designed, increased coupling can be harmful. Together these results suggest design strategies for reducing the risk of cascades in interdependent infrastructure systems.
△ Less
Submitted 23 June, 2015; v1 submitted 24 October, 2014;
originally announced October 2014.
-
Human language reveals a universal positivity bias
Authors:
Peter Sheridan Dodds,
Eric M. Clark,
Suma Desu,
Morgan R. Frank,
Andrew J. Reagan,
Jake Ryland Williams,
Lewis Mitchell,
Kameron Decker Harris,
Isabel M. Kloumann,
James P. Bagrow,
Karine Megerdoomian,
Matthew T. McMahon,
Brian F. Tivnan,
Christopher M. Danforth
Abstract:
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias i…
▽ More
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias is strongly independent of frequency of word usage. Alongside these general regularities, we describe inter-language variations in the emotional spectrum of languages which allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.
△ Less
Submitted 15 June, 2014;
originally announced June 2014.
-
Shadow networks: Discovering hidden nodes with models of information flow
Authors:
James P. Bagrow,
Suma Desu,
Morgan R. Frank,
Narine Manukyan,
Lewis Mitchell,
Andrew Reagan,
Eric E. Bloedorn,
Lashon B. Booker,
Luther K. Branting,
Michael J. Smith,
Brian F. Tivnan,
Christopher M. Danforth,
Peter S. Dodds,
Joshua C. Bongard
Abstract:
Complex, dynamic networks underlie many systems, and understanding these networks is the concern of a great span of important scientific and engineering problems. Quantitative description is crucial for this understanding yet, due to a range of measurement problems, many real network datasets are incomplete. Here we explore how accidentally missing or deliberately hidden nodes may be detected in n…
▽ More
Complex, dynamic networks underlie many systems, and understanding these networks is the concern of a great span of important scientific and engineering problems. Quantitative description is crucial for this understanding yet, due to a range of measurement problems, many real network datasets are incomplete. Here we explore how accidentally missing or deliberately hidden nodes may be detected in networks by the effect of their absence on predictions of the speed with which information flows through the network. We use Symbolic Regression (SR) to learn models relating information flow to network topology. These models show localized, systematic, and non-random discrepancies when applied to test networks with intentionally masked nodes, demonstrating the ability to detect the presence of missing nodes and where in the network those nodes are likely to reside.
△ Less
Submitted 20 December, 2013;
originally announced December 2013.