-
Characterising Interventions in Causal Games
Authors:
Manuj Mishra,
James Fox,
Michael Wooldridge
Abstract:
Causal games are probabilistic graphical models that enable causal queries to be answered in multi-agent settings. They extend causal Bayesian networks by specifying decision and utility variables to represent the agents' degrees of freedom and objectives. In multi-agent settings, whether each agent decides on their policy before or after knowing the causal intervention is important as this affect…
▽ More
Causal games are probabilistic graphical models that enable causal queries to be answered in multi-agent settings. They extend causal Bayesian networks by specifying decision and utility variables to represent the agents' degrees of freedom and objectives. In multi-agent settings, whether each agent decides on their policy before or after knowing the causal intervention is important as this affects whether they can respond to the intervention by adapting their policy. Consequently, previous work in causal games imposed chronological constraints on permissible interventions. We relax this by outlining a sound and complete set of primitive causal interventions so the effect of any arbitrarily complex interventional query can be studied in multi-agent settings. We also demonstrate applications to the design of safe AI systems by considering causal mechanism design and commitment.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Data-driven low-dimensional model of a sedimenting flexible fiber
Authors:
Andrew J Fox,
Michael D. Graham
Abstract:
The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscou…
▽ More
The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscous Newtonian fluid, using results from detailed simulations as the data set. The approach combines an autoencoder neural network architecture to learn a low-dimensional latent representation of the filament shape, with a neural ODE that learns the evolution of the particle in the latent state. The model was designed to model filaments of varying flexibility, characterized by an elasto-gravitational number $\mathcal{B}$, and was trained on a data set containing the evolution of fibers beginning at set angles of inclination. For the range of $\mathcal{B}$ considered here (100-10000), the filament shape dynamics can be represented with high accuracy with only four degrees of freedom, in contrast to the 93 present in the original bead-spring model used to generate the dynamic trajectories. We predict the evolution of fibers set at arbitrary angles and demonstrate that our data-driven model can accurately forecast the evolution of a fiber at both trained and untrained elasto-gravitational numbers.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Interpreting Time Series Transformer Models and Sensitivity Analysis of Population Age Groups to COVID-19 Infections
Authors:
Md Khairul Islam,
Tyler Valentine,
Timothy Joowon Sue,
Ayush Karmacharya,
Luke Neil Benham,
Zhengguang Wang,
Kingsley Kim,
Judy Fox
Abstract:
Interpreting deep learning time series models is crucial in understanding the model's behavior and learning patterns from raw data for real-time decision-making. However, the complexity inherent in transformer-based time series models poses challenges in explaining the impact of individual features on predictions. In this study, we leverage recent local interpretation methods to interpret state-of…
▽ More
Interpreting deep learning time series models is crucial in understanding the model's behavior and learning patterns from raw data for real-time decision-making. However, the complexity inherent in transformer-based time series models poses challenges in explaining the impact of individual features on predictions. In this study, we leverage recent local interpretation methods to interpret state-of-the-art time series models. To use real-world datasets, we collected three years of daily case data for 3,142 US counties. Firstly, we compare six transformer-based models and choose the best prediction model for COVID-19 infection. Using 13 input features from the last two weeks, we can predict the cases for the next two weeks. Secondly, we present an innovative way to evaluate the prediction sensitivity to 8 population age groups over highly dynamic multivariate infection data. Thirdly, we compare our proposed perturbation-based interpretation method with related work, including a total of eight local interpretation methods. Finally, we apply our framework to traffic and electricity datasets, demonstrating that our approach is generic and can be applied to other time-series domains.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
A multipartite analogue of Dilworth's Theorem
Authors:
Jacob Fox,
Huy Tuan Pham
Abstract:
We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every…
▽ More
We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every element in $A_{j}$ for $i\ne j$. This answers a question of the first author from 2006. As a corollary, we prove for each positive integer $h$ there is $C_h$ such that for any $h$ partial orders $<_{1},<_{2},\dots,<_{h}$ on a set of $n$ elements, there exists $k$ subsets $A_{1},A_{2},\dots,A_{k}$ each of size at least $n/(k\log n)^{C_{h}}$ such that for each partial order $<_{\ell}$, either $a_{1}<_{\ell}a_{2}<_{\ell}\dots<_{\ell}a_{k}$ for any tuple of elements $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_{1}>_{\ell}a_{2}>_{\ell}\dots>_{\ell}a_{k}$ for any $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_i$ is incomparable with $a_j$ for any $i\ne j$, $a_i\in A_i$ and $a_j\in A_j$. This improves on a 2009 result of Pach and the first author motivated by problems in discrete geometry.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
A structure theorem for pseudo-segments and its applications
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
We prove a far-reaching strengthening of Szemerédi's regularity lemma for intersection graphs of pseudo-segments. It shows that the vertex set of such a graph can be partitioned into a bounded number of parts of roughly the same size such that almost all bipartite graphs between different pairs of parts are complete or empty. We use this to get an improved bound on disjoint edges in simple topolog…
▽ More
We prove a far-reaching strengthening of Szemerédi's regularity lemma for intersection graphs of pseudo-segments. It shows that the vertex set of such a graph can be partitioned into a bounded number of parts of roughly the same size such that almost all bipartite graphs between different pairs of parts are complete or empty. We use this to get an improved bound on disjoint edges in simple topological graphs, showing that every $n$-vertex simple topological graph with no $k$ pairwise disjoint edges has at most $n(\log n)^{O(\log k)}$ edges.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Authors:
Ross Gruetzemacher,
Alan Chan,
Kevin Frazier,
Christy Manning,
Štěpán Los,
James Fox,
José Hernández-Orallo,
John Burden,
Matija Franklin,
Clíodhna Ní Ghuidhir,
Mark Bailey,
Daniel Eth,
Toby Pilditch,
Kyle Kilian
Abstract:
Given rapid progress toward advanced AI and risks from frontier AI systems (advanced AI systems pushing the boundaries of the AI capabilities frontier), the creation and implementation of AI governance and regulatory schemes deserves prioritization and substantial investment. However, the status quo is untenable and, frankly, dangerous. A regulatory gap has permitted AI labs to conduct research, d…
▽ More
Given rapid progress toward advanced AI and risks from frontier AI systems (advanced AI systems pushing the boundaries of the AI capabilities frontier), the creation and implementation of AI governance and regulatory schemes deserves prioritization and substantial investment. However, the status quo is untenable and, frankly, dangerous. A regulatory gap has permitted AI labs to conduct research, development, and deployment activities with minimal oversight. In response, frontier AI system evaluations have been proposed as a way of assessing risks from the development and deployment of frontier AI systems. Yet, the budding AI risk evaluation ecosystem faces significant coordination challenges, such as a limited diversity of evaluators, suboptimal allocation of effort, and perverse incentives. This paper proposes a solution in the form of an international consortium for AI risk evaluations, comprising both AI developers and third-party AI risk evaluators. Such a consortium could play a critical role in international efforts to mitigate societal-scale risks from advanced AI, including in managing responsible scaling policies and coordinated evaluation-based risk response. In this paper, we discuss the current evaluation ecosystem and its shortcomings, propose an international consortium for advanced AI risk evaluations, discuss issues regarding its implementation, discuss lessons that can be learnt from previous international institutions and existing proposals for international AI governance institutions, and, finally, we recommend concrete steps to advance the establishment of the proposed consortium: (i) solicit feedback from stakeholders, (ii) conduct additional research, (iii) conduct a workshop(s) for stakeholders, (iv) analyze feedback and create final proposal, (v) solicit funding, and (vi) create a consortium.
△ Less
Submitted 6 November, 2023; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Updated Corpora and Benchmarks for Long-Form Speech Recognition
Authors:
Jennifer Drexler Fox,
Desh Raj,
Natalie Delworth,
Quinn McNamara,
Corey Miller,
Migüel Jetté
Abstract:
The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en -…
▽ More
The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en - with updated transcription and alignments to enable their use for long-form ASR research. We use these reconstituted corpora to study the train-test mismatch problem for transducers and attention-based encoder-decoders (AEDs), confirming that AEDs are more susceptible to this issue. Finally, we benchmark a simple long-form training for these models, showing its efficacy for model robustness under this domain shift.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Fairness and Privacy in Federated Learning and Their Implications in Healthcare
Authors:
Navya Annapareddy,
Jade Preston,
Judy Fox
Abstract:
Currently, many contexts exist where distributed learning is difficult or otherwise constrained by security and communication limitations. One common domain where this is a consideration is in Healthcare where data is often governed by data-use-ordinances like HIPAA. On the other hand, larger sample sizes and shared data models are necessary to allow models to better generalize on account of the p…
▽ More
Currently, many contexts exist where distributed learning is difficult or otherwise constrained by security and communication limitations. One common domain where this is a consideration is in Healthcare where data is often governed by data-use-ordinances like HIPAA. On the other hand, larger sample sizes and shared data models are necessary to allow models to better generalize on account of the potential for more variability and balancing underrepresented classes. Federated learning is a type of distributed learning model that allows data to be trained in a decentralized manner. This, in turn, addresses data security, privacy, and vulnerability considerations as data itself is not shared across a given learning network nodes. Three main challenges to federated learning include node data is not independent and identically distributed (iid), clients requiring high levels of communication overhead between peers, and there is the heterogeneity of different clients within a network with respect to dataset bias and size. As the field has grown, the notion of fairness in federated learning has also been introduced through novel implementations. Fairness approaches differ from the standard form of federated learning and also have distinct challenges and considerations for the healthcare domain. This paper endeavors to outline the typical lifecycle of fair federated learning in research as well as provide an updated taxonomy to account for the current state of fairness in implementations. Lastly, this paper provides added insight into the implications and challenges of implementing and supporting fairness in federated learning in the healthcare domain.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Towards Fair and Privacy Preserving Federated Learning for the Healthcare Domain
Authors:
Navya Annapareddy,
Yingzheng Liu,
Judy Fox
Abstract:
Federated learning enables data sharing in healthcare contexts where it might otherwise be difficult due to data-use-ordinances or security and communication constraints. Distributed and shared data models allow models to become generalizable and learn from heterogeneous clients. While addressing data security, privacy, and vulnerability considerations, data itself is not shared across nodes in a…
▽ More
Federated learning enables data sharing in healthcare contexts where it might otherwise be difficult due to data-use-ordinances or security and communication constraints. Distributed and shared data models allow models to become generalizable and learn from heterogeneous clients. While addressing data security, privacy, and vulnerability considerations, data itself is not shared across nodes in a given learning network. On the other hand, FL models often struggle with variable client data distributions and operate on an assumption of independent and identically distributed data. As the field has grown, the notion of fairness-aware federated learning mechanisms has also been introduced and is of distinct significance to the healthcare domain where many sensitive groups and protected classes exist. In this paper, we create a benchmark methodology for FAFL mechanisms under various heterogeneous conditions on datasets in the healthcare domain typically outside the scope of current federated learning benchmarks, such as medical imaging and waveform data formats. Our results indicate considerable variation in how various FAFL schemes respond to high levels of data heterogeneity. Additionally, doing so under privacy-preserving conditions can create significant increases in network communication cost and latency compared to the typical federated learning scheme.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
On Imperfect Recall in Multi-Agent Influence Diagrams
Authors:
James Fox,
Matt MacDermott,
Lewis Hammond,
Paul Harrenstein,
Alessandro Abate,
Michael Wooldridge
Abstract:
Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfec…
▽ More
Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfect recall, however, a Nash equilibrium in behavioural policies may not exist. We overcome this by showing how to solve MAIDs with forgetful and absent-minded agents using mixed policies and two types of correlated equilibrium. We also analyse the computational complexity of key decision problems in MAIDs, and explore tractable cases. Finally, we describe applications of MAIDs to Markov games and team situations, where imperfect recall is often unavoidable.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Population Age Group Sensitivity for COVID-19 Infections with Deep Learning
Authors:
Md Khairul Islam,
Tyler Valentine,
Royal Wang,
Levi Davis,
Matt Manner,
Judy Fox
Abstract:
The COVID-19 pandemic has created unprecedented challenges for governments and healthcare systems worldwide, highlighting the critical importance of understanding the factors that contribute to virus transmission. This study aimed to identify the most influential age groups in COVID-19 infection rates at the US county level using the Modified Morris Method and deep learning for time series. Our ap…
▽ More
The COVID-19 pandemic has created unprecedented challenges for governments and healthcare systems worldwide, highlighting the critical importance of understanding the factors that contribute to virus transmission. This study aimed to identify the most influential age groups in COVID-19 infection rates at the US county level using the Modified Morris Method and deep learning for time series. Our approach involved training the state-of-the-art time-series model Temporal Fusion Transformer on different age groups as a static feature and the population vaccination status as the dynamic feature. We analyzed the impact of those age groups on COVID-19 infection rates by perturbing individual input features and ranked them based on their Morris sensitivity scores, which quantify their contribution to COVID-19 transmission rates. The findings are verified using ground truth data from the CDC and US Census, which provide the true infection rates for each age group. The results suggest that young adults were the most influential age group in COVID-19 transmission at the county level between March 1, 2020, and November 27, 2021. Using these results can inform public health policies and interventions, such as targeted vaccination strategies, to better control the spread of the virus. Our approach demonstrates the utility of feature sensitivity analysis in identifying critical factors contributing to COVID-19 transmission and can be applied in other public health domains.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Designing explainable artificial intelligence with active inference: A framework for transparent introspection and decision-making
Authors:
Mahault Albarracin,
Inês Hipólito,
Safae Essafi Tremblay,
Jason G. Fox,
Gabriel René,
Karl Friston,
Maxwell J. D. Ramstead
Abstract:
This paper investigates the prospect of develo** human-interpretable, explainable artificial intelligence (AI) systems based on active inference and the free energy principle. We first provide a brief overview of active inference, and in particular, of how it applies to the modeling of decision-making, introspection, as well as the generation of overt and covert actions. We then discuss how acti…
▽ More
This paper investigates the prospect of develo** human-interpretable, explainable artificial intelligence (AI) systems based on active inference and the free energy principle. We first provide a brief overview of active inference, and in particular, of how it applies to the modeling of decision-making, introspection, as well as the generation of overt and covert actions. We then discuss how active inference can be leveraged to design explainable AI systems, namely, by allowing us to model core features of ``introspective'' processes and by generating useful, human-interpretable models of the processes involved in decision-making. We propose an architecture for explainable AI systems using active inference. This architecture foregrounds the role of an explicit hierarchical generative model, the operation of which enables the AI system to track and explain the factors that contribute to its own decisions, and whose structure is designed to be interpretable and auditable by human users. We outline how this architecture can integrate diverse sources of information to make informed decisions in an auditable manner, mimicking or reproducing aspects of human-like consciousness and introspection. Finally, we discuss the implications of our findings for future research in AI, and the potential ethical considerations of develo** AI systems with (the appearance of) introspective capabilities.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Set-coloring Ramsey numbers and error-correcting codes near the zero-rate threshold
Authors:
David Conlon,
Jacob Fox,
Huy Tuan Pham,
Yufei Zhao
Abstract:
For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from…
▽ More
For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is known to grow exponentially in $r$, while if $\frac{s}{r}$ is greater than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is bounded. Here we prove bounds for $R(n;r,s)$ in the intermediate range where $\frac{s}{r}$ is close to $1 - \frac{1}{n-1}$ by establishing a connection to the maximum size of error-correcting codes near the zero-rate threshold.
△ Less
Submitted 14 August, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems
Authors:
Kevin Zeng,
Carlos E. Pérez De Jesús,
Andrew J. Fox,
Michael D. Graham
Abstract:
While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal…
▽ More
While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the map** functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a "collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.
△ Less
Submitted 6 December, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Real-Time Traffic End-of-Queue Detection and Tracking in UAV Video
Authors:
Russ Messenger,
Md Zobaer Islam,
Matthew Whitlock,
Erik Spong,
Nate Morton,
Layne Claggett,
Chris Matthews,
Jordan Fox,
Leland Palmer,
Dane C. Johnson,
John F. O'Hara,
Christopher J. Crick,
Jamey D. Jacob,
Sabit Ekin
Abstract:
Highway work zones are susceptible to undue accumulation of motorized vehicles which calls for dynamic work zone warning signs to prevent accidents. The work zone signs are placed according to the location of the end-of-queue of vehicles which usually changes rapidly. The detection of moving objects in video captured by Unmanned Aerial Vehicles (UAV) has been extensively researched so far, and is…
▽ More
Highway work zones are susceptible to undue accumulation of motorized vehicles which calls for dynamic work zone warning signs to prevent accidents. The work zone signs are placed according to the location of the end-of-queue of vehicles which usually changes rapidly. The detection of moving objects in video captured by Unmanned Aerial Vehicles (UAV) has been extensively researched so far, and is used in a wide array of applications including traffic monitoring. Unlike the fixed traffic cameras, UAVs can be used to monitor the traffic at work zones in real-time and also in a more cost-effective way. This study presents a method as a proof of concept for detecting End-of-Queue (EOQ) of traffic by processing the real-time video footage of a highway work zone captured by UAV. EOQ is detected in the video by image processing which includes background subtraction and blob detection methods. This dynamic localization of EOQ of vehicles will enable faster and more accurate relocation of work zone warning signs for drivers and thus will reduce work zone fatalities. The method can be applied to detect EOQ of vehicles and notify drivers in any other roads or intersections too where vehicles are rapidly accumulating due to special events, traffic jams, construction, or accidents.
△ Less
Submitted 31 October, 2023; v1 submitted 9 January, 2023;
originally announced February 2023.
-
Reasoning about Causality in Games
Authors:
Lewis Hammond,
James Fox,
Tom Everitt,
Ryan Carey,
Alessandro Abate,
Michael Wooldridge
Abstract:
Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's caus…
▽ More
Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three key questions: i) How can the (causal) dependencies in games - either between variables, or between strategies - be modelled in a uniform, principled manner? ii) How may causal queries be computed in causal games, and what assumptions does this require? iii) How do causal games compare to existing formalisms? To address question i), we introduce mechanised games, which encode dependencies between agents' decision rules and the distributions governing the game. In response to question ii), we present definitions of predictions, interventions, and counterfactuals, and discuss the assumptions required for each. Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support. Finally, we highlight possible applications of causal games, aided by an extensive open-source Python library.
△ Less
Submitted 17 April, 2023; v1 submitted 5 January, 2023;
originally announced January 2023.
-
Designing Ecosystems of Intelligence from First Principles
Authors:
Karl J Friston,
Maxwell J D Ramstead,
Alex B Kiefer,
Alexander Tschantz,
Christopher L Buckley,
Mahault Albarracin,
Riddhi J Pitliya,
Conor Heins,
Brennan Klein,
Beren Millidge,
Dalton A R Sakthivadivel,
Toby St Clere Smithe,
Magnus Koudahl,
Safae Essafi Tremblay,
Capm Petersen,
Kaiser Fung,
Jason G Fox,
Steven Swanson,
Dan Mapes,
Gabriel René
Abstract:
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants -- what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read…
▽ More
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants -- what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world -- also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing -- leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first -- and key -- step towards such an ecology.
△ Less
Submitted 11 January, 2024; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Interpreting County Level COVID-19 Infection and Feature Sensitivity using Deep Learning Time Series Models
Authors:
Md Khairul Islam,
Di Zhu,
Yingzheng Liu,
Andrej Erkelens,
Nick Daniello,
Judy Fox
Abstract:
Interpretable machine learning plays a key role in healthcare because it is challenging in understanding feature importance in deep learning model predictions. We propose a novel framework that uses deep learning to study feature sensitivity for model predictions. This work combines sensitivity analysis with heterogeneous time-series deep learning model prediction, which corresponds to the interpr…
▽ More
Interpretable machine learning plays a key role in healthcare because it is challenging in understanding feature importance in deep learning model predictions. We propose a novel framework that uses deep learning to study feature sensitivity for model predictions. This work combines sensitivity analysis with heterogeneous time-series deep learning model prediction, which corresponds to the interpretations of spatio-temporal features. We forecast county-level COVID-19 infection using the Temporal Fusion Transformer. We then use the sensitivity analysis extending Morris Method to see how sensitive the outputs are with respect to perturbation to our static and dynamic input features. The significance of the work is grounded in a real-world COVID-19 infection prediction with highly non-stationary, finely granular, and heterogeneous data. 1) Our model can capture the detailed daily changes of temporal and spatial model behaviors and achieves high prediction performance compared to a PyTorch baseline. 2) By analyzing the Morris sensitivity indices and attention patterns, we decipher the meaning of feature importance with observational population and dynamic model changes. 3) We have collected 2.5 years of socioeconomic and health features over 3142 US counties, such as observed cases and deaths, and a number of static (age distribution, health disparity, and industry) and dynamic features (vaccination, disease spread, transmissible cases, and social distancing). Using the proposed framework, we conduct extensive experiments and show our model can learn complex interactions and perform predictions for daily infection at the county level. Being able to model the disease infection with a hybrid prediction and description accuracy measurement with Morris index at the county level is a central idea that sheds light on individual feature interpretation via sensitivity analysis.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model
Authors:
Jennifer Drexler Fox,
Natalie Delworth
Abstract:
Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public benchmark for this task. We present baseline results on this benchmark using a pretrained end-to-end ASR model from the WeNet toolkit. We show results for shallow fu…
▽ More
Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public benchmark for this task. We present baseline results on this benchmark using a pretrained end-to-end ASR model from the WeNet toolkit. We show results for shallow fusion contextual biasing applied to two different decoding algorithms. Our baseline results confirm observations that end-to-end models struggle in particular with words that are rarely or never seen during training, and that existing shallow fusion techniques do not adequately address this problem. We propose an alternate spelling prediction model that improves recall of rare words by 34.7% relative and of out-of-vocabulary words by 97.2% relative, compared to contextual biasing without alternate spellings. This model is conceptually similar to ones used in prior work, but is simpler to implement as it does not rely on either a pronunciation dictionary or an existing text-to-speech system.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
Learning Task Automata for Reinforcement Learning using Hidden Markov Models
Authors:
Alessandro Abate,
Yousif Almulla,
James Fox,
David Hyland,
Michael Wooldridge
Abstract:
Training reinforcement learning (RL) agents using scalar reward signals is often infeasible when an environment has sparse and non-Markovian rewards. Moreover, handcrafting these reward functions before training is prone to misspecification, especially when the environment's dynamics are only partially known. This paper proposes a novel pipeline for learning non-Markovian task specifications as su…
▽ More
Training reinforcement learning (RL) agents using scalar reward signals is often infeasible when an environment has sparse and non-Markovian rewards. Moreover, handcrafting these reward functions before training is prone to misspecification, especially when the environment's dynamics are only partially known. This paper proposes a novel pipeline for learning non-Markovian task specifications as succinct finite-state `task automata' from episodes of agent experience within unknown environments. We leverage two key algorithmic insights. First, we learn a product MDP, a model composed of the specification's automaton and the environment's MDP (both initially unknown), by treating the product MDP as a partially observable MDP and using the well-known Baum-Welch algorithm for learning hidden Markov models. Second, we propose a novel method for distilling the task automaton (assumed to be a deterministic finite automaton) from the learnt product MDP. Our learnt task automaton enables the decomposition of a task into its constituent sub-tasks, which improves the rate at which an RL agent can later synthesise an optimal policy. It also provides an interpretable encoding of high-level environmental and task features, so a human can readily verify that the agent has learnt coherent tasks with no misspecifications. In addition, we take steps towards ensuring that the learnt automaton is environment-agnostic, making it well-suited for use in transfer learning. Finally, we provide experimental results compared with two baselines to illustrate our algorithm's performance in different environments and tasks.
△ Less
Submitted 3 October, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Extremal results on feedback arc sets in digraphs
Authors:
Jacob Fox,
Zoe Himwich,
Nitya Mani
Abstract:
A directed graph is oriented if it can be obtained by orienting the edges of a simple, undirected graph. For an oriented graph $G$, let $β(G)$ denote the size of a minimum feedback arc set, a smallest subset of edges whose deletion leaves an acyclic subgraph. A simple consequence of a result of Berger and Shor is that any oriented graph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^{3/4})$.
We…
▽ More
A directed graph is oriented if it can be obtained by orienting the edges of a simple, undirected graph. For an oriented graph $G$, let $β(G)$ denote the size of a minimum feedback arc set, a smallest subset of edges whose deletion leaves an acyclic subgraph. A simple consequence of a result of Berger and Shor is that any oriented graph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^{3/4})$.
We observe that if an oriented graph $G$ has a fixed forbidden subgraph $B$, the upper bound of $β(G) = m/2 - Ω(m^{3/4})$ is best possible as a function of the number of edges if $B$ is not bipartite, but the exponent $3/4$ in the lower order term can be improved if $B$ is bipartite. We also show that for every rational number $r$ between $3/4$ and $1$, there is a finite collection of digraphs $\mathcal{B}$ such that every $\mathcal{B}$-free digraph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^r)$, and this bound is best possible up to the implied constant factor. The proof uses a connection to Turán numbers and a result of Bukh and Conlon. Both of our upper bounds come equipped with randomized linear-time algorithms that construct feedback arc sets achieving those bounds. Finally, we give a characterization of quasirandom directed graphs via minimum feedback arc sets.
△ Less
Submitted 19 April, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Curriculum optimization for low-resource speech recognition
Authors:
Anastasia Kuznetsova,
Anurag Kumar,
Jennifer Drexler Fox,
Francis Tyers
Abstract:
Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model wh…
▽ More
Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model while training and prior knowledge about the difficulty of the training examples. We introduce a new difficulty measure called compression ratio that can be used as a scoring function for raw audio in various noise conditions. The proposed method improves speech recognition Word Error Rate performance by up to 33% relative over the baseline system
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Quasiplanar Graphs, String Graphs, and the Erdos-Gallai Problem
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
An $r$-quasiplanar graph is a graph drawn in the plane with no $r$ pairwise crossing edges. Let $s \geq 3$ be an integer and $r=2^s$. We prove that there is a constant $C$ such that every $r$-quasiplanar graph with $n \geq r$ vertices has at most $n\left(Cs^{-1}\log n\right)^{2s-4}$ edges.
A graph whose vertices are continuous curves in the plane, two being connected by an edge if and only if th…
▽ More
An $r$-quasiplanar graph is a graph drawn in the plane with no $r$ pairwise crossing edges. Let $s \geq 3$ be an integer and $r=2^s$. We prove that there is a constant $C$ such that every $r$-quasiplanar graph with $n \geq r$ vertices has at most $n\left(Cs^{-1}\log n\right)^{2s-4}$ edges.
A graph whose vertices are continuous curves in the plane, two being connected by an edge if and only if they intersect, is called a string graph. We show that for every $ε>0$, there exists $δ>0$ such that every string graph with $n$ vertices, whose chromatic number is at least $n^ε$ contains a clique of size at least $n^δ$. A clique of this size or a coloring using fewer than $n^ε$ colors can be found by a polynomial time algorithm in terms of the size of the geometric representation of the set of strings.
In the process, we use, generalize, and strengthen previous results of Lee, Tomon, and others. All of our theorems are related to geometric variants of the following classical graph-theoretic problem of Erdos, Gallai, and Rogers. Given a $K_r$-free graph on $n$ vertices and an integer $s<r$, at least how many vertices can we find such that the subgraph induced by them is $K_s$-free?
△ Less
Submitted 25 October, 2022; v1 submitted 4 December, 2021;
originally announced December 2021.
-
Intel Optane DCPMM and Serverless Computing
Authors:
Ahmet Uyar,
Selahattin Akkas,
Jiayu Li,
Judy Fox
Abstract:
This report describes 1) how we use Intel's Optane DCPMM in the memory Mode. We investigate the the scalability of applications on a single Optane machine, using Subgraph counting as memory-intensive graph problem. We test with various input graph and subtemplate sizes to determine its performance for different memory and CPU loads, as well as a comparison of performance on a single node Optane wi…
▽ More
This report describes 1) how we use Intel's Optane DCPMM in the memory Mode. We investigate the the scalability of applications on a single Optane machine, using Subgraph counting as memory-intensive graph problem. We test with various input graph and subtemplate sizes to determine its performance for different memory and CPU loads, as well as a comparison of performance on a single node Optane with a distributed set of nodes in a cluster using MPI. 2) We investigate the end-to-end execution delays in serverless computing and study concurrent function executions with cold starts. In future work, we will show that persistent memory machines may significantly improve concurrent function invocations in serverless computing including Amazon Lambda, Microsoft Azure Functions, Google Cloud Functions and IBM Cloud Functions (Apache OpenWhisk).
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
On the number of edges of separated multigraphs
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
We prove that the number of edges of a multigraph $G$ with $n$ vertices is at most $O(n^2\log n)$, provided that any two edges cross at most once, parallel edges are noncrossing, and the lens enclosed by every pair of parallel edges in $G$ contains at least one vertex. As a consequence, we prove the following extension of the Crossing Lemma of Ajtai, Chvátal, Newborn, Szemerédi and Leighton, if…
▽ More
We prove that the number of edges of a multigraph $G$ with $n$ vertices is at most $O(n^2\log n)$, provided that any two edges cross at most once, parallel edges are noncrossing, and the lens enclosed by every pair of parallel edges in $G$ contains at least one vertex. As a consequence, we prove the following extension of the Crossing Lemma of Ajtai, Chvátal, Newborn, Szemerédi and Leighton, if $G$ has $e \geq 4n$ edges, in any drawing of $G$ with the above property, the number of crossings is $Ω\left(\frac{e^3}{n^2\log(e/n)}\right)$. This answers a question of Kaufmann et al. and is tight up to the logarithmic factor.
△ Less
Submitted 22 February, 2022; v1 submitted 25 August, 2021;
originally announced August 2021.
-
HySec-Flow: Privacy-Preserving Genomic Computing with SGX-based Big-Data Analytics Framework
Authors:
Chathura Widanage,
Weijie Liu,
Jiayu Li,
Hongbo Chen,
XiaoFeng Wang,
Haixu Tang,
Judy Fox
Abstract:
Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for l…
▽ More
Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for large-scale genomic data analysis using SGX platforms. Here, the data-intensive computing tasks can be partitioned into independent subtasks to be deployed into distinct secured and non-secured containers, therefore allowing for parallel execution while alleviating the limited size of Page Cache (EPC) memory in each enclave. We illustrate our contributions using a workflow supporting indexing, alignment, dispatching, and merging the execution of SGX- enabled containers. We provide details regarding the architecture of the trusted and untrusted components and the underlying Scorn and Graphene support as generic shielding execution frameworks to port legacy code. We thoroughly evaluate the performance of our privacy-preserving reads map** algorithm using real human genome sequencing data. The results demonstrate that the performance is enhanced by partitioning the time-consuming genomic computation into subtasks compared to the conventional execution of the data-intensive reads map** algorithm in an enclave. The proposed HySec-Flow framework is made available as an open-source and adapted to the data-parallel computation of other large-scale genomic tasks requiring security and scalable computational resources.
△ Less
Submitted 26 July, 2021;
originally announced July 2021.
-
Concentric Spherical GNN for 3D Representation Learning
Authors:
James Fox,
Bo Zhao,
Sivasankaran Rajamanickam,
Rampi Ramprasad,
Le Song
Abstract:
Learning 3D representations that generalize well to arbitrarily oriented inputs is a challenge of practical importance in applications varying from computer vision to physics and chemistry. We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps, of which the single sphere representation is a special case. Our hierarchical architecture is…
▽ More
Learning 3D representations that generalize well to arbitrarily oriented inputs is a challenge of practical importance in applications varying from computer vision to physics and chemistry. We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps, of which the single sphere representation is a special case. Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information. We show the applicability of our method for two different types of 3D inputs, mesh objects, which can be regularly sampled, and point clouds, which are irregularly distributed. We also propose an efficient map** of point clouds to concentric spherical images, thereby bridging spherical convolutions on grids with general point clouds. We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
Making an $H$-Free Graph $k$-Colorable
Authors:
Jacob Fox,
Zoe Himwich,
Nitya Mani
Abstract:
We study the following question: how few edges can we delete from any $H$-free graph on $n$ vertices in order to make the resulting graph $k$-colorable? It turns out that various classical problems in extremal graph theory are special cases of this question. For $H$ any fixed odd cycle, we determine the answer up to a constant factor when $n$ is sufficiently large. We also prove an upper bound whe…
▽ More
We study the following question: how few edges can we delete from any $H$-free graph on $n$ vertices in order to make the resulting graph $k$-colorable? It turns out that various classical problems in extremal graph theory are special cases of this question. For $H$ any fixed odd cycle, we determine the answer up to a constant factor when $n$ is sufficiently large. We also prove an upper bound when $H$ is a fixed clique that we conjecture is tight up to a constant factor, and prove upper bounds for more general families of graphs. We apply our results to get a new bound on the maximum cut of graphs with a forbidden odd cycle in terms of the number of edges.
△ Less
Submitted 19 March, 2021; v1 submitted 19 February, 2021;
originally announced February 2021.
-
Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice
Authors:
Lewis Hammond,
James Fox,
Tom Everitt,
Alessandro Abate,
Michael Wooldridge
Abstract:
Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibri…
▽ More
Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibrium refinements. We then prove several equivalence results between MAIDs and EFGs. Finally, we describe an open source implementation for reasoning about MAIDs and computing their equilibria.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Achieving a quantum smart workforce
Authors:
Clarice D. Aiello,
D. D. Awschalom,
Hannes Bernien,
Tina Brower-Thomas,
Kenneth R. Brown,
Todd A. Brun,
Justin R. Caram,
Eric Chitambar,
Rosa Di Felice,
Michael F. J. Fox,
Stephan Haas,
Alexander W. Holleitner,
Eric R. Hudson,
Jeffrey H. Hunt,
Robert Joynt,
Scott Koziol,
H. J. Lewandowski,
Douglas T. McClure,
Jens Palsberg,
Gina Passante,
Kristen L. Pudenz,
Christopher J. K. Richardson,
Jessica L. Rosenberg,
R. S. Ross,
Mark Saffman
, et al. (7 additional authors not shown)
Abstract:
Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. Thi…
▽ More
Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. This manuscript not only addresses policy makers and funding agencies (both public and private and from the regional to the international level) but also contains needs identified by industry leaders and discusses the difficulties inherent in creating an inclusive QISE curriculum. We report on the status of eighteen post-secondary education programs in QISE and provide guidance for building new programs. Lastly, we encourage the development of a comprehensive strategic plan for quantum education and workforce development as a means to make the most of the ongoing substantial investments being made in QISE.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
How Robust Are Graph Neural Networks to Structural Noise?
Authors:
James Fox,
Sivasankaran Rajamanickam
Abstract:
Graph neural networks (GNNs) are an emerging model for learning graph embeddings and making predictions on graph structured data. However, robustness of graph neural networks is not yet well-understood. In this work, we focus on node structural identity predictions, where a representative GNN model is able to achieve near-perfect accuracy. We also show that the same GNN model is not robust to addi…
▽ More
Graph neural networks (GNNs) are an emerging model for learning graph embeddings and making predictions on graph structured data. However, robustness of graph neural networks is not yet well-understood. In this work, we focus on node structural identity predictions, where a representative GNN model is able to achieve near-perfect accuracy. We also show that the same GNN model is not robust to addition of structural noise, through a controlled dataset and set of experiments. Finally, we show that under the right conditions, graph-augmented training is capable of significantly improving robustness to structural noise.
△ Less
Submitted 21 December, 2019;
originally announced December 2019.
-
Exploring AI Futures Through Role Play
Authors:
Shahar Avin,
Ross Gruetzemacher,
James Fox
Abstract:
We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While…
▽ More
We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While the game currently focuses on the inter relations between short --, mid and long term impacts of AI, it has potential to be adapted for a broad range of scenarios, exploring in greater depths issues of AI policy research and affording training within organizations. The game presented here has undergone two years of development and has been tested through over 30 events involving between 3 and 70 participants. The game is under active development, but preliminary findings suggest that role play is a promising methodology for both exploring AI futures and training individuals and organizations in thinking about, and reflecting on, the impacts of AI and strategic mistakes that can be avoided today.
△ Less
Submitted 18 December, 2019;
originally announced December 2019.
-
Induced arithmetic removal: complexity 1 patterns over finite fields
Authors:
Jacob Fox,
Jonathan Tidor,
Yufei Zhao
Abstract:
We prove an arithmetic analog of the induced graph removal lemma for complexity 1 patterns over finite fields. Informally speaking, we show that given a fixed collection of $r$-colored complexity 1 arithmetic patterns over $\mathbb F_q$, every coloring $φ\colon \mathbb F_q^n \setminus\{0\} \to [r]$ with $o(1)$ density of every such pattern can be recolored on an $o(1)$-fraction of the space so tha…
▽ More
We prove an arithmetic analog of the induced graph removal lemma for complexity 1 patterns over finite fields. Informally speaking, we show that given a fixed collection of $r$-colored complexity 1 arithmetic patterns over $\mathbb F_q$, every coloring $φ\colon \mathbb F_q^n \setminus\{0\} \to [r]$ with $o(1)$ density of every such pattern can be recolored on an $o(1)$-fraction of the space so that no such pattern remains.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Performance Impact of Memory Channels on Sparse and Irregular Algorithms
Authors:
Oded Green,
James Fox,
Jeffrey Young,
Jun Shirako,
David Bader
Abstract:
Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization of the memory system for graph algorithms is not necessarily the raw bandwidth or even the latency of memory…
▽ More
Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization of the memory system for graph algorithms is not necessarily the raw bandwidth or even the latency of memory requests. Instead, we show that performance is proportional to the number of memory channels available to handle small data transfers with limited spatial locality.
Using several widely used graph frameworks, including Gunrock (on the GPU) and GAPBS \& Ligra (for CPUs), we evaluate key graph analytics kernels using two unique memory hierarchies, DDR-based and HBM/MCDRAM. Our results show that the differences in the peak bandwidths of several Pascal-generation GPU memory subsystems aren't reflected in the performance of various analytics. Furthermore, our experiments on CPU and Xeon Phi systems demonstrate that the number of memory channels utilized can be a decisive factor in performance across several different applications. For CPU systems with smaller thread counts, the memory channels can be underutilized while systems with high thread counts can oversaturate the memory subsystem, which leads to limited performance. Finally, we model the potential performance improvements of adding more memory channels with narrower access widths than are found in current platforms, and we analyze performance trade-offs for the two most prominent types of memory accesses found in graph algorithms, streaming and random accesses.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
On Ramsey numbers of hedgehogs
Authors:
Jacob Fox,
Ray Li
Abstract:
The hedgehog $H_t$ is a 3-uniform hypergraph on vertices $1,\dots,t+\binom{t}{2}$ such that, for any pair $(i,j)$ with $1\le i<j\le t$, there exists a unique vertex $k>t$ such that $\{i,j,k\}$ is an edge. Conlon, Fox, and Rödl proved that the two-color Ramsey number of the hedgehog grows polynomially in the number of its vertices, while the four-color Ramsey number grows exponentially in the numbe…
▽ More
The hedgehog $H_t$ is a 3-uniform hypergraph on vertices $1,\dots,t+\binom{t}{2}$ such that, for any pair $(i,j)$ with $1\le i<j\le t$, there exists a unique vertex $k>t$ such that $\{i,j,k\}$ is an edge. Conlon, Fox, and Rödl proved that the two-color Ramsey number of the hedgehog grows polynomially in the number of its vertices, while the four-color Ramsey number grows exponentially in the number of its vertices. They asked whether the two-color Ramsey number of the hedgehog $H_t$ is nearly linear in the number of its vertices. We answer this question affirmatively, proving that $r(H_t) = O(t^2\ln t)$.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.
-
Periodicity in Movement Patterns Shapes Epidemic Risk in Urban Environments
Authors:
Zhanwei Du,
Spencer J Fox,
Petter Holme,
Jiming Liu,
Alison P. Galvani,
Lauren Ancel Meyers
Abstract:
Daily variation in human mobility modulates the speed and severity of emerging outbreaks, yet most epidemiological studies assume static contact patterns. With a highly mobile population exceeding 24 million people, Shanghai, China is a transportation hub at high risk for the importation and subsequent global propagation of infectious diseases. Here, we use a dynamic metapopulation model informed…
▽ More
Daily variation in human mobility modulates the speed and severity of emerging outbreaks, yet most epidemiological studies assume static contact patterns. With a highly mobile population exceeding 24 million people, Shanghai, China is a transportation hub at high risk for the importation and subsequent global propagation of infectious diseases. Here, we use a dynamic metapopulation model informed by hourly transit data for Shanghai to estimate epidemic risks across thousands of outbreak scenarios. We find that the rate of initial epidemic growth varies by more than twenty-fold, depending on the hour and neighborhood of disease introduction. The riskiest introductions are those occurring close to the city center and on Fridays--which bridge weekday and weekend transit patterns and thereby connect otherwise disconnected portions of the population. The identification of these spatio-temporal hotspots can inform more efficient targets for sentinel surveillance and strategies for mitigating transmission.
△ Less
Submitted 13 September, 2018;
originally announced September 2018.
-
Towards the linear arboricity conjecture
Authors:
Asaf Ferber,
Jacob Fox,
Vishesh Jain
Abstract:
The linear arboricity of a graph $G$, denoted by $\text{la}(G)$, is the minimum number of edge-disjoint linear forests (i.e. forests in which every connected component is a path) in $G$ whose union covers all the edges of $G$. A famous conjecture due to Akiyama, Exoo, and Harary from 1980 asserts that $\text{la}(G)\leq \lceil (Δ(G)+1)/2 \rceil$, where $Δ(G)$ denotes the maximum degree of $G$. This…
▽ More
The linear arboricity of a graph $G$, denoted by $\text{la}(G)$, is the minimum number of edge-disjoint linear forests (i.e. forests in which every connected component is a path) in $G$ whose union covers all the edges of $G$. A famous conjecture due to Akiyama, Exoo, and Harary from 1980 asserts that $\text{la}(G)\leq \lceil (Δ(G)+1)/2 \rceil$, where $Δ(G)$ denotes the maximum degree of $G$. This conjectured upper bound would be best possible, as is easily seen by taking $G$ to be a regular graph. In this paper, we show that for every graph $G$, $\text{la}(G)\leq \fracΔ{2}+O(Δ^{2/3-α})$ for some $α> 0$, thereby improving the previously best known bound due to Alon and Spencer from 1992. For graphs which are sufficiently good spectral expanders, we give even better bounds. Our proofs of these results further give probabilistic polynomial time algorithms for finding such decompositions into linear forests.
△ Less
Submitted 12 September, 2018;
originally announced September 2018.
-
A Completion of the Proof of the Edge-statistics Conjecture
Authors:
Jacob Fox,
Lisa Sauermann
Abstract:
For given integers $k$ and $\ell$ with $0<\ell< {k \choose 2}$, Alon, Hefetz, Krivelevich and Tyomkyn formulated the following conjecture: When sampling a $k$-vertex subset uniformly at random from a very large graph $G$, then the probability to have exactly $\ell$ edges within the sampled $k$-vertex subset is at most $e^{-1}+o_k(1)$. This conjecture was proved in the case…
▽ More
For given integers $k$ and $\ell$ with $0<\ell< {k \choose 2}$, Alon, Hefetz, Krivelevich and Tyomkyn formulated the following conjecture: When sampling a $k$-vertex subset uniformly at random from a very large graph $G$, then the probability to have exactly $\ell$ edges within the sampled $k$-vertex subset is at most $e^{-1}+o_k(1)$. This conjecture was proved in the case $Ω(k)\leq \ell\leq {k \choose 2}-Ω(k)$ by Kwan, Sudakov and Tran. In this paper, we complete the proof of the conjecture by resolving the remaining cases. We furthermore give nearly tight upper bounds for the probability described above in the case $ω(1)\leq \ell\leq o(k)$. We also extend some of our results to hypergraphs with bounded edge size.
△ Less
Submitted 25 February, 2020; v1 submitted 5 September, 2018;
originally announced September 2018.
-
Finding Cliques in Social Networks: A New Distribution-Free Model
Authors:
Jacob Fox,
Tim Roughgarden,
C. Seshadhri,
Fan Wei,
Nicole Wein
Abstract:
We propose a new distribution-free model of social networks. Our definitions are motivated by one of the most universal signatures of social networks, triadic closure---the property that pairs of vertices with common neighbors tend to be adjacent. Our most basic definition is that of a "$c$-closed" graph, where for every pair of vertices $u,v$ with at least $c$ common neighbors, $u$ and $v$ are ad…
▽ More
We propose a new distribution-free model of social networks. Our definitions are motivated by one of the most universal signatures of social networks, triadic closure---the property that pairs of vertices with common neighbors tend to be adjacent. Our most basic definition is that of a "$c$-closed" graph, where for every pair of vertices $u,v$ with at least $c$ common neighbors, $u$ and $v$ are adjacent. We study the classic problem of enumerating all maximal cliques, an important task in social network analysis. We prove that this problem is fixed-parameter tractable with respect to $c$ on $c$-closed graphs. Our results carry over to "weakly $c$-closed graphs", which only require a vertex deletion ordering that avoids pairs of non-adjacent vertices with $c$ common neighbors. Numerical experiments show that well-studied social networks tend to be weakly $c$-closed for modest values of $c$.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.
-
A fast new algorithm for weak graph regularity
Authors:
Jacob Fox,
László Miklós Lovász,
Yufei Zhao
Abstract:
We provide a deterministic algorithm that finds, in $ε^{-O(1)} n^2$ time, an $ε$-regular Frieze-Kannan partition of a graph on $n$ vertices. The algorithm outputs an approximation of a given graph as a weighted sum of $ε^{-O(1)}$ many complete bipartite graphs.
As a corollary, we give a deterministic algorithm for estimating the number of copies of $H$ in an $n$-vertex graph $G$ up to an additiv…
▽ More
We provide a deterministic algorithm that finds, in $ε^{-O(1)} n^2$ time, an $ε$-regular Frieze-Kannan partition of a graph on $n$ vertices. The algorithm outputs an approximation of a given graph as a weighted sum of $ε^{-O(1)}$ many complete bipartite graphs.
As a corollary, we give a deterministic algorithm for estimating the number of copies of $H$ in an $n$-vertex graph $G$ up to an additive error of at most $εn^{v(H)}$, in time $ε^{-O_H(1)}n^2$.
△ Less
Submitted 12 January, 2018;
originally announced January 2018.
-
Fast property testing and metrics for permutations
Authors:
Jacob Fox,
Fan Wei
Abstract:
The goal of property testing is to quickly distinguish between objects which satisfy a property and objects that are $ε$-far from satisfying the property. There are now several general results in this area which show that natural properties of combinatorial objects can be tested with "constant" query complexity, depending only on $ε$ and the property, and not on the size of the object being tested…
▽ More
The goal of property testing is to quickly distinguish between objects which satisfy a property and objects that are $ε$-far from satisfying the property. There are now several general results in this area which show that natural properties of combinatorial objects can be tested with "constant" query complexity, depending only on $ε$ and the property, and not on the size of the object being tested. The upper bound on the query complexity coming from the proof techniques are often enormous and impractical. It remains a major open problem if better bounds hold.
Maybe surprisingly, for testing with respect to the rectangular distance, we prove there is a universal (not depending on the property), polynomial in $1/ε$ query complexity bound for two-sided testing hereditary properties of sufficiently large permutations. We further give a nearly linear bound with respect to a closely related metric which also depends on the smallest forbidden subpermutation for the property. Finally, we show that several different permutation metrics of interest are related to the rectangular distance, yielding similar results for testing with respect to these metrics.
△ Less
Submitted 4 April, 2018; v1 submitted 4 November, 2016;
originally announced November 2016.
-
Approximating the rectilinear crossing number
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
A straight-line drawing of a graph $G$ is a map** which assigns to each vertex a point in the plane and to each edge a straight-line segment connecting the corresponding two points. The rectilinear crossing number of a graph $G$, $\overline{cr}(G)$, is the minimum number of crossing edges in any straight-line drawing of $G$. Determining or estimating $\overline{cr}(G)$ appears to be a difficult…
▽ More
A straight-line drawing of a graph $G$ is a map** which assigns to each vertex a point in the plane and to each edge a straight-line segment connecting the corresponding two points. The rectilinear crossing number of a graph $G$, $\overline{cr}(G)$, is the minimum number of crossing edges in any straight-line drawing of $G$. Determining or estimating $\overline{cr}(G)$ appears to be a difficult problem, and deciding if $\overline{cr}(G)\leq k$ is known to be NP-hard. In fact, the asymptotic behavior of $\overline{cr}(K_n)$ is still unknown.
In this paper, we present a deterministic $n^{2+o(1)}$-time algorithm that finds a straight-line drawing of any $n$-vertex graph $G$ with $\overline{cr}(G) + o(n^4)$ crossing edges. Together with the well-known Crossing Lemma due to Ajtai et al. and Leighton, this result implies that for any dense $n$-vertex graph $G$, one can efficiently find a straight-line drawing of $G$ with $(1 + o(1))\overline{cr}(G)$ crossing edges.
△ Less
Submitted 7 September, 2016; v1 submitted 12 June, 2016;
originally announced June 2016.
-
On the number of cliques in graphs with a forbidden minor
Authors:
Jacob Fox,
Fan Wei
Abstract:
Reed and Wood and independently Norine, Seymour, Thomas, and Wollan proved that for each positive integer $t$ there is a constant $c(t)$ such that every graph on $n$ vertices with no $K_t$-minor has at most $c(t)n$ cliques. Wood asked in 2007 if we can take $c(t) = c^t$ for some absolute constant $c$. This question was recently answered affirmatively by Lee and Oum. In this paper, we determine the…
▽ More
Reed and Wood and independently Norine, Seymour, Thomas, and Wollan proved that for each positive integer $t$ there is a constant $c(t)$ such that every graph on $n$ vertices with no $K_t$-minor has at most $c(t)n$ cliques. Wood asked in 2007 if we can take $c(t) = c^t$ for some absolute constant $c$. This question was recently answered affirmatively by Lee and Oum. In this paper, we determine the exponential constant. We prove that every graph on $n$ vertices with no $K_t$-minor has at most $3^{2t/3+o(t)}n$ cliques. This bound is tight for $n \geq 4t/3$. More generally, let $H$ be a connected graph on $t$ vertices, and $x$ denote the size (i.e., the number edges) of the largest matching in the complement of $H$. We prove that every graph on $n$ vertices with no $H$-minor has at most $\max(3^{2t/3-x/3+o(t)}n,2^{t+o(t)}n)$ cliques, and this bound is tight for $n \geq \max (4t/3-2x/3,t)$ by a simple construction. Even more generally, we determine explicitly the exponential constant for the maximum number of cliques an $n$-vertex graph can have in a minor-closed family of graphs which is closed under disjoint union.
△ Less
Submitted 22 March, 2016;
originally announced March 2016.
-
Semi-algebraic colorings of complete graphs
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
We consider $m$-colorings of the edges of a complete graph, where each color class is defined semi-algebraically with bounded complexity. The case $m = 2$ was first studied by Alon et al., who applied this framework to obtain surprisingly strong Ramsey-type results for intersection graphs of geometric objects and for other graphs arising in computational geometry. Considering larger values of $m$…
▽ More
We consider $m$-colorings of the edges of a complete graph, where each color class is defined semi-algebraically with bounded complexity. The case $m = 2$ was first studied by Alon et al., who applied this framework to obtain surprisingly strong Ramsey-type results for intersection graphs of geometric objects and for other graphs arising in computational geometry. Considering larger values of $m$ is relevant, e.g., to problems concerning the number of distinct distances determined by a point set.
For $p\ge 3$ and $m\ge 2$, the classical Ramsey number $R(p;m)$ is the smallest positive integer $n$ such that any $m$-coloring of the edges of $K_n$, the complete graph on $n$ vertices, contains a monochromatic $K_p$. It is a longstanding open problem that goes back to Schur (1916) to decide whether $R(p;m)=2^{O(m)}$, for a fixed $p$. We prove that this is true if each color class is defined semi-algebraically with bounded complexity. The order of magnitude of this bound is tight. Our proof is based on the Cutting Lemma of Chazelle {\em et al.}, and on a Szemerédi-type regularity lemma for multicolored semi-algebraic graphs, which is of independent interest. The same technique is used to address the semi-algebraic variant of a more general Ramsey-type problem of Erdős and Shelah.
△ Less
Submitted 5 December, 2018; v1 submitted 27 May, 2015;
originally announced May 2015.
-
A polynomial regularity lemma for semi-algebraic hypergraphs and its applications in geometry and property testing
Authors:
Jacob Fox,
Janos Pach,
Andrew Suk
Abstract:
Fox, Gromov, Lafforgue, Naor, and Pach proved a regularity lemma for semi-algebraic $k$-uniform hypergraphs of bounded complexity, showing that for each $ε>0$ the vertex set can be equitably partitioned into a bounded number of parts (in terms of $ε$ and the complexity) so that all but an $ε$-fraction of the $k$-tuples of parts are homogeneous. We prove that the number of parts can be taken to be…
▽ More
Fox, Gromov, Lafforgue, Naor, and Pach proved a regularity lemma for semi-algebraic $k$-uniform hypergraphs of bounded complexity, showing that for each $ε>0$ the vertex set can be equitably partitioned into a bounded number of parts (in terms of $ε$ and the complexity) so that all but an $ε$-fraction of the $k$-tuples of parts are homogeneous. We prove that the number of parts can be taken to be polynomial in $1/ε$. Our improved regularity lemma can be applied to geometric problems and to the following general question on property testing: is it possible to decide, with query complexity polynomial in the reciprocal of the approximation parameter, whether a hypergraph has a given hereditary property? We give an affirmative answer for testing typical hereditary properties for semi-algebraic hypergraphs of bounded complexity.
△ Less
Submitted 14 October, 2016; v1 submitted 5 February, 2015;
originally announced February 2015.
-
A tight lower bound for Szemerédi's regularity lemma
Authors:
Jacob Fox,
László Miklós Lovász
Abstract:
Addressing a question of Gowers, we determine the order of the tower height for the partition size in a version of Szemerédi's regularity lemma.
Addressing a question of Gowers, we determine the order of the tower height for the partition size in a version of Szemerédi's regularity lemma.
△ Less
Submitted 7 March, 2014;
originally announced March 2014.
-
Distinct volume subsets
Authors:
David Conlon,
Jacob Fox,
William Gasarch,
David G. Harris,
Douglas Ulrich,
Samuel Zbarsky
Abstract:
Suppose that $a$ and $d$ are positive integers with $a \geq 2$. Let $h_{a,d}(n)$ be the largest integer $t$ such that any set of $n$ points in $\mathbb{R}^d$ contains a subset of $t$ points for which all the non-zero volumes of the ${t \choose a}$ subsets of order $a$ are distinct. Beginning with Erdős in 1957, the function $h_{2,d}(n)$ has been closely studied and is known to be at least a power…
▽ More
Suppose that $a$ and $d$ are positive integers with $a \geq 2$. Let $h_{a,d}(n)$ be the largest integer $t$ such that any set of $n$ points in $\mathbb{R}^d$ contains a subset of $t$ points for which all the non-zero volumes of the ${t \choose a}$ subsets of order $a$ are distinct. Beginning with Erdős in 1957, the function $h_{2,d}(n)$ has been closely studied and is known to be at least a power of $n$. We improve the best known bound for $h_{2,d}(n)$ and show that $h_{a,d}(n)$ is at least a power of $n$ for all $a$ and $d$.
△ Less
Submitted 10 May, 2015; v1 submitted 26 January, 2014;
originally announced January 2014.
-
Stanley-Wilf limits are typically exponential
Authors:
Jacob Fox
Abstract:
For a permutation $π$, let $S_{n}(π)$ be the number of permutations on $n$ letters avoiding $π$. Marcus and Tardos proved the celebrated Stanley-Wilf conjecture that $L(π)= \lim_{n \to \infty} S_n(π)^{1/n}$ exists and is finite. Backed by numerical evidence, it has been conjectured by many researchers over the years that $L(π)=Θ(k^2)$ for every permutation $π$ on $k$ letters. We disprove this conj…
▽ More
For a permutation $π$, let $S_{n}(π)$ be the number of permutations on $n$ letters avoiding $π$. Marcus and Tardos proved the celebrated Stanley-Wilf conjecture that $L(π)= \lim_{n \to \infty} S_n(π)^{1/n}$ exists and is finite. Backed by numerical evidence, it has been conjectured by many researchers over the years that $L(π)=Θ(k^2)$ for every permutation $π$ on $k$ letters. We disprove this conjecture, showing that $L(π)=2^{k^{Θ(1)}}$ for almost all permutations $π$ on $k$ letters.
△ Less
Submitted 31 October, 2013;
originally announced October 2013.
-
Strong & Weak Methods: A Logical View of Uncertainty
Authors:
John Fox
Abstract:
The last few years has seen a growing debate about techniques for managing uncertainty in AI systems. Unfortunately this debate has been cast as a rivalry between AI methods and classical probability based ones. Three arguments for extending the probability framework of uncertainty are presented, none of which imply a challenge to classical methods. These are (1) explicit representation of several…
▽ More
The last few years has seen a growing debate about techniques for managing uncertainty in AI systems. Unfortunately this debate has been cast as a rivalry between AI methods and classical probability based ones. Three arguments for extending the probability framework of uncertainty are presented, none of which imply a challenge to classical methods. These are (1) explicit representation of several types of uncertainty, specifically possibility and plausibility, as well as probability, (2) the use of weak methods for uncertainty management in problems which are poorly defined, and (3) symbolic representation of different uncertainty calculi and methods for choosing between them.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
Symbolic Decision Theory and Autonomous Systems
Authors:
John Fox,
Paul J. Krause
Abstract:
The ability to reason under uncertainty and with incomplete information is a fundamental requirement of decision support technology. In this paper we argue that the concentration on theoretical techniques for the evaluation and selection of decision options has distracted attention from many of the wider issues in decision making. Although numerical methods of reasoning under uncertainty have st…
▽ More
The ability to reason under uncertainty and with incomplete information is a fundamental requirement of decision support technology. In this paper we argue that the concentration on theoretical techniques for the evaluation and selection of decision options has distracted attention from many of the wider issues in decision making. Although numerical methods of reasoning under uncertainty have strong theoretical foundations, they are representationally weak and only deal with a small part of the decision process. Knowledge based systems, on the other hand, offer greater flexibility but have not been accompanied by a clear decision theory. We describe here work which is under way towards providing a theoretical framework for symbolic decision procedures. A central proposal is an extended form of inference which we call argumentation; reasoning for and against decision options from generalised domain theories. The approach has been successfully used in several decision support applications, but it is argued that a comprehensive decision theory must cover autonomous decision making, where the agent can formulate questions as well as take decisions. A major theoretical challenge for this theory is to capture the idea of reflection to permit decision agents to reason about their goals, what they believe and why, and what they need to know or do in order to achieve their goals.
△ Less
Submitted 20 March, 2013;
originally announced March 2013.