Search | arXiv e-print repository

Characterising Interventions in Causal Games

Authors: Manuj Mishra, James Fox, Michael Wooldridge

Abstract: Causal games are probabilistic graphical models that enable causal queries to be answered in multi-agent settings. They extend causal Bayesian networks by specifying decision and utility variables to represent the agents' degrees of freedom and objectives. In multi-agent settings, whether each agent decides on their policy before or after knowing the causal intervention is important as this affect… ▽ More Causal games are probabilistic graphical models that enable causal queries to be answered in multi-agent settings. They extend causal Bayesian networks by specifying decision and utility variables to represent the agents' degrees of freedom and objectives. In multi-agent settings, whether each agent decides on their policy before or after knowing the causal intervention is important as this affects whether they can respond to the intervention by adapting their policy. Consequently, previous work in causal games imposed chronological constraints on permissible interventions. We relax this by outlining a sound and complete set of primitive causal interventions so the effect of any arbitrarily complex interventional query can be studied in multi-agent settings. We also demonstrate applications to the design of safe AI systems by considering causal mechanism design and commitment. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted to the 40th Conference on Uncertainty in Artificial Intelligence (UAI-2024)

arXiv:2405.10442 [pdf, ps, other]

Data-driven low-dimensional model of a sedimenting flexible fiber

Authors: Andrew J Fox, Michael D. Graham

Abstract: The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscou… ▽ More The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscous Newtonian fluid, using results from detailed simulations as the data set. The approach combines an autoencoder neural network architecture to learn a low-dimensional latent representation of the filament shape, with a neural ODE that learns the evolution of the particle in the latent state. The model was designed to model filaments of varying flexibility, characterized by an elasto-gravitational number $\mathcal{B}$, and was trained on a data set containing the evolution of fibers beginning at set angles of inclination. For the range of $\mathcal{B}$ considered here (100-10000), the filament shape dynamics can be represented with high accuracy with only four degrees of freedom, in contrast to the 93 present in the original bead-spring model used to generate the dynamic trajectories. We predict the evolution of fibers set at arbitrary angles and demonstrate that our data-driven model can accurately forecast the evolution of a fiber at both trained and untrained elasto-gravitational numbers. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2401.15119 [pdf, other]

Interpreting Time Series Transformer Models and Sensitivity Analysis of Population Age Groups to COVID-19 Infections

Authors: Md Khairul Islam, Tyler Valentine, Timothy Joowon Sue, Ayush Karmacharya, Luke Neil Benham, Zhengguang Wang, Kingsley Kim, Judy Fox

Abstract: Interpreting deep learning time series models is crucial in understanding the model's behavior and learning patterns from raw data for real-time decision-making. However, the complexity inherent in transformer-based time series models poses challenges in explaining the impact of individual features on predictions. In this study, we leverage recent local interpretation methods to interpret state-of… ▽ More Interpreting deep learning time series models is crucial in understanding the model's behavior and learning patterns from raw data for real-time decision-making. However, the complexity inherent in transformer-based time series models poses challenges in explaining the impact of individual features on predictions. In this study, we leverage recent local interpretation methods to interpret state-of-the-art time series models. To use real-world datasets, we collected three years of daily case data for 3,142 US counties. Firstly, we compare six transformer-based models and choose the best prediction model for COVID-19 infection. Using 13 input features from the last two weeks, we can predict the cases for the next two weeks. Secondly, we present an innovative way to evaluate the prediction sensitivity to 8 population age groups over highly dynamic multivariate infection data. Thirdly, we compare our proposed perturbation-based interpretation method with related work, including a total of eight local interpretation methods. Finally, we apply our framework to traffic and electricity datasets, demonstrating that our approach is generic and can be applied to other time-series domains. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.00827 [pdf, ps, other]

A multipartite analogue of Dilworth's Theorem

Authors: Jacob Fox, Huy Tuan Pham

Abstract: We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every… ▽ More We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every element in $A_{j}$ for $i\ne j$. This answers a question of the first author from 2006. As a corollary, we prove for each positive integer $h$ there is $C_h$ such that for any $h$ partial orders $<_{1},<_{2},\dots,<_{h}$ on a set of $n$ elements, there exists $k$ subsets $A_{1},A_{2},\dots,A_{k}$ each of size at least $n/(k\log n)^{C_{h}}$ such that for each partial order $<_{\ell}$, either $a_{1}<_{\ell}a_{2}<_{\ell}\dots<_{\ell}a_{k}$ for any tuple of elements $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_{1}>_{\ell}a_{2}>_{\ell}\dots>_{\ell}a_{k}$ for any $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_i$ is incomparable with $a_j$ for any $i\ne j$, $a_i\in A_i$ and $a_j\in A_j$. This improves on a 2009 result of Pach and the first author motivated by problems in discrete geometry. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.01028 [pdf, ps, other]

A structure theorem for pseudo-segments and its applications

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: We prove a far-reaching strengthening of Szemerédi's regularity lemma for intersection graphs of pseudo-segments. It shows that the vertex set of such a graph can be partitioned into a bounded number of parts of roughly the same size such that almost all bipartite graphs between different pairs of parts are complete or empty. We use this to get an improved bound on disjoint edges in simple topolog… ▽ More We prove a far-reaching strengthening of Szemerédi's regularity lemma for intersection graphs of pseudo-segments. It shows that the vertex set of such a graph can be partitioned into a bounded number of parts of roughly the same size such that almost all bipartite graphs between different pairs of parts are complete or empty. We use this to get an improved bound on disjoint edges in simple topological graphs, showing that every $n$-vertex simple topological graph with no $k$ pairwise disjoint edges has at most $n(\log n)^{O(\log k)}$ edges. △ Less

Submitted 1 December, 2023; originally announced December 2023.

arXiv:2310.14455 [pdf]

An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI

Authors: Ross Gruetzemacher, Alan Chan, Kevin Frazier, Christy Manning, Štěpán Los, James Fox, José Hernández-Orallo, John Burden, Matija Franklin, Clíodhna Ní Ghuidhir, Mark Bailey, Daniel Eth, Toby Pilditch, Kyle Kilian

Abstract: Given rapid progress toward advanced AI and risks from frontier AI systems (advanced AI systems pushing the boundaries of the AI capabilities frontier), the creation and implementation of AI governance and regulatory schemes deserves prioritization and substantial investment. However, the status quo is untenable and, frankly, dangerous. A regulatory gap has permitted AI labs to conduct research, d… ▽ More Given rapid progress toward advanced AI and risks from frontier AI systems (advanced AI systems pushing the boundaries of the AI capabilities frontier), the creation and implementation of AI governance and regulatory schemes deserves prioritization and substantial investment. However, the status quo is untenable and, frankly, dangerous. A regulatory gap has permitted AI labs to conduct research, development, and deployment activities with minimal oversight. In response, frontier AI system evaluations have been proposed as a way of assessing risks from the development and deployment of frontier AI systems. Yet, the budding AI risk evaluation ecosystem faces significant coordination challenges, such as a limited diversity of evaluators, suboptimal allocation of effort, and perverse incentives. This paper proposes a solution in the form of an international consortium for AI risk evaluations, comprising both AI developers and third-party AI risk evaluators. Such a consortium could play a critical role in international efforts to mitigate societal-scale risks from advanced AI, including in managing responsible scaling policies and coordinated evaluation-based risk response. In this paper, we discuss the current evaluation ecosystem and its shortcomings, propose an international consortium for advanced AI risk evaluations, discuss issues regarding its implementation, discuss lessons that can be learnt from previous international institutions and existing proposals for international AI governance institutions, and, finally, we recommend concrete steps to advance the establishment of the proposed consortium: (i) solicit feedback from stakeholders, (ii) conduct additional research, (iii) conduct a workshop(s) for stakeholders, (iv) analyze feedback and create final proposal, (v) solicit funding, and (vi) create a consortium. △ Less

Submitted 6 November, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

Comments: 50 pages, 2 figures; updated w/ a few minor revisions based on feedback from SoLaR Workshop reviewers (on 5 page version)

arXiv:2309.15013 [pdf, other]

Updated Corpora and Benchmarks for Long-Form Speech Recognition

Authors: Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté

Abstract: The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en -… ▽ More The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en - with updated transcription and alignments to enable their use for long-form ASR research. We use these reconstituted corpora to study the train-test mismatch problem for transducers and attention-based encoder-decoders (AEDs), confirming that AEDs are more susceptible to this issue. Finally, we benchmark a simple long-form training for these models, showing its efficacy for model robustness under this domain shift. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: Submitted to ICASSP 2024

arXiv:2308.07805 [pdf, other]

Fairness and Privacy in Federated Learning and Their Implications in Healthcare

Authors: Navya Annapareddy, Jade Preston, Judy Fox

Abstract: Currently, many contexts exist where distributed learning is difficult or otherwise constrained by security and communication limitations. One common domain where this is a consideration is in Healthcare where data is often governed by data-use-ordinances like HIPAA. On the other hand, larger sample sizes and shared data models are necessary to allow models to better generalize on account of the p… ▽ More Currently, many contexts exist where distributed learning is difficult or otherwise constrained by security and communication limitations. One common domain where this is a consideration is in Healthcare where data is often governed by data-use-ordinances like HIPAA. On the other hand, larger sample sizes and shared data models are necessary to allow models to better generalize on account of the potential for more variability and balancing underrepresented classes. Federated learning is a type of distributed learning model that allows data to be trained in a decentralized manner. This, in turn, addresses data security, privacy, and vulnerability considerations as data itself is not shared across a given learning network nodes. Three main challenges to federated learning include node data is not independent and identically distributed (iid), clients requiring high levels of communication overhead between peers, and there is the heterogeneity of different clients within a network with respect to dataset bias and size. As the field has grown, the notion of fairness in federated learning has also been introduced through novel implementations. Fairness approaches differ from the standard form of federated learning and also have distinct challenges and considerations for the healthcare domain. This paper endeavors to outline the typical lifecycle of fair federated learning in research as well as provide an updated taxonomy to account for the current state of fairness in implementations. Lastly, this paper provides added insight into the implications and challenges of implementing and supporting fairness in federated learning in the healthcare domain. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2308.01529 [pdf, other]

Towards Fair and Privacy Preserving Federated Learning for the Healthcare Domain

Authors: Navya Annapareddy, Yingzheng Liu, Judy Fox

Abstract: Federated learning enables data sharing in healthcare contexts where it might otherwise be difficult due to data-use-ordinances or security and communication constraints. Distributed and shared data models allow models to become generalizable and learn from heterogeneous clients. While addressing data security, privacy, and vulnerability considerations, data itself is not shared across nodes in a… ▽ More Federated learning enables data sharing in healthcare contexts where it might otherwise be difficult due to data-use-ordinances or security and communication constraints. Distributed and shared data models allow models to become generalizable and learn from heterogeneous clients. While addressing data security, privacy, and vulnerability considerations, data itself is not shared across nodes in a given learning network. On the other hand, FL models often struggle with variable client data distributions and operate on an assumption of independent and identically distributed data. As the field has grown, the notion of fairness-aware federated learning mechanisms has also been introduced and is of distinct significance to the healthcare domain where many sensitive groups and protected classes exist. In this paper, we create a benchmark methodology for FAFL mechanisms under various heterogeneous conditions on datasets in the healthcare domain typically outside the scope of current federated learning benchmarks, such as medical imaging and waveform data formats. Our results indicate considerable variation in how various FAFL schemes respond to high levels of data heterogeneity. Additionally, doing so under privacy-preserving conditions can create significant increases in network communication cost and latency compared to the typical federated learning scheme. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.05059 [pdf, ps, other]

doi 10.4204/EPTCS.379.17

On Imperfect Recall in Multi-Agent Influence Diagrams

Authors: James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge

Abstract: Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfec… ▽ More Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfect recall, however, a Nash equilibrium in behavioural policies may not exist. We overcome this by showing how to solve MAIDs with forgetful and absent-minded agents using mixed policies and two types of correlated equilibrium. We also analyse the computational complexity of key decision problems in MAIDs, and explore tractable cases. Finally, we describe applications of MAIDs to Markov games and team situations, where imperfect recall is often unavoidable. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: In Proceedings TARK 2023, arXiv:2307.04005

Journal ref: EPTCS 379, 2023, pp. 201-220

arXiv:2307.00751 [pdf, other]

Population Age Group Sensitivity for COVID-19 Infections with Deep Learning

Authors: Md Khairul Islam, Tyler Valentine, Royal Wang, Levi Davis, Matt Manner, Judy Fox

Abstract: The COVID-19 pandemic has created unprecedented challenges for governments and healthcare systems worldwide, highlighting the critical importance of understanding the factors that contribute to virus transmission. This study aimed to identify the most influential age groups in COVID-19 infection rates at the US county level using the Modified Morris Method and deep learning for time series. Our ap… ▽ More The COVID-19 pandemic has created unprecedented challenges for governments and healthcare systems worldwide, highlighting the critical importance of understanding the factors that contribute to virus transmission. This study aimed to identify the most influential age groups in COVID-19 infection rates at the US county level using the Modified Morris Method and deep learning for time series. Our approach involved training the state-of-the-art time-series model Temporal Fusion Transformer on different age groups as a static feature and the population vaccination status as the dynamic feature. We analyzed the impact of those age groups on COVID-19 infection rates by perturbing individual input features and ranked them based on their Morris sensitivity scores, which quantify their contribution to COVID-19 transmission rates. The findings are verified using ground truth data from the CDC and US Census, which provide the true infection rates for each age group. The results suggest that young adults were the most influential age group in COVID-19 transmission at the county level between March 1, 2020, and November 27, 2021. Using these results can inform public health policies and interventions, such as targeted vaccination strategies, to better control the spread of the virus. Our approach demonstrates the utility of feature sensitivity analysis in identifying critical factors contributing to COVID-19 transmission and can be applied in other public health domains. △ Less

Submitted 3 July, 2023; originally announced July 2023.

arXiv:2306.04025 [pdf, other]

Designing explainable artificial intelligence with active inference: A framework for transparent introspection and decision-making

Authors: Mahault Albarracin, Inês Hipólito, Safae Essafi Tremblay, Jason G. Fox, Gabriel René, Karl Friston, Maxwell J. D. Ramstead

Abstract: This paper investigates the prospect of develo** human-interpretable, explainable artificial intelligence (AI) systems based on active inference and the free energy principle. We first provide a brief overview of active inference, and in particular, of how it applies to the modeling of decision-making, introspection, as well as the generation of overt and covert actions. We then discuss how acti… ▽ More This paper investigates the prospect of develo** human-interpretable, explainable artificial intelligence (AI) systems based on active inference and the free energy principle. We first provide a brief overview of active inference, and in particular, of how it applies to the modeling of decision-making, introspection, as well as the generation of overt and covert actions. We then discuss how active inference can be leveraged to design explainable AI systems, namely, by allowing us to model core features of ``introspective'' processes and by generating useful, human-interpretable models of the processes involved in decision-making. We propose an architecture for explainable AI systems using active inference. This architecture foregrounds the role of an explicit hierarchical generative model, the operation of which enables the AI system to track and explain the factors that contribute to its own decisions, and whose structure is designed to be interpretable and auditable by human users. We outline how this architecture can integrate diverse sources of information to make informed decisions in an auditable manner, mimicking or reproducing aspects of human-like consciousness and introspection. Finally, we discuss the implications of our findings for future research in AI, and the potential ethical considerations of develo** AI systems with (the appearance of) introspective capabilities. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2305.14132 [pdf, ps, other]

Set-coloring Ramsey numbers and error-correcting codes near the zero-rate threshold

Authors: David Conlon, Jacob Fox, Huy Tuan Pham, Yufei Zhao

Abstract: For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from… ▽ More For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is known to grow exponentially in $r$, while if $\frac{s}{r}$ is greater than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is bounded. Here we prove bounds for $R(n;r,s)$ in the intermediate range where $\frac{s}{r}$ is close to $1 - \frac{1}{n-1}$ by establishing a connection to the maximum size of error-correcting codes near the zero-rate threshold. △ Less

Submitted 14 August, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.01090 [pdf, ps, other]

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

Authors: Kevin Zeng, Carlos E. Pérez De Jesús, Andrew J. Fox, Michael D. Graham

Abstract: While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal… ▽ More While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the map** functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a "collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices. △ Less

Submitted 6 December, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2302.01923 [pdf, other]

Real-Time Traffic End-of-Queue Detection and Tracking in UAV Video

Authors: Russ Messenger, Md Zobaer Islam, Matthew Whitlock, Erik Spong, Nate Morton, Layne Claggett, Chris Matthews, Jordan Fox, Leland Palmer, Dane C. Johnson, John F. O'Hara, Christopher J. Crick, Jamey D. Jacob, Sabit Ekin

Abstract: Highway work zones are susceptible to undue accumulation of motorized vehicles which calls for dynamic work zone warning signs to prevent accidents. The work zone signs are placed according to the location of the end-of-queue of vehicles which usually changes rapidly. The detection of moving objects in video captured by Unmanned Aerial Vehicles (UAV) has been extensively researched so far, and is… ▽ More Highway work zones are susceptible to undue accumulation of motorized vehicles which calls for dynamic work zone warning signs to prevent accidents. The work zone signs are placed according to the location of the end-of-queue of vehicles which usually changes rapidly. The detection of moving objects in video captured by Unmanned Aerial Vehicles (UAV) has been extensively researched so far, and is used in a wide array of applications including traffic monitoring. Unlike the fixed traffic cameras, UAVs can be used to monitor the traffic at work zones in real-time and also in a more cost-effective way. This study presents a method as a proof of concept for detecting End-of-Queue (EOQ) of traffic by processing the real-time video footage of a highway work zone captured by UAV. EOQ is detected in the video by image processing which includes background subtraction and blob detection methods. This dynamic localization of EOQ of vehicles will enable faster and more accurate relocation of work zone warning signs for drivers and thus will reduce work zone fatalities. The method can be applied to detect EOQ of vehicles and notify drivers in any other roads or intersections too where vehicles are rapidly accumulating due to special events, traffic jams, construction, or accidents. △ Less

Submitted 31 October, 2023; v1 submitted 9 January, 2023; originally announced February 2023.

Comments: 13 pages, 7 figures excluding photos of authors, Published in International Journal of Intelligent Transportation Systems Research. Link to the published version: https://link.springer.com/article/10.1007/s13177-023-00374-0

arXiv:2301.02324 [pdf, other]

doi 10.1016/j.artint.2023.103919

Reasoning about Causality in Games

Authors: Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge

Abstract: Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's caus… ▽ More Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three key questions: i) How can the (causal) dependencies in games - either between variables, or between strategies - be modelled in a uniform, principled manner? ii) How may causal queries be computed in causal games, and what assumptions does this require? iii) How do causal games compare to existing formalisms? To address question i), we introduce mechanised games, which encode dependencies between agents' decision rules and the distributions governing the game. In response to question ii), we present definitions of predictions, interventions, and counterfactuals, and discuss the assumptions required for each. Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support. Finally, we highlight possible applications of causal games, aided by an extensive open-source Python library. △ Less

Submitted 17 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: Published in Artificial Intelligence (2023)

arXiv:2212.01354 [pdf, other]

doi 10.1177/26339137231222481

Designing Ecosystems of Intelligence from First Principles

Authors: Karl J Friston, Maxwell J D Ramstead, Alex B Kiefer, Alexander Tschantz, Christopher L Buckley, Mahault Albarracin, Riddhi J Pitliya, Conor Heins, Brennan Klein, Beren Millidge, Dalton A R Sakthivadivel, Toby St Clere Smithe, Magnus Koudahl, Safae Essafi Tremblay, Capm Petersen, Kaiser Fung, Jason G Fox, Steven Swanson, Dan Mapes, Gabriel René

Abstract: This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants -- what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read… ▽ More This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants -- what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world -- also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing -- leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first -- and key -- step towards such an ecology. △ Less

Submitted 11 January, 2024; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: 23+18 pages, one figure, one six page appendix

Journal ref: Collective Intelligence, 3(1), 2024

arXiv:2210.03258 [pdf, other]

Interpreting County Level COVID-19 Infection and Feature Sensitivity using Deep Learning Time Series Models

Authors: Md Khairul Islam, Di Zhu, Yingzheng Liu, Andrej Erkelens, Nick Daniello, Judy Fox

Abstract: Interpretable machine learning plays a key role in healthcare because it is challenging in understanding feature importance in deep learning model predictions. We propose a novel framework that uses deep learning to study feature sensitivity for model predictions. This work combines sensitivity analysis with heterogeneous time-series deep learning model prediction, which corresponds to the interpr… ▽ More Interpretable machine learning plays a key role in healthcare because it is challenging in understanding feature importance in deep learning model predictions. We propose a novel framework that uses deep learning to study feature sensitivity for model predictions. This work combines sensitivity analysis with heterogeneous time-series deep learning model prediction, which corresponds to the interpretations of spatio-temporal features. We forecast county-level COVID-19 infection using the Temporal Fusion Transformer. We then use the sensitivity analysis extending Morris Method to see how sensitive the outputs are with respect to perturbation to our static and dynamic input features. The significance of the work is grounded in a real-world COVID-19 infection prediction with highly non-stationary, finely granular, and heterogeneous data. 1) Our model can capture the detailed daily changes of temporal and spatial model behaviors and achieves high prediction performance compared to a PyTorch baseline. 2) By analyzing the Morris sensitivity indices and attention patterns, we decipher the meaning of feature importance with observational population and dynamic model changes. 3) We have collected 2.5 years of socioeconomic and health features over 3142 US counties, such as observed cases and deaths, and a number of static (age distribution, health disparity, and industry) and dynamic features (vaccination, disease spread, transmissible cases, and social distancing). Using the proposed framework, we conduct extensive experiments and show our model can learn complex interactions and perform predictions for daily infection at the county level. Being able to model the disease infection with a hybrid prediction and description accuracy measurement with Morris index at the county level is a central idea that sheds light on individual feature interpretation via sensitivity analysis. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2209.01250 [pdf, other]

Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model

Authors: Jennifer Drexler Fox, Natalie Delworth

Abstract: Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public benchmark for this task. We present baseline results on this benchmark using a pretrained end-to-end ASR model from the WeNet toolkit. We show results for shallow fu… ▽ More Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public benchmark for this task. We present baseline results on this benchmark using a pretrained end-to-end ASR model from the WeNet toolkit. We show results for shallow fusion contextual biasing applied to two different decoding algorithms. Our baseline results confirm observations that end-to-end models struggle in particular with words that are rarely or never seen during training, and that existing shallow fusion techniques do not adequately address this problem. We propose an alternate spelling prediction model that improves recall of rare words by 34.7% relative and of out-of-vocabulary words by 97.2% relative, compared to contextual biasing without alternate spellings. This model is conceptually similar to ones used in prior work, but is simpler to implement as it does not rely on either a pronunciation dictionary or an existing text-to-speech system. △ Less

Submitted 2 September, 2022; originally announced September 2022.

arXiv:2208.11838 [pdf, other]

Learning Task Automata for Reinforcement Learning using Hidden Markov Models

Authors: Alessandro Abate, Yousif Almulla, James Fox, David Hyland, Michael Wooldridge

Abstract: Training reinforcement learning (RL) agents using scalar reward signals is often infeasible when an environment has sparse and non-Markovian rewards. Moreover, handcrafting these reward functions before training is prone to misspecification, especially when the environment's dynamics are only partially known. This paper proposes a novel pipeline for learning non-Markovian task specifications as su… ▽ More Training reinforcement learning (RL) agents using scalar reward signals is often infeasible when an environment has sparse and non-Markovian rewards. Moreover, handcrafting these reward functions before training is prone to misspecification, especially when the environment's dynamics are only partially known. This paper proposes a novel pipeline for learning non-Markovian task specifications as succinct finite-state `task automata' from episodes of agent experience within unknown environments. We leverage two key algorithmic insights. First, we learn a product MDP, a model composed of the specification's automaton and the environment's MDP (both initially unknown), by treating the product MDP as a partially observable MDP and using the well-known Baum-Welch algorithm for learning hidden Markov models. Second, we propose a novel method for distilling the task automaton (assumed to be a deterministic finite automaton) from the learnt product MDP. Our learnt task automaton enables the decomposition of a task into its constituent sub-tasks, which improves the rate at which an RL agent can later synthesise an optimal policy. It also provides an interpretable encoding of high-level environmental and task features, so a human can readily verify that the agent has learnt coherent tasks with no misspecifications. In addition, we take steps towards ensuring that the learnt automaton is environment-agnostic, making it well-suited for use in transfer learning. Finally, we provide experimental results compared with two baselines to illustrate our algorithm's performance in different environments and tasks. △ Less

Submitted 3 October, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

Comments: 14 pages, 7 figures, Accepted to the 26th European Conference on Artificial Intelligence (ECAI 2023)

arXiv:2204.01938 [pdf, ps, other]

Extremal results on feedback arc sets in digraphs

Authors: Jacob Fox, Zoe Himwich, Nitya Mani

Abstract: A directed graph is oriented if it can be obtained by orienting the edges of a simple, undirected graph. For an oriented graph $G$, let $β(G)$ denote the size of a minimum feedback arc set, a smallest subset of edges whose deletion leaves an acyclic subgraph. A simple consequence of a result of Berger and Shor is that any oriented graph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^{3/4})$. We… ▽ More A directed graph is oriented if it can be obtained by orienting the edges of a simple, undirected graph. For an oriented graph $G$, let $β(G)$ denote the size of a minimum feedback arc set, a smallest subset of edges whose deletion leaves an acyclic subgraph. A simple consequence of a result of Berger and Shor is that any oriented graph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^{3/4})$. We observe that if an oriented graph $G$ has a fixed forbidden subgraph $B$, the upper bound of $β(G) = m/2 - Ω(m^{3/4})$ is best possible as a function of the number of edges if $B$ is not bipartite, but the exponent $3/4$ in the lower order term can be improved if $B$ is bipartite. We also show that for every rational number $r$ between $3/4$ and $1$, there is a finite collection of digraphs $\mathcal{B}$ such that every $\mathcal{B}$-free digraph $G$ with $m$ edges satisfies $β(G) = m/2 - Ω(m^r)$, and this bound is best possible up to the implied constant factor. The proof uses a connection to Turán numbers and a result of Bukh and Conlon. Both of our upper bounds come equipped with randomized linear-time algorithms that construct feedback arc sets achieving those bounds. Finally, we give a characterization of quasirandom directed graphs via minimum feedback arc sets. △ Less

Submitted 19 April, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: 23 pages

MSC Class: 05D40

arXiv:2202.08883 [pdf, other]

Curriculum optimization for low-resource speech recognition

Authors: Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis Tyers

Abstract: Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model wh… ▽ More Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model while training and prior knowledge about the difficulty of the training examples. We introduce a new difficulty measure called compression ratio that can be used as a scoring function for raw audio in various noise conditions. The proposed method improves speech recognition Word Error Rate performance by up to 33% relative over the baseline system △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2112.02378 [pdf, ps, other]

Quasiplanar Graphs, String Graphs, and the Erdos-Gallai Problem

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: An $r$-quasiplanar graph is a graph drawn in the plane with no $r$ pairwise crossing edges. Let $s \geq 3$ be an integer and $r=2^s$. We prove that there is a constant $C$ such that every $r$-quasiplanar graph with $n \geq r$ vertices has at most $n\left(Cs^{-1}\log n\right)^{2s-4}$ edges. A graph whose vertices are continuous curves in the plane, two being connected by an edge if and only if th… ▽ More An $r$-quasiplanar graph is a graph drawn in the plane with no $r$ pairwise crossing edges. Let $s \geq 3$ be an integer and $r=2^s$. We prove that there is a constant $C$ such that every $r$-quasiplanar graph with $n \geq r$ vertices has at most $n\left(Cs^{-1}\log n\right)^{2s-4}$ edges. A graph whose vertices are continuous curves in the plane, two being connected by an edge if and only if they intersect, is called a string graph. We show that for every $ε>0$, there exists $δ>0$ such that every string graph with $n$ vertices, whose chromatic number is at least $n^ε$ contains a clique of size at least $n^δ$. A clique of this size or a coloring using fewer than $n^ε$ colors can be found by a polynomial time algorithm in terms of the size of the geometric representation of the set of strings. In the process, we use, generalize, and strengthen previous results of Lee, Tomon, and others. All of our theorems are related to geometric variants of the following classical graph-theoretic problem of Erdos, Gallai, and Rogers. Given a $K_r$-free graph on $n$ vertices and an integer $s<r$, at least how many vertices can we find such that the subgraph induced by them is $K_s$-free? △ Less

Submitted 25 October, 2022; v1 submitted 4 December, 2021; originally announced December 2021.

Comments: Appears in the Proceedings of the 30th International Symposium on Graph Drawing and Network Visualization (GD 2022)

arXiv:2109.11021 [pdf, other]

Intel Optane DCPMM and Serverless Computing

Authors: Ahmet Uyar, Selahattin Akkas, Jiayu Li, Judy Fox

Abstract: This report describes 1) how we use Intel's Optane DCPMM in the memory Mode. We investigate the the scalability of applications on a single Optane machine, using Subgraph counting as memory-intensive graph problem. We test with various input graph and subtemplate sizes to determine its performance for different memory and CPU loads, as well as a comparison of performance on a single node Optane wi… ▽ More This report describes 1) how we use Intel's Optane DCPMM in the memory Mode. We investigate the the scalability of applications on a single Optane machine, using Subgraph counting as memory-intensive graph problem. We test with various input graph and subtemplate sizes to determine its performance for different memory and CPU loads, as well as a comparison of performance on a single node Optane with a distributed set of nodes in a cluster using MPI. 2) We investigate the end-to-end execution delays in serverless computing and study concurrent function executions with cold starts. In future work, we will show that persistent memory machines may significantly improve concurrent function invocations in serverless computing including Amazon Lambda, Microsoft Azure Functions, Google Cloud Functions and IBM Cloud Functions (Apache OpenWhisk). △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: 13pages

arXiv:2108.11290 [pdf, ps, other]

On the number of edges of separated multigraphs

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: We prove that the number of edges of a multigraph $G$ with $n$ vertices is at most $O(n^2\log n)$, provided that any two edges cross at most once, parallel edges are noncrossing, and the lens enclosed by every pair of parallel edges in $G$ contains at least one vertex. As a consequence, we prove the following extension of the Crossing Lemma of Ajtai, Chvátal, Newborn, Szemerédi and Leighton, if… ▽ More We prove that the number of edges of a multigraph $G$ with $n$ vertices is at most $O(n^2\log n)$, provided that any two edges cross at most once, parallel edges are noncrossing, and the lens enclosed by every pair of parallel edges in $G$ contains at least one vertex. As a consequence, we prove the following extension of the Crossing Lemma of Ajtai, Chvátal, Newborn, Szemerédi and Leighton, if $G$ has $e \geq 4n$ edges, in any drawing of $G$ with the above property, the number of crossings is $Ω\left(\frac{e^3}{n^2\log(e/n)}\right)$. This answers a question of Kaufmann et al. and is tight up to the logarithmic factor. △ Less

Submitted 22 February, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

Comments: Appears in the Proceedings of the 29th International Symposium on Graph Drawing and Network Visualization (GD 2021)

arXiv:2107.12423 [pdf, other]

HySec-Flow: Privacy-Preserving Genomic Computing with SGX-based Big-Data Analytics Framework

Authors: Chathura Widanage, Weijie Liu, Jiayu Li, Hongbo Chen, XiaoFeng Wang, Haixu Tang, Judy Fox

Abstract: Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for l… ▽ More Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for large-scale genomic data analysis using SGX platforms. Here, the data-intensive computing tasks can be partitioned into independent subtasks to be deployed into distinct secured and non-secured containers, therefore allowing for parallel execution while alleviating the limited size of Page Cache (EPC) memory in each enclave. We illustrate our contributions using a workflow supporting indexing, alignment, dispatching, and merging the execution of SGX- enabled containers. We provide details regarding the architecture of the trusted and untrusted components and the underlying Scorn and Graphene support as generic shielding execution frameworks to port legacy code. We thoroughly evaluate the performance of our privacy-preserving reads map** algorithm using real human genome sequencing data. The results demonstrate that the performance is enhanced by partitioning the time-consuming genomic computation into subtasks compared to the conventional execution of the data-intensive reads map** algorithm in an enclave. The proposed HySec-Flow framework is made available as an open-source and adapted to the data-parallel computation of other large-scale genomic tasks requiring security and scalable computational resources. △ Less

Submitted 26 July, 2021; originally announced July 2021.

arXiv:2103.10484 [pdf, other]

Concentric Spherical GNN for 3D Representation Learning

Authors: James Fox, Bo Zhao, Sivasankaran Rajamanickam, Rampi Ramprasad, Le Song

Abstract: Learning 3D representations that generalize well to arbitrarily oriented inputs is a challenge of practical importance in applications varying from computer vision to physics and chemistry. We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps, of which the single sphere representation is a special case. Our hierarchical architecture is… ▽ More Learning 3D representations that generalize well to arbitrarily oriented inputs is a challenge of practical importance in applications varying from computer vision to physics and chemistry. We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps, of which the single sphere representation is a special case. Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information. We show the applicability of our method for two different types of 3D inputs, mesh objects, which can be regularly sampled, and point clouds, which are irregularly distributed. We also propose an efficient map** of point clouds to concentric spherical images, thereby bridging spherical convolutions on grids with general point clouds. We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: This paper has been submitted for conference review

arXiv:2102.10220 [pdf, ps, other]

Making an $H$-Free Graph $k$-Colorable

Authors: Jacob Fox, Zoe Himwich, Nitya Mani

Abstract: We study the following question: how few edges can we delete from any $H$-free graph on $n$ vertices in order to make the resulting graph $k$-colorable? It turns out that various classical problems in extremal graph theory are special cases of this question. For $H$ any fixed odd cycle, we determine the answer up to a constant factor when $n$ is sufficiently large. We also prove an upper bound whe… ▽ More We study the following question: how few edges can we delete from any $H$-free graph on $n$ vertices in order to make the resulting graph $k$-colorable? It turns out that various classical problems in extremal graph theory are special cases of this question. For $H$ any fixed odd cycle, we determine the answer up to a constant factor when $n$ is sufficiently large. We also prove an upper bound when $H$ is a fixed clique that we conjecture is tight up to a constant factor, and prove upper bounds for more general families of graphs. We apply our results to get a new bound on the maximum cut of graphs with a forbidden odd cycle in terms of the number of edges. △ Less

Submitted 19 March, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

Comments: 21 pages

MSC Class: 05C35; 05C38; 05D40

arXiv:2102.05008 [pdf, other]

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Authors: Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge

Abstract: Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibri… ▽ More Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibrium refinements. We then prove several equivalence results between MAIDs and EFGs. Finally, we describe an open source implementation for reasoning about MAIDs and computing their equilibria. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

arXiv:2010.13778 [pdf]

doi 10.1088/2058-9565/abfa64

Achieving a quantum smart workforce

Authors: Clarice D. Aiello, D. D. Awschalom, Hannes Bernien, Tina Brower-Thomas, Kenneth R. Brown, Todd A. Brun, Justin R. Caram, Eric Chitambar, Rosa Di Felice, Michael F. J. Fox, Stephan Haas, Alexander W. Holleitner, Eric R. Hudson, Jeffrey H. Hunt, Robert Joynt, Scott Koziol, H. J. Lewandowski, Douglas T. McClure, Jens Palsberg, Gina Passante, Kristen L. Pudenz, Christopher J. K. Richardson, Jessica L. Rosenberg, R. S. Ross, Mark Saffman , et al. (7 additional authors not shown)

Abstract: Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. Thi… ▽ More Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. This manuscript not only addresses policy makers and funding agencies (both public and private and from the regional to the international level) but also contains needs identified by industry leaders and discusses the difficulties inherent in creating an inclusive QISE curriculum. We report on the status of eighteen post-secondary education programs in QISE and provide guidance for building new programs. Lastly, we encourage the development of a comprehensive strategic plan for quantum education and workforce development as a means to make the most of the ongoing substantial investments being made in QISE. △ Less

Submitted 23 October, 2020; originally announced October 2020.

Comments: 18 pages, 2 figures, 1 table

Journal ref: Quantum Sci. Technol. 6 030501 (2021)

arXiv:1912.10206 [pdf, other]

How Robust Are Graph Neural Networks to Structural Noise?

Authors: James Fox, Sivasankaran Rajamanickam

Abstract: Graph neural networks (GNNs) are an emerging model for learning graph embeddings and making predictions on graph structured data. However, robustness of graph neural networks is not yet well-understood. In this work, we focus on node structural identity predictions, where a representative GNN model is able to achieve near-perfect accuracy. We also show that the same GNN model is not robust to addi… ▽ More Graph neural networks (GNNs) are an emerging model for learning graph embeddings and making predictions on graph structured data. However, robustness of graph neural networks is not yet well-understood. In this work, we focus on node structural identity predictions, where a representative GNN model is able to achieve near-perfect accuracy. We also show that the same GNN model is not robust to addition of structural noise, through a controlled dataset and set of experiments. Finally, we show that under the right conditions, graph-augmented training is capable of significantly improving robustness to structural noise. △ Less

Submitted 21 December, 2019; originally announced December 2019.

Comments: Accepted workshop paper at Deep Learning on Graphs: Methodologies and Applications (DLGMA'20)

arXiv:1912.08964 [pdf]

doi 10.1145/3375627.3375817

Exploring AI Futures Through Role Play

Authors: Shahar Avin, Ross Gruetzemacher, James Fox

Abstract: We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While… ▽ More We present an innovative methodology for studying and teaching the impacts of AI through a role play game. The game serves two primary purposes: 1) training AI developers and AI policy professionals to reflect on and prepare for future social and ethical challenges related to AI and 2) exploring possible futures involving AI technology development, deployment, social impacts, and governance. While the game currently focuses on the inter relations between short --, mid and long term impacts of AI, it has potential to be adapted for a broad range of scenarios, exploring in greater depths issues of AI policy research and affording training within organizations. The game presented here has undergone two years of development and has been tested through over 30 events involving between 3 and 70 participants. The game is under active development, but preliminary findings suggest that role play is a promising methodology for both exploring AI futures and training individuals and organizations in thinking about, and reflecting on, the impacts of AI and strategic mistakes that can be avoided today. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Accepted to AIES

arXiv:1911.03427 [pdf, ps, other]

doi 10.1007/s11856-022-2290-x

Induced arithmetic removal: complexity 1 patterns over finite fields

Authors: Jacob Fox, Jonathan Tidor, Yufei Zhao

Abstract: We prove an arithmetic analog of the induced graph removal lemma for complexity 1 patterns over finite fields. Informally speaking, we show that given a fixed collection of $r$-colored complexity 1 arithmetic patterns over $\mathbb F_q$, every coloring $φ\colon \mathbb F_q^n \setminus\{0\} \to [r]$ with $o(1)$ density of every such pattern can be recolored on an $o(1)$-fraction of the space so tha… ▽ More We prove an arithmetic analog of the induced graph removal lemma for complexity 1 patterns over finite fields. Informally speaking, we show that given a fixed collection of $r$-colored complexity 1 arithmetic patterns over $\mathbb F_q$, every coloring $φ\colon \mathbb F_q^n \setminus\{0\} \to [r]$ with $o(1)$ density of every such pattern can be recolored on an $o(1)$-fraction of the space so that no such pattern remains. △ Less

Submitted 8 November, 2019; originally announced November 2019.

Comments: 22 pages

Journal ref: Israel J. Math. 248 (2022), 1--38

arXiv:1910.03679 [pdf, other]

Performance Impact of Memory Channels on Sparse and Irregular Algorithms

Authors: Oded Green, James Fox, Jeffrey Young, Jun Shirako, David Bader

Abstract: Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization of the memory system for graph algorithms is not necessarily the raw bandwidth or even the latency of memory… ▽ More Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization of the memory system for graph algorithms is not necessarily the raw bandwidth or even the latency of memory requests. Instead, we show that performance is proportional to the number of memory channels available to handle small data transfers with limited spatial locality. Using several widely used graph frameworks, including Gunrock (on the GPU) and GAPBS \& Ligra (for CPUs), we evaluate key graph analytics kernels using two unique memory hierarchies, DDR-based and HBM/MCDRAM. Our results show that the differences in the peak bandwidths of several Pascal-generation GPU memory subsystems aren't reflected in the performance of various analytics. Furthermore, our experiments on CPU and Xeon Phi systems demonstrate that the number of memory channels utilized can be a decisive factor in performance across several different applications. For CPU systems with smaller thread counts, the memory channels can be underutilized while systems with high thread counts can oversaturate the memory subsystem, which leads to limited performance. Finally, we model the potential performance improvements of adding more memory channels with narrower access widths than are found in current platforms, and we analyze performance trade-offs for the two most prominent types of memory accesses found in graph algorithms, streaming and random accesses. △ Less

Submitted 8 October, 2019; originally announced October 2019.

arXiv:1902.10221 [pdf, other]

doi 10.1017/S0963548319000312

On Ramsey numbers of hedgehogs

Authors: Jacob Fox, Ray Li

Abstract: The hedgehog $H_t$ is a 3-uniform hypergraph on vertices $1,\dots,t+\binom{t}{2}$ such that, for any pair $(i,j)$ with $1\le i<j\le t$, there exists a unique vertex $k>t$ such that $\{i,j,k\}$ is an edge. Conlon, Fox, and Rödl proved that the two-color Ramsey number of the hedgehog grows polynomially in the number of its vertices, while the four-color Ramsey number grows exponentially in the numbe… ▽ More The hedgehog $H_t$ is a 3-uniform hypergraph on vertices $1,\dots,t+\binom{t}{2}$ such that, for any pair $(i,j)$ with $1\le i<j\le t$, there exists a unique vertex $k>t$ such that $\{i,j,k\}$ is an edge. Conlon, Fox, and Rödl proved that the two-color Ramsey number of the hedgehog grows polynomially in the number of its vertices, while the four-color Ramsey number grows exponentially in the number of its vertices. They asked whether the two-color Ramsey number of the hedgehog $H_t$ is nearly linear in the number of its vertices. We answer this question affirmatively, proving that $r(H_t) = O(t^2\ln t)$. △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: 13 pages

MSC Class: 05D10; 05D40; 05C65

Journal ref: Combinator. Probab. Comp. 29 (2020) 101-112

arXiv:1809.05203 [pdf, other]

Periodicity in Movement Patterns Shapes Epidemic Risk in Urban Environments

Authors: Zhanwei Du, Spencer J Fox, Petter Holme, Jiming Liu, Alison P. Galvani, Lauren Ancel Meyers

Abstract: Daily variation in human mobility modulates the speed and severity of emerging outbreaks, yet most epidemiological studies assume static contact patterns. With a highly mobile population exceeding 24 million people, Shanghai, China is a transportation hub at high risk for the importation and subsequent global propagation of infectious diseases. Here, we use a dynamic metapopulation model informed… ▽ More Daily variation in human mobility modulates the speed and severity of emerging outbreaks, yet most epidemiological studies assume static contact patterns. With a highly mobile population exceeding 24 million people, Shanghai, China is a transportation hub at high risk for the importation and subsequent global propagation of infectious diseases. Here, we use a dynamic metapopulation model informed by hourly transit data for Shanghai to estimate epidemic risks across thousands of outbreak scenarios. We find that the rate of initial epidemic growth varies by more than twenty-fold, depending on the hour and neighborhood of disease introduction. The riskiest introductions are those occurring close to the city center and on Fridays--which bridge weekday and weekend transit patterns and thereby connect otherwise disconnected portions of the population. The identification of these spatio-temporal hotspots can inform more efficient targets for sentinel surveillance and strategies for mitigating transmission. △ Less

Submitted 13 September, 2018; originally announced September 2018.

arXiv:1809.04716 [pdf, ps, other]

Towards the linear arboricity conjecture

Authors: Asaf Ferber, Jacob Fox, Vishesh Jain

Abstract: The linear arboricity of a graph $G$, denoted by $\text{la}(G)$, is the minimum number of edge-disjoint linear forests (i.e. forests in which every connected component is a path) in $G$ whose union covers all the edges of $G$. A famous conjecture due to Akiyama, Exoo, and Harary from 1980 asserts that $\text{la}(G)\leq \lceil (Δ(G)+1)/2 \rceil$, where $Δ(G)$ denotes the maximum degree of $G$. This… ▽ More The linear arboricity of a graph $G$, denoted by $\text{la}(G)$, is the minimum number of edge-disjoint linear forests (i.e. forests in which every connected component is a path) in $G$ whose union covers all the edges of $G$. A famous conjecture due to Akiyama, Exoo, and Harary from 1980 asserts that $\text{la}(G)\leq \lceil (Δ(G)+1)/2 \rceil$, where $Δ(G)$ denotes the maximum degree of $G$. This conjectured upper bound would be best possible, as is easily seen by taking $G$ to be a regular graph. In this paper, we show that for every graph $G$, $\text{la}(G)\leq \fracΔ{2}+O(Δ^{2/3-α})$ for some $α> 0$, thereby improving the previously best known bound due to Alon and Spencer from 1992. For graphs which are sufficiently good spectral expanders, we give even better bounds. Our proofs of these results further give probabilistic polynomial time algorithms for finding such decompositions into linear forests. △ Less

Submitted 12 September, 2018; originally announced September 2018.

arXiv:1809.01352 [pdf, other]

A Completion of the Proof of the Edge-statistics Conjecture

Authors: Jacob Fox, Lisa Sauermann

Abstract: For given integers $k$ and $\ell$ with $0<\ell< {k \choose 2}$, Alon, Hefetz, Krivelevich and Tyomkyn formulated the following conjecture: When sampling a $k$-vertex subset uniformly at random from a very large graph $G$, then the probability to have exactly $\ell$ edges within the sampled $k$-vertex subset is at most $e^{-1}+o_k(1)$. This conjecture was proved in the case… ▽ More For given integers $k$ and $\ell$ with $0<\ell< {k \choose 2}$, Alon, Hefetz, Krivelevich and Tyomkyn formulated the following conjecture: When sampling a $k$-vertex subset uniformly at random from a very large graph $G$, then the probability to have exactly $\ell$ edges within the sampled $k$-vertex subset is at most $e^{-1}+o_k(1)$. This conjecture was proved in the case $Ω(k)\leq \ell\leq {k \choose 2}-Ω(k)$ by Kwan, Sudakov and Tran. In this paper, we complete the proof of the conjecture by resolving the remaining cases. We furthermore give nearly tight upper bounds for the probability described above in the case $ω(1)\leq \ell\leq o(k)$. We also extend some of our results to hypergraphs with bounded edge size. △ Less

Submitted 25 February, 2020; v1 submitted 5 September, 2018; originally announced September 2018.

Comments: 52 pages

Journal ref: Advances in Combinatorics, 2020:4, 52 pp

arXiv:1804.07431 [pdf, other]

Finding Cliques in Social Networks: A New Distribution-Free Model

Authors: Jacob Fox, Tim Roughgarden, C. Seshadhri, Fan Wei, Nicole Wein

Abstract: We propose a new distribution-free model of social networks. Our definitions are motivated by one of the most universal signatures of social networks, triadic closure---the property that pairs of vertices with common neighbors tend to be adjacent. Our most basic definition is that of a "$c$-closed" graph, where for every pair of vertices $u,v$ with at least $c$ common neighbors, $u$ and $v$ are ad… ▽ More We propose a new distribution-free model of social networks. Our definitions are motivated by one of the most universal signatures of social networks, triadic closure---the property that pairs of vertices with common neighbors tend to be adjacent. Our most basic definition is that of a "$c$-closed" graph, where for every pair of vertices $u,v$ with at least $c$ common neighbors, $u$ and $v$ are adjacent. We study the classic problem of enumerating all maximal cliques, an important task in social network analysis. We prove that this problem is fixed-parameter tractable with respect to $c$ on $c$-closed graphs. Our results carry over to "weakly $c$-closed graphs", which only require a vertex deletion ordering that avoids pairs of non-adjacent vertices with $c$ common neighbors. Numerical experiments show that well-studied social networks tend to be weakly $c$-closed for modest values of $c$. △ Less

Submitted 19 April, 2018; originally announced April 2018.

Comments: main text 13 pages; 2 figures; appendix 9 pages

MSC Class: 68W01; 68R10; 05C85; 05D99

arXiv:1801.05037 [pdf, ps, other]

doi 10.1017/S0963548319000075

A fast new algorithm for weak graph regularity

Authors: Jacob Fox, László Miklós Lovász, Yufei Zhao

Abstract: We provide a deterministic algorithm that finds, in $ε^{-O(1)} n^2$ time, an $ε$-regular Frieze-Kannan partition of a graph on $n$ vertices. The algorithm outputs an approximation of a given graph as a weighted sum of $ε^{-O(1)}$ many complete bipartite graphs. As a corollary, we give a deterministic algorithm for estimating the number of copies of $H$ in an $n$-vertex graph $G$ up to an additiv… ▽ More We provide a deterministic algorithm that finds, in $ε^{-O(1)} n^2$ time, an $ε$-regular Frieze-Kannan partition of a graph on $n$ vertices. The algorithm outputs an approximation of a given graph as a weighted sum of $ε^{-O(1)}$ many complete bipartite graphs. As a corollary, we give a deterministic algorithm for estimating the number of copies of $H$ in an $n$-vertex graph $G$ up to an additive error of at most $εn^{v(H)}$, in time $ε^{-O_H(1)}n^2$. △ Less

Submitted 12 January, 2018; originally announced January 2018.

Comments: 12 pages, not including references. arXiv admin note: text overlap with arXiv:1604.00733

Journal ref: Combinator. Probab. Comp. 28 (2019) 777-790

arXiv:1611.01270 [pdf, other]

Fast property testing and metrics for permutations

Authors: Jacob Fox, Fan Wei

Abstract: The goal of property testing is to quickly distinguish between objects which satisfy a property and objects that are $ε$-far from satisfying the property. There are now several general results in this area which show that natural properties of combinatorial objects can be tested with "constant" query complexity, depending only on $ε$ and the property, and not on the size of the object being tested… ▽ More The goal of property testing is to quickly distinguish between objects which satisfy a property and objects that are $ε$-far from satisfying the property. There are now several general results in this area which show that natural properties of combinatorial objects can be tested with "constant" query complexity, depending only on $ε$ and the property, and not on the size of the object being tested. The upper bound on the query complexity coming from the proof techniques are often enormous and impractical. It remains a major open problem if better bounds hold. Maybe surprisingly, for testing with respect to the rectangular distance, we prove there is a universal (not depending on the property), polynomial in $1/ε$ query complexity bound for two-sided testing hereditary properties of sufficiently large permutations. We further give a nearly linear bound with respect to a closely related metric which also depends on the smallest forbidden subpermutation for the property. Finally, we show that several different permutation metrics of interest are related to the rectangular distance, yielding similar results for testing with respect to these metrics. △ Less

Submitted 4 April, 2018; v1 submitted 4 November, 2016; originally announced November 2016.

Comments: 32 pages, 12 figures. The second version fixed some typos, and used the term "earth mover's distance" in replace of the term "planar footrule distance" used in v1

MSC Class: 05; 60; 68

arXiv:1606.03753 [pdf, ps, other]

Approximating the rectilinear crossing number

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: A straight-line drawing of a graph $G$ is a map** which assigns to each vertex a point in the plane and to each edge a straight-line segment connecting the corresponding two points. The rectilinear crossing number of a graph $G$, $\overline{cr}(G)$, is the minimum number of crossing edges in any straight-line drawing of $G$. Determining or estimating $\overline{cr}(G)$ appears to be a difficult… ▽ More A straight-line drawing of a graph $G$ is a map** which assigns to each vertex a point in the plane and to each edge a straight-line segment connecting the corresponding two points. The rectilinear crossing number of a graph $G$, $\overline{cr}(G)$, is the minimum number of crossing edges in any straight-line drawing of $G$. Determining or estimating $\overline{cr}(G)$ appears to be a difficult problem, and deciding if $\overline{cr}(G)\leq k$ is known to be NP-hard. In fact, the asymptotic behavior of $\overline{cr}(K_n)$ is still unknown. In this paper, we present a deterministic $n^{2+o(1)}$-time algorithm that finds a straight-line drawing of any $n$-vertex graph $G$ with $\overline{cr}(G) + o(n^4)$ crossing edges. Together with the well-known Crossing Lemma due to Ajtai et al. and Leighton, this result implies that for any dense $n$-vertex graph $G$, one can efficiently find a straight-line drawing of $G$ with $(1 + o(1))\overline{cr}(G)$ crossing edges. △ Less

Submitted 7 September, 2016; v1 submitted 12 June, 2016; originally announced June 2016.

Comments: Appears in the Proceedings of the 24th International Symposium on Graph Drawing and Network Visualization (GD 2016)

arXiv:1603.07056 [pdf, ps, other]

On the number of cliques in graphs with a forbidden minor

Authors: Jacob Fox, Fan Wei

Abstract: Reed and Wood and independently Norine, Seymour, Thomas, and Wollan proved that for each positive integer $t$ there is a constant $c(t)$ such that every graph on $n$ vertices with no $K_t$-minor has at most $c(t)n$ cliques. Wood asked in 2007 if we can take $c(t) = c^t$ for some absolute constant $c$. This question was recently answered affirmatively by Lee and Oum. In this paper, we determine the… ▽ More Reed and Wood and independently Norine, Seymour, Thomas, and Wollan proved that for each positive integer $t$ there is a constant $c(t)$ such that every graph on $n$ vertices with no $K_t$-minor has at most $c(t)n$ cliques. Wood asked in 2007 if we can take $c(t) = c^t$ for some absolute constant $c$. This question was recently answered affirmatively by Lee and Oum. In this paper, we determine the exponential constant. We prove that every graph on $n$ vertices with no $K_t$-minor has at most $3^{2t/3+o(t)}n$ cliques. This bound is tight for $n \geq 4t/3$. More generally, let $H$ be a connected graph on $t$ vertices, and $x$ denote the size (i.e., the number edges) of the largest matching in the complement of $H$. We prove that every graph on $n$ vertices with no $H$-minor has at most $\max(3^{2t/3-x/3+o(t)}n,2^{t+o(t)}n)$ cliques, and this bound is tight for $n \geq \max (4t/3-2x/3,t)$ by a simple construction. Even more generally, we determine explicitly the exponential constant for the maximum number of cliques an $n$-vertex graph can have in a minor-closed family of graphs which is closed under disjoint union. △ Less

Submitted 22 March, 2016; originally announced March 2016.

Comments: 20 pages

arXiv:1505.07429 [pdf, ps, other]

Semi-algebraic colorings of complete graphs

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: We consider $m$-colorings of the edges of a complete graph, where each color class is defined semi-algebraically with bounded complexity. The case $m = 2$ was first studied by Alon et al., who applied this framework to obtain surprisingly strong Ramsey-type results for intersection graphs of geometric objects and for other graphs arising in computational geometry. Considering larger values of $m$… ▽ More We consider $m$-colorings of the edges of a complete graph, where each color class is defined semi-algebraically with bounded complexity. The case $m = 2$ was first studied by Alon et al., who applied this framework to obtain surprisingly strong Ramsey-type results for intersection graphs of geometric objects and for other graphs arising in computational geometry. Considering larger values of $m$ is relevant, e.g., to problems concerning the number of distinct distances determined by a point set. For $p\ge 3$ and $m\ge 2$, the classical Ramsey number $R(p;m)$ is the smallest positive integer $n$ such that any $m$-coloring of the edges of $K_n$, the complete graph on $n$ vertices, contains a monochromatic $K_p$. It is a longstanding open problem that goes back to Schur (1916) to decide whether $R(p;m)=2^{O(m)}$, for a fixed $p$. We prove that this is true if each color class is defined semi-algebraically with bounded complexity. The order of magnitude of this bound is tight. Our proof is based on the Cutting Lemma of Chazelle {\em et al.}, and on a Szemerédi-type regularity lemma for multicolored semi-algebraic graphs, which is of independent interest. The same technique is used to address the semi-algebraic variant of a more general Ramsey-type problem of Erdős and Shelah. △ Less

Submitted 5 December, 2018; v1 submitted 27 May, 2015; originally announced May 2015.

arXiv:1502.01730 [pdf, ps, other]

A polynomial regularity lemma for semi-algebraic hypergraphs and its applications in geometry and property testing

Authors: Jacob Fox, Janos Pach, Andrew Suk

Abstract: Fox, Gromov, Lafforgue, Naor, and Pach proved a regularity lemma for semi-algebraic $k$-uniform hypergraphs of bounded complexity, showing that for each $ε>0$ the vertex set can be equitably partitioned into a bounded number of parts (in terms of $ε$ and the complexity) so that all but an $ε$-fraction of the $k$-tuples of parts are homogeneous. We prove that the number of parts can be taken to be… ▽ More Fox, Gromov, Lafforgue, Naor, and Pach proved a regularity lemma for semi-algebraic $k$-uniform hypergraphs of bounded complexity, showing that for each $ε>0$ the vertex set can be equitably partitioned into a bounded number of parts (in terms of $ε$ and the complexity) so that all but an $ε$-fraction of the $k$-tuples of parts are homogeneous. We prove that the number of parts can be taken to be polynomial in $1/ε$. Our improved regularity lemma can be applied to geometric problems and to the following general question on property testing: is it possible to decide, with query complexity polynomial in the reciprocal of the approximation parameter, whether a hypergraph has a given hereditary property? We give an affirmative answer for testing typical hereditary properties for semi-algebraic hypergraphs of bounded complexity. △ Less

Submitted 14 October, 2016; v1 submitted 5 February, 2015; originally announced February 2015.

arXiv:1403.1768 [pdf, ps, other]

A tight lower bound for Szemerédi's regularity lemma

Authors: Jacob Fox, László Miklós Lovász

Abstract: Addressing a question of Gowers, we determine the order of the tower height for the partition size in a version of Szemerédi's regularity lemma. Addressing a question of Gowers, we determine the order of the tower height for the partition size in a version of Szemerédi's regularity lemma. △ Less

Submitted 7 March, 2014; originally announced March 2014.

Comments: 31 pages

arXiv:1401.6734 [pdf, ps, other]

Distinct volume subsets

Authors: David Conlon, Jacob Fox, William Gasarch, David G. Harris, Douglas Ulrich, Samuel Zbarsky

Abstract: Suppose that $a$ and $d$ are positive integers with $a \geq 2$. Let $h_{a,d}(n)$ be the largest integer $t$ such that any set of $n$ points in $\mathbb{R}^d$ contains a subset of $t$ points for which all the non-zero volumes of the ${t \choose a}$ subsets of order $a$ are distinct. Beginning with Erdős in 1957, the function $h_{2,d}(n)$ has been closely studied and is known to be at least a power… ▽ More Suppose that $a$ and $d$ are positive integers with $a \geq 2$. Let $h_{a,d}(n)$ be the largest integer $t$ such that any set of $n$ points in $\mathbb{R}^d$ contains a subset of $t$ points for which all the non-zero volumes of the ${t \choose a}$ subsets of order $a$ are distinct. Beginning with Erdős in 1957, the function $h_{2,d}(n)$ has been closely studied and is known to be at least a power of $n$. We improve the best known bound for $h_{2,d}(n)$ and show that $h_{a,d}(n)$ is at least a power of $n$ for all $a$ and $d$. △ Less

Submitted 10 May, 2015; v1 submitted 26 January, 2014; originally announced January 2014.

Comments: 10 pages

Journal ref: SIAM Journal on Discrete Math 29(1), pp. 472-480 (2014)

arXiv:1310.8378 [pdf, ps, other]

Stanley-Wilf limits are typically exponential

Authors: Jacob Fox

Abstract: For a permutation $π$, let $S_{n}(π)$ be the number of permutations on $n$ letters avoiding $π$. Marcus and Tardos proved the celebrated Stanley-Wilf conjecture that $L(π)= \lim_{n \to \infty} S_n(π)^{1/n}$ exists and is finite. Backed by numerical evidence, it has been conjectured by many researchers over the years that $L(π)=Θ(k^2)$ for every permutation $π$ on $k$ letters. We disprove this conj… ▽ More For a permutation $π$, let $S_{n}(π)$ be the number of permutations on $n$ letters avoiding $π$. Marcus and Tardos proved the celebrated Stanley-Wilf conjecture that $L(π)= \lim_{n \to \infty} S_n(π)^{1/n}$ exists and is finite. Backed by numerical evidence, it has been conjectured by many researchers over the years that $L(π)=Θ(k^2)$ for every permutation $π$ on $k$ letters. We disprove this conjecture, showing that $L(π)=2^{k^{Θ(1)}}$ for almost all permutations $π$ on $k$ letters. △ Less

Submitted 31 October, 2013; originally announced October 2013.

Comments: 13 pages

arXiv:1304.3448 [pdf]

Strong & Weak Methods: A Logical View of Uncertainty

Authors: John Fox

Abstract: The last few years has seen a growing debate about techniques for managing uncertainty in AI systems. Unfortunately this debate has been cast as a rivalry between AI methods and classical probability based ones. Three arguments for extending the probability framework of uncertainty are presented, none of which imply a challenge to classical methods. These are (1) explicit representation of several… ▽ More The last few years has seen a growing debate about techniques for managing uncertainty in AI systems. Unfortunately this debate has been cast as a rivalry between AI methods and classical probability based ones. Three arguments for extending the probability framework of uncertainty are presented, none of which imply a challenge to classical methods. These are (1) explicit representation of several types of uncertainty, specifically possibility and plausibility, as well as probability, (2) the use of weak methods for uncertainty management in problems which are poorly defined, and (3) symbolic representation of different uncertainty calculi and methods for choosing between them. △ Less

Submitted 27 March, 2013; originally announced April 2013.

Comments: Appears in Proceedings of the First Conference on Uncertainty in Artificial Intelligence (UAI1985)

Report number: UAI-P-1985-PG-253-257

arXiv:1303.5716 [pdf]

Symbolic Decision Theory and Autonomous Systems

Authors: John Fox, Paul J. Krause

Abstract: The ability to reason under uncertainty and with incomplete information is a fundamental requirement of decision support technology. In this paper we argue that the concentration on theoretical techniques for the evaluation and selection of decision options has distracted attention from many of the wider issues in decision making. Although numerical methods of reasoning under uncertainty have st… ▽ More The ability to reason under uncertainty and with incomplete information is a fundamental requirement of decision support technology. In this paper we argue that the concentration on theoretical techniques for the evaluation and selection of decision options has distracted attention from many of the wider issues in decision making. Although numerical methods of reasoning under uncertainty have strong theoretical foundations, they are representationally weak and only deal with a small part of the decision process. Knowledge based systems, on the other hand, offer greater flexibility but have not been accompanied by a clear decision theory. We describe here work which is under way towards providing a theoretical framework for symbolic decision procedures. A central proposal is an extended form of inference which we call argumentation; reasoning for and against decision options from generalised domain theories. The approach has been successfully used in several decision support applications, but it is argued that a comprehensive decision theory must cover autonomous decision making, where the agent can formulate questions as well as take decisions. A major theoretical challenge for this theory is to capture the idea of reflection to permit decision agents to reason about their goals, what they believe and why, and what they need to know or do in order to achieve their goals. △ Less

Submitted 20 March, 2013; originally announced March 2013.

Comments: Appears in Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence (UAI1991)

Report number: UAI-P-1991-PG-103-110

Showing 1–50 of 66 results for author: Fox, J