Search | arXiv e-print repository

Transport of Algebraic Structure to Latent Embeddings

Authors: Samuel Pfrommer, Brendon G. Anderson, Somayeh Sojoudi

Abstract: Machine learning often aims to produce latent embeddings of inputs which lie in a larger, abstract mathematical space. For example, in the field of 3D modeling, subsets of Euclidean space can be embedded as vectors using implicit neural representations. Such subsets also have a natural algebraic structure including operations (e.g., union) and corresponding laws (e.g., associativity). How can we l… ▽ More Machine learning often aims to produce latent embeddings of inputs which lie in a larger, abstract mathematical space. For example, in the field of 3D modeling, subsets of Euclidean space can be embedded as vectors using implicit neural representations. Such subsets also have a natural algebraic structure including operations (e.g., union) and corresponding laws (e.g., associativity). How can we learn to "union" two sets using only their latent embeddings while respecting associativity? We propose a general procedure for parameterizing latent space operations that are provably consistent with the laws on the input space. This is achieved by learning a bijection from the latent space to a carefully designed mirrored algebra which is constructed on Euclidean space in accordance with desired laws. We evaluate these structural transport nets for a range of mirrored algebras against baselines that operate directly on the latent space. Our experiments provide strong evidence that respecting the underlying algebraic structure of the input space is key for learning accurate and self-consistent operations. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: Proceedings of the 41st International Conference on Machine Learning (2024)

arXiv:2402.01536 [pdf, other]

doi 10.1145/3635636.3656204

Homogenization Effects of Large Language Models on Human Creative Ideation

Authors: Barrett R. Anderson, Jash Hemant Shah, Max Kreminski

Abstract: Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also ho… ▽ More Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also homogenize the ideas suggested by different users. We conducted a 36-participant comparative user study and found, in accordance with the homogenization hypothesis, that different users tended to produce less semantically distinct ideas with ChatGPT than with an alternative CST. Additionally, ChatGPT users generated a greater number of more detailed ideas, but felt less responsible for the ideas they generated. We discuss potential implications of these findings for users, designers, and developers of LLM-based CSTs. △ Less

Submitted 10 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to C&C 2024

arXiv:2312.08286 [pdf, other]

Evolutionary Games on Infinite Strategy Sets: Convergence to Nash Equilibria via Dissipativity

Authors: Brendon G. Anderson, Somayeh Sojoudi, Murat Arcak

Abstract: We consider evolutionary dynamics for population games in which players have a continuum of strategies at their disposal. Models in this setting amount to infinite-dimensional differential equations evolving on the manifold of probability measures. We generalize dissipativity theory for evolutionary games from finite to infinite strategy sets that are compact metric spaces, and derive sufficient c… ▽ More We consider evolutionary dynamics for population games in which players have a continuum of strategies at their disposal. Models in this setting amount to infinite-dimensional differential equations evolving on the manifold of probability measures. We generalize dissipativity theory for evolutionary games from finite to infinite strategy sets that are compact metric spaces, and derive sufficient conditions for the stability of Nash equilibria under the infinite-dimensional dynamics. The resulting analysis is applicable to a broad class of evolutionary games, and is modular in the sense that the pertinent conditions on the dynamics and the game's payoff structure can be verified independently. By specializing our theory to the class of monotone games, we recover as special cases existing stability results for the Brown-von Neumann-Nash and impartial pairwise comparison dynamics. We also extend our theory to models with dynamic payoffs, further broadening the applicability of our framework. We illustrate our theory using a variety of case studies, including a novel, continuous variant of the war of attrition game. △ Less

Submitted 22 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.04733 [pdf, other]

Neighboring Extremal Optimal Control Theory for Parameter-Dependent Closed-loop Laws

Authors: Ayush Rai, Shaoshuai Mou, Brian D. O. Anderson

Abstract: This study introduces an approach to obtain a neighboring extremal optimal control (NEOC) solution for a closed-loop optimal control problem, applicable to a wide array of nonlinear systems and not necessarily quadratic performance indices. The approach involves investigating the variation incurred in the functional form of a known closed-loop optimal control law due to small, known parameter vari… ▽ More This study introduces an approach to obtain a neighboring extremal optimal control (NEOC) solution for a closed-loop optimal control problem, applicable to a wide array of nonlinear systems and not necessarily quadratic performance indices. The approach involves investigating the variation incurred in the functional form of a known closed-loop optimal control law due to small, known parameter variations in the system equations or the performance index. The NEOC solution can formally be obtained by solving a linear partial differential equation, akin to those encountered in the iterative solution of a nonlinear Hamilton-Jacobi equation. Motivated by numerical procedures for solving these latter equations, we also propose a numerical algorithm based on the Galerkin algorithm, leveraging the use of basis functions to solve the underlying Hamilton-Jacobi equation of the original optimal control problem. The proposed approach simplifies the NEOC problem by reducing it to the solution of a simple set of linear equations, thereby eliminating the need for a full re-solution of the adjusted optimal control problem. Furthermore, the variation to the optimal performance index can be obtained as a function of both the system state and small changes in parameters, allowing the determination of the adjustment to an optimal control law given a small adjustment of parameters in the system or the performance index. Moreover, in order to handle large known parameter perturbations, we propose a homotopic approach that breaks down the single calculation of NEOC into a finite set of multiple steps. Finally, the validity of the claims and theory is supported by theoretical analysis and numerical simulations. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2311.15165 [pdf, other]

Mixing Classifiers to Alleviate the Accuracy-Robustness Trade-Off

Authors: Yatong Bai, Brendon G. Anderson, Somayeh Sojoudi

Abstract: Deep neural classifiers have recently found tremendous success in data-driven control systems. However, existing models suffer from a trade-off between accuracy and adversarial robustness. This limitation must be overcome in the control of safety-critical systems that require both high performance and rigorous robustness guarantees. In this work, we develop classifiers that simultaneously inherit… ▽ More Deep neural classifiers have recently found tremendous success in data-driven control systems. However, existing models suffer from a trade-off between accuracy and adversarial robustness. This limitation must be overcome in the control of safety-critical systems that require both high performance and rigorous robustness guarantees. In this work, we develop classifiers that simultaneously inherit high robustness from robust models and high accuracy from standard models. Specifically, we propose a theoretically motivated formulation that mixes the output probabilities of a standard neural network and a robust neural network. Both base classifiers are pre-trained, and thus our method does not require additional training. Our numerical experiments verify that the mixed classifier noticeably improves the accuracy-robustness trade-off and identify the confidence property of the robust base classifier as the key leverage of this more benign trade-off. Our theoretical results prove that under mild assumptions, when the robustness of the robust base model is certifiable, no alteration or attack within a closed-form $\ell_p$ radius on an input can result in the misclassification of the mixed classifier. △ Less

Submitted 3 June, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

Comments: arXiv admin note: text overlap with arXiv:2301.12554

MSC Class: 68T07

arXiv:2310.04916 [pdf, other]

Tight Certified Robustness via Min-Max Representations of ReLU Neural Networks

Authors: Brendon G. Anderson, Samuel Pfrommer, Somayeh Sojoudi

Abstract: The reliable deployment of neural networks in control systems requires rigorous robustness guarantees. In this paper, we obtain tight robustness certificates over convex attack sets for min-max representations of ReLU neural networks by develo** a convex reformulation of the nonconvex certification problem. This is done by "lifting" the problem to an infinite-dimensional optimization over probab… ▽ More The reliable deployment of neural networks in control systems requires rigorous robustness guarantees. In this paper, we obtain tight robustness certificates over convex attack sets for min-max representations of ReLU neural networks by develo** a convex reformulation of the nonconvex certification problem. This is done by "lifting" the problem to an infinite-dimensional optimization over probability measures, leveraging recent results in distributionally robust optimization to solve for an optimal discrete distribution, and proving that solutions of the original nonconvex problem are generated by the discrete distribution under mild boundedness, nonredundancy, and Slater conditions. As a consequence, optimal (worst-case) attacks against the model may be solved for exactly. This contrasts prior state-of-the-art that either requires expensive branch-and-bound schemes or loose relaxation techniques. Experiments on robust control and MNIST image classification examples highlight the benefits of our approach. △ Less

Submitted 7 October, 2023; originally announced October 2023.

Comments: IEEE Conference on Decision and Control, 2023

arXiv:2309.13794 [pdf, other]

Projected Randomized Smoothing for Certified Adversarial Robustness

Authors: Samuel Pfrommer, Brendon G. Anderson, Somayeh Sojoudi

Abstract: Randomized smoothing is the current state-of-the-art method for producing provably robust classifiers. While randomized smoothing typically yields robust $\ell_2$-ball certificates, recent research has generalized provable robustness to different norm balls as well as anisotropic regions. This work considers a classifier architecture that first projects onto a low-dimensional approximation of the… ▽ More Randomized smoothing is the current state-of-the-art method for producing provably robust classifiers. While randomized smoothing typically yields robust $\ell_2$-ball certificates, recent research has generalized provable robustness to different norm balls as well as anisotropic regions. This work considers a classifier architecture that first projects onto a low-dimensional approximation of the data manifold and then applies a standard classifier. By performing randomized smoothing in the low-dimensional projected space, we characterize the certified region of our smoothed composite classifier back in the high-dimensional input space and prove a tractable lower bound on its volume. We show experimentally on CIFAR-10 and SVHN that classifiers without the initial projection are vulnerable to perturbations that are normal to the data manifold and yet are captured by the certified regions of our method. We compare the volume of our certified regions against various baselines and show that our method improves on the state-of-the-art by many orders of magnitude. △ Less

Submitted 24 September, 2023; originally announced September 2023.

Comments: Transactions on Machine Learning Research (TMLR) 2023

arXiv:2307.07610 [pdf, other]

Assessing and Exploiting Domain Name Misinformation

Authors: Blake Anderson, David McGrew

Abstract: Cloud providers' support for network evasion techniques that misrepresent the server's domain name is more prevalent than previously believed, which has serious implications for security and privacy due to the reliance on domain names in common security architectures. Domain fronting is one such evasive technique used by privacy enhancing technologies and malware to hide the domains they visit, an… ▽ More Cloud providers' support for network evasion techniques that misrepresent the server's domain name is more prevalent than previously believed, which has serious implications for security and privacy due to the reliance on domain names in common security architectures. Domain fronting is one such evasive technique used by privacy enhancing technologies and malware to hide the domains they visit, and it uses shared hosting and HTTPS to present a benign domain to observers while signaling the target domain in the encrypted HTTP request. In this paper, we construct an ontology of domain name misinformation and detail a novel measurement methodology to identify support among cloud infrastructure providers. Despite several of the largest cloud providers having publicly stated that they no longer support domain fronting, our findings demonstrate a more complex environment with many exceptions. We also present a novel and straightforward attack that allows an adversary to man-in-the-middle all the victim's encrypted traffic bound to a content delivery network that supports domain fronting, breaking the authenticity, confidentiality, and integrity guarantees expected by the victim when using HTTPS. By using dynamic linker hijacking to rewrite the HTTP Host field, our attack does not generate any artifacts that are visible to the victim or passive network monitoring solutions, and the attacker does not need a separate channel to exfiltrate data or perform command-and-control, which can be achieved by rewriting HTTP headers. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: Presented at the 8th International Workshop on Traffic Measurements for Cybersecurity (WTMC 2023)

arXiv:2304.04640 [pdf, other]

NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems

Authors: Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Denis Kleyko, Noah Pacik-Nelson, Pao-Sheng Vincent Sun, Guangzhi Tang, Shenqi Wang, Biyan Zhou, Soikat Hasan Ahmed, George Vathakkattil Joseph, Benedetto Leto, Aurora Micheli, Anurag Kumar Mishra, Gregor Lenz, Tao Sun, Zergham Ahmed, Mahmoud Akl, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu , et al. (73 additional authors not shown)

Abstract: Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neu… ▽ More Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of nearly 100 co-authors across over 50 institutions in industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we present initial performance baselines across various model architectures on the algorithm track and outline the system track benchmark tasks and guidelines. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community. △ Less

Submitted 17 January, 2024; v1 submitted 10 April, 2023; originally announced April 2023.

Comments: Updated from whitepaper to full perspective article preprint

arXiv:2302.01961 [pdf, other]

Asymmetric Certified Robustness via Feature-Convex Neural Networks

Authors: Samuel Pfrommer, Brendon G. Anderson, Julien Piet, Somayeh Sojoudi

Abstract: Recent works have introduced input-convex neural networks (ICNNs) as learning models with advantageous training, inference, and generalization properties linked to their convex structure. In this paper, we propose a novel feature-convex neural network architecture as the composition of an ICNN with a Lipschitz feature map in order to achieve adversarial robustness. We consider the asymmetric binar… ▽ More Recent works have introduced input-convex neural networks (ICNNs) as learning models with advantageous training, inference, and generalization properties linked to their convex structure. In this paper, we propose a novel feature-convex neural network architecture as the composition of an ICNN with a Lipschitz feature map in order to achieve adversarial robustness. We consider the asymmetric binary classification setting with one "sensitive" class, and for this class we prove deterministic, closed-form, and easily-computable certified robust radii for arbitrary $\ell_p$-norms. We theoretically justify the use of these models by characterizing their decision region geometry, extending the universal approximation theorem for ICNN regression to the classification setting, and proving a lower bound on the probability that such models perfectly fit even unstructured uniformly distributed data in sufficiently high dimensions. Experiments on Malimg malware classification and subsets of MNIST, Fashion-MNIST, and CIFAR-10 datasets show that feature-convex classifiers attain state-of-the-art certified $\ell_1$-radii as well as substantial $\ell_2$- and $\ell_{\infty}$-radii while being far more computationally efficient than any competitive baseline. △ Less

Submitted 10 October, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2301.12554 [pdf, other]

Improving the Accuracy-Robustness Trade-Off of Classifiers via Adaptive Smoothing

Authors: Yatong Bai, Brendon G. Anderson, Aerin Kim, Somayeh Sojoudi

Abstract: While prior research has proposed a plethora of methods that build neural classifiers robust against adversarial robustness, practitioners are still reluctant to adopt them due to their unacceptably severe clean accuracy penalties. This paper significantly alleviates this accuracy-robustness trade-off by mixing the output probabilities of a standard classifier and a robust classifier, where the st… ▽ More While prior research has proposed a plethora of methods that build neural classifiers robust against adversarial robustness, practitioners are still reluctant to adopt them due to their unacceptably severe clean accuracy penalties. This paper significantly alleviates this accuracy-robustness trade-off by mixing the output probabilities of a standard classifier and a robust classifier, where the standard network is optimized for clean accuracy and is not robust in general. We show that the robust base classifier's confidence difference for correct and incorrect examples is the key to this improvement. In addition to providing intuitions and empirical evidence, we theoretically certify the robustness of the mixed classifier under realistic assumptions. Furthermore, we adapt an adversarial input detector into a mixing network that adaptively adjusts the mixture of the two base models, further reducing the accuracy penalty of achieving robustness. The proposed flexible method, termed "adaptive smoothing", can work in conjunction with existing or even future methods that improve clean accuracy, robustness, or adversary detection. Our empirical evaluation considers strong attack methods, including AutoAttack and adaptive attack. On the CIFAR-100 dataset, our method achieves an 85.21% clean accuracy while maintaining a 38.72% $\ell_\infty$-AutoAttacked ($ε= 8/255$) accuracy, becoming the second most robust method on the RobustBench CIFAR-100 benchmark as of submission, while improving the clean accuracy by ten percentage points compared with all listed models. The code that implements our method is available at https://github.com/Bai-YT/AdaptiveSmoothing. △ Less

Submitted 8 April, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

MSC Class: 68T07

arXiv:2211.04468 [pdf, other]

An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries

Authors: Aryan Pedawi, Pawel Gniewek, Chaoyi Chang, Brandon M. Anderson, Henry van den Bedem

Abstract: Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space. Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets. However, they are quickly approaching a size beyond that which permits… ▽ More Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space. Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets. However, they are quickly approaching a size beyond that which permits explicit enumeration, presenting new challenges for virtual screening. To overcome these challenges, we propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE). The proposed generative model represents such libraries as a differentiable, hierarchically-organized database. Given a compound from the library, the molecular encoder constructs a query for retrieval, which is utilized by the molecular decoder to reconstruct the compound by first decoding its chemical reaction and subsequently decoding its reactants. Our design minimizes autoregression in the decoder, facilitating the generation of large, valid molecular graphs. Our method performs fast and parallel batch inference for ultra-large synthesis libraries, enabling a number of important applications in early-stage drug discovery. Compounds proposed by our method are guaranteed to be in the library, and thus synthetically and cost-effectively accessible. Importantly, CSLVAE can encode out-of-library compounds and search for in-library analogues. In experiments, we demonstrate the capabilities of the proposed method in the navigation of massive combinatorial synthesis libraries. △ Less

Submitted 19 October, 2022; originally announced November 2022.

Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2210.03786 [pdf, ps, other]

doi 10.1007/978-3-031-16980-9_14

Evaluating the Performance of StyleGAN2-ADA on Medical Images

Authors: McKell Woodland, John Wood, Brian M. Anderson, Suprateek Kundu, Ethan Lin, Eugene Koay, Bruno Odisio, Caroline Chung, Hyunseon Christine Kang, Aradhana M. Venkatesan, Sireesha Yedururi, Brian De, Yuan-Mao Lin, Ankit B. Patel, Kristy K. Brock

Abstract: Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impeded their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of… ▽ More Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impeded their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of liver-containing axial slices from non-contrast and contrast-enhanced computed tomography (CT) scans. Additionally, we utilized four public datasets composed of various imaging modalities. We trained a StyleGAN2 network with transfer learning (from the Flickr-Faces-HQ dataset) and data augmentation (horizontal flip** and adaptive discriminator augmentation). The network's generative quality was measured quantitatively with the Fréchet Inception Distance (FID) and qualitatively with a visual Turing test given to seven radiologists and radiation oncologists. The StyleGAN2-ADA network achieved a FID of 5.22 ($\pm$ 0.17) on our liver CT dataset. It also set new record FIDs of 10.78, 3.52, 21.17, and 5.39 on the publicly available SLIVER07, ChestX-ray14, ACDC, and Medical Segmentation Decathlon (brain tumors) datasets. In the visual Turing test, the clinicians rated generated images as real 42% of the time, approaching random guessing. Our computational ablation study revealed that transfer learning and data augmentation stabilize training and improve the perceptual quality of the generated images. We observed the FID to be consistent with human perceptual evaluation of medical images. Finally, our work found that StyleGAN2-ADA consistently produces high-quality results without hyperparameter searches or retraining. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: This preprint has not undergone post-submission improvements or corrections. The Version of Record of this contribution is published in LNCS, volume 13570, and is available online at https://doi.org/10.1007/978-3-031-16980-9_14

Journal ref: Lecture Notes in Computer Science 13570 (2022)

arXiv:2209.12017 [pdf, other]

Cooperative Tuning of Multi-Agent Optimal Control Systems

Authors: Zehui Lu, Wanxin **, Shaoshuai Mou, Brian D. O. Anderson

Abstract: This paper investigates the problem of cooperative tuning of multi-agent optimal control systems, where a network of agents (i.e. multiple coupled optimal control systems) adjusts parameters in their dynamics, objective functions, or controllers in a coordinated way to minimize the sum of their loss functions. Different from classical techniques for tuning parameters in a controller, we allow tuna… ▽ More This paper investigates the problem of cooperative tuning of multi-agent optimal control systems, where a network of agents (i.e. multiple coupled optimal control systems) adjusts parameters in their dynamics, objective functions, or controllers in a coordinated way to minimize the sum of their loss functions. Different from classical techniques for tuning parameters in a controller, we allow tunable parameters appearing in both the system dynamics and the objective functions of each agent. A framework is developed to allow all agents to reach a consensus on the tunable parameter, which minimizes team loss. The key idea of the proposed algorithm rests on the integration of consensus-based distributed optimization for a multi-agent system and a gradient generator capturing the optimal performance as a function of the parameter in the feedback loop tuning the parameter for each agent. Both theoretical results and simulations for a synchronous multi-agent rendezvous problem are provided to validate the proposed method for cooperative tuning of multi-agent optimal control. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2208.07464 [pdf, other]

An Overview and Prospective Outlook on Robust Training and Certification of Machine Learning Models

Authors: Brendon G. Anderson, Tanmay Gautam, Somayeh Sojoudi

Abstract: In this discussion paper, we survey recent research surrounding robustness of machine learning models. As learning algorithms become increasingly more popular in data-driven control systems, their robustness to data uncertainty must be ensured in order to maintain reliable safety-critical operations. We begin by reviewing common formalisms for such robustness, and then move on to discuss popular a… ▽ More In this discussion paper, we survey recent research surrounding robustness of machine learning models. As learning algorithms become increasingly more popular in data-driven control systems, their robustness to data uncertainty must be ensured in order to maintain reliable safety-critical operations. We begin by reviewing common formalisms for such robustness, and then move on to discuss popular and state-of-the-art techniques for training robust machine learning models as well as methods for provably certifying such robustness. From this unification of robust machine learning, we identify and discuss pressing directions for future research in the area. △ Less

Submitted 27 September, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

arXiv:2207.03003 [pdf, other]

A scaling model for measuring the morphology of African cities: Implications for future energy needs

Authors: Rafael Prieto Curiel, Jorge E. Patino, Brilé Anderson

Abstract: A large proportion of Africa's infrastructure is yet to be built. Where and how these new buildings are constructed matters since today's decisions will last for decades. The resulting morphology of cities has lasting implications for a city's energy needs. Estimating and projecting these needs has always been challenging in Africa due to the lack of data. Yet, given the swee** urbanisation expe… ▽ More A large proportion of Africa's infrastructure is yet to be built. Where and how these new buildings are constructed matters since today's decisions will last for decades. The resulting morphology of cities has lasting implications for a city's energy needs. Estimating and projecting these needs has always been challenging in Africa due to the lack of data. Yet, given the swee** urbanisation expected in Africa over the next three decades, this obstacle must be overcome to guide cities towards a trajectory of sustainability and resilience. Based on the location and surface of nearly 200 million buildings on the continent, we estimate the inter-building distance of almost six thousand cities. Buildings' footprint data enables the construction of urban form indicators to compare African cities' elongation, sprawl and emptiness. We establish the BASE model, where the mean distance between buildings is a functional relation to the number of Buildings and their average Area, as well as the Sprawl and the Elongation of its spatial arrangement. The mean distance between structures in cities -- our proxy for its energy demands related to mobility -- grows faster than the square root of its population, resulting from the combined impact of a sublinear growth in the number of buildings and a sublinear increase in building size and sprawl. We show that when a city doubles its population, it triples the energy demand related to commutes. △ Less

Submitted 11 August, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: 20 pages

arXiv:2204.11910 [pdf, other]

Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection

Authors: Peter Henderson, Ben Chugg, Brandon Anderson, Kristen Altenburger, Alex Turk, John Guyton, Jacob Goldin, Daniel E. Ho

Abstract: We introduce a new setting, optimize-and-estimate structured bandits. Here, a policy must select a batch of arms, each characterized by its own context, that would allow it to both maximize reward and maintain an accurate (ideally unbiased) population estimate of the reward. This setting is inherent to many public and private sector applications and often requires handling delayed feedback, small… ▽ More We introduce a new setting, optimize-and-estimate structured bandits. Here, a policy must select a batch of arms, each characterized by its own context, that would allow it to both maximize reward and maintain an accurate (ideally unbiased) population estimate of the reward. This setting is inherent to many public and private sector applications and often requires handling delayed feedback, small data, and distribution shifts. We demonstrate its importance on real data from the United States Internal Revenue Service (IRS). The IRS performs yearly audits of the tax base. Two of its most important objectives are to identify suspected misreporting and to estimate the "tax gap" -- the global difference between the amount paid and true amount owed. Based on a unique collaboration with the IRS, we cast these two processes as a unified optimize-and-estimate structured bandit. We analyze optimize-and-estimate approaches to the IRS problem and propose a novel mechanism for unbiased population estimation that achieves rewards comparable to baseline approaches. This approach has the potential to improve audit efficacy, while maintaining policy-relevant estimates of the tax gap. This has important social consequences given that the current tax gap is estimated at nearly half a trillion dollars. We suggest that this problem setting is fertile ground for further research and we highlight its interesting challenges. The results of this and related research are currently being incorporated into the continual improvement of the IRS audit selection methods. △ Less

Submitted 24 January, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted to the Thirty-Seventh AAAI Conference On Artificial Intelligence (AAAI), 2023

arXiv:2202.00557 [pdf]

Finding the optimal human strategy for Wordle using maximum correct letter probabilities and reinforcement learning

Authors: Benton J. Anderson, Jesse G. Meyer

Abstract: Wordle is an online word puzzle game that gained viral popularity in January 2022. The goal is to guess a hidden five letter word. After each guess, the player gains information about whether the letters they guessed are present in the word, and whether they are in the correct position. Numerous blogs have suggested guessing strategies and starting word lists that improve the chance of winning. Op… ▽ More Wordle is an online word puzzle game that gained viral popularity in January 2022. The goal is to guess a hidden five letter word. After each guess, the player gains information about whether the letters they guessed are present in the word, and whether they are in the correct position. Numerous blogs have suggested guessing strategies and starting word lists that improve the chance of winning. Optimized algorithms can win 100% of games within five of the six allowed trials. However, it is infeasible for human players to use these algorithms due to an inability to perfectly recall all known 5-letter words and perform complex calculations that optimize information gain. Here, we present two different methods for choosing starting words along with a framework for discovering the optimal human strategy based on reinforcement learning. Human Wordle players can use the rules we discover to optimize their chance of winning. △ Less

Submitted 1 February, 2022; originally announced February 2022.

arXiv:2201.06399 [pdf, other]

Cooperative constrained motion coordination of networked heterogeneous vehicles

Authors: Zhiyong Sun, Marcus Greiff, Anders Robertsson, Rolf Johansson, Brian D. O. Anderson

Abstract: We consider the problem of cooperative motion coordination for multiple heterogeneous mobile vehicles subject to various constraints. These include nonholonomic motion constraints, constant speed constraints, holonomic coordination constraints, and equality/inequality geometric constraints. We develop a general framework involving differential-algebraic equations and viability theory to determine… ▽ More We consider the problem of cooperative motion coordination for multiple heterogeneous mobile vehicles subject to various constraints. These include nonholonomic motion constraints, constant speed constraints, holonomic coordination constraints, and equality/inequality geometric constraints. We develop a general framework involving differential-algebraic equations and viability theory to determine coordination feasibility for a coordinated motion control under heterogeneous vehicle dynamics and different types of coordination task constraints. If a coordinated motion solution exists for the derived differential-algebraic equations and/or inequalities, a constructive algorithm is proposed to derive an equivalent dynamical system that generates a set of feasible coordinated motions for each individual vehicle. In case studies on coordinating two vehicles, we derive analytical solutions to motion generation for two-vehicle groups consisting of car-like vehicles, unicycle vehicles, or vehicles with constant speeds, which serve as benchmark coordination tasks for more complex vehicle groups. The motion generation algorithm is well-backed by simulation data for a wide variety of coordination situations involving heterogeneous vehicles. We then extend the vehicle control framework to deal with the cooperative coordination problem with time-varying coordination tasks and leader-follower structure. We show several simulation experiments on multi-vehicle coordination under various constraints to validate the theory and the effectiveness of the proposed schemes. △ Less

Submitted 17 January, 2022; originally announced January 2022.

Comments: 23 pages, 4 figures. Extended version of the paper at IEEE ICRA. Text overlap with arXiv:1809.05509. Submitted to an IEEE journal for publication

arXiv:2112.10988 [pdf, other]

Map** industrial poultry operations at scale with deep learning and aerial imagery

Authors: Caleb Robinson, Ben Chugg, Brandon Anderson, Juan M. Lavista Ferres, Daniel E. Ho

Abstract: Concentrated Animal Feeding Operations (CAFOs) pose serious risks to air, water, and public health, but have proven to be challenging to regulate. The U.S. Government Accountability Office notes that a basic challenge is the lack of comprehensive location information on CAFOs. We use the USDA's National Agricultural Imagery Program (NAIP) 1m/pixel aerial imagery to detect poultry CAFOs across the… ▽ More Concentrated Animal Feeding Operations (CAFOs) pose serious risks to air, water, and public health, but have proven to be challenging to regulate. The U.S. Government Accountability Office notes that a basic challenge is the lack of comprehensive location information on CAFOs. We use the USDA's National Agricultural Imagery Program (NAIP) 1m/pixel aerial imagery to detect poultry CAFOs across the continental United States. We train convolutional neural network (CNN) models to identify individual poultry barns and apply the best performing model to over 42 TB of imagery to create the first national, open-source dataset of poultry CAFOs. We validate the model predictions against held-out validation set on poultry CAFO facility locations from 10 hand-labeled counties in California and demonstrate that this approach has significant potential to fill gaps in environmental monitoring. △ Less

Submitted 21 December, 2021; originally announced December 2021.

arXiv:2112.06833 [pdf, other]

doi 10.1145/3511265.3550439

Beyond Ads: Sequential Decision-Making Algorithms in Law and Public Policy

Authors: Peter Henderson, Ben Chugg, Brandon Anderson, Daniel E. Ho

Abstract: We explore the promises and challenges of employing sequential decision-making algorithms -- such as bandits, reinforcement learning, and active learning -- in law and public policy. While such algorithms have well-characterized performance in the private sector (e.g., online advertising), the tendency to naively apply algorithms motivated by one domain, often online advertisements, can be called… ▽ More We explore the promises and challenges of employing sequential decision-making algorithms -- such as bandits, reinforcement learning, and active learning -- in law and public policy. While such algorithms have well-characterized performance in the private sector (e.g., online advertising), the tendency to naively apply algorithms motivated by one domain, often online advertisements, can be called the "advertisement fallacy." Our main thesis is that law and public policy pose distinct methodological challenges that the machine learning community has not yet addressed. Machine learning will need to address these methodological problems to move "beyond ads." Public law, for instance, can pose multiple objectives, necessitate batched and delayed feedback, and require systems to learn rational, causal decision-making policies, each of which presents novel questions at the research frontier. We discuss a wide range of potential applications of sequential decision-making algorithms in regulation and governance, including public health, environmental protection, tax administration, occupational safety, and benefits adjudication. We use these examples to highlight research needed to render sequential decision making policy-compliant, adaptable, and effective in the public sector. We also note the potential risks of such deployments and describe how sequential decision systems can also facilitate the discovery of harms. We hope our work inspires more investigation of sequential decision making in law and public policy, which provide unique challenges for machine learning researchers with potential for significant social benefit. △ Less

Submitted 29 November, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

Comments: Version 1 presented at Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice (2021), a NeurIPS 2021 Workshop; Version 2 presented at the 2nd ACM Symposium on Computer Science and Law (2022) (DOI: https://dl.acm.org/doi/10.1145/3511265.3550439)

arXiv:2107.11477 [pdf, other]

Plinko: Eliciting beliefs to build better models of statistical learning and mental model updating

Authors: Peter A. V. DiBerardino, Alexandre L. S. Filipowicz, James Danckert, Britt Anderson

Abstract: Prior beliefs are central to Bayesian accounts of cognition, but many of these accounts do not directly measure priors. More specifically, initial states of belief heavily influence how new information is assumed to be utilized when updating a particular model. Despite this, prior and posterior beliefs are either inferred from sequential participant actions or elicited through impoverished means.… ▽ More Prior beliefs are central to Bayesian accounts of cognition, but many of these accounts do not directly measure priors. More specifically, initial states of belief heavily influence how new information is assumed to be utilized when updating a particular model. Despite this, prior and posterior beliefs are either inferred from sequential participant actions or elicited through impoverished means. We had participants play a version of the game "Plinko", to first elicit individual participant priors in a theoretically agnostic manner. Subsequent learning and updating of participant beliefs was then directly measured. We show that participants hold a variety of priors that cluster around prototypical probability distributions that in turn influence learning. In follow-up experiments we show that participant priors are stable over time and that the ability to update beliefs is influenced by a simple environmental manipulation (i.e. a short break). This data reveals the importance of directly measuring participant beliefs rather than assuming or inferring them as has been widely done in the literature to date. The Plinko game provides a flexible and fecund means for examining statistical learning and mental model updating. △ Less

Submitted 7 January, 2022; v1 submitted 23 July, 2021; originally announced July 2021.

Comments: Partial rewrite. Added references and further discussion of background and results. Results unchanged

arXiv:2106.15919 [pdf, other]

On joint training with interfaces for spoken language understanding

Authors: Anirudh Raju, Milind Rao, Gautam Tiwari, Pranav Dheram, Bryan Anderson, Zhe Zhang, Chul Lee, Bach Bui, Ariya Rastrow

Abstract: Spoken language understanding (SLU) systems extract both text transcripts and semantics associated with intents and slots from input speech utterances. SLU systems usually consist of (1) an automatic speech recognition (ASR) module, (2) an interface module that exposes relevant outputs from ASR, and (3) a natural language understanding (NLU) module. Interfaces in SLU systems carry information on t… ▽ More Spoken language understanding (SLU) systems extract both text transcripts and semantics associated with intents and slots from input speech utterances. SLU systems usually consist of (1) an automatic speech recognition (ASR) module, (2) an interface module that exposes relevant outputs from ASR, and (3) a natural language understanding (NLU) module. Interfaces in SLU systems carry information on text transcriptions or richer information like neural embeddings from ASR to NLU. In this paper, we study how interfaces affect joint-training for spoken language understanding. Most notably, we obtain the state-of-the-art results on the publicly available 50-hr SLURP dataset. We first leverage large-size pretrained ASR and NLU models that are connected by a text interface, and then jointly train both models via a sequence loss function. For scenarios where pretrained models are not utilized, the best results are obtained through a joint sequence loss training using richer neural interfaces. Finally, we show the overall diminishing impact of leveraging pretrained models with increased training data size. △ Less

Submitted 25 July, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

Comments: Proc. Interspeech 2022

arXiv:2106.00089 [pdf, other]

Node-Variant Graph Filters in Graph Neural Networks

Authors: Fernando Gama, Brendon G. Anderson, Somayeh Sojoudi

Abstract: Graph neural networks (GNNs) have been successfully employed in a myriad of applications involving graph signals. Theoretical findings establish that GNNs use nonlinear activation functions to create low-eigenvalue frequency content that can be processed in a stable manner by subsequent graph convolutional filters. However, the exact shape of the frequency content created by nonlinear functions is… ▽ More Graph neural networks (GNNs) have been successfully employed in a myriad of applications involving graph signals. Theoretical findings establish that GNNs use nonlinear activation functions to create low-eigenvalue frequency content that can be processed in a stable manner by subsequent graph convolutional filters. However, the exact shape of the frequency content created by nonlinear functions is not known and cannot be learned. In this work, we use node-variant graph filters (NVGFs) -- which are linear filters capable of creating frequencies -- as a means of investigating the role that frequency creation plays in GNNs. We show that, by replacing nonlinear activation functions by NVGFs, frequency creation mechanisms can be designed or learned. By doing so, the role of frequency creation is separated from the nonlinear nature of traditional GNNs. Simulations on graph signal processing problems are carried out to pinpoint the role of frequency creation. △ Less

Submitted 4 March, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

arXiv:2105.14159 [pdf, other]

doi 10.1016/j.jag.2021.102463

Enhancing Environmental Enforcement with Near Real-Time Monitoring: Likelihood-Based Detection of Structural Expansion of Intensive Livestock Farms

Authors: Ben Chugg, Brandon Anderson, Seiji Eicher, Sandy Lee, Daniel E. Ho

Abstract: Much environmental enforcement in the United States has historically relied on either self-reported data or physical, resource-intensive, infrequent inspections. Advances in remote sensing and computer vision, however, have the potential to augment compliance monitoring by detecting early warning signs of noncompliance. We demonstrate a process for rapid identification of significant structural ex… ▽ More Much environmental enforcement in the United States has historically relied on either self-reported data or physical, resource-intensive, infrequent inspections. Advances in remote sensing and computer vision, however, have the potential to augment compliance monitoring by detecting early warning signs of noncompliance. We demonstrate a process for rapid identification of significant structural expansion using Planet's 3m/pixel satellite imagery products and focusing on Concentrated Animal Feeding Operations (CAFOs) in the US as a test case. Unpermitted building expansion has been a particular challenge with CAFOs, which pose significant health and environmental risks. Using new hand-labeled dataset of 145,053 images of 1,513 CAFOs, we combine state-of-the-art building segmentation with a likelihood-based change-point detection model to provide a robust signal of building expansion (AUC = 0.86). A major advantage of this approach is that it can work with higher cadence (daily to weekly), but lower resolution (3m/pixel), satellite imagery than previously used in similar environmental settings. It is also highly generalizable and thus provides a near real-time monitoring tool to prioritize enforcement resources in other settings where unpermitted construction poses environmental risk, e.g. zoning, habitat modification, or wetland protection. △ Less

Submitted 2 August, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

Journal ref: International Journal of Applied Earth Observation and Geoinformation, Volume 103, 2021, 102463, ISSN 0303-2434

arXiv:2104.08671 [pdf, other]

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset

Authors: Lucia Zheng, Neel Guha, Brandon R. Anderson, Peter Henderson, Daniel E. Ho

Abstract: While self-supervised learning has made rapid advances in natural language processing, it remains unclear when researchers should engage in resource-intensive domain-specific pretraining (domain pretraining). The law, puzzlingly, has yielded few documented instances of substantial gains to domain pretraining in spite of the fact that legal language is widely seen to be unique. We hypothesize that… ▽ More While self-supervised learning has made rapid advances in natural language processing, it remains unclear when researchers should engage in resource-intensive domain-specific pretraining (domain pretraining). The law, puzzlingly, has yielded few documented instances of substantial gains to domain pretraining in spite of the fact that legal language is widely seen to be unique. We hypothesize that these existing results stem from the fact that existing legal NLP tasks are too easy and fail to meet conditions for when domain pretraining can help. To address this, we first present CaseHOLD (Case Holdings On Legal Decisions), a new dataset comprised of over 53,000+ multiple choice questions to identify the relevant holding of a cited case. This dataset presents a fundamental task to lawyers and is both legally meaningful and difficult from an NLP perspective (F1 of 0.4 with a BiLSTM baseline). Second, we assess performance gains on CaseHOLD and existing legal NLP datasets. While a Transformer architecture (BERT) pretrained on a general corpus (Google Books and Wikipedia) improves performance, domain pretraining (using corpus of approximately 3.5M decisions across all courts in the U.S. that is larger than BERT's) with a custom legal vocabulary exhibits the most substantial performance gains with CaseHOLD (gain of 7.2% on F1, representing a 12% improvement on BERT) and consistent performance gains across two other legal tasks. Third, we show that domain pretraining may be warranted when the task exhibits sufficient similarity to the pretraining corpus: the level of performance increase in three legal tasks was directly tied to the domain specificity of the task. Our findings inform when researchers should engage resource-intensive pretraining and show that Transformer-based architectures, too, learn embeddings suggestive of distinct legal language. △ Less

Submitted 5 July, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

Comments: ICAIL 2021. Code & data available at https://github.com/reglab/casehold

arXiv:2103.09787 [pdf, other]

Temporal Cluster Matching for Change Detection of Structures from Satellite Imagery

Authors: Caleb Robinson, Anthony Ortiz, Juan M. Lavista Ferres, Brandon Anderson, Daniel E. Ho

Abstract: Longitudinal studies are vital to understanding dynamic changes of the planet, but labels (e.g., buildings, facilities, roads) are often available only for a single point in time. We propose a general model, Temporal Cluster Matching (TCM), for detecting building changes in time series of remotely sensed imagery when footprint labels are observed only once. The intuition behind the model is that t… ▽ More Longitudinal studies are vital to understanding dynamic changes of the planet, but labels (e.g., buildings, facilities, roads) are often available only for a single point in time. We propose a general model, Temporal Cluster Matching (TCM), for detecting building changes in time series of remotely sensed imagery when footprint labels are observed only once. The intuition behind the model is that the relationship between spectral values inside and outside of building's footprint will change when a building is constructed (or demolished). For instance, in rural settings, the pre-construction area may look similar to the surrounding environment until the building is constructed. Similarly, in urban settings, the pre-construction areas will look different from the surrounding environment until construction. We further propose a heuristic method for selecting the parameters of our model which allows it to be applied in novel settings without requiring data labeling efforts (to fit the parameters). We apply our model over a dataset of poultry barns from 2016/2017 high-resolution aerial imagery in the Delmarva Peninsula and a dataset of solar farms from a 2020 mosaic of Sentinel 2 imagery in India. Our results show that our model performs as well when fit using the proposed heuristic as it does when fit with labeled data, and further, that supervised versions of our model perform the best among all the baselines we test against. Finally, we show that our proposed approach can act as an effective data augmentation strategy -- it enables researchers to augment existing structure footprint labels along the time dimension and thus use imagery from multiple points in time to train deep learning models. We show that this improves the spatial generalization of such models when evaluated on the same change detection task. △ Less

Submitted 29 June, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

Comments: Published in ACM COMPASS 2021

arXiv:2101.09306 [pdf, other]

Towards Optimal Branching of Linear and Semidefinite Relaxations for Neural Network Robustness Certification

Authors: Brendon G. Anderson, Ziye Ma, **gqi Li, Somayeh Sojoudi

Abstract: In this paper, we study certifying the robustness of ReLU neural networks against adversarial input perturbations. To diminish the relaxation error suffered by the popular linear programming (LP) and semidefinite programming (SDP) certification methods, we take a branch-and-bound approach to propose partitioning the input uncertainty set and solving the relaxations on each part separately. We show… ▽ More In this paper, we study certifying the robustness of ReLU neural networks against adversarial input perturbations. To diminish the relaxation error suffered by the popular linear programming (LP) and semidefinite programming (SDP) certification methods, we take a branch-and-bound approach to propose partitioning the input uncertainty set and solving the relaxations on each part separately. We show that this approach reduces relaxation error, and that the error is eliminated entirely upon performing an LP relaxation with a partition intelligently designed to exploit the nature of the ReLU activations. To scale this approach to large networks, we consider using a coarser partition whereby the number of parts in the partition is reduced. We prove that computing such a coarse partition that directly minimizes the LP relaxation error is NP-hard. By instead minimizing the worst-case LP relaxation error, we develop a closed-form branching scheme. We extend the analysis to the SDP, where the feasible set geometry is exploited to design a branching scheme that minimizes the worst-case SDP relaxation error. Experiments on MNIST, CIFAR-10, and Wisconsin breast cancer diagnosis classifiers demonstrate significant increases in the percentages of test samples certified. By independently increasing the input size and the number of layers, we empirically illustrate under which regimes the branched LP and branched SDP are best applied. △ Less

Submitted 2 February, 2023; v1 submitted 22 January, 2021; originally announced January 2021.

Comments: This is an extension of our IEEE CDC 2020 conference paper arXiv:2004.00570

arXiv:2012.04035 [pdf, other]

ATOM3D: Tasks On Molecules in Three Dimensions

Authors: Raphael J. L. Townshend, Martin Vögele, Patricia Suriana, Alexander Derry, Alexander Powers, Yianni Laloudakis, Sidhika Balachandar, Bowen **g, Brandon Anderson, Stephan Eismann, Risi Kondor, Russ B. Altman, Ron O. Dror

Abstract: Computational methods that operate on three-dimensional molecular structure have the potential to solve important questions in biology and chemistry. In particular, deep neural networks have gained significant attention, but their widespread adoption in the biomolecular domain has been limited by a lack of either systematic performance benchmarks or a unified toolkit for interacting with molecular… ▽ More Computational methods that operate on three-dimensional molecular structure have the potential to solve important questions in biology and chemistry. In particular, deep neural networks have gained significant attention, but their widespread adoption in the biomolecular domain has been limited by a lack of either systematic performance benchmarks or a unified toolkit for interacting with molecular data. To address this, we present ATOM3D, a collection of both novel and existing benchmark datasets spanning several key classes of biomolecules. We implement several classes of three-dimensional molecular learning methods for each of these tasks and show that they consistently improve performance relative to methods based on one- and two-dimensional representations. The specific choice of architecture proves to be critical for performance, with three-dimensional convolutional networks excelling at tasks involving complex geometries, graph networks performing well on systems requiring detailed positional information, and the more recently developed equivariant networks showing significant promise. Our results indicate that many molecular problems stand to gain from three-dimensional molecular learning, and that there is potential for improvement on many tasks which remain underexplored. To lower the barrier to entry and facilitate further developments in the field, we also provide a comprehensive suite of tools for dataset processing, model training, and evaluation in our open-source atom3d Python package. All datasets are available for download from https://www.atom3d.ai . △ Less

Submitted 15 January, 2022; v1 submitted 7 December, 2020; originally announced December 2020.

Comments: NeurIPS 2021 Datasets and Benchmarks Track

arXiv:2010.07532

Certifying Neural Network Robustness to Random Input Noise from Samples

Authors: Brendon G. Anderson, Somayeh Sojoudi

Abstract: Methods to certify the robustness of neural networks in the presence of input uncertainty are vital in safety-critical settings. Most certification methods in the literature are designed for adversarial input uncertainty, but researchers have recently shown a need for methods that consider random uncertainty. In this paper, we propose a novel robustness certification method that upper bounds the p… ▽ More Methods to certify the robustness of neural networks in the presence of input uncertainty are vital in safety-critical settings. Most certification methods in the literature are designed for adversarial input uncertainty, but researchers have recently shown a need for methods that consider random uncertainty. In this paper, we propose a novel robustness certification method that upper bounds the probability of misclassification when the input noise follows an arbitrary probability distribution. This bound is cast as a chance-constrained optimization problem, which is then reformulated using input-output samples to replace the optimization constraints. The resulting optimization reduces to a linear program with an analytical solution. Furthermore, we develop a sufficient condition on the number of samples needed to make the misclassification bound hold with overwhelming probability. Our case studies on MNIST classifiers show that this method is able to certify a uniform infinity-norm uncertainty region with a radius of nearly 50 times larger than what the current state-of-the-art method can certify. △ Less

Submitted 25 January, 2023; v1 submitted 15 October, 2020; originally announced October 2020.

Comments: This paper has been superseded by arXiv:2010.01171 (merged from arXiv:2010.01171v1 and arXiv:2010.07532)

arXiv:2010.01171 [pdf, other]

Data-Driven Certification of Neural Networks with Random Input Noise

Authors: Brendon G. Anderson, Somayeh Sojoudi

Abstract: Methods to certify the robustness of neural networks in the presence of input uncertainty are vital in safety-critical settings. Most certification methods in the literature are designed for adversarial or worst-case inputs, but researchers have recently shown a need for methods that consider random input noise. In this paper, we examine the setting where inputs are subject to random noise coming… ▽ More Methods to certify the robustness of neural networks in the presence of input uncertainty are vital in safety-critical settings. Most certification methods in the literature are designed for adversarial or worst-case inputs, but researchers have recently shown a need for methods that consider random input noise. In this paper, we examine the setting where inputs are subject to random noise coming from an arbitrary probability distribution. We propose a robustness certification method that lower-bounds the probability that network outputs are safe. This bound is cast as a chance-constrained optimization problem, which is then reformulated using input-output samples to make the optimization constraints tractable. We develop sufficient conditions for the resulting optimization to be convex, as well as on the number of samples needed to make the robustness bound hold with overwhelming probability. We show for a special case that the proposed optimization reduces to an intuitive closed-form solution. Case studies on synthetic, MNIST, and CIFAR-10 networks experimentally demonstrate that this method is able to certify robustness against various input noise regimes over larger uncertainty regions than prior state-of-the-art techniques. △ Less

Submitted 25 January, 2023; v1 submitted 2 October, 2020; originally announced October 2020.

Comments: IEEE Transactions on Control of Network Systems, 2022. This work is a merge of arXiv:2010.01171v1 and arXiv:2010.07532

arXiv:2009.01939 [pdf, other]

Accurate TLS Fingerprinting using Destination Context and Knowledge Bases

Authors: Blake Anderson, David McGrew

Abstract: Network fingerprinting is used to identify applications, provide insight into network traffic, and detect malicious activity. With the broad adoption of TLS, traditional fingerprinting techniques that rely on clear-text data are no longer viable. TLS-specific techniques have been introduced that create a fingerprint string from carefully selected data features in the client_hello to facilitate pro… ▽ More Network fingerprinting is used to identify applications, provide insight into network traffic, and detect malicious activity. With the broad adoption of TLS, traditional fingerprinting techniques that rely on clear-text data are no longer viable. TLS-specific techniques have been introduced that create a fingerprint string from carefully selected data features in the client_hello to facilitate process identification before data is exchanged. Unfortunately, this approach fails in practice because hundreds of processes can map to the same fingerprint string. We solve this problem by presenting a TLS fingerprinting system that makes use of the destination address, port, and server name in addition to a carefully constructed fingerprint string. The destination context is used to disambiguate the set of processes that match a fingerprint string by applying a weighted naive Bayes classifier, resulting in far greater performance. △ Less

Submitted 3 September, 2020; originally announced September 2020.

arXiv:2008.05994 [pdf]

doi 10.1371/journal.pone.0253612

A community-powered search of machine learning strategy space to find NMR property prediction models

Authors: Lars A. Bratholm, Will Gerrard, Brandon Anderson, Shaojie Bai, Sunghwan Choi, Lam Dang, Pavel Hanchar, Addison Howard, Guillaume Huard, Sanghoon Kim, Zico Kolter, Risi Kondor, Mordechai Kornbluth, Youhan Lee, Youngsoo Lee, Jonathan P. Mailoa, Thanh Tu Nguyen, Milos Popovic, Goran Rakocevic, Walter Reade, Wonho Song, Luka Stojanovic, Erik H. Thiede, Nebojsa Tijanic, Andres Torrubia , et al. (4 additional authors not shown)

Abstract: The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the… ▽ More The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published "in-house" efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties. △ Less

Submitted 13 August, 2020; originally announced August 2020.

arXiv:2006.04780 [pdf, other]

Lorentz Group Equivariant Neural Network for Particle Physics

Authors: Alexander Bogatskiy, Brandon Anderson, Jan T. Offermann, Marwah Roussi, David W. Miller, Risi Kondor

Abstract: We present a neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics. The architecture is based on the theory of the finite-dimensional representations of the Lorentz group and the equivariant nonlinearity involves the tensor product. For classification tasks in particle physics, we demonstra… ▽ More We present a neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics. The architecture is based on the theory of the finite-dimensional representations of the Lorentz group and the equivariant nonlinearity involves the tensor product. For classification tasks in particle physics, we demonstrate that such an equivariant architecture leads to drastically simpler models that have relatively few learnable parameters and are much more physically interpretable than leading approaches that use CNNs and point cloud approaches. The competitive performance of the network is demonstrated on a public classification dataset [27] for tagging top quark decays given energy-momenta of jet constituents produced in proton-proton collisions. △ Less

Submitted 8 June, 2020; originally announced June 2020.

arXiv:2004.00570 [pdf, ps, other]

Tightened Convex Relaxations for Neural Network Robustness Certification

Authors: Brendon G. Anderson, Ziye Ma, **gqi Li, Somayeh Sojoudi

Abstract: In this paper, we consider the problem of certifying the robustness of neural networks to perturbed and adversarial input data. Such certification is imperative for the application of neural networks in safety-critical decision-making and control systems. Certification techniques using convex optimization have been proposed, but they often suffer from relaxation errors that void the certificate. O… ▽ More In this paper, we consider the problem of certifying the robustness of neural networks to perturbed and adversarial input data. Such certification is imperative for the application of neural networks in safety-critical decision-making and control systems. Certification techniques using convex optimization have been proposed, but they often suffer from relaxation errors that void the certificate. Our work exploits the structure of ReLU networks to improve relaxation errors through a novel partition-based certification procedure. The proposed method is proven to tighten existing linear programming relaxations, and asymptotically achieves zero relaxation error as the partition is made finer. We develop a finite partition that attains zero relaxation error and use the result to derive a tractable partitioning scheme that minimizes the worst-case relaxation error. Experiments using real data show that the partitioning procedure is able to issue robustness certificates in cases where prior methods fail. Consequently, partition-based certification procedures are found to provide an intuitive, effective, and theoretically justified method for tightening existing convex relaxation techniques. △ Less

Submitted 17 September, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: Proceedings of the 59th IEEE Conference on Decision and Control, 2020

arXiv:1911.02549 [pdf, other]

MLPerf Inference Benchmark

Authors: Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee , et al. (22 additional authors not shown)

Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic… ▽ More Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability. △ Less

Submitted 9 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: ISCA 2020

arXiv:1907.04409 [pdf, other]

Global Optimality Guarantees for Nonconvex Unsupervised Video Segmentation

Authors: Brendon G. Anderson, Somayeh Sojoudi

Abstract: In this paper, we consider the problem of unsupervised video object segmentation via background subtraction. Specifically, we pose the nonsemantic extraction of a video's moving objects as a nonconvex optimization problem via a sum of sparse and low-rank matrices. The resulting formulation, a nonnegative variant of robust principal component analysis, is more computationally tractable than its com… ▽ More In this paper, we consider the problem of unsupervised video object segmentation via background subtraction. Specifically, we pose the nonsemantic extraction of a video's moving objects as a nonconvex optimization problem via a sum of sparse and low-rank matrices. The resulting formulation, a nonnegative variant of robust principal component analysis, is more computationally tractable than its commonly employed convex relaxation, although not generally solvable to global optimality. In spite of this limitation, we derive intuitive and interpretable conditions on the video data under which the uniqueness and global optimality of the object segmentation are guaranteed using local search methods. We illustrate these novel optimality criteria through example segmentations using real video data. △ Less

Submitted 22 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

Comments: Proceedings of the 57th Annual Allerton Conference on Communication, Control, and Computing, 2019; added funding source information and notation definitions

Journal ref: Proceedings of the 57th Annual Allerton Conference on Communication, Control, and Computing, pp. 965--972, 2019

arXiv:1906.04015 [pdf, ps, other]

Cormorant: Covariant Molecular Neural Networks

Authors: Brandon Anderson, Truong-Son Hy, Risi Kondor

Abstract: We propose Cormorant, a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems. We apply these networks to molecular systems with two goals: learning atomic potential energy surfaces for use in Molecular Dynamics simulations, and learning ground state properties of molecules calculated by Density Functional Theory. Some of… ▽ More We propose Cormorant, a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems. We apply these networks to molecular systems with two goals: learning atomic potential energy surfaces for use in Molecular Dynamics simulations, and learning ground state properties of molecules calculated by Density Functional Theory. Some of the key features of our network are that (a) each neuron explicitly corresponds to a subset of atoms; (b) the activation of each neuron is covariant to rotations, ensuring that overall the network is fully rotationally invariant. Furthermore, the non-linearity in our network is based upon tensor products and the Clebsch-Gordan decomposition, allowing the network to operate entirely in Fourier space. Cormorant significantly outperforms competing algorithms in learning molecular Potential Energy Surfaces from conformational geometries in the MD-17 dataset, and is competitive with other methods at learning geometric, energetic, electronic, and thermodynamic properties of molecules on the GDB-9 dataset. △ Less

Submitted 25 November, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

arXiv:1901.03656 [pdf, other]

Cooperative event-based rigid formation control

Authors: Zhiyong Sun, Qingchen Liu, Na Huang, Changbin Yu, Brian D. O. Anderson

Abstract: This paper discusses cooperative stabilization control of rigid formations via an event-based approach. We first design a centralized event-based formation control system, in which a central event controller determines the next triggering time and broadcasts the event signal to all the agents for control input update. We then build on this approach to propose a distributed event control strategy,… ▽ More This paper discusses cooperative stabilization control of rigid formations via an event-based approach. We first design a centralized event-based formation control system, in which a central event controller determines the next triggering time and broadcasts the event signal to all the agents for control input update. We then build on this approach to propose a distributed event control strategy, in which each agent can use its local event trigger and local information to update the control input at its own event time. For both cases, the triggering condition, event function and triggering behavior are discussed in detail, and the exponential convergence of the event-based formation system is guaranteed. △ Less

Submitted 11 January, 2019; originally announced January 2019.

arXiv:1812.05138 [pdf, other]

Consensus and Disagreement of Heterogeneous Belief Systems in Influence Networks

Authors: Mengbin Ye, Ji Liu, Lili Wang, Brian D. O. Anderson, Ming Cao

Abstract: Recently, an opinion dynamics model has been proposed to describe a network of individuals discussing a set of logically interdependent topics. For each individual, the set of topics and the logical interdependencies between the topics (captured by a logic matrix) form a belief system. We investigate the role the logic matrix and its structure plays in determining the final opinions, including exi… ▽ More Recently, an opinion dynamics model has been proposed to describe a network of individuals discussing a set of logically interdependent topics. For each individual, the set of topics and the logical interdependencies between the topics (captured by a logic matrix) form a belief system. We investigate the role the logic matrix and its structure plays in determining the final opinions, including existence of the limiting opinions, of a strongly connected network of individuals. We provide a set of results that, given a set of individuals' belief systems, allow a systematic determination of which topics will reach a consensus, and which topics will disagreement in arise. For irreducible logic matrices, each topic reaches a consensus. For reducible logic matrices, which indicates a cascade interdependence relationship, conditions are given on whether a topic will reach a consensus or not. It turns out that heterogeneity among the individuals' logic matrices, including especially differences in the signs of the off-diagonal entries, can be a key determining factor. This paper thus attributes, for the first time, a strong diversity of limiting opinions to heterogeneity of belief systems in influence networks, in addition to the more typical explanation that strong diversity arises from individual stubbornness. △ Less

Submitted 12 December, 2018; originally announced December 2018.

Comments: Submitted journal paper

arXiv:1810.00182 [pdf, other]

Collaborative target-tracking control using multiple autonomous fixed-wing UAVs with constant speeds

Authors: Zhiyong Sun, Hector Garcia de Marina, Brian D. O. Anderson, Changbin Yu

Abstract: This paper considers a collaborative tracking control problem using a group of fixed-wing unmanned aerial vehicles (UAVs) with constant and non-identical speeds. The dynamics of fixed-wing UAVs are modelled by unicycle-type equations with nonholonomic constraints, assuming that UAVs fly at constant altitudes in the nominal operation mode. The controller is designed such that all fixed-wing UAVs as… ▽ More This paper considers a collaborative tracking control problem using a group of fixed-wing unmanned aerial vehicles (UAVs) with constant and non-identical speeds. The dynamics of fixed-wing UAVs are modelled by unicycle-type equations with nonholonomic constraints, assuming that UAVs fly at constant altitudes in the nominal operation mode. The controller is designed such that all fixed-wing UAVs as a group can collaboratively track a desired target's position and velocity. We first present conditions on the relative speeds of tracking UAVs and the target to ensure that the tracking objective can be achieved when UAVs are subject to constant speed constraints. We construct a reference velocity that includes both the target's velocity and position as feedback, which is to be tracked by the group centroid. In this way, all vehicles' headings are controlled such that the group centroid follows a reference trajectory that successfully tracks the target's trajectory. A spacing controller is further devised to ensure that all vehicles stay close to the group centroid trajectory. Trade-offs in the controller design and performance limitations of the target tracking control due to the constant-speed constraint are also discussed in detail. Experimental results with three fixed-wing UAVs tracking a target rotorcraft are provided. △ Less

Submitted 2 September, 2020; v1 submitted 29 September, 2018; originally announced October 2018.

Comments: 33 pages (single column). To be published in the AIAA Journal of Guidance, Dynamics, and Control

arXiv:1806.11236 [pdf, other]

An Influence Network Model to Study Discrepancies in Expressed and Private Opinions

Authors: Mengbin Ye, Yuzhen Qin, Alain Govaert, Brian D. O. Anderson, Ming Cao

Abstract: In many social situations, a discrepancy arises between an individual's private and expressed opinions on a given topic. Motivated by Solomon Asch's seminal experiments on social conformity and other related socio-psychological works, we propose a novel opinion dynamics model to study how such a discrepancy can arise in general social networks of interpersonal influence. Each individual in the net… ▽ More In many social situations, a discrepancy arises between an individual's private and expressed opinions on a given topic. Motivated by Solomon Asch's seminal experiments on social conformity and other related socio-psychological works, we propose a novel opinion dynamics model to study how such a discrepancy can arise in general social networks of interpersonal influence. Each individual in the network has both a private and an expressed opinion: an individual's private opinion evolves under social influence from the expressed opinions of the individual's neighbours, while the individual determines his or her expressed opinion under a pressure to conform to the average expressed opinion of his or her neighbours, termed the local public opinion. General conditions on the network that guarantee exponentially fast convergence of the opinions to a limit are obtained. Further analysis of the limit yields several semi-quantitative conclusions, which have insightful social interpretations, including the establishing of conditions that ensure every individual in the network has such a discrepancy. Last, we show the generality and validity of the model by using it to explain and predict the results of Solomon Asch's seminal experiments. △ Less

Submitted 22 February, 2019; v1 submitted 28 June, 2018; originally announced June 2018.

Comments: An extended version of a provisionally accepted Automatica regular paper, with extra simulations and additional commentary on Asch's experiments

arXiv:1805.11544 [pdf, other]

Limitless HTTP in an HTTPS World: Inferring the Semantics of the HTTPS Protocol without Decryption

Authors: Blake Anderson, Andrew Chi, Scott Dunlop, David McGrew

Abstract: We present new analytic techniques for inferring HTTP semantics from passive observations of HTTPS that can infer the value of important fields including the status-code, Content-Type, and Server, and the presence or absence of several additional HTTP header fields, e.g., Cookie and Referer. Our goals are twofold: to better understand the limitations of the confidentiality of HTTPS, and to explore… ▽ More We present new analytic techniques for inferring HTTP semantics from passive observations of HTTPS that can infer the value of important fields including the status-code, Content-Type, and Server, and the presence or absence of several additional HTTP header fields, e.g., Cookie and Referer. Our goals are twofold: to better understand the limitations of the confidentiality of HTTPS, and to explore benign uses of traffic analysis such as application troubleshooting and malware detection that could replace HTTPS interception and static private keys in some scenarios. We found that our techniques improve the efficacy of malware detection, but they do not enable more powerful website fingerprinting attacks against Tor. Our broader set of results raises concerns about the confidentiality goals of TLS relative to a user's expectation of privacy, warranting future research. We apply our methods to the semantics of both HTTP/1.1 and HTTP/2 on data collected from automated runs of Firefox 58.0, Chrome 63.0, and Tor Browser 7.0.11 in a lab setting, and from applications running in a malware sandbox. We obtain ground truth plaintext for a diverse set of applications from the malware sandbox by extracting the key material needed for decryption from RAM post-execution. We developed an iterative approach to simultaneously solve several multi-class (field values) and binary (field presence) classification problems, and we show that our inference algorithm achieves an unweighted $F_1$ score greater than 0.900 for most HTTP fields examined. △ Less

Submitted 29 May, 2018; originally announced May 2018.

arXiv:1805.02836 [pdf, other]

Continuous-time Opinion Dynamics on Multiple Interdependent Topics

Authors: Mengbin Ye, Minh Hoang Trinh, Young-Hun Lim, Brian D. O. Anderson, Hyo-Sung Ahn

Abstract: In this paper, and inspired by the recent discrete-time model in [1,2], we study two continuous-time opinion dynamics models (Model 1 and Model 2) where the individuals discuss opinions on multiple logically interdependent topics. The logical interdependence between the different topics is captured by a `logic' matrix, which is distinct from the Laplacian matrix capturing interactions between indi… ▽ More In this paper, and inspired by the recent discrete-time model in [1,2], we study two continuous-time opinion dynamics models (Model 1 and Model 2) where the individuals discuss opinions on multiple logically interdependent topics. The logical interdependence between the different topics is captured by a `logic' matrix, which is distinct from the Laplacian matrix capturing interactions between individuals. For each of Model 1 and Model 2, we obtain a necessary and sufficient condition for the network to reach to a consensus on each separate topic. The condition on Model 1 involves a combination of the eigenvalues of the logic matrix and Laplacian matrix, whereas the condition on Model 2 requires only separate conditions on the logic matrix and Laplacian matrix. Further investigations of Model 1 yields two sufficient conditions for consensus, and allow us to conclude that one way to guarantee a consensus is to reduce the rate of interaction between individuals exchanging opinions. By placing further restrictions on the logic matrix, we also establish a set of Laplacian matrices which guarantee consensus for Model 1. The two models are also expanded to include stubborn individuals, who remain attached to their initial opinions. Sufficient conditions are obtained for guaranteeing convergence of the opinion dynamics system, with the final opinions generally being at a persistent disagreement. Simulations are provided to illustrate the results. △ Less

Submitted 11 January, 2020; v1 submitted 8 May, 2018; originally announced May 2018.

Comments: Extended version of a journal paper submission, with detailed proofs and additional discussion and simulations

arXiv:1804.04317 [pdf, other]

Cooperative Localisation of a GPS-Denied UAV using Direction of Arrival Measurements

Authors: James S. Russell, Mengbin Ye, Brian D. O. Anderson, Hatem Hmam, Peter Sarunic

Abstract: A GPS-denied UAV (Agent B) is localised through INS alignment with the aid of a nearby GPS-equipped UAV (Agent A), which broadcasts its position at several time instants. Agent B measures the signals' direction of arrival with respect to Agent B's inertial navigation frame. Semidefinite programming and the Orthogonal Procrustes algorithm are employed, and accuracy is improved through maximum likel… ▽ More A GPS-denied UAV (Agent B) is localised through INS alignment with the aid of a nearby GPS-equipped UAV (Agent A), which broadcasts its position at several time instants. Agent B measures the signals' direction of arrival with respect to Agent B's inertial navigation frame. Semidefinite programming and the Orthogonal Procrustes algorithm are employed, and accuracy is improved through maximum likelihood estimation. The method is validated using flight data and simulations. A three-agent extension is explored. △ Less

Submitted 20 November, 2018; v1 submitted 12 April, 2018; originally announced April 2018.

Comments: 13 pages, 11 figures, this is an extended version to an imminent submission to IEEE Transactions on Aerospace and Electronic Systems. arXiv admin note: text overlap with arXiv:1703.06261

arXiv:1802.08751 [pdf, ps, other]

A Generalized Discrete-Time Altafini Model

Authors: L. Wang, J. Liu, A. S. Morse, B. D. O. Anderson, D. Fullmer

Abstract: A discrete-time modulus consensus model is considered in which the interaction among a family of networked agents is described by a time-dependent gain graph whose vertices correspond to agents and whose arcs are assigned complex numbers from a cyclic group. Limiting behavior of the model is studied using a graphical approach. It is shown that, under appropriate connectedness, a certain type of cl… ▽ More A discrete-time modulus consensus model is considered in which the interaction among a family of networked agents is described by a time-dependent gain graph whose vertices correspond to agents and whose arcs are assigned complex numbers from a cyclic group. Limiting behavior of the model is studied using a graphical approach. It is shown that, under appropriate connectedness, a certain type of clustering will be reached exponentially fast for almost all initial conditions if and only if the sequence of gain graphs is "repeatedly jointly structurally balanced" corresponding to that type of clustering, where the number of clusters is at most the order of a cyclic group. It is also shown that the model will reach a consensus asymptotically at zero if the sequence of gain graphs is repeatedly jointly strongly connected and structurally unbalanced. In the special case when the cyclic group is of order two, the model simplifies to the so-called Altafini model whose gain graph is simply a signed graph. △ Less

Submitted 23 February, 2018; originally announced February 2018.

Comments: 7 pages, 3 figures, ECC paper

arXiv:1801.02144 [pdf, other]

Covariant Compositional Networks For Learning Graphs

Authors: Risi Kondor, Hy Truong Son, Horace Pan, Brandon Anderson, Shubhendu Trivedi

Abstract: Most existing neural networks for learning graphs address permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this imposes a limitation on their representation power, and instead propose a new general architecture for representing objects consisting of a hierarchy of parts, which we call… ▽ More Most existing neural networks for learning graphs address permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this imposes a limitation on their representation power, and instead propose a new general architecture for representing objects consisting of a hierarchy of parts, which we call Covariant Compositional Networks (CCNs). Here, covariance means that the activation of each neuron must transform in a specific way under permutations, similarly to steerability in CNNs. We achieve covariance by making each activation transform according to a tensor representation of the permutation group, and derive the corresponding tensor aggregation rules that each neuron must implement. Experiments show that CCNs can outperform competing methods on standard graph learning benchmarks. △ Less

Submitted 7 January, 2018; originally announced January 2018.

arXiv:1711.00793 [pdf, other]

3D Mobile Localization Using Distance-only Measurements

Authors: Bomin Jiang, Brian D. O. Anderson, Hatem Hman

Abstract: For a group of cooperating UAVs, localizing each other is often a key task. This paper studies the localization problem for a group of UAVs flying in 3D space with very limited information, i.e., when noisy distance measurements are the only type of inter-agent sensing that is available, and when only one UAV knows a global coordinate basis, the others being GPS-denied. Initially for a two-agent p… ▽ More For a group of cooperating UAVs, localizing each other is often a key task. This paper studies the localization problem for a group of UAVs flying in 3D space with very limited information, i.e., when noisy distance measurements are the only type of inter-agent sensing that is available, and when only one UAV knows a global coordinate basis, the others being GPS-denied. Initially for a two-agent problem, but easily generalized to some multi-agent problems, constraints are established on the minimum number of required distance measurements required to achieve the localization. The paper also proposes an algorithm based on semidefinite programming (SDP), followed by maximum likelihood estimation using a gradient descent initialized from the SDP calculation. The efficacy of the algorithm is verified with experimental noisy flight data. △ Less

Submitted 2 November, 2017; originally announced November 2017.

Comments: Submitted to IEEE Transactions on Aerospace and Electronic Systems

arXiv:1709.10154 [pdf, other]

Finite-Time Distributed Linear Equation Solver for Minimum $l_1$ Norm Solutions

Authors: **gqiu Zhou, Wang Xuan, Shaoshuai Mou, Brian. D. O. Anderson

Abstract: This paper proposes distributed algorithms for multi-agent networks to achieve a solution in finite time to a linear equation $Ax=b$ where $A$ has full row rank, and with the minimum $l_1$-norm in the underdetermined case (where $A$ has more columns than rows). The underlying network is assumed to be undirected and fixed, and an analytical proof is provided for the proposed algorithm to drive all… ▽ More This paper proposes distributed algorithms for multi-agent networks to achieve a solution in finite time to a linear equation $Ax=b$ where $A$ has full row rank, and with the minimum $l_1$-norm in the underdetermined case (where $A$ has more columns than rows). The underlying network is assumed to be undirected and fixed, and an analytical proof is provided for the proposed algorithm to drive all agents' individual states to converge to a common value, viz a solution of $Ax=b$, which is the minimum $l_1$-norm solution in the underdetermined case. Numerical simulations are also provided as validation of the proposed algorithms. △ Less

Submitted 28 September, 2017; originally announced September 2017.

arXiv:1709.08840 [pdf, other]

Nonlinear Map** Convergence and Application to Social Networks

Authors: Brian D. O. Anderson, Mengbin Ye

Abstract: This paper discusses discrete-time maps of the form $x(k + 1) = F(x(k))$, focussing on equilibrium points of such maps. Under some circumstances, Lefschetz fixed-point theory can be used to establish the existence of a single locally attractive equilibrium (which is sometimes globally attractive) when a general property of local attractivity is known for any equilibrium. Problems in social network… ▽ More This paper discusses discrete-time maps of the form $x(k + 1) = F(x(k))$, focussing on equilibrium points of such maps. Under some circumstances, Lefschetz fixed-point theory can be used to establish the existence of a single locally attractive equilibrium (which is sometimes globally attractive) when a general property of local attractivity is known for any equilibrium. Problems in social networks often involve such discrete-time systems, and we make an application to one such problem. △ Less

Submitted 30 September, 2017; v1 submitted 26 September, 2017; originally announced September 2017.

Comments: Submission to European Control Conference 2018

Showing 1–50 of 72 results for author: Anderson, B