Search | arXiv e-print repository

arXiv:2211.12035 [pdf, other]

FastFlow: AI for Fast Urban Wind Velocity Prediction

Authors: Shi Jer Low, Venugopalan, S. G. Raghavan, Harish Gopalan, Jian Cheng Wong, Justin Yeoh, Chin Chun Ooi

Abstract: Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains. These extend to various areas in sustainability. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs be… ▽ More Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains. These extend to various areas in sustainability. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs between multiple objectives, including limits on urban build-up and/or consideration of urban heat island effect. Hence, it can be beneficial to urban planners to have a fast surrogate model to predict urban characteristics of a hypothetical layout, e.g. pedestrian-level wind velocity, without having to run computationally expensive and time-consuming high-fidelity numerical simulations. This fast surrogate can then be potentially integrated into other design optimization frameworks, including generative models or other gradient-based methods. Here we present the use of CNNs for urban layout characterization that is typically done via high-fidelity numerical simulation. We further apply this model towards a first demonstration of its utility for data-driven pedestrian-level wind velocity prediction. The data set in this work comprises results from high-fidelity numerical simulations of wind velocities for a diverse set of realistic urban layouts, based on randomized samples from a real-world, highly built-up urban city. We then provide prediction results obtained from the trained CNN, demonstrating test errors of under 0.1 m/s for previously unseen urban layouts. We further illustrate how this can be useful for purposes such as rapid evaluation of pedestrian wind velocity for a potential new layout. It is hoped that this data set will further accelerate research in data-driven urban AI, even as our baseline model facilitates quantitative comparison to future methods. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2205.07110 [pdf, other]

SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching

Authors: Scott Gigante, Varsha G. Raghavan, Amanda M. Robinson, Robert A. Barton, Adeeb H. Rahman, Drausin F. Wulsin, Jacques Banchereau, Noam Solomon, Luis F. Voloch, Fabian J. Theis

Abstract: Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduc… ▽ More Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduce SystemMatch to assess the fit of preclinical model systems to an $\textit{in sapiens}$ target population and to recommend experimental changes to further optimize these systems. We demonstrate this through an application to develo** $\textit{in vitro}$ systems to model human tumor-derived suppressive macrophages. We show with held-out $\textit{in vivo}$ controls that our pipeline successfully ranks macrophage subpopulations by their biological similarity to the target population, and apply this analysis to rank a series of 18 $\textit{in vitro}$ macrophage systems perturbed with a variety of cytokine stimulations. We extend this analysis to predict the behavior of 66 $\textit{in silico}$ model systems generated using a perturbational autoencoder and apply a $k$-medoids approach to recommend a subset of these model systems for further experimental development in order to fully explore the space of possible perturbations. Through this use case, we demonstrate a novel approach to model system development to generate a system more similar to human biology. △ Less

Submitted 14 May, 2022; originally announced May 2022.

Comments: Published at the MLDD workshop, ICLR 2022

arXiv:2205.00334 [pdf, other]

Engineering flexible machine learning systems by traversing functionally-invariant paths

Authors: Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Matt Thomson

Abstract: Transformers have emerged as the state of the art neural network architecture for natural language processing and computer vision. In the foundation model paradigm, large transformer models (BERT, GPT3/4, Bloom, ViT) are pre-trained on self-supervised tasks such as word or image masking, and then, adapted through fine-tuning for downstream user applications including instruction following and Ques… ▽ More Transformers have emerged as the state of the art neural network architecture for natural language processing and computer vision. In the foundation model paradigm, large transformer models (BERT, GPT3/4, Bloom, ViT) are pre-trained on self-supervised tasks such as word or image masking, and then, adapted through fine-tuning for downstream user applications including instruction following and Question Answering. While many approaches have been developed for model fine-tuning including low-rank weight update strategies (eg. LoRA), underlying mathematical principles that enable network adaptation without knowledge loss remain poorly understood. Here, we introduce a differential geometry framework, functionally invariant paths (FIP), that provides flexible and continuous adaptation of neural networks for a range of machine learning goals and network sparsification objectives. We conceptualize the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the FIP algorithm achieves comparable to state of the art performance on continual learning and sparsification tasks for language models (BERT), vision transformers (ViT, DeIT), and the CNNs. Broadly, we conceptualize a neural network as a mathematical object that can be iteratively transformed into distinct configurations by the path-sampling algorithm to define a sub-manifold of weight space that can be harnessed to achieve user goals. △ Less

Submitted 3 September, 2023; v1 submitted 30 April, 2022; originally announced May 2022.

Comments: 22 pages

arXiv:2112.04963 [pdf, other]

Model-Agnostic Hybrid Numerical Weather Prediction and Machine Learning Paradigm for Solar Forecasting in the Tropics

Authors: Nigel Yuan Yun Ng, Harish Gopalan, Venugopalan S. G. Raghavan, Chin Chun Ooi

Abstract: Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is prop… ▽ More Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is proposed and evaluated for four radiation models. Weather Research and Forecasting (WRF) model is run in both global and regional mode to provide an estimate for solar irradiance. This estimate is then post-processed using ML to provide a final prediction. Normalized root-mean-square error from WRF is reduced by up to 40-50% with this ML error correction model. Results obtained using CAM, GFDL, New Goddard and RRTMG radiation models were comparable after this correction, negating the need for WRF parameterization tuning. Other models incorporating nearby locations and sensor data are also evaluated, with the latter being particularly promising. △ Less

Submitted 9 December, 2021; originally announced December 2021.

arXiv:2106.02793 [pdf, other]

Solving hybrid machine learning tasks by traversing weight space geodesics

Authors: Guruprasad Raghavan, Matt Thomson

Abstract: Machine learning problems have an intrinsic geometric structure as central objects including a neural network's weight space and the loss function associated with a particular task can be viewed as encoding the intrinsic geometry of a given machine learning problem. Therefore, geometric concepts can be applied to analyze and understand theoretical properties of machine learning strategies as well… ▽ More Machine learning problems have an intrinsic geometric structure as central objects including a neural network's weight space and the loss function associated with a particular task can be viewed as encoding the intrinsic geometry of a given machine learning problem. Therefore, geometric concepts can be applied to analyze and understand theoretical properties of machine learning strategies as well as to develop new algorithms. In this paper, we address three seemingly unrelated open questions in machine learning by viewing them through a unified framework grounded in differential geometry. Specifically, we view the weight space of a neural network as a manifold endowed with a Riemannian metric that encodes performance on specific tasks. By defining a metric, we can construct geodesic, minimum length, paths in weight space that represent sets of networks of equivalent or near equivalent functional performance on a specific task. We, then, traverse geodesic paths while identifying networks that satisfy a second objective. Inspired by the geometric insight, we apply our geodesic framework to 3 major applications: (i) Network sparsification (ii) Mitigating catastrophic forgetting by constructing networks with high performance on a series of objectives and (iii) Finding high-accuracy paths connecting distinct local optima of deep networks in the non-convex loss landscape. Our results are obtained on a wide range of network architectures (MLP, VGG11/16) trained on MNIST, CIFAR-10/100. Broadly, we introduce a geometric framework that unifies a range of machine learning objectives and that can be applied to multiple classes of neural network architectures. △ Less

Submitted 5 June, 2021; originally announced June 2021.

Comments: 11 pages, 7 figures

arXiv:2105.05495 [pdf, other]

LipBaB: Computing exact Lipschitz constant of ReLU networks

Authors: Aritra Bhowmick, Meenakshi D'Souza, G. Srinivasa Raghavan

Abstract: The Lipschitz constant of neural networks plays an important role in several contexts of deep learning ranging from robustness certification and regularization to stability analysis of systems with neural network controllers. Obtaining tight bounds of the Lipschitz constant is therefore important. We introduce LipBaB, a branch and bound framework to compute certified bounds of the local Lipschitz… ▽ More The Lipschitz constant of neural networks plays an important role in several contexts of deep learning ranging from robustness certification and regularization to stability analysis of systems with neural network controllers. Obtaining tight bounds of the Lipschitz constant is therefore important. We introduce LipBaB, a branch and bound framework to compute certified bounds of the local Lipschitz constant of deep neural networks with ReLU activation functions up to any desired precision. We achieve this by bounding the norm of the Jacobians, corresponding to different activation patterns of the network caused within the input domain. Our algorithm can provide provably exact computation of the Lipschitz constant for any p-norm. △ Less

Submitted 6 July, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

arXiv:2012.09605 [pdf, other]

Sparsifying networks by traversing Geodesics

Authors: Guruprasad Raghavan, Matt Thomson

Abstract: The geometry of weight spaces and functional manifolds of neural networks play an important role towards 'understanding' the intricacies of ML. In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry, ultimately relating it to the discovery of points or paths of equivalent function in these spaces. We propose a mathematical framework to evaluat… ▽ More The geometry of weight spaces and functional manifolds of neural networks play an important role towards 'understanding' the intricacies of ML. In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry, ultimately relating it to the discovery of points or paths of equivalent function in these spaces. We propose a mathematical framework to evaluate geodesics in the functional space, to find high-performance paths from a dense network to its sparser counterpart. Our results are obtained on VGG-11 trained on CIFAR-10 and MLP's trained on MNIST. Broadly, we demonstrate that the framework is general, and can be applied to a wide variety of problems, ranging from sparsification to alleviating catastrophic forgetting. △ Less

Submitted 12 December, 2020; originally announced December 2020.

Comments: 5 pages; Presented work at NeurIPS 2020 Workshop (DiffGeo4DL). arXiv admin note: text overlap with arXiv:2005.11603

arXiv:2011.02712 [pdf, other]

Architecture Agnostic Neural Networks

Authors: Sabera Talukder, Guruprasad Raghavan, Yisong Yue

Abstract: In this paper, we explore an alternate method for synthesizing neural network architectures, inspired by the brain's stochastic synaptic pruning. During a person's lifetime, numerous distinct neuronal architectures are responsible for performing the same tasks. This indicates that biological neural networks are, to some degree, architecture agnostic. However, artificial networks rely on their fine… ▽ More In this paper, we explore an alternate method for synthesizing neural network architectures, inspired by the brain's stochastic synaptic pruning. During a person's lifetime, numerous distinct neuronal architectures are responsible for performing the same tasks. This indicates that biological neural networks are, to some degree, architecture agnostic. However, artificial networks rely on their fine-tuned weights and hand-crafted architectures for their remarkable performance. This contrast begs the question: Can we build artificial architecture agnostic neural networks? To ground this study we utilize sparse, binary neural networks that parallel the brain's circuits. Within this sparse, binary paradigm we sample many binary architectures to create families of architecture agnostic neural networks not trained via backpropagation. These high-performing network families share the same sparsity, distribution of binary weights, and succeed in both static and dynamic tasks. In summation, we create an architecture manifold search procedure to discover families or architecture agnostic neural networks. △ Less

Submitted 11 December, 2020; v1 submitted 5 November, 2020; originally announced November 2020.

arXiv:2006.06902 [pdf, other]

Self-organization of multi-layer spiking neural networks

Authors: Guruprasad Raghavan, Cong Lin, Matt Thomson

Abstract: Living neural networks in our brains autonomously self-organize into large, complex architectures during early development to result in an organized and functional organic computational device. A key mechanism that enables the formation of complex architecture in the develo** brain is the emergence of traveling spatio-temporal waves of neuronal activity across the growing brain. Inspired by this… ▽ More Living neural networks in our brains autonomously self-organize into large, complex architectures during early development to result in an organized and functional organic computational device. A key mechanism that enables the formation of complex architecture in the develo** brain is the emergence of traveling spatio-temporal waves of neuronal activity across the growing brain. Inspired by this strategy, we attempt to efficiently self-organize large neural networks with an arbitrary number of layers into a wide variety of architectures. To achieve this, we propose a modular tool-kit in the form of a dynamical system that can be seamlessly stacked to assemble multi-layer neural networks. The dynamical system encapsulates the dynamics of spiking units, their inter/intra layer interactions as well as the plasticity rules that control the flow of information between layers. The key features of our tool-kit are (1) autonomous spatio-temporal waves across multiple layers triggered by activity in the preceding layer and (2) Spike-timing dependent plasticity (STDP) learning rules that update the inter-layer connectivity based on wave activity in the connecting layers. Our framework leads to the self-organization of a wide variety of architectures, ranging from multi-layer perceptrons to autoencoders. We also demonstrate that emergent waves can self-organize spiking network architecture to perform unsupervised learning, and networks can be coupled with a linear classifier to perform classification on classic image datasets like MNIST. Broadly, our work shows that a dynamical systems framework for learning can be used to self-organize large computational devices. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: 11 pages, 4 figures

arXiv:2005.11603 [pdf, other]

Geometric algorithms for predicting resilience and recovering damage in neural networks

Authors: Guruprasad Raghavan, Jiayi Li, Matt Thomson

Abstract: Biological neural networks have evolved to maintain performance despite significant circuit damage. To survive damage, biological network architectures have both intrinsic resilience to component loss and also activate recovery programs that adjust network weights through plasticity to stabilize performance. Despite the importance of resilience in technology applications, the resilience of artific… ▽ More Biological neural networks have evolved to maintain performance despite significant circuit damage. To survive damage, biological network architectures have both intrinsic resilience to component loss and also activate recovery programs that adjust network weights through plasticity to stabilize performance. Despite the importance of resilience in technology applications, the resilience of artificial neural networks is poorly understood, and autonomous recovery algorithms have yet to be developed. In this paper, we establish a mathematical framework to analyze the resilience of artificial neural networks through the lens of differential geometry. Our geometric language provides natural algorithms that identify local vulnerabilities in trained networks as well as recovery algorithms that dynamically adjust networks to compensate for damage. We reveal striking vulnerabilities in commonly used image analysis networks, like MLP's and CNN's trained on MNIST and CIFAR10 respectively. We also uncover high-performance recovery paths that enable the same networks to dynamically re-adjust their parameters to compensate for damage. Broadly, our work provides procedures that endow artificial systems with resilience and rapid-recovery routines to enhance their integration with IoT devices as well as enable their deployment for critical applications. △ Less

Submitted 2 June, 2020; v1 submitted 23 May, 2020; originally announced May 2020.

Comments: 10 pages and 4 figures

arXiv:1906.01039 [pdf, other]

Neural networks grown and self-organized by noise

Authors: Guruprasad Raghavan, Matt Thomson

Abstract: Living neural networks emerge through a process of growth and self-organization that begins with a single cell and results in a brain, an organized and functional computational device. Artificial neural networks, however, rely on human-designed, hand-programmed architectures for their remarkable performance. Can we develop artificial computational devices that can grow and self-organize without hu… ▽ More Living neural networks emerge through a process of growth and self-organization that begins with a single cell and results in a brain, an organized and functional computational device. Artificial neural networks, however, rely on human-designed, hand-programmed architectures for their remarkable performance. Can we develop artificial computational devices that can grow and self-organize without human intervention? In this paper, we propose a biologically inspired developmental algorithm that can 'grow' a functional, layered neural network from a single initial cell. The algorithm organizes inter-layer connections to construct a convolutional pooling layer, a key constituent of convolutional neural networks (CNN's). Our approach is inspired by the mechanisms employed by the early visual system to wire the retina to the lateral geniculate nucleus (LGN), days before animals open their eyes. The key ingredients for robust self-organization are an emergent spontaneous spatiotemporal activity wave in the first layer and a local learning rule in the second layer that 'learns' the underlying activity pattern in the first layer. The algorithm is adaptable to a wide-range of input-layer geometries, robust to malfunctioning units in the first layer, and so can be used to successfully grow and self-organize pooling architectures of different pool-sizes and shapes. The algorithm provides a primitive procedure for constructing layered neural networks through growth and self-organization. Broadly, our work shows that biologically inspired developmental algorithms can be applied to autonomously grow functional 'brains' in-silico. △ Less

Submitted 3 June, 2019; originally announced June 2019.

Comments: 21 pages (including 11 pages of appendix)

Showing 1–11 of 11 results for author: Raghavan, G