-
Metric Flow Matching for Smooth Interpolations on the Data Manifold
Authors:
Kacper Kapusniak,
Peter Potaptchik,
Teodora Reu,
Leo Zhang,
Alexander Tong,
Michael Bronstein,
Avishek Joey Bose,
Francesco Di Giovanni
Abstract:
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive fo…
▽ More
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Spatio-temporal patterns of diurnal temperature: a random matrix approach I-case of India
Authors:
Madhuchhanda Bhattacharjee,
Arup Bose
Abstract:
We consider the spatio-temporal gridded daily diurnal temperature range (DTR) data across India during the 72-year period 1951--2022. We augment this data with information on the El Nino-Southern Oscillation (ENSO) and on the climatic regions (Stamp's and Koeppen's classification) and four seasons of India.
We use various matrix theory approaches to trim out strong but routine signals, random ma…
▽ More
We consider the spatio-temporal gridded daily diurnal temperature range (DTR) data across India during the 72-year period 1951--2022. We augment this data with information on the El Nino-Southern Oscillation (ENSO) and on the climatic regions (Stamp's and Koeppen's classification) and four seasons of India.
We use various matrix theory approaches to trim out strong but routine signals, random matrix theory to remove noise, and novel empirical generalised singular-value distributions to establish retention of essential signals in the trimmed data. We make use of the spatial Bergsma statistics to measure spatial association and identify temporal change points in the spatial-association.
In particular, our investigation captures a yet unknown change-point over the 72 years under study with drastic changes in spatial-association of DTR in India. It also brings out changes in spatial association with regard to ENSO.
We conclude that while studying/modelling Indian DTR data, due consideration should be granted to the strong spatial association that is being persistently exhibited over decades, and provision should be kept for potential change points in the temporal behaviour, which in turn can bring moderate to dramatic changes in the spatial association pattern.
Some of our analysis also reaffirms the conclusions made by other authors, regarding spatial and temporal behavior of DTR, adding our own insights. We consider the data from the yearly, seasonal and climatic zones points of view, and discover several new and interesting statistical structures which should be of interest, especially to climatologists and statisticians. Our methods are not country specific and could be used profitably for DTR data from other geographical areas.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Authors:
Tara Akhound-Sadegh,
Jarrid Rector-Brooks,
Avishek Joey Bose,
Sarthak Mittal,
Pablo Lemos,
Cheng-Hao Liu,
Marcin Sendera,
Siamak Ravanbakhsh,
Gauthier Gidel,
Yoshua Bengio,
Nikolay Malkin,
Alexander Tong
Abstract:
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and…
▽ More
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system.
△ Less
Submitted 26 June, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
On the Stability of Iterative Retraining of Generative Models on their own Data
Authors:
Quentin Bertrand,
Avishek Joey Bose,
Alexandre Duplessis,
Marco Jiralerspong,
Gauthier Gidel
Abstract:
Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inev…
▽ More
Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inevitably be increasingly populated with synthetic content. Such a fact directly implies that future iterations of generative models will be trained on both clean and artificially generated data from past models. In this paper, we develop a framework to rigorously study the impact of training generative models on mixed datasets -- from classical training on real data to self-consuming generative models trained on purely synthetic data. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough and the proportion of clean training data (w.r.t. synthetic data) is large enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models on CIFAR10 and FFHQ.
△ Less
Submitted 2 April, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Measuring spatial association and testing spatial independence based on short time course data
Authors:
Divya Kappara,
Arup Bose,
Madhuchhanda Bhattacharjee
Abstract:
Spatial association measures for univariate static spatial data are widely used. When the data is in the form of a collection of spatial vectors with the same temporal domain of interest, we construct a measure of similarity between the regions' series, using Bergsma's correlation coefficient $ρ$. Due to the special properties of $ρ$, unlike other spatial association measures which test for spatia…
▽ More
Spatial association measures for univariate static spatial data are widely used. When the data is in the form of a collection of spatial vectors with the same temporal domain of interest, we construct a measure of similarity between the regions' series, using Bergsma's correlation coefficient $ρ$. Due to the special properties of $ρ$, unlike other spatial association measures which test for spatial randomness, our statistic can account for spatial pairwise independence. We have derived the asymptotic behavior of our statistic under null (independence of the regions) and alternate cases (the regions are dependent). We explore the alternate scenario of spatial dependence further, using simulations for the SAR and SMA dependence models. Finally, we provide application to modelling and testing for the presence of spatial association in COVID-19 incidence data, by using our statistic on the residuals obtained after model fitting.
△ Less
Submitted 25 September, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Assessing bivariate independence: Revisiting Bergsma's covariance
Authors:
Divya Kappara,
Arup Bose,
Madhuchhanda Bhattacharjee
Abstract:
Bergsma (2006) proposed a covariance $κ$(X,Y) between random variables X and Y. He derived their asymptotic distributions under the null hypothesis of independence between X and Y. The non-null (dependent) case does not seem to have been studied in the literature. We derive several alternate expressions for $κ$. One of them leads us to a very intuitive estimator of $κ$(X,Y) that is a nice function…
▽ More
Bergsma (2006) proposed a covariance $κ$(X,Y) between random variables X and Y. He derived their asymptotic distributions under the null hypothesis of independence between X and Y. The non-null (dependent) case does not seem to have been studied in the literature. We derive several alternate expressions for $κ$. One of them leads us to a very intuitive estimator of $κ$(X,Y) that is a nice function of four naturally arising U-statistics. We derive the exact finite sample relation between all three estimates. The asymptotic distribution of our estimator, and hence also of the other two estimators, in the non-null (dependence) case, is then obtained by using the U-statistics central limit theorem. For specific parametric bivariate distributions, the value of $κ$ can be derived in terms of the natural dependence parameters of these distributions. In particular, we derive the formula for $κ$ when (X,Y) are distributed as Gumbel's bivariate exponential. We bring out various aspects of these estimators through extensive simulations from several prominent bivariate distributions. In particular, we investigate the empirical relationship between $κ$ and the dependence parameters, the distributional properties of the estimators, and the accuracy of these estimators. We also investigate the powers of these measures for testing independence, compare these among themselves, and with other well known such measures. Based on these exercises, the proposed estimator seems as good or better than its competitors both in terms of power and computing efficiency.
△ Less
Submitted 29 May, 2023; v1 submitted 17 December, 2022;
originally announced December 2022.
-
Modelling COVID-19-III: endemic spread in India
Authors:
Madhuchhanda Bhattacharjee,
Arup Bose
Abstract:
A disease in a given population is termed endemic when it exhibits a steady prevalence. We address the pertinent question as to what extent COVID-19 has turned endemic in India. There are several existing models for studying endemic behaviour, such as the extensions of the traditional temporal SIR model or the spatio-temporal endemic-epidemic model of Held et al. (2005) and its extensions. We prop…
▽ More
A disease in a given population is termed endemic when it exhibits a steady prevalence. We address the pertinent question as to what extent COVID-19 has turned endemic in India. There are several existing models for studying endemic behaviour, such as the extensions of the traditional temporal SIR model or the spatio-temporal endemic-epidemic model of Held et al. (2005) and its extensions. We propose a "spatio-temporal Gravity model" in a state of the art generalised linear model set up that can be deployed at various spatial resolutions. In absence of routine and quality covariates in the context of COVID-19 at finer spatial scales, we make use of extraneous covariates like air-traffic passenger count that enables us to capture the local mobility and social interactions effectively. This makes the proposed model different from the existing models. The proposed gravity model not only produces consistent estimators, but also outperforms the other models when applied to Indian COVID-19 data.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Multiscale Generative Models: Improving Performance of a Generative Model Using Feedback from Other Dependent Generative Models
Authors:
Changyu Chen,
Avinandan Bose,
Shih-Fen Cheng,
Arunesh Sinha
Abstract:
Realistic fine-grained multi-agent simulation of real-world complex systems is crucial for many downstream tasks such as reinforcement learning. Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems. However, such generative models are often monolithic and miss out on modeling the interaction in multi-agent systems. In this work, w…
▽ More
Realistic fine-grained multi-agent simulation of real-world complex systems is crucial for many downstream tasks such as reinforcement learning. Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems. However, such generative models are often monolithic and miss out on modeling the interaction in multi-agent systems. In this work, we take a first step towards building multiple interacting generative models (GANs) that reflects the interaction in real world. We build and analyze a hierarchical set-up where a higher-level GAN is conditioned on the output of multiple lower-level GANs. We present a technique of using feedback from the higher-level GAN to improve performance of lower-level GANs. We mathematically characterize the conditions under which our technique is impactful, including understanding the transfer learning nature of our set-up. We present three distinct experiments on synthetic data, time series data, and image domain, revealing the wide applicability of our technique.
△ Less
Submitted 24 February, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
Changepoint Analysis of Topic Proportions in Temporal Text Data
Authors:
Avinandan Bose,
Soumendu Sundar Mukherjee
Abstract:
Changepoint analysis deals with unsupervised detection and/or estimation of time-points in time-series data, when the distribution generating the data changes. In this article, we consider \emph{offline} changepoint detection in the context of large scale textual data. We build a specialised temporal topic model with provisions for changepoints in the distribution of topic proportions. As full lik…
▽ More
Changepoint analysis deals with unsupervised detection and/or estimation of time-points in time-series data, when the distribution generating the data changes. In this article, we consider \emph{offline} changepoint detection in the context of large scale textual data. We build a specialised temporal topic model with provisions for changepoints in the distribution of topic proportions. As full likelihood based inference in this model is computationally intractable, we develop a computationally tractable approximate inference procedure. More specifically, we use sample splitting to estimate topic polytopes first and then apply a likelihood ratio statistic together with a modified version of the wild binary segmentation algorithm of Fryzlewicz et al. (2014). Our methodology facilitates automated detection of structural changes in large corpora without the need of manual processing by domain experts. As changepoints under our model correspond to changes in topic structure, the estimated changepoints are often highly interpretable as marking the surge or decline in popularity of a fashionable topic. We apply our procedure on two large datasets: (i) a corpus of English literature from the period 1800-1922 (Underwoodet al., 2015); (ii) abstracts from the High Energy Physics arXiv repository (Clementet al., 2019). We obtain some historically well-known changepoints and discover some new ones.
△ Less
Submitted 29 November, 2021;
originally announced December 2021.
-
Structure Aware Negative Sampling in Knowledge Graphs
Authors:
Kian Ahrabian,
Aarash Feizi,
Yasmin Salehi,
William L. Hamilton,
Avishek Joey Bose
Abstract:
Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations…
▽ More
Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations and find critical characteristics of observed data. While earlier methods either employ too simple corruption distributions, i.e. uniform, yielding easy uninformative negatives or sophisticated adversarial distributions with challenging optimization schemes, they do not explicitly incorporate known graph structure resulting in suboptimal negatives. In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node's k-hop neighborhood. Empirically, we demonstrate that SANS finds semantically meaningful negatives and is competitive with SOTA approaches while requires no additional parameters nor difficult adversarial optimization.
△ Less
Submitted 6 October, 2020; v1 submitted 23 September, 2020;
originally announced September 2020.
-
Modelling COVID-19 -- I A dynamic SIR(D) with application to Indian data
Authors:
Madhuchhanda Bhattacharjee,
Arup Bose
Abstract:
We propose an epidemiological model using an adaptive dynamic three compartment (with four states) SIR(D) model. Our approach is similar to non-parametric curve fitting in spirit and automatically adapts to key external factors, such as interventions, while retaining the parsimonious nature of the standard SIR(D) model. Initial dynamic temporal estimates of the model parameters are obtained by min…
▽ More
We propose an epidemiological model using an adaptive dynamic three compartment (with four states) SIR(D) model. Our approach is similar to non-parametric curve fitting in spirit and automatically adapts to key external factors, such as interventions, while retaining the parsimonious nature of the standard SIR(D) model. Initial dynamic temporal estimates of the model parameters are obtained by minimising the aggregate residual sum of squares across the number of infections, recoveries, and fatalities, over a chosen lag period. Then a geometric smoother is applied to obtain the final time series of estimates. These estimates are used to obtain dynamic temporal robust estimates of the key feature of this pandemic, namely the "reproduction number". We illustrate our method on the Indian COVID-19 data for the period March 14 - August 31, 2020. The time series data plots of the 36 states and union territories shows a clear presence of inter-regional variation in the prognosis of the epidemic. This is also bourne out by the estimates of the underlying parameters, including the reproduction numbers for the 36 regions. Due to this, an SIR(D) model, dynamic or otherwise, on the national aggregate data is not suited for robust local predictions. The time series of estimates of the model enables us to carry out daily, weekly and also long term predictions, including construction of predictive bands. We obtain an excellent agreement between the actual data and the model predicted data at the regional level. Our estimates of the current reproduction number turn out to be more than 2 in three regions (Andhra Pradesh, Maharashtra and Uttar Pradesh) and between 1.5 and 2 in 13 regions. Each of these regions have experienced an individual trajectory, which typically involves initial phase of shock(s) followed by a relatively steady lower level of the reproduction number.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Adversarial Example Games
Authors:
Avishek Joey Bose,
Gauthier Gidel,
Hugo Berard,
Andre Cianflone,
Pascal Vincent,
Simon Lacoste-Julien,
William L. Hamilton
Abstract:
The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks to guide the development of safeguards against them. This includes attack methods in the challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior attacks i…
▽ More
The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks to guide the development of safeguards against them. This includes attack methods in the challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior attacks in this setting have relied mainly on algorithmic innovations derived from empirical observations (e.g., that momentum helps), lacking principled transferability guarantees. In this work, we provide a theoretical foundation for crafting transferable adversarial examples to entire hypothesis classes. We introduce Adversarial Example Games (AEG), a framework that models the crafting of adversarial examples as a min-max game between a generator of attacks and a classifier. AEG provides a new way to design adversarial examples by adversarially training a generator and a classifier from a given hypothesis class (e.g., architecture). We prove that this game has an equilibrium, and that the optimal generator is able to craft adversarial examples that can attack any classifier from the corresponding hypothesis class. We demonstrate the efficacy of AEG on the MNIST and CIFAR-10 datasets, outperforming prior state-of-the-art approaches with an average relative improvement of $29.9\%$ and $47.2\%$ against undefended and robust models (Table 2 & 3) respectively.
△ Less
Submitted 8 January, 2021; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Latent Variable Modelling with Hyperbolic Normalizing Flows
Authors:
Avishek Joey Bose,
Ariella Smofsky,
Renjie Liao,
Prakash Panangaden,
William L. Hamilton
Abstract:
The choice of approximate posterior distributions plays a central role in stochastic variational inference (SVI). One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions. However, one key limitation of existing normalizing flows is that they are restricted to the Euclidean space and are ill-equipped to model data with a…
▽ More
The choice of approximate posterior distributions plays a central role in stochastic variational inference (SVI). One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions. However, one key limitation of existing normalizing flows is that they are restricted to the Euclidean space and are ill-equipped to model data with an underlying hierarchical structure. To address this fundamental limitation, we present the first extension of normalizing flows to hyperbolic spaces. We first elevate normalizing flows to hyperbolic spaces using coupling transforms defined on the tangent bundle, termed Tangent Coupling ($\mathcal{TC}$). We further introduce Wrapped Hyperboloid Coupling ($\mathcal{W}\mathbb{H}C$), a fully invertible and learnable transformation that explicitly utilizes the geometric structure of hyperbolic spaces, allowing for expressive posteriors while being efficient to sample from. We demonstrate the efficacy of our novel normalizing flow over hyperbolic VAEs and Euclidean normalizing flows. Our approach achieves improved performance on density estimation, as well as reconstruction of real-world graph data, which exhibit a hierarchical structure. Finally, we show that our approach can be used to power a generative model over hierarchical data using hyperbolic latent variables.
△ Less
Submitted 13 August, 2020; v1 submitted 15 February, 2020;
originally announced February 2020.
-
Meta-Graph: Few Shot Link Prediction via Meta Learning
Authors:
Avishek Joey Bose,
Ankit Jain,
Piero Molino,
William L. Hamilton
Abstract:
We consider the task of few shot link prediction on graphs. The goal is to learn from a distribution over graphs so that a model is able to quickly infer missing edges in a new graph after a small amount of training. We show that current link prediction methods are generally ill-equipped to handle this task. They cannot effectively transfer learned knowledge from one graph to another and are unabl…
▽ More
We consider the task of few shot link prediction on graphs. The goal is to learn from a distribution over graphs so that a model is able to quickly infer missing edges in a new graph after a small amount of training. We show that current link prediction methods are generally ill-equipped to handle this task. They cannot effectively transfer learned knowledge from one graph to another and are unable to effectively learn from sparse samples of edges. To address this challenge, we introduce a new gradient-based meta learning framework, Meta-Graph. Our framework leverages higher-order gradients along with a learned graph signature function that conditionally generates a graph neural network initialization. Using a novel set of few shot link prediction benchmarks, we show that Meta-Graph can learn to quickly adapt to a new graph using only a small sample of true edges, enabling not only fast adaptation but also improved results at convergence.
△ Less
Submitted 1 March, 2020; v1 submitted 20 December, 2019;
originally announced December 2019.
-
Deep Radar Waveform Design for Efficient Automotive Radar Sensing
Authors:
Shahin Khobahi,
Arindam Bose,
Mojtaba Soltanalian
Abstract:
In radar systems, unimodular (or constant-modulus) waveform design plays an important role in achieving better clutter/interference rejection, as well as a more accurate estimation of the target parameters. The design of such sequences has been studied widely in the last few decades, with most design algorithms requiring sophisticated a priori knowledge of environmental parameters which may be dif…
▽ More
In radar systems, unimodular (or constant-modulus) waveform design plays an important role in achieving better clutter/interference rejection, as well as a more accurate estimation of the target parameters. The design of such sequences has been studied widely in the last few decades, with most design algorithms requiring sophisticated a priori knowledge of environmental parameters which may be difficult to obtain in real-time scenarios. In this paper, we propose a novel hybrid model-driven and data-driven architecture that adapts to the ever changing environment and allows for adaptive unimodular waveform design. In particular, the approach lays the groundwork for develo** extremely low-cost waveform design and processing frameworks for radar systems deployed in autonomous vehicles. The proposed model-based deep architecture imitates a well-known unimodular signal design algorithm in its structure, and can quickly infer statistical information from the environment using the observed data. Our numerical experiments portray the advantages of using the proposed method for efficient radar waveform design in time-varying environments.
△ Less
Submitted 19 December, 2019; v1 submitted 17 December, 2019;
originally announced December 2019.
-
Deep One-bit Compressive Autoencoding
Authors:
Shahin Khobahi,
Arindam Bose,
Mojtaba Soltanalian
Abstract:
Parameterized mathematical models play a central role in understanding and design of complex information systems. However, they often cannot take into account the intricate interactions innate to such systems. On the contrary, purely data-driven approaches do not need explicit mathematical models for data generation and have a wider applicability at the cost of interpretability. In this paper, we…
▽ More
Parameterized mathematical models play a central role in understanding and design of complex information systems. However, they often cannot take into account the intricate interactions innate to such systems. On the contrary, purely data-driven approaches do not need explicit mathematical models for data generation and have a wider applicability at the cost of interpretability. In this paper, we consider the design of a one-bit compressive autoencoder, and propose a novel hybrid model-based and data-driven methodology that allows us to not only design the sensing matrix for one-bit data acquisition, but also allows for learning the latent-parameters of an iterative optimization algorithm specifically designed for the problem of one-bit sparse signal recovery. Our results demonstrate a significant improvement compared to state-of-the-art model-based algorithms.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
A Novel Approach for Detection and Ranking of Trendy and Emerging Cyber Threat Events in Twitter Streams
Authors:
Avishek Bose,
Vahid Behzadan,
Carlos Aguirre,
William H. Hsu
Abstract:
We present a new machine learning and text information extraction approach to detection of cyber threat events in Twitter that are novel (previously non-extant) and develo** (marked by significance with respect to similarity with a previously detected event). While some existing approaches to event detection measure novelty and trendiness, typically as independent criteria and occasionally as a…
▽ More
We present a new machine learning and text information extraction approach to detection of cyber threat events in Twitter that are novel (previously non-extant) and develo** (marked by significance with respect to similarity with a previously detected event). While some existing approaches to event detection measure novelty and trendiness, typically as independent criteria and occasionally as a holistic measure, this work focuses on detecting both novel and develo** events using an unsupervised machine learning approach. Furthermore, our proposed approach enables the ranking of cyber threat events based on an importance score by extracting the tweet terms that are characterized as named entities, keywords, or both. We also impute influence to users in order to assign a weighted score to noun phrases in proportion to user influence and the corresponding event scores for named entities and keywords. To evaluate the performance of our proposed approach, we measure the efficiency and detection error rate for events over a specified time interval, relative to human annotator ground truth.
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
Authors:
Patrick Nadeem Ward,
Ariella Smofsky,
Avishek Joey Bose
Abstract:
Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm within the maximum entropy RL framework which offers greater stability and empirical gains. The choice of policy distribution, a factored Gaussian, is motivated b…
▽ More
Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm within the maximum entropy RL framework which offers greater stability and empirical gains. The choice of policy distribution, a factored Gaussian, is motivated by \cut{chosen due}its easy re-parametrization rather than its modeling power. We introduce Normalizing Flow policies within the SAC framework that learn more expressive classes of policies than simple factored Gaussians. \cut{We also present a series of stabilization tricks that enable effective training of these policies in the RL setting.}We show empirically on continuous grid world tasks that our approach increases stability and is better suited to difficult exploration in sparse reward settings.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Generalizable Adversarial Attacks with Latent Variable Perturbation Modelling
Authors:
Avishek Joey Bose,
Andre Cianflone,
William L. Hamilton
Abstract:
Adversarial attacks on deep neural networks traditionally rely on a constrained optimization paradigm, where an optimization procedure is used to obtain a single adversarial perturbation for a given input example. In this work we frame the problem as learning a distribution of adversarial perturbations, enabling us to generate diverse adversarial distributions given an unperturbed input. We show t…
▽ More
Adversarial attacks on deep neural networks traditionally rely on a constrained optimization paradigm, where an optimization procedure is used to obtain a single adversarial perturbation for a given input example. In this work we frame the problem as learning a distribution of adversarial perturbations, enabling us to generate diverse adversarial distributions given an unperturbed input. We show that this framework is domain-agnostic in that the same framework can be employed to attack different input domains with minimal modification. Across three diverse domains---images, text, and graphs---our approach generates whitebox attacks with success rates that are competitive with or superior to existing approaches, with a new state-of-the-art achieved in the graph domain. Finally, we demonstrate that our framework can efficiently generate a diverse set of attacks for a single given input, and is even capable of attacking \textit{unseen} test instances in a zero-shot manner, exhibiting attack generalization.
△ Less
Submitted 20 January, 2020; v1 submitted 26 May, 2019;
originally announced May 2019.
-
Compositional Fairness Constraints for Graph Embeddings
Authors:
Avishek Joey Bose,
William L. Hamilton
Abstract:
Learning high-quality node embeddings is a key building block for machine learning models that operate on graph data, such as social networks and recommender systems. However, existing graph embedding techniques are unable to cope with fairness constraints, e.g., ensuring that the learned representations do not correlate with certain attributes, such as age or gender. Here, we introduce an adversa…
▽ More
Learning high-quality node embeddings is a key building block for machine learning models that operate on graph data, such as social networks and recommender systems. However, existing graph embedding techniques are unable to cope with fairness constraints, e.g., ensuring that the learned representations do not correlate with certain attributes, such as age or gender. Here, we introduce an adversarial framework to enforce fairness constraints on graph embeddings. Our approach is compositional---meaning that it can flexibly accommodate different combinations of fairness constraints during inference. For instance, in the context of social recommendations, our framework would allow one user to request that their recommendations are invariant to both their age and gender, while also allowing another user to request invariance to just their age. Experiments on standard knowledge graph and recommender system benchmarks highlight the utility of our proposed framework.
△ Less
Submitted 16 July, 2019; v1 submitted 25 May, 2019;
originally announced May 2019.
-
PyTorch-BigGraph: A Large-scale Graph Embedding System
Authors:
Adam Lerer,
Ledell Wu,
Jiajun Shen,
Timothee Lacroix,
Luca Wehrstedt,
Abhijit Bose,
Alex Peysakhovich
Abstract:
Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to tr…
▽ More
Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several large social network graphs as well as the full Freebase dataset, which contains over 100 million nodes and 2 billion edges.
△ Less
Submitted 9 April, 2019; v1 submitted 28 March, 2019;
originally announced March 2019.
-
Bi-Linear Modeling of Data Manifolds for Dynamic-MRI Recovery
Authors:
Gaurav N. Shetty,
Konstantinos Slavakis,
Abhishek Bose,
Ukash Nakarmi,
Gesualdo Scutari,
Leslie Ying
Abstract:
This paper puts forth a novel bi-linear modeling framework for data recovery via manifold-learning and sparse-approximation arguments and considers its application to dynamic magnetic-resonance imaging (dMRI). Each temporal-domain MR image is viewed as a point that lies onto or close to a smooth manifold, and landmark points are identified to describe the point cloud concisely. To facilitate compu…
▽ More
This paper puts forth a novel bi-linear modeling framework for data recovery via manifold-learning and sparse-approximation arguments and considers its application to dynamic magnetic-resonance imaging (dMRI). Each temporal-domain MR image is viewed as a point that lies onto or close to a smooth manifold, and landmark points are identified to describe the point cloud concisely. To facilitate computations, a dimensionality reduction module generates low-dimensional/compressed renditions of the landmark points. Recovery of the high-fidelity MRI data is realized by solving a non-convex minimization task for the linear decompression operator and those affine combinations of landmark points which locally approximate the latent manifold geometry. An algorithm with guaranteed convergence to stationary solutions of the non-convex minimization task is also provided. The aforementioned framework exploits the underlying spatio-temporal patterns and geometry of the acquired data without any prior training on external data or information. Extensive numerical results on simulated as well as real cardiac-cine and perfusion MRI data illustrate noteworthy improvements of the advocated machine-learning framework over state-of-the-art reconstruction techniques.
△ Less
Submitted 11 June, 2019; v1 submitted 26 December, 2018;
originally announced December 2018.