-
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation
Authors:
Guillaume Huguet,
James Vuckovic,
Kilian Fatras,
Eric Thibodeau-Laufer,
Pablo Lemos,
Riashat Islam,
Cheng-Hao Liu,
Jarrid Rector-Brooks,
Tara Akhound-Sadegh,
Michael Bronstein,
Alexander Tong,
Avishek Joey Bose
Abstract:
Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFl…
▽ More
Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFlow-2 presents substantial new architectural features over the previous FoldFlow family of models including a protein large language model to encode sequence, a new multi-modal fusion trunk that combines structure and sequence representations, and a geometric transformer based decoder. To increase diversity and novelty of generated samples -- crucial for de-novo drug design -- we train FoldFlow-2 at scale on a new dataset that is an order of magnitude larger than PDB datasets of prior works, containing both known proteins in PDB and high-quality synthetic structures achieved through filtering. We further demonstrate the ability to align FoldFlow-2 to arbitrary rewards, e.g. increasing secondary structures diversity, by introducing a Reinforced Finetuning (ReFT) objective. We empirically observe that FoldFlow-2 outperforms previous state-of-the-art protein structure-based generative models, improving over RFDiffusion in terms of unconditional generation across all metrics including designability, diversity, and novelty across all protein lengths, as well as exhibiting generalization on the task of equilibrium conformation sampling. Finally, we demonstrate that a fine-tuned FoldFlow-2 makes progress on challenging conditional design tasks such as designing scaffolds for the VHH nanobody.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Metric Flow Matching for Smooth Interpolations on the Data Manifold
Authors:
Kacper Kapusniak,
Peter Potaptchik,
Teodora Reu,
Leo Zhang,
Alexander Tong,
Michael Bronstein,
Avishek Joey Bose,
Francesco Di Giovanni
Abstract:
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive fo…
▽ More
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Fisher Flow Matching for Generative Modeling over Discrete Data
Authors:
Oscar Davis,
Samuel Kessler,
Mircea Petrache,
İsmail İlkan Ceylan,
Michael Bronstein,
Avishek Joey Bose
Abstract:
Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in con…
▽ More
Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in continuous data settings, such as image or video generation. In this work, we introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data as points residing on a statistical manifold equipped with its natural Riemannian metric: the $\textit{Fisher-Rao metric}$. As a result, we demonstrate discrete data itself can be continuously reparameterised to points on the positive orthant of the $d$-hypersphere $\mathbb{S}^d_+$, which allows us to define flows that map any source distribution to target in a principled manner by transporting mass along (closed-form) geodesics of $\mathbb{S}^d_+$. Furthermore, the learned flows in Fisher-Flow can be further bootstrapped by leveraging Riemannian optimal transport leading to improved training dynamics. We prove that the gradient flow induced by Fisher-Flow is optimal in reducing the forward KL divergence.
We evaluate Fisher-Flow on an array of synthetic and diverse real-world benchmarks, including designing DNA Promoter, and DNA Enhancer sequences. Empirically, we find that Fisher-Flow improves over prior diffusion and flow-matching models on these benchmarks.
△ Less
Submitted 28 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Authors:
Tara Akhound-Sadegh,
Jarrid Rector-Brooks,
Avishek Joey Bose,
Sarthak Mittal,
Pablo Lemos,
Cheng-Hao Liu,
Marcin Sendera,
Siamak Ravanbakhsh,
Gauthier Gidel,
Yoshua Bengio,
Nikolay Malkin,
Alexander Tong
Abstract:
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and…
▽ More
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system.
△ Less
Submitted 26 June, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
SE(3)-Stochastic Flow Matching for Protein Backbone Generation
Authors:
Avishek Joey Bose,
Tara Akhound-Sadegh,
Guillaume Huguet,
Kilian Fatras,
Jarrid Rector-Brooks,
Cheng-Hao Liu,
Andrei Cristian Nica,
Maksym Korablyov,
Michael Bronstein,
Alexander Tong
Abstract:
The computational design of novel protein structures has the potential to impact numerous scientific disciplines greatly. Toward this goal, we introduce FoldFlow, a series of novel generative models of increasing modeling power based on the flow-matching paradigm over $3\mathrm{D}$ rigid motions -- i.e. the group $\text{SE}(3)$ -- enabling accurate modeling of protein backbones. We first introduce…
▽ More
The computational design of novel protein structures has the potential to impact numerous scientific disciplines greatly. Toward this goal, we introduce FoldFlow, a series of novel generative models of increasing modeling power based on the flow-matching paradigm over $3\mathrm{D}$ rigid motions -- i.e. the group $\text{SE}(3)$ -- enabling accurate modeling of protein backbones. We first introduce FoldFlow-Base, a simulation-free approach to learning deterministic continuous-time dynamics and matching invariant target distributions on $\text{SE}(3)$. We next accelerate training by incorporating Riemannian optimal transport to create FoldFlow-OT, leading to the construction of both more simple and stable flows. Finally, we design FoldFlow-SFM, coupling both Riemannian OT and simulation-free training to learn stochastic continuous-time dynamics over $\text{SE}(3)$. Our family of FoldFlow, generative models offers several key advantages over previous approaches to the generative modeling of proteins: they are more stable and faster to train than diffusion-based approaches, and our models enjoy the ability to map any invariant source distribution to any invariant target distribution over $\text{SE}(3)$. Empirically, we validate FoldFlow, on protein backbone generation of up to $300$ amino acids leading to high-quality designable, diverse, and novel samples.
△ Less
Submitted 11 April, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
On the Stability of Iterative Retraining of Generative Models on their own Data
Authors:
Quentin Bertrand,
Avishek Joey Bose,
Alexandre Duplessis,
Marco Jiralerspong,
Gauthier Gidel
Abstract:
Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inev…
▽ More
Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inevitably be increasingly populated with synthetic content. Such a fact directly implies that future iterations of generative models will be trained on both clean and artificially generated data from past models. In this paper, we develop a framework to rigorously study the impact of training generative models on mixed datasets -- from classical training on real data to self-consuming generative models trained on purely synthetic data. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough and the proportion of clean training data (w.r.t. synthetic data) is large enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models on CIFAR10 and FFHQ.
△ Less
Submitted 2 April, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Effect of Mindfulness and Mindful Art on Beginners and Experienced Meditators
Authors:
Koonlin Eunice Chan,
Joy Bose
Abstract:
Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of drawing, painting and sculpturing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two…
▽ More
Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of drawing, painting and sculpturing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two groups of participants, a beginner group of 21 participants and an experienced meditation group of 9 participants, who had previously practiced mindfulness meditation for more than one year. The beginner group was instructed in mindfulness sitting and moving meditation, while the experienced group was instructed in mindful art making in addition to mindfulness meditation. The instructions were delivered remotely over Tencent Conference and WeChat. The sessions were of 90 minutes duration each, twice per week, with 45 minutes of home practice daily and the length of the study was 21 days. The blood pressure, pulse rate and breathing rates, as well as the subjective degree of relaxation were recorded at every session. At the end of the study, the experienced group reported higher average difference in breath rate and relaxation within each session, while the beginner group reported a greater degree of improvement in breath rate and relaxation over the period of the study, although their scores were lower on average than the experienced group.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
EDGI: Equivariant Diffusion for Planning with Embodied Agents
Authors:
Johann Brehmer,
Joey Bose,
Pim de Haan,
Taco Cohen
Abstract:
Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBR…
▽ More
Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group SE(3), the discrete-time translation group Z, and the object permutation group Sn. EDGI follows the Diffuser framework (Janner et al., 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new SE(3)xZxSn-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed. On object manipulation and navigation tasks, EDGI is substantially more sample efficient and generalizes better across the symmetry group than non-equivariant models.
△ Less
Submitted 19 October, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples
Authors:
Marco Jiralerspong,
Avishek Joey Bose,
Ian Gemp,
Chongli Qin,
Yoram Bachrach,
Gauthier Gidel
Abstract:
The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data. However, current methods for evaluating such models remain incomplete: standard likelihood-based metrics do not always apply and rarely correlate with perceptual fidelity, while sample-based metrics, such as FID, are insensitive to…
▽ More
The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data. However, current methods for evaluating such models remain incomplete: standard likelihood-based metrics do not always apply and rarely correlate with perceptual fidelity, while sample-based metrics, such as FID, are insensitive to overfitting, i.e., inability to generalize beyond the training set. To address these limitations, we propose a new metric called the Feature Likelihood Divergence (FLD), a parametric sample-based metric that uses density estimation to provide a comprehensive trichotomic evaluation accounting for novelty (i.e., different from the training samples), fidelity, and diversity of generated samples. We empirically demonstrate the ability of FLD to identify overfitting problem cases, even when previously proposed metrics fail. We also extensively evaluate FLD on various image datasets and model classes, demonstrating its ability to match intuitions of previous metrics like FID while offering a more comprehensive evaluation of generative models. Code is available at https://github.com/marcojira/fld.
△ Less
Submitted 12 March, 2024; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Riemannian Diffusion Models
Authors:
Chin-Wei Huang,
Milad Aghajohari,
Avishek Joey Bose,
Prakash Panangaden,
Aaron Courville
Abstract:
Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation. Computationally, we propose new methods for computing the Riemannian divergence which is needed in the likelihood estimation. Moreover, in gen…
▽ More
Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation. Computationally, we propose new methods for computing the Riemannian divergence which is needed in the likelihood estimation. Moreover, in generalizing the Euclidean case, we prove that maximizing this variational lower-bound is equivalent to Riemannian score matching. Empirically, we demonstrate the expressive power of Riemannian diffusion models on a wide spectrum of smooth manifolds, such as spheres, tori, hyperboloids, and orthogonal groups. Our proposed method achieves new state-of-the-art likelihoods on all benchmarks.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Matching Normalizing Flows and Probability Paths on Manifolds
Authors:
Heli Ben-Hamu,
Samuel Cohen,
Joey Bose,
Brandon Amos,
Aditya Grover,
Maximilian Nickel,
Ricky T. Q. Chen,
Yaron Lipman
Abstract:
Continuous Normalizing Flows (CNFs) are a class of generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE). We propose to train CNFs on manifolds by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path. PPD i…
▽ More
Continuous Normalizing Flows (CNFs) are a class of generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE). We propose to train CNFs on manifolds by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path. PPD is formulated using a logarithmic mass conservation formula which is a linear first order partial differential equation relating the log target probabilities and the CNF's defining vector field. PPD has several key benefits over existing methods: it sidesteps the need to solve an ODE per iteration, readily applies to manifold data, scales to high dimensions, and is compatible with a large family of target paths interpolating pure noise and data in finite time. Theoretically, PPD is shown to bound classical probability divergences. Empirically, we show that CNFs learned by minimizing PPD achieve state-of-the-art results in likelihoods and sample quality on existing low-dimensional manifold benchmarks, and is the first example of a generative model to scale to moderately high dimensional manifolds.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
Equivariant Finite Normalizing Flows
Authors:
Avishek Joey Bose,
Marcus Brubaker,
Ivan Kobyzev
Abstract:
Generative modeling seeks to uncover the underlying factors that give rise to observed data that can often be modeled as the natural symmetries that manifest themselves through invariances and equivariances to certain transformation laws. However, current approaches to representing these symmetries are couched in the formalism of continuous normalizing flows that require the construction of equiva…
▽ More
Generative modeling seeks to uncover the underlying factors that give rise to observed data that can often be modeled as the natural symmetries that manifest themselves through invariances and equivariances to certain transformation laws. However, current approaches to representing these symmetries are couched in the formalism of continuous normalizing flows that require the construction of equivariant vector fields -- inhibiting their simple application to conventional higher dimensional generative modelling domains like natural images. In this paper, we focus on building equivariant normalizing flows using discrete layers. We first theoretically prove the existence of an equivariant map for compact groups whose actions are on compact spaces. We further introduce three new equivariant flows: $G$-Residual Flows, $G$-Coupling Flows, and $G$-Inverse Autoregressive Flows that elevate classical Residual, Coupling, and Inverse Autoregressive Flows with equivariant maps to a prescribed group $G$. Our construction of $G$-Residual Flows are also universal, in the sense that we prove an $G$-equivariant diffeomorphism can be exactly mapped by a $G$-residual flow. Finally, we complement our theoretical insights with demonstrative experiments -- for the first time -- on image datasets like CIFAR-10 and show $G$-Equivariant Finite Normalizing flows lead to increased data efficiency, faster convergence, and improved likelihood estimates.
△ Less
Submitted 12 August, 2022; v1 submitted 16 October, 2021;
originally announced October 2021.
-
Modeling Effect of Lockdowns and Other Effects on India Covid-19 Infections Using SEIR Model and Machine Learning
Authors:
Sathiyanarayanan Sampath,
Joy Bose
Abstract:
The SEIR model is a widely used epidemiological model used to predict the rise in infections. This model has been widely used in different countries to predict the number of Covid-19 cases. But the original SEIR model does not take into account the effect of factors such as lockdowns, vaccines, and re-infections. In India the first wave of Covid started in March 2020 and the second wave in April 2…
▽ More
The SEIR model is a widely used epidemiological model used to predict the rise in infections. This model has been widely used in different countries to predict the number of Covid-19 cases. But the original SEIR model does not take into account the effect of factors such as lockdowns, vaccines, and re-infections. In India the first wave of Covid started in March 2020 and the second wave in April 2021. In this paper, we modify the SEIR model equations to model the effect of lockdowns and other influencers, and fit the model on data of the daily Covid-19 infections in India using lmfit, a python library for least squares minimization for curve fitting. We modify R0 parameter in the standard SEIR model as a rectangle in order to account for the effect of lockdowns. Our modified SEIR model accurately fits the available data of infections.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Sparse Distributed Memory using Spiking Neural Networks on Nengo
Authors:
Rohan Deepak Ajwani,
Arshika Lalan,
Basabdatta Sen Bhattacharya,
Joy Bose
Abstract:
We present a Spiking Neural Network (SNN) based Sparse Distributed Memory (SDM) implemented on the Nengo framework. We have based our work on previous work by Furber et al, 2004, implementing SDM using N-of-M codes. As an integral part of the SDM design, we have implemented Correlation Matrix Memory (CMM) using SNN on Nengo. Our SNN implementation uses Leaky Integrate and Fire (LIF) spiking neuron…
▽ More
We present a Spiking Neural Network (SNN) based Sparse Distributed Memory (SDM) implemented on the Nengo framework. We have based our work on previous work by Furber et al, 2004, implementing SDM using N-of-M codes. As an integral part of the SDM design, we have implemented Correlation Matrix Memory (CMM) using SNN on Nengo. Our SNN implementation uses Leaky Integrate and Fire (LIF) spiking neuron models on Nengo. Our objective is to understand how well SNN-based SDMs perform in comparison to conventional SDMs. Towards this, we have simulated both conventional and SNN-based SDM and CMM on Nengo. We observe that SNN-based models perform similarly as the conventional ones. In order to evaluate the performance of different SNNs, we repeated the experiment using Adaptive-LIF, Spiking Rectified Linear Unit, and Izhikevich models and obtained similar results. We conclude that it is indeed feasible to develop some types of associative memories using spiking neurons whose memory capacity and other features are similar to the performance without SNNs. Finally we have implemented an application where MNIST images, encoded with N-of-M codes, are associated with their labels and stored in the SNN-based SDM.
△ Less
Submitted 3 December, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding
Authors:
Nouha Dziri,
Andrea Madotto,
Osmar Zaiane,
Avishek Joey Bose
Abstract:
Dialogue systems powered by large pre-trained language models (LM) exhibit an innate ability to deliver fluent and natural-looking responses. Despite their impressive generation performance, these models can often generate factually incorrect statements impeding their widespread adoption. In this paper, we focus on the task of improving the faithfulness -- and thus reduce hallucination -- of Neura…
▽ More
Dialogue systems powered by large pre-trained language models (LM) exhibit an innate ability to deliver fluent and natural-looking responses. Despite their impressive generation performance, these models can often generate factually incorrect statements impeding their widespread adoption. In this paper, we focus on the task of improving the faithfulness -- and thus reduce hallucination -- of Neural Dialogue Systems to known facts supplied by a Knowledge Graph (KG). We propose Neural Path Hunter which follows a generate-then-refine strategy whereby a generated response is amended using the k-hop subgraph of a KG. Neural Path Hunter leverages a separate token-level fact critic to identify plausible sources of hallucination followed by a refinement stage consisting of a chain of two neural LM's that retrieves correct entities by crafting a query signal that is propagated over the k-hop subgraph. Our proposed model can easily be applied to any dialogue generated responses without retraining the model. We empirically validate our proposed approach on the OpenDialKG dataset against a suite of metrics and report a relative improvement of faithfulness over dialogue responses by 20.35% based on FeQA (Durmus et al., 2020).
△ Less
Submitted 14 September, 2021; v1 submitted 17 April, 2021;
originally announced April 2021.
-
Online Adversarial Attacks
Authors:
Andjela Mladenovic,
Avishek Joey Bose,
Hugo Berard,
William L. Hamilton,
Simon Lacoste-Julien,
Pascal Vincent,
Gauthier Gidel
Abstract:
Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream. In this paper, we formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases: attackers must operate under partial knowledge of the target model, and the decisions made by the attacker are irrev…
▽ More
Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream. In this paper, we formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases: attackers must operate under partial knowledge of the target model, and the decisions made by the attacker are irrevocable since they operate on a transient data stream. We first rigorously analyze a deterministic variant of the online threat model by drawing parallels to the well-studied $k$-secretary problem in theoretical computer science and propose Virtual+, a simple yet practical online algorithm. Our main theoretical result shows Virtual+ yields provably the best competitive ratio over all single-threshold algorithms for $k<5$ -- extending the previous analysis of the $k$-secretary problem. We also introduce the \textit{stochastic $k$-secretary} -- effectively reducing online blackbox transfer attacks to a $k$-secretary problem under noise -- and prove theoretical bounds on the performance of Virtual+ adapted to this setting. Finally, we complement our theoretical results by conducting experiments on MNIST, CIFAR-10, and Imagenet classifiers, revealing the necessity of online algorithms in achieving near-optimal performance and also the rich interplay between attack strategies and online attack selection, enabling simple strategies like FGSM to outperform stronger adversaries.
△ Less
Submitted 22 March, 2022; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Structure Aware Negative Sampling in Knowledge Graphs
Authors:
Kian Ahrabian,
Aarash Feizi,
Yasmin Salehi,
William L. Hamilton,
Avishek Joey Bose
Abstract:
Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations…
▽ More
Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations and find critical characteristics of observed data. While earlier methods either employ too simple corruption distributions, i.e. uniform, yielding easy uninformative negatives or sophisticated adversarial distributions with challenging optimization schemes, they do not explicitly incorporate known graph structure resulting in suboptimal negatives. In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node's k-hop neighborhood. Empirically, we demonstrate that SANS finds semantically meaningful negatives and is competitive with SOTA approaches while requires no additional parameters nor difficult adversarial optimization.
△ Less
Submitted 6 October, 2020; v1 submitted 23 September, 2020;
originally announced September 2020.
-
Adversarial Example Games
Authors:
Avishek Joey Bose,
Gauthier Gidel,
Hugo Berard,
Andre Cianflone,
Pascal Vincent,
Simon Lacoste-Julien,
William L. Hamilton
Abstract:
The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks to guide the development of safeguards against them. This includes attack methods in the challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior attacks i…
▽ More
The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks to guide the development of safeguards against them. This includes attack methods in the challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior attacks in this setting have relied mainly on algorithmic innovations derived from empirical observations (e.g., that momentum helps), lacking principled transferability guarantees. In this work, we provide a theoretical foundation for crafting transferable adversarial examples to entire hypothesis classes. We introduce Adversarial Example Games (AEG), a framework that models the crafting of adversarial examples as a min-max game between a generator of attacks and a classifier. AEG provides a new way to design adversarial examples by adversarially training a generator and a classifier from a given hypothesis class (e.g., architecture). We prove that this game has an equilibrium, and that the optimal generator is able to craft adversarial examples that can attack any classifier from the corresponding hypothesis class. We demonstrate the efficacy of AEG on the MNIST and CIFAR-10 datasets, outperforming prior state-of-the-art approaches with an average relative improvement of $29.9\%$ and $47.2\%$ against undefended and robust models (Table 2 & 3) respectively.
△ Less
Submitted 8 January, 2021; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Latent Variable Modelling with Hyperbolic Normalizing Flows
Authors:
Avishek Joey Bose,
Ariella Smofsky,
Renjie Liao,
Prakash Panangaden,
William L. Hamilton
Abstract:
The choice of approximate posterior distributions plays a central role in stochastic variational inference (SVI). One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions. However, one key limitation of existing normalizing flows is that they are restricted to the Euclidean space and are ill-equipped to model data with a…
▽ More
The choice of approximate posterior distributions plays a central role in stochastic variational inference (SVI). One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions. However, one key limitation of existing normalizing flows is that they are restricted to the Euclidean space and are ill-equipped to model data with an underlying hierarchical structure. To address this fundamental limitation, we present the first extension of normalizing flows to hyperbolic spaces. We first elevate normalizing flows to hyperbolic spaces using coupling transforms defined on the tangent bundle, termed Tangent Coupling ($\mathcal{TC}$). We further introduce Wrapped Hyperboloid Coupling ($\mathcal{W}\mathbb{H}C$), a fully invertible and learnable transformation that explicitly utilizes the geometric structure of hyperbolic spaces, allowing for expressive posteriors while being efficient to sample from. We demonstrate the efficacy of our novel normalizing flow over hyperbolic VAEs and Euclidean normalizing flows. Our approach achieves improved performance on density estimation, as well as reconstruction of real-world graph data, which exhibit a hierarchical structure. Finally, we show that our approach can be used to power a generative model over hierarchical data using hyperbolic latent variables.
△ Less
Submitted 13 August, 2020; v1 submitted 15 February, 2020;
originally announced February 2020.
-
Extraction of Relevant Images for Boilerplate Removal in Web Browsers
Authors:
Joy Bose
Abstract:
Boilerplate refers to unwanted and repeated parts of a webpage (such as ads or table of contents) that distracts the user from reading the core content of the webpage, such as a news article. Accurate detection and removal of boilerplate content from a webpage can enable the users to have a clutter free view of the webpage or news article. This can be useful in features like reader mode in web bro…
▽ More
Boilerplate refers to unwanted and repeated parts of a webpage (such as ads or table of contents) that distracts the user from reading the core content of the webpage, such as a news article. Accurate detection and removal of boilerplate content from a webpage can enable the users to have a clutter free view of the webpage or news article. This can be useful in features like reader mode in web browsers. Current implementations of reader mode in web browsers such as Firefox, Chrome and Edge perform reasonably well for textual content in webpages. However, they are mostly heuristic based and not flexible when the webpage content is dynamic. Also they often do not perform well for removing boilerplate content in the form of images and multimedia in webpages. For detection of boilerplate images, one needs to have knowledge of the actual layout of the images in the webpage, which is only possible when the webpage is rendered. In this paper we discuss some of the issues in relevant image extraction. We also present the design of a testing framework to measure accuracy and a classifier to extract relevant images by leveraging a headless browser solution that gives the rendering information for images.
△ Less
Submitted 13 January, 2020; v1 submitted 17 December, 2019;
originally announced January 2020.
-
Evaluating Usage of Images for App Classification
Authors:
Kushal Singla,
Niloy Mukherjee,
Hari Manassery Koduvely,
Joy Bose
Abstract:
App classification is useful in a number of applications such as adding apps to an app store or building a user model based on the installed apps. Presently there are a number of existing methods to classify apps based on a given taxonomy on the basis of their text metadata. However, text based methods for app classification may not work in all cases, such as when the text descriptions are in a di…
▽ More
App classification is useful in a number of applications such as adding apps to an app store or building a user model based on the installed apps. Presently there are a number of existing methods to classify apps based on a given taxonomy on the basis of their text metadata. However, text based methods for app classification may not work in all cases, such as when the text descriptions are in a different language, or missing, or inadequate to classify the app. One solution in such cases is to utilize the app images to supplement the text description. In this paper, we evaluate a number of approaches in which app images can be used to classify the apps. In one approach, we use Optical character recognition (OCR) to extract text from images, which is then used to supplement the text description of the app. In another, we use pic2vec to convert the app images into vectors, then train an SVM to classify the vectors to the correct app label. In another, we use the captionbot.ai tool to generate natural language descriptions from the app images. Finally, we use a method to detect and label objects in the app images and use a voting technique to determine the category of the app based on all the images. We compare the performance of our image-based techniques to classify a number of apps in our dataset. We use a text based SVM app classifier as our base and obtained an improved classification accuracy of 96% for some classes when app images are added.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Meta-Graph: Few Shot Link Prediction via Meta Learning
Authors:
Avishek Joey Bose,
Ankit Jain,
Piero Molino,
William L. Hamilton
Abstract:
We consider the task of few shot link prediction on graphs. The goal is to learn from a distribution over graphs so that a model is able to quickly infer missing edges in a new graph after a small amount of training. We show that current link prediction methods are generally ill-equipped to handle this task. They cannot effectively transfer learned knowledge from one graph to another and are unabl…
▽ More
We consider the task of few shot link prediction on graphs. The goal is to learn from a distribution over graphs so that a model is able to quickly infer missing edges in a new graph after a small amount of training. We show that current link prediction methods are generally ill-equipped to handle this task. They cannot effectively transfer learned knowledge from one graph to another and are unable to effectively learn from sparse samples of edges. To address this challenge, we introduce a new gradient-based meta learning framework, Meta-Graph. Our framework leverages higher-order gradients along with a learned graph signature function that conditionally generates a graph neural network initialization. Using a novel set of few shot link prediction benchmarks, we show that Meta-Graph can learn to quickly adapt to a new graph using only a small sample of true edges, enabling not only fast adaptation but also improved results at convergence.
△ Less
Submitted 1 March, 2020; v1 submitted 20 December, 2019;
originally announced December 2019.
-
Field Label Prediction for Autofill in Web Browsers
Authors:
Joy Bose
Abstract:
Automatic form fill is an important productivity related feature present in major web browsers, which predicts the field labels of a web form and automatically fills values in a new form based on the values previously filled for the same field in other forms. This feature increases the convenience and efficiency of users who have to fill similar information in fields in multiple forms. In this pap…
▽ More
Automatic form fill is an important productivity related feature present in major web browsers, which predicts the field labels of a web form and automatically fills values in a new form based on the values previously filled for the same field in other forms. This feature increases the convenience and efficiency of users who have to fill similar information in fields in multiple forms. In this paper we describe a machine learning solution for predicting the form field labels, implemented as a web service using Azure ML Studio.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
Analysis of Software Engineering for Agile Machine Learning Projects
Authors:
Kushal Singla,
Joy Bose,
Chetan Naik
Abstract:
The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with correspondin…
▽ More
The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with corresponding data from non-machine learning projects, in an attempt to analyze how machine learning projects are executed differently from normal software engineering projects. On analysis, we find that machine learning project issues use different kinds of words to describe issues, have higher number of exploratory or research oriented tasks as compared to implementation tasks, and have a higher number of issues in the product backlog after each sprint, denoting that it is more difficult to estimate the duration of machine learning project related tasks in advance. After analyzing this data, we propose a few ways in which Agile machine learning projects can be better logged and executed, given their differences with normal software engineering projects.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Semi-Supervised Method using Gaussian Random Fields for Boilerplate Removal in Web Browsers
Authors:
Joy Bose,
Sumanta Mukherjee
Abstract:
Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads and extracting relevant content that can be used by various services. This can be useful in several features in web browsers such as ad blocking, accessibility tools such as read out loud, translation, summarization etc. In order to create a training dataset to train a model for boilerplate detection and…
▽ More
Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads and extracting relevant content that can be used by various services. This can be useful in several features in web browsers such as ad blocking, accessibility tools such as read out loud, translation, summarization etc. In order to create a training dataset to train a model for boilerplate detection and removal, labeling or tagging webpage data manually can be tedious and time consuming. Hence, a semi-supervised model, in which some of the webpage elements are labeled manually and labels for others are inferred based on some parameters, can be useful. In this paper we present a solution for extraction of relevant content from a webpage that relies on semi-supervised learning using Gaussian Random Fields. We first represent the webpage as a graph, with text elements as nodes and the edge weights representing similarity between nodes. After this, we label a few nodes in the graph using heuristics and label the remaining nodes by a weighted measure of similarity to the already labeled nodes. We describe the system architecture and a few preliminary results on a dataset of webpages.
△ Less
Submitted 7 November, 2019;
originally announced November 2019.
-
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
Authors:
Patrick Nadeem Ward,
Ariella Smofsky,
Avishek Joey Bose
Abstract:
Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm within the maximum entropy RL framework which offers greater stability and empirical gains. The choice of policy distribution, a factored Gaussian, is motivated b…
▽ More
Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm within the maximum entropy RL framework which offers greater stability and empirical gains. The choice of policy distribution, a factored Gaussian, is motivated by \cut{chosen due}its easy re-parametrization rather than its modeling power. We introduce Normalizing Flow policies within the SAC framework that learn more expressive classes of policies than simple factored Gaussians. \cut{We also present a series of stabilization tricks that enable effective training of these policies in the RL setting.}We show empirically on continuous grid world tasks that our approach increases stability and is better suited to difficult exploration in sparse reward settings.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
A Cross-Domain Transferable Neural Coherence Model
Authors:
Peng Xu,
Hamidreza Saghir,
** Sung Kang,
Teng Long,
Avishek Joey Bose,
Yanshuai Cao,
Jackie Chi Kit Cheung
Abstract:
Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to disc…
▽ More
Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to discriminate against during training is prohibitively large. In this work, we propose a local discriminative neural model with a much smaller negative sampling space that can efficiently learn against incorrect orderings. The proposed coherence model is simple in structure, yet it significantly outperforms previous state-of-art methods on a standard benchmark dataset on the Wall Street Journal corpus, as well as in multiple new challenging settings of transfer to unseen categories of discourse on Wikipedia articles.
△ Less
Submitted 9 July, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Generalizable Adversarial Attacks with Latent Variable Perturbation Modelling
Authors:
Avishek Joey Bose,
Andre Cianflone,
William L. Hamilton
Abstract:
Adversarial attacks on deep neural networks traditionally rely on a constrained optimization paradigm, where an optimization procedure is used to obtain a single adversarial perturbation for a given input example. In this work we frame the problem as learning a distribution of adversarial perturbations, enabling us to generate diverse adversarial distributions given an unperturbed input. We show t…
▽ More
Adversarial attacks on deep neural networks traditionally rely on a constrained optimization paradigm, where an optimization procedure is used to obtain a single adversarial perturbation for a given input example. In this work we frame the problem as learning a distribution of adversarial perturbations, enabling us to generate diverse adversarial distributions given an unperturbed input. We show that this framework is domain-agnostic in that the same framework can be employed to attack different input domains with minimal modification. Across three diverse domains---images, text, and graphs---our approach generates whitebox attacks with success rates that are competitive with or superior to existing approaches, with a new state-of-the-art achieved in the graph domain. Finally, we demonstrate that our framework can efficiently generate a diverse set of attacks for a single given input, and is even capable of attacking \textit{unseen} test instances in a zero-shot manner, exhibiting attack generalization.
△ Less
Submitted 20 January, 2020; v1 submitted 26 May, 2019;
originally announced May 2019.
-
Compositional Fairness Constraints for Graph Embeddings
Authors:
Avishek Joey Bose,
William L. Hamilton
Abstract:
Learning high-quality node embeddings is a key building block for machine learning models that operate on graph data, such as social networks and recommender systems. However, existing graph embedding techniques are unable to cope with fairness constraints, e.g., ensuring that the learned representations do not correlate with certain attributes, such as age or gender. Here, we introduce an adversa…
▽ More
Learning high-quality node embeddings is a key building block for machine learning models that operate on graph data, such as social networks and recommender systems. However, existing graph embedding techniques are unable to cope with fairness constraints, e.g., ensuring that the learned representations do not correlate with certain attributes, such as age or gender. Here, we introduce an adversarial framework to enforce fairness constraints on graph embeddings. Our approach is compositional---meaning that it can flexibly accommodate different combinations of fairness constraints during inference. For instance, in the context of social recommendations, our framework would allow one user to request that their recommendations are invariant to both their age and gender, while also allowing another user to request invariance to just their age. Experiments on standard knowledge graph and recommender system benchmarks highlight the utility of our proposed framework.
△ Less
Submitted 16 July, 2019; v1 submitted 25 May, 2019;
originally announced May 2019.
-
Identifying Implementation Bugs in Machine Learning based Image Classifiers using Metamorphic Testing
Authors:
Anurag Dwarakanath,
Manish Ahuja,
Samarth Sikand,
Raghotham M. Rao,
R. P. Jagadeesh Chandra Bose,
Neville Dubash,
Sanjay Podder
Abstract:
We have recently witnessed tremendous success of Machine Learning (ML) in practical applications. Computer vision, speech recognition and language translation have all seen a near human level performance. We expect, in the near future, most business applications will have some form of ML. However, testing such applications is extremely challenging and would be very expensive if we follow today's m…
▽ More
We have recently witnessed tremendous success of Machine Learning (ML) in practical applications. Computer vision, speech recognition and language translation have all seen a near human level performance. We expect, in the near future, most business applications will have some form of ML. However, testing such applications is extremely challenging and would be very expensive if we follow today's methodologies. In this work, we present an articulation of the challenges in testing ML based applications. We then present our solution approach, based on the concept of Metamorphic Testing, which aims to identify implementation bugs in ML based image classifiers. We have developed metamorphic relations for an application based on Support Vector Machine and a Deep Learning based application. Empirical validation showed that our approach was able to catch 71% of the implementation bugs in the ML applications.
△ Less
Submitted 16 August, 2018;
originally announced August 2018.
-
Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization
Authors:
Avishek Joey Bose,
Parham Aarabi
Abstract:
Adversarial attacks involve adding, small, often imperceptible, perturbations to inputs with the goal of getting a machine learning model to misclassifying them. While many different adversarial attack strategies have been proposed on image classification models, object detection pipelines have been much harder to break. In this paper, we propose a novel strategy to craft adversarial examples by s…
▽ More
Adversarial attacks involve adding, small, often imperceptible, perturbations to inputs with the goal of getting a machine learning model to misclassifying them. While many different adversarial attack strategies have been proposed on image classification models, object detection pipelines have been much harder to break. In this paper, we propose a novel strategy to craft adversarial examples by solving a constrained optimization problem using an adversarial generator network. Our approach is fast and scalable, requiring only a forward pass through our trained generator network to craft an adversarial sample. Unlike in many attack strategies, we show that the same trained generator is capable of attacking new images without explicitly optimizing on them. We evaluate our attack on a trained Faster R-CNN face detector on the cropped 300-W face dataset where we manage to reduce the number of detected faces to $0.5\%$ of all originally detected faces. In a different experiment, also on 300-W, we demonstrate the robustness of our attack to a JPEG compression based defense typical JPEG compression level of $75\%$ reduces the effectiveness of our attack from only $0.5\%$ of detected faces to a modest $5.0\%$.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
IoT2Vec: Identification of Similar IoT Devices via Activity Footprints
Authors:
Kushal Singla,
Joy Bose
Abstract:
We consider a smart home or smart office environment with a number of IoT devices connected and passing data between one another. The footprints of the data transferred can provide valuable information about the devices, which can be used to (a) identify the IoT devices and (b) in case of failure, to identify the correct replacements for these devices. In this paper, we generate the embeddings for…
▽ More
We consider a smart home or smart office environment with a number of IoT devices connected and passing data between one another. The footprints of the data transferred can provide valuable information about the devices, which can be used to (a) identify the IoT devices and (b) in case of failure, to identify the correct replacements for these devices. In this paper, we generate the embeddings for IoT devices in a smart home using Word2Vec, and explore the possibility of having a similar concept for IoT devices, aka IoT2Vec. These embeddings can be used in a number of ways, such as to find similar devices in an IoT device store, or as a signature of each type of IoT device. We show results of a feasibility study on the CASAS dataset of IoT device activity logs, using our method to identify the patterns in embeddings of various types of IoT devices in a household.
△ Less
Submitted 21 May, 2018;
originally announced May 2018.
-
Adversarial Contrastive Estimation
Authors:
Avishek Joey Bose,
Huan Ling,
Yanshuai Cao
Abstract:
Learning by contrasting positive and negative samples is a general strategy adopted by many methods. Noise contrastive estimation (NCE) for word embeddings and translating embeddings for knowledge graphs are examples in NLP employing this approach. In this work, we view contrastive learning as an abstraction of all such methods and augment the negative sampler into a mixture distribution containin…
▽ More
Learning by contrasting positive and negative samples is a general strategy adopted by many methods. Noise contrastive estimation (NCE) for word embeddings and translating embeddings for knowledge graphs are examples in NLP employing this approach. In this work, we view contrastive learning as an abstraction of all such methods and augment the negative sampler into a mixture distribution containing an adversarially learned sampler. The resulting adaptive sampler finds harder negative examples, which forces the main model to learn a better representation of the data. We evaluate our proposal on learning word embeddings, order embeddings and knowledge graph embeddings and observe both faster convergence and improved results on multiple metrics.
△ Less
Submitted 2 August, 2018; v1 submitted 9 May, 2018;
originally announced May 2018.
-
VR Content Capture using Aligned Smartphones
Authors:
Ramanujam R Srinivasa,
Joy Bose,
Dipin KP
Abstract:
There are a number of dedicated 3D capture devices in the market, but generally they are unaffordable and do not make use of existing smartphone cameras, which are generally of decent quality. Due to this, while there are several means to consume 3D or VR content, there is currently lack of means to capture 3D content, resulting in very few 3D videos being publicly available. Some mobile applicati…
▽ More
There are a number of dedicated 3D capture devices in the market, but generally they are unaffordable and do not make use of existing smartphone cameras, which are generally of decent quality. Due to this, while there are several means to consume 3D or VR content, there is currently lack of means to capture 3D content, resulting in very few 3D videos being publicly available. Some mobile applications such as Camerada enable 3D or VR content capture by combining the output of two existing smartphones, but users would have to hold the cameras in their hand, making it difficult to align properly. In this paper we present the design of a system to enable 3D content capture using one or more smartphones, taking care of alignment issues so as to get optimal alignment of the smartphone cameras. We aim to keep the distance between the cameras constant and equal to the inter-pupillary distance of about 6.5 cm. Our solution is applicable for one, two and three smartphones. We have a mobile app to generate a template given the dimensions of the smartphones, camera positions and other specifications. The template can be printed by the user and cut out on 2D cardboard, similar to Google cardboard. Alternatively, it can be printed using a 3D printer. During video capture, with the smartphones aligned using our printed template, we capture videos which are then combined to get the optimal 3D content. We present the details of a small proof of concept implementation. Our solution would make it easier for people to use existing smartphones to generate 3D content.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.
-
A Bias Aware News Recommendation System
Authors:
Anish Anil Patankar,
Joy Bose,
Harshit Khanna
Abstract:
In this era of fake news and political polarization, it is desirable to have a system to enable users to access balanced news content. Current solutions focus on top down, server based approaches to decide whether a news article is fake or biased, and display only trusted news to the end users. In this paper, we follow a different approach to help the users make informed choices about which news t…
▽ More
In this era of fake news and political polarization, it is desirable to have a system to enable users to access balanced news content. Current solutions focus on top down, server based approaches to decide whether a news article is fake or biased, and display only trusted news to the end users. In this paper, we follow a different approach to help the users make informed choices about which news they want to read, making users aware in real time of the bias in news articles they were browsing and recommending news articles from other sources on the same topic with different levels of bias. We use a recent Pew research report to collect news sources that readers with varying political inclinations prefer to read. We then scrape news articles on a variety of topics from these varied news sources. After this, we perform clustering to find similar topics of the articles, as well as calculate a bias score for each article. For a news article the user is currently reading, we display the bias score and also display other articles on the same topic, out of the previously collected articles, from different news sources. This we present to the user. This approach, we hope, would make it possible for users to access more balanced articles on given news topics. We present the implementation details of the system along with some preliminary results on news articles.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.
-
Attention Sensitive Web Browsing
Authors:
Joy Bose,
Amit Singhai,
Anish Patankar,
Ankit Kumar
Abstract:
With a number of cheap commercial dry EEG kits available today, it is possible to look at user attention driven scenarios for interaction with the web browser. Using EEG to determine the user's attention level is preferable to using methods such as gaze tracking or time spent on the webpage. In this paper we use the attention level in three different ways. First, as a control mechanism, to control…
▽ More
With a number of cheap commercial dry EEG kits available today, it is possible to look at user attention driven scenarios for interaction with the web browser. Using EEG to determine the user's attention level is preferable to using methods such as gaze tracking or time spent on the webpage. In this paper we use the attention level in three different ways. First, as a control mechanism, to control user interface elements such as menus or buttons. Second, to make the web browser responsive to the current attention level. Third, as a means for the web developer to control the user experience based on the level of attention paid by the user, thus creating attention sensitive websites. We present implementation details for each of these, using the NeuroSky MindWave sensor. We also explore issues in the system, and possibility of an EEG based web standard.
△ Less
Submitted 6 January, 2016;
originally announced January 2016.
-
An associative memory for the on-line recognition and prediction of temporal sequences
Authors:
J. Bose,
S. B. Furber,
J. L. Shapiro
Abstract:
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed m…
▽ More
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed memory is used to gain scalability. The context store combines the functionality of a neural layer with a shift register. The sensitivity of the machine to the sequence context is controllable, resulting in different characteristic behaviours. The model can store and predict on-line sequences of various types and length. Numerical simulations on the model have been carried out to determine its properties.
△ Less
Submitted 4 November, 2006;
originally announced November 2006.