-
Hybrid Approach to Parallel Stochastic Gradient Descent
Authors:
Aakash Sudhirbhai Vora,
Dhrumil Chetankumar Joshi,
Aksh Kantibhai Patel
Abstract:
Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We pr…
▽ More
Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We propose a third approach to data parallelism which is a hybrid between synchronous and asynchronous approaches, using both approaches to train the neural network. When the threshold function is selected appropriately to gradually shift all parameter aggregation from asynchronous to synchronous, we show that in a given time period our hybrid approach outperforms both asynchronous and synchronous approaches.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
Evaluating representation learning on the protein structure universe
Authors:
Arian R. Jamasb,
Alex Morehead,
Chaitanya K. Joshi,
Zuobai Zhang,
Kieran Didi,
Simon V. Mathis,
Charles Harris,
Jian Tang,
Jianlin Cheng,
Pietro Lio,
Tom L. Blundell
Abstract:
We introduce ProteinWorkshop, a comprehensive benchmark suite for representation learning on protein structures with Geometric Graph Neural Networks. We consider large-scale pre-training and downstream tasks on both experimental and predicted structures to enable the systematic evaluation of the quality of the learned structural representation and their usefulness in capturing functional relations…
▽ More
We introduce ProteinWorkshop, a comprehensive benchmark suite for representation learning on protein structures with Geometric Graph Neural Networks. We consider large-scale pre-training and downstream tasks on both experimental and predicted structures to enable the systematic evaluation of the quality of the learned structural representation and their usefulness in capturing functional relationships for downstream tasks. We find that: (1) large-scale pretraining on AlphaFold structures and auxiliary tasks consistently improve the performance of both rotation-invariant and equivariant GNNs, and (2) more expressive equivariant GNNs benefit from pretraining to a greater extent compared to invariant models. We aim to establish a common ground for the machine learning and computational biology communities to rigorously compare and advance protein structure representation learning. Our open-source codebase reduces the barrier to entry for working with large protein structure datasets by providing: (1) storage-efficient dataloaders for large-scale structural databases including AlphaFoldDB and ESM Atlas, as well as (2) utilities for constructing new tasks from the entire PDB. ProteinWorkshop is available at: github.com/a-r-j/ProteinWorkshop.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design
Authors:
Rishabh Anand,
Chaitanya K. Joshi,
Alex Morehead,
Arian R. Jamasb,
Charles Harris,
Simon V. Mathis,
Kieran Didi,
Bryan Hooi,
Pietro Liò
Abstract:
We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon SE(3) flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges posed by RNA modeling. We formulate RNA structures as a set of rigid-body frames and associated loss functions which account for larger, more conformationally fle…
▽ More
We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon SE(3) flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges posed by RNA modeling. We formulate RNA structures as a set of rigid-body frames and associated loss functions which account for larger, more conformationally flexible RNA backbones (13 atoms per nucleotide) vs. proteins (4 atoms per residue). Toward tackling the lack of diversity in 3D RNA datasets, we explore training with structural clustering and crop** augmentations. Additionally, we define a suite of evaluation metrics to measure whether the generated RNA structures are globally self-consistent (via inverse folding followed by forward folding) and locally recover RNA-specific structural descriptors. The most performant version of RNA-FrameFlow generates locally realistic RNA backbones of 40-150 nucleotides, over 40% of which pass our validity criteria as measured by a self-consistency TM-score >= 0.45, at which two RNAs have the same global fold. Open-source code: https://github.com/rish-16/rna-backbone-design
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Understanding Biology in the Age of Artificial Intelligence
Authors:
Elsa Lawrence,
Adham El-Shazly,
Srijit Seal,
Chaitanya K Joshi,
Pietro Liò,
Shantanu Singh,
Andreas Bender,
Pietro Sormanni,
Matthew Greenig
Abstract:
Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems, primarily centered around the use of machine learning (ML) models. Although ML is undeniably useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific i…
▽ More
Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems, primarily centered around the use of machine learning (ML) models. Although ML is undeniably useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. As such, the interplay between these models and scientific understanding in biology is a topic with important implications for the future of scientific research, yet it is a subject that has received little attention. Here, we draw from an epistemological toolkit to contextualize recent applications of ML in biological sciences under modern philosophical theories of understanding, identifying general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge. We propose that conceptions of scientific understanding as information compression, qualitative intelligibility, and dependency relation modelling provide a useful framework for interpreting ML-mediated understanding of biological systems. Through a detailed analysis of two key application areas of ML in modern biological research - protein structure prediction and single cell RNA-sequencing - we explore how these features have thus far enabled ML systems to advance scientific understanding of their target phenomena, how they may guide the development of future ML models, and the key obstacles that remain in preventing ML from achieving its potential as a tool for biological discovery. Consideration of the epistemological features of ML applications in biology will improve the prospects of these methods to solve important problems and advance scientific understanding of living systems.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems
Authors:
Alexandre Duval,
Simon V. Mathis,
Chaitanya K. Joshi,
Victor Schmidt,
Santiago Miret,
Fragkiskos D. Malliaros,
Taco Cohen,
Pietro Liò,
Yoshua Bengio,
Michael Bronstein
Abstract:
Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations.…
▽ More
Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations. In recent years, Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation. Their specificity lies in the inductive biases they leverage - such as physical symmetries and chemical properties - to learn informative representations of these geometric graphs.
In this opinionated paper, we provide a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems. We cover fundamental background material and introduce a pedagogical taxonomy of Geometric GNN architectures: (1) invariant networks, (2) equivariant networks in Cartesian basis, (3) equivariant networks in spherical basis, and (4) unconstrained networks. Additionally, we outline key datasets and application areas and suggest future research directions. The objective of this work is to present a structured perspective on the field, making it accessible to newcomers and aiding practitioners in gaining an intuition for its mathematical abstractions.
△ Less
Submitted 13 March, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
Authors:
Xuan Zhang,
Limei Wang,
Jacob Helwig,
Youzhi Luo,
Cong Fu,
Yaochen Xie,
Meng Liu,
Yuchao Lin,
Zhao Xu,
Keqiang Yan,
Keir Adams,
Maurice Weiler,
Xiner Li,
Tianfan Fu,
Yucheng Wang,
Haiyang Yu,
YuQing Xie,
Xiang Fu,
Alex Strasser,
Shenglong Xu,
Yi Liu,
Yuanqi Du,
Alexandra Saxton,
Hongyi Ling,
Hannah Lawrence
, et al. (38 additional authors not shown)
Abstract:
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Sc…
▽ More
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.
△ Less
Submitted 15 November, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Group Invariant Global Pooling
Authors:
Kamil Bujel,
Yonatan Gideoni,
Chaitanya K. Joshi,
Pietro Liò
Abstract:
Much work has been devoted to devising architectures that build group-equivariant representations, while invariance is often induced using simple global pooling mechanisms. Little work has been done on creating expressive layers that are invariant to given symmetries, despite the success of permutation invariant pooling in various molecular tasks. In this work, we present Group Invariant Global Po…
▽ More
Much work has been devoted to devising architectures that build group-equivariant representations, while invariance is often induced using simple global pooling mechanisms. Little work has been done on creating expressive layers that are invariant to given symmetries, despite the success of permutation invariant pooling in various molecular tasks. In this work, we present Group Invariant Global Pooling (GIGP), an invariant pooling layer that is provably sufficiently expressive to represent a large class of invariant functions. We validate GIGP on rotated MNIST and QM9, showing improvements for the latter while attaining identical results for the former. By making the pooling process group orbit-aware, this invariant aggregation method leads to improved performance, while performing well-principled group aggregation.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
CENSUS-HWR: a large training dataset for offline handwriting recognition
Authors:
Chetan Joshi,
Lawry Sorenson,
Ammon Wolfert,
Dr. Mark Clement,
Dr. Joseph Price,
Dr. Kasey Buckles
Abstract:
Progress in Automated Handwriting Recognition has been hampered by the lack of large training datasets. Nearly all research uses a set of small datasets that often cause models to overfit. We present CENSUS-HWR, a new dataset consisting of full English handwritten words in 1,812,014 gray scale images. A total of 1,865,134 handwritten texts from a vocabulary of 10,711 words in the English language…
▽ More
Progress in Automated Handwriting Recognition has been hampered by the lack of large training datasets. Nearly all research uses a set of small datasets that often cause models to overfit. We present CENSUS-HWR, a new dataset consisting of full English handwritten words in 1,812,014 gray scale images. A total of 1,865,134 handwritten texts from a vocabulary of 10,711 words in the English language are present in this collection. This dataset is intended to serve handwriting models as a benchmark for deep learning algorithms. This huge English handwriting recognition dataset has been extracted from the US 1930 and 1940 censuses taken by approximately 70,000 enumerators each year. The dataset and the trained model with their weights are freely available to download at https://censustree.org/data.html.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
gRNAde: Geometric Deep Learning for 3D RNA inverse design
Authors:
Chaitanya K. Joshi,
Arian R. Jamasb,
Ramon Viñas,
Charles Harris,
Simon V. Mathis,
Alex Morehead,
Rishabh Anand,
Pietro Liò
Abstract:
Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a mul…
▽ More
Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. [2010], gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent RNA polymerase ribozyme structure. Open source code: https://github.com/chaitjo/geometric-rna-design
△ Less
Submitted 25 May, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
On the Expressive Power of Geometric Graph Neural Networks
Authors:
Chaitanya K. Joshi,
Cristian Bodnar,
Simon V. Mathis,
Taco Cohen,
Pietro Liò
Abstract:
The expressive power of Graph Neural Networks (GNNs) has been studied extensively through the Weisfeiler-Leman (WL) graph isomorphism test. However, standard GNNs and the WL framework are inapplicable for geometric graphs embedded in Euclidean space, such as biomolecules, materials, and other physical systems. In this work, we propose a geometric version of the WL test (GWL) for discriminating geo…
▽ More
The expressive power of Graph Neural Networks (GNNs) has been studied extensively through the Weisfeiler-Leman (WL) graph isomorphism test. However, standard GNNs and the WL framework are inapplicable for geometric graphs embedded in Euclidean space, such as biomolecules, materials, and other physical systems. In this work, we propose a geometric version of the WL test (GWL) for discriminating geometric graphs while respecting the underlying physical symmetries: permutations, rotation, reflection, and translation. We use GWL to characterise the expressive power of geometric GNNs that are invariant or equivariant to physical symmetries in terms of distinguishing geometric graphs. GWL unpacks how key design choices influence geometric GNN expressivity: (1) Invariant layers have limited expressivity as they cannot distinguish one-hop identical geometric graphs; (2) Equivariant layers distinguish a larger class of graphs by propagating geometric information beyond local neighbourhoods; (3) Higher order tensors and scalarisation enable maximally powerful geometric GNNs; and (4) GWL's discrimination-based perspective is equivalent to universal approximation. Synthetic experiments supplementing our results are available at \url{https://github.com/chaitjo/geometric-gnn-dojo}
△ Less
Submitted 3 March, 2024; v1 submitted 23 January, 2023;
originally announced January 2023.
-
On Representation Knowledge Distillation for Graph Neural Networks
Authors:
Chaitanya K. Joshi,
Fayao Liu,
Xu Xun,
Jie Lin,
Chuan-Sheng Foo
Abstract:
Knowledge distillation is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the Local Structure Preserving loss (LSP), which matches local structural relationships defined over edges across the student and teacher's node embeddings. This paper studies whether preserving t…
▽ More
Knowledge distillation is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the Local Structure Preserving loss (LSP), which matches local structural relationships defined over edges across the student and teacher's node embeddings. This paper studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose Graph Contrastive Representation Distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across 4 datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving variant of LSP) as well as baselines from 2D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other. Our code is available at https://github.com/chaitjo/efficient-gnns
△ Less
Submitted 4 February, 2023; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Integrated Construction of Multimodal Atlases with Structural Connectomes in the Space of Riemannian Metrics
Authors:
Kristen M. Campbell,
Haocheng Dai,
Zhe Su,
Martin Bauer,
P. Thomas Fletcher,
Sarang C. Joshi
Abstract:
The structural network of the brain, or structural connectome, can be represented by fiber bundles generated by a variety of tractography methods. While such methods give qualitative insights into brain structure, there is controversy over whether they can provide quantitative information, especially at the population level. In order to enable population-level statistical analysis of the structura…
▽ More
The structural network of the brain, or structural connectome, can be represented by fiber bundles generated by a variety of tractography methods. While such methods give qualitative insights into brain structure, there is controversy over whether they can provide quantitative information, especially at the population level. In order to enable population-level statistical analysis of the structural connectome, we propose representing a connectome as a Riemannian metric, which is a point on an infinite-dimensional manifold. We equip this manifold with the Ebin metric, a natural metric structure for this space, to get a Riemannian manifold along with its associated geometric properties. We then use this Riemannian framework to apply object-oriented statistical analysis to define an atlas as the Fréchet mean of a population of Riemannian metrics. This formulation ties into the existing framework for diffeomorphic construction of image atlases, allowing us to construct a multimodal atlas by simultaneously integrating complementary white matter structure details from DWMRI and cortical details from T1-weighted MRI. We illustrate our framework with 2D data examples of connectome registration and atlas formation. Finally, we build an example 3D multimodal atlas using T1 images and connectomes derived from diffusion tensors estimated from a subset of subjects from the Human Connectome Project.
△ Less
Submitted 13 June, 2022; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Point Discriminative Learning for Data-efficient 3D Point Cloud Analysis
Authors:
Fayao Liu,
Guosheng Lin,
Chuan-Sheng Foo,
Chaitanya K. Joshi,
Jie Lin
Abstract:
3D point cloud analysis has drawn a lot of research attention due to its wide applications. However, collecting massive labelled 3D point cloud data is both time-consuming and labor-intensive. This calls for data-efficient learning methods. In this work we propose PointDisc, a point discriminative learning method to leverage self-supervisions for data-efficient 3D point cloud classification and se…
▽ More
3D point cloud analysis has drawn a lot of research attention due to its wide applications. However, collecting massive labelled 3D point cloud data is both time-consuming and labor-intensive. This calls for data-efficient learning methods. In this work we propose PointDisc, a point discriminative learning method to leverage self-supervisions for data-efficient 3D point cloud classification and segmentation. PointDisc imposes a novel point discrimination loss on the middle and global level features produced by the backbone network. This point discrimination loss enforces learned features to be consistent with points belonging to the corresponding local shape region and inconsistent with randomly sampled noisy points. We conduct extensive experiments on 3D object classification, 3D semantic and part segmentation, showing the benefits of PointDisc for data-efficient learning. Detailed analysis demonstrate that PointDisc learns unsupervised features that well capture local and global geometry.
△ Less
Submitted 20 January, 2023; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Mathematical Modeling of Heat Conduction
Authors:
Abdul Aziz Momin,
Nikhil Shende,
Abhijna Anamtatmakula,
Emily Ganguly,
Ashwin Gurbani,
Chaitanya A Joshi,
Yogesh Y Mahajan
Abstract:
This report describes a mathematical model of heat conduction. The differential equation for heat conduction in one dimensional rod has been derived. The explicit finite difference numerical method is used to solve this differential equation. Then for simulation, a code was written in using python libraries via Jupyter notebook. The simulation carried out for Aluminum, Copper and Mild Steel rods a…
▽ More
This report describes a mathematical model of heat conduction. The differential equation for heat conduction in one dimensional rod has been derived. The explicit finite difference numerical method is used to solve this differential equation. Then for simulation, a code was written in using python libraries via Jupyter notebook. The simulation carried out for Aluminum, Copper and Mild Steel rods and results were discussed.
△ Less
Submitted 25 July, 2021;
originally announced July 2021.
-
Structural Connectome Atlas Construction in the Space of Riemannian Metrics
Authors:
Kristen M. Campbell,
Haocheng Dai,
Zhe Su,
Martin Bauer,
P. Thomas Fletcher,
Sarang C. Joshi
Abstract:
The structural connectome is often represented by fiber bundles generated from various types of tractography. We propose a method of analyzing connectomes by representing them as a Riemannian metric, thereby viewing them as points in an infinite-dimensional manifold. After equip** this space with a natural metric structure, the Ebin metric, we apply object-oriented statistical analysis to define…
▽ More
The structural connectome is often represented by fiber bundles generated from various types of tractography. We propose a method of analyzing connectomes by representing them as a Riemannian metric, thereby viewing them as points in an infinite-dimensional manifold. After equip** this space with a natural metric structure, the Ebin metric, we apply object-oriented statistical analysis to define an atlas as the Fréchet mean of a population of Riemannian metrics. We demonstrate connectome registration and atlas formation using connectomes derived from diffusion tensors estimated from a subset of subjects from the Human Connectome Project.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Performance of Intelligent Reconfigurable Surface-Based Wireless Communications Using QAM Signaling
Authors:
Dharmendra Dixit,
Kishor Chandra Joshi,
Sanjeev Sharma
Abstract:
Intelligent reconfigurable surface (IRS) is being seen as a promising technology for 6G wireless networks. The IRS can reconfigure the wireless propagation environment, which results in significant performance improvement of wireless communications. In this paper, we analyze the performance of bandwidth-efficient quadrature amplitude modulation (QAM) techniques for IRS-assisted wireless communicat…
▽ More
Intelligent reconfigurable surface (IRS) is being seen as a promising technology for 6G wireless networks. The IRS can reconfigure the wireless propagation environment, which results in significant performance improvement of wireless communications. In this paper, we analyze the performance of bandwidth-efficient quadrature amplitude modulation (QAM) techniques for IRS-assisted wireless communications over Rayleigh fading channels. New closed-form expressions of the generic average symbol error rate (ASER) for rectangular QAM, square QAM and cross QAM schemes are derived. Moreover, simplified expressions of the ASER for low signal-to-noise-ratio (SNR) and high SNR regions are also presented, which are useful to provide insights analytically. We comprehensively analyze the impact of modulation parameters and the number of IRS elements employed. We also verify our theoretical results through simulations. Our results demonstrate that employing IRS significantly enhances the ASER performance in comparison to additive white Gaussian noise channel at a low SNR regime. Thus, IRS-assisted wireless communications can be a promising candidate for various low powered communication applications such as internet-of-things (IoT).
△ Less
Submitted 26 September, 2020;
originally announced October 2020.
-
Learning the Travelling Salesperson Problem Requires Rethinking Generalization
Authors:
Chaitanya K. Joshi,
Quentin Cappart,
Louis-Martin Rousseau,
Thomas Laurent
Abstract:
End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with few hundreds of nodes. While state-of-the-art learning-driven approaches for TSP perform closely to classical solvers when trained on trivially small sizes, they…
▽ More
End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with few hundreds of nodes. While state-of-the-art learning-driven approaches for TSP perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales. This work presents an end-to-end neural combinatorial optimization pipeline that unifies several recent papers in order to identify the inductive biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training. Our controlled experiments provide the first principled investigation into such zero-shot generalization, revealing that extrapolating beyond training data requires rethinking the neural combinatorial optimization pipeline, from network layers and learning paradigms to evaluation protocols. Additionally, we analyze recent advances in deep learning for routing problems through the lens of our pipeline and provide new directions to stimulate future research.
△ Less
Submitted 25 May, 2022; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Various Secure Routing Schemes for MANETs: A Survey
Authors:
Priya R. Soni,
Charmi A. Joshi,
Dhwani R. Bhadra,
Nikita P. Vyas,
Rutvij H. Jhaveri
Abstract:
MANET is an infrastructure less as well as self configuring network consisting of mobile nodes communicating with each other using radio medium. Its exclusive properties such as dynamic topology, decentralization, and wireless medium make MANET to become very unique network amongst other traditional networks, thereby determining security to be a major challenge. In this paper, we have carried out…
▽ More
MANET is an infrastructure less as well as self configuring network consisting of mobile nodes communicating with each other using radio medium. Its exclusive properties such as dynamic topology, decentralization, and wireless medium make MANET to become very unique network amongst other traditional networks, thereby determining security to be a major challenge. In this paper, we have carried out the survey of various security approaches of Mobile Adhoc Networks and provide a comprehensive study regarding it. We have focused our work on three approaches such as Bayesian watch dog, Trust based systems, and Ant colony optimization. In wireless perspective, security is a crucial term to handle. Therefore it becomes necessary when we are concerning our work with Mobile Adhoc Network.
△ Less
Submitted 14 April, 2020;
originally announced April 2020.
-
Fast and Accurate Retrieval of Methane Concentration from Imaging Spectrometer Data Using Sparsity Prior
Authors:
Markus D. Foote,
Philip E. Dennison,
Andrew K. Thorpe,
David R. Thompson,
Siraput Jongaramrungruang,
Christian Frankenberg,
Sarang C. Joshi
Abstract:
The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Air…
▽ More
The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Airborne Visible InfraRed Imaging Spectrometer Next Generation (AVIRIS-NG). Current quantitative methods have tradeoffs between computational requirements and retrieval accuracy, creating obstacles for processing real-time data or large datasets from flight campaigns. We present a new computationally efficient algorithm that applies sparsity and an albedo correction to matched filter retrieval of trace gas concentration-pathlength. The new algorithm was tested using AVIRIS-NG data acquired over several point source plumes in Ahmedabad, India. The algorithm was validated using simulated AVIRIS-NG data including synthetic plumes of known methane concentration. Sparsity and albedo correction together reduced the root mean squared error of retrieved methane concentration-pathlength enhancement by 60.7% compared with a previous robust matched filter method. Background noise was reduced by a factor of 2.64. The new algorithm was able to process the entire 300 flightline 2016 AVIRIS-NG India campaign in just over 8 hours on a desktop computer with GPU acceleration.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
Benchmarking Graph Neural Networks
Authors:
Vijay Prakash Dwivedi,
Chaitanya K. Joshi,
Anh Tuan Luu,
Thomas Laurent,
Yoshua Bengio,
Xavier Bresson
Abstract:
In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be deve…
▽ More
In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress. This led us in March 2020 to release a benchmark framework that i) comprises of a diverse collection of mathematical and real-world graphs, ii) enables fair model comparison with the same parameter budget to identify key architectures, iii) has an open-source, easy-to-use and reproducible code infrastructure, and iv) is flexible for researchers to experiment with new theoretical ideas. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, which demonstrates the utility of the proposed open-source framework through the wide usage by the GNN community. In this paper, we present an updated version of our benchmark with a concise presentation of the aforementioned framework characteristics, an additional medium-sized molecular dataset AQSOL, similar to the popular ZINC, but with a real-world measured chemical target, and discuss how this framework can be leveraged to explore new GNN designs and insights. As a proof of value of our benchmark, we study the case of graph positional encoding (PE) in GNNs, which was introduced with this benchmark and has since spurred interest of exploring more powerful PE for Transformers and GNNs in a robust experimental setting.
△ Less
Submitted 27 December, 2022; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Multi-Graph Transformer for Free-Hand Sketch Recognition
Authors:
Peng Xu,
Chaitanya K. Joshi,
Xavier Bresson
Abstract:
Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). In this work, we propose a new representatio…
▽ More
Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel Graph Neural Network (GNN), the Multi-Graph Transformer (MGT), for learning representations of sketches from multiple graphs which simultaneously capture global and local geometric stroke structures, as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: (i) achieves small recognition gap to the CNN-based performance upper bound (72.80% vs. 74.22%), and (ii) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
△ Less
Submitted 25 March, 2021; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Insider threat modeling: An adversarial risk analysis approach
Authors:
Chaitanya Joshi,
David Rios Insua,
Jesus Rios
Abstract:
Insider threats entail major security issues in geopolitics, cyber risk management and business organization. The game theoretic models proposed so far do not take into account some important factors such as the organisational culture and whether the attacker was detected or not. They also fail to model the defensive mechanisms already put in place by an organisation to mitigate an insider attack.…
▽ More
Insider threats entail major security issues in geopolitics, cyber risk management and business organization. The game theoretic models proposed so far do not take into account some important factors such as the organisational culture and whether the attacker was detected or not. They also fail to model the defensive mechanisms already put in place by an organisation to mitigate an insider attack. We propose two new models which incorporate these settings and hence are more realistic. %Most earlier work in the field has focused on %standard game theoretic approaches to find the solutions. We use the adversarial risk analysis (ARA) approach to find the solution to our models. ARA does not assume common knowledge and solves the problem from the point of view of one of the players, taking into account their knowledge and uncertainties regarding the choices available to them, to their adversaries, the possible outcomes, their utilities and their opponents' utilities. Our models and the ARA solutions are general and can be applied to most insider threat scenarios. A data security example illustrates the discussion.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Adversarial Risk Analysis for First-Price Sealed-Bid Auctions
Authors:
Muhammad Ejaz,
Chaitanya Joshi,
Stephen Joe
Abstract:
Adversarial Risk Analysis (ARA) is an upcoming methodology that is considered to have advantages over the traditional decision theoretic and game theoretic approaches. ARA solutions for first-price sealed-bid (FPSB) auctions have been found but only under strong assumptions which make the model somewhat unrealistic. In this paper, we use ARA methodology to model FPSB auctions using more realistic…
▽ More
Adversarial Risk Analysis (ARA) is an upcoming methodology that is considered to have advantages over the traditional decision theoretic and game theoretic approaches. ARA solutions for first-price sealed-bid (FPSB) auctions have been found but only under strong assumptions which make the model somewhat unrealistic. In this paper, we use ARA methodology to model FPSB auctions using more realistic assumptions. We define a new utility function that considers bidders' wealth, we assume a reserve price and find solutions not only for risk-neutral but also for risk-averse as well as risk-seeking bidders. We model the problem using ARA for non-strategic play and level-k thinking solution concepts.
△ Less
Submitted 18 March, 2020; v1 submitted 21 November, 2019;
originally announced November 2019.
-
Learning Multiparametric Biomarkers for Assessing MR-Guided Focused Ultrasound Treatment of Malignant Tumors
Authors:
Blake E. Zimmerman,
Sara Johnson,
Henrik Odéen,
Jill Shea,
Markus D. Foote,
Nicole Winkler,
Sarang C. Joshi,
Allison Payne
Abstract:
Noninvasive MR-guided focused ultrasound (MRgFUS) treatments are promising alternatives to the surgical removal of malignant tumors. A significant challenge is assessing the viability of treated tissue during and immediately after MRgFUS procedures. Current clinical assessment uses the nonperfused volume (NPV) biomarker immediately after treatment from contrast-enhanced MRI. The NPV has variable a…
▽ More
Noninvasive MR-guided focused ultrasound (MRgFUS) treatments are promising alternatives to the surgical removal of malignant tumors. A significant challenge is assessing the viability of treated tissue during and immediately after MRgFUS procedures. Current clinical assessment uses the nonperfused volume (NPV) biomarker immediately after treatment from contrast-enhanced MRI. The NPV has variable accuracy, and the use of contrast agent prevents continuing MRgFUS treatment if tumor coverage is inadequate. This work presents a novel, noncontrast, learned multiparametric MR biomarker that can be used during treatment for intratreatment assessment, validated in a VX2 rabbit tumor model. A deep convolutional neural network was trained on noncontrast multiparametric MR images using the NPV biomarker from follow-up MR imaging (3-5 days after MRgFUS treatment) as the accurate label of nonviable tissue. A novel volume-conserving registration algorithm yielded a voxel-wise correlation between treatment and follow-up NPV, providing a rigorous validation of the biomarker. The learned noncontrast multiparametric MR biomarker predicted the follow-up NPV with an average DICE coefficient of 0.71, substantially outperforming the current clinical standard (DICE coefficient = 0.53). Noncontrast multiparametric MR imaging integrated with a deep convolutional neural network provides a more accurate prediction of MRgFUS treatment outcome than current contrast-based techniques.
△ Less
Submitted 29 September, 2020; v1 submitted 23 October, 2019;
originally announced October 2019.
-
On Learning Paradigms for the Travelling Salesman Problem
Authors:
Chaitanya K. Joshi,
Thomas Laurent,
Xavier Bresson
Abstract:
We explore the impact of learning paradigms on training deep neural networks for the Travelling Salesman Problem. We design controlled experiments to train supervised learning (SL) and reinforcement learning (RL) models on fixed graph sizes up to 100 nodes, and evaluate them on variable sized graphs up to 500 nodes. Beyond not needing labelled data, our results reveal favorable properties of RL ov…
▽ More
We explore the impact of learning paradigms on training deep neural networks for the Travelling Salesman Problem. We design controlled experiments to train supervised learning (SL) and reinforcement learning (RL) models on fixed graph sizes up to 100 nodes, and evaluate them on variable sized graphs up to 500 nodes. Beyond not needing labelled data, our results reveal favorable properties of RL over SL: RL training leads to better emergent generalization to variable graph sizes and is a key component for learning scale-invariant solvers for novel combinatorial problems.
△ Less
Submitted 31 October, 2019; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Reinforcing Edge Computing with Multipath TCP Enabled Mobile Device Clouds
Authors:
Venkatraman Balasubramanian,
Kees Kroep,
Kishor Chandra Joshi,
R. Venkatesha Prasad
Abstract:
In recent years, enormous growth has been witnessed in the computational and storage capabilities of mobile devices. However, much of this computational and storage capabilities are not always fully used. On the other hand, popularity of mobile edge computing which aims to replace the traditional centralized powerful cloud with multiple edge servers is rapidly growing. In particular, applications…
▽ More
In recent years, enormous growth has been witnessed in the computational and storage capabilities of mobile devices. However, much of this computational and storage capabilities are not always fully used. On the other hand, popularity of mobile edge computing which aims to replace the traditional centralized powerful cloud with multiple edge servers is rapidly growing. In particular, applications having strict latency requirements can be best served by the mobile edge clouds due to a reduced round-trip delay. In this paper we propose a Multi-Path TCP (MPTCP) enabled mobile device cloud (MDC) as a replacement to the existing TCP based or D2D device cloud techniques, as it effectively makes use of the available bandwidth by providing much higher throughput as well as ensures robust wireless connectivity. We investigate the congestion in mobile-device cloud formation resulting mainly due to the message passing for service providing nodes at the time of discovery, service continuity and formation of cloud composition. We propose a user space agent called congestion handler that enable offloading of packets from one sub-flow to the other under link quality constraints. Further, we discuss the benefits of this design and perform preliminary analysis of the system.
△ Less
Submitted 30 October, 2019; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Association, Blockage and Handoffs in IEEE 802.11ad based 60GHz Picocells- A Closer Look
Authors:
Kishor Chandra Joshi,
Rizqi Hersyandika,
R. Venkatesha Prasad
Abstract:
The link misalignment and high susceptibility to blockages are the biggest hurdles in realizing 60GHz based wireless local area networks (WLANs). However, much of the previous studies investigating 60GHz alignment and blockage issues do not provide an accurate quantitative evaluation from the perspective of WLANs. In this paper, we present an in-depth quantitative evaluation of commodity IEEE 802.…
▽ More
The link misalignment and high susceptibility to blockages are the biggest hurdles in realizing 60GHz based wireless local area networks (WLANs). However, much of the previous studies investigating 60GHz alignment and blockage issues do not provide an accurate quantitative evaluation from the perspective of WLANs. In this paper, we present an in-depth quantitative evaluation of commodity IEEE 802.11ad devices by forming a 60GHz WLAN with two docking stations mimicking as access points (APs). Through extensive experiments, we provide important insights about directional coverage pattern of antennas, communication range and co-channel interference and blockages. We are able to measure the IEEE 802.11ad link alignment and association overheads in absolute time units. With a very high accuracy (96-97%), our blockage characterization can differentiate between temporary and permanent blockages caused by humans in the indoor environment, which is a key insight. Utilizing our blockage characterization, we also demonstrate intelligent handoff to alternate APs using consumergrade IEEE 802.11ad devices. Our blockage-induced handoff experiments provide important insights that would be helpful in integrating millimeter wave based WLANs into future wireless networks.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Analyzing the Trade-offs in Using Millimeter Wave Directional Links for High Data Rate Tactile Internet Applications
Authors:
Kishor Chandra Joshi,
Solmaz Niknam,
R. Venkatesha Prasad,
Balasubramaniam Natarajan
Abstract:
Ultra-low latency and high reliability communications are the two defining characteristics of Tactile Internet (TI). Nevertheless, some TI applications would also require high data-rate transfer of audio-visual information to complement the haptic data. Using Millimeter wave (mmWave) communications is an attractive choice for high datarate TI applications due to the availability of large bandwidth…
▽ More
Ultra-low latency and high reliability communications are the two defining characteristics of Tactile Internet (TI). Nevertheless, some TI applications would also require high data-rate transfer of audio-visual information to complement the haptic data. Using Millimeter wave (mmWave) communications is an attractive choice for high datarate TI applications due to the availability of large bandwidth in the mmWave bands. Moreover, mmWave radio access is also advantageous to attain the airinterface-diversity required for high reliability in TI systems as mmWave signal propagation significantly differs to sub-6GHz propagation. However, the use of narrow beamwidth in mmWave systems makes them susceptible to link misalignment-induced unreliability and high access latency. In this paper, we analyze the trade-offs between high gain of narrow beamwidth antennas and corresponding susceptibility to misalignment in mmWave links. To alleviate the effects of random antenna misalignment, we propose a beamwidth-adaptation scheme that significantly stabilize the link throughput performance.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
An Efficient Graph Convolutional Network Technique for the Travelling Salesman Problem
Authors:
Chaitanya K. Joshi,
Thomas Laurent,
Xavier Bresson
Abstract:
This paper introduces a new learning-based approach for approximately solving the Travelling Salesman Problem on 2D Euclidean graphs. We use deep Graph Convolutional Networks to build efficient TSP graph representations and output tours in a non-autoregressive manner via highly parallelized beam search. Our approach outperforms all recently proposed autoregressive deep learning techniques in terms…
▽ More
This paper introduces a new learning-based approach for approximately solving the Travelling Salesman Problem on 2D Euclidean graphs. We use deep Graph Convolutional Networks to build efficient TSP graph representations and output tours in a non-autoregressive manner via highly parallelized beam search. Our approach outperforms all recently proposed autoregressive deep learning techniques in terms of solution quality, inference speed and sample efficiency for problem instances of fixed graph sizes. In particular, we reduce the average optimality gap from 0.52% to 0.01% for 50 nodes, and from 2.26% to 1.39% for 100 nodes. Finally, despite improving upon other learning-based approaches for TSP, our approach falls short of standard Operations Research solvers.
△ Less
Submitted 14 October, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Working women and caste in India: A study of social disadvantage using feature attribution
Authors:
Kuhu Joshi,
Chaitanya K. Joshi
Abstract:
Women belonging to the socially disadvantaged caste-groups in India have historically been engaged in labour-intensive, blue-collar work. We study whether there has been any change in the ability to predict a woman's work-status and work-type based on her caste by interpreting machine learning models using feature attribution. We find that caste is now a less important determinant of work for the…
▽ More
Women belonging to the socially disadvantaged caste-groups in India have historically been engaged in labour-intensive, blue-collar work. We study whether there has been any change in the ability to predict a woman's work-status and work-type based on her caste by interpreting machine learning models using feature attribution. We find that caste is now a less important determinant of work for the younger generation of women compared to the older generation. Moreover, younger women from disadvantaged castes are now more likely to be working in white-collar jobs.
△ Less
Submitted 3 January, 2020; v1 submitted 27 April, 2019;
originally announced May 2019.
-
High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures
Authors:
Iddo Drori,
Isht Dwivedi,
Pranav Shrestha,
Jeffrey Wan,
Yueqi Wang,
Yunchu He,
Anthony Mazza,
Hugh Krogh-Freeman,
Dimitri Leggas,
Kendal Sandridge,
Linyong Nan,
Kaveri Thakoor,
Chinmay Joshi,
Sonam Goenka,
Chen Keasar,
Itsik Pe'er
Abstract:
We tackle the problem of protein secondary structure prediction using a common task framework. This lead to the introduction of multiple ideas for neural architectures based on state of the art building blocks, used in this task for the first time. We take a principled machine learning approach, which provides genuine, unbiased performance measures, correcting longstanding errors in the applicatio…
▽ More
We tackle the problem of protein secondary structure prediction using a common task framework. This lead to the introduction of multiple ideas for neural architectures based on state of the art building blocks, used in this task for the first time. We take a principled machine learning approach, which provides genuine, unbiased performance measures, correcting longstanding errors in the application domain. We focus on the Q8 resolution of secondary structure, an active area for continuously improving methods. We use an ensemble of strong predictors to achieve accuracy of 70.7% (on the CB513 test set using the CB6133filtered training set). These results are statistically indistinguishable from those of the top existing predictors. In the spirit of reproducible research we make our data, models and code available, aiming to set a gold standard for purity of training and testing sets. Such good practices lower entry barriers to this domain and facilitate reproducible, extendable research.
△ Less
Submitted 17 November, 2018;
originally announced November 2018.
-
CLINIQA: A Machine Intelligence Based Clinical Question Answering System
Authors:
M A H Zahid,
Ankush Mittal,
R. C. Joshi,
G. Atluri
Abstract:
The recent developments in the field of biomedicine have made large volumes of biomedical literature available to the medical practitioners. Due to the large size and lack of efficient searching strategies, medical practitioners struggle to obtain necessary information available in the biomedical literature. Moreover, the most sophisticated search engines of age are not intelligent enough to inter…
▽ More
The recent developments in the field of biomedicine have made large volumes of biomedical literature available to the medical practitioners. Due to the large size and lack of efficient searching strategies, medical practitioners struggle to obtain necessary information available in the biomedical literature. Moreover, the most sophisticated search engines of age are not intelligent enough to interpret the clinicians' questions. These facts reflect the urgent need of an information retrieval system that accepts the queries from medical practitioners' in natural language and returns the answers quickly and efficiently. In this paper, we present an implementation of a machine intelligence based CLINIcal Question Answering system (CLINIQA) to answer medical practitioner's questions. The system was rigorously evaluated on different text mining algorithms and the best components for the system were selected. The system makes use of Unified Medical Language System for semantic analysis of both questions and medical documents. In addition, the system employs supervised machine learning algorithms for classification of the documents, identifying the focus of the question and answer selection. Effective domain-specific heuristics are designed for answer ranking. The performance evaluation on hundred clinical questions shows the effectiveness of our approach.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Personalization in Goal-Oriented Dialog
Authors:
Chaitanya K. Joshi,
Fei Mi,
Boi Faltings
Abstract:
The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios. End-to-end trained neural dialog systems are an important line of research for such generalized dialog models as they do not resort to any situation-specific handcrafting of rules. However, incorporating personalization into such systems is a largely unexp…
▽ More
The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios. End-to-end trained neural dialog systems are an important line of research for such generalized dialog models as they do not resort to any situation-specific handcrafting of rules. However, incorporating personalization into such systems is a largely unexplored topic as there are no existing corpora to facilitate such work. In this paper, we present a new dataset of goal-oriented dialogs which are influenced by speaker profiles attached to them. We analyze the shortcomings of an existing end-to-end dialog system based on Memory Networks and propose modifications to the architecture which enable personalization. We also investigate personalization in dialog as a multi-task learning problem, and show that a single model which shares features among various profiles outperforms separate models for each profile.
△ Less
Submitted 15 December, 2017; v1 submitted 22 June, 2017;
originally announced June 2017.
-
Distributed Denial of Service Prevention Techniques
Authors:
B. B. Gupta,
R. C. Joshi,
Manoj Misra
Abstract:
The significance of the DDoS problem and the increased occurrence, sophistication and strength of attacks has led to the dawn of numerous prevention mechanisms. Each proposed prevention mechanism has some unique advantages and disadvantages over the others. In this paper, we present a classification of available mechanisms that are proposed in literature on preventing Internet services from possib…
▽ More
The significance of the DDoS problem and the increased occurrence, sophistication and strength of attacks has led to the dawn of numerous prevention mechanisms. Each proposed prevention mechanism has some unique advantages and disadvantages over the others. In this paper, we present a classification of available mechanisms that are proposed in literature on preventing Internet services from possible DDoS attacks and discuss the strengths and weaknesses of each mechanism. This provides better understanding of the problem and enables a security administrator to effectively equip his arsenal with proper prevention mechanisms for fighting against DDoS threat.
△ Less
Submitted 17 August, 2012;
originally announced August 2012.
-
Dynamic and Auto Responsive Solution for Distributed Denial-of-Service Attacks Detection in ISP Network
Authors:
B. B. Gupta,
R. C. Joshi,
Manoj Misra
Abstract:
Denial of service (DoS) attacks and more particularly the distributed ones (DDoS) are one of the latest threat and pose a grave danger to users, organizations and infrastructures of the Internet. Several schemes have been proposed on how to detect some of these attacks, but they suffer from a range of problems, some of them being impractical and others not being effective against these attacks. Th…
▽ More
Denial of service (DoS) attacks and more particularly the distributed ones (DDoS) are one of the latest threat and pose a grave danger to users, organizations and infrastructures of the Internet. Several schemes have been proposed on how to detect some of these attacks, but they suffer from a range of problems, some of them being impractical and others not being effective against these attacks. This paper reports the design principles and evaluation results of our proposed framework that autonomously detects and accurately characterizes a wide range of flooding DDoS attacks in ISP network. Attacks are detected by the constant monitoring of propagation of abrupt traffic changes inside ISP network. For this, a newly designed flow-volume based approach (FVBA) is used to construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic goes out of profile. Consideration of varying tolerance factors make proposed detection system scalable to the varying network conditions and attack loads in real time. Six-sigma method is used to identify threshold values accurately for malicious flows characterization. FVBA has been extensively evaluated in a controlled test-bed environment. Detection thresholds and efficiency is justified using receiver operating characteristics (ROC) curve. For validation, KDD 99, a publicly available benchmark dataset is used. The results show that our proposed system gives a drastic improvement in terms of detection and false alarm rate.
△ Less
Submitted 25 April, 2012;
originally announced April 2012.
-
An Efficient Analytical Solution to Thwart DDoS Attacks in Public Domain
Authors:
B. B. Gupta,
R. C. Joshi,
Manoj Misra
Abstract:
In this paper, an analytical model for DDoS attacks detection is proposed, in which propagation of abrupt traffic changes inside public domain is monitored to detect a wide range of DDoS attacks. Although, various statistical measures can be used to construct profile of the traffic normally seen in the network to identify anomalies whenever traffic goes out of profile, we have selected volume and…
▽ More
In this paper, an analytical model for DDoS attacks detection is proposed, in which propagation of abrupt traffic changes inside public domain is monitored to detect a wide range of DDoS attacks. Although, various statistical measures can be used to construct profile of the traffic normally seen in the network to identify anomalies whenever traffic goes out of profile, we have selected volume and flow measure. Consideration of varying tolerance factors make proposed detection system scalable to the varying network conditions and attack loads in real time. NS-2 network simulator on Linux platform is used as simulation testbed. Simulation results show that our proposed solution gives a drastic improvement in terms of detection rate and false positive rate. However, the mammoth volume generated by DDoS attacks pose the biggest challenge in terms of memory and computational overheads as far as monitoring and analysis of traffic at single point connecting victim is concerned. To address this problem, a distributed cooperative technique is proposed that distributes memory and computational overheads to all edge routers for detecting a wide range of DDoS attacks at early stage.
△ Less
Submitted 25 April, 2012;
originally announced April 2012.
-
An ISP Level Solution to Combat DDoS Attacks using Combined Statistical Based Approach
Authors:
B. B. Gupta,
Manoj Misra,
R. C. Joshi
Abstract:
Disruption from service caused by DDoS attacks is an immense threat to Internet today. These attacks can disrupt the availability of Internet services completely, by eating either computational or communication resources through sheer volume of packets sent from distributed locations in a coordinated manner or graceful degradation of network performance by sending attack traffic at low rate. In th…
▽ More
Disruption from service caused by DDoS attacks is an immense threat to Internet today. These attacks can disrupt the availability of Internet services completely, by eating either computational or communication resources through sheer volume of packets sent from distributed locations in a coordinated manner or graceful degradation of network performance by sending attack traffic at low rate. In this paper, we describe a novel framework that deals with the detection of variety of DDoS attacks by monitoring propagation of abrupt traffic changes inside ISP Domain and then characterizes flows that carry attack traffic. Two statistical metrics namely, Volume and Flow are used as parameters to detect DDoS attacks. Effectiveness of an anomaly based detection and characterization system highly depends on accuracy of threshold value settings. Inaccurate threshold values cause a large number of false positives and negatives. Therefore, in our scheme, Six-Sigma and varying tolerance factor methods are used to identify threshold values accurately and dynamically for various statistical metrics. NS-2 network simulator on Linux platform is used as simulation testbed to validate effectiveness of proposed approach. Different attack scenarios are implemented by varying total number of zombie machines and at different attack strengths. The comparison with volume-based approach clearly indicates the supremacy of our proposed system.
△ Less
Submitted 12 March, 2012;
originally announced March 2012.
-
Estimating strength of DDoS attack using various regression models
Authors:
B. B. Gupta,
R. C. Joshi,
Manoj Misra
Abstract:
Anomaly-based DDoS detection systems construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic deviate from normal profile beyond a threshold. This extend of deviation is normally not utilised. This paper reports the evaluation results of proposed approach that utilises this extend of deviation from detection threshold to estimate strength of DDoS attac…
▽ More
Anomaly-based DDoS detection systems construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic deviate from normal profile beyond a threshold. This extend of deviation is normally not utilised. This paper reports the evaluation results of proposed approach that utilises this extend of deviation from detection threshold to estimate strength of DDoS attack using various regression models. A relationship is established between number of zombies and observed deviation in sample entropy. Various statistical performance measures, such as coefficient of determination (R2), coefficient of correlation (CC), sum of square error (SSE), mean square error (MSE), root mean square error (RMSE), normalised mean square error (NMSE), Nash-Sutcliffe efficiency index (η) and mean absolute error (MAE) are used to measure the performance of various regression models. Internet type topologies used for simulation are generated using transit-stub model of GT-ITM topology generator. NS-2 network simulator on Linux platform is used as simulation test bed for launching DDoS attacks with varied attack strength. A comparative study is performed using different regression models for estimating strength of DDoS attack. The simulation results are promising as we are able to estimate strength of DDoS attack efficiently with very less error rate using various regression models.
△ Less
Submitted 12 March, 2012;
originally announced March 2012.