Search | arXiv e-print repository

CSIS: compressed sensing-based enhanced-embedding capacity image steganography scheme

Abstract: Image steganography plays a vital role in securing secret data by embedding it in the cover images. Usually, these images are communicated in a compressed format. Existing techniques achieve this but have low embedding capacity. Enhancing this capacity causes a deterioration in the visual quality of the stego-image. Hence, our goal here is to enhance the embedding capacity while preserving the vis… ▽ More Image steganography plays a vital role in securing secret data by embedding it in the cover images. Usually, these images are communicated in a compressed format. Existing techniques achieve this but have low embedding capacity. Enhancing this capacity causes a deterioration in the visual quality of the stego-image. Hence, our goal here is to enhance the embedding capacity while preserving the visual quality of the stego-image. We also intend to ensure that our scheme is resistant to steganalysis attacks. This paper proposes a Compressed Sensing Image Steganography (CSIS) scheme to achieve our goal while embedding binary data in images. The novelty of our scheme is the combination of three components in attaining the above-listed goals. First, we use compressed sensing to sparsify cover image block-wise, obtain its linear measurements, and then uniquely select permissible measurements. Further, before embedding the secret data, we encrypt it using the Data Encryption Standard (DES) algorithm, and finally, we embed two bits of encrypted data into each permissible measurement. Second, we propose a novel data extraction technique, which is lossless and completely recovers our secret data. Third, for the reconstruction of the stego-image, we use the least absolute shrinkage and selection operator (LASSO) for the resultant optimization problem. We perform experiments on several standard grayscale images and a color image, and evaluate embedding capacity, PSNR value, mean SSIM index, NCC coefficients, and entropy. We achieve 1.53 times more embedding capacity as compared to the most recent scheme. We obtain an average of 37.92 dB PSNR value, and average values close to 1 for both the mean SSIM index and the NCC coefficients, which are considered good. Moreover, the entropy of cover images and their corresponding stego-images are nearly the same. △ Less

Submitted 3 January, 2021; originally announced January 2021.

Comments: 12 pages double-column, 7 tables, and 11 figures

ACM Class: I.4.2; I.4.5; I.4.9; E.3

arXiv:2012.12141 [pdf, other]

Learning to Initialize Gradient Descent Using Gradient Descent

Authors: Kartik Ahuja, Amit Dhurandhar, Kush R. Varshney

Abstract: Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy. Often, either random initialization is used or initialization rules are carefully designed by exploiting the nature of the problem class. As a simple alternative to hand-crafted initialization rules, we propose an… ▽ More Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy. Often, either random initialization is used or initialization rules are carefully designed by exploiting the nature of the problem class. As a simple alternative to hand-crafted initialization rules, we propose an approach for learning "good" initialization rules from previous solutions. We provide theoretical guarantees that establish conditions that are sufficient in all cases and also necessary in some under which our approach performs better than random initialization. We apply our methodology to various non-convex problems such as generating adversarial examples, generating post hoc explanations for black-box machine learning models, and allocating communication spectrum, and show consistent gains over other initialization techniques. △ Less

Submitted 22 December, 2020; originally announced December 2020.

arXiv:2011.03965 [pdf, other]

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages

Authors: Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Abstract: While recurrent models have been effective in NLP tasks, their performance on context-free languages (CFLs) has been found to be quite weak. Given that CFLs are believed to capture important phenomena such as hierarchical structure in natural languages, this discrepancy in performance calls for an explanation. We study the performance of recurrent models on Dyck-n languages, a particularly importa… ▽ More While recurrent models have been effective in NLP tasks, their performance on context-free languages (CFLs) has been found to be quite weak. Given that CFLs are believed to capture important phenomena such as hierarchical structure in natural languages, this discrepancy in performance calls for an explanation. We study the performance of recurrent models on Dyck-n languages, a particularly important and well-studied class of CFLs. We find that while recurrent models generalize nearly perfectly if the lengths of the training and test strings are from the same range, they perform poorly if the test strings are longer. At the same time, we observe that recurrent models are expressive enough to recognize Dyck words of arbitrary lengths in finite precision if their depths are bounded. Hence, we evaluate our models on samples generated from Dyck languages with bounded depth and find that they are indeed able to generalize to much higher lengths. Since natural language datasets have nested dependencies of bounded depth, this may help explain why they perform well in modeling hierarchical dependencies in natural language data despite prior works indicating poor generalization performance on Dyck languages. We perform probing studies to support our results and provide comparisons with Transformers. △ Less

Submitted 8 November, 2020; originally announced November 2020.

Comments: COLING 2020

arXiv:2010.16412 [pdf, other]

Empirical or Invariant Risk Minimization? A Sample Complexity Perspective

Authors: Kartik Ahuja, Jun Wang, Amit Dhurandhar, Karthikeyan Shanmugam, Kush R. Varshney

Abstract: Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this importa… ▽ More Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this important question. We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and asymptotic behavior. For example, in the covariate shift setting we see that the two approaches not only arrive at the same asymptotic solution, but also have similar finite sample behavior with no clear winner. For other distribution shifts such as those involving confounders or anti-causal variables, however, the two approaches arrive at different asymptotic solutions where IRM is guaranteed to be close to the desired OOD solutions in the finite sample regime, while ERM is biased even asymptotically. We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions △ Less

Submitted 19 August, 2022; v1 submitted 30 October, 2020; originally announced October 2020.

arXiv:2010.15234 [pdf, other]

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions

Authors: Kartik Ahuja, Karthikeyan Shanmugam, Amit Dhurandhar

Abstract: Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as a promising solution to address out-of-distribution (OOD) generalization. In Ahuja et al., it was shown that solving for the Nash equilibria of a new class of "ensemble-games" is equivalent to solving IRM. In this work, we extend the framework in Ahuja et al. for linear regressions by projecting the ensemble-game on an… ▽ More Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as a promising solution to address out-of-distribution (OOD) generalization. In Ahuja et al., it was shown that solving for the Nash equilibria of a new class of "ensemble-games" is equivalent to solving IRM. In this work, we extend the framework in Ahuja et al. for linear regressions by projecting the ensemble-game on an $\ell_{\infty}$ ball. We show that such projections help achieve non-trivial OOD guarantees despite not achieving perfect invariance. For linear models with confounders, we prove that Nash equilibria of these games are closer to the ideal OOD solutions than the standard empirical risk minimization (ERM) and we also provide learning algorithms that provably converge to these Nash Equilibria. Empirical comparisons of the proposed approach with the state-of-the-art show consistent gains in achieving OOD solutions in several settings involving anti-causal variables and confounders. △ Less

Submitted 28 October, 2020; originally announced October 2020.

arXiv:2010.11893 [pdf, other]

ParaLarH: Parallel FPGA Router based upon Lagrange Heuristics

Authors: Rohit Agrawal, Kapil Ahuja, Dhaarna Maheshwari, Akash Kumar

Abstract: Routing of the nets in Field Programmable Gate Array (FPGA) design flow is one of the most time consuming steps. Although Versatile Place and Route (VPR), which is a commonly used algorithm for this purpose, routes effectively, it is slow in execution. One way to accelerate this design flow is to use parallelization. Since VPR is intrinsically sequential, a set of parallel algorithms have been rec… ▽ More Routing of the nets in Field Programmable Gate Array (FPGA) design flow is one of the most time consuming steps. Although Versatile Place and Route (VPR), which is a commonly used algorithm for this purpose, routes effectively, it is slow in execution. One way to accelerate this design flow is to use parallelization. Since VPR is intrinsically sequential, a set of parallel algorithms have been recently proposed for this purpose (ParaLaR and ParaLarPD). These algorithms formulate the routing process as a Linear Program (LP) and solve it using the Lagrange relaxation, the sub-gradient method, and the Steiner tree algorithm. Out of the many metrics available to check the effectiveness of routing, ParaLarPD, which is an improved version of ParaLaR, suffers from large violations in the constraints of the LP problem (which is related to the minimum channel width metric) as well as an easily measurable critical path delay metric that can be improved further. In this paper, we introduce a set of novel Lagrange heuristics that improve the Lagrange relaxation process. When tested on the MCNC benchmark circuits, on an average, this leads to halving of the constraints violation, up to 10% improvement in the minimum channel width, and up to 8% reduction in the critical path delay as obtained from ParaLarPD. We term our new algorithm as ParaLarH. Due to the increased work in the Lagrange relaxation process, as compared to ParaLarPD, ParaLarH does slightly deteriorate the speedup obtained because of parallelization, however, this aspect is easily compensated by using more number of threads. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: 10 pages, 3 Figures, and 5 Tables

MSC Class: 90C05; 90C10; ACM Class: G.1.6; C.1.4

arXiv:2009.11264 [pdf, other]

On the Ability and Limitations of Transformers to Recognize Formal Languages

Authors: Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Abstract: Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on regular languages and have close connections with counter languages. In this work, we systematically study the ability of Transformers to model such languages a… ▽ More Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on regular languages and have close connections with counter languages. In this work, we systematically study the ability of Transformers to model such languages as well as the role of its individual components in doing so. We first provide a construction of Transformers for a subclass of counter languages, including well-studied languages such as n-ary Boolean Expressions, Dyck-1, and its generalizations. In experiments, we find that Transformers do well on this subclass, and their learned mechanism strongly correlates with our construction. Perhaps surprisingly, in contrast to LSTMs, Transformers do well only on a subset of regular languages with degrading performance as we make languages more complex according to a well-known measure of complexity. Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model. △ Less

Submitted 8 October, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

Comments: EMNLP 2020

arXiv:2009.09028 [pdf, ps, other]

Probabilistically Sampled and Spectrally Clustered Plant Genotypes using Phenotypic Characteristics

Authors: Aditya A. Shastri, Kapil Ahuja, Milind B. Ratnaparkhe, Yann Busnel

Abstract: Clustering genotypes based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral… ▽ More Clustering genotypes based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral Clustering (SC) algorithm. To make the algorithm computationally cheap, we propose using sampling, specifically, Pivotal Sampling that is probability based. Since application of samplings to phenotypic data has not been explored much, for effective comparison, another sampling technique called Vector Quantization (VQ) is adapted for this data as well. VQ has recently given promising results for genome data. The novelty of our SC with Pivotal Sampling algorithm is in constructing the crucial similarity matrix for the clustering algorithm and defining probabilities for the sampling technique. Although our algorithm can be applied to any plant genotypes, we test it on the phenotypic data obtained from about 2400 Soybean genotypes. SC with Pivotal Sampling achieves substantially more accuracy (in terms of Silhouette Values) than all the other proposed competitive clustering with sampling algorithms (i.e. SC with VQ, HC with Pivotal Sampling, and HC with VQ). The complexities of our SC with Pivotal Sampling algorithm and these three variants are almost same because of the involved sampling. In addition to this, SC with Pivotal Sampling outperforms the standard HC algorithm in both accuracy and computational complexity. We experimentally show that we are up to 45% more accurate than HC in terms of clustering accuracy. The computational complexity of our algorithm is more than a magnitude lesser than HC. △ Less

Submitted 18 September, 2020; originally announced September 2020.

Comments: 16 Pages, 3 Figures, and 6 Tables

MSC Class: 92B05; 68T09 ACM Class: I.2.1; J.3

arXiv:2007.07768 [pdf, other]

Opening the Software Engineering Toolbox for the Assessment of Trustworthy AI

Authors: Mohit Kumar Ahuja, Mohamed-Bachir Belaid, Pierre Bernabé, Mathieu Collet, Arnaud Gotlieb, Chhagan Lal, Dusica Marijan, Sagar Sen, Aizaz Sharif, Helge Spieker

Abstract: Trustworthiness is a central requirement for the acceptance and success of human-centered artificial intelligence (AI). To deem an AI system as trustworthy, it is crucial to assess its behaviour and characteristics against a gold standard of Trustworthy AI, consisting of guidelines, requirements, or only expectations. While AI systems are highly complex, their implementations are still based on so… ▽ More Trustworthiness is a central requirement for the acceptance and success of human-centered artificial intelligence (AI). To deem an AI system as trustworthy, it is crucial to assess its behaviour and characteristics against a gold standard of Trustworthy AI, consisting of guidelines, requirements, or only expectations. While AI systems are highly complex, their implementations are still based on software. The software engineering community has a long-established toolbox for the assessment of software systems, especially in the context of software testing. In this paper, we argue for the application of software engineering and testing practices for the assessment of trustworthy AI. We make the connection between the seven key requirements as defined by the European Commission's AI high-level expert group and established procedures from software engineering and raise questions for future work. △ Less

Submitted 30 August, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 1st International Workshop on New Foundations for Human-Centered AI @ ECAI 2020

arXiv:2006.04621 [pdf, other]

Adversarial Feature Desensitization

Authors: Pouya Bashivan, Reza Bayat, Adam Ibrahim, Kartik Ahuja, Mojtaba Faramarzi, Touraj Laleh, Blake Aaron Richards, Irina Rish

Abstract: Neural networks are known to be vulnerable to adversarial attacks -- slight but carefully constructed perturbations of the inputs which can drastically impair the network's performance. Many defense methods have been proposed for improving robustness of deep networks by training them on adversarially perturbed inputs. However, these models often remain vulnerable to new types of attacks not seen d… ▽ More Neural networks are known to be vulnerable to adversarial attacks -- slight but carefully constructed perturbations of the inputs which can drastically impair the network's performance. Many defense methods have been proposed for improving robustness of deep networks by training them on adversarially perturbed inputs. However, these models often remain vulnerable to new types of attacks not seen during training, and even to slightly stronger versions of previously seen attacks. In this work, we propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs. This is achieved through a game where we learn features that are both predictive and robust (insensitive to adversarial attacks), i.e. cannot be used to discriminate between natural and adversarial data. Empirical results on several benchmarks demonstrate the effectiveness of the proposed approach against a wide range of attack types and attack strengths. Our code is available at https://github.com/BashivanLab/afd. △ Less

Submitted 4 January, 2022; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Accepted at Neurips 2021

arXiv:2005.12951 [pdf]

Gaze-based Autism Detection for Adolescents and Young Adults using Prosaic Videos

Authors: Karan Ahuja, Abhishek Bose, Mohit Jain, Kuntal Dey, Anil Joshi, Krishnaveni Achary, Blessin Varkey, Chris Harrison, Mayank Goel

Abstract: Autism often remains undiagnosed in adolescents and adults. Prior research has indicated that an autistic individual often shows atypical fixation and gaze patterns. In this short paper, we demonstrate that by monitoring a user's gaze as they watch commonplace (i.e., not specialized, structured or coded) video, we can identify individuals with autism spectrum disorder. We recruited 35 autistic and… ▽ More Autism often remains undiagnosed in adolescents and adults. Prior research has indicated that an autistic individual often shows atypical fixation and gaze patterns. In this short paper, we demonstrate that by monitoring a user's gaze as they watch commonplace (i.e., not specialized, structured or coded) video, we can identify individuals with autism spectrum disorder. We recruited 35 autistic and 25 non-autistic individuals, and captured their gaze using an off-the-shelf eye tracker connected to a laptop. Within 15 seconds, our approach was 92.5% accurate at identifying individuals with an autism diagnosis. We envision such automatic detection being applied during e.g., the consumption of web media, which could allow for passive screening and adaptation of user interfaces. △ Less

Submitted 26 May, 2020; originally announced May 2020.

arXiv:2005.08417 [pdf, other]

Syntax-guided Controlled Generation of Paraphrases

Authors: Ashutosh Kumar, Kabir Ahuja, Raghuram Vadapalli, Partha Talukdar

Abstract: Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment flip), the goal of controlled text generation is to produce a sentence that adapts the input sentence to meet the requirements of the constraint (e.g., "I hate mangoes"). Going beyond such simple constraints, recent works have started exploring the incorporation of complex syntactic-guidance as constraints in the task of c… ▽ More Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment flip), the goal of controlled text generation is to produce a sentence that adapts the input sentence to meet the requirements of the constraint (e.g., "I hate mangoes"). Going beyond such simple constraints, recent works have started exploring the incorporation of complex syntactic-guidance as constraints in the task of controlled paraphrase generation. In these methods, syntactic-guidance is sourced from a separate exemplar sentence. However, these prior works have only utilized limited syntactic information available in the parse tree of the exemplar sentence. We address this limitation in the paper and propose Syntax Guided Controlled Paraphraser (SGCP), an end-to-end framework for syntactic paraphrase generation. We find that SGCP can generate syntax conforming sentences while not compromising on relevance. We perform extensive automated and human evaluations over multiple real-world English language datasets to demonstrate the efficacy of SGCP over state-of-the-art baselines. To drive future research, we have made SGCP's source code available △ Less

Submitted 17 May, 2020; originally announced May 2020.

Comments: 16 pages, 3 figures, Accepted to TACL 2020

arXiv:2003.12759 [pdf, other]

Reusing Preconditioners in Projection based Model Order Reduction Algorithms

Authors: Navneet Pratap Singh, Kapil Ahuja

Abstract: Dynamical systems are pervasive in almost all engineering and scientific applications. Simulating such systems is computationally very intensive. Hence, Model Order Reduction (MOR) is used to reduce them to a lower dimension. Most of the MOR algorithms require solving large sparse sequences of linear systems. Since using direct methods for solving such systems does not scale well in time with resp… ▽ More Dynamical systems are pervasive in almost all engineering and scientific applications. Simulating such systems is computationally very intensive. Hence, Model Order Reduction (MOR) is used to reduce them to a lower dimension. Most of the MOR algorithms require solving large sparse sequences of linear systems. Since using direct methods for solving such systems does not scale well in time with respect to the increase in the input dimension, efficient preconditioned iterative methods are commonly used. In one of our previous works, we have shown substantial improvements by reusing preconditioners for the parametric MOR (Singh et al. 2019). Here, we had proposed techniques for both, the non-parametric and the parametric cases, but had applied them only to the latter. We have four main contributions here. First, we demonstrate that preconditioners can be reused more effectively in the non-parametric case as compared to the parametric one because of the lack of parameters in the former. Second, we show that reusing preconditioners is an art and it needs to be fine-tuned for the underlying MOR algorithm. Third, we describe the pitfalls in the algorithmic implementation of reusing preconditioners. Fourth, and final, we demonstrate this theory on a real life industrial problem (of size 1.2 million), where savings of upto 64% in the total computation time is obtained by reusing preconditioners. In absolute terms, this leads to a saving of 5 days. △ Less

Submitted 28 March, 2020; originally announced March 2020.

Comments: 12 Pages, 3 Figures, and 10 Tables

MSC Class: 65F10; 65F08; 15-04; 9305; 93C10

arXiv:2002.04692 [pdf, other]

Invariant Risk Minimization Games

Authors: Kartik Ahuja, Karthikeyan Shanmugam, Kush R. Varshney, Amit Dhurandhar

Abstract: The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations. Training on data from many environments and finding invariant predictors reduces the effect of spurious features by concentrating models on features that have a causal relationship with the outcome.… ▽ More The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations. Training on data from many environments and finding invariant predictors reduces the effect of spurious features by concentrating models on features that have a causal relationship with the outcome. In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments. By doing so, we develop a simple training algorithm that uses best response dynamics and, in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al. (2019). One key theoretical contribution is showing that the set of Nash equilibria for the proposed game are equivalent to the set of invariant predictors for any finite number of environments, even with nonlinear classifiers and transformations. As a result, our method also retains the generalization guarantees to a large set of environments shown in Arjovsky et al. (2019). The proposed algorithm adds to the collection of successful game-theoretic machine learning algorithms such as generative adversarial networks. △ Less

Submitted 18 March, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

arXiv:1905.00586 [pdf, other]

Estimating Kullback-Leibler Divergence Using Kernel Machines

Authors: Kartik Ahuja

Abstract: Recently, a method called the Mutual Information Neural Estimator (MINE) that uses neural networks has been proposed to estimate mutual information and more generally the Kullback-Leibler (KL) divergence between two distributions. The method uses the Donsker-Varadhan representation to arrive at the estimate of the KL divergence and is better than the existing estimators in terms of scalability and… ▽ More Recently, a method called the Mutual Information Neural Estimator (MINE) that uses neural networks has been proposed to estimate mutual information and more generally the Kullback-Leibler (KL) divergence between two distributions. The method uses the Donsker-Varadhan representation to arrive at the estimate of the KL divergence and is better than the existing estimators in terms of scalability and flexibility. The output of MINE algorithm is not guaranteed to be a consistent estimator. We propose a new estimator that instead of searching among functions characterized by neural networks searches the functions in a Reproducing Kernel Hilbert Space. We prove that the proposed estimator is consistent. We carry out simulations and show that when the datasets are small the proposed estimator is more reliable than the MINE estimator and when the datasets are large the performance of the two methods are close. △ Less

Submitted 16 August, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

arXiv:1903.11649 [pdf, other]

Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment

Authors: Samyak Datta, Karan Sikka, Anirban Roy, Karuna Ahuja, Devi Parikh, Ajay Divakaran

Abstract: We address the problem of grounding free-form textual phrases by using weak supervision from image-caption pairs. We propose a novel end-to-end model that uses caption-to-image retrieval as a `downstream' task to guide the process of phrase localization. Our method, as a first step, infers the latent correspondences between regions-of-interest (RoIs) and phrases in the caption and creates a discri… ▽ More We address the problem of grounding free-form textual phrases by using weak supervision from image-caption pairs. We propose a novel end-to-end model that uses caption-to-image retrieval as a `downstream' task to guide the process of phrase localization. Our method, as a first step, infers the latent correspondences between regions-of-interest (RoIs) and phrases in the caption and creates a discriminative image representation using these matched RoIs. In a subsequent step, this (learned) representation is aligned with the caption. Our key contribution lies in building this `caption-conditioned' image encoding which tightly couples both the tasks and allows the weak supervision to effectively guide visual grounding. We provide an extensive empirical and qualitative analysis to investigate the different components of our proposed model and compare it with competitive baselines. For phrase localization, we report an improvement of 4.9% (absolute) over the prior state-of-the-art on the VisualGenome dataset. We also report results that are at par with the state-of-the-art on the downstream caption-to-image retrieval task on COCO and Flickr30k datasets. △ Less

Submitted 15 October, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

Comments: v2 contains phrase localization results on Flickr30k Entities. Accepted for publication at ICCV 2019

arXiv:1903.01143 [pdf, ps, other]

Inexact Linear Solves In Model Reduction of Bilinear Dynamical Systems

Authors: Rajendra Choudhary, Kapil Ahuja

Abstract: Bilinear dynamical systems are commonly used in science and engineering because they form a bridge between linear and non-linear systems. However, simulating them is still a challenge because of their large size. Hence, a lot of research is currently being done for reducing such bilinear dynamical systems (termed as bilinear model order reduction or bilinear MOR). Bilinear iterative rational Krylo… ▽ More Bilinear dynamical systems are commonly used in science and engineering because they form a bridge between linear and non-linear systems. However, simulating them is still a challenge because of their large size. Hence, a lot of research is currently being done for reducing such bilinear dynamical systems (termed as bilinear model order reduction or bilinear MOR). Bilinear iterative rational Krylov algorithm (BIRKA) is a very popular, standard and mathematically sound algorithm for bilinear MOR, which is based upon interpolatory projection technique. An efficient variant of BIRKA, Truncated BIRKA (or TBIRKA) has also been recently proposed. Like for any MOR algorithm, these two algorithms also require solving multiple linear systems as part of the model reduction process. For reducing very large dynamical systems, which is now-a-days becoming a norm, scaling of such linear systems with respect to input dynamical system size is a bottleneck. For efficiency, these linear systems are often solved by an iterative solver, which introduces approximation errors. Hence, stability analysis of MOR algorithms with respect to inexact linear solves is important. In our past work, we have shown that under mild conditions, BIRKA is stable (in the sense as discussed above). Here, we look at stability of TBIRKA in the same context. Besides deriving the conditions for a stable TBIRKA, our other novel contribution is the more intuitive methodology for achieving this. This approach exploits the fact that in TBIRKA a bilinear dynamical system can be represented by a finite set of functions, which was not possible in BIRKA (because infinite such functions were needed there). The stability analysis techniques that we propose here can be extended to many other methods for doing MOR of bilinear dynamical systems, e.g., using balanced truncation or the ADI methods. △ Less

Submitted 12 March, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

Comments: 25 Pages

MSC Class: 34D10; 34D20; 41A05; 65F10

arXiv:1811.09992 [pdf, other]

Externalities in Endogenous Sharing Economy Networks

Authors: Pramod C. Mane, Kapil Ahuja, Nagarajan Krishnamurthy

Abstract: This paper investigates the impact of link formation between a pair of agents on the resource availability of other agents (that is, externalities) in a social cloud network, a special case of endogenous sharing economy networks. Specifically, we study how the closeness between agents and the network size affect externalities. We conjecture, and experimentally support, that for an agent to exper… ▽ More This paper investigates the impact of link formation between a pair of agents on the resource availability of other agents (that is, externalities) in a social cloud network, a special case of endogenous sharing economy networks. Specifically, we study how the closeness between agents and the network size affect externalities. We conjecture, and experimentally support, that for an agent to experience positive externalities, an increase in its closeness is necessary. The condition is not sufficient though. We, then, show that for populated ring networks, one or more agents experience positive externalities due to an increase in the closeness of agents. Further, the initial distance between agents forming a link has a direct bearing on the number of beneficiaries, and the number of beneficiaries is always less than that of non-beneficiaries. △ Less

Submitted 18 October, 2019; v1 submitted 25 November, 2018; originally announced November 2018.

Comments: 7 Pages, 1 Table, 2 Figures

MSC Class: 91-08; 91B02; 91B74

arXiv:1811.00753 [pdf, other]

Risk-Stratify: Confident Stratification Of Patients Based On Risk

Authors: Kartik Ahuja, Mihaela van der Schaar

Abstract: A clinician desires to use a risk-stratification method that achieves confident risk-stratification - the risk estimates of the different patients reflect the true risks with a high probability. This allows him/her to use these risks to make accurate predictions about prognosis and decisions about screening, treatments for the current patient. We develop Risk-stratify - a two phase algorithm that… ▽ More A clinician desires to use a risk-stratification method that achieves confident risk-stratification - the risk estimates of the different patients reflect the true risks with a high probability. This allows him/her to use these risks to make accurate predictions about prognosis and decisions about screening, treatments for the current patient. We develop Risk-stratify - a two phase algorithm that is designed to achieve confident risk-stratification. In the first phase, we grow a tree to partition the covariate space. Each node in the tree is split using statistical tests that determine if the risks of the child nodes are different or not. The choice of the statistical tests depends on whether the data is censored (Log-rank test) or not (U-test). The set of the leaves of the tree form a partition. The risk distribution of patients that belong to a leaf is different from the sibling leaf but not the rest of the leaves. Therefore, some of the leaves that have similar underlying risks are incorrectly specified to have different risks. In the second phase, we develop a novel recursive graph decomposition approach to address this problem. We merge the leaves of the tree that have similar risks to form new leaves that form the final output. We apply Risk-stratify on a cohort of patients (with no history of cardiovascular disease) from UK Biobank and assess their risk for cardiovascular disease. Risk-stratify significantly improves risk-stratification, i.e., a lower fraction of the groups have over/under estimated risks (measured in terms of false discovery rate; 33% reduction) in comparison to state-of-the-art methods for cardiovascular prediction (Random forests, Cox model, etc.). We find that the Cox model significantly over estimates the risk of 21,621 patients out of 216,211 patients. Risk-stratify can accurately categorize 2,987 of these 21,621 patients as low-risk individuals. △ Less

Submitted 2 November, 2018; originally announced November 2018.

arXiv:1810.11207 [pdf, other]

Joint Concordance Index

Authors: Kartik Ahuja, Mihaela van der Schaar

Abstract: Existing metrics in competing risks survival analysis such as concordance and accuracy do not evaluate a model's ability to jointly predict the event type and the event time. To address these limitations, we propose a new metric, which we call the joint concordance. The joint concordance measures a model's ability to predict the overall risk profile, i.e., risk of death from different event types.… ▽ More Existing metrics in competing risks survival analysis such as concordance and accuracy do not evaluate a model's ability to jointly predict the event type and the event time. To address these limitations, we propose a new metric, which we call the joint concordance. The joint concordance measures a model's ability to predict the overall risk profile, i.e., risk of death from different event types. We develop a consistent estimator for the new metric that accounts for the censoring bias. We use the new metric to develop a variable importance ranking approach. Using the real and synthetic data experiments, we show that models selected using the existing metrics are worse than those selected using joint concordance at jointly predicting the event type and event time. We show that the existing approaches for variable importance ranking often fail to recognize the importance of the event-specific risk factors, whereas, the proposed approach does not, since it compares risk factors based on their contribution to the prediction of the different event-types. To summarize, joint concordance is helpful for model comparisons and variable importance ranking and has the potential to impact applications such as risk-stratification and treatment planning in multimorbid populations. △ Less

Submitted 17 August, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

arXiv:1810.00398 [pdf]

Vector Quantized Spectral Clustering applied to Soybean Whole Genome Sequences

Authors: Aditya A. Shastri, Kapil Ahuja, Milind B. Ratnaparkhe, Aditya Shah, Aishwary Gagrani, Anant Lal

Abstract: We develop a Vector Quantized Spectral Clustering (VQSC) algorithm that is a combination of Spectral Clustering (SC) and Vector Quantization (VQ) sampling for grou** Soybean genomes. The inspiration here is to use SC for its accuracy and VQ to make the algorithm computationally cheap (the complexity of SC is cubic in-terms of the input size). Although the combination of SC and VQ is not new, the… ▽ More We develop a Vector Quantized Spectral Clustering (VQSC) algorithm that is a combination of Spectral Clustering (SC) and Vector Quantization (VQ) sampling for grou** Soybean genomes. The inspiration here is to use SC for its accuracy and VQ to make the algorithm computationally cheap (the complexity of SC is cubic in-terms of the input size). Although the combination of SC and VQ is not new, the novelty of our work is in develo** the crucial similarity matrix in SC as well as use of k-medoids in VQ, both adapted for the Soybean genome data. We compare our approach with commonly used techniques like UPGMA (Un-weighted Pair Graph Method with Arithmetic Mean) and NJ (Neighbour Joining). Experimental results show that our approach outperforms both these techniques significantly in terms of cluster quality (up to 25% better cluster quality) and time complexity (order of magnitude faster). △ Less

Submitted 30 September, 2018; originally announced October 2018.

Comments: 10 Pages, 3 Tables, 2 Figures

MSC Class: 68T01; 68T10; 68W40

arXiv:1809.06574 [pdf, ps, other]

Preconditioned Linear Solves for Parametric Model Order Reduction

Authors: Navneet Pratap Singh, Kapil Ahuja

Abstract: The main computational cost of algorithms for computing reduced-order models of parametric dynamical systems is in solving sequences of very large and sparse linear systems. We focus on efficiently solving these linear systems, arising while reducing second-order linear dynamical systems, by iterative methods with appropriate preconditioners. We propose that the choice of underlying iterative so… ▽ More The main computational cost of algorithms for computing reduced-order models of parametric dynamical systems is in solving sequences of very large and sparse linear systems. We focus on efficiently solving these linear systems, arising while reducing second-order linear dynamical systems, by iterative methods with appropriate preconditioners. We propose that the choice of underlying iterative solver is problem dependent. We propose the use of block variant of the underlying iterative method because often all right-hand-side are available together. Since, Sparse Approximate Inverse (SPAI) preconditioner is a general preconditioner that can be naturally parallelized, we propose its use. Our most novel contribution is a technique to cheaply update the SPAI preconditioner, while solving the parametrically changing linear systems. We support our proposed theory by numerical experiments where we first show benefit of 80% in time by using a block iterative method, and a benefit of 70% in time by using SPAI updates. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 15 Pages, 5 Tables

MSC Class: 34C20; 65F10

arXiv:1807.01448 [pdf, other]

Understanding Visual Ads by Aligning Symbols and Objects using Co-Attention

Authors: Karuna Ahuja, Karan Sikka, Anirban Roy, Ajay Divakaran

Abstract: We tackle the problem of understanding visual ads where given an ad image, our goal is to rank appropriate human generated statements describing the purpose of the ad. This problem is generally addressed by jointly embedding images and candidate statements to establish correspondence. Decoding a visual ad requires inference of both semantic and symbolic nuances referenced in an image and prior met… ▽ More We tackle the problem of understanding visual ads where given an ad image, our goal is to rank appropriate human generated statements describing the purpose of the ad. This problem is generally addressed by jointly embedding images and candidate statements to establish correspondence. Decoding a visual ad requires inference of both semantic and symbolic nuances referenced in an image and prior methods may fail to capture such associations especially with weakly annotated symbols. In order to create better embeddings, we leverage an attention mechanism to associate image proposals with symbols and thus effectively aggregate information from aligned multimodal representations. We propose a multihop co-attention mechanism that iteratively refines the attention map to ensure accurate attention estimation. Our attention based embedding model is learned end-to-end guided by a max-margin loss function. We show that our model outperforms other baselines on the benchmark Ad dataset and also show qualitative results to highlight the advantages of using multihop co-attention. △ Less

Submitted 4 July, 2018; originally announced July 2018.

Comments: Accepted at CVPR 2018 workshop- Towards Automatic Understanding of Visual Advertisements

arXiv:1806.10270 [pdf, other]

Optimal Piecewise Local-Linear Approximations

Authors: Kartik Ahuja, William Zame, Mihaela van der Schaar

Abstract: Existing works on "black-box" model interpretation use local-linear approximations to explain the predictions made for each data instance in terms of the importance assigned to the different features for arriving at the prediction. These works provide instancewise explanations and thus give a local view of the model. To be able to trust the model it is important to understand the global model beha… ▽ More Existing works on "black-box" model interpretation use local-linear approximations to explain the predictions made for each data instance in terms of the importance assigned to the different features for arriving at the prediction. These works provide instancewise explanations and thus give a local view of the model. To be able to trust the model it is important to understand the global model behavior and there are relatively fewer works which do the same. Piecewise local-linear models provide a natural way to extend local-linear models to explain the global behavior of the model. In this work, we provide a dynamic programming based framework to obtain piecewise approximations of the black-box model. We also provide provable fidelity, i.e., how well the explanations reflect the black-box model, guarantees. We carry out simulations on synthetic and real datasets to show the utility of the proposed approach. At the end, we show that the ideas developed for our framework can also be used to address the problem of clustering for one-dimensional data. We give a polynomial time algorithm and prove that it achieves optimal clustering. △ Less

Submitted 27 August, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

arXiv:1805.07892 [pdf, other]

Localized Multiple Kernel Learning for Anomaly Detection: One-class Classification

Authors: Chandan Gautam, Ramesh Balaji, K Sudharsan, Aruna Tiwari, Kapil Ahuja

Abstract: Multi-kernel learning has been well explored in the recent past and has exhibited promising outcomes for multi-class classification and regression tasks. In this paper, we present a multiple kernel learning approach for the One-class Classification (OCC) task and employ it for anomaly detection. Recently, the basic multi-kernel approach has been proposed to solve the OCC problem, which is simply a… ▽ More Multi-kernel learning has been well explored in the recent past and has exhibited promising outcomes for multi-class classification and regression tasks. In this paper, we present a multiple kernel learning approach for the One-class Classification (OCC) task and employ it for anomaly detection. Recently, the basic multi-kernel approach has been proposed to solve the OCC problem, which is simply a convex combination of different kernels with equal weights. This paper proposes a Localized Multiple Kernel learning approach for Anomaly Detection (LMKAD) using OCC, where the weight for each kernel is assigned locally. Proposed LMKAD approach adapts the weight for each kernel using a gating function. The parameters of the gating function and one-class classifier are optimized simultaneously through a two-step optimization process. We present the empirical results of the performance of LMKAD on 25 benchmark datasets from various disciplines. This performance is evaluated against existing Multi Kernel Anomaly Detection (MKAD) algorithm, and four other existing kernel-based one-class classifiers to showcase the credibility of our approach. Our algorithm achieves significantly better Gmean scores while using a lesser number of support vectors compared to MKAD. Friedman test is also performed to verify the statistical significance of the results claimed in this paper. △ Less

Submitted 17 July, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

Comments: 21 pages, 9 Tables and 2 Figures

arXiv:1803.09283 [pdf, ps, other]

Stability Analysis of Inexact Solves in Moment Matching based Model Reduction

Authors: Navneet Pratap Singh, Kapil Ahuja

Abstract: Recently a new algorithm for model reduction of second order linear dynamical systems with proportional dam**, the Adaptive Iterative Rational Global Arnoldi (AIRGA) algorithm (Bonin et. al., 2016), has been proposed. The main computational cost of the AIRGA algorithm is solving a sequence of linear systems. Usually, direct methods (e.g., LU) are used for solving these systems. As model sizes gr… ▽ More Recently a new algorithm for model reduction of second order linear dynamical systems with proportional dam**, the Adaptive Iterative Rational Global Arnoldi (AIRGA) algorithm (Bonin et. al., 2016), has been proposed. The main computational cost of the AIRGA algorithm is solving a sequence of linear systems. Usually, direct methods (e.g., LU) are used for solving these systems. As model sizes grow, direct methods become prohibitively expensive. Iterative methods (e.g., Krylov) scale well with size, and hence, are a good choice with an appropriate preconditioner. Preconditioned iterative methods introduce errors in linear solves because they are not exact. They solve linear systems up to a certain tolerance. We prove that, under mild conditions, the AIRGA algorithm is backward stable with respect to the errors introduced by these inexact linear solves. Our first assumption is use of a Ritz-Galerkin based solver that satisfies few extra orthogonality conditions. Since Conjugate Gradient (CG) is the most popular method based upon the Ritz-Galerkin theory, we use it. We show how to modify CG to achieve these extra orthogonalities. Modifying CG with the suggested changes is non-trivial. Hence, we demonstrate that using Recycling CG (RCG) helps us achieve these orthogonalities with no code changes. The extra cost of orthogonalizations is often offset by savings because of recycling. Our second and third assumptions involve existence, invertibility and boundedness of two matrices, which are easy to satisfy. While satisfying the backward stability assumptions, by numerical experiments we show that as we iteratively solve the linear systems arising in the AIRGA algorithm more accurately, we obtain a more accurate reduced system. We also show that recycling Krylov subspaces helps satisfy the backward stability assumptions (extra-orthogonalities) at almost no extra cost. △ Less

Submitted 25 March, 2018; originally announced March 2018.

Comments: 24 Pages and 7 Tables

MSC Class: 34C20; 41A05; 65F10; 93A15; 93C05; 65L20

arXiv:1803.03885 [pdf, ps, other]

Parallel FPGA Router using Sub-Gradient method and Steiner tree

Authors: Rohit Agrawal, Chin Hao Hoo, Kapil Ahuja, Akash Kumar

Abstract: In the FPGA (Field Programmable Gate Arrays) design flow, one of the most time-consuming step is the routing of nets. Therefore, there is a need to accelerate it. In a recent paper by Hoo et. al., the authors have developed a Linear Programming based framework that parallelizes this routing process to achieve significant speedups (the algorithm is termed as ParaLaR). However, this approach has cer… ▽ More In the FPGA (Field Programmable Gate Arrays) design flow, one of the most time-consuming step is the routing of nets. Therefore, there is a need to accelerate it. In a recent paper by Hoo et. al., the authors have developed a Linear Programming based framework that parallelizes this routing process to achieve significant speedups (the algorithm is termed as ParaLaR). However, this approach has certain weaknesses. Namely, the constraints violation by the solution and a local minima that could be improved. We address these two issues here. In our paper, we use this framework and solve it using the Primal-Dual sub-gradient method that better exploits the problem properties. We also propose a better way to update the size of the step taken by this iterative algorithm. We perform experiments on a set of standard benchmarks, where we show that our algorithm outperforms the standard existing algorithms (VPR and ParaLaR). We achieve up to 22% improvement in the constraints violation and the standard metric of the minimum channel width when compared with ParaLaR (which is same as in VPR). We achieve about 20% savings in another standard metric of the total wire length (when compared with VPR), which is the same as for ParaLaR. Hence, our algorithm achieves minimum value for all the three parameters. Also, the critical path delay for our algorithm is almost same as compared to VPR and ParaLaR. We achieve relative speedups of 3 times when we run a parallel version of our algorithm using 4 threads. △ Less

Submitted 19 August, 2018; v1 submitted 10 March, 2018; originally announced March 2018.

Comments: 5 pages, double column, 1 figure, and 2 tables

arXiv:1711.10283 [pdf, other]

Data Backup Network Formation with Heterogeneous Agents

Authors: Harshit Jain, Guduru Sai Teja, Pramod Mane, Kapil Ahuja, Nagarajan Krishnamurthy

Abstract: Social storage systems are becoming increasingly popular compared to the existing data backup systems like local, centralized and P2P systems. An endogenously built symmetric social storage model and its aspects like the utility of each agent, bilateral stability, contentment, and efficiency have been extensively discussed in Mane et. al. (2017). We include heterogeneity in this model by using the… ▽ More Social storage systems are becoming increasingly popular compared to the existing data backup systems like local, centralized and P2P systems. An endogenously built symmetric social storage model and its aspects like the utility of each agent, bilateral stability, contentment, and efficiency have been extensively discussed in Mane et. al. (2017). We include heterogeneity in this model by using the concept of Social Range Matrix from Kuznetsov et. al (2010). Now, each agent is concerned about its perceived utility, which is a linear combination of its utility as well as others utilities (depending upon whether the pair are friends, enemies or do not care about each other). We derive conditions when two agents may want to add or delete a link, and provide an algorithm that checks if a bilaterally stable network is possible or not. Finally, we take some special Social Range Matrices and prove that under certain conditions on network parameters, a bilaterally stable network is unique. △ Less

Submitted 28 November, 2017; originally announced November 2017.

Comments: 3 Pages, double columns, 1 figure, extended abstract

MSC Class: 91

arXiv:1706.04119 [pdf, other]

doi 10.1186/s13040-018-0164-x

Investigating the Parameter Space of Evolutionary Algorithms

Authors: Moshe Sipper, Weixuan Fu, Karuna Ahuja, Jason H. Moore

Abstract: The practice of evolutionary algorithms involves the tuning of many parameters. How big should the population be? How many generations should the algorithm run? What is the (tournament selection) tournament size? What probabilities should one assign to crossover and mutation? Through an extensive series of experiments over multiple evolutionary algorithm implementations and problems we show that p… ▽ More The practice of evolutionary algorithms involves the tuning of many parameters. How big should the population be? How many generations should the algorithm run? What is the (tournament selection) tournament size? What probabilities should one assign to crossover and mutation? Through an extensive series of experiments over multiple evolutionary algorithm implementations and problems we show that parameter space tends to be rife with viable parameters, at least for 25 the problems studied herein. We discuss the implications of this finding in practice. △ Less

Submitted 10 October, 2017; v1 submitted 13 June, 2017; originally announced June 2017.

Journal ref: BioData Mining, 2018, 11:2

arXiv:1701.04508 [pdf, other]

Online Learning with Regularized Kernel for One-class Classification

Authors: Chandan Gautam, Aruna Tiwari, Sundaram Suresh, Kapil Ahuja

Abstract: This paper presents an online learning with regularized kernel based one-class extreme learning machine (ELM) classifier and is referred as online RK-OC-ELM. The baseline kernel hyperplane model considers whole data in a single chunk with regularized ELM approach for offline learning in case of one-class classification (OCC). Further, the basic hyper plane model is adapted in an online fashion fro… ▽ More This paper presents an online learning with regularized kernel based one-class extreme learning machine (ELM) classifier and is referred as online RK-OC-ELM. The baseline kernel hyperplane model considers whole data in a single chunk with regularized ELM approach for offline learning in case of one-class classification (OCC). Further, the basic hyper plane model is adapted in an online fashion from stream of training samples in this paper. Two frameworks viz., boundary and reconstruction are presented to detect the target class in online RKOC-ELM. Boundary framework based one-class classifier consists of single node output architecture and classifier endeavors to approximate all data to any real number. However, one-class classifier based on reconstruction framework is an autoencoder architecture, where output nodes are identical to input nodes and classifier endeavor to reconstruct input layer at the output layer. Both these frameworks employ regularized kernel ELM based online learning and consistency based model selection has been employed to select learning algorithm parameters. The performance of online RK-OC-ELM has been evaluated on standard benchmark datasets as well as on artificial datasets and the results are compared with existing state-of-the art one-class classifiers. The results indicate that the online learning one-class classifier is slightly better or same as batch learning based approaches. As, base classifier used for the proposed classifiers are based on the ELM, hence, proposed classifiers would also inherit the benefit of the base classifier i.e. it will perform faster computation compared to traditional autoencoder based one-class classifier. △ Less

Submitted 9 April, 2018; v1 submitted 16 January, 2017; originally announced January 2017.

Comments: Paper has been submitted to special issue of IEEE Transactions on Systems, Man and Cybernetics: Systems with Manuscript ID: SMCA-16-09-1033, 3rd submission ID: SMCA-18-03-0322

arXiv:1701.04010 [pdf, ps, other]

Density-Wise Two Stage Mammogram Classification using Texture Exploiting Descriptors

Authors: Aditya A. Shastri, Deepti Tamrakar, Kapil Ahuja

Abstract: Breast cancer is becoming pervasive with each passing day. Hence, its early detection is a big step in saving the life of any patient. Mammography is a common tool in breast cancer diagnosis. The most important step here is classification of mammogram patches as normal-abnormal and benign-malignant. Texture of a breast in a mammogram patch plays a significant role in these classifications. We pr… ▽ More Breast cancer is becoming pervasive with each passing day. Hence, its early detection is a big step in saving the life of any patient. Mammography is a common tool in breast cancer diagnosis. The most important step here is classification of mammogram patches as normal-abnormal and benign-malignant. Texture of a breast in a mammogram patch plays a significant role in these classifications. We propose a variation of Histogram of Gradients (HOG) and Gabor filter combination called Histogram of Oriented Texture (HOT) that exploits this fact. We also revisit the Pass Band - Discrete Cosine Transform (PB-DCT) descriptor that captures texture information well. All features of a mammogram patch may not be useful. Hence, we apply a feature selection technique called Discrimination Potentiality (DP). Our resulting descriptors, DP-HOT and DP-PB-DCT, are compared with the standard descriptors. Density of a mammogram patch is important for classification, and has not been studied exhaustively. The Image Retrieval in Medical Application (IRMA) database from RWTH Aachen, Germany is a standard database that provides mammogram patches, and most researchers have tested their frameworks only on a subset of patches from this database. We apply our two new descriptors on all images of the IRMA database for density wise classification, and compare with the standard descriptors. We achieve higher accuracy than all of the existing standard descriptors (more than 92%). △ Less

Submitted 2 January, 2018; v1 submitted 15 January, 2017; originally announced January 2017.

Comments: 28 Pages, 8 Figures, and 7 Tables

arXiv:1608.04489 [pdf, other]

SenTion: A framework for Sensing Facial Expressions

Authors: Rahul Islam, Karan Ahuja, Sandip Karmakar, Ferdous Barbhuiya

Abstract: Facial expressions are an integral part of human cognition and communication, and can be applied in various real life applications. A vital precursor to accurate expression recognition is feature extraction. In this paper, we propose SenTion: A framework for sensing facial expressions. We propose a novel person independent and scale invariant method of extracting Inter Vector Angles (IVA) as geome… ▽ More Facial expressions are an integral part of human cognition and communication, and can be applied in various real life applications. A vital precursor to accurate expression recognition is feature extraction. In this paper, we propose SenTion: A framework for sensing facial expressions. We propose a novel person independent and scale invariant method of extracting Inter Vector Angles (IVA) as geometric features, which proves to be robust and reliable across databases. SenTion employs a novel framework of combining geometric (IVA's) and appearance based features (Histogram of Gradients) to create a hybrid model, that achieves state of the art recognition accuracy. We evaluate the performance of SenTion on two famous face expression data set, namely: CK+ and JAFFE; and subsequently evaluate the viability of facial expression systems by a user study. Extensive experiments showed that SenTion framework yielded dramatic improvements in facial expression recognition and could be employed in real-world applications with low resolution imaging and minimal computational resources in real-time, achieving 15-18 fps on a 2.4 GHz CPU with no GPU. △ Less

Submitted 16 August, 2016; originally announced August 2016.

arXiv:1606.01216 [pdf, ps, other]

Preconditioned Iterative Solves in Model Reduction of Second Order Linear Dynamical Systems

Authors: Navneet Pratap Singh, Kapil Ahuja, Heike Fassbender

Abstract: Recently a new algorithm for model reduction of second order linear dynamical systems with proportional dam**, the Adaptive Iterative Rational Global Arnoldi (AIRGA) algorithm, has been proposed. The main computational cost of the AIRGA algorithm is in solving a sequence of linear systems. These linear systems do change only slightly from one iteration step to the next. Here we focus on efficien… ▽ More Recently a new algorithm for model reduction of second order linear dynamical systems with proportional dam**, the Adaptive Iterative Rational Global Arnoldi (AIRGA) algorithm, has been proposed. The main computational cost of the AIRGA algorithm is in solving a sequence of linear systems. These linear systems do change only slightly from one iteration step to the next. Here we focus on efficiently solving these systems by iterative methods and the choice of an appropriate preconditioner. We propose the use of relevant iterative algorithm and the Sparse Approximate Inverse (SPAI) preconditioner. A technique to cheaply update the SPAI preconditioner in each iteration step of the model order reduction process is given. Moreover, it is shown that under certain conditions the AIRGA algorithm is stable with respect to the error introduced by iterative methods. Our theory is illustrated by experiments. It is demonstrated that SPAI preconditioned Conjugate Gradient (CG) works well for model reduction of a one dimensional beam model with AIRGA algorithm. Moreover, the computation time of preconditioner with update is on an average 2/3 rd of the computation time of preconditioner without update. With average timings running into hours for very large systems, such savings are substantial. △ Less

Submitted 14 February, 2017; v1 submitted 3 June, 2016; originally announced June 2016.

Comments: 21 pages, 9 tables, and 5 algorithms

MSC Class: 34C20; 65F10; 65L20

arXiv:1603.07689 [pdf, other]

Stability, Efficiency, and Contentedness of Social Storage Networks

Authors: Pramod Mane, Kapil Ahuja, Nagarajan Krishnamurthy

Abstract: Social storage systems are a good alternative to existing data backup systems of local, centralized, and P2P backup. In this paper, we look at two untouched aspects of social storage systems. One aspect involves modelling social storage as an endogenous social network, where agents themselves decide with whom they want to build data backup relation. The second aspect involves studying the stabilit… ▽ More Social storage systems are a good alternative to existing data backup systems of local, centralized, and P2P backup. In this paper, we look at two untouched aspects of social storage systems. One aspect involves modelling social storage as an endogenous social network, where agents themselves decide with whom they want to build data backup relation. The second aspect involves studying the stability of social storage systems, which would help reduce maintenance costs and further, help build efficient as well as contented networks. We have a four fold contribution that covers the above two aspects. We, first, model the social storage system as a strategic network formation game. We define the utility of each agent in the network under two different frameworks. Second, we propose the concept of bilateral stability which refines the pairwise stability concept defined by Jackson et. al. 1996, by requiring mutual consent for both addition and deletion of links, as compared to mutual consent just for link addition. Mutual consent for link deletion is especially important in the social storage setting. Third, we prove necessary and the sufficient conditions for bilateral stability of social storage networks. For symmetric social storage networks, we prove that there exists a unique neighborhood size, independent of the number of agents (for all non-trivial cases), where no pair of agents has any incentive to increase or decrease their neighborhood size. We call this neighborhood size as the stability point. Fourth, given the number of agents and other parameters, we discuss which bilaterally stable networks would evolve and also discuss which of these stable networks are efficient --- that is, stable networks with maximum sum of utilities of all agents. We also discuss ways to build contented networks, where each agent achieves the maximum possible utility. △ Less

Submitted 10 April, 2018; v1 submitted 24 March, 2016; originally announced March 2016.

Comments: 38 Pages, 7 Figures, and 4 Tables

arXiv:1603.06254 [pdf, ps, other]

Stability Analysis of Bilinear Iterative Rational Krylov Algorithm

Authors: Rajendra Choudhary, Kapil Ahuja

Abstract: Models coming from different physical applications are very large in size. Simulation with such systems is expensive so one usually obtains a reduced model (by model reduction) that replicates the input-output behaviour of the original full model. A recently proposed algorithm for model reduction of bilinear dynamical systems, Bilinear Iterative Rational Krylov Algorithm (BIRKA), does so in a loca… ▽ More Models coming from different physical applications are very large in size. Simulation with such systems is expensive so one usually obtains a reduced model (by model reduction) that replicates the input-output behaviour of the original full model. A recently proposed algorithm for model reduction of bilinear dynamical systems, Bilinear Iterative Rational Krylov Algorithm (BIRKA), does so in a locally optimal way. This algorithm requires solving very large linear systems of equations. Usually these systems are solved by direct methods (e.g., LU), which are very expensive. A better choice is iterative methods (e.g., Krylov). However, iterative methods introduce errors in linear solves because they are not exact. They solve the given linear system up to a certain tolerance. We prove that under some mild assumptions BIRKA is stable with respect to the error introduced by the inexact linear solves. We also analyze the accuracy of the reduced system obtained from using these inexact solves and support all our results by numerical experiments. △ Less

Submitted 3 September, 2017; v1 submitted 20 March, 2016; originally announced March 2016.

Comments: 29 pages, 6 figures, and 4 tables

MSC Class: 34C20; 41A05; 65F10; 65G99

arXiv:1602.02439 [pdf, other]

Dynamic Matching and Allocation of Tasks

Authors: Kartik Ahuja, Mihaela van der Schaar

Abstract: In many two-sided markets, the parties to be matched have incomplete information about their characteristics. We consider the settings where the parties engaged are extremely patient and are interested in long-term partnerships. Hence, once the final matches are determined, they persist for a long time. Each side has an opportunity to learn (some) relevant information about the other before final… ▽ More In many two-sided markets, the parties to be matched have incomplete information about their characteristics. We consider the settings where the parties engaged are extremely patient and are interested in long-term partnerships. Hence, once the final matches are determined, they persist for a long time. Each side has an opportunity to learn (some) relevant information about the other before final matches are made. For instance, clients seeking workers to perform tasks often conduct interviews that require the workers to perform some tasks and thereby provide information to both sides. The performance of a worker in such an interview- and hence the information revealed - depends both on the inherent characteristics of the worker and the task and also on the actions taken by the worker (e.g. the effort expended), which are not observed by the client. Thus there is moral hazard. Our goal is to derive a dynamic matching mechanism that facilitates learning on both sides before final matches are achieved and ensures that the worker side does not have incentive to obscure learning of their characteristics through their actions. We derive such a mechanism that leads to final matching that achieve optimal performance (revenue) in equilibrium. We show that the equilibrium strategy is long-run coalitionally stable, which means there is no subset of workers and clients that can gain by deviating from the equilibrium strategy. We derive all the results under the modeling assumption that the utilities of the agents are defined as limit of means of the utility obtained in each interaction. △ Less

Submitted 28 August, 2019; v1 submitted 7 February, 2016; originally announced February 2016.

arXiv:1512.01230 [pdf, other]

A Theory of Individualism, Collectivism and Economic Outcomes

Authors: Kartik Ahuja, Mihaela van der Schaar, William R. Zame

Abstract: This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individuali… ▽ More This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individualism allows high quality workers to leave larger bequests. The model suggests that more individualistic societies display smaller populations, greater per capita income and greater income inequality. Some (limited) historical evidence is consistent with these predictions. △ Less

Submitted 1 July, 2016; v1 submitted 3 December, 2015; originally announced December 2015.

arXiv:1511.02429 [pdf, ps, other]

A Micro-foundation of Social Capital in Evolving Social Networks

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela van der Schaar

Abstract: A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurall… ▽ More A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurally opportunistic, socially gregarious and by the distribution of types in the society. In the analysis, we identify different kinds of social capital: bonding capital, popularity capital, and bridging capital. Bonding capital is created by forming a circle of connections, homophily increases bonding capital because it makes this circle of connections more homogeneous. Popularity capital leads to preferential attachment: individuals who become popular tend to become more popular because others are more likely to link to them. Homophily creates asymmetries in the levels of popularity attained by different social groups, more gregarious types of agents are more likely to become popular. However, in homophilic societies, individuals who belong to less gregarious, less opportunistic, or major types are likely to be more central in the network and thus acquire a bridging capital. △ Less

Submitted 7 November, 2015; originally announced November 2015.

Comments: Centrality, homophily, network formation, popularity, preferential attachment, social capital, social networks

arXiv:1509.06097 [pdf, ps, other]

The user base dynamics of websites

Authors: Kartik Ahuja, Simpson Zhang, Mihaela van der Schaar

Abstract: In this work we study for the first time the interaction between marketing and network effects. We build a model in which the online firm starts with an initial user base and controls the growth of the user base by choosing the intensity of advertisements and referrals to potential users. A large user base provides more profits to the online firm, but building a large user base through advertiseme… ▽ More In this work we study for the first time the interaction between marketing and network effects. We build a model in which the online firm starts with an initial user base and controls the growth of the user base by choosing the intensity of advertisements and referrals to potential users. A large user base provides more profits to the online firm, but building a large user base through advertisements and referrals is costly; therefore, the optimal policy must balance the marginal benefits of adding users against the marginal costs of sending advertisements and referrals. Our work offers three main insights: (1) The optimal policy prescribes that a new online firm should offer many advertisements and referrals initially, but then it should decrease advertisements and referrals over time. (2) If the network effects decrease, then the change in the optimal policy depends heavily on two factors i) the level of patience of the online firm, where patient online firms are oriented towards long term profits and impatient online firms are oriented towards short term profits and, ii) the size of the user base. If the online firm is very patient (impatient) and if the network effects decrease, then the optimal policy prescribes it to be more (less) aggressive in posting advertisements and referrals at low user base levels and less (more) aggressive in posting advertisements and referrals at high user base levels. (3) The change in the optimal policy when network effects decrease also depends heavily on the heterogeneity in the user base, as measured in terms of the revenue generated by each user. An online firm that generates most of its revenue from a core group of users should be more aggressive and protective of its user base than a firm that generates revenue uniformly from its users. △ Less

Submitted 29 October, 2015; v1 submitted 20 September, 2015; originally announced September 2015.

arXiv:1508.00205 [pdf, ps, other]

Evolution of Social Networks: A Microfounded Model

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela van der Schaar

Abstract: Many societies are organized in networks that are formed by people who meet and interact over time. In this paper, we present a first model to capture the micro-foundations of social networks evolution, where boundedly rational agents of different types join the network; meet other agents stochastically over time; and consequently decide to form social ties. A basic premise of our model is that in… ▽ More Many societies are organized in networks that are formed by people who meet and interact over time. In this paper, we present a first model to capture the micro-foundations of social networks evolution, where boundedly rational agents of different types join the network; meet other agents stochastically over time; and consequently decide to form social ties. A basic premise of our model is that in real-world networks, agents form links by reasoning about the benefits that agents they meet over time can bestow. We study the evolution of the emerging networks in terms of friendship and popularity acquisition given the following exogenous parameters: structural opportunism, type distribution, homophily, and social gregariousness. We show that the time needed for an agent to find "friends" is influenced by the exogenous parameters: agents who are more gregarious, more homophilic, less opportunistic, or belong to a type "minority" spend a longer time on average searching for friendships. Moreover, we show that preferential attachment is a consequence of an emerging doubly preferential meeting process: a process that guides agents of a certain type to meet more popular similar-type agents with a higher probability, thereby creating asymmetries in the popularity evolution of different types of agents. △ Less

Submitted 14 August, 2015; v1 submitted 2 August, 2015; originally announced August 2015.

arXiv:1504.07009 [pdf, other]

Efficient Interference Management Policies for Femtocell Networks

Authors: Kartik Ahuja, Yuanzhang Xiao, Mihaela van der Schaar

Abstract: Managing interference in a network of macrocells underlaid with femtocells presents an important, yet challenging problem. A majority of spatial (frequency/time) reuse based approaches partition the users based on coloring the interference graph, which is shown to be suboptimal. Some spatial time reuse based approaches schedule the maximal independent sets (MISs) in a cyclic, (weighted) round-robi… ▽ More Managing interference in a network of macrocells underlaid with femtocells presents an important, yet challenging problem. A majority of spatial (frequency/time) reuse based approaches partition the users based on coloring the interference graph, which is shown to be suboptimal. Some spatial time reuse based approaches schedule the maximal independent sets (MISs) in a cyclic, (weighted) round-robin fashion, which is inefficient for delay-sensitive applications. Our proposed policies schedule the MISs in a non-cyclic fashion, which aim to optimize any given network performance criterion for delay-sensitive applications while fulfilling minimum throughput requirements of the users. Importantly, we do not take the interference graph as given as in existing works; we propose an optimal construction of the interference graph. We prove that under certain conditions, the proposed policy achieves the optimal network performance. For large networks, we propose a low-complexity algorithm for computing the proposed policy. We show that the policy computed achieves a constant competitive ratio (with respect to the optimal network performance), which is independent of the network size, under wide range of deployment scenarios. The policy can be implemented in a decentralized manner by the users. Compared to the existing policies, our proposed policies can achieve improvement of up to 130 % in large-scale deployments. △ Less

Submitted 27 April, 2015; originally announced April 2015.

arXiv:1503.04768 [pdf, ps, other]

Self-organizing Networks of Information Gathering Cognitive Agents

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela Van der Schaar

Abstract: In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper we present a first model and associated analysis for network formation under the much mor… ▽ More In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper we present a first model and associated analysis for network formation under the much more realistic assumption that the value of each additional piece of information depends on the type of that piece of information and on the information already possessed: information may be complementary or redundant. We model the formation of a network as a non-cooperative game in which the actions are the formation of links and the benefit of forming a link is the value of the information exchanged minus the cost of forming the link. We characterize the topologies of the networks emerging at a Nash equilibrium (NE) of this game and compare the efficiency of equilibrium networks with the efficiency of centrally designed networks. To quantify the impact of information redundancy and linking cost on social information loss, we provide estimates for the Price of Anarchy (PoA); to quantify the impact on individual information loss we introduce and provide estimates for a measure we call Maximum Information Loss (MIL). Finally, we consider the setting in which agents are not endowed with information, but must produce it. We show that the validity of the well-known "law of the few" depends on how information aggregates; in particular, the "law of the few" fails when information displays complementarities. △ Less

Submitted 12 August, 2015; v1 submitted 16 March, 2015; originally announced March 2015.

arXiv:1501.03358 [pdf, other]

doi 10.1016/j.jcp.2015.09.040

Recycling Krylov subspaces for CFD applications and a new hybrid recycling solver

Authors: Amit Amritkar, Eric de Sturler, Katarzyna Świrydowicz, Danesh Tafti, Kapil Ahuja

Abstract: We focus on robust and efficient iterative solvers for the pressure Poisson equation in incompressible Navier-Stokes problems. Preconditioned Krylov subspace methods are popular for these problems, with BiCGStab and GMRES(m) most frequently used for nonsymmetric systems. BiCGStab is popular because it has cheap iterations, but it may fail for stiff problems, especially early on as the initial gues… ▽ More We focus on robust and efficient iterative solvers for the pressure Poisson equation in incompressible Navier-Stokes problems. Preconditioned Krylov subspace methods are popular for these problems, with BiCGStab and GMRES(m) most frequently used for nonsymmetric systems. BiCGStab is popular because it has cheap iterations, but it may fail for stiff problems, especially early on as the initial guess is far from the solution. Restarted GMRES is better, more robust, in this phase, but restarting may lead to very slow convergence. Therefore, we evaluate the rGCROT method for these systems. This method recycles a selected subspace of the search space (called recycle space) after a restart. This generally improves the convergence drastically compared with GMRES(m). Recycling subspaces is also advantageous for subsequent linear systems, if the matrix changes slowly or is constant. However, rGCROT iterations are still expensive in memory and computation time compared with those of BiCGStab. Hence, we propose a new, hybrid approach that combines the cheap iterations of BiCGStab with the robustness of rGCROT. For the first few time steps the algorithm uses rGCROT and builds an effective recycle space, and then it recycles that space in the rBiCGStab solver. We evaluate rGCROT on a turbulent channel flow problem, and we evaluate both rGCROT and the new, hybrid combination of rGCROT and rBiCGStab on a porous medium flow problem. We see substantial performance gains on both problems. △ Less

Submitted 25 September, 2015; v1 submitted 1 January, 2015; originally announced January 2015.

Comments: 26 pages, 7 figures

arXiv:1411.5107 [pdf, other]

Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism

Authors: Kartik Ahuja, Simpson Zhang, Mihaela van der Schaar

Abstract: Substantial empirical research has shown that the level of individualism vs. collectivism is one of the most critical and important determinants of societal traits, such as economic growth, economic institutions and health conditions. But the exact nature of this impact has thus far not been well understood in an analytical setting. In this work, we develop one of the first theoretical models that… ▽ More Substantial empirical research has shown that the level of individualism vs. collectivism is one of the most critical and important determinants of societal traits, such as economic growth, economic institutions and health conditions. But the exact nature of this impact has thus far not been well understood in an analytical setting. In this work, we develop one of the first theoretical models that analytically studies the impact of individualism-collectivism on the society. We model the growth of an individual's welfare (wealth, resources and health) as depending not only on himself, but also on the level of collectivism, i.e. the level of dependence on the rest of the individuals in the society, which leads to a co-evolutionary setting. Based on our model, we are able to predict the impact of individualism-collectivism on various societal metrics, such as average welfare, average life-time, total population, cumulative welfare and average inequality. We analytically show that individualism has a positive impact on average welfare and cumulative welfare, but comes with the drawbacks of lower average life-time, lower total population and higher average inequality. △ Less

Submitted 18 November, 2014; originally announced November 2014.

arXiv:1411.5102 [pdf, other]

Distributed Interference Management Policies for Heterogeneous Small Cell Networks

Authors: Kartik Ahuja, Yuanzhang Xiao, Mihaela van der Schaar

Abstract: We study the problem of interference management in large-scale small cell networks, where each user equipment (UE) needs to determine in a distributed manner when and at what power level it should transmit to its serving small cell base station (SBS) such that a given network performance criterion is maximized subject to minimum quality of service (QoS) requirements by the UEs. We first propose a… ▽ More We study the problem of interference management in large-scale small cell networks, where each user equipment (UE) needs to determine in a distributed manner when and at what power level it should transmit to its serving small cell base station (SBS) such that a given network performance criterion is maximized subject to minimum quality of service (QoS) requirements by the UEs. We first propose a distributed algorithm for the UE-SBS pairs to find a subset of weakly interfering UE-SBS pairs, namely the maximal independent sets (MISs) of the interference graph in logarithmic time (with respect to the number of UEs). Then we propose a novel problem formulation which enables UE-SBS pairs to determine the optimal fractions of time occupied by each MIS in a distributed manner. We analytically bound the performance of our distributed policy in terms of the competitive ratio with respect to the optimal network performance, which is obtained in a centralized manner with NP (non-deterministic polynomial time) complexity. Remarkably, the competitive ratio is independent of the network size, which guarantees scalability in terms of performance for arbitrarily large networks. Through simulations, we show that our proposed policies achieve significant performance improvements (from 150% to 700%) over the existing policies. △ Less

Submitted 7 March, 2015; v1 submitted 18 November, 2014; originally announced November 2014.

arXiv:1411.2702

Social Cloud: Concept, Current Trends and Future Scope

Authors: Pramod Mane, Monalisa Sarma, Debasis Samanta, Kapil Ahuja

Abstract: In recent years, various kinds of distributed resource sharing setups have been proposed by taking social relationships into consideration. These dissimilar resource sharing setups are tagged as Social Cloud. These setups have appeared in various distributed computing forms such as community cloud, grid, volunteer computing and network services. Such setups are discrete in nature, and hence, do no… ▽ More In recent years, various kinds of distributed resource sharing setups have been proposed by taking social relationships into consideration. These dissimilar resource sharing setups are tagged as Social Cloud. These setups have appeared in various distributed computing forms such as community cloud, grid, volunteer computing and network services. Such setups are discrete in nature, and hence, do not conceptualize the totality of the Social Cloud concept. In fact, it is difficult to conceptualize Social Cloud without a general framework. There are three main objectives of this work. First, to present a general framework of Social Cloud. Second, to report various Social Cloud setups with corresponding architectural prototypes and current trends. Third, to discuss research challenges. △ Less

Submitted 30 August, 2015; v1 submitted 11 November, 2014; originally announced November 2014.

Comments: This paper has been withdrawn by the author due to a crucial sign error in equation 1

arXiv:1406.2831 [pdf, ps, other]

Recycling BiCGSTAB with an Application to Parametric Model Order Reduction

Authors: Kapil Ahuja, Peter Benner, Eric de Sturler, Lihong Feng

Abstract: Krylov subspace recycling is a process for accelerating the convergence of sequences of linear systems. Based on this technique, the recycling BiCG algorithm has been developed recently. Here, we now generalize and extend this recycling theory to BiCGSTAB. Recycling BiCG focuses on efficiently solving sequences of dual linear systems, while the focus here is on efficiently solving sequences of sin… ▽ More Krylov subspace recycling is a process for accelerating the convergence of sequences of linear systems. Based on this technique, the recycling BiCG algorithm has been developed recently. Here, we now generalize and extend this recycling theory to BiCGSTAB. Recycling BiCG focuses on efficiently solving sequences of dual linear systems, while the focus here is on efficiently solving sequences of single linear systems (assuming non-symmetric matrices for both recycling BiCG and recycling BiCGSTAB). As compared with other methods for solving sequences of single linear systems with non-symmetric matrices (e.g., recycling variants of GMRES), BiCG based recycling algorithms, like recycling BiCGSTAB, have the advantage that they involve a short-term recurrence, and hence, do not suffer from storage issues and are also cheaper with respect to the orthogonalizations. We modify the BiCGSTAB algorithm to use a recycle space, which is built from left and right approximate invariant subspaces. Using our algorithm for a parametric model order reduction example gives good results. We show about 40% savings in the number of matrix-vector products and about 35% savings in runtime. △ Less

Submitted 25 January, 2015; v1 submitted 11 June, 2014; originally announced June 2014.

Comments: 18 pages, 5 figures, Extended version of Max Planck Institute report (MPIMD/13-21)

MSC Class: 65F10; 65N22; 93A15; 93C05

arXiv:1107.3671 [pdf]

Impact of Mobility On QoS of Mobile WiMax Network With CBR Application

Authors: Kranti Bala, Kiran Ahuja

Abstract: The issue of mobility is important in wireless network because internet connectivity can only be effective if it's available during the movement of node. To enhance mobility, wireless access systems are designed such as IEEE 802.16e to operate on the move without any disruption of services. In this paper we are analyzing the impact of mobility on the QoS parameters (Throughput, Average Jitter and… ▽ More The issue of mobility is important in wireless network because internet connectivity can only be effective if it's available during the movement of node. To enhance mobility, wireless access systems are designed such as IEEE 802.16e to operate on the move without any disruption of services. In this paper we are analyzing the impact of mobility on the QoS parameters (Throughput, Average Jitter and Average end to end Delay) of a mobile WiMAX network (IEEE 802.16e) with CBR application. △ Less

Submitted 19 July, 2011; originally announced July 2011.

Comments: Total 7 Pages, 5 Figures and 1 table

Journal ref: International Journal of Advancements in Technology , Vol. 2, No. 3, July 2011

arXiv:1010.0762 [pdf, ps, other]

Recycling BiCG with an Application to Model Reduction

Authors: Kapil Ahuja, Eric de Sturler, Serkan Gugercin, Eun R. Chang

Abstract: Science and engineering problems frequently require solving a sequence of dual linear systems. Besides having to store only few Lanczos vectors, using the BiConjugate Gradient method (BiCG) to solve dual linear systems has advantages for specific applications. For example, using BiCG to solve the dual linear systems arising in interpolatory model reduction provides a backward error formulation in… ▽ More Science and engineering problems frequently require solving a sequence of dual linear systems. Besides having to store only few Lanczos vectors, using the BiConjugate Gradient method (BiCG) to solve dual linear systems has advantages for specific applications. For example, using BiCG to solve the dual linear systems arising in interpolatory model reduction provides a backward error formulation in the model reduction framework. Using BiCG to evaluate bilinear forms -- for example, in quantum Monte Carlo (QMC) methods for electronic structure calculations -- leads to a quadratic error bound. Since our focus is on sequences of dual linear systems, we introduce recycling BiCG, a BiCG method that recycles two Krylov subspaces from one pair of dual linear systems to the next pair. The derivation of recycling BiCG also builds the foundation for develo** recycling variants of other bi-Lanczos based methods, such as CGS, BiCGSTAB, QMR, and TFQMR. We develop an augmented bi-Lanczos algorithm and a modified two-term recurrence to include recycling in the iteration. The recycle spaces are approximate left and right invariant subspaces corresponding to the eigenvalues closest to the origin. These recycle spaces are found by solving a small generalized eigenvalue problem alongside the dual linear systems being solved in the sequence. We test our algorithm in two application areas. First, we solve a discretized partial differential equation (PDE) of convection-diffusion type. Such a problem provides well-known test cases that are easy to test and analyze further. Second, we use recycling BiCG in the Iterative Rational Krylov Algorithm (IRKA) for interpolatory model reduction. IRKA requires solving sequences of slowly changing dual linear systems. We show up to 70% savings in iterations, and also demonstrate that for a model reduction problem BiCG takes (about) 50% more time than recycling BiCG. △ Less

Submitted 16 October, 2011; v1 submitted 5 October, 2010; originally announced October 2010.

Comments: 25 pages, 6 figures, 3 tables

MSC Class: 65F10; 65N22; 93A15; 93C05

arXiv:1008.5113 [pdf, ps, other]

Improved Scaling for Quantum Monte Carlo on Insulators

Authors: Kapil Ahuja, Bryan K. Clark, Eric de Sturler, David M. Ceperley, Jeongnim Kim

Abstract: Quantum Monte Carlo (QMC) methods are often used to calculate properties of many body quantum systems. The main cost of many QMC methods, for example the variational Monte Carlo (VMC) method, is in constructing a sequence of Slater matrices and computing the ratios of determinants for successive Slater matrices. Recent work has improved the scaling of constructing Slater matrices for insulators so… ▽ More Quantum Monte Carlo (QMC) methods are often used to calculate properties of many body quantum systems. The main cost of many QMC methods, for example the variational Monte Carlo (VMC) method, is in constructing a sequence of Slater matrices and computing the ratios of determinants for successive Slater matrices. Recent work has improved the scaling of constructing Slater matrices for insulators so that the cost of constructing Slater matrices in these systems is now linear in the number of particles, whereas computing determinant ratios remains cubic in the number of particles. With the long term aim of simulating much larger systems, we improve the scaling of computing the determinant ratios in the VMC method for simulating insulators by using preconditioned iterative solvers. The main contribution of this paper is the development of a method to efficiently compute for the Slater matrices a sequence of preconditioners that make the iterative solver converge rapidly. This involves cheap preconditioner updates, an effective reordering strategy, and a cheap method to monitor instability of ILUTP preconditioners. Using the resulting preconditioned iterative solvers to compute determinant ratios of consecutive Slater matrices reduces the scaling of QMC algorithms from O(n^3) per sweep to roughly O(n^2), where n is the number of particles, and a sweep is a sequence of n steps, each attempting to move a distinct particle. We demonstrate experimentally that we can achieve the improved scaling without increasing statistical errors. Our results show that preconditioned iterative solvers can dramatically reduce the cost of VMC for large(r) systems. △ Less

Submitted 6 May, 2011; v1 submitted 30 August, 2010; originally announced August 2010.

Comments: 24 pages, 10 figures

Showing 51–100 of 100 results for author: Ahuja, K