Search | arXiv e-print repository

Complementing Semi-Supervised Learning with Uncertainty Quantification

Abstract: The problem of fully supervised classification is that it requires a tremendous amount of annotated data, however, in many datasets a large portion of data is unlabeled. To alleviate this problem semi-supervised learning (SSL) leverages the knowledge of the classifier on the labeled domain and extrapolates it to the unlabeled domain which has a supposedly similar distribution as annotated data. Re… ▽ More The problem of fully supervised classification is that it requires a tremendous amount of annotated data, however, in many datasets a large portion of data is unlabeled. To alleviate this problem semi-supervised learning (SSL) leverages the knowledge of the classifier on the labeled domain and extrapolates it to the unlabeled domain which has a supposedly similar distribution as annotated data. Recent success on SSL methods crucially hinges on thresholded pseudo labeling and thereby consistency regularization for the unlabeled domain. However, the existing methods do not incorporate the uncertainty of the pseudo labels or unlabeled samples in the training process which are due to the noisy labels or out of distribution samples owing to strong augmentations. Inspired by the recent developments in SSL, our goal in this paper is to propose a novel unsupervised uncertainty-aware objective that relies on aleatoric and epistemic uncertainty quantification. Complementing the recent techniques in SSL with the proposed uncertainty-aware loss function our approach outperforms or is on par with the state-of-the-art over standard SSL benchmarks while being computationally lightweight. Our results outperform the state-of-the-art results on complex datasets such as CIFAR-100 and Mini-ImageNet. △ Less

Submitted 21 July, 2022; originally announced July 2022.

arXiv:2112.02143 [pdf, other]

CTIN: Robust Contextual Transformer Network for Inertial Navigation

Authors: Bingbing Rao, Ehsan Kazemi, Yifan Ding, Devu M Shila, Frank M. Tucker, Liqiang Wang

Abstract: Recently, data-driven inertial navigation approaches have demonstrated their capability of using well-trained neural networks to obtain accurate position estimates from inertial measurement units (IMU) measurements. In this paper, we propose a novel robust Contextual Transformer-based network for Inertial Navigation~(CTIN) to accurately predict velocity and trajectory. To this end, we first design… ▽ More Recently, data-driven inertial navigation approaches have demonstrated their capability of using well-trained neural networks to obtain accurate position estimates from inertial measurement units (IMU) measurements. In this paper, we propose a novel robust Contextual Transformer-based network for Inertial Navigation~(CTIN) to accurately predict velocity and trajectory. To this end, we first design a ResNet-based encoder enhanced by local and global multi-head self-attention to capture spatial contextual information from IMU measurements. Then we fuse these spatial representations with temporal knowledge by leveraging multi-head attention in the Transformer decoder. Finally, multi-task learning with uncertainty reduction is leveraged to improve learning efficiency and prediction accuracy of velocity and trajectory. Through extensive experiments over a wide range of inertial datasets~(e.g. RIDI, OxIOD, RoNIN, IDOL, and our own), CTIN is very robust and outperforms state-of-the-art models. △ Less

Submitted 20 December, 2021; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: Accepted as technical research paper in 36th AAAI Conference on Artificial Intelligence, 2022

arXiv:2104.02772 [pdf, other]

The Power of Subsampling in Submodular Maximization

Authors: Christopher Harshaw, Ehsan Kazemi, Moran Feldman, Amin Karbasi

Abstract: We propose subsampling as a unified algorithmic technique for submodular maximization in centralized and online settings. The idea is simple: independently sample elements from the ground set, and use simple combinatorial techniques (such as greedy or local search) on these sampled elements. We show that this approach leads to optimal/state-of-the-art results despite being much simpler than existi… ▽ More We propose subsampling as a unified algorithmic technique for submodular maximization in centralized and online settings. The idea is simple: independently sample elements from the ground set, and use simple combinatorial techniques (such as greedy or local search) on these sampled elements. We show that this approach leads to optimal/state-of-the-art results despite being much simpler than existing methods. In the usual offline setting, we present SampleGreedy, which obtains a $(p + 2 + o(1))$-approximation for maximizing a submodular function subject to a $p$-extendible system using $O(n + nk/p)$ evaluation and feasibility queries, where $k$ is the size of the largest feasible set. The approximation ratio improves to $p+1$ and $p$ for monotone submodular and linear objectives, respectively. In the streaming setting, we present SampleStreaming, which obtains a $(4p +2 - o(1))$-approximation for maximizing a submodular function subject to a $p$-matchoid using $O(k)$ memory and $O(km/p)$ evaluation and feasibility queries per element, where $m$ is the number of matroids defining the $p$-matchoid. The approximation ratio improves to $4p$ for monotone submodular objectives. We empirically demonstrate the effectiveness of our algorithms on video summarization, location summarization, and movie recommendation tasks. △ Less

Submitted 6 April, 2021; originally announced April 2021.

Comments: arXiv admin note: text overlap with arXiv:1802.07098

arXiv:2102.07360 [pdf, other]

Generating Structured Adversarial Attacks Using Frank-Wolfe Method

Authors: Ehsan Kazemi, Thomas Kerdreux, Liquang Wang

Abstract: White box adversarial perturbations are generated via iterative optimization algorithms most often by minimizing an adversarial loss on a $\ell_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithm… ▽ More White box adversarial perturbations are generated via iterative optimization algorithms most often by minimizing an adversarial loss on a $\ell_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples might provide challenges for provable and empirical robust mechanisms. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $\ell_p$ counter-part while remaining imperceptible or perceptible as natural distortions of the image. We will demonstrate in this work that the proposed structured adversarial examples can significantly bring down the classification accuracy of adversarialy trained classifiers while showing low $\ell_2$ distortion rate. For instance, on ImagNet dataset the structured attacks drop the accuracy of adversarial model to near zero with only 50\% of $\ell_2$ distortion generated using white-box attacks like PGD. As a byproduct, our finding on structured adversarial examples can be used for adversarial regularization of models to make models more robust or improve their generalization performance on datasets which are structurally different. △ Less

Submitted 15 February, 2021; originally announced February 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2007.01855

arXiv:2007.01855 [pdf, other]

Trace-Norm Adversarial Examples

Authors: Ehsan Kazemi, Thomas Kerdreux, Liqiang Wang

Abstract: White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These… ▽ More White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples, yet pervasive in optimization, are for instance a challenge for adversarial theoretical certification which again provides only $l_p$ certificates. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part while remaining imperceptible or perceptible as natural slight distortions of the image. Finally, they allow some control on the generation of the adversarial perturbation, like (localized) bluriness. △ Less

Submitted 2 July, 2020; originally announced July 2020.

arXiv:2006.09327 [pdf, other]

Submodular Maximization in Clean Linear Time

Authors: Wenxin Li, Moran Feldman, Ehsan Kazemi, Amin Karbasi

Abstract: In this paper, we provide the first deterministic algorithm that achieves the tight $1-1/e$ approximation guarantee for submodular maximization under a cardinality (size) constraint while making a number of queries that scales only linearly with the size of the ground set $n$. To complement our result, we also show strong information-theoretic lower bounds. More specifically, we show that when the… ▽ More In this paper, we provide the first deterministic algorithm that achieves the tight $1-1/e$ approximation guarantee for submodular maximization under a cardinality (size) constraint while making a number of queries that scales only linearly with the size of the ground set $n$. To complement our result, we also show strong information-theoretic lower bounds. More specifically, we show that when the maximum cardinality allowed for a solution is constant, no algorithm making a sub-linear number of function evaluations can guarantee any constant approximation ratio. Furthermore, when the constraint allows the selection of a constant fraction of the ground set, we show that any algorithm making fewer than $Ω(n/\log(n))$ function evaluations cannot perform better than an algorithm that simply outputs a uniformly random subset of the ground set of the right size. We then provide a variant of our deterministic algorithm for the more general knapsack constraint, which is the first linear-time algorithm that achieves $1/2$-approximation guarantee for this constraint. Finally, we extend our results to the general case of maximizing a monotone submodular function subject to the intersection of a $p$-set system and multiple knapsack constraints. We extensively evaluate the performance of our algorithms on multiple real-life machine learning applications, including movie recommendation, location summarization, twitter text summarization and video summarization. △ Less

Submitted 12 April, 2022; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 48 pages, 4 figures

MSC Class: 90C27 (Primary) 68Q32; 68R05 (Secondary) ACM Class: F.2.2; G.2.1; I.2.6

arXiv:2006.08669 [pdf, other]

On Adversarial Bias and the Robustness of Fair Machine Learning

Authors: Hongyan Chang, Ta Duy Nguyen, Sasi Kumar Murakonda, Ehsan Kazemi, Reza Shokri

Abstract: Optimizing prediction accuracy can come at the expense of fairness. Towards minimizing discrimination against a group, fair machine learning algorithms strive to equalize the behavior of a model across different groups, by imposing a fairness constraint on models. However, we show that giving the same importance to groups of different sizes and distributions, to counteract the effect of bias in tr… ▽ More Optimizing prediction accuracy can come at the expense of fairness. Towards minimizing discrimination against a group, fair machine learning algorithms strive to equalize the behavior of a model across different groups, by imposing a fairness constraint on models. However, we show that giving the same importance to groups of different sizes and distributions, to counteract the effect of bias in training data, can be in conflict with robustness. We analyze data poisoning attacks against group-based fair machine learning, with the focus on equalized odds. An adversary who can control sampling or labeling for a fraction of training data, can reduce the test accuracy significantly beyond what he can achieve on unconstrained models. Adversarial sampling and adversarial labeling attacks can also worsen the model's fairness gap on test data, even though the model satisfies the fairness constraint on training data. We analyze the robustness of fair machine learning through an empirical evaluation of attacks on multiple algorithms and benchmark datasets. △ Less

Submitted 15 June, 2020; originally announced June 2020.

arXiv:2002.03523 [pdf, other]

Submodular Maximization Through Barrier Functions

Authors: Ashwinkumar Badanidiyuru, Amin Karbasi, Ehsan Kazemi, Jan Vondrak

Abstract: In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and… ▽ More In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $ε$ error it is guaranteed that we have found a feasible set with a $2(k+1+ε)$-approximation factor which can indeed be further improved to $(k+1+ε)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem. △ Less

Submitted 9 February, 2020; originally announced February 2020.

arXiv:2002.03503 [pdf, other]

Regularized Submodular Maximization at Scale

Authors: Ehsan Kazemi, Shervin Minaee, Moran Feldman, Amin Karbasi

Abstract: In this paper, we propose scalable methods for maximizing a regularized submodular function $f = g - \ell$ expressed as the difference between a monotone submodular function $g$ and a modular function $\ell$. Indeed, submodularity is inherently related to the notions of diversity, coverage, and representativeness. In particular, finding the mode of many popular probabilistic models of diversity, s… ▽ More In this paper, we propose scalable methods for maximizing a regularized submodular function $f = g - \ell$ expressed as the difference between a monotone submodular function $g$ and a modular function $\ell$. Indeed, submodularity is inherently related to the notions of diversity, coverage, and representativeness. In particular, finding the mode of many popular probabilistic models of diversity, such as determinantal point processes, submodular probabilistic models, and strongly log-concave distributions, involves maximization of (regularized) submodular functions. Since a regularized function $f$ can potentially take on negative values, the classic theory of submodular maximization, which heavily relies on the non-negativity assumption of submodular functions, may not be applicable. To circumvent this challenge, we develop the first one-pass streaming algorithm for maximizing a regularized submodular function subject to a $k$-cardinality constraint. It returns a solution $S$ with the guarantee that $f(S)\geq(φ^{-2}-ε) \cdot g(OPT)-\ell (OPT)$, where $φ$ is the golden ratio. Furthermore, we develop the first distributed algorithm that returns a solution $S$ with the guarantee that $\mathbb{E}[f(S)] \geq (1-ε) [(1-e^{-1}) \cdot g(OPT)-\ell(OPT)]$ in $O(1/ ε)$ rounds of MapReduce computation, without kee** multiple copies of the entire dataset in each round (as it is usually done). We should highlight that our result, even for the unregularized case where the modular term $\ell$ is zero, improves the memory and communication complexity of the existing work by a factor of $O(1/ ε)$ while arguably provides a simpler distributed algorithm and a unifying analysis. We also empirically study the performance of our scalable methods on a set of real-life applications, including finding the mode of distributions, data summarization, and product recommendation. △ Less

Submitted 9 February, 2020; originally announced February 2020.

arXiv:2002.03352 [pdf, other]

Streaming Submodular Maximization under a $k$-Set System Constraint

Authors: Ran Haba, Ehsan Kazemi, Moran Feldman, Amin Karbasi

Abstract: In this paper, we propose a novel framework that converts streaming algorithms for monotone submodular maximization into streaming algorithms for non-monotone submodular maximization. This reduction readily leads to the currently tightest deterministic approximation ratio for submodular maximization subject to a $k$-matchoid constraint. Moreover, we propose the first streaming algorithm for monoto… ▽ More In this paper, we propose a novel framework that converts streaming algorithms for monotone submodular maximization into streaming algorithms for non-monotone submodular maximization. This reduction readily leads to the currently tightest deterministic approximation ratio for submodular maximization subject to a $k$-matchoid constraint. Moreover, we propose the first streaming algorithm for monotone submodular maximization subject to $k$-extendible and $k$-set system constraints. Together with our proposed reduction, we obtain $O(k\log k)$ and $O(k^2\log k)$ approximation ratio for submodular maximization subject to the above constraints, respectively. We extensively evaluate the empirical performance of our algorithm against the existing work in a series of experiments including finding the maximum independent set in randomly generated graphs, maximizing linear functions over social networks, movie recommendation, Yelp location summarization, and Twitter data summarization. △ Less

Submitted 9 February, 2020; originally announced February 2020.

Comments: 28 pages; 8 figures. This paper subsumes arXiv:1906.04449, which was previously posted on arXiv and considered only the case of linear objective functions

MSC Class: 68W25 (Primary) 68R05 (Secondary) ACM Class: F.2.2; G.2.1; I.2.6

arXiv:1906.03489 [pdf, other]

doi 10.1016/j.cpc.2019.107110

Nektar++: enhancing the capability and application of high-fidelity spectral/$hp$ element methods

Authors: David Moxey, Chris D. Cantwell, Yan Bao, Andrea Cassinelli, Giacomo Castiglioni, Sehun Chun, Emilia Juda, Ehsan Kazemi, Kilian Lackhove, Julian Marcon, Gianmarco Mengaldo, Douglas Serson, Michael Turner, Hui Xu, Joaquim Peiró, Robert M. Kirby, Spencer J. Sherwin

Abstract: Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/$hp$ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a… ▽ More Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/$hp$ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a wide range of application areas. In this paper, we present the algorithmic, implementation and application developments associated with our Nektar++ version 5.0 release. We describe some of the key software and performance developments, including our strategies on parallel I/O, on in situ processing, the use of collective operations for exploiting current and emerging hardware, and interfaces to enable multi-solver coupling. Furthermore, we provide details on a newly developed Python interface that enables a more rapid introduction for new users unfamiliar with spectral/$hp$ element methods, C++ and/or Nektar++. This release also incorporates a number of numerical method developments - in particular: the method of moving frames, which provides an additional approach for the simulation of equations on embedded curvilinear manifolds and domains; a means of handling spatially variable polynomial order; and a novel technique for quasi-3D simulations to permit spatially-varying perturbations to the geometry in the homogeneous direction. Finally, we demonstrate the new application-level features provided in this release, namely: a facility for generating high-order curvilinear meshes called NekMesh; a novel new AcousticSolver for aeroacoustic problems; our development of a 'thick' strip model for the modelling of fluid-structure interaction problems in the context of vortex-induced vibrations. We conclude by commenting some directions for future code development and expansion. △ Less

Submitted 26 November, 2019; v1 submitted 8 June, 2019; originally announced June 2019.

Comments: 21 pages, 14 figures

Journal ref: Computer Physics Communications 249 (2020) 107110

arXiv:1905.00948 [pdf, other]

Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Authors: Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi

Abstract: Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight… ▽ More Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $(1/2)$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $(1/4)$-approximation with $Θ(k)$ memory or the optimal $(1/2)$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of Sieve-Streaming++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos. △ Less

Submitted 13 May, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

Comments: Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

arXiv:1902.05981 [pdf, other]

Adaptive Sequence Submodularity

Authors: Marko Mitrovic, Ehsan Kazemi, Moran Feldman, Andreas Krause, Amin Karbasi

Abstract: In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in a certain order (e.g., guiding an agent through a series of states). Not only do sequences already pose a dauntingly large search space, but we must also take into account past observations, as well as the uncertainty of… ▽ More In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in a certain order (e.g., guiding an agent through a series of states). Not only do sequences already pose a dauntingly large search space, but we must also take into account past observations, as well as the uncertainty of future outcomes. Without further structure, finding an optimal sequence is notoriously challenging, if not completely intractable. In this paper, we view the problem of adaptive and sequential decision making through the lens of submodularity and propose an adaptive greedy policy with strong theoretical guarantees. Additionally, to demonstrate the practical utility of our results, we run experiments on Amazon product recommendation and Wikipedia link prediction tasks. △ Less

Submitted 20 June, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

arXiv:1902.01856 [pdf, other]

Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Authors: Ehsan Kazemi, Liqiang Wang

Abstract: Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, develo** efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is… ▽ More Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, develo** efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Lojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed. △ Less

Submitted 4 February, 2019; originally announced February 2019.

arXiv:1811.04973 [pdf, other]

Eliminating Latent Discrimination: Train Then Mask

Authors: Soheil Ghili, Ehsan Kazemi, Amin Karbasi

Abstract: How can we control for latent discrimination in predictive models? How can we provably remove it? Such questions are at the heart of algorithmic fairness and its impacts on society. In this paper, we define a new operational fairness criteria, inspired by the well-understood notion of omitted variable-bias in statistics and econometrics. Our notion of fairness effectively controls for sensitive fe… ▽ More How can we control for latent discrimination in predictive models? How can we provably remove it? Such questions are at the heart of algorithmic fairness and its impacts on society. In this paper, we define a new operational fairness criteria, inspired by the well-understood notion of omitted variable-bias in statistics and econometrics. Our notion of fairness effectively controls for sensitive features and provides diagnostics for deviations from fair decision making. We then establish analytical and algorithmic results about the existence of a fair classifier in the context of supervised learning. Our results readily imply a simple, but rather counter-intuitive, strategy for eliminating latent discrimination. In order to prevent other features proxying for sensitive features, we need to include sensitive features in the training phase, but exclude them in the test/evaluation phase while controlling for their effects. We evaluate the performance of our algorithm on several real-world datasets and show how fairness for these datasets can be improved with a very small loss in accuracy. △ Less

Submitted 21 February, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

arXiv:1810.10085 [pdf, ps, other]

A Proximal Zeroth-Order Algorithm for Nonconvex Nonsmooth Problems

Authors: Ehsan Kazemi, Liqiang Wang

Abstract: In this paper, we focus on solving an important class of nonconvex optimization problems which includes many problems for example signal processing over a networked multi-agent system and distributed learning over networks. Motivated by many applications in which the local objective function is the sum of smooth but possibly nonconvex part, and non-smooth but convex part subject to a linear equali… ▽ More In this paper, we focus on solving an important class of nonconvex optimization problems which includes many problems for example signal processing over a networked multi-agent system and distributed learning over networks. Motivated by many applications in which the local objective function is the sum of smooth but possibly nonconvex part, and non-smooth but convex part subject to a linear equality constraint, this paper proposes a proximal zeroth-order primal dual algorithm (PZO-PDA) that accounts for the information structure of the problem. This algorithm only utilize the zeroth-order information (i.e., the functional values) of smooth functions, yet the flexibility is achieved for applications that only noisy information of the objective function is accessible, where classical methods cannot be applied. We prove convergence and rate of convergence for PZO-PDA. Numerical experiments are provided to validate the theoretical results. △ Less

Submitted 17 October, 2018; originally announced October 2018.

arXiv:1806.02815 [pdf, other]

Data Summarization at Scale: A Two-Stage Submodular Approach

Authors: Marko Mitrovic, Ehsan Kazemi, Morteza Zadimoghaddam, Amin Karbasi

Abstract: The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the go… ▽ More The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the goal is to use some given training functions to reduce the ground set so that optimizing new functions (drawn from the same distribution) over the reduced set provides almost as much value as optimizing them over the entire ground set. In this paper, we develop the first streaming and distributed solutions to this problem. In addition to providing strong theoretical guarantees, we demonstrate both the utility and efficiency of our algorithms on real-world tasks including image summarization and ride-share optimization. △ Less

Submitted 7 June, 2018; originally announced June 2018.

arXiv:1804.10029 [pdf, other]

doi 10.1109/TCBB.2019.2914050

MPGM: Scalable and Accurate Multiple Network Alignment

Authors: Ehsan Kazemi, Matthias Grossglauser

Abstract: Protein-protein interaction (PPI) network alignment is a canonical operation to transfer biological knowledge among species. The alignment of PPI-networks has many applications, such as the prediction of protein function, detection of conserved network motifs, and the reconstruction of species' phylogenetic relationships. A good multiple-network alignment (MNA), by considering the data related to… ▽ More Protein-protein interaction (PPI) network alignment is a canonical operation to transfer biological knowledge among species. The alignment of PPI-networks has many applications, such as the prediction of protein function, detection of conserved network motifs, and the reconstruction of species' phylogenetic relationships. A good multiple-network alignment (MNA), by considering the data related to several species, provides a deep understanding of biological networks and system-level cellular processes. With the massive amounts of available PPI data and the increasing number of known PPI networks, the problem of MNA is gaining more attention in the systems-biology studies. In this paper, we introduce a new scalable and accurate algorithm, called MPGM, for aligning multiple networks. The MPGM algorithm has two main steps: (i) SEEDGENERATION and (ii) MULTIPLEPERCOLATION. In the first step, to generate an initial set of seed tuples, the SEEDGENERATION algorithm uses only protein sequence similarities. In the second step, to align remaining unmatched nodes, the MULTIPLEPERCOLATION algorithm uses network structures and the seed tuples generated from the first step. We show that, with respect to different evaluation criteria, MPGM outperforms the other state-of-the-art algorithms. In addition, we guarantee the performance of MPGM under certain classes of network models. We introduce a sampling-based stochastic model for generating k correlated networks. We prove that for this model, if a sufficient number of seed tuples are available, the MULTIPLEPERCOLATION algorithm correctly aligns almost all the nodes. Our theoretical results are supported by experimental evaluations over synthetic networks. △ Less

Submitted 13 May, 2019; v1 submitted 26 April, 2018; originally announced April 2018.

Journal ref: IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019

arXiv:1802.07098 [pdf, other]

Do Less, Get More: Streaming Submodular Maximization with Subsampling

Authors: Moran Feldman, Amin Karbasi, Ehsan Kazemi

Abstract: In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of data stream, our algorithm enjoys the tightest approximation guarantees in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations. More specifically, for… ▽ More In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of data stream, our algorithm enjoys the tightest approximation guarantees in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations. More specifically, for a monotone submodular function and a $p$-matchoid constraint, our randomized algorithm achieves a $4p$ approximation ratio (in expectation) with $O(k)$ memory and $O(km/p)$ queries per element ($k$ is the size of the largest feasible solution and $m$ is the number of matroids used to define the constraint). For the non-monotone case, our approximation ratio increases only slightly to $4p+2-o(1)$. To the best or our knowledge, our algorithm is the first that combines the benefits of streaming and subsampling in a novel way in order to truly scale submodular maximization to massive machine learning problems. To showcase its practicality, we empirically evaluated the performance of our algorithm on a video summarization application and observed that it outperforms the state-of-the-art algorithm by up to fifty fold, while maintaining practically the same utility. △ Less

Submitted 20 February, 2018; originally announced February 2018.

arXiv:1802.06942 [pdf, other]

Comparison Based Learning from Weak Oracles

Authors: Ehsan Kazemi, Lin Chen, Sanjoy Dasgupta, Amin Karbasi

Abstract: There is increasing interest in learning algorithms that involve interaction between human and machine. Comparison-based queries are among the most natural ways to get feedback from humans. A challenge in designing comparison-based interactive learning algorithms is co** with noisy answers. The most common fix is to submit a query several times, but this is not applicable in many situations due… ▽ More There is increasing interest in learning algorithms that involve interaction between human and machine. Comparison-based queries are among the most natural ways to get feedback from humans. A challenge in designing comparison-based interactive learning algorithms is co** with noisy answers. The most common fix is to submit a query several times, but this is not applicable in many situations due to its prohibitive cost and due to the unrealistic assumption of independent noise in different repetitions of the same query. In this paper, we introduce a new weak oracle model, where a non-malicious user responds to a pairwise comparison query only when she is quite sure about the answer. This model is able to mimic the behavior of a human in noise-prone regions. We also consider the application of this weak oracle model to the problem of content search (a variant of the nearest neighbor search problem) through comparisons. More specifically, we aim at devising efficient algorithms to locate a target object in a database equipped with a dissimilarity metric via invocation of the weak comparison oracle. We propose two algorithms termed WORCS-I and WORCS-II (Weak-Oracle Comparison-based Search), which provably locate the target object in a number of comparisons close to the entropy of the target distribution. While WORCS-I provides better theoretical guarantees, WORCS-II is applicable to more technically challenging scenarios where the algorithm has limited access to the ranking dissimilarity between objects. A series of experiments validate the performance of our proposed algorithms. △ Less

Submitted 19 February, 2018; originally announced February 2018.

arXiv:1711.07112 [pdf, other]

Deletion-Robust Submodular Maximization at Scale

Authors: Ehsan Kazemi, Morteza Zadimoghaddam, Amin Karbasi

Abstract: Can we efficiently extract useful information from a large user-generated dataset while protecting the privacy of the users and/or ensuring fairness in representation. We cast this problem as an instance of a deletion-robust submodular maximization where part of the data may be deleted due to privacy concerns or fairness criteria. We propose the first memory-efficient centralized, streaming, and d… ▽ More Can we efficiently extract useful information from a large user-generated dataset while protecting the privacy of the users and/or ensuring fairness in representation. We cast this problem as an instance of a deletion-robust submodular maximization where part of the data may be deleted due to privacy concerns or fairness criteria. We propose the first memory-efficient centralized, streaming, and distributed methods with constant-factor approximation guarantees against any number of adversarial deletions. We extensively evaluate the performance of our algorithms against prior state-of-the-art on real-world applications, including (i) Uber-pick up locations with location privacy constraints; (ii) feature selection with fairness constraints for income prediction and crime rate prediction; and (iii) robust to deletion summarization of census data, consisting of 2,458,285 feature vectors. △ Less

Submitted 20 November, 2017; v1 submitted 19 November, 2017; originally announced November 2017.

Comments: 27 pages, 3 figures

arXiv:1602.00668 [pdf, other]

On the Structure and Efficient Computation of IsoRank Node Similarities

Authors: Ehsan Kazemi, Matthias Grossglauser

Abstract: The alignment of protein-protein interaction (PPI) networks has many applications, such as the detection of conserved biological network motifs, the prediction of protein interactions, and the reconstruction of phylogenetic trees [1, 2, 3]. IsoRank is one of the first global network alignment algorithms [4, 5, 6], where the goal is to match all (or most) of the nodes of two PPI networks. The IsoRa… ▽ More The alignment of protein-protein interaction (PPI) networks has many applications, such as the detection of conserved biological network motifs, the prediction of protein interactions, and the reconstruction of phylogenetic trees [1, 2, 3]. IsoRank is one of the first global network alignment algorithms [4, 5, 6], where the goal is to match all (or most) of the nodes of two PPI networks. The IsoRank algorithm first computes a pairwise node similarity metric, and then generates a matching between the two node sets based on this metric. The metric is a convex combination of a structural similarity score (with weight $ α$) and an extraneous amino-acid sequence similarity score for two proteins (with weight $ 1 - α$). In this short paper, we make two contributions. First, we show that when IsoRank similarity depends only on network structure ($α= 1$), the similarity of two nodes is only a function of their degrees. In other words, IsoRank similarity is invariant to any network rewiring that does not affect the node degrees. This result suggests a reason for the poor performance of IsoRank in structure-only ($ α= 1 $) alignment. Second, using ideas from [7, 8], we develop an approximation algorithm that outperforms IsoRank (including recent versions with better scaling, e.g., [9]) by several orders of magnitude in time and memory complexity, despite only a negligible loss in precision. △ Less

Submitted 24 February, 2016; v1 submitted 1 February, 2016; originally announced February 2016.

Comments: 8 pages and 1 figure

arXiv:1307.2084 [pdf, other]

Mitigating Epidemics through Mobile Micro-measures

Authors: Mohamed Kafsi, Ehsan Kazemi, Lucas Maystre, Lyudmila Yartseva, Matthias Grossglauser, Patrick Thiran

Abstract: Epidemics of infectious diseases are among the largest threats to the quality of life and the economic and social well-being of develo** countries. The arsenal of measures against such epidemics is well-established, but costly and insufficient to mitigate their impact. In this paper, we argue that mobile technology adds a powerful weapon to this arsenal, because (a) mobile devices endow us with… ▽ More Epidemics of infectious diseases are among the largest threats to the quality of life and the economic and social well-being of develo** countries. The arsenal of measures against such epidemics is well-established, but costly and insufficient to mitigate their impact. In this paper, we argue that mobile technology adds a powerful weapon to this arsenal, because (a) mobile devices endow us with the unprecedented ability to measure and model the detailed behavioral patterns of the affected population, and (b) they enable the delivery of personalized behavioral recommendations to individuals in real time. We combine these two ideas and propose several strategies to generate such recommendations from mobility patterns. The goal of each strategy is a large reduction in infections, with a small impact on the normal course of daily life. We evaluate these strategies over the Orange D4D dataset and show the benefit of mobile micro-measures, even if only a fraction of the population participates. These preliminary results demonstrate the potential of mobile technology to complement other measures like vaccination and quarantines against disease epidemics. △ Less

Submitted 8 July, 2013; originally announced July 2013.

Comments: Presented at NetMob 2013, Boston

Showing 1–23 of 23 results for author: Kazemi, E