Skip to main content

Showing 1–14 of 14 results for author: Haddadpour, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2204.12446  [pdf, ps, other

    stat.ML cs.LG

    Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD

    Authors: Konstantinos E. Nikolakakis, Farzin Haddadpour, Amin Karbasi, Dionysios S. Kalogerias

    Abstract: We provide sharp path-dependent generalization and excess risk guarantees for the full-batch Gradient Descent (GD) algorithm on smooth losses (possibly non-Lipschitz, possibly nonconvex). At the heart of our analysis is an upper bound on the generalization error, which implies that average output stability and a bounded expected optimization error at termination lead to generalization. This result… ▽ More

    Submitted 9 February, 2023; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: 35 pages

  2. arXiv:2203.09607  [pdf, other

    cs.LG stat.ML

    Learning Distributionally Robust Models at Scale via Composite Optimization

    Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Mehrdad Mahdavi, Amin Karbasi

    Abstract: To train machine learning models that are robust to distribution shifts in the data, distributionally robust optimization (DRO) has been proven very effective. However, the existing approaches to learning a distributionally robust model either require solving complex optimization problems such as semidefinite programming or a first-order method whose convergence scales linearly with the number of… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to ICLR2022 as a conference paper. International Conference on Learning Representations (2022)

  3. arXiv:2202.06880  [pdf, ps, other

    cs.LG math.OC stat.ML

    Black-Box Generalization: Stability of Zeroth-Order Learning

    Authors: Konstantinos E. Nikolakakis, Farzin Haddadpour, Dionysios S. Kalogerias, Amin Karbasi

    Abstract: We provide the first generalization error analysis for black-box learning through derivative-free optimization. Under the assumption of a Lipschitz and smooth unknown loss, we consider the Zeroth-order Stochastic Search (ZoSS) algorithm, that updates a $d$-dimensional model by replacing stochastic gradient directions with stochastic differences of $K+1$ perturbed loss evaluations per dataset (exam… ▽ More

    Submitted 9 February, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: 32 pages

  4. arXiv:2008.04975  [pdf, ps, other

    stat.ML cs.DS cs.LG

    FedSKETCH: Communication-Efficient and Private Federated Learning via Sketching

    Authors: Farzin Haddadpour, Belhal Karimi, ** Li, Xiaoyun Li

    Abstract: Communication complexity and privacy are the two key challenges in Federated Learning where the goal is to perform a distributed learning through a large volume of devices. In this work, we introduce FedSKETCH and FedSKETCHGATE algorithms to address both challenges in Federated learning jointly, where these algorithms are intended to be used for homogeneous and heterogeneous data distribution sett… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

  5. arXiv:2007.01154  [pdf, other

    cs.LG cs.DC stat.ML

    Federated Learning with Compression: Unified Analysis and Sharp Guarantees

    Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari, Mehrdad Mahdavi

    Abstract: In federated learning, communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices with potentially unreliable or limited communication and heterogeneous data distributions. Two notable trends to deal with the communication overhead of federated algorithms are gradient compression and local computation… ▽ More

    Submitted 20 November, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: version 2. more experiments and comparisons

  6. arXiv:1911.04931  [pdf, other

    cs.LG cs.DS math.OC stat.ML

    Efficient Fair Principal Component Analysis

    Authors: Mohammad Mahdi Kamani, Farzin Haddadpour, Rana Forsati, Mehrdad Mahdavi

    Abstract: It has been shown that dimension reduction methods such as PCA may be inherently prone to unfairness and treat data from different sensitive groups such as race, color, sex, etc., unfairly. In pursuit of fairness-enhancing dimensionality reduction, using the notion of Pareto optimality, we propose an adaptive first-order algorithm to learn a subspace that preserves fairness, while slightly comprom… ▽ More

    Submitted 7 March, 2020; v1 submitted 12 November, 2019; originally announced November 2019.

  7. arXiv:1910.14425  [pdf, other

    cs.LG cs.DC stat.ML

    On the Convergence of Local Descent Methods in Federated Learning

    Authors: Farzin Haddadpour, Mehrdad Mahdavi

    Abstract: In federated distributed learning, the goal is to optimize a global training objective defined over distributed devices, where the data shard at each device is sampled from a possibly different distribution (a.k.a., heterogeneous or non i.i.d. data samples). In this paper, we generalize the local stochastic and full gradient descent with periodic averaging-- originally designed for homogeneous dis… ▽ More

    Submitted 6 December, 2019; v1 submitted 31 October, 2019; originally announced October 2019.

    Comments: 47 pages, "Updates from v1: A technical error in Lemma B3 is corrected"

  8. arXiv:1910.13598  [pdf, other

    cs.LG cs.DC stat.ML

    Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization

    Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Mehrdad Mahdavi, Viveck R. Cadambe

    Abstract: Communication overhead is one of the key challenges that hinders the scalability of distributed optimization algorithms. In this paper, we study local distributed SGD, where data is partitioned among computation nodes, and the computation nodes perform local updates with periodically exchanging the model among the workers to perform averaging. While local SGD is empirically shown to provide promis… ▽ More

    Submitted 14 May, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: Paper accepted to NeurIPS 2019 - We fixed a flaw in the earlier version regarding the dependency on constants but this change does not affect the communication complexity

  9. arXiv:1806.06140  [pdf, other

    cs.IT

    Straggler-Resilient and Communication-Efficient Distributed Iterative Linear Solver

    Authors: Farzin Haddadpour, Yaoqing Yang, Malhar Chaudhari, Viveck R Cadambe, Pulkit Grover

    Abstract: We propose a novel distributed iterative linear inverse solver method. Our method, PolyLin, has significantly lower communication cost, both in terms of number of rounds as well as number of bits, in comparison with the state of the art at the cost of higher computational complexity and storage. Our algorithm also has a built-in resilience to straggling and faulty computation nodes. We develop a n… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

    Comments: 15 pages, 3 figures and 2 tables

  10. arXiv:1801.10292  [pdf, other

    cs.IT cs.DC

    On the Optimal Recovery Threshold of Coded Matrix Multiplication

    Authors: Sanghamitra Dutta, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, Pulkit Grover

    Abstract: We provide novel coded computation strategies for distributed matrix-matrix products that outperform the recent "Polynomial code" constructions in recovery threshold, i.e., the required number of successful workers. When $m$-th fraction of each matrix can be stored in each worker node, Polynomial codes require $m^2$ successful workers, while our MatDot codes only require $2m-1$ successful workers,… ▽ More

    Submitted 16 May, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: Extended version of the paper that appeared at Allerton 2017 (October 2017), including full proofs and further results. Submitted to IEEE Transactions on Information Theory

  11. arXiv:1605.02046  [pdf, other

    cs.LG cs.AI cs.IT

    Low-Complexity Stochastic Generalized Belief Propagation

    Authors: Farzin Haddadpour, Mahdi Jafari Siavoshani, Morteza Noshad

    Abstract: The generalized belief propagation (GBP), introduced by Yedidia et al., is an extension of the belief propagation (BP) algorithm, which is widely used in different problems involved in calculating exact or approximate marginals of probability distributions. In many problems, it has been observed that the accuracy of GBP considerably outperforms that of BP. However, because in general the computati… ▽ More

    Submitted 6 May, 2016; originally announced May 2016.

    Comments: 18 pages, 11 figures, a shorter version of this paper was accepted in ISIT'16

  12. arXiv:1305.5901  [pdf, other

    cs.IT

    Simulation of a Channel with Another Channel

    Authors: Farzin Haddadpour, Mohammad Hossein Yassaee, Salman Beigi, Amin Gohari, Mohammad Reza Aref

    Abstract: In this paper, we study the problem of simulating a DMC channel from another DMC channel under an average-case and an exact model. We present several achievability and infeasibility results, with tight characterizations in special cases. In particular for the exact model, we fully characterize when a BSC channel can be simulated from a BEC channel when there is no shared randomness. We also provid… ▽ More

    Submitted 1 December, 2016; v1 submitted 25 May, 2013; originally announced May 2013.

    Comments: 31 pages, 10 figures, and some parts of this work were published at ITW 2013

  13. arXiv:1301.6345  [pdf, other

    cs.IT

    On AVCs with Quadratic Constraints

    Authors: Farzin Haddadpour, Mahdi Jafari Siavoshani, Mayank Bakshi, Sidharth Jaggi

    Abstract: In this work we study an Arbitrarily Varying Channel (AVC) with quadratic power constraints on the transmitter and a so-called "oblivious" jammer (along with additional AWGN) under a maximum probability of error criterion, and no private randomness between the transmitter and the receiver. This is in contrast to similar AVC models under the average probability of error criterion considered in [1],… ▽ More

    Submitted 27 January, 2013; originally announced January 2013.

    Comments: A shorter version of this work will be send to ISIT13, Istanbul. 8 pages, 3 figures

  14. arXiv:1203.0731  [pdf, ps, other

    cs.IT

    Coordination via a relay

    Authors: Farzin Haddadpour, Mohammad Hossein Yassaee, Amin Gohari, Mohammad Reza Aref

    Abstract: In this paper, we study the problem of coordinating two nodes which can only exchange information via a relay at limited rates. The nodes are allowed to do a two-round interactive two-way communication with the relay, after which they should be able to generate i.i.d. copies of two random variables with a given joint distribution within a vanishing total variation distance. We prove inner and oute… ▽ More

    Submitted 4 March, 2012; originally announced March 2012.

    Comments: Submitted to ISIT 2012