Skip to main content

Showing 1–10 of 10 results for author: Sanjabi, M

Searching in archive math. Search in all archives.
.
  1. arXiv:2204.13169  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    FedShuffle: Recipes for Better Use of Local Work in Federated Learning

    Authors: Samuel Horváth, Maziar Sanjabi, Lin Xiao, Peter Richtárik, Michael Rabbat

    Abstract: The practice of applying several local updates before aggregation across clients has been empirically shown to be a successful approach to overcoming the communication bottleneck in Federated Learning (FL). Such methods are usually implemented by having clients perform one or more epochs of local training per round while randomly reshuffling their finite dataset in each epoch. Data imbalance, wher… ▽ More

    Submitted 27 September, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: Published in Transactions on Machine Learning Research (09/2022)

  2. arXiv:2204.03809  [pdf, other

    cs.LG cs.DC math.OC

    Federated Learning with Partial Model Personalization

    Authors: Krishna Pillutla, Kshitiz Malik, Abdelrahman Mohamed, Michael Rabbat, Maziar Sanjabi, Lin Xiao

    Abstract: We consider two federated learning algorithms for training partially personalized models, where the shared and personal parameters are updated either simultaneously or alternately on the devices. Both algorithms have been proposed in the literature, but their convergence properties are not fully understood, especially for the alternating variant. We provide convergence analyses of both algorithms… ▽ More

    Submitted 15 August, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Journal ref: ICML 2022: 17716-17758

  3. arXiv:2109.06141  [pdf, other

    cs.LG cs.IT math.OC stat.ML

    On Tilted Losses in Machine Learning: Theory and Applications

    Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith

    Abstract: Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -… ▽ More

    Submitted 1 June, 2023; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2007.01162

  4. arXiv:2009.03482  [pdf, ps, other

    math.OC cs.LG

    Alternating Direction Method of Multipliers for Quantization

    Authors: Tianjian Huang, Prajwal Singhania, Maziar Sanjabi, Pabitra Mitra, Meisam Razaviyayn

    Abstract: Quantization of the parameters of machine learning models, such as deep neural networks, requires solving constrained optimization problems, where the constraint set is formed by the Cartesian product of many simple discrete sets. For such optimization problems, we study the performance of the Alternating Direction Method of Multipliers for Quantization ($\texttt{ADMM-Q}$) algorithm, which is a va… ▽ More

    Submitted 1 March, 2021; v1 submitted 7 September, 2020; originally announced September 2020.

  5. arXiv:2006.08141  [pdf, other

    math.OC cs.LG stat.ML

    Non-convex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances

    Authors: Meisam Razaviyayn, Tianjian Huang, Songtao Lu, Maher Nouiehed, Maziar Sanjabi, Mingyi Hong

    Abstract: The min-max optimization problem, also known as the saddle point problem, is a classical optimization problem which is also studied in the context of zero-sum games. Given a class of objective functions, the goal is to find a value for the argument which leads to a small objective value even for the worst case function in the given class. Min-max optimization problems have recently become very pop… ▽ More

    Submitted 18 August, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: IEEE Signal Processing Magazine (Volume: 37, Issue: 5, Sept. 2020)

  6. arXiv:1902.08297  [pdf, other

    math.OC cs.LG stat.ML

    Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

    Authors: Maher Nouiehed, Maziar Sanjabi, Tianjian Huang, Jason D. Lee, Meisam Razaviyayn

    Abstract: Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon--first order stationary point of the game can b… ▽ More

    Submitted 30 October, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

  7. arXiv:1812.02878  [pdf, ps, other

    math.OC cs.GT cs.LG

    Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition

    Authors: Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee

    Abstract: In this short note, we consider the problem of solving a min-max zero-sum game. This problem has been extensively studied in the convex-concave regime where the global solution can be computed efficiently. Recently, there have also been developments for finding the first order stationary points of the game when one of the player's objective is concave or (weakly) concave. This work focuses on the… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

  8. arXiv:1802.08249  [pdf, other

    cs.LG math.OC stat.ML

    On the Convergence and Robustness of Training GANs with Regularized Optimal Transport

    Authors: Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, Jason D. Lee

    Abstract: Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its object… ▽ More

    Submitted 22 May, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

  9. arXiv:1404.5350  [pdf, ps, other

    math.OC

    On the Linear Convergence of the Approximate Proximal Splitting Method for Non-Smooth Convex Optimization

    Authors: Mojtaba Kadkhodaie, Maziar Sanjabi, Zhi-Quan Luo

    Abstract: Consider the problem of minimizing the sum of two convex functions, one being smooth and the other non-smooth. In this paper, we introduce a general class of approximate proximal splitting (APS) methods for solving such minimization problems. Methods in the APS class include many well-known algorithms such as the proximal splitting method (PSM), the block coordinate descent method (BCD) and the ap… ▽ More

    Submitted 21 April, 2014; originally announced April 2014.

    Comments: 21 pages, no figures

  10. arXiv:1307.4457  [pdf, ps, other

    math.OC math.NA

    A Stochastic Successive Minimization Method for Nonsmooth Nonconvex Optimization with Applications to Transceiver Design in Wireless Communication Networks

    Authors: Meisam Razaviyayn, Maziar Sanjabi, Zhi-Quan Luo

    Abstract: Consider the problem of minimizing the expected value of a cost function parameterized by a random variable. The classical sample average approximation (SAA) method for solving this problem requires minimization of an ensemble average of the objective at each step, which can be expensive. In this paper, we propose a stochastic successive upper-bound minimization method (SSUM) which minimizes an ap… ▽ More

    Submitted 22 July, 2013; v1 submitted 16 July, 2013; originally announced July 2013.