Skip to main content

Showing 1–6 of 6 results for author: Pethick, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.13459  [pdf, other

    cs.LG math.OC

    Stable Nonconvex-Nonconcave Training via Linear Interpolation

    Authors: Thomas Pethick, Wanyun Xie, Volkan Cevher

    Abstract: This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear interpolation can help by leveraging the theory of nonexpansive operators. We construct a new optimization scheme cal… ▽ More

    Submitted 14 March, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

  2. arXiv:2306.05325  [pdf, other

    cs.LG

    Federated Learning under Covariate Shifts with Generalization Guarantees

    Authors: Ali Ramezani-Kebrya, Fanghui Liu, Thomas Pethick, Grigorios Chrysos, Volkan Cevher

    Abstract: This paper addresses intra-client and inter-client covariate shifts in federated learning (FL) with a focus on the overall generalization performance. To handle covariate shifts, we formulate a new global model training paradigm and propose Federated Importance-Weighted Empirical Risk Minimization (FTW-ERM) along with improving density ratio matching methods without requiring perfect knowledge of… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  3. arXiv:2302.09831  [pdf, other

    math.OC cs.LG

    Esca** limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems

    Authors: Thomas Pethick, Puya Latafat, Panagiotis Patrinos, Olivier Fercoq, Volkan Cevher

    Abstract: This paper introduces a new extragradient-type algorithm for a class of nonconvex-nonconcave minimax problems. It is well-known that finding a local solution for general minimax problems is computationally intractable. This observation has recently motivated the study of structures sufficient for convergence of first order methods in the more general setting of variational inequalities when the so… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: Code accessible at: https://github.com/LIONS-EPFL/weak-minty-code/

  4. arXiv:2302.09029  [pdf, other

    math.OC cs.LG

    Solving stochastic weak Minty variational inequalities without increasing batch size

    Authors: Thomas Pethick, Olivier Fercoq, Puya Latafat, Panagiotis Patrinos, Volkan Cevher

    Abstract: This paper introduces a family of stochastic extragradient-type algorithms for a class of nonconvex-nonconcave problems characterized by the weak Minty variational inequality (MVI). Unlike existing results on extragradient methods in the monotone setting, employing diminishing stepsizes is no longer possible in the weak MVI setting. This has led to approaches such as increasing batch sizes per ite… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: Code accessible at: https://github.com/LIONS-EPFL/stochastic-weak-minty-code

  5. arXiv:2302.08872  [pdf, other

    cs.LG

    Revisiting adversarial training for the worst-performing class

    Authors: Thomas Pethick, Grigorios G. Chrysos, Volkan Cevher

    Abstract: Despite progress in adversarial training (AT), there is a substantial gap between the top-performing and worst-performing classes in many datasets. For example, on CIFAR10, the accuracies for the best and worst classes are 74% and 23%, respectively. We argue that this gap can be reduced by explicitly optimizing for the worst-performing class, resulting in a min-max-max optimization formulation. Ou… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: Code accessible at: https://github.com/LIONS-EPFL/class-focused-online-learning-code

  6. arXiv:2111.01875  [pdf, other

    cs.LG stat.ML

    Subquadratic Overparameterization for Shallow Neural Networks

    Authors: Chaehwan Song, Ali Ramezani-Kebrya, Thomas Pethick, Armin Eftekhari, Volkan Cevher

    Abstract: Overparameterization refers to the important phenomenon where the width of a neural network is chosen such that learning algorithms can provably attain zero loss in nonconvex training. The existing theory establishes such global convergence using various initialization strategies, training modifications, and width scalings. In particular, the state-of-the-art results require the width to scale qua… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: To appear at the conference on Neural Information Processing Systems (NeurIPS 2021)