Skip to main content

Showing 1–4 of 4 results for author: Masiha, S

.
  1. arXiv:2210.00483  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Learning Algorithm Generalization Error Bounds via Auxiliary Distributions

    Authors: Gholamali Aminian, Saeed Masiha, Laura Toni, Miguel R. D. Rodrigues

    Abstract: Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios. We show that our general upper bounds can be specialized under some conditions to new bounds invol… ▽ More

    Submitted 16 April, 2024; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: Accepted in IEEE Journal on Selected Areas in Information Theory

  2. arXiv:2206.11042  [pdf, other

    cs.IT cs.LG

    f-divergences and their applications in lossy compression and bounding generalization error

    Authors: Saeed Masiha, Amin Gohari, Mohammad Hossein Yassaee

    Abstract: In this paper, we provide three applications for $f$-divergences: (i) we introduce Sanov's upper bound on the tail probability of the sum of independent random variables based on super-modular $f$-divergence and show that our generalized Sanov's bound strictly improves over ordinary one, (ii) we consider the lossy compression problem which studies the set of achievable rates for a given distortion… ▽ More

    Submitted 26 January, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

  3. arXiv:2205.12856  [pdf, other

    cs.LG math.OC

    Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

    Authors: Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran

    Abstract: We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with $1\leα\le2$ which holds in a wide range of applications in machine learning and signal processing. This condition ensures that any first-order stationary point is a global optimum. We prove that the total sample complexity of SCRN in achieving $ε$-global optimu… ▽ More

    Submitted 20 January, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

  4. arXiv:2102.05695  [pdf, other

    cs.IT

    Learning under Distribution Mismatch and Model Misspecification

    Authors: Saeed Masiha, Amin Gohari, Mohammad Hossein Yassaee, Mohammad Reza Aref

    Abstract: We study learning algorithms when there is a mismatch between the distributions of the training and test datasets of a learning algorithm. The effect of this mismatch on the generalization error and model misspecification are quantified. Moreover, we provide a connection between the generalization error and the rate-distortion theory, which allows one to utilize bounds from the rate-distortion the… ▽ More

    Submitted 10 August, 2022; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: 25 pages, 4 figures