Skip to main content

Showing 1–3 of 3 results for author: Amani, M H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.10575  [pdf, other

    cs.LG cs.AI

    Symbolic Autoencoding for Self-Supervised Sequence Learning

    Authors: Mohammad Hossein Amani, Nicolas Mario Baldwin, Amin Mansouri, Martin Josifoski, Maxime Peyrard, Robert West

    Abstract: Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce. Addressing this issue, we introduce \textit{symbolic autoencoding} ($Σ$AE), a self-supervised framework that harnesses the power of abundant unparallel data alongside limited parallel data. $Σ$AE connects… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  2. arXiv:2205.10217  [pdf, other

    stat.ML cs.IT cs.LG

    Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization

    Authors: Simone Bombari, Mohammad Hossein Amani, Marco Mondelli

    Abstract: The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with $Ω(N)$ neurons, $N$ being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-li… ▽ More

    Submitted 21 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Uniformed with the published NeurIPS 2022 version

  3. arXiv:2205.08199  [pdf, ps, other

    cs.IT cs.LG stat.ML

    Sharp asymptotics on the compression of two-layer neural networks

    Authors: Mohammad Hossein Amani, Simone Bombari, Marco Mondelli, Rattana Pukdee, Stefano Rini

    Abstract: In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M<N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L_2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools… ▽ More

    Submitted 16 August, 2022; v1 submitted 17 May, 2022; originally announced May 2022.