Skip to main content

Showing 1–12 of 12 results for author: Lotfi, S

.
  1. arXiv:2312.17173  [pdf, other

    stat.ML cs.LG

    Non-Vacuous Generalization Bounds for Large Language Models

    Authors: Sanae Lotfi, Marc Finzi, Yilun Kuang, Tim G. J. Rudner, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular,… ▽ More

    Submitted 12 February, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  2. arXiv:2211.13609  [pdf, other

    cs.LG stat.ML

    PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

    Authors: Sanae Lotfi, Marc Finzi, Sanyam Kapoor, Andres Potapczynski, Micah Goldblum, Andrew Gordon Wilson

    Abstract: While there has been progress in develo** non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on quantizing neural network parameters in a linear subspace, profoundly improving on previous results to provide state-of-the-art generalization bounds on a variety of tas… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022. Code is available at https://github.com/activatedgeek/tight-pac-bayes

  3. arXiv:2209.03147  [pdf, other

    cs.CR

    Network Intrusion Detection with Limited Labeled Data Using Self-supervision

    Authors: S. Lotfi, M. Modirrousta, S. Shashaani, M. Aliyari Shoorehdeli

    Abstract: With the increasing dependency of daily life over computer networks, the importance of these networks security becomes prominent. Different intrusion attacks to networks have been designed and the attackers are working on improving them. Thus the ability to detect intrusion with limited number of labeled data is desirable to provide networks with higher level of security. In this paper we design a… ▽ More

    Submitted 31 March, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

  4. arXiv:2204.12314  [pdf, other

    quant-ph gr-qc hep-ph

    Gravitational Effects on Quantum Coherence in Neutrino Oscillation

    Authors: M. M. Ettefaghi, R. Ramezani Arani, Z. S. Tabatabaei Lotfi

    Abstract: In this paper, we investigate the quantum coherence for two flavor neutrinos propagating in a Schwarzschild metric. In fact, this issue is explored both qualitatively via calculating the parameter $K_{3}$ in Leggett-Garg inequality (LGI) and also quantitatively by evaluating the $l_{1}$-norm, ${\cal C}(ρ)$. Using the weak field approximations, we show that the gravitational effects decrease the ma… ▽ More

    Submitted 4 May, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: 21 pages, 2 figures, to appear in Phys. Rev. D

  5. arXiv:2202.11678  [pdf, other

    cs.LG stat.ML

    Bayesian Model Selection, the Marginal Likelihood, and Generalization

    Authors: Sanae Lotfi, Pavel Izmailov, Gregory Benton, Micah Goldblum, Andrew Gordon Wilson

    Abstract: How do we compare between hypotheses that are entirely consistent with observations? The marginal likelihood (aka Bayesian evidence), which represents the probability of generating our observations from a prior, provides a distinctive approach to this foundational question, automatically encoding Occam's razor. Although it has been observed that the marginal likelihood can overfit and is sensitive… ▽ More

    Submitted 1 May, 2023; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: Extended version. Shorter ICML version available at arXiv:2202.11678v2

  6. arXiv:2111.14761  [pdf, other

    cs.LG math.NA math.OC

    Adaptive First- and Second-Order Algorithms for Large-Scale Machine Learning

    Authors: Sanae Lotfi, Tiphaine Bonniot de Ruisselet, Dominique Orban, Andrea Lodi

    Abstract: In this paper, we consider both first- and second-order techniques to address continuous optimization problems arising in machine learning. In the first-order case, we propose a framework of transition from deterministic or semi-deterministic to stochastic quadratic regularization methods. We leverage the two-phase nature of stochastic optimization to propose a novel first-order algorithm with ada… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: 29 pages, 8 figures. arXiv admin note: text overlap with arXiv:2012.05783

    MSC Class: 68T07; 90C15; 90C30; 90C53 ACM Class: G.1.6; G.3; G.4; I.2.6

  7. arXiv:2109.10430  [pdf, other

    cs.NE cs.DC

    GAP2WSS: A Genetic Algorithm based on the Pareto Principle for Web Service Selection

    Authors: SayedHassan Khatoonabadi, Shahriar Lotfi, Ayaz Isazadeh

    Abstract: Despite all the progress in Web service selection, the need for an approach with a better optimality and performance still remains. This paper presents a genetic algorithm by adopting the Pareto principle that is called GAP2WSS for selecting a Web service for each task of a composite Web service from a pool of candidate Web services. In contrast to the existing approaches, all global QoS constrain… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

  8. arXiv:2106.11905  [pdf, other

    cs.LG stat.ML

    Dangers of Bayesian Model Averaging under Covariate Shift

    Authors: Pavel Izmailov, Patrick Nicholson, Sanae Lotfi, Andrew Gordon Wilson

    Abstract: Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this… ▽ More

    Submitted 6 December, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021. Code is available at https://github.com/izmailovpavel/bnn_covariate_shift

  9. arXiv:2102.13042  [pdf, other

    cs.LG cs.CV stat.ML

    Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling

    Authors: Gregory W. Benton, Wesley J. Maddox, Sanae Lotfi, Andrew Gordon Wilson

    Abstract: With a better understanding of the loss surfaces for multilayer networks, we can build more robust and accurate training procedures. Recently it was discovered that independently trained SGD solutions can be connected along one-dimensional paths of near-constant training loss. In this paper, we show that there are mode-connecting simplicial complexes that form multi-dimensional manifolds of low lo… ▽ More

    Submitted 15 November, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  10. arXiv:2012.05783  [pdf, other

    cs.LG math.NA math.OC

    Stochastic Damped L-BFGS with Controlled Norm of the Hessian Approximation

    Authors: Sanae Lotfi, Tiphaine Bonniot de Ruisselet, Dominique Orban, Andrea Lodi

    Abstract: We propose a new stochastic variance-reduced damped L-BFGS algorithm, where we leverage estimates of bounds on the largest and smallest eigenvalues of the Hessian approximation to balance its quality and conditioning. Our algorithm, VARCHEN, draws from previous work that proposed a novel stochastic damped L-BFGS algorithm called SdLBFGS. We establish almost sure convergence to a stationary point a… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 14 pages, 4 figures

    Report number: Cahier du GERAD G-2020-52 MSC Class: 68T07; 90C15; 90C30; 90C53 ACM Class: G.1.6; G.3; G.4; I.2.6

  11. Quantum Correlations in Neutrino Oscillation: Coherence and Entanglement

    Authors: M. M. Ettefaghi, Z. S. Tabatabaei Lotfi, R. Ramezani Arani

    Abstract: In this paper, we consider the quantum correlations, coherence and entanglement, in neutrino oscillation. We find that the $l_{1}$-norm as a coherence measure is equal to sum of the three possible concurrences for measuring the entanglement among different flavor modes which were calculated in the paper by (M. Blasone et al., Europhys. Lett., {\bf 112}, 20007). Our result shows that the origin of… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: 12 pages; 1 figure

  12. Graph Colouring Problem Based on Discrete Imperialist Competitive Algorithm

    Authors: Hojjat Emami, Shahriar Lotfi

    Abstract: In graph theory, Graph Colouring Problem (GCP) is an assignment of colours to vertices of any given graph such that the colours on adjacent vertices are different. The GCP is known to be an optimization and NP-hard problem. Imperialist Competitive Algorithm (ICA) is a meta-heuristic optimization and stochastic search strategy which is inspired from socio-political phenomenon of imperialistic compe… ▽ More

    Submitted 17 August, 2013; originally announced August 2013.

    Comments: 12 pages

    Journal ref: International Journal in Foundations of Computer Science & Technology (IJFCST), Vol. 3, No.4, July 2013, pp. 1-12