Skip to main content

Showing 1–25 of 25 results for author: Bahri, Y

.
  1. arXiv:2403.03154  [pdf, other

    physics.comp-ph cond-mat.other cs.AI

    Quantum Many-Body Physics Calculations with Large Language Models

    Authors: Haining Pan, Nayantara Mudur, Will Taranto, Maria Tikhanovskaya, Subhashini Venugopalan, Yasaman Bahri, Michael P. Brenner, Eun-Ah Kim

    Abstract: Large language models (LLMs) have demonstrated an unprecedented ability to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physics. We focus on a broadly used approximation method in quantum physics: the Hartree-Fock metho… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures. Supplemental material in the source file

  2. arXiv:2309.01592  [pdf, other

    stat.ML cs.AI cs.LG hep-th math.PR

    Les Houches Lectures on Deep Learning at Large & Infinite Width

    Authors: Yasaman Bahri, Boris Hanin, Antonin Brossollet, Vittorio Erba, Christian Keup, Rosalba Pacelli, James B. Simon

    Abstract: These lectures, presented at the 2022 Les Houches Summer School on Statistical Physics and Machine Learning, focus on the infinite-width limit and large-width regime of deep neural networks. Topics covered include various statistical and dynamical properties of these networks. In particular, the lecturers discuss properties of random deep neural networks; connections between trained deep neural ne… ▽ More

    Submitted 12 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: These are notes from lectures delivered by Yasaman Bahri and Boris Hanin at the 2022 Les Houches Summer School on Statistics Physics and Machine Learning and a first version of them were transcribed by Antonin Brossollet, Vittorio Erba, Christian Keup, Rosalba Pacelli, James B. Simon

  3. arXiv:2208.07896  [pdf, ps, other

    math.AP

    Normal form for transverse instability of gZK equation for the line soliton with nearly critical speed

    Authors: Yakine Bahri, Hichem Hajaiej

    Abstract: In this paper, we study the transverse instability of generalized Zakharov-Kuznetsov equation for the line soliton with critical speed. We derive and justify a normal form reduction for a bifurcation problem of the stationary nonlinear KdV equation on the product space R ? T.

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:1706.00064 by other authors

  4. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  5. arXiv:2201.06764  [pdf, ps, other

    math.AP

    Infinitely many positive solutions of a Gross-Pitaevskii equation in the presence of a harmonic potential and combined nonlinearities

    Authors: Yakine Bahri, Hichem Hajaiej

    Abstract: The main goal of this paper is to address an important conjecture in the field of differential equations in the presence of a harmonic potential. While in the subcritical case, the uniqueness of positive solution has been addressed by Hirose and Ohta in 2007, the problem has remained open for years in the supercritical case. In Hadj Selem et al., the authors obtained interesting numerical computat… ▽ More

    Submitted 6 March, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

  6. arXiv:2106.15831  [pdf, other

    cs.LG cs.AI cs.CV

    The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning

    Authors: Anders Andreassen, Yasaman Bahri, Behnam Neyshabur, Rebecca Roelofs

    Abstract: Although machine learning models typically experience a drop in performance on out-of-distribution data, accuracies on in- versus out-of-distribution data are widely observed to follow a single linear trend when evaluated across a testbed of models. Models that are more accurate on the out-of-distribution data relative to this baseline exhibit "effective robustness" and are exceedingly rare. Ident… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

    Comments: 27 pages, 25 figures

  7. arXiv:2103.05887  [pdf, ps, other

    math.AP math-ph

    Pitchfork bifurcation at line solitons for nonlinear Schrödinger equations on the product space $\mathbb{R} \times \mathbb{T}$

    Authors: Takafumi Akahori, Yakine Bahri, Slim Ibrahim, Hiroaki Kikuchi

    Abstract: In this paper, we study the bifurcation problem from a line soliton for a stationary nonlinear Schrödinger equation on the product space $\mathbb{R} \times \mathbb{T}$. We extend earlier results to a larger class of the nonlinearity in the equation. The salient point of our analysis relies on a lower bound of solution to the ``auxiliary equation'' and then on the application of the Crandall-Rabino… ▽ More

    Submitted 18 January, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

  8. arXiv:2102.06701  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Explaining Neural Scaling Laws

    Authors: Yasaman Bahri, Ethan Dyer, Jared Kaplan, Jaehoon Lee, Utkarsh Sharma

    Abstract: The population loss of trained deep neural networks often follows precise power-law scaling relations with either the size of the training dataset or the number of parameters in the network. We propose a theory that explains the origins of and connects these scaling laws. We identify variance-limited and resolution-limited scaling behavior for both dataset and model size, for a total of four scali… ▽ More

    Submitted 28 April, 2024; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: 11 pages, 3 figures + Supplement (expanded). This version to appear in PNAS

    Journal ref: PNAS 121 (27) e2311878121 (2024)

  9. arXiv:2101.01314  [pdf, ps, other

    math.AP

    Transverse stability of line soliton and characterization of ground state for wave guide Schrödinger equations

    Authors: Yakine Bahri, Slim Ibrahim, Hiroaki Kikuchi

    Abstract: In this paper, we study the transverse stability of the line Schrödinger soliton under a full wave guide Schrödinger flow on a cylindrical domain $\mathbb R\times\mathbb T$. When the nonlinearity is of power type $|ψ|^{p-1}ψ$ with $p>1$, we show that there exists a critical frequency $ω_{p} >0$ such that the line standing wave is stable for $0<ω< ω_{p}$ and unstable for $ω> ω_{p}$. Furthermore, we… ▽ More

    Submitted 8 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: To appear in JDDE

  10. arXiv:2006.10541  [pdf, other

    stat.ML cs.LG

    Exact posterior distributions of wide Bayesian neural networks

    Authors: Jiri Hron, Yasaman Bahri, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein

    Abstract: Recent work has shown that the prior over functions induced by a deep Bayesian neural network (BNN) behaves as a Gaussian process (GP) as the width of all layers becomes large. However, many BNN applications are concerned with the BNN function space posterior. While some empirical evidence of the posterior convergence was provided in the original works of Neal (1996) and Matthews et al. (2018), it… ▽ More

    Submitted 26 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

  11. arXiv:2006.10540  [pdf, other

    stat.ML cs.LG

    Infinite attention: NNGP and NTK for deep attention networks

    Authors: Jiri Hron, Yasaman Bahri, Jascha Sohl-Dickstein, Roman Novak

    Abstract: There is a growing amount of literature on the relationship between wide neural networks (NNs) and Gaussian processes (GPs), identifying an equivalence between the two for a variety of NN architectures. This equivalence enables, for instance, accurate approximation of the behaviour of wide Bayesian NNs without MCMC or variational approximations, or characterisation of the distribution of randomly… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: ICML 2020

  12. arXiv:2003.02218  [pdf, other

    stat.ML cs.LG

    The large learning rate phase of deep learning: the catapult mechanism

    Authors: Aitor Lewkowycz, Yasaman Bahri, Ethan Dyer, Jascha Sohl-Dickstein, Guy Gur-Ari

    Abstract: The choice of initial learning rate can have a profound effect on the performance of deep networks. We present a class of neural networks with solvable training dynamics, and confirm their predictions empirically in practical deep learning settings. The networks exhibit sharply distinct behaviors at small and large learning rates. The two regimes are separated by a phase transition. In the small l… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: 25 pages, 19 figures

  13. arXiv:1911.11457  [pdf, ps, other

    math.AP

    Self-similar blow-up profiles for slightly supercritical nonlinear Schrödinger equations

    Authors: Yakine Bahri, Yvan Martel, Pierre Raphaël

    Abstract: We construct radially symmetric self-similar blow-up profiles for the mass supercritical nonlinear Schrödinger equation $i\partial_t u + Δu + |u|^{p-1}u=0$ on $\mathbf{R}^d$, close to the mass critical case and for any space dimension $d\ge 1$. These profiles bifurcate from the ground state solitary wave. The argument relies on the classical matched asymptotics method suggested in [Sulem, C.; Sule… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  14. Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

    Authors: Jaehoon Lee, Lechao Xiao, Samuel S. Schoenholz, Yasaman Bahri, Roman Novak, Jascha Sohl-Dickstein, Jeffrey Pennington

    Abstract: A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks have made a theory of learning dynamics elusive. In this work, we show that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained… ▽ More

    Submitted 8 December, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: 12+16 pages; open-source code available at https://github.com/google/neural-tangents; accepted to NeurIPS 2019

  15. arXiv:1810.05148  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes

    Authors: Roman Novak, Lechao Xiao, Jaehoon Lee, Yasaman Bahri, Greg Yang, Jiri Hron, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein

    Abstract: There is a previously identified equivalence between wide fully connected neural networks (FCNs) and Gaussian processes (GPs). This equivalence enables, for instance, test set predictions that would have resulted from a fully Bayesian, infinitely wide trained FCN to be computed without ever instantiating the FCN, but by instead evaluating the corresponding GP. In this work, we derive an analogous… ▽ More

    Submitted 21 August, 2020; v1 submitted 11 October, 2018; originally announced October 2018.

    Comments: Published as a conference paper at ICLR 2019

  16. arXiv:1810.01385  [pdf, ps, other

    math.AP

    Remarks on solitary waves and Cauchy problem for a Half-wave-Schrödinger equations

    Authors: Yakine Bahri, Slim Ibrahim, Hiroaki Kikuchi

    Abstract: In this paper, we study the solitary wave and the Cauchy problem for Half-wave-Schrödinger equations in the plane. First, we show the existence and orbital stability of the ground states. Secondly, we prove that traveling waves exist and converge to zero as the velocity tends to $1$. Finally, we solve the Cauchy problem for initial data in $L^{2}_{x}H^{s}_{y}(\mathbb{R}^{2})$, with… ▽ More

    Submitted 2 October, 2018; originally announced October 2018.

  17. arXiv:1806.05393  [pdf, other

    stat.ML cs.LG

    Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks

    Authors: Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington

    Abstract: In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds or even thousands of layers. A variety of pathologies such as vanishing/exploding gradients make training such deep networks challenging. While residual connections and batch normalization do enabl… ▽ More

    Submitted 10 July, 2018; v1 submitted 14 June, 2018; originally announced June 2018.

    Comments: ICML 2018 Conference Proceedings

  18. arXiv:1802.08760  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Sensitivity and Generalization in Neural Networks: an Empirical Study

    Authors: Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein

    Abstract: In practice it is often found that large over-parameterized neural networks generalize better than their smaller counterparts, an observation that appears to conflict with classical notions of function complexity, which typically favor smaller models. In this work, we investigate this tension between complexity and generalization through an extensive empirical exploration of two natural metrics of… ▽ More

    Submitted 18 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICLR 2018

  19. arXiv:1711.00165  [pdf, other

    stat.ML cs.LG

    Deep Neural Networks as Gaussian Processes

    Authors: Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein

    Abstract: It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer… ▽ More

    Submitted 2 March, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

    Comments: Published version in ICLR 2018. 10 pages + appendix

  20. arXiv:1604.03715  [pdf, ps, other

    math-ph

    On the asymptotic stability in the energy space for multi-solitons of the Landau-Lifshitz equation

    Authors: Yakine Bahri

    Abstract: We establish the asymptotic stability of multi-solitons for the one-dimensional Landau-Lifshitz equation with an easy-plane anisotropy. The solitons have non-zero speed, are ordered according to their speeds and have sufficiently separated initial positions. We provide the asymptotic stability around solitons and between solitons. More precisely, we show that for an initial datum close to a sum of… ▽ More

    Submitted 13 April, 2016; originally announced April 2016.

  21. Asymptotic stability in the energy space for dark solitons of the Landau-Lifshitz equation

    Authors: Yakine Bahri

    Abstract: We prove the asymptotic stability in the energy space of non-zero speed solitons for the one-dimensional Landau-Lifshitz equation with an easy-plane anisotropy. More precisely, we show that any solution corresponding to an initial datum close to a soliton with non-zero speed, is weakly convergent in the energy space as time goes to infinity, to a soliton with a possible different non-zero speed, u… ▽ More

    Submitted 1 December, 2015; originally announced December 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1212.5027 by other authors

    Journal ref: Anal. PDE 9 (2016) 645-697

  22. arXiv:1410.1320  [pdf, other

    cond-mat.mes-hall cond-mat.soft

    Phonon analogue of topological nodal semimetals

    Authors: Hoi Chun Po, Yasaman Bahri, Ashvin Vishwanath

    Abstract: Topological band structures in electronic systems like topological insulators and semimetals give rise to highly unusual physical properties. Analogous topological effects have also been discussed in bosonic systems, but the novel phenomena typically occur only when the system is excited by finite-frequency probes. A map** recently proposed by Kane and Lubensky [Nat. Phys. 10, 39 (2014)], howeve… ▽ More

    Submitted 3 March, 2017; v1 submitted 6 October, 2014; originally announced October 2014.

    Comments: 5 pages, 3 figures and supplementary materials; v2: 6+1 pages, 5+1 figures. Close to published version

    Journal ref: Phys. Rev. B 93, 205158 (2016)

  23. arXiv:1408.6826  [pdf, other

    cond-mat.str-el cond-mat.mes-hall

    Stable non-Fermi liquid phase of itinerant spin-orbit coupled ferromagnets

    Authors: Yasaman Bahri, Andrew C. Potter

    Abstract: Direct coupling between gapless bosons and a Fermi surface results in the destruction of Landau quasiparticles and a breakdown of Fermi liquid theory. Such a non-Fermi liquid phase arises in spin-orbit coupled ferromagnets with spontaneously broken continuous symmetries due to strong coupling between rotational Goldstone modes and itinerant electrons. These systems provide an experimentally access… ▽ More

    Submitted 8 September, 2014; v1 submitted 28 August, 2014; originally announced August 2014.

    Comments: 14 pages; typos fixed and transport/disorder sections revised

    Journal ref: Phys. Rev. B 92, 035131 (2015)

  24. arXiv:1307.4092  [pdf, other

    cond-mat.dis-nn cond-mat.quant-gas cond-mat.str-el quant-ph

    Localization and topology protected quantum coherence at the edge of 'hot' matter

    Authors: Yasaman Bahri, Ronen Vosk, Ehud Altman, Ashvin Vishwanath

    Abstract: Topological phases are often characterized by special edge states confined near the boundaries by an energy gap in the bulk. On raising temperature, these edge states are lost in a clean system due to mobile thermal excitations. Recently however, it has been established that disorder can localize an isolated many body system, potentially allowing for a sharply defined topological phase even in a h… ▽ More

    Submitted 4 October, 2013; v1 submitted 15 July, 2013; originally announced July 2013.

    Comments: Typos corrected and appendix E added

  25. Detecting Majorana fermions in quasi-one-dimensional topological phases using nonlocal order parameters

    Authors: Yasaman Bahri, Ashvin Vishwanath

    Abstract: Topological phases which host Majorana fermions can not be identified via local order parameters. We give simple nonlocal order parameters to distinguish quasi-one-dimensional (1D) topological superconductors of spinless fermions, for any interacting model in the absence of time reversal symmetry. These string or "brane" order parameters are natural for measurements in cold atom systems using quan… ▽ More

    Submitted 25 July, 2014; v1 submitted 11 March, 2013; originally announced March 2013.

    Comments: Final version; 14 pages, 4 figures

    Journal ref: Phys. Rev. B 89, 155135 (2014)