Skip to main content

Showing 1–10 of 10 results for author: Dar, G

.
  1. arXiv:2311.07772  [pdf, other

    cs.CL cs.LG

    In-context Learning and Gradient Descent Revisited

    Authors: Gilad Deutch, Nadav Magar, Tomer Bar Natan, Guy Dar

    Abstract: In-context learning (ICL) has shown impressive results in few-shot learning tasks, yet its underlying mechanism is still not fully understood. A recent line of work suggests that ICL performs gradient descent (GD)-based optimization implicitly. While appealing, much of the research focuses on simplified settings, where the parameters of a shallow model are optimized. In this work, we revisit evide… ▽ More

    Submitted 31 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024 main conference

  2. arXiv:2209.02535  [pdf, other

    cs.CL cs.LG

    Analyzing Transformers in Embedding Space

    Authors: Guy Dar, Mor Geva, Ankit Gupta, Jonathan Berant

    Abstract: Understanding Transformer-based models has attracted significant attention, as they lie at the heart of recent technological advances across machine learning. While most interpretability methods rely on running models over inputs, recent work has shown that a zero-pass approach, where parameters are interpreted directly without a forward/backward pass is feasible for some Transformer parameters, a… ▽ More

    Submitted 24 December, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

  3. arXiv:2204.12130  [pdf, other

    cs.CL

    LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

    Authors: Mor Geva, Avi Caciularu, Guy Dar, Paul Roit, Shoval Sadde, Micah Shlain, Bar Tamir, Yoav Goldberg

    Abstract: The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions. However, current interpretation methods mostly focus on probing models from outside, executing behavioral tests, and analyzing salience input features, while the internal prediction construction process is largely not understood. In this work, we int… ▽ More

    Submitted 12 October, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: EMNLP 2022 System Demonstrations

  4. arXiv:2106.06899  [pdf, other

    cs.CL cs.LG

    Memory-efficient Transformers via Top-$k$ Attention

    Authors: Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonathan Berant

    Abstract: Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. While these variants are memory and compute efficient, it is not possible to directly use them with popular pre-trained language models trained using vanilla attention, without an expensive corrective pre-training… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

  5. arXiv:nlin/0204026  [pdf, ps, other

    nlin.CD astro-ph cond-mat

    Comment on "On two-dimensional magnetohydrodynamic turbulence" [Phys. Plasmas, 8, 3282 (2001)]

    Authors: Mahendra K. Verma, Gaurav Dar, V. Eswaran

    Abstract: Biskamp and Schwarz [Phys. Plasmas, 8, 3282 (2001)] have reported that the energy spectrum of two-dimensional magnetohydrodynamic turbulence is proportional to $k^{-3/2}$, which is a prediction of Iroshnikov-Kraichnan phenomenology. In this comment we report some earlier results which conclusively show that for two-dimensional magnetohydrodynamic turbulence, Kolmogorov-like phenomenology (spectr… ▽ More

    Submitted 14 April, 2002; originally announced April 2002.

    Comments: 2 pages, Revtex, 1 figure (Phys. Plasmas, v9, p1484, 2002)

  6. Energy transfer in two-dimensional magnetohydrodynamic turbulence: formalism and numerical results

    Authors: Gaurav Dar, Mahendra K. Verma, V. Eswaran

    Abstract: The basic entity of nonlinear interaction in Navier-Stokes and the Magnetohydrodynamic (MHD) equations is a wavenumber triad ({\bf k,p,q}) satisfying ${\bf k+p+q=0}$. The expression for the combined energy transfer from two of these wavenumbers to the third wavenumber is known. In this paper we introduce the idea of an effective energy transfer between a pair of modes by the mediation of the thi… ▽ More

    Submitted 3 September, 2001; originally announced September 2001.

    Comments: 27 pages REVTEX; 14 ps figures

    Journal ref: Physica D, 157, (2001) 207

  7. arXiv:physics/0006012  [pdf, ps, other

    physics.flu-dyn nlin.CD physics.plasm-ph

    A new approach to study energy transfer in turbulence

    Authors: Gaurav Dar, Mahendra K. Verma, V. Eswaran

    Abstract: The unit of nonlinear interaction in Navier-Stokes and the Magnetohydrodynamic (MHD) equations is a wavenumber triad ({\bf k,p,q}) satisfying ${\bf k+p+q=0}$. The expression for the combined energy transfer from two of these wavenumbers to the third wavenumber is known. In this paper we introduce the idea of an effective energy transfer between a pair of modes through the mediation of the third… ▽ More

    Submitted 7 June, 2000; originally announced June 2000.

    Comments: 50 pages of LaTeX (includes 25 pages of appendix), 12 postscript figures, submitted to Physics of Plasmas

  8. Energy transfer in two-dimensional magnetohydrodynamic turbulence

    Authors: Gaurav Dar, Mahendra K. Verma, V. Eswaran

    Abstract: In an earlier paper (physics/0006012) we had developed a method for computing the effective energy transfer between any two Fourier modes in fluid or magnetohydrodynamic (MHD) flows. This method is applied to a pseudo-spectral, direct numerical simulation (DNS) study of energy transfer in the quasi-steady state of 2-D MHD turbulence with large scale kinetic forcing. Two aspects of energy transfe… ▽ More

    Submitted 14 July, 2000; v1 submitted 28 October, 1998; originally announced October 1998.

    Comments: 31 pages (22 pages LaTeX + 9 postscript figures), submitted to Physics of Plasmas, also see physics/0006012

  9. Initial Condition Sensitivity of Global Quantities in Magnetohydrodynamic Turbulence

    Authors: Gaurav Dar, Mahendra K. Verma, V. Eswaran

    Abstract: In this paper we study the effect of subtle changes in initial conditions on the evolution of global quantities in two-dimensional Magnetohydrodynamic (MHD) turbulence. We find that a change in the initial phases of complex Fourier modes of the Elsässer variables, while kee** the initial values of total energy, cross helicity and Alfvén ratio unchanged, has a significant effect on the evolutio… ▽ More

    Submitted 16 March, 1998; originally announced March 1998.

    Comments: 12 pages LateX, 11 ps figures. Accepted for publication by Physics of Plasmas

  10. arXiv:chao-dyn/9803022  [pdf, ps, other

    nlin.CD astro-ph cond-mat

    Probing Physics of Magnetohydrodynamic Turbulence Using Direct Numerical Simulation

    Authors: Mahendra K. Verma, Gaurav Dar

    Abstract: The energy spectrum and the nolinear cascade rates of MHD turbulence is not clearly understood. We have addressed this problem using direct numerical simulation and analytical calculations. Our numerical simulations indicate that Kolmogorov-like phenomenology with $k^{-5/3}$ energy spectrum, rather than Kraichnan's $k^{-3/2}$, appears to be applicable in MHD turbulence. Here, we also construct a… ▽ More

    Submitted 21 March, 1998; v1 submitted 16 March, 1998; originally announced March 1998.

    Comments: 10 pages Latex, 1 postscript figure, To appear in the proceedings of "Nonlinear Dynamics and Computational Physics"