Skip to main content

Showing 1–21 of 21 results for author: Yahav, E

.
  1. arXiv:2305.02582  [pdf, other

    cs.LG

    On the Expressivity Role of LayerNorm in Transformers' Attention

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: Layer Normalization (LayerNorm) is an inherent component in all Transformer-based models. In this paper, we show that LayerNorm is crucial to the expressivity of the multi-head attention layer that follows it. This is in contrast to the common belief that LayerNorm's only role is to normalize the activations during the forward pass, and their gradients during the backward pass. We consider a geome… ▽ More

    Submitted 11 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted as a short paper in Findings of ACL 2023

  2. arXiv:2303.00613  [pdf, ps, other

    cs.LG

    Diffusing Graph Attention

    Authors: Daniel Glickman, Eran Yahav

    Abstract: The dominant paradigm for machine learning on graphs uses Message Passing Graph Neural Networks (MP-GNNs), in which node representations are updated by aggregating information in their local neighborhood. Recently, there have been increasingly more attempts to adapt the Transformer architecture to graphs in an effort to solve some known limitations of MP-GNN. A challenging aspect of designing Grap… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  3. arXiv:2106.06981  [pdf, other

    cs.LG cs.CL

    Thinking Like Transformers

    Authors: Gail Weiss, Yoav Goldberg, Eran Yahav

    Abstract: What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language.… ▽ More

    Submitted 19 July, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  4. arXiv:2105.14491  [pdf, other

    cs.LG

    How Attentive are Graph Attention Networks?

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: Graph Attention Networks (GATs) are one of the most popular GNN architectures and are considered as the state-of-the-art architecture for representation learning with graphs. In GAT, every node attends to its neighbors given its own representation as the query. However, in this paper we show that GAT computes a very limited kind of attention: the ranking of the attention scores is unconditioned on… ▽ More

    Submitted 31 January, 2022; v1 submitted 30 May, 2021; originally announced May 2021.

    Comments: Published in ICLR 2022

  5. arXiv:2006.05205  [pdf, other

    cs.LG stat.ML

    On the Bottleneck of Graph Neural Networks and its Practical Implications

    Authors: Uri Alon, Eran Yahav

    Abstract: Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph. We propose a new explanation for this problem: GNNs are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing o… ▽ More

    Submitted 9 March, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: Accepted to ICLR'2021

  6. arXiv:2005.13209  [pdf, other

    cs.PL cs.LG

    A Structural Model for Contextual Code Changes

    Authors: Shaked Brody, Uri Alon, Eran Yahav

    Abstract: We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We refer to this task as the EditCompletion task and present a novel approach for tackling it. The main idea is to directly represent structural edits. This allows u… ▽ More

    Submitted 12 October, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: Accepted to OOPSLA 2020

  7. arXiv:2004.08500  [pdf, other

    cs.CL cs.FL

    A Formal Hierarchy of RNN Architectures

    Authors: William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav

    Abstract: We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, wh… ▽ More

    Submitted 19 September, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

    Comments: To appear at ACL 2020. Updated to include computational cost estimates and updated experimental results (in an erratum appendix)

  8. arXiv:1910.13895  [pdf, other

    cs.LG cs.FL stat.ML

    Learning Deterministic Weighted Automata with Queries and Counterexamples

    Authors: Gail Weiss, Yoav Goldberg, Eran Yahav

    Abstract: We present an algorithm for extraction of a probabilistic deterministic finite automaton (PDFA) from a given black-box language model, such as a recurrent neural network (RNN). The algorithm is a variant of the exact-learning algorithm L*, adapted to a probabilistic setting with noise. The key insight is the use of conditional probabilities for observations, and the introduction of a local toleran… ▽ More

    Submitted 29 December, 2019; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: Presented in NeurIPS 2019. Update: fix email address, add reference to github repo (available at https://github.com/tech-srl/weighted_lstar )

  9. arXiv:1910.07517  [pdf, other

    cs.LG cs.PL

    Adversarial Examples for Models of Code

    Authors: Noam Yefet, Uri Alon, Eran Yahav

    Abstract: Neural models of code have shown impressive results when performing tasks such as predicting method names and identifying certain kinds of bugs. We show that these models are vulnerable to adversarial examples, and introduce a novel approach for attacking trained models of code using adversarial examples. The main idea of our approach is to force a given trained model to make an incorrect predicti… ▽ More

    Submitted 12 October, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Accepted to OOPSLA'2020

  10. arXiv:1910.00577  [pdf, other

    cs.LG cs.PL stat.ML

    Structural Language Models of Code

    Authors: Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

    Abstract: We address the problem of any-code completion - generating a missing piece of source code in a given program without any restriction on the vocabulary or structure. We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM). SLM estimates the probability of the program's abstrac… ▽ More

    Submitted 29 July, 2020; v1 submitted 30 September, 2019; originally announced October 2019.

    Comments: Appeared in ICML'2020

  11. arXiv:1905.08325  [pdf, other

    cs.PL cs.LG

    Towards Neural Decompilation

    Authors: Omer Katz, Yuval Olshaker, Yoav Goldberg, Eran Yahav

    Abstract: We address the problem of automatic decompilation, converting a program in low-level representation back to a higher-level human-readable programming language. The problem of decompilation is extremely important for security researchers. Finding vulnerabilities and understanding how malware operates is much easier when done over source code. The importance of decompilation has motivated the cons… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

  12. arXiv:1902.09122  [pdf, other

    cs.LG cs.CR cs.PL stat.ML

    Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs

    Authors: Yaniv David, Uri Alon, Eran Yahav

    Abstract: We address the problem of reverse engineering of stripped executables, which contain no debug information. This is a challenging problem because of the low amount of syntactic information available in stripped executables, and the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach co… ▽ More

    Submitted 16 October, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

  13. arXiv:1808.01400  [pdf, other

    cs.LG cs.PL stat.ML

    code2seq: Generating Sequences from Structured Representations of Code

    Authors: Uri Alon, Shaked Brody, Omer Levy, Eran Yahav

    Abstract: The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine translation (NMT), have achieved state-of-the-art performance on these tasks by treating source code as a sequence of tokens. We present ${\rm {\scriptsize CODE2SEQ}}$:… ▽ More

    Submitted 21 February, 2019; v1 submitted 3 August, 2018; originally announced August 2018.

    Comments: Accepted to ICLR'2019

  14. arXiv:1805.04908  [pdf, other

    cs.LG cs.CL stat.ML

    On the Practical Computational Power of Finite Precision RNNs for Language Recognition

    Authors: Gail Weiss, Yoav Goldberg, Eran Yahav

    Abstract: While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM… ▽ More

    Submitted 13 May, 2018; originally announced May 2018.

    Comments: Accepted as a short paper in ACL 2018

  15. arXiv:1803.09544  [pdf, other

    cs.PL cs.LG

    A General Path-Based Representation for Predicting Program Properties

    Authors: Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

    Abstract: Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$. We present a $\textit{general path-based representation}$ for learning from programs. Our repr… ▽ More

    Submitted 22 April, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

    Comments: to appear in PLDI 2018

  16. arXiv:1803.09473  [pdf, other

    cs.LG cs.AI cs.PL stat.ML

    code2vec: Learning Distributed Representations of Code

    Authors: Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

    Abstract: We present a neural model for representing snippets of code as continuous distributed vectors ("code embeddings"). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict semantic properties of the snippet. This is performed by decomposing code to a collection of paths in its abstract syntax tree, and learning the atomic representa… ▽ More

    Submitted 30 October, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

    Comments: Accepted in POPL 2019

  17. arXiv:1711.09576  [pdf, other

    cs.LG cs.FL

    Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples

    Authors: Gail Weiss, Yoav Goldberg, Eran Yahav

    Abstract: We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L* algorithm as a learner and the trained RNN as an oracle. Our technique efficiently extracts accurate automata from trained RNNs, even when the state vectors are large and require fine differentiation.

    Submitted 27 February, 2020; v1 submitted 27 November, 2017; originally announced November 2017.

    Comments: Accepted in ICML 2018, (Feb 2020: added link to code, at https://github.com/tech-srl/lstar_extraction )

    Journal ref: ICML 2018

  18. arXiv:1710.01291  [pdf, other

    cs.PL

    Programming Not Only by Example

    Authors: Hila Peleg, Sharon Shoham, Eran Yahav

    Abstract: In recent years, there has been tremendous progress in automated synthesis techniques that are able to automatically generate code based on some intent expressed by the programmer. A major challenge for the adoption of synthesis remains in having the programmer communicate their intent. When the expressed intent is coarse-grained (for example, restriction on the expected type of an expression), th… ▽ More

    Submitted 3 October, 2017; originally announced October 2017.

  19. arXiv:1706.05070  [pdf, other

    cs.LG

    Learning Disjunctions of Predicates

    Authors: Nader H. Bshouty, Dana Drachsler-Cohen, Martin Vechev, Eran Yahav

    Abstract: Let $F$ be a set of boolean functions. We present an algorithm for learning $F_\vee := \{\vee_{f\in S} f \mid S \subseteq F\}$ from membership queries. Our algorithm asks at most $|F| \cdot OPT(F_\vee)$ membership queries where $OPT(F_\vee)$ is the minimum worst case number of membership queries for learning $F_\vee$. When $F$ is a set of halfspaces over a constant dimension space or a set of vari… ▽ More

    Submitted 15 June, 2017; originally announced June 2017.

  20. arXiv:1608.00089  [pdf, other

    cs.PL

    Optimal Learning of Specifications from Examples

    Authors: Dana Drachsler-Cohen, Martin Vechev, Eran Yahav

    Abstract: A fundamental challenge in synthesis from examples is designing a learning algorithm that poses the minimal number of questions to an end user while guaranteeing that the target hypothesis is discovered. Such guarantees are practically important because they ensure that end users will not be overburdened with unnecessary questions. We present SPEX -- a learning algorithm that addresses the above… ▽ More

    Submitted 30 July, 2016; originally announced August 2016.

  21. arXiv:1410.0151  [pdf, other

    cs.CR

    Exploiting Social Navigation

    Authors: Meital Ben Sinai, Nimrod Partush, Shir Yadid, Eran Yahav

    Abstract: We present an effective Sybil attack against social location based services. Our attack is based on creating a large number of reputed "bot drivers", and controlling their reported locations using fake GPS reports. We show how this attack can be used to influence social navigation systems by applying it to Waze - a prominent social navigation application used by over 50 million drivers. We show th… ▽ More

    Submitted 1 October, 2014; originally announced October 2014.