Skip to main content

Showing 1–50 of 215 results for author: Wolf, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14528  [pdf, other

    cs.LG cs.AI

    DeciMamba: Exploring the Length Extrapolation Potential of Mamba

    Authors: Assaf Ben-Kish, Itamar Zimerman, Shady Abu-Hussein, Nadav Cohen, Amir Globerson, Lior Wolf, Raja Giryes

    Abstract: Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length. A promising alternative is Mamba, which demonstrates high performance and achieves Transformer-level capabilities while requiring substantially fewer computational resources. In this paper we explore the length-generalization capabilities of Mamba, which we find to be re… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Link To Official Implementation: https://github.com/assafbk/DeciMamba

  2. arXiv:2406.12900  [pdf, other

    cs.IT cs.AI cs.LG

    Factor Graph Optimization of Error-Correcting Codes for Belief Propagation Decoding

    Authors: Yoni Choukroun, Lior Wolf

    Abstract: The design of optimal linear block codes capable of being efficiently decoded is of major concern, especially for short block lengths. As near capacity-approaching codes, Low-Density Parity-Check (LDPC) codes possess several advantages over other families of codes, the most notable being its efficient decoding via Belief Propagation. While many LDPC code design methods exist, the development of ef… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  3. arXiv:2406.09920  [pdf, other

    cs.CL cs.AI

    Knowledge Editing in Language Models via Adapted Direct Preference Optimization

    Authors: Amit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum

    Abstract: Large Language Models (LLMs) can become outdated over time as they may lack updated world knowledge, leading to factual knowledge errors and gaps. Knowledge Editing (KE) aims to overcome this challenge using weight updates that do not require expensive retraining. We propose treating KE as an LLM alignment problem. Toward this goal, we introduce Knowledge Direct Preference Optimization (KDPO), a v… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures

  4. arXiv:2406.06636  [pdf, other

    cs.CL cs.LG

    LLM Questionnaire Completion for Automatic Psychiatric Assessment

    Authors: Gony Rosenman, Lior Wolf, Talma Hendler

    Abstract: We employ a Large Language Model (LLM) to convert unstructured psychological interviews into structured questionnaires spanning various psychiatric and personality domains. The LLM is prompted to answer these questionnaires by impersonating the interviewee. The obtained answers are coded as features, which are used to predict standardized psychiatric measures of depression (PHQ-8) and PTSD (PCL-C)… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    MSC Class: 68T50 ACM Class: I.2.7

  5. arXiv:2405.16504  [pdf, other

    cs.LG

    A Unified Implicit Attention Formulation for Gated-Linear Recurrent Sequence Models

    Authors: Itamar Zimerman, Ameen Ali, Lior Wolf

    Abstract: Recent advances in efficient sequence modeling have led to attention-free layers, such as Mamba, RWKV, and various gated RNNs, all featuring sub-quadratic complexity in sequence length and excellent scaling properties, enabling the construction of a new type of foundation models. In this paper, we present a unified view of these models, formulating such layers as implicit causal self-attention lay… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    ACM Class: F.2.2; I.2.7

  6. arXiv:2405.04050  [pdf, other

    cs.IT cs.AI

    Learning Linear Block Error Correction Codes

    Authors: Yoni Choukroun, Lior Wolf

    Abstract: Error correction codes are a crucial part of the physical communication layer, ensuring the reliable transfer of data over noisy channels. The design of optimal linear block codes capable of being efficiently decoded is of major concern, especially for short block lengths. While neural decoders have recently demonstrated their advantage over classical decoding techniques, the neural design of the… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  7. arXiv:2405.00791  [pdf, other

    cs.CV cs.AI

    Obtaining Favorable Layouts for Multiple Object Generation

    Authors: Barak Battash, Amit Rozner, Lior Wolf, Ofir Lindenbaum

    Abstract: Large-scale text-to-image models that can generate high-quality and diverse images based on textual prompts have shown remarkable success. These models aim ultimately to create complex scenes, and addressing the challenge of multi-subject generation is a critical step towards this goal. However, the existing state-of-the-art diffusion models face difficulty when generating images that involve mult… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    MSC Class: I.2; I.4

  8. arXiv:2403.05549  [pdf, ps, other

    cs.CY

    A Scheduling Perspective on Modular Educational Systems in Europe

    Authors: Rubén Ruiz-Torrubiano, Sebastian Knopp, Lukas Matthias Wolf, Andreas Krystallidis

    Abstract: In modular educational systems, students are allowed to choose a part of their own curriculum themselves. This is typically done in the final class levels which lead to maturity for university access. The rationale behind letting students choose their courses themselves is to enhance self-responsibility, improve student motivation, and allow a focus on specific areas of interest. A central instrum… ▽ More

    Submitted 7 February, 2024; originally announced March 2024.

    Comments: Preprint submitted to International Journal of Educational Research

    ACM Class: J.1; I.2.8

  9. arXiv:2403.01590  [pdf, other

    cs.LG

    The Hidden Attention of Mamba Models

    Authors: Ameen Ali, Itamar Zimerman, Lior Wolf

    Abstract: The Mamba layer offers an efficient selective state space model (SSM) that is highly effective in modeling multiple domains, including NLP, long-range sequence processing, and computer vision. Selective SSMs are viewed as dual models, in which one trains in parallel on the entire sequence via an IO-aware parallel scan, and deploys in an autoregressive manner. We add a third view and show that such… ▽ More

    Submitted 31 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    MSC Class: F.2.2; I.2.7 ACM Class: F.2.2; I.2.7

  10. arXiv:2402.12865  [pdf, other

    cs.CL cs.AI cs.LG

    Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

    Authors: Shahar Katz, Yonatan Belinkov, Mor Geva, Lior Wolf

    Abstract: Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the models' vocabularies, hel** to uncover how information flows within LMs. In this work, we extend this methodology to LMs' backward pass and gradients. We first p… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  11. arXiv:2402.03286  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Training-Free Consistent Text-to-Image Generation

    Authors: Yoad Tewel, Omri Kaduri, Rinon Gal, Yoni Kasten, Lior Wolf, Gal Chechik, Yuval Atzmon

    Abstract: Text-to-image models offer a new level of creative flexibility by allowing users to guide the image generation process through natural language. However, using these models to consistently portray the same subject across diverse prompts remains challenging. Existing approaches fine-tune the model to teach it new words that describe specific user-provided subjects or add image conditioning to the m… ▽ More

    Submitted 30 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted to journal track of SIGGRAPH 2024 (TOG). Project page is at https://consistory-paper.github.io

  12. arXiv:2401.12819  [pdf, other

    cs.LG cs.AI

    Dynamic Layer Tying for Parameter-Efficient Transformers

    Authors: Tamir David Hay, Lior Wolf

    Abstract: In the pursuit of reducing the number of trainable parameters in deep transformer networks, we employ Reinforcement Learning to dynamically select layers during training and tie them together. Every few iterations, the RL agent is asked whether to train each layer $i$ independently or to copy the weights of a previous layer $j<i$. This facilitates weight sharing, reduces the number of trainable pa… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  13. arXiv:2401.12570  [pdf, other

    eess.AS cs.AI cs.SD

    DiffMoog: a Differentiable Modular Synthesizer for Sound Matching

    Authors: Noy Uzrad, Oren Barkan, Almog Elharar, Shlomi Shvartzman, Moshe Laufer, Lior Wolf, Noam Koenigstein

    Abstract: This paper presents DiffMoog - a differentiable modular synthesizer with a comprehensive set of modules typically found in commercial instruments. Being differentiable, it allows integration into neural networks, enabling automated sound matching, to replicate a given audio input. Notably, DiffMoog facilitates modulation capabilities (FM/AM), low-frequency oscillators (LFOs), filters, envelope sha… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 5 pages, 7 figures, 1 table, Our code is released at https://github.com/aisynth/diffmoog

  14. arXiv:2401.11316  [pdf, other

    cs.CL cs.AI

    PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

    Authors: Nadav Benedek, Lior Wolf

    Abstract: With the proliferation of large pre-trained language models (PLMs), fine-tuning all model parameters becomes increasingly inefficient, particularly when dealing with numerous downstream tasks that entail substantial training and storage costs. Several approaches aimed at achieving parameter-efficient fine-tuning (PEFT) have been proposed. Among them, Low-Rank Adaptation (LoRA) stands out as an arc… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: EACL 2024

  15. arXiv:2401.06766  [pdf, other

    cs.CL

    Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements

    Authors: Anton Voronov, Lena Wolf, Max Ryabinin

    Abstract: Large language models demonstrate a remarkable capability for learning to solve new tasks from a few examples. The prompt template, or the way the input examples are formatted to obtain the prompt, is an important yet often overlooked aspect of in-context learning. In this work, we conduct a comprehensive study of the template format's influence on the in-context learning performance. We evaluate… ▽ More

    Submitted 6 June, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted to Findings of ACL 2024. 24 pages, 10 figures. Code: https://github.com/yandex-research/mind-your-format

  16. arXiv:2312.13240  [pdf, other

    cs.CV

    Efficient Verification-Based Face Identification

    Authors: Amit Rozner, Barak Battash, Ofir Lindenbaum, Lior Wolf

    Abstract: We study the problem of performing face verification with an efficient neural model $f$. The efficiency of $f$ stems from simplifying the face verification problem from an embedding nearest neighbor search into a binary problem; each user has its own neural network $f$. To allow information sharing between different individuals in the training set, we do not train $f$ directly but instead generate… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures

    ACM Class: I.4

  17. arXiv:2312.10458  [pdf, other

    cs.LG

    Degree-based stratification of nodes in Graph Neural Networks

    Authors: Ameen Ali, Hakan Cevikalp, Lior Wolf

    Abstract: Despite much research, Graph Neural Networks (GNNs) still do not display the favorable scaling properties of other deep neural networks such as Convolutional Neural Networks and Transformers. Previous work has identified issues such as oversmoothing of the latent representation and have suggested solutions such as skip connections and sophisticated normalization schemes. Here, we propose a differe… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  18. arXiv:2312.02931  [pdf, other

    cs.CL cs.AI

    WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words

    Authors: Lukas Wolf, Greta Tuckute, Klemen Kotar, Eghbal Hosseini, Tamar Regev, Ethan Wilcox, Alex Warstadt

    Abstract: Training on multiple modalities of input can augment the capabilities of a language model. Here, we ask whether such a training regime can improve the quality and efficiency of these systems as well. We focus on text--audio and introduce Whisbert, which is inspired by the text--image approach of FLAVA (Singh et al., 2022). In accordance with Babylm guidelines (Warstadt et al., 2023), we pretrain W… ▽ More

    Submitted 6 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Published at the BabyLM Challenge, a shared task co-sponsored by CMCL 2023 and CoNLL 2023, hosted by EMNLP 2023

  19. arXiv:2311.17233  [pdf, other

    cs.CL cs.AI cs.IT cs.LG

    Quantifying the redundancy between prosody and text

    Authors: Lukas Wolf, Tiago Pimentel, Evelina Fedorenko, Ryan Cotterell, Alex Warstadt, Ethan Wilcox, Tamar Regev

    Abstract: Prosody -- the suprasegmental component of speech, including pitch, loudness, and tempo -- carries critical aspects of meaning. However, the relationship between the information conveyed by prosody vs. by the words themselves remains poorly understood. We use large language models (LLMs) to estimate how much information is redundant between prosody and the words themselves. Using a large spoken co… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: Published at The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  20. arXiv:2311.16620  [pdf, other

    cs.LG cs.CL

    On the Long Range Abilities of Transformers

    Authors: Itamar Zimerman, Lior Wolf

    Abstract: Despite their dominance in modern DL and, especially, NLP domains, transformer architectures exhibit sub-optimal performance on long-range tasks compared to recent layers that are specifically designed for this purpose. In this work, drawing inspiration from key attributes of long-range layers, such as state-space layers, linear RNN layers, and global convolution layers, we demonstrate that minima… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 18 pages

    ACM Class: F.2.2; I.2.7

  21. arXiv:2311.08610  [pdf, other

    cs.LG cs.CR

    Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption

    Authors: Itamar Zimerman, Moran Baruch, Nir Drucker, Gilad Ezov, Omri Soceanu, Lior Wolf

    Abstract: Designing privacy-preserving deep learning models is a major challenge within the deep learning community. Homomorphic Encryption (HE) has emerged as one of the most promising approaches in this realm, enabling the decoupling of knowledge between the model owner and the data owner. Despite extensive research and application of this technology, primarily in convolutional neural networks, incorporat… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 6 figures

    ACM Class: F.2.2; I.2.7

  22. arXiv:2309.16429  [pdf, other

    cs.LG cs.AI

    Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

    Authors: Guy Yariv, Itai Gat, Sagie Benaim, Lior Wolf, Idan Schwartz, Yossi Adi

    Abstract: We consider the task of generating diverse and realistic videos guided by natural audio samples from a wide variety of semantic classes. For this task, the videos are required to be aligned both globally and temporally with the input audio: globally, the input audio is semantically associated with the entire output video, and temporally, each segment of the input audio is associated with a corresp… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 9 pages, 6 figures

  23. arXiv:2309.13600  [pdf, other

    cs.CV cs.LG

    Multi-Dimensional Hyena for Spatial Inductive Bias

    Authors: Itamar Zimerman, Lior Wolf

    Abstract: In recent years, Vision Transformers have attracted increasing interest from computer vision researchers. However, the advantage of these transformers over CNNs is only fully manifested when trained over a large dataset, mainly due to the reduced inductive bias towards spatial locality within the transformer's self-attention mechanism. In this work, we present a data-efficient vision transformer t… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: 10 pages, 3 figures

    ACM Class: F.2.2; I.2.7

  24. arXiv:2309.03884  [pdf, other

    cs.SD cs.CL eess.AS

    Zero-Shot Audio Captioning via Audibility Guidance

    Authors: Tal Shaharabany, Ariel Shaulov, Lior Wolf

    Abstract: The task of audio captioning is similar in essence to tasks such as image and video captioning. However, it has received much less attention. We propose three desiderata for captioning audio -- (i) fluency of the generated text, (ii) faithfulness of the generated text to the input audio, and the somewhat related (iii) audibility, which is the quality of being able to be perceived based only on aud… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  25. arXiv:2309.03874  [pdf, other

    cs.CV

    Box-based Refinement for Weakly Supervised and Unsupervised Localization Tasks

    Authors: Eyal Gomel, Tal Shaharabany, Lior Wolf

    Abstract: It has been established that training a box-based detector network can enhance the localization performance of weakly supervised and unsupervised methods. Moreover, we extend this understanding by demonstrating that these detectors can be utilized to improve the original network, paving the way for further advancements. To accomplish this, we train the detectors on top of the network output instea… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  26. arXiv:2307.10159  [pdf, other

    cs.CV

    FABRIC: Personalizing Diffusion Models with Iterative Feedback

    Authors: Dimitri von Rütte, Elisabetta Fedele, Jonathan Thomm, Lukas Wolf

    Abstract: In an era where visual content generation is increasingly driven by machine learning, the integration of human feedback into generative models presents significant opportunities for enhancing user experience and output quality. This study explores strategies for incorporating iterative human feedback into the generative process of diffusion-based text-to-image models. We propose FABRIC, a training… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 14 pages, 7 figures

    MSC Class: I.2.10

  27. arXiv:2306.15971  [pdf, other

    q-bio.NC cs.AI cs.LG

    Reconstructing the Hemodynamic Response Function via a Bimodal Transformer

    Authors: Yoni Choukroun, Lior Golgher, Pablo Blinder, Lior Wolf

    Abstract: The relationship between blood flow and neuronal activity is widely recognized, with blood flow frequently serving as a surrogate for neuronal activity in fMRI studies. At the microscopic level, neuronal activity has been shown to influence blood flow in nearby blood vessels. This study introduces the first predictive model that addresses this issue directly at the explicit neuronal population lev… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  28. arXiv:2306.09004  [pdf, other

    eess.IV cs.CV

    Annotator Consensus Prediction for Medical Image Segmentation with Diffusion Models

    Authors: Tomer Amit, Shmuel Shichrur, Tal Shaharabany, Lior Wolf

    Abstract: A major challenge in the segmentation of medical images is the large inter- and intra-observer variability in annotations provided by multiple experts. To address this challenge, we propose a novel method for multi-expert prediction using diffusion models. Our method leverages the diffusion-based approach to incorporate information from multiple annotations and fuse it into a unified segmentation… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2112.00390

  29. arXiv:2306.06635  [pdf, other

    cs.CV cs.LG

    2-D SSM: A General Spatial Layer for Visual Transformers

    Authors: Ethan Baron, Itamar Zimerman, Lior Wolf

    Abstract: A central objective in computer vision is to design models with appropriate 2-D inductive bias. Desiderata for 2D inductive bias include two-dimensional position awareness, dynamic spatial locality, and translation and permutation invariance. To address these goals, we leverage an expressive variation of the multidimensional State Space Model (SSM). Our approach introduces efficient parameterizati… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: 16 pages, 5 figures

    MSC Class: F.2.2; I.2.7

  30. arXiv:2306.06370  [pdf, other

    cs.CV

    AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder

    Authors: Tal Shaharabany, Aviad Dahan, Raja Giryes, Lior Wolf

    Abstract: The recently introduced Segment Anything Model (SAM) combines a clever architecture and large quantities of training data to obtain remarkable image segmentation capabilities. However, it fails to reproduce such results for Out-Of-Distribution (OOD) domains such as medical images. Moreover, while SAM is conditioned on either a mask or a set of points, it may be desirable to have a fully automatic… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

  31. arXiv:2306.05167  [pdf, other

    cs.LG

    Decision S4: Efficient Sequence-Based RL via State Spaces Layers

    Authors: Shmuel Bar-David, Itamar Zimerman, Eliya Nachmani, Lior Wolf

    Abstract: Recently, sequence learning methods have been applied to the problem of off-policy Reinforcement Learning, including the seminal work on Decision Transformers, which employs transformers for this task. Since transformers are parameter-heavy, cannot benefit from history longer than a fixed window size, and are not computed using recurrence, we set out to investigate the suitability of the S4 family… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 21 pages,13 figures

    MSC Class: 14J60 ACM Class: F.2.2; I.2.7

  32. arXiv:2306.01610  [pdf, other

    cs.LG

    Centered Self-Attention Layers

    Authors: Ameen Ali, Tomer Galanti, Lior Wolf

    Abstract: The self-attention mechanism in transformers and the message-passing mechanism in graph neural networks are repeatedly applied within deep learning architectures. We show that this application inevitably leads to oversmoothing, i.e., to similar representations at the deeper layers for different tokens in transformers and different nodes in graph neural networks. Based on our analysis, we present a… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  33. arXiv:2306.01158  [pdf, other

    cs.LG cs.AI

    Heterogeneous Knowledge for Augmented Modular Reinforcement Learning

    Authors: Lorenz Wolf, Mirco Musolesi

    Abstract: Existing modular Reinforcement Learning (RL) architectures are generally based on reusable components, also allowing for ``plug-and-play'' integration. However, these modules are homogeneous in nature - in fact, they essentially provide policies obtained via RL through the maximization of individual reward functions. Consequently, such solutions still lack the ability to integrate and process mult… ▽ More

    Submitted 14 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 16 pages, 4 figures

  34. arXiv:2306.00966  [pdf, other

    cs.CV

    The Hidden Language of Diffusion Models

    Authors: Hila Chefer, Oran Lang, Mor Geva, Volodymyr Polosukhin, Assaf Shocher, Michal Irani, Inbar Mosseri, Lior Wolf

    Abstract: Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model. This interpretation is obtained by decomposing t… ▽ More

    Submitted 5 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  35. arXiv:2306.00582  [pdf, other

    cs.LG cs.AI

    Anomaly Detection with Variance Stabilized Density Estimation

    Authors: Amit Rozner, Barak Battash, Henry Li, Lior Wolf, Ofir Lindenbaum

    Abstract: We propose a modified density estimation problem that is highly effective for detecting anomalies in tabular data. Our approach assumes that the density function is relatively stable (with lower variance) around normal samples. We have verified this hypothesis empirically using a wide range of real-world data. Then, we present a variance-stabilized density estimation problem for maximizing the lik… ▽ More

    Submitted 8 May, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 9 pages, 7 figures

    ACM Class: I.2

  36. arXiv:2305.14952  [pdf, other

    cs.LG eess.SP

    Focus Your Attention (with Adaptive IIR Filters)

    Authors: Shahar Lutati, Itamar Zimerman, Lior Wolf

    Abstract: We present a new layer in which dynamic (i.e.,input-dependent) Infinite Impulse Response (IIR) filters of order two are used to process the input sequence prior to applying conventional attention. The input is split into chunks, and the coefficients of these filters are determined based on previous chunks to maintain causality. Despite their relatively low order, the causal adaptive filters are sh… ▽ More

    Submitted 18 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023

    ACM Class: F.2.2; I.2.7

  37. arXiv:2305.13050  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

    Authors: Guy Yariv, Itai Gat, Lior Wolf, Yossi Adi, Idan Schwartz

    Abstract: In recent years, image generation has shown a great leap in performance, where diffusion models play a central role. Although generating high-quality images, such models are mainly conditioned on textual descriptions. This begs the question: "how can we adopt such models to be conditioned on other modalities?". In this paper, we propose a novel method utilizing latent diffusion models trained for… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to INTERSPEECH 2023

  38. arXiv:2303.17155  [pdf, other

    cs.CV cs.AI

    Discriminative Class Tokens for Text-to-Image Diffusion Models

    Authors: Idan Schwartz, Vésteinn Snæbjarnarson, Hila Chefer, Ryan Cotterell, Serge Belongie, Lior Wolf, Sagie Benaim

    Abstract: Recent advances in text-to-image diffusion models have enabled the generation of diverse and high-quality images. While impressive, the images often fall short of depicting subtle details and are susceptible to errors due to ambiguity in the input text. One way of alleviating these issues is to train diffusion models on class-labeled datasets. This approach has two disadvantages: (i) supervised da… ▽ More

    Submitted 10 September, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: ICCV 2023

  39. arXiv:2303.07166  [pdf, ps, other

    cs.LG cs.PL cs.SE

    Improved Tree Search for Automatic Program Synthesis

    Authors: Aran Carmon, Lior Wolf

    Abstract: In the task of automatic program synthesis, one obtains pairs of matching inputs and outputs and generates a computer program, in a particular domain-specific language (DSL), which given each sample input returns the matching output. A key element is being able to perform an efficient search in the space of valid programs. Here, we suggest a variant of MCTS that leads to state of the art results o… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Proceedings of the 2nd Exploration in Reinforcement Learning Workshop at the 36th International Conference on Machine Learning, 2019

  40. arXiv:2303.06552  [pdf, other

    cs.LG

    Energy Regularized RNNs for Solving Non-Stationary Bandit Problems

    Authors: Michael Rotman, Lior Wolf

    Abstract: We consider a Multi-Armed Bandit problem in which the rewards are non-stationary and are dependent on past actions and potentially on past contexts. At the heart of our method, we employ a recurrent neural network, which models these sequences. In order to balance between exploration and exploitation, we present an energy minimization term that prevents the neural network from becoming too confide… ▽ More

    Submitted 28 March, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

  41. arXiv:2302.11413  [pdf, other

    cs.CV eess.IV

    Gradient Adjusting Networks for Domain Inversion

    Authors: Erez Sheffi, Michael Rotman, Lior Wolf

    Abstract: StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing. However, in order to manipulate a real-world image, one first needs to be able to retrieve its corresponding latent representation in StyleGAN's latent space that is decoded to an image as close as possible to the desired image. For many real-world images, a latent representation does not exist, whi… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  42. arXiv:2301.13826  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

    Authors: Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or

    Abstract: Recent text-to-image generative models have demonstrated an unparalleled ability to generate diverse and creative imagery guided by a target text prompt. While revolutionary, current state-of-the-art diffusion models may still fail in generating images that fully convey the semantics in the given text prompt. We analyze the publicly available Stable Diffusion model and assess the existence of cata… ▽ More

    Submitted 31 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Accepted to SIGGRAPH 2023; Project page available at https://yuval-alaluf.github.io/Attend-and-Excite/

  43. arXiv:2301.13530  [pdf, other

    cs.LG cs.CV

    Domain-Generalizable Multiple-Domain Clustering

    Authors: Amit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum

    Abstract: This work generalizes the problem of unsupervised domain generalization to the case in which no labeled samples are available (completely unsupervised). We are given unlabeled samples from multiple source domains, and we aim to learn a shared predictor that assigns examples to semantically related clusters. Evaluation is done by predicting cluster assignments in previously unseen domains. Towards… ▽ More

    Submitted 31 January, 2024; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: 13 pages, 3 figures

  44. arXiv:2301.11930  [pdf, other

    quant-ph cs.AI cs.ET cs.IT

    Deep Quantum Error Correction

    Authors: Yoni Choukroun, Lior Wolf

    Abstract: Quantum error correction codes (QECC) are a key component for realizing the potential of quantum computing. QECC, as its classical counterpart (ECC), enables the reduction of error rates, by distributing quantum logical information across redundant physical qubits, such that errors can be detected and corrected. In this work, we efficiently train novel {\emph{end-to-end}} deep quantum error decode… ▽ More

    Submitted 10 December, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

  45. arXiv:2301.10752  [pdf, other

    eess.AS cs.AI

    Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation

    Authors: Shahar Lutati, Eliya Nachmani, Lior Wolf

    Abstract: The problem of speech separation, also known as the cocktail party problem, refers to the task of isolating a single speech signal from a mixture of speech signals. Previous work on source separation derived an upper bound for the source separation task in the domain of human speech. This bound is derived for deterministic models. Recent advancements in generative models challenge this bound. We s… ▽ More

    Submitted 24 June, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  46. arXiv:2211.13964  [pdf, other

    cs.CR cs.CV cs.LG cs.NE stat.ML

    Generating 2D and 3D Master Faces for Dictionary Attacks with a Network-Assisted Latent Space Evolution

    Authors: Tomer Friedlander, Ron Shmelkin, Lior Wolf

    Abstract: A master face is a face image that passes face-based identity authentication for a high percentage of the population. These faces can be used to impersonate, with a high probability of success, any user, without having access to any user information. We optimize these faces for 2D and 3D face verification models, by using an evolutionary algorithm in the latent embedding space of the StyleGAN face… ▽ More

    Submitted 28 November, 2022; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: accepted for publication in IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM). This paper extends arXiv:2108.01077 that was accepted to IEEE FG 2021

  47. arXiv:2210.12112  [pdf, other

    cs.CV cs.AI cs.CL

    Describing Sets of Images with Textual-PCA

    Authors: Oded Hupert, Idan Schwartz, Lior Wolf

    Abstract: We seek to semantically describe a set of images, capturing both the attributes of single images and the variations within the set. Our procedure is analogous to Principle Component Analysis, in which the role of projection vectors is replaced with generated phrases. First, a centroid phrase that has the largest average semantic similarity to the images in the set is generated, where both the comp… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to Findings of EMNLP'22

  48. arXiv:2210.00471  [pdf, other

    cs.LG

    OCD: Learning to Overfit with Conditional Diffusion Models

    Authors: Shahar Lutati, Lior Wolf

    Abstract: We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This map** between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on t… ▽ More

    Submitted 9 June, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: Accepted to ICML 2023 (Oral & Poster)

  49. arXiv:2209.13533  [pdf, other

    cs.IT cs.AI cs.LG

    Denoising Diffusion Error Correction Codes

    Authors: Yoni Choukroun, Lior Wolf

    Abstract: Error correction code (ECC) is an integral part of the physical communication layer, ensuring reliable data transfer over noisy channels. Recently, neural decoders have demonstrated their advantage over classical decoding techniques. However, recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders. In this w… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  50. arXiv:2208.14445  [pdf

    q-bio.QM cs.CV eess.IV

    Artificial intelligence-based locoregional markers of brain peritumoral microenvironment

    Authors: Zahra Riahi Samani, Drew Parker, Hamed Akbari, Spyridon Bakas, Ronald L. Wolf, Steven Brem, Ragini Verma

    Abstract: In malignant primary brain tumors, cancer cells infiltrate into the peritumoral brain structures which results in inevitable recurrence. Quantitative assessment of infiltrative heterogeneity in the peritumoral region, the area where biopsy or resection can be hazardous, is important for clinical decision making. Previous work on characterizing the infiltrative heterogeneity in the peritumoral regi… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.