Search | arXiv e-print repository

Latent CLAP Loss for Better Foley Sound Synthesis

Authors: Tornike Karchkhadze, Hassan Salami Kavaki, Mohammad Rasool Izadi, Bryce Irvin, Mikolaj Kegler, Ari Hertz, Shuo Zhang, Marko Stamenovic

Abstract: Foley sound generation, the art of creating audio for multimedia, has recently seen notable advancements through text-conditioned latent diffusion models. These systems use multimodal text-audio representation models, such as Contrastive Language-Audio Pretraining (CLAP), whose objective is to map corresponding audio and text prompts into a joint embedding space. AudioLDM, a text-to-audio model, w… ▽ More Foley sound generation, the art of creating audio for multimedia, has recently seen notable advancements through text-conditioned latent diffusion models. These systems use multimodal text-audio representation models, such as Contrastive Language-Audio Pretraining (CLAP), whose objective is to map corresponding audio and text prompts into a joint embedding space. AudioLDM, a text-to-audio model, was the winner of the DCASE2023 task 7 Foley sound synthesis challenge. The winning system fine-tuned the model for specific audio classes and applied a post-filtering method using CLAP similarity scores between output audio and input text at inference time, requiring the generation of extra samples, thus reducing data generation efficiency. We introduce a new loss term to enhance Foley sound generation in AudioLDM without post-filtering. This loss term uses a new module based on the CLAP mode-Latent CLAP encode-to align the latent diffusion output with real audio in a shared CLAP embedding space. Our experiments demonstrate that our method effectively reduces the Frechet Audio Distance (FAD) score of the generated audio and eliminates the need for post-filtering, thus enhancing generation efficiency. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.05226 [pdf, other]

Extremal Chemical Graphs for the Arithmetic-Geometric Index

Authors: Alain Hertz, Sébastien Bonte, Gauvain Devillez, Valentin Dusollier, Hadrien Mélot, David Schindl

Abstract: The arithmetic-geometric index is a newly proposed degree-based graph invariant in mathematical chemistry. We give a sharp upper bound on the value of this invariant for connected chemical graphs of given order and size and characterize the connected chemical graphs that reach the bound. We also prove that the removal of the constraint that extremal chemical graphs must be connected does not allow… ▽ More The arithmetic-geometric index is a newly proposed degree-based graph invariant in mathematical chemistry. We give a sharp upper bound on the value of this invariant for connected chemical graphs of given order and size and characterize the connected chemical graphs that reach the bound. We also prove that the removal of the constraint that extremal chemical graphs must be connected does not allow to increase the upper bound. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 15 pages

arXiv:2402.04404 [pdf, other]

doi 10.1103/PhysRevA.110.012408

Quadrature Coherence Scale of Linear Combinations of Gaussian Functions in Phase Space

Authors: Anaelle Hertz, Aaron Z. Goldberg, Khabat Heshami

Abstract: The quadrature coherence scale (QCS) is a recently introduced measure that was shown to be an efficient witness of nonclassicality. It takes a simple form for pure and Gaussian states, but a general expression for mixed states tends to be prohibitively unwieldy. In this paper, we introduce a method for computing the quadrature coherence scale of quantum states characterized by Wigner functions exp… ▽ More The quadrature coherence scale (QCS) is a recently introduced measure that was shown to be an efficient witness of nonclassicality. It takes a simple form for pure and Gaussian states, but a general expression for mixed states tends to be prohibitively unwieldy. In this paper, we introduce a method for computing the quadrature coherence scale of quantum states characterized by Wigner functions expressible as linear combinations of Gaussian functions. Notable examples within this framework include cat states, GKP states, and states resulting from Gaussian transformations, measurements, and breeding protocols. In particular, we show that the quadrature coherence scale serves as a valuable tool for examining the scalability of nonclassicality in the presence of loss. Our findings lead us to put forth a conjecture suggesting that, subject to 50% loss or more, all pure states lose any QCS-certifiable nonclassicality. We also consider the quadrature coherence scale as a measure of quality of the output state of the breeding protocol. △ Less

Submitted 2 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: Added a clarification in the abstract Improved figures. One section added to compare with other nonclassicality measures

Journal ref: Phys. Rev. A 110, 012408 (2024)

arXiv:2401.06105 [pdf, other]

PALP: Prompt Aligned Personalization of Text-to-Image Models

Authors: Moab Arar, Andrey Voynov, Amir Hertz, Omri Avrahami, Shlomi Fruchter, Yael Pritch, Daniel Cohen-Or, Ariel Shamir

Abstract: Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impe… ▽ More Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a \emph{single} prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: Project page available at https://prompt-aligned.github.io/

arXiv:2312.02133 [pdf, other]

Style Aligned Image Generation via Shared Attention

Authors: Amir Hertz, Andrey Voynov, Shlomi Fruchter, Daniel Cohen-Or

Abstract: Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel techniq… ▽ More Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal `attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs. △ Less

Submitted 11 January, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: Project page at style-aligned-gen.github.io

arXiv:2311.17609 [pdf, other]

AnyLens: A Generative Diffusion Model with Any Rendering Lens

Authors: Andrey Voynov, Amir Hertz, Moab Arar, Shlomi Fruchter, Daniel Cohen-Or

Abstract: State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image capture. The influence of different optical systems on the final scene appearance is frequently overlooked. This study introduces a framework that intimately integrate… ▽ More State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image capture. The influence of different optical systems on the final scene appearance is frequently overlooked. This study introduces a framework that intimately integrates a text-to-image diffusion model with the particular lens geometry used in image rendering. Our method is based on a per-pixel coordinate conditioning method, enabling the control over the rendering geometry. Notably, we demonstrate the manipulation of curvature properties, achieving diverse visual effects, such as fish-eye, panoramic views, and spherical texturing using a single diffusion model. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.10093 [pdf, other]

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Authors: Omri Avrahami, Amir Hertz, Yael Vinker, Moab Arar, Shlomi Fruchter, Ohad Fried, Daniel Cohen-Or, Dani Lischinski

Abstract: Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images… ▽ More Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. △ Less

Submitted 5 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Accepted to SIGGRAPH 2024. Project page is available at https://omriavrahami.com/the-chosen-one/

arXiv:2310.19296 [pdf, other]

Complex-valued Wigner entropy of a quantum state

Authors: Nicolas J. Cerf, Anaelle Hertz, Zacharie Van Herstraeten

Abstract: It is common knowledge that the Wigner function of a quantum state may admit negative values, so that it cannot be viewed as a genuine probability density. Here, we examine the difficulty in finding an entropy-like functional in phase space that extends to negative Wigner functions and then advocate the merits of defining a complex-valued entropy associated with any Wigner function. This quantity,… ▽ More It is common knowledge that the Wigner function of a quantum state may admit negative values, so that it cannot be viewed as a genuine probability density. Here, we examine the difficulty in finding an entropy-like functional in phase space that extends to negative Wigner functions and then advocate the merits of defining a complex-valued entropy associated with any Wigner function. This quantity, which we call the complex Wigner entropy, is defined via the analytic continuation of Shannon's differential entropy of the Wigner function in the complex plane. We show that the complex Wigner entropy enjoys interesting properties, especially its real and imaginary parts are both invariant under Gaussian unitaries (displacements, rotations, and squeezing in phase space). Its real part is physically relevant when considering the evolution of the Wigner function under a Gaussian convolution, while its imaginary part is simply proportional to the negative volume of the Wigner function. Finally, we define the complex-valued Fisher information of any Wigner function, which is linked (via an extended de Bruijn's identity) to the time derivative of the complex Wigner entropy when the state undergoes Gaussian additive noise. Overall, it is anticipated that the complex plane yields a proper framework for analyzing the entropic properties of quasiprobability distributions in phase space. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 14 pages + 10 pages of Appendix

arXiv:2310.09341 [pdf, other]

Addressing the cold start problem in privacy preserving content-based recommender systems using hypercube graphs

Authors: Noa Tuval, Alain Hertz, Tsvi Kuflik

Abstract: The initial interaction of a user with a recommender system is problematic because, in such a so-called cold start situation, the recommender system has very little information about the user, if any. Moreover, in collaborative filtering, users need to share their preferences with the service provider by rating items while in content-based filtering there is no need for such information sharing. W… ▽ More The initial interaction of a user with a recommender system is problematic because, in such a so-called cold start situation, the recommender system has very little information about the user, if any. Moreover, in collaborative filtering, users need to share their preferences with the service provider by rating items while in content-based filtering there is no need for such information sharing. We have recently shown that a content-based model that uses hypercube graphs can determine user preferences with a very limited number of ratings while better preserving user privacy. In this paper, we confirm these findings on the basis of experiments with more than 1,000 users in the restaurant and movie domains. We show that the proposed method outperforms standard machine learning algorithms when the number of available ratings is at most 10, which often happens, and is competitive with larger training sets. In addition, training is simple and does not require large computational efforts. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 22 pages, 6 figures, 9 tables

arXiv:2306.06088 [pdf, other]

SENS: Part-Aware Sketch-based Implicit Neural Shape Modeling

Authors: Alexandre Binninger, Amir Hertz, Olga Sorkine-Hornung, Daniel Cohen-Or, Raja Giryes

Abstract: We present SENS, a novel method for generating and editing 3D models from hand-drawn sketches, including those of abstract nature. Our method allows users to quickly and easily sketch a shape, and then maps the sketch into the latent space of a part-aware neural implicit shape architecture. SENS analyzes the sketch and encodes its parts into ViT patch encoding, subsequently feeding them into a tra… ▽ More We present SENS, a novel method for generating and editing 3D models from hand-drawn sketches, including those of abstract nature. Our method allows users to quickly and easily sketch a shape, and then maps the sketch into the latent space of a part-aware neural implicit shape architecture. SENS analyzes the sketch and encodes its parts into ViT patch encoding, subsequently feeding them into a transformer decoder that converts them to shape embeddings suitable for editing 3D neural implicit shapes. SENS provides intuitive sketch-based generation and editing, and also succeeds in capturing the intent of the user's sketch to generate a variety of novel and expressive 3D shapes, even from abstract and imprecise sketches. Additionally, SENS supports refinement via part reconstruction, allowing for nuanced adjustments and artifact removal. It also offers part-based modeling capabilities, enabling the combination of features from multiple sketches to create more complex and customized 3D shapes. We demonstrate the effectiveness of our model compared to the state-of-the-art using objective metric evaluation criteria and a user study, both indicating strong performance on sketches with a medium level of abstraction. Furthermore, we showcase our method's intuitive sketch-based shape editing capabilities, and validate it through a usability study. △ Less

Submitted 21 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 25 pages, 24 figures

arXiv:2304.07090 [pdf, other]

Delta Denoising Score

Authors: Amir Hertz, Kfir Aberman, Daniel Cohen-Or

Abstract: We introduce Delta Denoising Score (DDS), a novel scoring function for text-based image editing that guides minimal modifications of an input image towards the content described in a target prompt. DDS leverages the rich generative prior of text-to-image diffusion models and can be used as a loss term in an optimization problem to steer an image towards a desired direction dictated by a text. DDS… ▽ More We introduce Delta Denoising Score (DDS), a novel scoring function for text-based image editing that guides minimal modifications of an input image towards the content described in a target prompt. DDS leverages the rich generative prior of text-to-image diffusion models and can be used as a loss term in an optimization problem to steer an image towards a desired direction dictated by a text. DDS utilizes the Score Distillation Sampling (SDS) mechanism for the purpose of image editing. We show that using only SDS often produces non-detailed and blurry outputs due to noisy gradients. To address this issue, DDS uses a prompt that matches the input image to identify and remove undesired erroneous directions of SDS. Our key premise is that SDS should be zero when calculated on pairs of matched prompts and images, meaning that if the score is non-zero, its gradients can be attributed to the erroneous component of SDS. Our analysis demonstrates the competence of DDS for text based image-to-image translation. We further show that DDS can be used to train an effective zero-shot image translation model. Experimental results indicate that DDS outperforms existing methods in terms of stability and quality, highlighting its potential for real-world applications in text-based image editing. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: Project page: https://delta-denoising-score.github.io/

arXiv:2303.01818 [pdf, other]

Word-As-Image for Semantic Typography

Authors: Shir Iluz, Yael Vinker, Amir Hertz, Daniel Berio, Daniel Cohen-Or, Ariel Shamir

Abstract: A word-as-image is a semantic typography technique where a word illustration presents a visualization of the meaning of the word, while also preserving its readability. We present a method to create word-as-image illustrations automatically. This task is highly challenging as it requires semantic understanding of the word and a creative idea of where and how to depict these semantics in a visually… ▽ More A word-as-image is a semantic typography technique where a word illustration presents a visualization of the meaning of the word, while also preserving its readability. We present a method to create word-as-image illustrations automatically. This task is highly challenging as it requires semantic understanding of the word and a creative idea of where and how to depict these semantics in a visually pleasing and legible manner. We rely on the remarkable ability of recent large pretrained language-vision models to distill textual concepts visually. We target simple, concise, black-and-white designs that convey the semantics clearly. We deliberately do not change the color or texture of the letters and do not use embellishments. Our method optimizes the outline of each letter to convey the desired concept, guided by a pretrained Stable Diffusion model. We incorporate additional loss terms to ensure the legibility of the text and the preservation of the style of the font. We show high quality and engaging results on numerous examples and compare to alternative techniques. △ Less

Submitted 6 March, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

arXiv:2212.11451 [pdf, other]

A machine learning framework for neighbor generation in metaheuristic search

Authors: Defeng Liu, Vincent Perreault, Alain Hertz, Andrea Lodi

Abstract: This paper presents a methodology for integrating machine learning techniques into metaheuristics for solving combinatorial optimization problems. Namely, we propose a general machine learning framework for neighbor generation in metaheuristic search. We first define an efficient neighborhood structure constructed by applying a transformation to a selected subset of variables from the current solu… ▽ More This paper presents a methodology for integrating machine learning techniques into metaheuristics for solving combinatorial optimization problems. Namely, we propose a general machine learning framework for neighbor generation in metaheuristic search. We first define an efficient neighborhood structure constructed by applying a transformation to a selected subset of variables from the current solution. Then, the key of the proposed methodology is to generate promising neighbors by selecting a proper subset of variables that contains a descent of the objective in the solution space. To learn a good variable selection strategy, we formulate the problem as a classification task that exploits structural information from the characteristics of the problem and from high-quality solutions. We validate our methodology on two metaheuristic applications: a Tabu Search scheme for solving a Wireless Network Optimization problem and a Large Neighborhood Search heuristic for solving Mixed-Integer Programs. The experimental results show that our approach is able to achieve a satisfactory trade-off between the exploration of a larger solution space and the exploitation of high-quality solution regions on both applications. △ Less

Submitted 21 December, 2022; originally announced December 2022.

arXiv:2211.09794 [pdf, other]

Null-text Inversion for Editing Real Images using Guided Diffusion Models

Authors: Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, Daniel Cohen-Or

Abstract: Recent text-guided diffusion models provide powerful image generation capabilities. Currently, a massive effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. To edit a real image using these state-of-the-art tools, one must first invert the image with a meaningful text prompt into the pretrained model's domain. In this paper,… ▽ More Recent text-guided diffusion models provide powerful image generation capabilities. Currently, a massive effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. To edit a real image using these state-of-the-art tools, one must first invert the image with a meaningful text prompt into the pretrained model's domain. In this paper, we introduce an accurate inversion technique and thus facilitate an intuitive text-based modification of the image. Our proposed inversion consists of two novel key components: (i) Pivotal inversion for diffusion models. While current methods aim at map** random noise samples to a single input image, we use a single pivotal noise vector for each timestamp and optimize around it. We demonstrate that a direct inversion is inadequate on its own, but does provide a good anchor for our optimization. (ii) NULL-text optimization, where we only modify the unconditional textual embedding that is used for classifier-free guidance, rather than the input text embedding. This allows for kee** both the model weights and the conditional embedding intact and hence enables applying prompt-based editing while avoiding the cumbersome tuning of the model's weights. Our Null-text inversion, based on the publicly available Stable Diffusion model, is extensively evaluated on a variety of images and prompt editing, showing high-fidelity editing of real images. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2208.01626 [pdf, other]

Prompt-to-Prompt Image Editing with Cross Attention Control

Authors: Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, Daniel Cohen-Or

Abstract: Recent large-scale text-driven synthesis models have attracted much attention thanks to their remarkable capabilities of generating highly diverse images that follow given text prompts. Such text-based synthesis methods are particularly appealing to humans who are used to verbally describe their intent. Therefore, it is only natural to extend the text-driven image synthesis to text-driven image ed… ▽ More Recent large-scale text-driven synthesis models have attracted much attention thanks to their remarkable capabilities of generating highly diverse images that follow given text prompts. Such text-based synthesis methods are particularly appealing to humans who are used to verbally describe their intent. Therefore, it is only natural to extend the text-driven image synthesis to text-driven image editing. Editing is challenging for these generative models, since an innate property of an editing technique is to preserve most of the original image, while in the text-based models, even a small modification of the text prompt often leads to a completely different outcome. State-of-the-art methods mitigate this by requiring the users to provide a spatial mask to localize the edit, hence, ignoring the original structure and content within the masked region. In this paper, we pursue an intuitive prompt-to-prompt editing framework, where the edits are controlled by text only. To this end, we analyze a text-conditioned model in depth and observe that the cross-attention layers are the key to controlling the relation between the spatial layout of the image to each word in the prompt. With this observation, we present several applications which monitor the image synthesis by editing the textual prompt only. This includes localized editing by replacing a word, global editing by adding a specification, and even delicately controlling the extent to which a word is reflected in the image. We present our results over diverse images and prompts, demonstrating high-quality synthesis and fidelity to the edited prompts. △ Less

Submitted 2 August, 2022; originally announced August 2022.

arXiv:2204.10236 [pdf, ps, other]

The average size of maximal matchings in graphs

Authors: Alain Hertz, Sébastien Bonte, Gauvain Devillez, Hadrien Mélot

Abstract: We investigate the ratio $\avM(G)$ of the average size of a maximal matching to the size of a maximum matching in a graph $G$. If many maximal matchings have a size close to $\maxM(G)$, this graph invariant has a value close to 1. Conversely, if many maximal matchings have a small size, $\avM(G)$ approaches $\frac{1}{2}$. We propose a general technique to determine the asymptotic behavior of… ▽ More We investigate the ratio $\avM(G)$ of the average size of a maximal matching to the size of a maximum matching in a graph $G$. If many maximal matchings have a size close to $\maxM(G)$, this graph invariant has a value close to 1. Conversely, if many maximal matchings have a small size, $\avM(G)$ approaches $\frac{1}{2}$. We propose a general technique to determine the asymptotic behavior of $\avM(G)$ for various classes of graphs. To illustrate the use of this technique, we first show how it makes it possible to find known asymptotic values of $\avM(G)$ which were typically obtained using generating functions, and we then determine the asymptotic value of $\avM(G)$ for other families of graphs, highlighting the spectrum of possible values of this graph invariant between $\frac{1}{2}$ and $1$. △ Less

Submitted 8 March, 2024; v1 submitted 21 April, 2022; originally announced April 2022.

Comments: 28 pages

arXiv:2204.06358 [pdf, other]

doi 10.1103/PhysRevA.107.043713

Decoherence and nonclassicality of photon-added/subtracted multi-mode Gaussian states

Authors: Anaelle Hertz, Stephan De Bièvre

Abstract: Photon addition and subtraction render Gaussian states non-Gaussian. We provide a quantitative analysis of the change in nonclassicality produced by these processes by analyzing the Wigner negativity and quadrature coherence scale (QCS) of the resulting states. The QCS is a recently introduced measure of nonclassicality [PRL 122, 080402 (2019), PRL 124, 090402 (2020)], that we show to undergo a re… ▽ More Photon addition and subtraction render Gaussian states non-Gaussian. We provide a quantitative analysis of the change in nonclassicality produced by these processes by analyzing the Wigner negativity and quadrature coherence scale (QCS) of the resulting states. The QCS is a recently introduced measure of nonclassicality [PRL 122, 080402 (2019), PRL 124, 090402 (2020)], that we show to undergo a relative increase under photon addition/subtraction that can be as large as 200\%. This implies that the degaussification and the concomitant increase of nonclassicality come at a cost. Indeed, the QCS is proportional to the decoherence rate of the state so that the resulting states are considerably more prone to environmental decoherence. Our results are quantitative and rely on explicit and general expressions for the characteristic and Wigner functions of photon added/subtracted single- and multi-mode Gaussian states for which we provide a simple and straightforward derivation. These expressions further allow us to certify the quantum non-Gaussianity of the photon-subtracted states with positive Wigner function. △ Less

Submitted 2 April, 2023; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: Considerably expanded version with study of Wigner negative volume and of genuine non-Gaussianity of photon-added/subtracted Gaussian states

arXiv:2203.08063 [pdf, other]

MotionCLIP: Exposing Human Motion Generation to CLIP Space

Authors: Guy Tevet, Brian Gordon, Amir Hertz, Amit H. Bermano, Daniel Cohen-Or

Abstract: We introduce MotionCLIP, a 3D human motion auto-encoder featuring a latent embedding that is disentangled, well behaved, and supports highly semantic textual descriptions. MotionCLIP gains its unique power by aligning its latent space with that of the Contrastive Language-Image Pre-training (CLIP) model. Aligning the human motion manifold to CLIP space implicitly infuses the extremely rich semanti… ▽ More We introduce MotionCLIP, a 3D human motion auto-encoder featuring a latent embedding that is disentangled, well behaved, and supports highly semantic textual descriptions. MotionCLIP gains its unique power by aligning its latent space with that of the Contrastive Language-Image Pre-training (CLIP) model. Aligning the human motion manifold to CLIP space implicitly infuses the extremely rich semantic knowledge of CLIP into the manifold. In particular, it helps continuity by placing semantically similar motions close to one another, and disentanglement, which is inherited from the CLIP-space structure. MotionCLIP comprises a transformer-based motion auto-encoder, trained to reconstruct motion while being aligned to its text label's position in CLIP-space. We further leverage CLIP's unique visual understanding and inject an even stronger signal through aligning motion to rendered frames in a self-supervised manner. We show that although CLIP has never seen the motion domain, MotionCLIP offers unprecedented text-to-motion abilities, allowing out-of-domain actions, disentangled editing, and abstract language specification. For example, the text prompt "couch" is decoded into a sitting down motion, due to lingual similarity, and the prompt "Spiderman" results in a web-swinging-like solution that is far from seen during training. In addition, we show how the introduced latent space can be leveraged for motion interpolation, editing and recognition. △ Less

Submitted 15 March, 2022; originally announced March 2022.

arXiv:2201.13168 [pdf, other]

SPAGHETTI: Editing Implicit Shapes Through Part Aware Generation

Authors: Amir Hertz, Or Perel, Raja Giryes, Olga Sorkine-Hornung, Daniel Cohen-Or

Abstract: Neural implicit fields are quickly emerging as an attractive representation for learning based techniques. However, adopting them for 3D shape modeling and editing is challenging. We introduce a method for $\mathbf{E}$diting $\mathbf{I}$mplicit $\mathbf{S}$hapes $\mathbf{T}$hrough $\mathbf{P}$art $\mathbf{A}$ware $\mathbf{G}$enera$\mathbf{T}$ion, permuted in short as SPAGHETTI. Our architecture al… ▽ More Neural implicit fields are quickly emerging as an attractive representation for learning based techniques. However, adopting them for 3D shape modeling and editing is challenging. We introduce a method for $\mathbf{E}$diting $\mathbf{I}$mplicit $\mathbf{S}$hapes $\mathbf{T}$hrough $\mathbf{P}$art $\mathbf{A}$ware $\mathbf{G}$enera$\mathbf{T}$ion, permuted in short as SPAGHETTI. Our architecture allows for manipulation of implicit shapes by means of transforming, interpolating and combining shape segments together, without requiring explicit part supervision. SPAGHETTI disentangles shape part representation into extrinsic and intrinsic geometric information. This characteristic enables a generative framework with part-level control. The modeling capabilities of SPAGHETTI are demonstrated using an interactive graphical interface, where users can directly edit neural implicit shapes. △ Less

Submitted 31 January, 2022; originally announced January 2022.

arXiv:2111.09734 [pdf, other]

ClipCap: CLIP Prefix for Image Captioning

Authors: Ron Mokady, Amir Hertz, Amit H. Bermano

Abstract: Image captioning is a fundamental task in vision-language understanding, where the model predicts a textual informative caption to a given input image. In this paper, we present a simple approach to address this task. We use CLIP encoding as a prefix to the caption, by employing a simple map** network, and then fine-tunes a language model to generate the image captions. The recently proposed CLI… ▽ More Image captioning is a fundamental task in vision-language understanding, where the model predicts a textual informative caption to a given input image. In this paper, we present a simple approach to address this task. We use CLIP encoding as a prefix to the caption, by employing a simple map** network, and then fine-tunes a language model to generate the image captions. The recently proposed CLIP model contains rich semantic features which were trained with textual context, making it best for vision-language perception. Our key idea is that together with a pre-trained language model (GPT2), we obtain a wide understanding of both visual and textual data. Hence, our approach only requires rather quick training to produce a competent captioning model. Without additional annotations or pre-training, it efficiently generates meaningful captions for large-scale and diverse datasets. Surprisingly, our method works well even when only the map** network is trained, while both CLIP and the language model remain frozen, allowing a lighter architecture with less trainable parameters. Through quantitative evaluation, we demonstrate our model achieves comparable results to state-of-the-art methods on the challenging Conceptual Captions and nocaps datasets, while it is simpler, faster, and lighter. Our code is available in https://github.com/rmokady/CLIP_prefix_caption. △ Less

Submitted 18 November, 2021; originally announced November 2021.

arXiv:2110.05433 [pdf, other]

Mesh Dra**: Parametrization-Free Neural Mesh Transfer

Authors: Amir Hertz, Or Perel, Raja Giryes, Olga Sorkine-Hornung, Daniel Cohen-Or

Abstract: Despite recent advances in geometric modeling, 3D mesh modeling still involves a considerable amount of manual labor by experts. In this paper, we introduce Mesh Dra**: a neural method for transferring existing mesh structure from one shape to another. The method drapes the source mesh over the target geometry and at the same time seeks to preserve the carefully designed characteristics of the s… ▽ More Despite recent advances in geometric modeling, 3D mesh modeling still involves a considerable amount of manual labor by experts. In this paper, we introduce Mesh Dra**: a neural method for transferring existing mesh structure from one shape to another. The method drapes the source mesh over the target geometry and at the same time seeks to preserve the carefully designed characteristics of the source mesh. At its core, our method deforms the source mesh using progressive positional encoding. We show that by leveraging gradually increasing frequencies to guide the neural optimization, we are able to achieve stable and high quality mesh transfer. Our approach is simple and requires little user guidance, compared to contemporary surface map** techniques which rely on parametrization or careful manual tuning. Most importantly, Mesh Dra** is a parameterization-free method, and thus applicable to a variety of target shape representations, including point clouds, polygon soups, and non-manifold meshes. We demonstrate that the transferred meshing remains faithful to the source mesh design characteristics, and at the same time fits the target geometry well. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 12 pages. Portions of this work previously appeared as arXiv:2104.09125v1 which has been split into two works: arXiv:2104.09125v2+ and this work

arXiv:2109.00951 [pdf, other]

GAM: Explainable Visual Similarity and Classification via Gradient Activation Maps

Authors: Oren Barkan, Omri Armstrong, Amir Hertz, Avi Caciularu, Ori Katz, Itzik Malkiel, Noam Koenigstein

Abstract: We present Gradient Activation Maps (GAM) - a machinery for explaining predictions made by visual similarity and classification models. By gleaning localized gradient and activation information from multiple network layers, GAM offers improved visual explanations, when compared to existing alternatives. The algorithmic advantages of GAM are explained in detail, and validated empirically, where it… ▽ More We present Gradient Activation Maps (GAM) - a machinery for explaining predictions made by visual similarity and classification models. By gleaning localized gradient and activation information from multiple network layers, GAM offers improved visual explanations, when compared to existing alternatives. The algorithmic advantages of GAM are explained in detail, and validated empirically, where it is shown that GAM outperforms its alternatives across various tasks and datasets. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: CIKM 2021

arXiv:2105.12703 [pdf, other]

Exploring dual information in distance metric learning for clustering

Authors: Rodrigo Randel, Daniel Aloise, Alain Hertz

Abstract: Distance metric learning algorithms aim to appropriately measure similarities and distances between data points. In the context of clustering, metric learning is typically applied with the assist of side-information provided by experts, most commonly expressed in the form of cannot-link and must-link constraints. In this setting, distance metric learning algorithms move closer pairs of data points… ▽ More Distance metric learning algorithms aim to appropriately measure similarities and distances between data points. In the context of clustering, metric learning is typically applied with the assist of side-information provided by experts, most commonly expressed in the form of cannot-link and must-link constraints. In this setting, distance metric learning algorithms move closer pairs of data points involved in must-link constraints, while pairs of points involved in cannot-link constraints are moved away from each other. For these algorithms to be effective, it is important to use a distance metric that matches the expert knowledge, beliefs, and expectations, and the transformations made to stick to the side-information should preserve geometrical properties of the dataset. Also, it is interesting to filter the constraints provided by the experts to keep only the most useful and reject those that can harm the clustering process. To address these issues, we propose to exploit the dual information associated with the pairwise constraints of the semi-supervised clustering problem. Experiments clearly show that distance metric learning algorithms benefit from integrating this dual information. △ Less

Submitted 26 May, 2021; originally announced May 2021.

Comments: 25 pages, 6 figures

arXiv:2105.01120 [pdf, ps, other]

doi 10.1007/s00373-023-02637-9

Upper bounds on the average number of colors in the non-equivalent colorings of a graph

Authors: Alain Hertz, Hadrien Mélot, Sébastien Bonte, Gauvain Devillez, Pierre Hauweele

Abstract: A coloring of a graph is an assignment of colors to its vertices such that adjacent vertices have different colors. Two colorings are equivalent if they induce the same partition of the vertex set into color classes. Let $\mathcal{A}(G)$ be the average number of colors in the non-equivalent colorings of a graph $G$. We give a general upper bound on $\mathcal{A}(G)$ that is valid for all graphs… ▽ More A coloring of a graph is an assignment of colors to its vertices such that adjacent vertices have different colors. Two colorings are equivalent if they induce the same partition of the vertex set into color classes. Let $\mathcal{A}(G)$ be the average number of colors in the non-equivalent colorings of a graph $G$. We give a general upper bound on $\mathcal{A}(G)$ that is valid for all graphs $G$ and a more precise one for graphs $G$ of order $n$ and maximum degree $Δ(G)\in \{1,2,n-2\}$. △ Less

Submitted 3 May, 2021; originally announced May 2021.

Comments: 21 pages, 1 figure. arXiv admin note: text overlap with arXiv:2104.14172

arXiv:2104.14172 [pdf, ps, other]

doi 10.1016/j.dam.2022.08.011

Lower Bounds and properties for the average number of colors in the non-equivalent colorings of a graph

Authors: Alain Hertz, Hadrien Mélot, Sébastien Bonte, Gauvain Devillez

Abstract: We study the average number $\mathcal{A}(G)$ of colors in the non-equivalent colorings of a graph $G$. We show some general properties of this graph invariant and determine its value for some classes of graphs. We then conjecture several lower bounds on $\mathcal{A}(G)$ and prove that these conjectures are true for specific classes of graphs such as triangulated graphs and graphs with maximum degr… ▽ More We study the average number $\mathcal{A}(G)$ of colors in the non-equivalent colorings of a graph $G$. We show some general properties of this graph invariant and determine its value for some classes of graphs. We then conjecture several lower bounds on $\mathcal{A}(G)$ and prove that these conjectures are true for specific classes of graphs such as triangulated graphs and graphs with maximum degree at most 2. △ Less

Submitted 29 April, 2021; originally announced April 2021.

Comments: 20 pages, 2 figures

arXiv:2104.09125 [pdf, other]

SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Authors: Amir Hertz, Or Perel, Raja Giryes, Olga Sorkine-Hornung, Daniel Cohen-Or

Abstract: Multilayer-perceptrons (MLP) are known to struggle with learning functions of high-frequencies, and in particular cases with wide frequency bands. We present a spatially adaptive progressive encoding (SAPE) scheme for input signals of MLP networks, which enables them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. SAP… ▽ More Multilayer-perceptrons (MLP) are known to struggle with learning functions of high-frequencies, and in particular cases with wide frequency bands. We present a spatially adaptive progressive encoding (SAPE) scheme for input signals of MLP networks, which enables them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. SAPE gradually unmasks signal components with increasing frequencies as a function of time and space. The progressive exposure of frequencies is monitored by a feedback loop throughout the neural optimization process, allowing changes to propagate at different rates among local spatial portions of the signal space. We demonstrate the advantage of SAPE on a variety of domains and applications, including regression of low dimensional signals and images, representation learning of occupancy networks, and a geometric task of mesh transfer between 3D shapes. △ Less

Submitted 28 May, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

arXiv:2104.07510 [pdf, other]

doi 10.1103/PhysRevA.104.022427

Realignment separability criterion assisted with filtration for detecting continuous-variable entanglement

Authors: Anaelle Hertz, Matthieu Arnhem, Ali Asadian, Nicolas J. Cerf

Abstract: We introduce a weak form of the realignment separability criterion which is particularly suited to detect continuous-variable entanglement and is physically implementable (it requires linear optics transformations and homodyne detection). Moreover, we define a family of states, called Schmidt-symmetric states, for which the weak realignment criterion reduces to the original formulation of the real… ▽ More We introduce a weak form of the realignment separability criterion which is particularly suited to detect continuous-variable entanglement and is physically implementable (it requires linear optics transformations and homodyne detection). Moreover, we define a family of states, called Schmidt-symmetric states, for which the weak realignment criterion reduces to the original formulation of the realignment criterion, making it even more valuable as it is easily computable especially in higher dimensions. Then, we focus in particular on Gaussian states and introduce a filtration procedure based on noiseless amplification or attenuation, which enhances the entanglement detection sensitivity. In some specific examples, it does even better than the original realignment criterion. △ Less

Submitted 23 August, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

Comments: Minor corrections in v2 to match the published version of the paper

Journal ref: Phys. Rev. A 104, 022427 (2021)

arXiv:2104.00552 [pdf, ps, other]

Using Graph Theory to Derive Inequalities for the Bell Numbers

Authors: Alain Hertz, Anaelle Hertz, Hadrien Mélot

Abstract: The Bell numbers count the number of different ways to partition a set of $n$ elements while the graphical Bell numbers count the number of non-equivalent partitions of the vertex set of a graph into stable sets. This relation between graph theory and integer sequences has motivated us to study properties on the average number of colors in the non-equivalent colorings of a graph to discover new no… ▽ More The Bell numbers count the number of different ways to partition a set of $n$ elements while the graphical Bell numbers count the number of non-equivalent partitions of the vertex set of a graph into stable sets. This relation between graph theory and integer sequences has motivated us to study properties on the average number of colors in the non-equivalent colorings of a graph to discover new non trivial inequalities for the Bell numbers. Example are given to illustrate our approach. △ Less

Submitted 19 October, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

Journal ref: Journal of Integer Sequences, 24 (2021), 21.10.6

arXiv:2007.00074 [pdf, other]

doi 10.1145/3386569.3392471

Deep Geometric Texture Synthesis

Authors: Amir Hertz, Rana Hanocka, Raja Giryes, Daniel Cohen-Or

Abstract: Recently, deep generative adversarial networks for image generation have advanced rapidly; yet, only a small amount of research has focused on generative models for irregular structures, particularly meshes. Nonetheless, mesh generation and synthesis remains a fundamental topic in computer graphics. In this work, we propose a novel framework for synthesizing geometric textures. It learns geometric… ▽ More Recently, deep generative adversarial networks for image generation have advanced rapidly; yet, only a small amount of research has focused on generative models for irregular structures, particularly meshes. Nonetheless, mesh generation and synthesis remains a fundamental topic in computer graphics. In this work, we propose a novel framework for synthesizing geometric textures. It learns geometric texture statistics from local neighborhoods (i.e., local triangular patches) of a single reference 3D model. It learns deep features on the faces of the input triangulation, which is used to subdivide and generate offsets across multiple scales, without parameterization of the reference or target mesh. Our network displaces mesh vertices in any direction (i.e., in the normal and tangential direction), enabling synthesis of geometric textures, which cannot be expressed by a simple 2D displacement map. Learning and synthesizing on local geometric patches enables a genus-oblivious framework, facilitating texture transfer between shapes of different genus. △ Less

Submitted 30 June, 2020; originally announced July 2020.

Comments: SIGGRAPH 2020

arXiv:2004.11782 [pdf, other]

doi 10.1103/PhysRevA.102.032413

Relating the Entanglement and Optical Nonclassicality of Multimode States of a Bosonic Quantum Field

Authors: Anaelle Hertz, Nicolas J. Cerf, Stephan De Bièvre

Abstract: The quantum nature of the state of a bosonic quantum field manifests itself in its entanglement, coherence, or optical nonclassicality which are each known to be resources for quantum computing or metrology. We provide quantitative and computable bounds relating entanglement measures with optical nonclassicality measures. These bounds imply that strongly entangled states must necessarily be strong… ▽ More The quantum nature of the state of a bosonic quantum field manifests itself in its entanglement, coherence, or optical nonclassicality which are each known to be resources for quantum computing or metrology. We provide quantitative and computable bounds relating entanglement measures with optical nonclassicality measures. These bounds imply that strongly entangled states must necessarily be strongly optically nonclassical. As an application, we infer strong bounds on the entanglement that can be produced with an optically nonclassical state im**ing on a beam splitter. For Gaussian states, we analyze the link between the logarithmic negativity and a specific nonclassicality witness called "quadrature coherence scale". △ Less

Submitted 23 September, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: 13 pages, 2 figures, change of notation in v2

Journal ref: Phys. Rev. A 102, 032413 (2020)

arXiv:2003.13326 [pdf, other]

PointGMM: a Neural GMM Network for Point Clouds

Authors: Amir Hertz, Rana Hanocka, Raja Giryes, Daniel Cohen-Or

Abstract: Point clouds are a popular representation for 3D shapes. However, they encode a particular sampling without accounting for shape priors or non-local information. We advocate for the use of a hierarchical Gaussian mixture model (hGMM), which is a compact, adaptive and lightweight representation that probabilistically defines the underlying 3D surface. We present PointGMM, a neural network that lear… ▽ More Point clouds are a popular representation for 3D shapes. However, they encode a particular sampling without accounting for shape priors or non-local information. We advocate for the use of a hierarchical Gaussian mixture model (hGMM), which is a compact, adaptive and lightweight representation that probabilistically defines the underlying 3D surface. We present PointGMM, a neural network that learns to generate hGMMs which are characteristic of the shape class, and also coincide with the input point cloud. PointGMM is trained over a collection of shapes to learn a class-specific prior. The hierarchical representation has two main advantages: (i) coarse-to-fine learning, which avoids converging to poor local-minima; and (ii) (an unsupervised) consistent partitioning of the input shape. We show that as a generative model, PointGMM learns a meaningful latent space which enables generating consistent interpolations between existing shapes, as well as synthesizing novel shapes. We also present a novel framework for rigid registration using PointGMM, that learns to disentangle orientation from structure of an input shape. △ Less

Submitted 30 March, 2020; originally announced March 2020.

Comments: CVPR 2020 -- final version

arXiv:1909.05025 [pdf, other]

doi 10.1103/PhysRevLett.124.090402

Quadrature coherence scale driven fast decoherence of bosonic quantum field states

Authors: Anaelle Hertz, Stephan De Bièvre

Abstract: We introduce, for each state of a bosonic quantum field, its quadrature coherence scale (QCS), a measure of the range of its quadrature coherences. Under coupling to a thermal bath, the purity and QCS are shown to decrease on a time scale inversely proportional to the QCS squared. The states most fragile to decoherence are therefore those with quadrature coherences far from the diagonal. We furthe… ▽ More We introduce, for each state of a bosonic quantum field, its quadrature coherence scale (QCS), a measure of the range of its quadrature coherences. Under coupling to a thermal bath, the purity and QCS are shown to decrease on a time scale inversely proportional to the QCS squared. The states most fragile to decoherence are therefore those with quadrature coherences far from the diagonal. We further show a large QCS is difficult to measure since it induces small scale variations in the state's Wigner function. These two observations imply a large QCS constitutes a mark of "macroscopic coherence". Finally, we link the QCS to optical classicality: optical classical states have a small QCS and a large QCS implies strong optical nonclassicality. △ Less

Submitted 11 March, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

Comments: 12 pages, 5 figures. New version to match the published version. Minor errors were corrected

Journal ref: Phys. Rev. Lett. 124, 090402 (2020)

arXiv:1908.11300 [pdf, other]

On graceful difference labelings of disjoint unions of circuits

Authors: Alain Hertz, Christophe Picouleau

Abstract: A graceful difference labeling (gdl for short) of a directed graph G with vertex set V is a bijection f between V and {1,...,|V|} such that, when each arc uv is assigned the difference label f(v)-f(u), the resulting arc labels are distinct. We conjecture that all disjoint unions of circuits have a gdl, except in two particular cases. We prove partial results which support this conjecture. A graceful difference labeling (gdl for short) of a directed graph G with vertex set V is a bijection f between V and {1,...,|V|} such that, when each arc uv is assigned the difference label f(v)-f(u), the resulting arc labels are distinct. We conjecture that all disjoint unions of circuits have a gdl, except in two particular cases. We prove partial results which support this conjecture. △ Less

Submitted 29 August, 2019; originally announced August 2019.

Comments: 21 pages,15 figures, 9 tables

arXiv:1907.09183 [pdf, other]

doi 10.1103/PhysRevA.100.052112

Multi-copy uncertainty observable inducing a symplectic-invariant uncertainty relation in position and momentum phase space

Authors: Anaelle Hertz, Ognyan Oreshkov, Nicolas J. Cerf

Abstract: We define an uncertainty observable, acting on several replicas of a continuous-variable bosonic state, whose trivial uncertainty lower bound induces nontrivial phase-space uncertainty relations for a single copy of the state. By exploiting the Schwinger representation of angular momenta in terms of bosonic operators, we construct such an observable that is invariant under symplectic transformatio… ▽ More We define an uncertainty observable, acting on several replicas of a continuous-variable bosonic state, whose trivial uncertainty lower bound induces nontrivial phase-space uncertainty relations for a single copy of the state. By exploiting the Schwinger representation of angular momenta in terms of bosonic operators, we construct such an observable that is invariant under symplectic transformations (rotation and squeezing in phase space). We first design a two-copy uncertainty observable, which is a discrete-spectrum operator vanishing with certainty if and only if it is applied on (two copies of) any pure Gaussian state centered at the origin. The non-negativity of its variance translates into the Schrödinger-Robertson uncertainty relation. We then extend our construction to a three-copy uncertainty observable, which exhibits additional invariance under displacements (translations in phase space) so that it vanishes on every pure Gaussian state. The resulting invariance under Gaussian unitaries makes this observable a natural tool to measure the phase-space uncertainty -- or the deviation from pure Gaussianity -- of continuous-variable bosonic states. In particular, it suggests that the Shannon entropy of this observable provides a symplectic-invariant entropic measure of uncertainty in position and momentum phase space. △ Less

Submitted 19 November, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

Comments: 15 pages. 5 figures. Minor changes in V2 to match the published version of the paper

Journal ref: Phys. Rev. A 100, 052112 (2019)

arXiv:1904.02756 [pdf, other]

Blind Visual Motif Removal from a Single Image

Authors: Amir Hertz, Sharon Fogel, Rana Hanocka, Raja Giryes, Daniel Cohen-Or

Abstract: Many images shared over the web include overlaid objects, or visual motifs, such as text, symbols or drawings, which add a description or decoration to the image. For example, decorative text that specifies where the image was taken, repeatedly appears across a variety of different images. Often, the reoccurring visual motif, is semantically similar, yet, differs in location, style and content (e.… ▽ More Many images shared over the web include overlaid objects, or visual motifs, such as text, symbols or drawings, which add a description or decoration to the image. For example, decorative text that specifies where the image was taken, repeatedly appears across a variety of different images. Often, the reoccurring visual motif, is semantically similar, yet, differs in location, style and content (e.g. text placement, font and letters). This work proposes a deep learning based technique for blind removal of such objects. In the blind setting, the location and exact geometry of the motif are unknown. Our approach simultaneously estimates which pixels contain the visual motif, and synthesizes the underlying latent image. It is applied to a single input image, without any user assistance in specifying the location of the motif, achieving state-of-the-art results for blind removal of both opaque and semi-transparent visual motifs. △ Less

Submitted 4 April, 2019; originally announced April 2019.

Comments: CVPR 2019

arXiv:1809.05910 [pdf, other]

doi 10.1145/3306346.3322959

MeshCNN: A Network with an Edge

Authors: Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar Fleishman, Daniel Cohen-Or

Abstract: Polygonal meshes provide an efficient representation for 3D shapes. They explicitly capture both shape surface and topology, and leverage non-uniformity to represent large flat regions as well as sharp, intricate features. This non-uniformity and irregularity, however, inhibits mesh analysis efforts using neural networks that combine convolution and pooling operations. In this paper, we utilize th… ▽ More Polygonal meshes provide an efficient representation for 3D shapes. They explicitly capture both shape surface and topology, and leverage non-uniformity to represent large flat regions as well as sharp, intricate features. This non-uniformity and irregularity, however, inhibits mesh analysis efforts using neural networks that combine convolution and pooling operations. In this paper, we utilize the unique properties of the mesh for a direct analysis of 3D shapes using MeshCNN, a convolutional neural network designed specifically for triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized convolution and pooling layers that operate on the mesh edges, by leveraging their intrinsic geodesic connections. Convolutions are applied on edges and the four edges of their incident triangles, and pooling is applied via an edge collapse operation that retains surface topology, thereby, generating new mesh connectivity for the subsequent convolutions. MeshCNN learns which edges to collapse, thus forming a task-driven process where the network exposes and expands the important features while discarding the redundant ones. We demonstrate the effectiveness of our task-driven pooling on various learning tasks applied to 3D meshes. △ Less

Submitted 13 February, 2019; v1 submitted 16 September, 2018; originally announced September 2018.

Comments: For a two-minute explanation video see https://bit.ly/meshcnnvideo

arXiv:1809.03158 [pdf, ps, other]

doi 10.2298/YJOR1811

Minimum Eccentric Connectivity Index for Graphs with Fixed Order and Fixed Number of Pending Vertices

Authors: Gauvain Devillez, Alain Hertz, Hadrien Mélot, Pierre Hauweele

Abstract: The eccentric connectivity index of a connected graph $G$ is the sum over all vertices $v$ of the product $d_{G}(v) e_{G}(v)$, where $d_{G}(v)$ is the degree of $v$ in $G$ and $e_{G}(v)$ is the maximum distance between $v$ and any other vertex of $G$. This index is helpful for the prediction of biological activities of diverse nature, a molecule being modeled as a graph where atoms are represented… ▽ More The eccentric connectivity index of a connected graph $G$ is the sum over all vertices $v$ of the product $d_{G}(v) e_{G}(v)$, where $d_{G}(v)$ is the degree of $v$ in $G$ and $e_{G}(v)$ is the maximum distance between $v$ and any other vertex of $G$. This index is helpful for the prediction of biological activities of diverse nature, a molecule being modeled as a graph where atoms are represented by vertices and chemical bonds by edges. We characterize those graphs which have the smallest eccentric connectivity index among all connected graphs of a given order $n$. Also, given two integers $n$ and $p$ with $p\leq n-1$, we characterize those graphs which have the smallest eccentric connectivity index among all connected graphs of order $n$ with $p$ pending vertices. △ Less

Submitted 10 September, 2018; originally announced September 2018.

Comments: 9 pages

arXiv:1809.01052 [pdf, other]

doi 10.1088/1751-8121/ab03f3

Continuous-variable entropic uncertainty relations

Authors: Anaelle Hertz, Nicolas J. Cerf

Abstract: Uncertainty relations are central to quantum physics. While they were originally formulated in terms of variances, they have later been successfully expressed with entropies following the advent of Shannon information theory. Here, we review recent results on entropic uncertainty relations involving continuous variables, such as position $x$ and momentum $p$. This includes the generalization to ar… ▽ More Uncertainty relations are central to quantum physics. While they were originally formulated in terms of variances, they have later been successfully expressed with entropies following the advent of Shannon information theory. Here, we review recent results on entropic uncertainty relations involving continuous variables, such as position $x$ and momentum $p$. This includes the generalization to arbitrary (not necessarily canonically-conjugate) variables as well as entropic uncertainty relations that take $x$-$p$ correlations into account and admit all Gaussian pure states as minimum uncertainty states. We emphasize that these continuous-variable uncertainty relations can be conveniently reformulated in terms of entropy power, a central quantity in the information-theoretic description of random signals, which makes a bridge with variance-based uncertainty relations. In this review, we take the quantum optics viewpoint and consider uncertainties on the amplitude and phase quadratures of the electromagnetic field, which are isomorphic to $x$ and $p$, but the formalism applies to all such variables (and linear combinations thereof) regardless of their physical meaning. Then, in the second part of this paper, we move on to new results and introduce a tighter entropic uncertainty relation for two arbitrary vectors of intercommuting continuous variables that take correlations into account. It is proven conditionally on reasonable assumptions. Finally, we present some conjectures for new entropic uncertainty relations involving more than two continuous variables. △ Less

Submitted 29 April, 2019; v1 submitted 4 September, 2018; originally announced September 2018.

Comments: Review paper, 42 pages, 1 figure. We corrected some minor errors in V2

Journal ref: J. Phys. A: Math. Theo. 52(17) 2019. Shannon's Information Theory 70 years on: applications in classical and quantum physics

arXiv:1808.10203 [pdf, ps, other]

doi 10.1016/j.dam.2019.04.031

Maximum Eccentric Connectivity Index for Graphs with Given Diameter

Authors: Pierre Hauweele, Alain Hertz, Hadrien Mélot, Bernard Ries, Gauvain Devillez

Abstract: The eccentricity of a vertex $v$ in a graph $G$ is the maximum distance between $v$ and any other vertex of $G$. The diameter of a graph $G$ is the maximum eccentricity of a vertex in $G$. The eccentric connectivity index of a connected graph is the sum over all vertices of the product between eccentricity and degree. Given two integers $n$ and $D$ with $D\leq n-1$, we characterize those graphs wh… ▽ More The eccentricity of a vertex $v$ in a graph $G$ is the maximum distance between $v$ and any other vertex of $G$. The diameter of a graph $G$ is the maximum eccentricity of a vertex in $G$. The eccentric connectivity index of a connected graph is the sum over all vertices of the product between eccentricity and degree. Given two integers $n$ and $D$ with $D\leq n-1$, we characterize those graphs which have the largest eccentric connectivity index among all connected graphs of order $n$ and diameter $D$. As a corollary, we also characterize those graphs which have the largest eccentric connectivity index among all connected graphs of a given order $n$. △ Less

Submitted 30 August, 2018; originally announced August 2018.

Comments: 13 pages

arXiv:1711.04566 [pdf, other]

doi 10.1103/PhysRevA.97.012111

Multidimensional entropic uncertainty relation based on a commutator matrix in position and momentum spaces

Authors: Anaelle Hertz, Luc Vanbever, Nicolas J. Cerf

Abstract: The uncertainty relation for continuous variables due to Byalinicki-Birula and Mycielski expresses the complementarity between two $n$-uples of canonically conjugate variables $(x_1,x_2,\cdots x_n)$ and $(p_1,p_2,\cdots p_n)$ in terms of Shannon differential entropy. Here, we consider the generalization to variables that are not canonically conjugate and derive an entropic uncertainty relation exp… ▽ More The uncertainty relation for continuous variables due to Byalinicki-Birula and Mycielski expresses the complementarity between two $n$-uples of canonically conjugate variables $(x_1,x_2,\cdots x_n)$ and $(p_1,p_2,\cdots p_n)$ in terms of Shannon differential entropy. Here, we consider the generalization to variables that are not canonically conjugate and derive an entropic uncertainty relation expressing the balance between any two $n$-variable Gaussian projective measurements. The bound on entropies is expressed in terms of the determinant of a matrix of commutators between the measured variables. This uncertainty relation also captures the complementarity between any two incompatible linear canonical transforms, the bound being written in terms of the corresponding symplectic matrices in phase space. Finally, we extend this uncertainty relation to Rényi entropies and also prove a covariance-based uncertainty relation which generalizes Robertson relation. △ Less

Submitted 12 January, 2018; v1 submitted 13 November, 2017; originally announced November 2017.

Comments: 8 pages, 1 figure

Journal ref: Phys. Rev. A 97, 012111 (2018)

arXiv:1702.07286 [pdf, other]

doi 10.1088/1751-8121/aa852f

Entropy-power uncertainty relations : towards a tight inequality for all Gaussian pure states

Authors: Anaelle Hertz, Michael G. Jabbour, Nicolas J. Cerf

Abstract: We show that a proper expression of the uncertainty relation for a pair of canonically-conjugate continuous variables relies on entropy power, a standard notion in Shannon information theory for real-valued signals. The resulting entropy-power uncertainty relation is equivalent to the entropic formulation of the uncertainty relation due to Bialynicki-Birula and Mycielski, but can be further extend… ▽ More We show that a proper expression of the uncertainty relation for a pair of canonically-conjugate continuous variables relies on entropy power, a standard notion in Shannon information theory for real-valued signals. The resulting entropy-power uncertainty relation is equivalent to the entropic formulation of the uncertainty relation due to Bialynicki-Birula and Mycielski, but can be further extended to rotated variables. Hence, based on a reasonable assumption, we give a partial proof of a tighter form of the entropy-power uncertainty relation taking correlations into account and provide extensive numerical evidence of its validity. Interestingly, it implies the generalized (rotation-invariant) Schrödinger-Robertson uncertainty relation exactly as the original entropy-power uncertainty relation implies Heisenberg relation. It is saturated for all Gaussian pure states, in contrast with hitherto known entropic formulations of the uncertainty principle. △ Less

Submitted 30 August, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

Comments: 15 pages, 5 figures, the new version includes the n-mode case

Journal ref: J. Phys. A: Math. Theor. 50 385301 (2017)

arXiv:1607.07082 [pdf, other]

On the edge capacitated Steiner tree problem

Authors: Cedric Bentz, Marie-Christine Costa, Alain Hertz

Abstract: Given a graph G = (V,E) with a root r in V, positive capacities {c(e)|e in E}, and non-negative lengths {l(e)|e in E}, the minimum-length (rooted) edge capacitated Steiner tree problem is to find a tree in G of minimum total length, rooted at r, spanning a given subset T of vertices, and such that, for each e in E, there are at most c(e) paths, linking r to vertices in T, that contain e. We study… ▽ More Given a graph G = (V,E) with a root r in V, positive capacities {c(e)|e in E}, and non-negative lengths {l(e)|e in E}, the minimum-length (rooted) edge capacitated Steiner tree problem is to find a tree in G of minimum total length, rooted at r, spanning a given subset T of vertices, and such that, for each e in E, there are at most c(e) paths, linking r to vertices in T, that contain e. We study the complexity and approximability of the problem, considering several relevant parameters such as the number of terminals, the edge lengths and the minimum and maximum edge capacities. For all but one combinations of assumptions regarding these parameters, we settle the question, giving a complete characterization that separates tractable cases from hard ones. The only remaining open case is proved to be equivalent to a long-standing open problem. We also prove close relations between our problem and classical Steiner tree as well as vertex-disjoint paths problems. △ Less

Submitted 24 July, 2016; originally announced July 2016.

arXiv:1606.00107 [pdf, other]

doi 10.3390/sym8050036

Higher order nonclassicality from nonlinear coherent states for models with quadratic spectrum

Authors: Anaelle Hertz, Sanjib Dey, Véronique Hussin, Hichem Eleuch

Abstract: Harmonic oscillator coherent states are well known to be the analogue of classical states. On the other hand, nonlinear and generalised coherent states may possess nonclassical properties. In this article, we study the nonclassical behaviour of nonlinear coherent states for generalised classes of models corresponding to the generalised ladder operators. A comparative analysis among them indicates… ▽ More Harmonic oscillator coherent states are well known to be the analogue of classical states. On the other hand, nonlinear and generalised coherent states may possess nonclassical properties. In this article, we study the nonclassical behaviour of nonlinear coherent states for generalised classes of models corresponding to the generalised ladder operators. A comparative analysis among them indicates that the models with quadratic spectrum are more nonclassical than the others. Our central result is further underpinned by the comparison of the degree of nonclassicality of squeezed states of the corresponding models. △ Less

Submitted 1 June, 2016; originally announced June 2016.

Comments: 10 pages, 4 figures, Published in the Special Issue: "Harmonic Oscillators In Modern Physics"

Journal ref: Symmetry 2016, 8(5), 36

arXiv:1604.05775 [pdf, ps, other]

doi 10.1088/1751-8121/50/3/033001

Path integral methods for the dynamics of stochastic and disordered systems

Authors: John A. Hertz, Yasser Roudi, Peter Sollich

Abstract: We review some of the techniques used to study the dynamics of disordered systems subject to both quenched and fast (thermal) noise. Starting from the Martin-Siggia-Rose path integral formalism for a single variable stochastic dynamics, we provide a pedagogical survey of the perturbative, i.e. diagrammatic, approach to dynamics and how this formalism can be used for studying soft spin models. We r… ▽ More We review some of the techniques used to study the dynamics of disordered systems subject to both quenched and fast (thermal) noise. Starting from the Martin-Siggia-Rose path integral formalism for a single variable stochastic dynamics, we provide a pedagogical survey of the perturbative, i.e. diagrammatic, approach to dynamics and how this formalism can be used for studying soft spin models. We review the supersymmetric formulation of the Langevin dynamics of these models and discuss the physical implications of the supersymmetry. We also describe the key steps involved in studying the disorder-averaged dynamics. Finally, we discuss the path integral approach for the case of hard Ising spins and review some recent developments in the dynamics of such kinetic Ising models. △ Less

Submitted 19 April, 2016; originally announced April 2016.

Comments: review article, 37 pages

Journal ref: J. Phys. A: Math. Theor. 50 (2017) 033001

arXiv:1511.06621 [pdf, other]

doi 10.1103/PhysRevA.93.032330

Detection of non-Gaussian entangled states with an improved continuous-variable separability criterion

Authors: Anaelle Hertz, Evgueni Karpov, Aikaterini Mandilara, Nicolas J. Cerf

Abstract: Currently available separability criteria for continuous-variable states are generally based on the covariance matrix of quadrature operators. The well-known separability criterion of Duan et al. [Phys. Rev. Lett. 84, 2722 (2000)] and Simon [Phys. Rev. Lett. 84, 2726 (2000)] , for example, gives a necessary and sufficient condition for a two-mode Gaussian state to be separable, but leaves many ent… ▽ More Currently available separability criteria for continuous-variable states are generally based on the covariance matrix of quadrature operators. The well-known separability criterion of Duan et al. [Phys. Rev. Lett. 84, 2722 (2000)] and Simon [Phys. Rev. Lett. 84, 2726 (2000)] , for example, gives a necessary and sufficient condition for a two-mode Gaussian state to be separable, but leaves many entangled non-Gaussian states undetected. Here, we introduce an improvement of this criterion that enables a stronger entanglement detection. The improved condition is based on the knowledge of an additional parameter, namely the degree of Gaussianity, and exploits a connection with Gaussianity-bounded uncertainty relations [Phys. Rev. A 86, 030102 (2012)]. We exhibit families of non-Gaussian entangled states whose entanglement remains undetected by the Duan-Simon criterion. △ Less

Submitted 23 March, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

Comments: Revised presentation, results unchanged. 10 pages, 6 figures

Journal ref: Phys. Rev. A 93, 032330 (2016)

arXiv:1505.02558 [pdf, other]

Dominating induced matchings in graphs containing no long claw

Authors: Alain Hertz, Vadim Lozin, Bernard Ries, Victor Zamaraev, Dominique de Werra

Abstract: An induced matching $M$ in a graph $G$ is dominating if every edge not in $M$ shares exactly one vertex with an edge in $M$. The dominating induced matching problem (also known as efficient edge domination) asks whether a graph $G$ contains a dominating induced matching. This problem is generally NP-complete, but polynomial-time solvable for graphs with some special properties. In particular, it i… ▽ More An induced matching $M$ in a graph $G$ is dominating if every edge not in $M$ shares exactly one vertex with an edge in $M$. The dominating induced matching problem (also known as efficient edge domination) asks whether a graph $G$ contains a dominating induced matching. This problem is generally NP-complete, but polynomial-time solvable for graphs with some special properties. In particular, it is solvable in polynomial time for claw-free graphs. In the present paper, we study this problem for graphs containing no long claw, i.e. no induced subgraph obtained from the claw by subdividing each of its edges exactly once. To solve the problem in this class, we reduce it to the following question: given a graph $G$ and a subset of its vertices, does $G$ contain a matching saturating all vertices of the subset? We show that this question can be answered in polynomial time, thus providing a polynomial-time algorithm to solve the dominating induced matching problem for graphs containing no long claw. △ Less

Submitted 11 May, 2015; originally announced May 2015.

MSC Class: 05C85

arXiv:1305.2100 [pdf, other]

Beam splitter and entanglement created with the squeezed coherent states of the Morse potential

Authors: Anaelle Hertz, Véronique Hussin, Hichem Eleuch

Abstract: The Morse potential is relatively closed to the harmonic oscillator quantum system. Thus, following the idea used for the latter, we study the possibility of creating entanglement using squeezed coherent states of the Morse potential as an input field of a beam splitter. We measure the entanglement with the linear entropy for two types of such states and we study the dependence with the coherence… ▽ More The Morse potential is relatively closed to the harmonic oscillator quantum system. Thus, following the idea used for the latter, we study the possibility of creating entanglement using squeezed coherent states of the Morse potential as an input field of a beam splitter. We measure the entanglement with the linear entropy for two types of such states and we study the dependence with the coherence and squeezing parameters. The new results are linked with observations made on probability densities and uncertainty relations of those states. The dynamical evolution of the linear entropy is also explored. △ Less

Submitted 9 May, 2013; originally announced May 2013.

arXiv:1111.1974 [pdf, other]

doi 10.1088/1751-8113/45/24/244007

Squeezed coherent states and the one-dimensional Morse quantum system

Authors: M. Angelova, A. Hertz, V. Hussin

Abstract: The Morse potential one-dimensional quantum system is a realistic model for studying vibrations of atoms in a diatomic molecule. This system is very close to the harmonic oscillator one. We thus propose a construction of squeezed coherent states similar to the one of harmonic oscillator using ladder operators. Properties of these states are analysed with respect to the localization in position, mi… ▽ More The Morse potential one-dimensional quantum system is a realistic model for studying vibrations of atoms in a diatomic molecule. This system is very close to the harmonic oscillator one. We thus propose a construction of squeezed coherent states similar to the one of harmonic oscillator using ladder operators. Properties of these states are analysed with respect to the localization in position, minimal Heisenberg uncertainty relation, the statistical properties and illustrated with examples using the finite number of states in a well-known diatomic molecule. △ Less

Submitted 6 February, 2013; v1 submitted 8 November, 2011; originally announced November 2011.

Comments: 15 pages, 10 figures. $\bullet$Revised section 4, results unchanged. Correction of formulas 35 and 37. Results unchanged because all variables are real numbers. arXiv admin note: substantial text overlap with arXiv:1010.3277

Journal ref: (2012) J. Phys. A: Math. Theor. 45 244007

arXiv:1010.3372 [pdf, ps, other]

Maximizing measures for partially hyperbolic systems with compact center leaves

Authors: F. Rodriguez Hertz, M. A. Rodriguez Hertz, A. Tahzibi, R. Ures

Abstract: We obtain the following dichotomy for accessible partially hyperbolic diffeomorphisms of 3-dimensional manifolds having compact center leaves: either there is a unique entropy maximizing measure, this measure has the Bernoulli property and its center Lyapunov exponent is 0 or, there is a finite number of entropy maximizing measures, all of them with nonzero center Lyapunov exponent (at least one w… ▽ More We obtain the following dichotomy for accessible partially hyperbolic diffeomorphisms of 3-dimensional manifolds having compact center leaves: either there is a unique entropy maximizing measure, this measure has the Bernoulli property and its center Lyapunov exponent is 0 or, there is a finite number of entropy maximizing measures, all of them with nonzero center Lyapunov exponent (at least one with negative exponent and one with positive exponent), that are finite extensions of a Bernoulli system. In the first case of the dichotomy we obtain that the system is topologically conjugated to a rotation extension of a hyperbolic system. This implies that the second case of the dichotomy holds for an open and dense set of diffeomorphisms in the hypothesis of our result. As a consequence we obtain an open set of topologically mixing diffeomorphisms having more than one entropy maximizing measure. △ Less

Submitted 16 October, 2010; originally announced October 2010.

arXiv:1009.5946 [pdf, ps, other]

doi 10.1103/PhysRevLett.106.048702

Mean Field Theory For Non-Equilibrium Network Reconstruction

Authors: Yasser Roudi, John A. Hertz

Abstract: There has been recent progress on the problem of inferring the structure of interactions in complex networks when they are in stationary states satisfying detailed balance, but little has been done for non-equilibrium systems. Here we introduce an approach to this problem, considering, as an example, the question of recovering the interactions in an asymmetrically-coupled, synchronously-updated Sh… ▽ More There has been recent progress on the problem of inferring the structure of interactions in complex networks when they are in stationary states satisfying detailed balance, but little has been done for non-equilibrium systems. Here we introduce an approach to this problem, considering, as an example, the question of recovering the interactions in an asymmetrically-coupled, synchronously-updated Sherrington-Kirkpatrick model. We derive an exact iterative inversion algorithm and develop efficient approximations based on dynamical mean-field and Thouless-Anderson-Palmer equations that express the interactions in terms of equal-time and one time step-delayed correlation functions. △ Less

Submitted 5 January, 2011; v1 submitted 29 September, 2010; originally announced September 2010.

Comments: new version, accepted in PRL. For the Supp. Mat. (ref. 11), please contact the authors

Report number: NORDITA-2010-79

Journal ref: Phys. Rev. Lett. 106, 048702 (2011)

Showing 1–50 of 56 results for author: Hertz, A