Search | arXiv e-print repository

DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer

Authors: Dan Ruta, Gemma Canet Tarrés, Andrew Gilbert, Eli Shechtman, Nicholas Kolkin, John Collomosse

Abstract: Neural Style Transfer (NST) is the field of study applying neural techniques to modify the artistic appearance of a content image to match the style of a reference style image. Traditionally, NST methods have focused on texture-based image edits, affecting mostly low level information and kee** most image structures the same. However, style-based deformation of the content is desirable for some… ▽ More Neural Style Transfer (NST) is the field of study applying neural techniques to modify the artistic appearance of a content image to match the style of a reference style image. Traditionally, NST methods have focused on texture-based image edits, affecting mostly low level information and kee** most image structures the same. However, style-based deformation of the content is desirable for some styles, especially in cases where the style is abstract or the primary concept of the style is in its deformed rendition of some content. With the recent introduction of diffusion models, such as Stable Diffusion, we can access far more powerful image generation techniques, enabling new possibilities. In our work, we propose using this new class of models to perform style transfer while enabling deformable style transfer, an elusive capability in previous models. We show how leveraging the priors of these models can expose new artistic controls at inference time, and we document our findings in exploring this new direction for the field of style transfer. △ Less

Submitted 11 July, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

arXiv:2304.05755 [pdf, other]

ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer

Authors: Dan Ruta, Gemma Canet Tarres, Alexander Black, Andrew Gilbert, John Collomosse

Abstract: Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of a given sample respective to its domain. Existing works in visual style representation literature have tried to disentangle style from content during training explicitly. A complete separation between these has yet to be fully ach… ▽ More Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of a given sample respective to its domain. Existing works in visual style representation literature have tried to disentangle style from content during training explicitly. A complete separation between these has yet to be fully achieved. Our paper aims to learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image. We use Neural Style Transfer (NST) to measure and drive the learning signal and achieve state-of-the-art representation learning on explicitly disentangled metrics. We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics, encoding far less semantic information and achieving state-of-the-art accuracy in downstream multimodal applications. △ Less

Submitted 17 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

arXiv:2304.05139 [pdf, other]

NeAT: Neural Artistic Tracing for Beautiful Style Transfer

Authors: Dan Ruta, Andrew Gilbert, John Collomosse, Eli Shechtman, Nicholas Kolkin

Abstract: Style transfer is the task of reproducing the semantic contents of a source image in the artistic style of a second target image. In this paper, we present NeAT, a new state-of-the art feed-forward style transfer method. We re-formulate feed-forward style transfer as image editing, rather than image generation, resulting in a model which improves over the state-of-the-art in both preserving the so… ▽ More Style transfer is the task of reproducing the semantic contents of a source image in the artistic style of a second target image. In this paper, we present NeAT, a new state-of-the art feed-forward style transfer method. We re-formulate feed-forward style transfer as image editing, rather than image generation, resulting in a model which improves over the state-of-the-art in both preserving the source content and matching the target style. An important component of our model's success is identifying and fixing "style halos", a commonly occurring artefact across many style transfer techniques. In addition to training and testing on standard datasets, we introduce the BBST-4M dataset, a new, large scale, high resolution dataset of 4M images. As a component of curating this data, we present a novel model able to classify if an image is stylistic. We use BBST-4M to improve and measure the generalization of NeAT across a huge variety of styles. Not only does NeAT offer state-of-the-art quality and generalization, it is designed and trained for fast inference at high resolution. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2303.06464 [pdf, other]

PARASOL: Parametric Style Control for Diffusion Image Synthesis

Authors: Gemma Canet Tarrés, Dan Ruta, Tu Bui, John Collomosse

Abstract: We propose PARASOL, a multi-modal synthesis model that enables disentangled, parametric control of the visual style of the image by jointly conditioning synthesis on both content and a fine-grained visual style embedding. We train a latent diffusion model (LDM) using specific losses for each modality and adapt the classifier-free guidance for encouraging disentangled control over independent conte… ▽ More We propose PARASOL, a multi-modal synthesis model that enables disentangled, parametric control of the visual style of the image by jointly conditioning synthesis on both content and a fine-grained visual style embedding. We train a latent diffusion model (LDM) using specific losses for each modality and adapt the classifier-free guidance for encouraging disentangled control over independent content and style modalities at inference time. We leverage auxiliary semantic and style-based search to create training triplets for supervision of the LDM, ensuring complementarity of content and style cues. PARASOL shows promise for enabling nuanced control over visual style in diffusion models for image creation and stylization, as well as generative search where text-based search results may be adapted to more closely match user intent by interpolating both content and style descriptors. △ Less

Submitted 1 May, 2024; v1 submitted 11 March, 2023; originally announced March 2023.

Comments: Camera-ready version

arXiv:2208.04807 [pdf, other]

HyperNST: Hyper-Networks for Neural Style Transfer

Authors: Dan Ruta, Andrew Gilbert, Saeid Motiian, Baldo Faieta, Zhe Lin, John Collomosse

Abstract: We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture. Our contribution is a novel method for inducing style transfer parameterized by a metric space, pre-trained for style-based visual search (SBVS). We show for the first time that such space may be used to drive NST, enabling the application and… ▽ More We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture. Our contribution is a novel method for inducing style transfer parameterized by a metric space, pre-trained for style-based visual search (SBVS). We show for the first time that such space may be used to drive NST, enabling the application and interpolation of styles from an SBVS system. The technical contribution is a hyper-network that predicts weight updates to a StyleGAN2 pre-trained over a diverse gamut of artistic content (portraits), tailoring the style parameterization on a per-region basis using a semantic map of the facial regions. We show HyperNST to exceed state of the art in content preservation for our stylized content while retaining good style transfer performance. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2203.05321 [pdf, other]

StyleBabel: Artistic Style Tagging and Captioning

Authors: Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, A**kya Kale, Jo Briggs, Chris Speed, Hailin **, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

Abstract: We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools. StyleBabel was collected via an iterative method, inspired by `Grounded Theory': a qualitative approach that enables annotation while co… ▽ More We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools. StyleBabel was collected via an iterative method, inspired by `Grounded Theory': a qualitative approach that enables annotation while co-evolving a shared language for fine-grained artistic style attribute description. We demonstrate several downstream tasks for StyleBabel, adapting the recent ALADIN architecture for fine-grained style similarity, to train cross-modal embeddings for: 1) free-form tag generation; 2) natural language description of artistic style; 3) fine-grained text search of style. To do so, we extend ALADIN with recent advances in Visual Transformer (ViT) and cross-modal representation learning, achieving a state of the art accuracy in fine-grained style retrieval. △ Less

Submitted 11 March, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

arXiv:2111.08882 [pdf, other]

Search and Rescue in a Maze-like Environment with Ant and Dijkstra Algorithms

Authors: Z. Husain, A. Al Zaabi, H. Hildmann, F. Saffre, D. Ruta, A. F. Isakovic

Abstract: With the growing reliability of modern Ad Hoc Networks, it is encouraging to analyze potential involvement of autonomous Ad Hoc agents in critical situations where human involvement could be perilous. One such critical scenario is the Search and Rescue effort in the event of a disaster where timely discovery and help deployment is of utmost importance. This paper demonstrates the applicability of… ▽ More With the growing reliability of modern Ad Hoc Networks, it is encouraging to analyze potential involvement of autonomous Ad Hoc agents in critical situations where human involvement could be perilous. One such critical scenario is the Search and Rescue effort in the event of a disaster where timely discovery and help deployment is of utmost importance. This paper demonstrates the applicability of a bio-inspired technique, namely Ant Algorithms (AA), in optimizing the search time for a near optimal path to a trapped victim, followed by the application of Dijkstra's algorithm in the rescue phase. The inherent exploratory nature of AA is put to use for a faster map** and coverage of the unknown search space. Four different AA are implemented, with different effects of the pheromone in play. An inverted AA, with repulsive pheromones, was found to be the best fit for this particular application. After considerable exploration, upon discovery of the victim, the autonomous agents further facilitate the rescue process by forming a relay network, using the already deployed resources. Hence, the paper discusses a detailed decision making model of the swarm, segmented into two primary phases, responsible for the search and rescue respectively. Different aspects of the performance of the agent swarm are analyzed, as a function of the spatial dimensions, the complexity of the search space, the deployed search group size, and the signal permeability of the obstacles in the area. △ Less

Submitted 16 November, 2021; originally announced November 2021.

arXiv:2103.09776 [pdf, other]

ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity

Authors: Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin **, Alex Filipkowski, Andrew Gilbert, John Collomosse

Abstract: We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style. Representation learning is critical to visual search, where distance in the learned search embedding reflects image similarity. Learning an embedding that discriminates fine-grained variations in style is hard, due to the difficulty of defining and labelling style. ALADIN… ▽ More We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style. Representation learning is critical to visual search, where distance in the learned search embedding reflects image similarity. Learning an embedding that discriminates fine-grained variations in style is hard, due to the difficulty of defining and labelling style. ALADIN takes a weakly supervised approach to learning a representation for fine-grained style similarity of digital artworks, leveraging BAM-FG, a novel large-scale dataset of user generated content grou**s gathered from the web. ALADIN sets a new state of the art accuracy for style-based visual search over both coarse labelled style data (BAM) and BAM-FG; a new 2.62 million image dataset of 310,000 fine-grained style grou**s also contributed by this work. △ Less

Submitted 17 March, 2021; originally announced March 2021.

arXiv:2009.04864 [pdf, other]

Coverage and Energy Analysis of Mobile Sensor Nodes in Obstructed Noisy Indoor Environment: A Voronoi Approach

Authors: K. Eledlebi, D. Ruta, H. Hildmann, F. Saffre, Y. Al Hammadi, A. F. Isakovic

Abstract: The rapid deployment of wireless sensor network (WSN) poses the challenge of finding optimal locations for the network nodes, especially so in (i) unknown and (ii) obstacle-rich environments. This paper addresses this challenge with BISON (Bio-Inspired Self-Organizing Network), a variant of the Voronoi algorithm. In line with the scenario challenges, BISON nodes are restricted to (i) locally sense… ▽ More The rapid deployment of wireless sensor network (WSN) poses the challenge of finding optimal locations for the network nodes, especially so in (i) unknown and (ii) obstacle-rich environments. This paper addresses this challenge with BISON (Bio-Inspired Self-Organizing Network), a variant of the Voronoi algorithm. In line with the scenario challenges, BISON nodes are restricted to (i) locally sensed as well as (ii) noisy information on the basis of which they move, avoid obstacles and connect with neighboring nodes. Performance is measured as (i) the percentage of area covered, (ii) the total distance traveled by the nodes, (iii) the cumulative energy consumption and (iv) the uniformity of nodes distribution. Obstacle constellations and noise levels are studied systematically and a collision-free recovery strategy for failing nodes is proposed. Results obtained from extensive simulations show the algorithm outperforming previously reported approaches in both, convergence speed, as well as deployment cost. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: 17 pages, 24 figures, 1 table

arXiv:1901.11303 [pdf, other]

Hyperbox based machine learning algorithms: A comprehensive survey

Authors: Thanh Tung Khuat, Dymitr Ruta, Bogdan Gabrys

Abstract: With the rapid development of digital information, the data volume generated by humans and machines is growing exponentially. Along with this trend, machine learning algorithms have been formed and evolved continuously to discover new information and knowledge from different data sources. Learning algorithms using hyperboxes as fundamental representational and building blocks are a branch of machi… ▽ More With the rapid development of digital information, the data volume generated by humans and machines is growing exponentially. Along with this trend, machine learning algorithms have been formed and evolved continuously to discover new information and knowledge from different data sources. Learning algorithms using hyperboxes as fundamental representational and building blocks are a branch of machine learning methods. These algorithms have enormous potential for high scalability and online adaptation of predictors built using hyperbox data representations to the dynamically changing environments and streaming data. This paper aims to give a comprehensive survey of literature on hyperbox-based machine learning models. In general, according to the architecture and characteristic features of the resulting models, the existing hyperbox-based learning algorithms may be grouped into three major categories: fuzzy min-max neural networks, hyperbox-based hybrid models, and other algorithms based on hyperbox representations. Within each of these groups, this paper shows a brief description of the structure of models, associated learning algorithms, and an analysis of their advantages and drawbacks. Main applications of these hyperbox-based models to the real-world problems are also described in this paper. Finally, we discuss some open problems and identify potential future research directions in this field. △ Less

Submitted 21 March, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

Comments: 7 figures

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

Showing 1–10 of 10 results for author: Ruta, D