Skip to main content

Showing 1–12 of 12 results for author: Alanov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10601  [pdf, other

    cs.CV

    The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

    Authors: Denis Bobkov, Vadim Titov, Aibek Alanov, Dmitry Vetrov

    Abstract: The task of manipulating real image attributes through StyleGAN inversion has been extensively researched. This process involves searching latent variables from a well-trained StyleGAN generator that can synthesize a real image, modifying these latent variables, and then synthesizing an image with the desired edits. A balance must be struck between the quality of the reconstruction and the ability… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to CVPR 2024

  2. arXiv:2406.10019  [pdf, other

    cs.LG cs.AI cs.CL cs.CV math.NA

    Group and Shuffle: Efficient Structured Orthogonal Parametrization

    Authors: Mikhail Gorbunov, Nikolay Yudin, Vera Soboleva, Aibek Alanov, Alexey Naumov, Maxim Rakhuba

    Abstract: The increasing size of neural networks has led to a growing demand for methods of efficient fine-tuning. Recently, an orthogonal fine-tuning paradigm was introduced that uses orthogonal matrices for adapting the weights of a pretrained model. In this paper, we introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works. We examine properties o… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2404.01094  [pdf, other

    cs.CV

    HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

    Authors: Maxim Nikolaev, Mikhail Kuznetsov, Dmitry Vetrov, Aibek Alanov

    Abstract: Our paper addresses the complex task of transferring a hairstyle from a reference image to an input photo for virtual hair try-on. This task is challenging due to the need to adapt to various photo poses, the sensitivity of hairstyles, and the lack of objective metrics. The current state of the art hairstyle transfer methods use an optimization process for different parts of the approach, making t… ▽ More

    Submitted 25 May, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  4. arXiv:2302.05259  [pdf, other

    stat.ML cs.LG

    Star-Shaped Denoising Diffusion Probabilistic Models

    Authors: Andrey Okhotin, Dmitry Molchanov, Vladimir Arkhipkin, Grigory Bartosh, Viktor Ohanesian, Aibek Alanov, Dmitry Vetrov

    Abstract: Denoising Diffusion Probabilistic Models (DDPMs) provide the foundation for the recent breakthroughs in generative modeling. Their Markovian structure makes it difficult to define DDPMs with distributions other than Gaussian or discrete. In this paper, we introduce Star-Shaped DDPM (SS-DDPM). Its star-shaped diffusion process allows us to bypass the need to define the transition probabilities or c… ▽ More

    Submitted 28 October, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: Accepted at NeurIPS 2023

  5. arXiv:2212.10229  [pdf, other

    cs.CV cs.LG

    StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adaptation

    Authors: Aibek Alanov, Vadim Titov, Maksim Nakhodnov, Dmitry Vetrov

    Abstract: Domain adaptation of GANs is a problem of fine-tuning GAN models pretrained on a large dataset (e.g. StyleGAN) to a specific domain with few samples (e.g. painting faces, sketches, etc.). While there are many methods that tackle this problem in different ways, there are still many important questions that remain unanswered. In this paper, we provide a systematic and in-depth analysis of the domain… ▽ More

    Submitted 12 September, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to ICCV 2023

  6. arXiv:2211.01751  [pdf, other

    cs.SD cs.AI eess.AS

    Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

    Authors: Pavel Andreev, Nicholas Babaev, Azat Saginbaev, Ivan Shchekotov, Aibek Alanov

    Abstract: Streaming models are an essential component of real-time speech enhancement tools. The streaming regime constrains speech enhancement models to use only a tiny context of future information. As a result, the low-latency streaming setup is generally considered a challenging task and has a significant negative impact on the model's quality. However, the sequential nature of streaming generation offe… ▽ More

    Submitted 5 December, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: Accepted to Interspeech 2023

  7. arXiv:2210.08884  [pdf, other

    cs.CV

    HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks

    Authors: Aibek Alanov, Vadim Titov, Dmitry Vetrov

    Abstract: Domain adaptation framework of GANs has achieved great progress in recent years as a main successful approach of training contemporary GANs in the case of very limited training data. In this work, we significantly improve this framework by proposing an extremely compact parameter space for fine-tuning the generator. We introduce a novel domain-modulation technique that allows to optimize only 6 th… ▽ More

    Submitted 30 March, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  8. arXiv:2206.08303  [pdf, other

    cs.LG math.OC

    On Scaled Methods for Saddle Point Problems

    Authors: Aleksandr Beznosikov, Aibek Alanov, Dmitry Kovalev, Martin Takáč, Alexander Gasnikov

    Abstract: Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training. This paper carries out a theoretical analysis of the following scaling techniques for solving SPPs: the well-known Adam and RmsProp scaling and the newer AdaHessian and OASIS based on Hut… ▽ More

    Submitted 21 June, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 54 pages, 2 algorithms with 4 options for each, 12 figures, 5 tables, 2 theorems

  9. arXiv:2204.03042  [pdf, other

    cs.SD cs.AI eess.AS

    FFC-SE: Fast Fourier Convolution for Speech Enhancement

    Authors: Ivan Shchekotov, Pavel Andreev, Oleg Ivanov, Aibek Alanov, Dmitry Vetrov

    Abstract: Fast Fourier convolution (FFC) is the recently proposed neural operator showing promising performance in several computer vision problems. The FFC operator allows employing large receptive field operations within early layers of the neural network. It was shown to be especially helpful for inpainting of periodic structures which are common in audio processing. In this work, we design neural networ… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH 2022

  10. HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

    Authors: Pavel Andreev, Aibek Alanov, Oleg Ivanov, Dmitry Vetrov

    Abstract: Generative adversarial networks have recently demonstrated outstanding performance in neural vocoding outperforming best autoregressive and flow-based models. In this paper, we show that this success can be extended to other tasks of conditional audio generation. In particular, building upon HiFi vocoders, we propose a novel HiFi++ general framework for bandwidth extension and speech enhancement.… ▽ More

    Submitted 10 December, 2023; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted to ICASSP 2023

  11. arXiv:1904.04751  [pdf, other

    cs.CV cs.LG stat.ML

    User-Controllable Multi-Texture Synthesis with Generative Adversarial Networks

    Authors: Aibek Alanov, Max Kochurov, Denis Volkhonskiy, Daniil Yashkov, Evgeny Burnaev, Dmitry Vetrov

    Abstract: We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure a dataset coverage, we use an… ▽ More

    Submitted 24 April, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: 8 pages paper, 17 pages supplementary material

  12. arXiv:1810.04920  [pdf, other

    stat.ML cs.LG

    Pairwise Augmented GANs with Adversarial Reconstruction Loss

    Authors: Aibek Alanov, Max Kochurov, Daniil Yashkov, Dmitry Vetrov

    Abstract: We propose a novel autoencoding model called Pairwise Augmented GANs. We train a generator and an encoder jointly and in an adversarial manner. The generator network learns to sample realistic objects. In turn, the encoder network at the same time is trained to map the true data distribution to the prior in latent space. To ensure good reconstructions, we introduce an augmented adversarial reconst… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.