Skip to main content

Showing 1–6 of 6 results for author: Arkhipkin, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.03511  [pdf, other

    cs.CV cs.LG cs.MM

    Kandinsky 3.0 Technical Report

    Authors: Vladimir Arkhipkin, Andrei Filatov, Viacheslav Vasilev, Anastasia Maltseva, Said Azizov, Igor Pavlov, Julia Agafonova, Andrey Kuznetsov, Denis Dimitrov

    Abstract: We present Kandinsky 3.0, a large-scale text-to-image generation model based on latent diffusion, continuing the series of text-to-image Kandinsky models and reflecting our progress to achieve higher quality and realism of image generation. In this report we describe the architecture of the model, the data collection procedure, the training technique, and the production system for user interaction… ▽ More

    Submitted 28 June, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Project page: https://ai-forever.github.io/Kandinsky-3

  2. arXiv:2311.13073  [pdf, other

    cs.CV cs.LG cs.MM

    FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline

    Authors: Vladimir Arkhipkin, Zein Shaheen, Viacheslav Vasilev, Elizaveta Dakhova, Andrey Kuznetsov, Denis Dimitrov

    Abstract: Multimedia generation approaches occupy a prominent place in artificial intelligence research. Text-to-image models achieved high-quality results over the last few years. However, video synthesis methods recently started to develop. This paper presents a new two-stage latent diffusion text-to-video generation architecture based on the text-to-image diffusion model. The first stage concerns keyfram… ▽ More

    Submitted 20 December, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Project page: https://ai-forever.github.io/kandinsky-video/

  3. arXiv:2310.03502  [pdf, other

    cs.CV

    Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

    Authors: Anton Razzhigaev, Arseniy Shakhmatov, Anastasia Maltseva, Vladimir Arkhipkin, Igor Pavlov, Ilya Ryabov, Angelina Kuts, Alexander Panchenko, Andrey Kuznetsov, Denis Dimitrov

    Abstract: Text-to-image generation is a significant domain in modern computer vision and has achieved substantial improvements through the evolution of generative architectures. Among these, there are diffusion-based models that have demonstrated essential quality enhancements. These models are generally split into two categories: pixel-level and latent-level approaches. We present Kandinsky1, a novel explo… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  4. arXiv:2302.05259  [pdf, other

    stat.ML cs.LG

    Star-Shaped Denoising Diffusion Probabilistic Models

    Authors: Andrey Okhotin, Dmitry Molchanov, Vladimir Arkhipkin, Grigory Bartosh, Viktor Ohanesian, Aibek Alanov, Dmitry Vetrov

    Abstract: Denoising Diffusion Probabilistic Models (DDPMs) provide the foundation for the recent breakthroughs in generative modeling. Their Markovian structure makes it difficult to define DDPMs with distributions other than Gaussian or discrete. In this paper, we introduce Star-Shaped DDPM (SS-DDPM). Its star-shaped diffusion process allows us to bypass the need to define the transition probabilities or c… ▽ More

    Submitted 28 October, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: Accepted at NeurIPS 2023

  5. arXiv:2208.00406  [pdf, other

    cs.LG cs.AI cs.CE cs.CY

    Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI

    Authors: Semen Budennyy, Vladimir Lazarev, Nikita Zakharenko, Alexey Korovin, Olga Plosskaya, Denis Dimitrov, Vladimir Arkhipkin, Ivan Oseledets, Ivan Barsola, Ilya Egorov, Aleksandra Kosterina, Leonid Zhukov

    Abstract: The size and complexity of deep neural networks continue to grow exponentially, significantly increasing energy consumption for training and inference by these models. We introduce an open-source package eco2AI to help data scientists and researchers to track energy consumption and equivalent CO2 emissions of their models in a straightforward way. In eco2AI we put emphasis on accuracy of energy co… ▽ More

    Submitted 3 August, 2022; v1 submitted 31 July, 2022; originally announced August 2022.

    Comments: Source code for eco2AI package (energy consumption and carbon emission tracker of code in python) is available at: https://github.com/sb-ai-lab/Eco2AI , the package is also available at PyPi: https://pypi.org/project/eco2ai/

  6. arXiv:2111.10974  [pdf, other

    cs.CV cs.AI cs.CL

    Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture

    Authors: Daria Bakshandaeva, Denis Dimitrov, Vladimir Arkhipkin, Alex Shonenkov, Mark Potanin, Denis Karachev, Andrey Kuznetsov, Anton Voronov, Vera Davydova, Elena Tutubalina, Aleksandr Petiushko

    Abstract: Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The Fusion Brain Challenge combines the following specific tasks: Code2code Transl… ▽ More

    Submitted 28 December, 2022; v1 submitted 21 November, 2021; originally announced November 2021.