Skip to main content

Showing 51–88 of 88 results for author: Tulyakov, S

.
  1. arXiv:2206.07771  [pdf, other

    cs.CV

    Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

    Authors: Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan

    Abstract: Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis. A key desideratum in conditional synthesis is to achieve high correspondence between the conditioning input and generated output. Most existing methods learn such relationships implicitly, by incorporating the prior into the variation… ▽ More

    Submitted 16 February, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: ICLR 2023. Project at https://github.com/L-YeZhu/CDCD

  2. arXiv:2206.01191  [pdf, other

    cs.CV

    EfficientFormer: Vision Transformers at MobileNet Speed

    Authors: Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

    Abstract: Vision Transformers (ViT) have shown rapid progress in computer vision tasks, achieving promising results on various benchmarks. However, due to the massive number of parameters and model design, \textit{e.g.}, attention mechanism, ViT-based models are generally times slower than lightweight convolutional networks. Therefore, the deployment of ViT for real-time applications is particularly challen… ▽ More

    Submitted 10 October, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

  3. arXiv:2204.10850  [pdf, other

    cs.CV

    Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation

    Authors: Verica Lazova, Vladimir Guzov, Kyle Olszewski, Sergey Tulyakov, Gerard Pons-Moll

    Abstract: We present a novel method for performing flexible, 3D-aware image content manipulation while enabling high-quality novel view synthesis. While NeRF-based approaches are effective for novel view synthesis, such models memorize the radiance for every point in a scene within a neural network. Since these models are scene-specific and lack a 3D scene representation, classical editing such as shape man… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  4. arXiv:2204.00604  [pdf, other

    cs.CV cs.SD eess.AS

    Quantized GAN for Complex Music Generation from Dance Videos

    Authors: Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov

    Abstract: We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motions as input, and learns to generate music samples that plausibly accompany the corresponding input. Unlike most existing conditional music generation works that generate specific types… ▽ More

    Submitted 19 July, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: Dataset and code at https://github.com/L-YeZhu/D2M-GAN

  5. arXiv:2203.17261  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

    Authors: Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov

    Abstract: Recent research explosion on Neural Radiance Field (NeRF) shows the encouraging potential to represent complex scenes with neural networks. One major drawback of NeRF is its prohibitive inference time: Rendering a single pixel requires querying the NeRF network hundreds of times. To resolve it, existing efforts mainly attempt to reduce the number of required sampled points. However, the problem of… ▽ More

    Submitted 22 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted by ECCV 2022. Code: https://github.com/snap-research/R2L

  6. arXiv:2203.17191  [pdf, other

    cs.CV

    Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

    Authors: Stepan Tulyakov, Alfredo Bochicchio, Daniel Gehrig, Stamatios Georgoulis, Yuanyou Li, Davide Scaramuzza

    Abstract: Recently, video frame interpolation using a combination of frame- and event-based cameras has surpassed traditional image-based methods both in terms of performance and memory efficiency. However, current methods still suffer from (i) brittle image-level fusion of complementary interpolation results, that fails in the presence of artifacts in the fused image, (ii) potentially temporally inconsiste… ▽ More

    Submitted 25 April, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 2022

  7. arXiv:2203.06622  [pdf, other

    eess.IV cs.CV

    Multi-Bracket High Dynamic Range Imaging with Event Cameras

    Authors: Nico Messikommer, Stamatios Georgoulis, Daniel Gehrig, Stepan Tulyakov, Julius Erbach, Alfredo Bochicchio, Yuanyou Li, Davide Scaramuzza

    Abstract: Modern high dynamic range (HDR) imaging pipelines align and fuse multiple low dynamic range (LDR) images captured at different exposure times. While these methods work well in static scenes, dynamic scenes remain a challenge since the LDR images still suffer from saturation and noise. In such scenarios, event cameras would be a valid complement, thanks to their higher temporal resolution and dynam… ▽ More

    Submitted 28 April, 2022; v1 submitted 13 March, 2022; originally announced March 2022.

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), New Orleans, 2022

  8. arXiv:2203.02573  [pdf, other

    cs.CV cs.AI cs.LG

    Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

    Authors: Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov

    Abstract: Most methods for conditional video synthesis use a single modality as the condition. This comes with major limitations. For example, it is problematic for a model conditioned on an image to generate a specific motion trajectory desired by the user since there is no means to provide motion information. Conversely, language information can describe the desired motion, while not precisely defining th… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  9. arXiv:2203.01914  [pdf, other

    cs.CV cs.AI

    Playable Environments: Video Manipulation in Space and Time

    Authors: Willi Menapace, Stéphane Lathuilière, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci

    Abstract: We present Playable Environments - a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint.… ▽ More

    Submitted 15 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  10. arXiv:2202.05239  [pdf, other

    cs.CV cs.AI cs.AR cs.LG cs.NE

    F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

    Authors: Qing **, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

    Abstract: Neural network quantization is a promising compression technique to reduce memory footprint and save energy consumption, potentially leading to real-time inference. However, there is a performance gap between quantized and full-precision models. To reduce it, existing quantization approaches require high-precision INT32 or full-precision multiplication during inference for scaling or dequantizatio… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: ICLR 2022 (Oral)

  11. arXiv:2201.02533  [pdf, other

    cs.CV

    NeROIC: Neural Rendering of Objects from Online Image Collections

    Authors: Zhengfei Kuang, Kyle Olszewski, Menglei Chai, Zeng Huang, Panos Achlioptas, Sergey Tulyakov

    Abstract: We present a novel method to acquire object representations from online image collections, capturing high-quality geometry and material properties of arbitrary objects from photographs with varying cameras, illumination, and backgrounds. This enables various object-centric rendering applications such as novel-view synthesis, relighting, and harmonized background composition from challenging in-the… ▽ More

    Submitted 1 September, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

    Comments: SIGGRAPH 2022 (Journal Track). Project page: https://formyfamily.github.io/NeROIC/ Code repository: https://github.com/snap-research/NeROIC/

  12. arXiv:2112.14683  [pdf, other

    cs.CV cs.AI cs.LG

    StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

    Authors: Ivan Skorokhodov, Sergey Tulyakov, Mohamed Elhoseiny

    Abstract: Videos show continuous events, yet most $-$ if not all $-$ video synthesis frameworks treat them discretely in time. In this work, we think of videos of what they should be $-$ time-continuous signals, and extend the paradigm of neural representations to build a continuous-time video generator. For this, we first design continuous motion representations through the lens of positional embeddings. T… ▽ More

    Submitted 31 May, 2022; v1 submitted 29 December, 2021; originally announced December 2021.

    Comments: CVPR 2022

  13. arXiv:2106.07771  [pdf, other

    cs.CV

    Flow Guided Transformable Bottleneck Networks for Motion Retargeting

    Authors: Jian Ren, Menglei Chai, Oliver J. Woodford, Kyle Olszewski, Sergey Tulyakov

    Abstract: Human motion retargeting aims to transfer the motion of one person in a "driving" video or set of images to another person. Existing efforts leverage a long training video from each target person to train a subject-specific motion transfer model. However, the scalability of such methods is limited, as each model can only generate videos for the given target subject, and such training videos are la… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: CVPR 2021

  14. arXiv:2106.07286  [pdf, other

    cs.CV

    TimeLens: Event-based Video Frame Interpolation

    Authors: Stepan Tulyakov, Daniel Gehrig, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza

    Abstract: State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. In the absence of additional information, first-order approximations, i.e. optical flow, must be used, but this choice restricts the types of motions that can be modeled, leading to errors in highly dynamic scenarios. Event cameras are novel sensors that a… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

  15. arXiv:2104.15069  [pdf, other

    cs.CV

    A Good Image Generator Is What You Need for High-Resolution Video Synthesis

    Authors: Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov

    Abstract: Image and video synthesis are closely related areas aiming at generating content from noise. While rapid progress has been demonstrated in improving image-based models to handle large resolutions, high-quality renderings, and wide variations in image content, achieving comparable video generation results remains problematic. We present a framework that leverages contemporary image generators to re… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

    Comments: Accepted to ICLR 2021

  16. arXiv:2104.11280  [pdf, other

    cs.CV

    Motion Representations for Articulated Animation

    Authors: Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

    Abstract: We propose novel motion representations for animating articulated objects consisting of distinct parts. In a completely unsupervised manner, our method identifies object parts, tracks them in a driving video, and infers their motions by considering their principal axes. In contrast to the previous keypoint-based works, our method extracts meaningful and consistent regions, describing locations, sh… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Journal ref: CVPR 2021

  17. arXiv:2104.03963  [pdf, other

    cs.CV

    InfinityGAN: Towards Infinite-Pixel Image Synthesis

    Authors: Chieh Hubert Lin, Hsin-Ying Lee, Yen-Chi Cheng, Sergey Tulyakov, Ming-Hsuan Yang

    Abstract: We present a novel framework, InfinityGAN, for arbitrary-sized image generation. The task is associated with several key challenges. First, scaling existing models to an arbitrarily large image size is resource-constrained, in terms of both computation and availability of large-field-of-view training data. InfinityGAN trains and infers in a seamless patch-by-patch manner with low computational res… ▽ More

    Submitted 10 March, 2022; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: Accepted to ICLR 2022. Full Paper: https://openreview.net/forum?id=ufGMqIM0a4b ; Project page: https://hubert0527.github.io/infinityGAN/

  18. arXiv:2104.00675  [pdf, other

    cs.CV

    In&Out : Diverse Image Outpainting via GAN Inversion

    Authors: Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

    Abstract: Image outpainting seeks for a semantically consistent extension of the input image beyond its available content. Compared to inpainting -- filling in missing pixels in a way coherent with the neighboring pixels -- outpainting can be achieved in more diverse ways since the problem is less constrained by the surrounding pixels. Existing image outpainting methods pose the problem as a conditional ima… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: Project Page: https://yccyenchicheng.github.io/InOut/

  19. arXiv:2103.05677  [pdf, other

    cs.CV

    SMIL: Multimodal Learning with Severely Missing Modality

    Authors: Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, Xi Peng

    Abstract: A common assumption in multimodal learning is the completeness of training data, i.e., full modalities are available in all training examples. Although there exists research endeavor in develo** novel methods to tackle the incompleteness of testing data, e.g., modalities are partially missing in testing examples, few of them can handle incomplete training modalities. The problem becomes even mor… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

    Comments: In AAAI 2021 (9 pages)

  20. arXiv:2103.03467  [pdf, other

    cs.CV

    Teachers Do More Than Teach: Compressing Image-to-Image Models

    Authors: Qing **, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

    Abstract: Generative Adversarial Networks (GANs) have achieved huge success in generating high-fidelity images, however, they suffer from low efficiency due to tremendous computational cost and bulky memory usage. Recent efforts on compression GANs show noticeable progress in obtaining smaller generators by sacrificing image quality or involving a time-consuming searching process. In this work, we aim to ad… ▽ More

    Submitted 18 August, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: 18 pages, 10 figures, accepted by CVPR 2021

  21. arXiv:2101.12195  [pdf, other

    cs.CV cs.AI

    Playable Video Generation

    Authors: Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci

    Abstract: This paper introduces the unsupervised learning problem of playable video generation (PVG). In PVG, we aim at allowing a user to control the generated video by selecting a discrete action at every time step as when playing a video game. The difficulty of the task lies both in learning semantically consistent actions and in generating realistic videos conditioned on the user input. We propose a nov… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

  22. arXiv:2010.16417  [pdf, other

    cs.CV

    MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

    Authors: Zhentao Tan, Menglei Chai, Dongdong Chen, **g Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

    Abstract: Despite the recent success of face image generation with GANs, conditional hair editing remains challenging due to the under-explored complexity of its geometry and appearance. In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation. To provide user control over every major hair visual facto… ▽ More

    Submitted 30 October, 2020; originally announced October 2020.

    Comments: Siggraph 2020, code is available at https://github.com/tzt101/MichiGAN

  23. arXiv:2004.14489  [pdf, other

    cs.GR cs.CV

    Interactive Video Stylization Using Few-Shot Patch-Based Training

    Authors: Ondřej Texler, David Futschik, Michal Kučera, Ondřej Jamriška, Šárka Sochorová, Menglei Chai, Sergey Tulyakov, Daniel Sýkora

    Abstract: In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence. Its key advantage is that the resulting stylization is semantically meaningful, i.e., specific parts of moving objects are stylized according to the artist's intention. In contrast to previous style transfe… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

  24. arXiv:2004.13297  [pdf, other

    cs.CV

    Neural Hair Rendering

    Authors: Menglei Chai, Jian Ren, Sergey Tulyakov

    Abstract: In this paper, we propose a generic neural-based hair rendering pipeline that can synthesize photo-realistic images from virtual 3D hair models. Unlike existing supervised translation methods that require model-level similarity to preserve consistent structure representation for both real images and fake renderings, our method adopts an unsupervised solution to work on arbitrary hair models. The k… ▽ More

    Submitted 21 July, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: ECCV 2020

  25. arXiv:2004.03234  [pdf, other

    cs.CV

    Motion-supervised Co-Part Segmentation

    Authors: Aliaksandr Siarohin, Subhankar Roy, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

    Abstract: Recent co-part segmentation methods mostly operate in a supervised learning setting, which requires a large amount of annotated data for training. To overcome this limitation, we propose a self-supervised deep learning method for co-part segmentation. Differently from previous works, our approach develops the idea that motion information inferred from videos can be leveraged to discover meaningful… ▽ More

    Submitted 15 April, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Journal ref: ICPR 2021

  26. arXiv:2004.03142  [pdf, other

    cs.CV

    Human Motion Transfer from Poses in the Wild

    Authors: Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

    Abstract: In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video. It is a video-to-video translation task in which the estimated poses are used to bridge two domains. Despite substantial progress on the topic, there exist several problems with the previous methods. First, there is a domain ga… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  27. arXiv:2003.00196  [pdf, other

    cs.CV cs.AI

    First Order Motion Model for Image Animation

    Authors: Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

    Abstract: Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video. Our framework addresses this problem without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), our method can be applied to… ▽ More

    Submitted 1 October, 2020; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: NeurIPS 2019

  28. arXiv:1908.06079  [pdf, other

    cs.CV stat.ML

    Task-Assisted Domain Adaptation with Anchor Tasks

    Authors: Zhizhong Li, Linjie Luo, Sergey Tulyakov, Qieyun Dai, Derek Hoiem

    Abstract: Some tasks, such as surface normals or single-view depth estimation, require per-pixel ground truth that is difficult to obtain on real images but easy to obtain on synthetic. However, models learned on synthetic images often do not generalize well to real images due to the domain shift. Our key idea to improve domain adaptation is to introduce a separate anchor task (such as facial landmarks) who… ▽ More

    Submitted 9 November, 2020; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: In WACV 2021

  29. arXiv:1904.06458  [pdf, other

    cs.CV

    Transformable Bottleneck Networks

    Authors: Kyle Olszewski, Sergey Tulyakov, Oliver Woodford, Hao Li, Linjie Luo

    Abstract: We propose a novel approach to performing fine-grained 3D manipulation of image content via a convolutional neural network, which we call the Transformable Bottleneck Network (TBN). It applies given spatial transformations directly to a volumetric bottleneck within our encoder-bottleneck-decoder architecture. Multi-view supervision encourages the network to learn to spatially disentangle the featu… ▽ More

    Submitted 26 August, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: Project Page: https://kyleolsz.github.io/TB-Networks/

  30. arXiv:1903.12431  [pdf, other

    cs.CL

    Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering

    Authors: Lahari Poddar, Leonardo Neves, William Brendel, Luis Marujo, Sergey Tulyakov, Pradeep Karuturi

    Abstract: Tracking user reported bugs requires considerable engineering effort in going through many repetitive reports and assigning them to the correct teams. This paper proposes a neural architecture that can jointly (1) detect if two bug reports are duplicates, and (2) aggregate them into latent topics. Leveraging the assumption that learning the topic of a bug is a sub-task for detecting duplicates, we… ▽ More

    Submitted 3 April, 2019; v1 submitted 29 March, 2019; originally announced March 2019.

    Comments: Accepted for publication in NAACL 2019

  31. arXiv:1903.11633  [pdf, other

    cs.CV

    Laplace Landmark Localization

    Authors: Joseph P Robinson, Yuncheng Li, Ning Zhang, Yun Fu, and Sergey Tulyakov

    Abstract: Landmark localization in images and videos is a classic problem solved in various ways. Nowadays, with deep networks prevailing throughout machine learning, there are revamped interests in pushing facial landmark detection technologies to handle more challenging data. Most efforts use network objectives based on L1 or L2 norms, which have several disadvantages. First of all, the locations of landm… ▽ More

    Submitted 14 August, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: International Conference on Computer Vision (ICCV), 2019

  32. arXiv:1902.08900  [pdf, other

    cs.CV

    3D Guided Fine-Grained Face Manipulation

    Authors: Zhenglin Geng, Chen Cao, Sergey Tulyakov

    Abstract: We present a method for fine-grained face manipulation. Given a face image with an arbitrary expression, our method can synthesize another arbitrary expression by the same person. This is achieved by first fitting a 3D face model and then disentangling the face into a texture and a shape. We then learn different networks in these two spaces. In the texture space, we use a conditional generative ne… ▽ More

    Submitted 24 February, 2019; originally announced February 2019.

  33. arXiv:1812.08861  [pdf, other

    cs.GR cs.CV cs.LG stat.ML

    Animating Arbitrary Objects via Deep Motion Transfer

    Authors: Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

    Abstract: This paper introduces a novel deep learning framework for image animation. Given an input image with a target object and a driving video sequence depicting a moving object, our framework generates a video in which the target object is animated according to the driving sequence. This is achieved through a deep architecture that decouples appearance and motion information. Our framework consists of… ▽ More

    Submitted 30 August, 2019; v1 submitted 20 December, 2018; originally announced December 2018.

    Comments: CVPR-2019 (oral)

  34. arXiv:1806.01677  [pdf, other

    cs.CV cs.NE

    Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching

    Authors: Stepan Tulyakov, Anton Ivanov, Francois Fleuret

    Abstract: End-to-end deep-learning networks recently demonstrated extremely good perfor- mance for stereo matching. However, existing networks are difficult to use for practical applications since (1) they are memory-hungry and unable to process even modest-size images, (2) they have to be trained for a given disparity range. The Practical Deep Stereo (PDS) network that we propose addresses both issues: Fir… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

  35. arXiv:1711.11566  [pdf, other

    cs.LG cs.CV

    Hybrid VAE: Improving Deep Generative Models using Partial Observations

    Authors: Sergey Tulyakov, Andrew Fitzgibbon, Sebastian Nowozin

    Abstract: Deep neural network models trained on large labeled datasets are the state-of-the-art in a large variety of computer vision tasks. In many applications, however, labeled data is expensive to obtain or requires a time consuming manual annotation process. In contrast, unlabeled data is often abundant and available in large quantities. We present a principled framework to capitalize on unlabeled data… ▽ More

    Submitted 30 November, 2017; originally announced November 2017.

  36. arXiv:1707.04993  [pdf, other

    cs.CV

    MoCoGAN: Decomposing Motion and Content for Video Generation

    Authors: Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz

    Abstract: Visual signals in a video can be divided into content and motion. While content specifies which objects are in the video, motion describes their dynamics. Based on this prior, we propose the Motion and Content decomposed Generative Adversarial Network (MoCoGAN) framework for video generation. The proposed framework generates a video by map** a sequence of random vectors to a sequence of video fr… ▽ More

    Submitted 13 December, 2017; v1 submitted 16 July, 2017; originally announced July 2017.

  37. Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter

    Authors: Stepan Tulyakov, Anton Ivanov, Nicolas Thomas, Victoria Roloff, Antoine Pommerol, Gabriele Cremonese, Thomas Weigel, Francois Fleuret

    Abstract: There are many geometric calibration methods for "standard" cameras. These methods, however, cannot be used for the calibration of telescopes with large focal lengths and complex off-axis optics. Moreover, specialized calibration methods for the telescopes are scarce in literature. We describe the calibration method that we developed for the Colour and Stereo Surface Imaging System (CaSSIS) telesc… ▽ More

    Submitted 3 July, 2017; originally announced July 2017.

    Comments: Submitted to Advances in Space Research

  38. arXiv:1612.00979  [pdf, ps, other

    cs.CV cs.NE

    Semi-supervised learning of deep metrics for stereo reconstruction

    Authors: Stepan Tulyakov, Anton Ivanov, Francois Fleuret

    Abstract: Deep-learning metrics have recently demonstrated extremely good performance to match image patches for stereo reconstruction. However, training such metrics requires large amount of labeled stereo images, which can be difficult or costly to collect for certain applications. The main contribution of our work is a new semi-supervised method for learning deep metrics from unlabeled stereo images, giv… ▽ More

    Submitted 3 December, 2016; originally announced December 2016.

    Comments: 11 pages, 3 figures