Skip to main content

Showing 1–22 of 22 results for author: Perazzi, F

.
  1. arXiv:2203.14863  [pdf, other

    cs.CV cs.MM

    HIME: Efficient Headshot Image Super-Resolution with Multiple Exemplars

    Authors: Xiaoyu Xiang, Jon Morton, Fitsum A Reda, Lucas Young, Federico Perazzi, Rakesh Ranjan, Amit Kumar, Andrea Colaco, Jan Allebach

    Abstract: A promising direction for recovering the lost information in low-resolution headshot images is utilizing a set of high-resolution exemplars from the same identity. Complementary images in the reference set can improve the generated headshot quality across many different views and poses. However, it is challenging to make the best use of multiple exemplars: the quality and alignment of each exempla… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Technical Report

  2. arXiv:2106.01667  [pdf, other

    cs.CV

    APES: Audiovisual Person Search in Untrimmed Video

    Authors: Juan Leon Alcazar, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem, Fabian Caba Heilbron

    Abstract: Humans are arguably one of the most important subjects in video streams, many real-world applications such as video summarization or video editing workflows often require the automatic search and retrieval of a person of interest. Despite tremendous efforts in the person reidentification and retrieval domains, few works have developed audiovisual search strategies. In this paper, we present the Au… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  3. arXiv:2104.02244  [pdf, other

    cs.CV

    Content-Aware GAN Compression

    Authors: Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, S. Y. Kung

    Abstract: Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. Directly applying generic compression approaches yields poor results on GANs, which motivates a number of recent GAN compression works. While prior works mainly accelerate condit… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: Published in CVPR2021

    ACM Class: I.4.0; I.2.6

  4. arXiv:2012.05116  [pdf, other

    cs.CV

    Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

    Authors: Zhihao Xia, Michaël Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti

    Abstract: We introduce a neural network-based method to denoise pairs of images taken in quick succession, with and without a flash, in low-light environments. Our goal is to produce a high-quality rendering of the scene that preserves the color and mood from the ambient illumination of the noisy no-flash image, while recovering surface texture and detail revealed by the flash. Our network outputs a gain ma… ▽ More

    Submitted 14 April, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: CVPR 2021. Project page at https://www.cse.wustl.edu/~zhihao.xia/deepfnf/

  5. arXiv:2011.02146  [pdf, other

    cs.CV

    Deep Image Compositing

    Authors: He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin, Vishal M. Patel

    Abstract: Image compositing is a task of combining regions from different images to compose a new image. A common use case is background replacement of portrait images. To obtain high quality composites, professionals typically manually perform multiple editing steps such as segmentation, matting and foreground color decontamination, which is very time consuming even with sophisticated photo editing tools.… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Comments: WACV-2021. A better portrait segmentation technology has been shipped in Photoshop 2020. Check this out if you are not sure how to use it. https://www.youtube.com/watch?v=v_kitSYKr3s&t=138s

  6. arXiv:2008.00892  [pdf, other

    cs.LG cs.CV stat.ML

    Shape Adaptor: A Learnable Resizing Module

    Authors: Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns

    Abstract: We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution. Whilst traditional resizing layers have fixed and deterministic resha** factors, our module allows for a learnable resha** factor. Our implementation enables shape adaptors to be trained end-to-end… ▽ More

    Submitted 10 August, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: Published at ECCV 2020

  7. arXiv:2007.09529  [pdf, other

    cs.CV

    Single View Metrology in the Wild

    Authors: Rui Zhu, Xingyi Yang, Yannick Hold-Geoffroy, Federico Perazzi, Jonathan Eisenmann, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: Most 3D reconstruction methods may only recover scene properties up to a global scale ambiguity. We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground as well as camera parameters of orientation and field of view, using just a monocular image acquired in unconstrained condition. Our… ▽ More

    Submitted 23 February, 2021; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: ECCV 2020, camera-ready version

  8. arXiv:2007.03815  [pdf, other

    cs.CV cs.MM cs.RO

    Real-time Semantic Segmentation with Fast Attention

    Authors: ** Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko, Stan Sclaroff

    Abstract: In deep CNN based models for semantic segmentation, high accuracy relies on rich spatial context (large receptive fields) and fine spatial details (high resolution), both of which incur high computational costs. In this paper, we propose a novel architecture that addresses both challenges and achieves state-of-the-art performance for semantic segmentation of high-resolution images and videos in re… ▽ More

    Submitted 9 July, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: project page: https://cs-people.bu.edu/**hu/FANet.html

  9. arXiv:2005.09812  [pdf, other

    cs.CV cs.SD eess.AS

    Active Speakers in Context

    Authors: Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem

    Abstract: Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker. Although this strategy can be enough for addressing single-speaker scenarios, it prevents accurate detection when the task is to identify who of many candidate speakers are talking. This paper introduces the Active Speaker Context, a novel representation that models relationshi… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  10. arXiv:2004.01800  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    Temporally Distributed Networks for Fast Video Semantic Segmentation

    Authors: ** Hu, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Stan Sclaroff, Federico Perazzi

    Abstract: We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefo… ▽ More

    Submitted 6 April, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: [CVPR2020] Project: https://github.com/feinanshan/TDNet

  11. arXiv:2003.12649  [pdf, other

    cs.CV

    Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

    Authors: Sai Bi, Kalyan Sunkavalli, Federico Perazzi, Eli Shechtman, Vladimir Kim, Ravi Ramamoorthi

    Abstract: We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline first learns… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: Accepted to ICCV 2019

  12. arXiv:1912.04421  [pdf, other

    cs.CV

    Basis Prediction Networks for Effective Burst Denoising with Large Kernels

    Authors: Zhihao Xia, Federico Perazzi, Michaël Gharbi, Kalyan Sunkavalli, Ayan Chakrabarti

    Abstract: Bursts of images exhibit significant self-similarity across both time and space. This motivates a representation of the kernels as linear combinations of a small set of basis elements. To this end, we introduce a novel basis prediction network that, given an input burst, predicts a set of global basis kernels -- shared within the image -- and the corresponding mixing coefficients -- which are spec… ▽ More

    Submitted 2 December, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: CVPR 2020. Project website at https://www.cse.wustl.edu/~zhihao.xia/bpn/

  13. arXiv:1909.06804  [pdf, other

    cs.CV cs.LG

    Scaling Object Detection by Transferring Classification Weights

    Authors: Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

    Abstract: Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count. Yet, the number of object-level categories annotated in detection datasets is an order of magnitude smaller than image-level classification labels. State-of-the art object detection models are trained in a supervised fashion and this limits the number of object classe… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Comments: ICCV 2019

  14. Deblurring Face Images using Uncertainty Guided Multi-Stream Semantic Networks

    Authors: Rajeev Yasarla, Federico Perazzi, Vishal M. Patel

    Abstract: We propose a novel multi-stream architecture and training methodology that exploits semantic labels for facial image deblurring. The proposed Uncertainty Guided Multi- Stream Semantic Network (UMSN) processes regions belonging to each semantic class independently and learns to combine their outputs into the final deblurred result. Pixel-wise semantic labels are obtained using a segmentation networ… ▽ More

    Submitted 20 April, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

    Comments: Accepted at TIP 2020

  15. arXiv:1905.00737  [pdf, other

    cs.CV

    The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation

    Authors: Sergi Caelles, Jordi Pont-Tuset, Federico Perazzi, Alberto Montes, Kevis-Kokitsi Maninis, Luc Van Gool

    Abstract: We present the 2019 DAVIS Challenge on Video Object Segmentation, the third edition of the DAVIS Challenge series, a public competition designed for the task of Video Object Segmentation (VOS). In addition to the original semi-supervised track and the interactive track introduced in the previous edition, a new unsupervised multi-object track will be featured this year. In the newly introduced trac… ▽ More

    Submitted 2 May, 2019; originally announced May 2019.

    Comments: CVPR 2019 Workshop/Challenge

  16. arXiv:1904.11112  [pdf, other

    cs.CV

    Web Stereo Video Supervision for Depth Prediction from Dynamic Scenes

    Authors: Chaoyang Wang, Simon Lucey, Federico Perazzi, Oliver Wang

    Abstract: We present a fully data-driven method to compute depth from diverse monocular video sequences that contain large amounts of non-rigid objects, e.g., people. In order to learn reconstruction cues for non-rigid scenes, we introduce a new dataset consisting of stereo videos scraped in-the-wild. This dataset has a wide variety of scene types, and features large amounts of nonrigid objects, especially… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

  17. arXiv:1804.02900  [pdf, other

    cs.CV

    A Fully Progressive Approach to Single-Image Super-Resolution

    Authors: Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga Sorkine-Hornung, Christopher Schroers

    Abstract: Recent deep learning approaches to single image super-resolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors. To this end, we propose a method (ProSR) that is progressive both in architecture and training: the network upsamples an image in in… ▽ More

    Submitted 10 April, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

  18. arXiv:1804.01346  [pdf, other

    cs.CV

    Normalized Cut Loss for Weakly-supervised CNN Segmentation

    Authors: Meng Tang, Abdelaziz Djelouah, Federico Perazzi, Yuri Boykov, Christopher Schroers

    Abstract: Most recent semantic segmentation methods train deep convolutional neural networks with fully annotated masks requiring pixel-accuracy for good quality training. Common weakly-supervised approaches generate full masks from partial input (e.g. scribbles or seeds) using standard interactive segmentation methods as preprocessing. But, errors in such masks result in poorer training since standard loss… ▽ More

    Submitted 4 April, 2018; originally announced April 2018.

    Comments: Accepted at CVPR 2018

  19. arXiv:1803.09569  [pdf, other

    cs.CV

    On Regularized Losses for Weakly-supervised CNN Segmentation

    Authors: Meng Tang, Federico Perazzi, Abdelaziz Djelouah, Ismail Ben Ayed, Christopher Schroers, Yuri Boykov

    Abstract: Minimization of regularized losses is a principled approach to weak supervision well-established in deep learning, in general. However, it is largely overlooked in semantic segmentation currently dominated by methods mimicking full supervision via "fake" fully-labeled training masks (proposals) generated from available partial input. To obtain such full masks the typical methods explicitly use sta… ▽ More

    Submitted 10 April, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

  20. arXiv:1803.00557  [pdf, other

    cs.CV

    The 2018 DAVIS Challenge on Video Object Segmentation

    Authors: Sergi Caelles, Alberto Montes, Kevis-Kokitsi Maninis, Yuhua Chen, Luc Van Gool, Federico Perazzi, Jordi Pont-Tuset

    Abstract: We present the 2018 DAVIS Challenge on Video Object Segmentation, a public competition specifically designed for the task of video object segmentation. It builds upon the DAVIS 2017 dataset, which was presented in the previous edition of the DAVIS Challenge, and added 100 videos with multiple objects per sequence to the original DAVIS 2016 dataset. Motivated by the analysis of the results of the 2… ▽ More

    Submitted 27 March, 2018; v1 submitted 1 March, 2018; originally announced March 2018.

    Comments: Challenge website: http://davischallenge.org/

  21. arXiv:1704.00675  [pdf, other

    cs.CV

    The 2017 DAVIS Challenge on Video Object Segmentation

    Authors: Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alex Sorkine-Hornung, Luc Van Gool

    Abstract: We present the 2017 DAVIS Challenge on Video Object Segmentation, a public dataset, benchmark, and competition specifically designed for the task of video object segmentation. Following the footsteps of other successful initiatives, such as ILSVRC and PASCAL VOC, which established the avenue of research in the fields of scene classification and semantic segmentation, the DAVIS Challenge comprises… ▽ More

    Submitted 1 March, 2018; v1 submitted 3 April, 2017; originally announced April 2017.

    Comments: Challenge website: http://davischallenge.org

  22. arXiv:1612.02646  [pdf, other

    cs.CV

    Learning Video Object Segmentation from Static Images

    Authors: Anna Khoreva, Federico Perazzi, Rodrigo Benenson, Bernt Schiele, Alexander Sorkine-Hornung

    Abstract: Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation. Our model proceeds on a per-frame basis, guided by the output of the previous frame towards the object of interest in the next frame. We demonstrate that highly accurate object segmentation in videos can be enabled b… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.

    Comments: Submitted to CVPR 2017