Skip to main content

Showing 1–37 of 37 results for author: Drozdzal, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10429  [pdf, other

    cs.CV cs.AI

    Consistency-diversity-realism Pareto fronts of conditional image generative models

    Authors: Pietro Astolfi, Marlene Careil, Melissa Hall, Oscar Mañas, Matthew Muckley, Jakob Verbeek, Adriana Romero Soriano, Michal Drozdzal

    Abstract: Building world models that accurately and comprehensively represent the real world is the utmost aspiration for conditional image generative models as it would enable their use as world simulators. For these models to be successful world models, they should not only excel at image quality and prompt-image consistency but also ensure high representation diversity. However, current research in gener… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2406.04551  [pdf, other

    cs.CV cs.AI cs.LG

    Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance

    Authors: Reyhane Askari Hemmat, Melissa Hall, Alicia Sun, Candace Ross, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: With the growing popularity of text-to-image generative models, there has been increasing focus on understanding their risks and biases. Recent work has found that state-of-the-art models struggle to depict everyday objects with the true diversity of the real world and have notable gaps between geographic regions. In this work, we aim to increase the diversity of generated images of common objects… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2405.04457  [pdf, other

    cs.CV cs.CY cs.HC

    Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

    Authors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero Soriano

    Abstract: Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated met… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  4. arXiv:2403.17804  [pdf, other

    cs.CV cs.CL

    Improving Text-to-Image Consistency via Automatic Prompt Optimization

    Authors: Oscar Mañas, Pietro Astolfi, Melissa Hall, Candace Ross, Jack Urbanek, Adina Williams, Aishwarya Agrawal, Adriana Romero-Soriano, Michal Drozdzal

    Abstract: Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  5. arXiv:2402.05188  [pdf, other

    cs.RO cs.AI cs.CL

    InCoRo: In-Context Learning for Robotics Control with Feedback Loops

    Authors: Jiaqiang Ye Zhu, Carla Gomez Cano, David Vazquez Bermudez, Michal Drozdzal

    Abstract: One of the challenges in robotics is to enable robotic units with the reasoning capability that would be robust enough to execute complex tasks in dynamic environments. Recent advances in LLMs have positioned them as go-to tools for simple reasoning tasks, motivating the pioneering work of Liang et al. [35] that uses an LLM to translate natural language commands into low-level static execution pla… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  6. arXiv:2310.00158  [pdf, other

    cs.CV cs.AI cs.LG

    Feedback-guided Data Synthesis for Imbalanced Classification

    Authors: Reyhane Askari Hemmat, Mohammad Pezeshki, Florian Bordes, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: Current status quo in machine learning is to use static datasets of real images for training, which often come from long-tailed distributions. With the recent advances in generative models, researchers have started augmenting these static datasets with synthetic data, reporting moderate performance improvements on classification tasks. We hypothesize that these performance gains are limited by the… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  7. arXiv:2308.06198  [pdf, other

    cs.CV cs.HC

    DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity

    Authors: Melissa Hall, Candace Ross, Adina Williams, Nicolas Carion, Michal Drozdzal, Adriana Romero Soriano

    Abstract: The unprecedented photorealistic results achieved by recent text-to-image generative systems and their increasing use as plug-and-play content creation solutions make it crucial to understand their potential biases. In this work, we introduce three indicators to evaluate the realism, diversity and prompt-generation consistency of text-to-image generative systems when prompted to generate objects f… ▽ More

    Submitted 18 March, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

  8. arXiv:2305.08675  [pdf

    cs.CV

    Improved baselines for vision-language pre-training

    Authors: Enrico Fini, Pietro Astolfi, Adriana Romero-Soriano, Jakob Verbeek, Michal Drozdzal

    Abstract: Contrastive learning has emerged as an efficient framework to learn multimodal representations. CLIP, a seminal work in this area, achieved impressive results by training on paired image-text data using the contrastive loss. Recent work claims improvements over CLIP using additional non-contrastive losses inspired from self-supervised learning. However, it is sometimes hard to disentangle the cont… ▽ More

    Submitted 4 November, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: TMLR, featured certification; changelog at https://openreview.net/forum?id=a7nvXxNmdV

    Journal ref: Transactions on Machine Learning Research, 10/2023, issn 2835-8856

  9. arXiv:2304.13722  [pdf, other

    cs.CV

    Controllable Image Generation via Collage Representations

    Authors: Arantxa Casanova, Marlène Careil, Adriana Romero-Soriano, Christopher J. Pal, Jakob Verbeek, Michal Drozdzal

    Abstract: Recent advances in conditional generative image models have enabled impressive results. On the one hand, text-based conditional models have achieved remarkable generation quality, by leveraging large-scale datasets of image-text pairs. To enable fine-grained controllability, however, text-based models require long prompts, whose details may be ignored by the model. On the other hand, layout-based… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  10. arXiv:2303.09677  [pdf, other

    cs.CV

    Instance-Conditioned GAN Data Augmentation for Representation Learning

    Authors: Pietro Astolfi, Arantxa Casanova, Jakob Verbeek, Pascal Vincent, Adriana Romero-Soriano, Michal Drozdzal

    Abstract: Data augmentation has become a crucial component to train state-of-the-art visual representation models. However, handcrafting combinations of transformations that lead to improved performances is a laborious task, which can result in visually unrealistic samples. To overcome these limitations, recent works have explored the use of generative models as learnable data augmentation tools, showing pr… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: TMLR reviews at https://openreview.net/forum?id=1n7q9mxG3T&referrer=%5BTMLR%5D(%2Fgroup%3Fid%3DTMLR)

  11. arXiv:2302.07960  [pdf, other

    cs.LG cs.HC

    Learning to Substitute Ingredients in Recipes

    Authors: Bahare Fatemi, Quentin Duval, Rohit Girdhar, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: Recipe personalization through ingredient substitution has the potential to help people meet their dietary needs and preferences, avoid potential allergens, and ease culinary exploration in everyone's kitchen. To address ingredient substitution, we build a benchmark, composed of a dataset of substitution pairs with standardized splits, evaluation metrics, and baselines. We further introduce Graph-… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  12. arXiv:2211.01866  [pdf, other

    cs.CV cs.LG

    ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

    Authors: Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim

    Abstract: Deep learning vision systems are widely deployed across applications where reliability is critical. However, even today's best models can fail to recognize an object when its pose, lighting, or background varies. While existing benchmarks surface examples challenging for models, they do not explain why such mistakes arise. To address this need, we introduce ImageNet-X, a set of sixteen human annot… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  13. arXiv:2210.00978  [pdf, other

    cs.CV

    Uncertainty-Driven Active Vision for Implicit Scene Reconstruction

    Authors: Edward J. Smith, Michal Drozdzal, Derek Nowrouzezahrai, David Meger, Adriana Romero-Soriano

    Abstract: Multi-view implicit scene reconstruction methods have become increasingly popular due to their ability to represent complex scene details. Recent efforts have been devoted to improving the representation of input information and to reducing the number of views required to obtain high quality reconstructions. Yet, perhaps surprisingly, the study of which views to select to maximally improve scene u… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

  14. arXiv:2203.16392  [pdf, other

    eess.IV cs.CV

    On learning adaptive acquisition policies for undersampled multi-coil MRI reconstruction

    Authors: Tim Bakker, Matthew Muckley, Adriana Romero-Soriano, Michal Drozdzal, Luis Pineda

    Abstract: Most current approaches to undersampled multi-coil MRI reconstruction focus on learning the reconstruction model for a fixed, equidistant acquisition trajectory. In this paper, we study the problem of joint learning of the reconstruction model together with acquisition policies. To this end, we extend the End-to-End Variational Network with learnable acquisition policies that can adapt to differen… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted to MIDL 2022 as conference paper

  15. arXiv:2110.13100  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Parameter Prediction for Unseen Deep Architectures

    Authors: Boris Knyazev, Michal Drozdzal, Graham W. Taylor, Adriana Romero-Soriano

    Abstract: Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks. We introduce a large-scale dataset of di… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021 camera ready, the code is available at https://github.com/facebookresearch/ppuda

  16. arXiv:2109.05070  [pdf, other

    cs.CV cs.LG

    Instance-Conditioned GAN

    Authors: Arantxa Casanova, Marlène Careil, Jakob Verbeek, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces. Yet, modeling complex distributions of datasets such as ImageNet and COCO-Stuff remains challenging in unconditional settings. In this paper, we take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex… ▽ More

    Submitted 4 November, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted at NeurIPS2021

  17. arXiv:2107.09584  [pdf, other

    cs.CV cs.RO

    Active 3D Shape Reconstruction from Vision and Touch

    Authors: Edward J. Smith, David Meger, Luis Pineda, Roberto Calandra, Jitendra Malik, Adriana Romero, Michal Drozdzal

    Abstract: Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. However, in 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings, leaving the active exploration of the shape largely unexplored. Inactive touch sensing for 3D reconstruction,… ▽ More

    Submitted 26 October, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

    Journal ref: Published at Neurips 2021

  18. arXiv:2012.04027  [pdf, other

    cs.CV cs.AI

    Generating unseen complex scenes: are we there yet?

    Authors: Arantxa Casanova, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: Although recent complex scene conditional generation models generate increasingly appealing scenes, it is very hard to assess which models perform better and why. This is often due to models being trained to fit different data splits, and defining their own experimental setups. In this paper, we propose a methodology to compare complex scene conditional generation models, and provide an in-depth a… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  19. arXiv:2007.15255  [pdf, other

    cs.CV cs.LG stat.ML

    Instance Selection for GANs

    Authors: Terrance DeVries, Michal Drozdzal, Graham W. Taylor

    Abstract: Recent advances in Generative Adversarial Networks (GANs) have led to their widespread adoption for the purposes of generating high quality synthetic imagery. While capable of generating photo-realistic images, these models often produce unrealistic samples which fall outside of the data manifold. Several recently proposed techniques attempt to avoid spurious samples, either by rejecting them afte… ▽ More

    Submitted 23 October, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted to NeurIPS 2020

  20. Active MR k-space Sampling with Reinforcement Learning

    Authors: Luis Pineda, Sumana Basu, Adriana Romero, Roberto Calandra, Michal Drozdzal

    Abstract: Deep learning approaches have recently shown great promise in accelerating magnetic resonance image (MRI) acquisition. The majority of existing work have focused on designing better reconstruction models given a pre-determined acquisition trajectory, ignoring the question of trajectory optimization. In this paper, we focus on learning acquisition trajectories given a fixed image reconstruction mod… ▽ More

    Submitted 7 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

    Comments: Presented at the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2020

    Journal ref: LNCS vol. 12262 (2020) 23-33

  21. arXiv:2007.03778  [pdf, other

    cs.CV cs.RO

    3D Shape Reconstruction from Vision and Touch

    Authors: Edward J. Smith, Roberto Calandra, Adriana Romero, Georgia Gkioxari, David Meger, Jitendra Malik, Michal Drozdzal

    Abstract: When a toddler is presented a new toy, their instinctual behaviour is to pick it upand inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with. At any instance here, touch provides high fidelity localized information while vision provides complementary global context. However, in 3D shape reconstruction, the complementary… ▽ More

    Submitted 2 November, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: Accepted at Neurips 2020

  22. arXiv:2004.07041  [pdf, other

    eess.IV cs.CV cs.LG

    Extending Unsupervised Neural Image Compression With Supervised Multitask Learning

    Authors: David Tellez, Diederik Hoppener, Cornelis Verhoef, Dirk Grunhagen, Pieter Nierop, Michal Drozdzal, Jeroen van der Laak, Francesco Ciompi

    Abstract: We focus on the problem of training convolutional neural networks on gigapixel histopathology images to predict image-level targets. For this purpose, we extend Neural Image Compression (NIC), an image compression framework that reduces the dimensionality of these images using an encoder network trained unsupervisedly. We propose to train this encoder using supervised multitask learning (MTL) inst… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

    Comments: Medical Imaging with Deep Learning 2020 (MIDL20)

  23. arXiv:1908.06037  [pdf, other

    cs.CV cs.LG

    Needles in Haystacks: On Classifying Tiny Objects in Large Images

    Authors: Nick Pawlowski, Suvrat Bhooshan, Nicolas Ballas, Francesco Ciompi, Ben Glocker, Michal Drozdzal

    Abstract: In some important computer vision domains, such as medical or hyperspectral imaging, we care about the classification of tiny objects in large images. However, most Convolutional Neural Networks (CNNs) for image classification were developed using biased datasets that contain large objects, in mostly central image positions. To assess whether classical CNN architectures work well for tiny object c… ▽ More

    Submitted 6 January, 2020; v1 submitted 16 August, 2019; originally announced August 2019.

  24. arXiv:1907.08175  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    On the Evaluation of Conditional GANs

    Authors: Terrance DeVries, Adriana Romero, Luis Pineda, Graham W. Taylor, Michal Drozdzal

    Abstract: Conditional Generative Adversarial Networks (cGANs) are finding increasingly widespread use in many application domains. Despite outstanding progress, quantitative evaluation of such models often involves multiple distinct metrics to assess different desirable properties, such as image quality, conditional consistency, and intra-conditioning diversity. In this setting, model benchmarking becomes a… ▽ More

    Submitted 23 December, 2019; v1 submitted 11 July, 2019; originally announced July 2019.

  25. arXiv:1904.05709  [pdf, other

    cs.CV

    Elucidating image-to-set prediction: An analysis of models, losses and datasets

    Authors: Luis Pineda, Amaia Salvador, Michal Drozdzal, Adriana Romero

    Abstract: In this paper, we identify an important reproducibility challenge in the image-to-set prediction literature that impedes proper comparisons among published methods, namely, researchers use different evaluation protocols to assess their contributions. To alleviate this issue, we introduce an image-to-set prediction benchmark suite built on top of five public datasets of increasing task complexity t… ▽ More

    Submitted 27 May, 2020; v1 submitted 11 April, 2019; originally announced April 2019.

  26. arXiv:1902.03051  [pdf, other

    cs.CV

    Reducing Uncertainty in Undersampled MRI Reconstruction with Active Acquisition

    Authors: Zizhao Zhang, Adriana Romero, Matthew J. Muckley, Pascal Vincent, Lin Yang, Michal Drozdzal

    Abstract: The goal of MRI reconstruction is to restore a high fidelity image from partially observed measurements. This partial view naturally induces reconstruction uncertainty that can only be reduced by acquiring additional measurements. In this paper, we present a novel method for MRI reconstruction that, at inference time, dynamically selects the measurements to take and iteratively refines the predict… ▽ More

    Submitted 8 February, 2019; originally announced February 2019.

  27. The Liver Tumor Segmentation Benchmark (LiTS)

    Authors: Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene Vorontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Szeskin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, Fabian Lohöfer, Julian Walter Holch, Wieland Sommer, Felix Hofmann, Alexandre Hostettler, Naama Lev-Cohain, Michal Drozdzal, Michal Marianne Amitai, Refael Vivantik, Jacob Sosna, Ivan Ezhov, Anjany Sekuboyina, Fernando Navarro, Florian Kofler, Johannes C. Paetzold , et al. (84 additional authors not shown)

    Abstract: In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with… ▽ More

    Submitted 25 November, 2022; v1 submitted 13 January, 2019; originally announced January 2019.

    Comments: Patrick Bilic, Patrick Christ, Hongwei Bran Li, and Eugene Vorontsov made equal contributions to this work. Published in Medical Image Analysis

    Journal ref: Medical Image Analysis (2022) Pg. 102680

  28. arXiv:1812.06164  [pdf, other

    cs.CV

    Inverse Cooking: Recipe Generation from Food Images

    Authors: Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero

    Abstract: People enjoy food photography because they appreciate food. Behind each meal there is a story described in a complex recipe and, unfortunately, by simply looking at a food image we do not have access to its preparation process. Therefore, in this paper we introduce an inverse cooking system that recreates cooking recipes given food images. Our system predicts ingredients as sets by means of a nove… ▽ More

    Submitted 15 June, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: CVPR 2019

  29. arXiv:1811.08839  [pdf, other

    cs.CV cs.LG eess.SP physics.med-ph stat.ML

    fastMRI: An Open Dataset and Benchmarks for Accelerated MRI

    Authors: Jure Zbontar, Florian Knoll, Anuroop Sriram, Tullie Murrell, Zhengnan Huang, Matthew J. Muckley, Aaron Defazio, Ruben Stern, Patricia Johnson, Mary Bruno, Marc Parente, Krzysztof J. Geras, Joe Katsnelson, Hersh Chandarana, Zizhao Zhang, Michal Drozdzal, Adriana Romero, Michael Rabbat, Pascal Vincent, Nafissa Yakubova, James Pinkerton, Duo Wang, Erich Owens, C. Lawrence Zitnick, Michael P. Recht , et al. (2 additional authors not shown)

    Abstract: Accelerating Magnetic Resonance Imaging (MRI) by taking fewer measurements has the potential to reduce medical costs, minimize stress to patients and make MRI possible in applications where it is currently prohibitively slow or expensive. We introduce the fastMRI dataset, a large-scale collection of both raw MR measurements and clinical MR images, that can be used for training and evaluation of ma… ▽ More

    Submitted 11 December, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

    Comments: 35 pages, 10 figures

  30. arXiv:1804.11332  [pdf, other

    cs.CV

    On the iterative refinement of densely connected representation levels for semantic segmentation

    Authors: Arantxa Casanova, Guillem Cucurull, Michal Drozdzal, Adriana Romero, Yoshua Bengio

    Abstract: State-of-the-art semantic segmentation approaches increase the receptive field of their models by using either a downsampling path composed of poolings/strided convolutions or successive dilated convolutions. However, it is not clear which operation leads to best results. In this paper, we systematically study the differences introduced by distinct receptive field enlargement methods and their imp… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

  31. arXiv:1710.02248  [pdf, other

    cs.LG cs.AI stat.ML

    Learnable Explicit Density for Continuous Latent Space and Variational Inference

    Authors: Chin-Wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville

    Abstract: In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF)… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: 2 figures, 5 pages, submitted to ICML Principled Approaches to Deep Learning workshop

  32. arXiv:1705.07450  [pdf, other

    cs.CV

    Image Segmentation by Iterative Inference from Conditional Score Estimation

    Authors: Adriana Romero, Michal Drozdzal, Akram Erraqabi, Simon Jégou, Yoshua Bengio

    Abstract: Inspired by the combination of feedforward and iterative computations in the virtual cortex, and taking advantage of the ability of denoising autoencoders to estimate the score of a joint distribution, we propose a novel approach to iterative inference for capturing and exploiting the complex joint distribution of output variables conditioned on some input variables. This approach is applied to im… ▽ More

    Submitted 18 August, 2017; v1 submitted 21 May, 2017; originally announced May 2017.

  33. arXiv:1702.05174  [pdf, other

    cs.CV

    Learning Normalized Inputs for Iterative Estimation in Medical Image Segmentation

    Authors: Michal Drozdzal, Gabriel Chartrand, Eugene Vorontsov, Lisa Di Jorio, An Tang, Adriana Romero, Yoshua Bengio, Chris Pal, Samuel Kadoury

    Abstract: In this paper, we introduce a simple, yet powerful pipeline for medical image segmentation that combines Fully Convolutional Networks (FCNs) with Fully Convolutional Residual Networks (FC-ResNets). We propose and examine a design that takes particular advantage of recent advances in the understanding of both Convolutional Neural Networks as well as ResNets. Our approach focuses upon the importance… ▽ More

    Submitted 16 February, 2017; originally announced February 2017.

  34. arXiv:1612.00799  [pdf, other

    cs.CV

    A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images

    Authors: David Vázquez, Jorge Bernal, F. Javier Sánchez, Gloria Fernández-Esparrach, Antonio M. López, Adriana Romero, Michal Drozdzal, Aaron Courville

    Abstract: Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced… ▽ More

    Submitted 2 December, 2016; originally announced December 2016.

  35. arXiv:1611.09326  [pdf, other

    cs.CV

    The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation

    Authors: Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio

    Abstract: State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g.… ▽ More

    Submitted 31 October, 2017; v1 submitted 28 November, 2016; originally announced November 2016.

  36. arXiv:1608.04117  [pdf, other

    cs.CV

    The Importance of Skip Connections in Biomedical Image Segmentation

    Authors: Michal Drozdzal, Eugene Vorontsov, Gabriel Chartrand, Samuel Kadoury, Chris Pal

    Abstract: In this paper, we study the influence of both long and short skip connections on Fully Convolutional Networks (FCN) for biomedical image segmentation. In standard FCNs, only long skip connections are used to skip features from the contracting path to the expanding path in order to recover spatial information lost during downsampling. We extend FCNs by adding short skip connections, that are simila… ▽ More

    Submitted 22 September, 2016; v1 submitted 14 August, 2016; originally announced August 2016.

    Comments: Accepted to 2nd Workshop on Deep Learning in Medical Image Analysis (DLMIA 2016); Added references

  37. arXiv:1607.07604  [pdf, other

    cs.CV

    Generic Feature Learning for Wireless Capsule Endoscopy Analysis

    Authors: Santi Seguí, Michal Drozdzal, Guillem Pascual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, Jordi Vitrià

    Abstract: The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each… ▽ More

    Submitted 26 July, 2016; originally announced July 2016.