Skip to main content

Showing 1–4 of 4 results for author: Gill, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2001.01005  [pdf, other

    eess.IV cs.CV

    Segmentation of Cellular Patterns in Confocal Images of Melanocytic Lesions in vivo via a Multiscale Encoder-Decoder Network (MED-Net)

    Authors: Kivanc Kose, Alican Bozkurt, Christi Alessi-Fox, Melissa Gill, Caterina Longo, Giovanni Pellacani, Jennifer Dy, Dana H. Brooks, Milind Rajadhyaksha

    Abstract: In-vivo optical microscopy is advancing into routine clinical practice for non-invasively guiding diagnosis and treatment of cancer and other diseases, and thus beginning to reduce the need for traditional biopsy. However, reading and analysis of the optical microscopic images are generally still qualitative, relying mainly on visual examination. Here we present an automated semantic segmentation… ▽ More

    Submitted 3 January, 2020; originally announced January 2020.

  2. arXiv:1912.00938  [pdf

    eess.AS cs.SD

    Speaker detection in the wild: Lessons learned from JSALT 2019

    Authors: Paola Garcia, Jesus Villalba, Herve Bredin, Jun Du, Diego Castan, Alejandrina Cristia, Latane Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Leo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak

    Abstract: This paper presents the problems and solutions addressed at the JSALT workshop when using a single microphone for speaker detection in adverse scenarios. The main focus was to tackle a wide range of conditions that go from meetings to wild speech. We describe the research threads we explored and a set of modules that was successful for these scenarios. The ultimate goal was to explore speaker dete… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: Submitted to ICASSP 2020

  3. arXiv:1911.01255  [pdf, other

    eess.AS cs.SD

    pyannote.audio: neural building blocks for speaker diarization

    Authors: Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill

    Abstract: We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection,… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: Submitted to ICASSP 2020

  4. arXiv:1910.10655  [pdf, other

    eess.AS

    End-to-end Domain-Adversarial Voice Activity Detection

    Authors: Marvin Lavechin, Marie-Philippe Gill, Ruben Bousbib, Hervé Bredin, Leibny Paola Garcia-Perera

    Abstract: Voice activity detection is the task of detecting speech regions in a given audio stream or recording. First, we design a neural network combining trainable filters and recurrent layers to tackle voice activity detection directly from the waveform. Experiments on the challenging DIHARD dataset show that the proposed end-to-end model reaches state-of-the-art performance and outperforms a variant wh… ▽ More

    Submitted 26 May, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: submitted to Interspeech 2020

    ACM Class: I.2.7