-
DINO-VITS: Data-Efficient Zero-Shot TTS with Self-Supervised Speaker Verification Loss for Noise Robustness
Authors:
Vikentii Pankov,
Valeria Pronina,
Alexander Kuzmin,
Maksim Borisov,
Nikita Usoltsev,
Xingshan Zeng,
Alexander Golubkov,
Nikolai Ermolenko,
Aleksandra Shirshova,
Yulia Matveeva
Abstract:
We address zero-shot TTS systems' noise-robustness problem by proposing a dual-objective training for the speaker encoder using self-supervised DINO loss. This approach enhances the speaker encoder with the speech synthesis objective, capturing a wider range of speech characteristics beneficial for voice cloning. At the same time, the DINO objective improves speaker representation learning, ensuri…
▽ More
We address zero-shot TTS systems' noise-robustness problem by proposing a dual-objective training for the speaker encoder using self-supervised DINO loss. This approach enhances the speaker encoder with the speech synthesis objective, capturing a wider range of speech characteristics beneficial for voice cloning. At the same time, the DINO objective improves speaker representation learning, ensuring robustness to noise and speaker discriminability. Experiments demonstrate significant improvements in subjective metrics under both clean and noisy conditions, outperforming traditional speaker-encoderbased TTS systems. Additionally, we explore training zeroshot TTS on noisy, unlabeled data. Our two-stage training strategy, leveraging self-supervised speech models to distinguish between noisy and clean speech, shows notable advances in similarity and naturalness, especially with noisy training datasets, compared to the ASR-transcription-based approach.
△ Less
Submitted 18 June, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Deep learning Framework for Mobile Microscopy
Authors:
Anatasiia Kornilova,
Mikhail Salnikov,
Olga Novitskaya,
Maria Begicheva,
Egor Sevriugov,
Kirill Shcherbakov,
Valeriya Pronina,
Dmitry V. Dylov
Abstract:
Mobile microscopy is a promising technology to assist and to accelerate disease diagnostics, with its widespread adoption being hindered by the mediocre quality of acquired images. Although some paired image translation and super-resolution approaches for mobile microscopy have emerged, a set of essential challenges, necessary for automating it in a high-throughput setting, still await to be addre…
▽ More
Mobile microscopy is a promising technology to assist and to accelerate disease diagnostics, with its widespread adoption being hindered by the mediocre quality of acquired images. Although some paired image translation and super-resolution approaches for mobile microscopy have emerged, a set of essential challenges, necessary for automating it in a high-throughput setting, still await to be addressed. The issues like in-focus/out-of-focus classification, fast scanning deblurring, focus-stacking, etc. -- all have specific peculiarities when the data are recorded using a mobile device. In this work, we aspire to create a comprehensive pipeline by connecting a set of methods purposely tuned to mobile microscopy: (1) a CNN model for stable in-focus / out-of-focus classification, (2) modified DeblurGAN architecture for image deblurring, (3) FuseGAN model for combining in-focus parts from multiple images to boost the detail. We discuss the limitations of the existing solutions developed for professional clinical microscopes, propose corresponding improvements, and compare to the other state-of-the-art mobile analytics solutions.
△ Less
Submitted 18 February, 2021; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Microscopy Image Restoration with Deep Wiener-Kolmogorov filters
Authors:
Valeriya Pronina,
Filippos Kokkinos,
Dmitry V. Dylov,
Stamatios Lefkimmiatis
Abstract:
Microscopy is a powerful visualization tool in biology, enabling the study of cells, tissues, and the fundamental biological processes; yet, the observed images typically suffer from blur and background noise. In this work, we propose a unifying framework of algorithms for Gaussian image deblurring and denoising. These algorithms are based on deep learning techniques for the design of learnable re…
▽ More
Microscopy is a powerful visualization tool in biology, enabling the study of cells, tissues, and the fundamental biological processes; yet, the observed images typically suffer from blur and background noise. In this work, we propose a unifying framework of algorithms for Gaussian image deblurring and denoising. These algorithms are based on deep learning techniques for the design of learnable regularizers integrated into the Wiener-Kolmogorov filter. Our extensive experimentation line showcases that the proposed approach achieves a superior quality of image reconstruction and surpasses the solutions that rely either on deep learning or on optimization schemes alone. Augmented with the variance stabilizing transformation, the proposed reconstruction pipeline can also be successfully applied to the problem of Poisson image deblurring, surpassing the state-of-the-art methods. Moreover, several variants of the proposed framework demonstrate competitive performance at low computational complexity, which is of high importance for real-time imaging applications.
△ Less
Submitted 14 May, 2020; v1 submitted 25 November, 2019;
originally announced November 2019.