Search | arXiv e-print repository

Deep Probability Estimation

Authors: Sheng Liu, Aakash Kaku, Weicheng Zhu, Matan Leibovich, Sreyas Mohan, Boyang Yu, Haoxiang Huang, Laure Zanna, Narges Razavian, Jonathan Niles-Weed, Carlos Fernandez-Granda

Abstract: Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous t… ▽ More Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous to binary classification, with the difference that the objective is to estimate probabilities rather than predicting the specific outcome. This work investigates probability estimation from high-dimensional data using deep neural networks. There exist several methods to improve the probabilities generated by these models but they mostly focus on model (epistemic) uncertainty. For problems with inherent uncertainty, it is challenging to evaluate performance without access to ground-truth probabilities. To address this, we build a synthetic dataset to study and compare different computable metrics. We evaluate existing methods on the synthetic data as well as on three real-world probability estimation tasks, all of which involve inherent uncertainty: precipitation forecasting from radar images, predicting cancer patient survival from histopathology images, and predicting car crashes from dashcam videos. We also give a theoretical analysis of a model for high-dimensional probability estimation which reproduces several of the phenomena evinced in our experiments. Finally, we propose a new method for probability estimation using neural networks, which modifies the training process to promote output probabilities that are consistent with empirical probabilities computed from the data. The method outperforms existing approaches on most metrics on the simulated as well as real-world data. △ Less

Submitted 11 October, 2022; v1 submitted 20 November, 2021; originally announced November 2021.

Comments: SL, AK, WZ, ML, SM contributed equally to this work; 36 pages, 17 figures, 12 tables

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:13746-13781, 2022

arXiv:2011.15045 [pdf, other]

Unsupervised Deep Video Denoising

Authors: Dev Yashpal Sheth, Sreyas Mohan, Joshua L. Vincent, Ramon Manzorro, Peter A. Crozier, Mitesh M. Khapra, Eero P. Simoncelli, Carlos Fernandez-Granda

Abstract: Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD i… ▽ More Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD is comparable to the supervised state-of-the-art, even when trained only on a single short noisy video. We demonstrate the promise of our approach in real-world imaging applications by denoising raw video, fluorescence-microscopy and electron-microscopy data. In contrast to many current approaches to video denoising, UDVD does not require explicit motion compensation. This is advantageous because motion compensation is computationally expensive, and can be unreliable when the input data are noisy. A gradient-based analysis reveals that UDVD automatically adapts to local motion in the input noisy videos. Thus, the network learns to perform implicit motion compensation, even though it is only trained for denoising. △ Less

Submitted 19 August, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

Comments: Dev and Sreyas contributed equally. To appear at 2021 IEEE/CVF International Conference on Computer Vision (ICCV). See https://sreyas-mohan.github.io/udvd/ for code and more results

arXiv:2008.03096 [pdf, other]

doi 10.21437/Interspeech.2020-1822

Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning

Authors: Devang S Ram Mohan, Raphael Lenain, Lorenzo Foglianti, Tian Huey Teh, Marlene Staib, Alexandra Torresquintero, Jiameng Gao

Abstract: Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions v… ▽ More Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions varies across sentences, which raises the question of how the actions should be chosen. We propose a reinforcement learning based framework to train an agent to make this decision. We compare our performance against that of deterministic, rule-based systems. Our results demonstrate that our agent successfully balances the trade-off between the latency of audio generation and the quality of synthesised audio. More broadly, we show that neural sequence-to-sequence models can be adapted to run in an incremental manner. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: To be published in Interspeech 2020. 5 pages, 4 figures

arXiv:2002.04019 [pdf, other]

Be Like Water: Robustness to Extraneous Variables Via Adaptive Feature Normalization

Authors: Aakash Kaku, Sreyas Mohan, Avinash Parnandi, Heidi Schambra, Carlos Fernandez-Granda

Abstract: Extraneous variables are variables that are irrelevant for a certain task, but heavily affect the distribution of the available data. In this work, we show that the presence of such variables can degrade the performance of deep-learning models. We study three datasets where there is a strong influence of known extraneous variables: classification of upper-body movements in stroke patients, annotat… ▽ More Extraneous variables are variables that are irrelevant for a certain task, but heavily affect the distribution of the available data. In this work, we show that the presence of such variables can degrade the performance of deep-learning models. We study three datasets where there is a strong influence of known extraneous variables: classification of upper-body movements in stroke patients, annotation of surgical activities, and recognition of corrupted images. Models trained with batch normalization learn features that are highly dependent on the extraneous variables. In batch normalization, the statistics used to normalize the features are learned from the training set and fixed at test time, which produces a mismatch in the presence of varying extraneous variables. We demonstrate that estimating the feature statistics adaptively during inference, as in instance normalization, addresses this issue, producing normalized features that are more robust to changes in the extraneous variables. This results in a significant gain in performance for different network architectures and choices of feature statistics. △ Less

Submitted 25 February, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: Aakash and Sreyas contributed equally

arXiv:1911.07335 [pdf, other]

doi 10.1007/s10994-020-05897-1

Using Error Decay Prediction to Overcome Practical Issues of Deep Active Learning for Named Entity Recognition

Authors: Haw-Shiuan Chang, Shankar Vembu, Sunil Mohan, Rheeya Uppaal, Andrew McCallum

Abstract: Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimati… ▽ More Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers, and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well. △ Less

Submitted 20 July, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

Comments: This is a pre-print of an article published in Springer Machine Learning journal. The final authenticated version is available online at: https://doi.org/10.1007/s10994-020-05897-1

arXiv:1908.07842 [pdf, other]

doi 10.1007/978-3-030-27272-2_3

Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Authors: Mohammadreza Baharani, Shrey Mohan, Hamed Tabkhi

Abstract: A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios.… ▽ More A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server $3.25\times$ reaching to 27.77fps and $1.75\times$, respectively and decreases power consumption on the edge node by $1.45\times$, while it deteriorates accuracy only 5.6\% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid. △ Less

Submitted 19 August, 2019; originally announced August 2019.

Comments: This is a pre-print of an article published in International Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in Computer Science. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-27272-2_3

Journal ref: International Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in Computer Science

arXiv:1906.05478 [pdf, other]

Robust and interpretable blind image denoising via bias-free convolutional neural networks

Authors: Sreyas Mohan, Zahra Kadkhodaie, Eero P. Simoncelli, Carlos Fernandez-Granda

Abstract: Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional map**s. Biases are also used to facilitate training, by subtracting mean response over batches of training images (a component of "batch normalization"). Recent state-of-the-art blind denoising methods (e.g., DnCNN) seem to require these terms for… ▽ More Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional map**s. Biases are also used to facilitate training, by subtracting mean response over batches of training images (a component of "batch normalization"). Recent state-of-the-art blind denoising methods (e.g., DnCNN) seem to require these terms for their success. Here, however, we show that these networks systematically overfit the noise levels for which they are trained: when deployed at noise levels outside the training range, performance degrades dramatically. In contrast, a bias-free architecture -- obtained by removing the constant terms in every layer of the network, including those used for batch normalization-- generalizes robustly across noise levels, while preserving state-of-the-art performance within the training range. Locally, the bias-free network acts linearly on the noisy image, enabling direct analysis of network behavior via standard linear-algebraic tools. These analyses provide interpretations of network functionality in terms of nonlinear adaptive filtering, and projection onto a union of low-dimensional subspaces, connecting the learning-based method to more traditional denoising methodology. △ Less

Submitted 8 February, 2020; v1 submitted 13 June, 2019; originally announced June 2019.

Comments: Published as conference paper in ICLR 2020

arXiv:1906.00823 [pdf, other]

Data-driven Estimation of Sinusoid Frequencies

Authors: Gautier Izacard, Sreyas Mohan, Carlos Fernandez-Granda

Abstract: Frequency estimation is a fundamental problem in signal processing, with applications in radar imaging, underwater acoustics, seismic imaging, and spectroscopy. The goal is to estimate the frequency of each component in a multisinusoidal signal from a finite number of noisy samples. A recent machine-learning approach uses a neural network to output a learned representation with local maxima at the… ▽ More Frequency estimation is a fundamental problem in signal processing, with applications in radar imaging, underwater acoustics, seismic imaging, and spectroscopy. The goal is to estimate the frequency of each component in a multisinusoidal signal from a finite number of noisy samples. A recent machine-learning approach uses a neural network to output a learned representation with local maxima at the position of the frequency estimates. In this work, we propose a novel neural-network architecture that produces a significantly more accurate representation, and combine it with an additional neural-network module trained to detect the number of frequencies. This yields a fast, fully-automatic method for frequency estimation that achieves state-of-the-art results. In particular, it outperforms existing techniques by a substantial margin at medium-to-high noise levels. △ Less

Submitted 3 February, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

Showing 1–8 of 8 results for author: Mohan, S