Search | arXiv e-print repository

An Estimator for the Sensitivity to Perturbations of Deep Neural Networks

Authors: Naman Maheshwari, Nicholas Malaya, Scott Moe, Jaydeep P. Kulkarni, Sudhanva Gurumurthi

Abstract: For Deep Neural Networks (DNNs) to become useful in safety-critical applications, such as self-driving cars and disease diagnosis, they must be stable to perturbations in input and model parameters. Characterizing the sensitivity of a DNN to perturbations is necessary to determine minimal bit-width precision that may be used to safely represent the network. However, no general result exists that i… ▽ More For Deep Neural Networks (DNNs) to become useful in safety-critical applications, such as self-driving cars and disease diagnosis, they must be stable to perturbations in input and model parameters. Characterizing the sensitivity of a DNN to perturbations is necessary to determine minimal bit-width precision that may be used to safely represent the network. However, no general result exists that is capable of predicting the sensitivity of a given DNN to round-off error, noise, or other perturbations in input. This paper derives an estimator that can predict such quantities. The estimator is derived via inequalities and matrix norms, and the resulting quantity is roughly analogous to a condition number for the entire neural network. An approximation of the estimator is tested on two Convolutional Neural Networks, AlexNet and VGG-19, using the ImageNet dataset. For each of these networks, the tightness of the estimator is explored via random perturbations and adversarial attacks. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: Actual work and paper concluded in January 2019

arXiv:2201.11964 [pdf, other]

Dynamic Temporal Reconciliation by Reinforcement learning

Authors: Himanshi Charotia, Abhishek Garg, Gaurav Dhama, Naman Maheshwari

Abstract: Planning based on long and short term time series forecasts is a common practice across many industries. In this context, temporal aggregation and reconciliation techniques have been useful in improving forecasts, reducing model uncertainty, and providing a coherent forecast across different time horizons. However, an underlying assumption spanning all these techniques is the complete availability… ▽ More Planning based on long and short term time series forecasts is a common practice across many industries. In this context, temporal aggregation and reconciliation techniques have been useful in improving forecasts, reducing model uncertainty, and providing a coherent forecast across different time horizons. However, an underlying assumption spanning all these techniques is the complete availability of data across all levels of the temporal hierarchy, while this offers mathematical convenience but most of the time low frequency data is partially completed and it is not available while forecasting. On the other hand, high frequency data can significantly change in a scenario like the COVID pandemic and this change can be used to improve forecasts that will otherwise significantly diverge from long term actuals. We propose a dynamic reconciliation method whereby we formulate the problem of informing low frequency forecasts based on high frequency actuals as a Markov Decision Process (MDP) allowing for the fact that we do not have complete information about the dynamics of the process. This allows us to have the best long term estimates based on the most recent data available even if the low frequency cycles have only been partially completed. The MDP has been solved using a Time Differenced Reinforcement learning (TDRL) approach with customizable actions and improves the long terms forecasts dramatically as compared to relying solely on historical low frequency data. The result also underscores the fact that while low frequency forecasts can improve the high frequency forecasts as mentioned in the temporal reconciliation literature (based on the assumption that low frequency forecasts have lower noise to signal ratio) the high frequency forecasts can also be used to inform the low frequency forecasts. △ Less

Submitted 28 January, 2022; originally announced January 2022.

arXiv:2112.08052 [pdf, other]

Optimal Latent Space Forecasting for Large Collections of Short Time Series Using Temporal Matrix Factorization

Authors: Himanshi Charotia, Abhishek Garg, Gaurav Dhama, Naman Maheshwari

Abstract: In the context of time series forecasting, it is a common practice to evaluate multiple methods and choose one of these methods or an ensemble for producing the best forecasts. However, choosing among different ensembles over multiple methods remains a challenging task that undergoes a combinatorial explosion as the number of methods increases. In the context of demand forecasting or revenue forec… ▽ More In the context of time series forecasting, it is a common practice to evaluate multiple methods and choose one of these methods or an ensemble for producing the best forecasts. However, choosing among different ensembles over multiple methods remains a challenging task that undergoes a combinatorial explosion as the number of methods increases. In the context of demand forecasting or revenue forecasting, this challenge is further exacerbated by a large number of time series as well as limited historical data points available due to changing business context. Although deep learning forecasting methods aim to simultaneously forecast large collections of time series, they become challenging to apply in such scenarios due to the limited history available and might not yield desirable results. We propose a framework for forecasting short high-dimensional time series data by combining low-rank temporal matrix factorization and optimal model selection on latent time series using cross-validation. We demonstrate that forecasting the latent factors leads to significant performance gains as compared to directly applying different uni-variate models on time series. Performance has been validated on a truncated version of the M4 monthly dataset which contains time series data from multiple domains showing the general applicability of the method. Moreover, it is amenable to incorporating the analyst view of the future owing to the low number of latent factors which is usually impractical when applying forecasting methods directly to high dimensional datasets. △ Less

Submitted 15 December, 2021; originally announced December 2021.

arXiv:1805.08691 [pdf, other]

Highly Efficient 8-bit Low Precision Inference of Convolutional Neural Networks with IntelCaffe

Authors: Jiong Gong, Haihao Shen, Guoming Zhang, Xiaoli Liu, Shane Li, Ge **, Niharika Maheshwari, Evarist Fomenko, Eden Segal

Abstract: High throughput and low latency inference of deep neural networks are critical for the deployment of deep learning applications. This paper presents the efficient inference techniques of IntelCaffe, the first Intel optimized deep learning framework that supports efficient 8-bit low precision inference and model optimization techniques of convolutional neural networks on Intel Xeon Scalable Process… ▽ More High throughput and low latency inference of deep neural networks are critical for the deployment of deep learning applications. This paper presents the efficient inference techniques of IntelCaffe, the first Intel optimized deep learning framework that supports efficient 8-bit low precision inference and model optimization techniques of convolutional neural networks on Intel Xeon Scalable Processors. The 8-bit optimized model is automatically generated with a calibration process from FP32 model without the need of fine-tuning or retraining. We show that the inference throughput and latency with ResNet-50, Inception-v3 and SSD are improved by 1.38X-2.9X and 1.35X-3X respectively with neglectable accuracy loss from IntelCaffe FP32 baseline and by 56X-75X and 26X-37X from BVLC Caffe. All these techniques have been open-sourced on IntelCaffe GitHub1, and the artifact is provided to reproduce the result on Amazon AWS Cloud. △ Less

Submitted 4 May, 2018; originally announced May 2018.

Comments: 1st Reproducible Tournament on Pareto-efficient Image Classification, co-held with ASPLOS 2018

arXiv:1306.1889 [pdf]

An Improved Structure Of Reversible Adder And Subtractor

Authors: Aakash Gupta, Pradeep Singla, Jitendra Gupta, Nitin Maheshwari

Abstract: In today's world everyday a new technology which is faster, smaller and more complex than its predecessor is being developed. The increased number of transistors packed onto a chip of a conventional system results in increased power consumption that is why Reversible logic has drawn attention of Researchers due to its less heat dissipating characteristics. Reversible logic can be imposed over appl… ▽ More In today's world everyday a new technology which is faster, smaller and more complex than its predecessor is being developed. The increased number of transistors packed onto a chip of a conventional system results in increased power consumption that is why Reversible logic has drawn attention of Researchers due to its less heat dissipating characteristics. Reversible logic can be imposed over applications such as quantum computing, optical computing, quantum dot cellular automata, low power VLSI circuits, DNA computing. This paper presents the reversible combinational circuit of adder, subtractor and parity preserving subtractor. The suggested circuit in this paper are designed using Feynman, Double Feynman and MUX gates which are better than the existing one in literature in terms of Quantum cost, Garbage output and Total logical calculations. △ Less

Submitted 8 June, 2013; originally announced June 2013.

Journal ref: International Journal of Electronics and Computer Science Engineering, Vol 2, No. 2, pp712-718, June 2013

arXiv:1011.6481 [pdf, ps, other]

A near optimal algorithm for finding Euclidean shortest path in polygonal domain

Authors: Rajasekhar Inkulu, Sanjiv Kapoor, S. N. Maheshwari

Abstract: We present an algorithm to find an {\it Euclidean Shortest Path} from a source vertex $s$ to a sink vertex $t$ in the presence of obstacles in $\Re^2$. Our algorithm takes $O(T+m(\lg{m})(\lg{n}))$ time and $O(n)$ space. Here, $O(T)$ is the time to triangulate the polygonal region, $m$ is the number of obstacles, and $n$ is the number of vertices. This bound is close to the known lower bound of… ▽ More We present an algorithm to find an {\it Euclidean Shortest Path} from a source vertex $s$ to a sink vertex $t$ in the presence of obstacles in $\Re^2$. Our algorithm takes $O(T+m(\lg{m})(\lg{n}))$ time and $O(n)$ space. Here, $O(T)$ is the time to triangulate the polygonal region, $m$ is the number of obstacles, and $n$ is the number of vertices. This bound is close to the known lower bound of $O(n+m\lg{m})$ time and $O(n)$ space. Our approach involve progressing shortest path wavefront as in continuous Dijkstra-type method, and confining its expansion to regions of interest. △ Less

Submitted 30 November, 2010; originally announced November 2010.

Comments: 50 pages

Showing 1–6 of 6 results for author: Maheshwari, N