-
A Principled Hierarchical Deep Learning Approach to Joint Image Compression and Classification
Authors:
Siyu Qi,
Achintha Wijesinghe,
Lahiru D. Chamain,
Zhi Ding
Abstract:
Among applications of deep learning (DL) involving low cost sensors, remote image classification involves a physical channel that separates edge sensors and cloud classifiers. Traditional DL models must be divided between an encoder for the sensor and the decoder + classifier at the edge server. An important challenge is to effectively train such distributed models when the connecting channels hav…
▽ More
Among applications of deep learning (DL) involving low cost sensors, remote image classification involves a physical channel that separates edge sensors and cloud classifiers. Traditional DL models must be divided between an encoder for the sensor and the decoder + classifier at the edge server. An important challenge is to effectively train such distributed models when the connecting channels have limited rate/capacity. Our goal is to optimize DL models such that the encoder latent requires low channel bandwidth while still delivers feature information for high classification accuracy. This work proposes a three-step joint learning strategy to guide encoders to extract features that are compact, discriminative, and amenable to common augmentations/transformations. We optimize latent dimension through an initial screening phase before end-to-end (E2E) training. To obtain an adjustable bit rate via a single pre-deployed encoder, we apply entropy-based quantization and/or manual truncation on the latent representations. Tests show that our proposed method achieves accuracy improvement of up to 1.5% on CIFAR-10 and 3% on CIFAR-100 over conventional E2E cross-entropy training.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
End-to-end optimized image compression for multiple machine tasks
Authors:
Lahiru D. Chamain,
Fabien Racapé,
Jean Bégaint,
Akshay Pushparaja,
Simon Feltman
Abstract:
An increasing share of captured images and videos are transmitted for storage and remote analysis by computer vision algorithms, rather than to be viewed by humans. Contrary to traditional standard codecs with engineered tools, neural network based codecs can be trained end-to-end to optimally compress images with respect to a target rate and any given differentiable performance metric. Although i…
▽ More
An increasing share of captured images and videos are transmitted for storage and remote analysis by computer vision algorithms, rather than to be viewed by humans. Contrary to traditional standard codecs with engineered tools, neural network based codecs can be trained end-to-end to optimally compress images with respect to a target rate and any given differentiable performance metric. Although it is possible to train such compression tools to achieve better rate-accuracy performance for a particular computer vision task, it could be practical and relevant to re-use the compressed bit-stream for multiple machine tasks. For this purpose, we introduce 'Connectors' that are inserted between the decoder and the task algorithms to enable a direct transformation of the compressed content, which was previously optimized for a specific task, to multiple other machine tasks. We demonstrate the effectiveness of the proposed method by achieving significant rate-accuracy performance improvement for both image classification and object segmentation, using the same bit-stream, originally optimized for object detection.
△ Less
Submitted 6 March, 2021;
originally announced March 2021.
-
End-to-end optimized image compression for machines, a study
Authors:
Lahiru D. Chamain,
Fabien Racapé,
Jean Bégaint,
Akshay Pushparaja,
Simon Feltman
Abstract:
An increasing share of image and video content is analyzed by machines rather than viewed by humans, and therefore it becomes relevant to optimize codecs for such applications where the analysis is performed remotely. Unfortunately, conventional coding tools are challenging to specialize for machine tasks as they were originally designed for human perception. However, neural network based codecs c…
▽ More
An increasing share of image and video content is analyzed by machines rather than viewed by humans, and therefore it becomes relevant to optimize codecs for such applications where the analysis is performed remotely. Unfortunately, conventional coding tools are challenging to specialize for machine tasks as they were originally designed for human perception. However, neural network based codecs can be jointly trained end-to-end with any convolutional neural network (CNN)-based task model. In this paper, we propose to study an end-to-end framework enabling efficient image compression for remote machine task analysis, using a chain composed of a compression module and a task algorithm that can be optimized end-to-end. We show that it is possible to significantly improve the task accuracy when fine-tuning jointly the codec and the task networks, especially at low bit-rates. Depending on training or deployment constraints, selective fine-tuning can be applied only on the encoder, decoder or task network and still achieve rate-accuracy improvements over an off-the-shelf codec and task network. Our results also demonstrate the flexibility of end-to-end pipelines for practical applications.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Faster and Accurate Classification for JPEG2000 Compressed Images in Networked Applications
Authors:
Lahiru D. Chamain,
Zhi Ding
Abstract:
JPEG2000 (j2k) is a highly popular format for image and video compression.With the rapidly growing applications of cloud based image classification, most existing j2k-compatible schemes would stream compressed color images from the source before reconstruction at the processing center as inputs to deep CNNs. We propose to remove the computationally costly reconstruction step by training a deep CNN…
▽ More
JPEG2000 (j2k) is a highly popular format for image and video compression.With the rapidly growing applications of cloud based image classification, most existing j2k-compatible schemes would stream compressed color images from the source before reconstruction at the processing center as inputs to deep CNNs. We propose to remove the computationally costly reconstruction step by training a deep CNN image classifier using the CDF 9/7 Discrete Wavelet Transformed (DWT) coefficients directly extracted from j2k-compressed images. We demonstrate additional computation savings by utilizing shallower CNN to achieve classification of good accuracy in the DWT domain. Furthermore, we show that traditional augmentation transforms such as flip**/shifting are ineffective in the DWT domain and present different augmentation transformations to achieve more accurate classification without any additional cost. This way, faster and more accurate classification is possible for j2k encoded images without image reconstruction. Through experiments on CIFAR-10 and Tiny ImageNet data sets, we show that the performance of the proposed solution is consistent for image transmission over limited channel bandwidth.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Eigenvalue Based Detection of a Signal in Colored Noise: Finite and Asymptotic Analyses
Authors:
Lahiru D. Chamain,
Prathapasinghe Dharmawansa,
Saman Atapattu,
Chintha Tellambura
Abstract:
Signal detection in colored noise with an unknown covariance matrix has a myriad of applications in diverse scientific/engineering fields. The test statistic is the largest generalized eigenvalue (l.g.e.) of the whitened sample covariance matrix, which is constructed via $m$-dimensional $p $ signal-plus-noise samples and $m$-dimensional $n $ noise-only samples. A finite dimensional characterizatio…
▽ More
Signal detection in colored noise with an unknown covariance matrix has a myriad of applications in diverse scientific/engineering fields. The test statistic is the largest generalized eigenvalue (l.g.e.) of the whitened sample covariance matrix, which is constructed via $m$-dimensional $p $ signal-plus-noise samples and $m$-dimensional $n $ noise-only samples. A finite dimensional characterization of this statistic under the alternative hypothesis has hitherto been an open problem. We answer this problem by deriving cumulative distribution function (c.d.f.) of this l.g.e. via the powerful orthogonal polynomial approach, exploiting the deformed Jacobi unitary ensemble (JUE). Two special cases and an asymptotic version of the c.d.f. are also derived. With this new c.d.f., we comprehensively analyze the receiver operating characteristics (ROC) of the detector. Importantly, when the noise-only covariant matrix is nearly rank deficient (i.e., $ m=n$), we show that (a) when $m$ and $p$ increase such that $m/p$ is fixed, at each fixed signal-to-noise ratio (SNR), there exists an optimal ROC profile. We also establish a tight approximation of it; and (b) asymptotically, reliable signal detection is always possible (no matter how weak the signal is) if SNR scales with $m$.
△ Less
Submitted 7 February, 2019;
originally announced February 2019.
-
Detection of a Signal in Colored Noise: A Random Matrix Theory Based Analysis
Authors:
Lahiru D. Chamain,
Prathapasinghe Dharmawansa,
Saman Atapattu,
Chintha Tellambura
Abstract:
This paper investigates the classical statistical signal processing problem of detecting a signal in the presence of colored noise with an unknown covariance matrix. In particular, we consider a scenario where m-dimensional p possible signal-plus-noise samples and m-dimensional n noise-only samples are available at the detector. Then the presence of a signal can be detected using the largest gener…
▽ More
This paper investigates the classical statistical signal processing problem of detecting a signal in the presence of colored noise with an unknown covariance matrix. In particular, we consider a scenario where m-dimensional p possible signal-plus-noise samples and m-dimensional n noise-only samples are available at the detector. Then the presence of a signal can be detected using the largest generalized eigenvalue (l.g.e.) of the so called whitened sample covariance matrix. This amounts to statistically characterizing the maximum eigenvalue of the deformed Jacobi unitary ensemble (JUE). To this end, we employ the powerful orthogonal polynomial approach to determine a new finite dimensional expression for the cumulative distribution function (c.d.f.) of the l.g.e. of the deformed JUE. This new c.d.f. expression facilitates the further analysis of the receiver operating characteristics (ROC) of the detector. It turns out that, for m=n, when m and p increase such that m/p attains a fixed value, there exists an optimal ROC profile corresponding to each fixed signal-to-noise ratio (SNR). In this respect, we have established a tight approximation for the corresponding optimal ROC profile.
△ Less
Submitted 28 January, 2019;
originally announced January 2019.