Search | arXiv e-print repository

Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: Neural compression has brought tremendous progress in designing lossy compressors with good rate-distortion (RD) performance at low complexity. Thus far, neural compression design involves transforming the source to a latent vector, which is then rounded to integers and entropy coded. While this approach has been shown to be optimal in a one-shot sense on certain sources, we show that it is highly… ▽ More Neural compression has brought tremendous progress in designing lossy compressors with good rate-distortion (RD) performance at low complexity. Thus far, neural compression design involves transforming the source to a latent vector, which is then rounded to integers and entropy coded. While this approach has been shown to be optimal in a one-shot sense on certain sources, we show that it is highly sub-optimal on i.i.d. sequences, and in fact always recovers scalar quantization of the original source sequence. We demonstrate that the sub-optimality is due to the choice of quantization scheme in the latent space, and not the transform design. By employing lattice quantization instead of scalar quantization in the latent space, we demonstrate that Lattice Transform Coding (LTC) is able to recover optimal vector quantization at various dimensions and approach the asymptotically-achievable rate-distortion function at reasonable complexity. On general vector sources, LTC improves upon standard neural compressors in one-shot coding performance. LTC also enables neural compressors that perform block coding on i.i.d. vector sources, which yields coding gain over optimal one-shot coding. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2310.09890 [pdf, other]

Score-Based Methods for Discrete Optimization in Deep Learning

Authors: Eric Lei, Arman Adibi, Hamed Hassani

Abstract: Discrete optimization problems often arise in deep learning tasks, despite the fact that neural networks typically operate on continuous data. One class of these problems involve objective functions which depend on neural networks, but optimization variables which are discrete. Although the discrete optimization literature provides efficient algorithms, they are still impractical in these settings… ▽ More Discrete optimization problems often arise in deep learning tasks, despite the fact that neural networks typically operate on continuous data. One class of these problems involve objective functions which depend on neural networks, but optimization variables which are discrete. Although the discrete optimization literature provides efficient algorithms, they are still impractical in these settings due to the high cost of an objective function evaluation, which involves a neural network forward-pass. In particular, they require $O(n)$ complexity per iteration, but real data such as point clouds have values of $n$ in thousands or more. In this paper, we investigate a score-based approximation framework to solve such problems. This framework uses a score function as a proxy for the marginal gain of the objective, leveraging embeddings of the discrete variables and speed of auto-differentiation frameworks to compute backward-passes in parallel. We experimentally demonstrate, in adversarial set classification tasks, that our method achieves a superior trade-off in terms of speed and solution quality compared to heuristic methods. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2308.15413 [pdf, other]

Wrap**Net: Mesh Autoencoder via Deep Sphere Deformation

Authors: Eric Lei, Muhammad Asad Lodhi, Jiahao Pang, Junghyun Ahn, Dong Tian

Abstract: There have been recent efforts to learn more meaningful representations via fixed length codewords from mesh data, since a mesh serves as a complete model of underlying 3D shape compared to a point cloud. However, the mesh connectivity presents new difficulties when constructing a deep learning pipeline for meshes. Previous mesh unsupervised learning approaches typically assume category-specific t… ▽ More There have been recent efforts to learn more meaningful representations via fixed length codewords from mesh data, since a mesh serves as a complete model of underlying 3D shape compared to a point cloud. However, the mesh connectivity presents new difficulties when constructing a deep learning pipeline for meshes. Previous mesh unsupervised learning approaches typically assume category-specific templates, e.g., human face/body templates. It restricts the learned latent codes to only be meaningful for objects in a specific category, so the learned latent spaces are unable to be used across different types of objects. In this work, we present Wrap**Net, the first mesh autoencoder enabling general mesh unsupervised learning over heterogeneous objects. It introduces a novel base graph in the bottleneck dedicated to representing mesh connectivity, which is shown to facilitate learning a shared latent space representing object shape. The superiority of Wrap**Net mesh learning is further demonstrated via improved reconstruction quality and competitive classification compared to point cloud learning, as well as latent interpolation between meshes of different categories. △ Less

Submitted 29 August, 2023; originally announced August 2023.

arXiv:2307.01944 [pdf, other]

Text + Sketch: Image Compression at Ultra Low Rates

Authors: Eric Lei, Yiğit Berkay Uslu, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: Recent advances in text-to-image generative models provide the ability to generate high-quality images from short text descriptions. These foundation models, when pre-trained on billion-scale datasets, are effective for various downstream tasks with little or no further training. A natural question to ask is how such models may be adapted for image compression. We investigate several techniques in… ▽ More Recent advances in text-to-image generative models provide the ability to generate high-quality images from short text descriptions. These foundation models, when pre-trained on billion-scale datasets, are effective for various downstream tasks with little or no further training. A natural question to ask is how such models may be adapted for image compression. We investigate several techniques in which the pre-trained models can be directly used to implement compression schemes targeting novel low rate regimes. We show how text descriptions can be used in conjunction with side information to generate high-fidelity reconstructions that preserve both semantics and spatial structure of the original. We demonstrate that at very low bit-rates, our method can significantly improve upon learned compressors in terms of perceptual and semantic fidelity, despite no end-to-end training. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: ICML 2023 Neural Compression Workshop

arXiv:2307.00246 [pdf, other]

On a Relation Between the Rate-Distortion Function and Optimal Transport

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: We discuss a relationship between rate-distortion and optimal transport (OT) theory, even though they seem to be unrelated at first glance. In particular, we show that a function defined via an extremal entropic OT distance is equivalent to the rate-distortion function. We numerically verify this result as well as previous results that connect the Monge and Kantorovich problems to optimal scalar q… ▽ More We discuss a relationship between rate-distortion and optimal transport (OT) theory, even though they seem to be unrelated at first glance. In particular, we show that a function defined via an extremal entropic OT distance is equivalent to the rate-distortion function. We numerically verify this result as well as previous results that connect the Monge and Kantorovich problems to optimal scalar quantization. Thus, we unify solving scalar quantization and rate-distortion functions in an alternative fashion by using their respective optimal transport solvers. △ Less

Submitted 1 July, 2023; originally announced July 2023.

Comments: Published as a Tiny Paper at ICLR 2023; invited to present

arXiv:2305.16416 [pdf, other]

Federated Neural Compression Under Heterogeneous Data

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: We discuss a federated learned compression problem, where the goal is to learn a compressor from real-world data which is scattered across clients and may be statistically heterogeneous, yet share a common underlying representation. We propose a distributed source model that encompasses both characteristics, and naturally suggests a compressor architecture that uses analysis and synthesis transfor… ▽ More We discuss a federated learned compression problem, where the goal is to learn a compressor from real-world data which is scattered across clients and may be statistically heterogeneous, yet share a common underlying representation. We propose a distributed source model that encompasses both characteristics, and naturally suggests a compressor architecture that uses analysis and synthesis transforms shared by clients. Inspired by personalized federated learning methods, we employ an entropy model that is personalized to each client. This allows for a global latent space to be learned across clients, and personalized entropy models that adapt to the clients' latent distributions. We show empirically that this strategy outperforms solely local methods, which indicates that learned compression also benefits from a shared global representation in statistically heterogeneous federated settings. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: ISIT 2023

arXiv:2204.01612 [pdf, other]

doi 10.1109/JSAIT.2023.3273467

Neural Estimation of the Rate-Distortion Function With Applications to Operational Source Coding

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: A fundamental question in designing lossy data compression schemes is how well one can do in comparison with the rate-distortion function, which describes the known theoretical limits of lossy compression. Motivated by the empirical success of deep neural network (DNN) compressors on large, real-world data, we investigate methods to estimate the rate-distortion function on such data, which would a… ▽ More A fundamental question in designing lossy data compression schemes is how well one can do in comparison with the rate-distortion function, which describes the known theoretical limits of lossy compression. Motivated by the empirical success of deep neural network (DNN) compressors on large, real-world data, we investigate methods to estimate the rate-distortion function on such data, which would allow comparison of DNN compressors with optimality. While one could use the empirical distribution of the data and apply the Blahut-Arimoto algorithm, this approach presents several computational challenges and inaccuracies when the datasets are large and high-dimensional, such as the case of modern image datasets. Instead, we re-formulate the rate-distortion objective, and solve the resulting functional optimization problem using neural networks. We apply the resulting rate-distortion estimator, called NERD, on popular image datasets, and provide evidence that NERD can accurately estimate the rate-distortion function. Using our estimate, we show that the rate-distortion achievable by DNN compressors are within several bits of the rate-distortion function for real-world datasets. Additionally, NERD provides access to the rate-distortion achieving channel, as well as samples from its output marginal. Therefore, using recent results in reverse channel coding, we describe how NERD can be used to construct an operational one-shot lossy compression scheme with guarantees on the achievable rate and distortion. Experimental results demonstrate competitive performance with DNN compressors. △ Less

Submitted 1 February, 2023; v1 submitted 4 April, 2022; originally announced April 2022.

arXiv:2112.07575 [pdf, other]

Robust Graph Neural Networks via Probabilistic Lipschitz Constraints

Authors: Raghu Arghal, Eric Lei, Shirin Saeedi Bidokhti

Abstract: Graph neural networks (GNNs) have recently been demonstrated to perform well on a variety of network-based tasks such as decentralized control and resource allocation, and provide computationally efficient methods for these tasks which have traditionally been challenging in that regard. However, like many neural-network based systems, GNNs are susceptible to shifts and perturbations on their input… ▽ More Graph neural networks (GNNs) have recently been demonstrated to perform well on a variety of network-based tasks such as decentralized control and resource allocation, and provide computationally efficient methods for these tasks which have traditionally been challenging in that regard. However, like many neural-network based systems, GNNs are susceptible to shifts and perturbations on their inputs, which can include both node attributes and graph structure. In order to make them more useful for real-world applications, it is important to ensure their robustness post-deployment. Motivated by controlling the Lipschitz constant of GNN filters with respect to the node attributes, we propose to constrain the frequency response of the GNN's filter banks. We extend this formulation to the dynamic graph setting using a continuous frequency response constraint, and solve a relaxed variant of the problem via the scenario approach. This allows for the use of the same computationally efficient algorithm on sampled constraints, which provides PAC-style guarantees on the stability of the GNN using results in scenario optimization. We also highlight an important connection between this setup and GNN stability to graph perturbations, and provide experimental results which demonstrate the efficacy and broadness of our approach. △ Less

Submitted 14 December, 2021; originally announced December 2021.

arXiv:2110.07007 [pdf, other]

Out-of-Distribution Robustness in Deep Learning Compression

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Abstract: In recent years, deep neural network (DNN) compression systems have proved to be highly effective for designing source codes for many natural sources. However, like many other machine learning systems, these compressors suffer from vulnerabilities to distribution shifts as well as out-of-distribution (OOD) data, which reduces their real-world applications. In this paper, we initiate the study of O… ▽ More In recent years, deep neural network (DNN) compression systems have proved to be highly effective for designing source codes for many natural sources. However, like many other machine learning systems, these compressors suffer from vulnerabilities to distribution shifts as well as out-of-distribution (OOD) data, which reduces their real-world applications. In this paper, we initiate the study of OOD robust compression. Considering robustness to two types of ambiguity sets (Wasserstein balls and group shifts), we propose algorithmic and architectural frameworks built on two principled methods: one that trains DNN compressors using distributionally-robust optimization (DRO), and the other which uses a structured latent code. Our results demonstrate that both methods enforce robustness compared to a standard DNN compressor, and that using a structured code can be superior to the DRO compressor. We observe tradeoffs between robustness and distortion and corroborate these findings theoretically for a specific class of sources. △ Less

Submitted 13 October, 2021; originally announced October 2021.

Comments: Initially published at ICML-2021 ITR3 Workshop

arXiv:2009.02798 [pdf, other]

CSI-Based Multi-Antenna and Multi-Point Indoor Positioning Using Probability Fusion

Authors: Emre Gönültaş, Eric Lei, Jack Langerman, Howard Huang, Christoph Studer

Abstract: Channel state information (CSI)-based fingerprinting via neural networks (NNs) is a promising approach to enable accurate indoor and outdoor positioning of user equipments (UEs), even under challenging propagation conditions. In this paper, we propose a positioning pipeline for wireless LAN MIMO-OFDM systems which uses uplink CSI measurements obtained from one or more unsynchronized access points… ▽ More Channel state information (CSI)-based fingerprinting via neural networks (NNs) is a promising approach to enable accurate indoor and outdoor positioning of user equipments (UEs), even under challenging propagation conditions. In this paper, we propose a positioning pipeline for wireless LAN MIMO-OFDM systems which uses uplink CSI measurements obtained from one or more unsynchronized access points (APs). For each AP receiver, novel features are first extracted from the CSI that are robust to system impairments arising in real-world transceivers. These features are the inputs to a NN that extracts a probability map indicating the likelihood of a UE being at a given grid point. The NN output is then fused across multiple APs to provide a final position estimate. We provide experimental results with real-world indoor measurements under line-of-sight (LoS) and non-LoS propagation conditions for an 80MHz bandwidth IEEE 802.11ac system using a two-antenna transmit UE and two AP receivers each with four antennas. Our approach is shown to achieve centimeter-level median distance error, an order of magnitude improvement over a conventional baseline. △ Less

Submitted 31 August, 2021; v1 submitted 6 September, 2020; originally announced September 2020.

Comments: To appear in the IEEE Transactions on Wireless Communications

arXiv:1909.13355 [pdf, other]

Siamese Neural Networks for Wireless Positioning and Channel Charting

Authors: Eric Lei, Oscar Castañeda, Olav Tirkkonen, Tom Goldstein, Christoph Studer

Abstract: Neural networks have been proposed recently for positioning and channel charting of user equipments (UEs) in wireless systems. Both of these approaches process channel state information (CSI) that is acquired at a multi-antenna base-station in order to learn a function that maps CSI to location information. CSI-based positioning using deep neural networks requires a dataset that contains both CSI… ▽ More Neural networks have been proposed recently for positioning and channel charting of user equipments (UEs) in wireless systems. Both of these approaches process channel state information (CSI) that is acquired at a multi-antenna base-station in order to learn a function that maps CSI to location information. CSI-based positioning using deep neural networks requires a dataset that contains both CSI and associated location information. Channel charting (CC) only requires CSI information to extract relative position information. Since CC builds on dimensionality reduction, it can be implemented using autoencoders. In this paper, we propose a unified architecture based on Siamese networks that can be used for supervised UE positioning and unsupervised channel charting. In addition, our framework enables semisupervised positioning, where only a small set of location information is available during training. We use simulations to demonstrate that Siamese networks achieve similar or better performance than existing positioning and CC approaches with a single, unified neural network architecture. △ Less

Submitted 29 September, 2019; originally announced September 2019.

Comments: Presented at Allerton 2019; 8 pages

arXiv:1804.10742 [pdf, other]

Novel Prediction Techniques Based on Clusterwise Linear Regression

Authors: Igor Gitman, Jieshi Chen, Eric Lei, Artur Dubrawski

Abstract: In this paper we explore different regression models based on Clusterwise Linear Regression (CLR). CLR aims to find the partition of the data into $k$ clusters, such that linear regressions fitted to each of the clusters minimize overall mean squared error on the whole data. The main obstacle preventing to use found regression models for prediction on the unseen test points is the absence of a rea… ▽ More In this paper we explore different regression models based on Clusterwise Linear Regression (CLR). CLR aims to find the partition of the data into $k$ clusters, such that linear regressions fitted to each of the clusters minimize overall mean squared error on the whole data. The main obstacle preventing to use found regression models for prediction on the unseen test points is the absence of a reasonable way to obtain CLR cluster labels when the values of target variable are unknown. In this paper we propose two novel approaches on how to solve this problem. The first approach, predictive CLR builds a separate classification model to predict test CLR labels. The second approach, constrained CLR utilizes a set of user-specified constraints that enforce certain points to go to the same clusters. Assuming the constraint values are known for the test points, they can be directly used to assign CLR labels. We evaluate these two approaches on three UCI ML datasets as well as on a large corpus of health insurance claims. We show that both of the proposed algorithms significantly improve over the known CLR-based regression methods. Moreover, predictive CLR consistently outperforms linear regression and random forest, and shows comparable performance to support vector regression on UCI ML datasets. The constrained CLR approach achieves the best performance on the health insurance dataset, while enjoying only $\approx 20$ times increased computational time over linear regression. △ Less

Submitted 28 April, 2018; originally announced April 2018.

arXiv:1709.05602 [pdf, ps, other]

Characterization of Hemodynamic Signal by Learning Multi-View Relationships

Authors: Eric Lei, Kyle Miller, Michael R. Pinsky, Artur Dubrawski

Abstract: Multi-view data are increasingly prevalent in practice. It is often relevant to analyze the relationships between pairs of views by multi-view component analysis techniques such as Canonical Correlation Analysis (CCA). However, data may easily exhibit nonlinear relations, which CCA cannot reveal. We aim to investigate the usefulness of nonlinear multi-view relations to characterize multi-view data… ▽ More Multi-view data are increasingly prevalent in practice. It is often relevant to analyze the relationships between pairs of views by multi-view component analysis techniques such as Canonical Correlation Analysis (CCA). However, data may easily exhibit nonlinear relations, which CCA cannot reveal. We aim to investigate the usefulness of nonlinear multi-view relations to characterize multi-view data in an explainable manner. To address this challenge, we propose a method to characterize globally nonlinear multi-view relationships as a mixture of linear relationships. A clustering method, it identifies partitions of observations that exhibit the same relationships and learns those relationships simultaneously. It defines cluster variables by multi-view rather than spatial relationships, unlike almost all other clustering methods. Furthermore, we introduce a supervised classification method that builds on our clustering method by employing multi-view relationships as discriminative factors. The value of these methods resides in their capability to find useful structure in the data that single-view or current multi-view methods may struggle to find. We demonstrate the potential utility of the proposed approach using an application in clinical informatics to detect and characterize slow bleeding in patients whose central venous pressure (CVP) is monitored at the bedside. Presently, CVP is considered an insensitive measure of a subject's intravascular volume status or its change. However, we reason that features of CVP during inspiration and expiration should be informative in early identification of emerging changes of patient status. We empirically show how the proposed method can help discover and analyze multiple-to-multiple correlations, which could be nonlinear or vary throughout the population, by finding explainable structure of operational interest to practitioners. △ Less

Submitted 8 December, 2019; v1 submitted 16 September, 2017; originally announced September 2017.

Showing 1–13 of 13 results for author: Lei, E