Search | arXiv e-print repository

On Semi-Supervised Estimation of Distributions

Authors: H. S. Melihcan Erol, Erixhen Sula, Lizhong Zheng

Abstract: We study the problem of estimating the joint probability mass function (pmf) over two random variables. In particular, the estimation is based on the observation of $m$ samples containing both variables and $n$ samples missing one fixed variable. We adopt the minimax framework with $l^p_p$ loss functions, and we show that the composition of uni-variate minimax estimators achieves minimax risk with… ▽ More We study the problem of estimating the joint probability mass function (pmf) over two random variables. In particular, the estimation is based on the observation of $m$ samples containing both variables and $n$ samples missing one fixed variable. We adopt the minimax framework with $l^p_p$ loss functions, and we show that the composition of uni-variate minimax estimators achieves minimax risk with the optimal first-order constant for $p \ge 2$, in the regime $m = o(n)$. △ Less

Submitted 15 May, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

Comments: Presented in ISIT-2023

arXiv:2211.00537 [pdf, ps, other]

On the Semi-supervised Expectation Maximization

Authors: Erixhen Sula, Lizhong Zheng

Abstract: The Expectation Maximization (EM) algorithm is widely used as an iterative modification to maximum likelihood estimation when the data is incomplete. We focus on a semi-supervised case to learn the model from labeled and unlabeled samples. Existing work in the semi-supervised case has focused mainly on performance rather than convergence guarantee, however we focus on the contribution of the label… ▽ More The Expectation Maximization (EM) algorithm is widely used as an iterative modification to maximum likelihood estimation when the data is incomplete. We focus on a semi-supervised case to learn the model from labeled and unlabeled samples. Existing work in the semi-supervised case has focused mainly on performance rather than convergence guarantee, however we focus on the contribution of the labeled samples to the convergence rate. The analysis clearly demonstrates how the labeled samples improve the convergence rate for the exponential family mixture model. In this case, we assume that the population EM (EM with unlimited data) is initialized within the neighborhood of global convergence for the population EM that consists solely of samples that have not been labeled. The analysis for the labeled samples provides a comprehensive description of the convergence rate for the Gaussian mixture model. In addition, we extend the findings for labeled samples and offer an alternative proof for the population EM's convergence rate with unlabeled samples for the symmetric mixture of two Gaussians. △ Less

Submitted 25 January, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: 7 pages, 0 figures

arXiv:2202.01260 [pdf, ps, other]

Shannon Bounds on Lossy Gray-Wyner Networks

Authors: Erixhen Sula, Michael Gastpar

Abstract: The Gray-Wyner network subject to a fidelity criterion is studied. Upper and lower bounds for the trade-offs between the private sum-rate and the common rate are obtained for arbitrary sources subject to mean-squared error distortion. The bounds meet exactly, leading to the computation of the rate region, when the source is jointly Gaussian. They meet partially when the sources are modeled via an… ▽ More The Gray-Wyner network subject to a fidelity criterion is studied. Upper and lower bounds for the trade-offs between the private sum-rate and the common rate are obtained for arbitrary sources subject to mean-squared error distortion. The bounds meet exactly, leading to the computation of the rate region, when the source is jointly Gaussian. They meet partially when the sources are modeled via an additive Gaussian "channel". The bounds are inspired from the Shannon bounds on the rate-distortion problem. △ Less

Submitted 2 February, 2022; originally announced February 2022.

Comments: 6 pages, 2 figures

arXiv:2102.08157 [pdf, ps, other]

Lower bound on Wyner's Common Information

Authors: Erixhen Sula, Michael Gastpar

Abstract: An important notion of common information between two random variables is due to Wyner. In this paper, we derive a lower bound on Wyner's common information for continuous random variables. The new bound improves on the only other general lower bound on Wyner's common information, which is the mutual information. We also show that the new lower bound is tight for the so-called "Gaussian channels"… ▽ More An important notion of common information between two random variables is due to Wyner. In this paper, we derive a lower bound on Wyner's common information for continuous random variables. The new bound improves on the only other general lower bound on Wyner's common information, which is the mutual information. We also show that the new lower bound is tight for the so-called "Gaussian channels" case, namely, when the joint distribution of the random variables can be written as the sum of a single underlying random variable and Gaussian noises. We motivate this work from the recent variations of Wyner's common information and applications to network data compression problems such as the Gray-Wyner network. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: 6 pages, 3 figures

arXiv:2002.09297 [pdf]

doi 10.1109/JLT.2020.2977569

Supply-Power-Constrained Cable Capacity Maximization Using Multi-Layer Neural Networks

Authors: Junho Cho, Sethumadhavan Chandrasekhar, Erixhen Sula, Samuel Olsson, Ellsworth Burrows, Greg Raybon, Roland Ryf, Nicolas Fontaine, Jean-Christophe Antona, Steve Grubb, Peter Winzer, Andrew Chraplyvy

Abstract: We experimentally solve the problem of maximizing capacity under a total supply power constraint in a massively parallel submarine cable context, i.e., for a spatially uncoupled system in which fiber Kerr nonlinearity is not a dominant limitation. By using multi-layer neural networks trained with extensive measurement data acquired from a 12-span 744-km optical fiber link as an accurate digital tw… ▽ More We experimentally solve the problem of maximizing capacity under a total supply power constraint in a massively parallel submarine cable context, i.e., for a spatially uncoupled system in which fiber Kerr nonlinearity is not a dominant limitation. By using multi-layer neural networks trained with extensive measurement data acquired from a 12-span 744-km optical fiber link as an accurate digital twin of the true optical system, we experimentally maximize fiber capacity with respect to the transmit signal's spectral power distribution based on a gradient-descent algorithm. By observing convergence to approximately the same maximum capacity and power distribution for almost arbitrary initial conditions, we conjecture that the capacity surface is a concave function of the transmit signal power distribution. We then demonstrate that eliminating gain flattening filters (GFFs) from the optical amplifiers results in substantial capacity gains per Watt of electrical supply power compared to a conventional system that contains GFFs. △ Less

Submitted 20 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:1910.02050

arXiv:2002.01348 [pdf, ps, other]

The Gaussian lossy Gray-Wyner network

Authors: Erixhen Sula, Michael Gastpar

Abstract: We consider the problem of source coding subject to a fidelity criterion for the Gray-Wyner network that connects a single source with two receivers via a common channel and two private channels. The pareto-optimal trade-offs between the sum-rate of the private channels and the rate of the common channel is completely characterized for jointly Gaussian sources subject to the mean-squared error cri… ▽ More We consider the problem of source coding subject to a fidelity criterion for the Gray-Wyner network that connects a single source with two receivers via a common channel and two private channels. The pareto-optimal trade-offs between the sum-rate of the private channels and the rate of the common channel is completely characterized for jointly Gaussian sources subject to the mean-squared error criterion, leveraging convex duality and an argument involving the factorization of convex envelopes. Specifically, it is attained by selecting the auxiliary random variable to be jointly Gaussian with the sources. △ Less

Submitted 30 September, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

Comments: 5 pages, 2 figures. It is presented at the 54th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, March 18-20, 2020. arXiv admin note: text overlap with arXiv:1912.07083

arXiv:2002.00779 [pdf, ps, other]

Common Information Components Analysis

Authors: Michael Gastpar, Erixhen Sula

Abstract: We give an information-theoretic interpretation of Canonical Correlation Analysis (CCA) via (relaxed) Wyner's common information. CCA permits to extract from two high-dimensional data sets low-dimensional descriptions (features) that capture the commonalities between the data sets, using a framework of correlations and linear transforms. Our interpretation first extracts the common information up… ▽ More We give an information-theoretic interpretation of Canonical Correlation Analysis (CCA) via (relaxed) Wyner's common information. CCA permits to extract from two high-dimensional data sets low-dimensional descriptions (features) that capture the commonalities between the data sets, using a framework of correlations and linear transforms. Our interpretation first extracts the common information up to a pre-selected resolution level, and then projects this back onto each of the data sets. In the case of Gaussian statistics, this procedure precisely reduces to CCA, where the resolution level specifies the number of CCA components that are extracted. This also suggests a novel algorithm, Common Information Components Analysis (CICA), with several desirable features, including a natural extension to beyond just two data sets. △ Less

Submitted 28 February, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

Comments: 5 pages, 1 figure. Presented at the 2020 Information Theory and Applications (ITA) Workshop, San Diego, CA, USA, February 2-7, 2020

arXiv:1912.07083 [pdf, ps, other]

On Wyner's Common Information in the Gaussian Case

Authors: Erixhen Sula, Michael Gastpar

Abstract: Wyner's Common Information and a natural relaxation are studied in the special case of Gaussian random variables. The relaxation replaces conditional independence by a bound on the conditional mutual information. The main contribution is the proof that Gaussian auxiliaries are optimal, leading to a closed-form formula. As a corollary, the proof technique also establishes the optimality of Gaussian… ▽ More Wyner's Common Information and a natural relaxation are studied in the special case of Gaussian random variables. The relaxation replaces conditional independence by a bound on the conditional mutual information. The main contribution is the proof that Gaussian auxiliaries are optimal, leading to a closed-form formula. As a corollary, the proof technique also establishes the optimality of Gaussian auxiliaries for the Gaussian Gray-Wyner network, a long-standing open problem. △ Less

Submitted 28 September, 2020; v1 submitted 15 December, 2019; originally announced December 2019.

Comments: 26 pages, 4 figures

arXiv:1910.02050 [pdf]

Supply-Power-Constrained Cable Capacity Maximization Using Deep Neural Networks

Authors: Junho Cho, Sethumadhavan Chandrasekhar, Erixhen Sula, Samuel Olsson, Ellsworth Burrows, Greg Raybon, Roland Ryf, Nicolas Fontaine, Jean-Christophe Antona, Steve Grubb, Peter Winzer, Andrew Chraplyvy

Abstract: We experimentally achieve a 19% capacity gain per Watt of electrical supply power in a 12-span link by eliminating gain flattening filters and optimizing launch powers using machine learning by deep neural networks in a massively parallel fiber context. We experimentally achieve a 19% capacity gain per Watt of electrical supply power in a 12-span link by eliminating gain flattening filters and optimizing launch powers using machine learning by deep neural networks in a massively parallel fiber context. △ Less

Submitted 2 October, 2019; originally announced October 2019.

arXiv:1811.09525 [pdf, ps, other]

Sum-Rate Capacity for Symmetric Gaussian Multiple Access Channels with Feedback

Authors: Erixhen Sula, Michael Gastpar, Gerhard Kramer

Abstract: The feedback sum-rate capacity is established for the symmetric $J$-user Gaussian multiple-access channel (GMAC). The main contribution is a converse bound that combines the dependence-balance argument of Hekstra and Willems (1989) with a variant of the factorization of a convex envelope of Geng and Nair (2014). The converse bound matches the achievable sum-rate of the Fourier-Modulated Estimate C… ▽ More The feedback sum-rate capacity is established for the symmetric $J$-user Gaussian multiple-access channel (GMAC). The main contribution is a converse bound that combines the dependence-balance argument of Hekstra and Willems (1989) with a variant of the factorization of a convex envelope of Geng and Nair (2014). The converse bound matches the achievable sum-rate of the Fourier-Modulated Estimate Correction strategy of Kramer (2002). △ Less

Submitted 23 November, 2018; originally announced November 2018.

Comments: 16 pages, 2 figures, published in International Symposium on Information Theory (ISIT) 2018

arXiv:1712.10293 [pdf, ps, other]

Compute--Forward Multiple Access (CFMA): Practical Code Design

Authors: Erixhen Sula, **gge Zhu, Adriano Pastore, Sung Hoon Lim, Michael Gastpar

Abstract: We present a practical strategy that aims to attain rate points on the dominant face of the multiple access channel capacity using a standard low complexity decoder. This technique is built upon recent theoretical developments of Zhu and Gastpar on compute-forward multiple access (CFMA) which achieves the capacity of the multiple access channel using a sequential decoder. We illustrate this strate… ▽ More We present a practical strategy that aims to attain rate points on the dominant face of the multiple access channel capacity using a standard low complexity decoder. This technique is built upon recent theoretical developments of Zhu and Gastpar on compute-forward multiple access (CFMA) which achieves the capacity of the multiple access channel using a sequential decoder. We illustrate this strategy with off-the-shelf LDPC codes. In the first stage of decoding, the receiver first recovers a linear combination of the transmitted codewords using the sum-product algorithm (SPA). In the second stage, by using the recovered sum-of-codewords as side information, the receiver recovers one of the two codewords using a modified SPA, ultimately recovering both codewords. The main benefit of recovering the sum-of-codewords instead of the codeword itself is that it allows to attain points on the dominant face of the multiple access channel capacity without the need of rate-splitting or time sharing while maintaining a low complexity in the order of a standard point-to-point decoder. This property is also shown to be crucial for some applications, e.g., interference channels. For all the simulations with single-layer binary codes, our proposed practical strategy is shown to be within \SI{1.7}{\decibel} of the theoretical limits, without explicit optimization on the off-the-self LDPC codes. △ Less

Submitted 29 December, 2017; originally announced December 2017.

Comments: 30 pages, 14 figures, submitted to the IEEE Transactions on Wireless Communications

Showing 1–11 of 11 results for author: Sula, E