-
Data representation with optimal transport
Authors:
Rocío Díaz Martín,
Ivan V. Medri,
Gustavo Kunde Rohde
Abstract:
Optimal transport has been used to define bijective nonlinear transforms and different transport-related metrics for discriminating data and signals. Here we briefly describe the advances in this topic with the main applications and properties in each case.
Optimal transport has been used to define bijective nonlinear transforms and different transport-related metrics for discriminating data and signals. Here we briefly describe the advances in this topic with the main applications and properties in each case.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Characterization of frames for source recovery from dynamical samples
Authors:
Akram Aldroubi,
Rocio Diaz Martin,
Le Gong,
Javad Mashreghi,
Ivan Medri
Abstract:
In this paper, we address the problem of recovering constant source terms in a discrete dynamical system represented by $x_{n+1} = Ax_n + w$, where $x_n$ is the $n$-th state in a Hilbert space $\mathcal{H}$, $A$ is a bounded linear operator in $\mathcal{B}(\mathcal{H})$, and $w$ is a source term within a closed subspace $W$ of $\HH$. Our focus is on the stable recovery of $w$ using time-space samp…
▽ More
In this paper, we address the problem of recovering constant source terms in a discrete dynamical system represented by $x_{n+1} = Ax_n + w$, where $x_n$ is the $n$-th state in a Hilbert space $\mathcal{H}$, $A$ is a bounded linear operator in $\mathcal{B}(\mathcal{H})$, and $w$ is a source term within a closed subspace $W$ of $\HH$. Our focus is on the stable recovery of $w$ using time-space sample measurements formed by inner products with vectors from a Bessel system $\mathcal{G} \subset \mathcal{H}$. We establish the necessary and sufficient conditions for the recovery of $w$ from these measurements, independent of the unknown initial state $x_0$ and for any $w \in W$. This research is particularly relevant to applications such as environmental monitoring, where precise source identification is critical.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking
Authors:
Shourav B. Rabbani,
Ivan V. Medri,
Manar D. Samad
Abstract:
Despite groundbreaking success in image and text learning, deep learning has not achieved significant improvements against traditional machine learning (ML) when it comes to tabular data. This performance gap underscores the need for data-centric treatment and benchmarking of learning algorithms. Recently, attention and contrastive learning breakthroughs have shifted computer vision and natural la…
▽ More
Despite groundbreaking success in image and text learning, deep learning has not achieved significant improvements against traditional machine learning (ML) when it comes to tabular data. This performance gap underscores the need for data-centric treatment and benchmarking of learning algorithms. Recently, attention and contrastive learning breakthroughs have shifted computer vision and natural language processing paradigms. However, the effectiveness of these advanced deep models on tabular data is sparsely studied using a few data sets with very large sample sizes, reporting mixed findings after benchmarking against a limited number of baselines. We argue that the heterogeneity of tabular data sets and selective baselines in the literature can bias the benchmarking outcomes. This article extensively evaluates state-of-the-art attention and contrastive learning methods on a wide selection of 28 tabular data sets (14 easy and 14 hard-to-classify) against traditional deep and machine learning. Our data-centric benchmarking demonstrates when traditional ML is preferred over deep learning and vice versa because no best learning method exists for all tabular data sets. Combining between-sample and between-feature attentions conquers the invincible traditional ML on tabular data sets by a significant margin but fails on high dimensional data, where contrastive learning takes a robust lead. While a hybrid attention-contrastive learning strategy mostly wins on hard-to-classify data sets, traditional methods are frequently superior on easy-to-classify data sets with presumably simpler decision boundaries. To the best of our knowledge, this is the first benchmarking paper with statistical analyses of attention and contrastive learning performances on a diverse selection of tabular data sets against traditional deep and machine learning baselines to facilitate further advances in this field.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
LCOT: Linear circular optimal transport
Authors:
Rocio Diaz Martin,
Ivan Medri,
Yikun Bai,
Xinran Liu,
Kangbai Yan,
Gustavo K. Rohde,
Soheil Kolouri
Abstract:
The optimal transport problem for measures supported on non-Euclidean spaces has recently gained ample interest in diverse applications involving representation learning. In this paper, we focus on circular probability measures, i.e., probability measures supported on the unit circle, and introduce a new computationally efficient metric for these measures, denoted as Linear Circular Optimal Transp…
▽ More
The optimal transport problem for measures supported on non-Euclidean spaces has recently gained ample interest in diverse applications involving representation learning. In this paper, we focus on circular probability measures, i.e., probability measures supported on the unit circle, and introduce a new computationally efficient metric for these measures, denoted as Linear Circular Optimal Transport (LCOT). The proposed metric comes with an explicit linear embedding that allows one to apply Machine Learning (ML) algorithms to the embedded measures and seamlessly modify the underlying metric for the ML algorithm to LCOT. We show that the proposed metric is rooted in the Circular Optimal Transport (COT) and can be considered the linearization of the COT metric with respect to a fixed reference measure. We provide a theoretical analysis of the proposed metric and derive the computational complexities for pairwise comparison of circular probability measures. Lastly, through a set of numerical experiments, we demonstrate the benefits of LCOT in learning representations of circular measures.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Dynamical Sampling for the Recovery of Spatially Constant Source Terms in Dynamical Systems
Authors:
Akram Aldroubi,
Rocio Diaz Martin,
Ivan Medri
Abstract:
In this paper, we investigate the problem of source recovery in a dynamical system utilizing space-time samples. This is a specific issue within the broader field of dynamical sampling, which involves collecting samples from solutions to a differential equation across both space and time with the aim of recovering critical data, such as initial values, the sources, the driving operator, or other r…
▽ More
In this paper, we investigate the problem of source recovery in a dynamical system utilizing space-time samples. This is a specific issue within the broader field of dynamical sampling, which involves collecting samples from solutions to a differential equation across both space and time with the aim of recovering critical data, such as initial values, the sources, the driving operator, or other relevant details. Our focus in this study is the recovery of unknown, stationary sources across both space and time, leveraging space-time samples. This research may have significant applications; for instance, it could provide a model for strategically placing devices to measure the quantity of pollutants emanating from factory smokestacks and dispersing across a specific area. Space-time samples could be collected using measuring devices placed at various spatial locations and activated at different times. We present necessary and sufficient conditions for the positioning of these measuring devices to successfully resolve this dynamical sampling problem. This paper provides both a theoretical foundation for the recovery of sources in dynamical systems and potential practical applications.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Linear Optimal Partial Transport Embedding
Authors:
Yikun Bai,
Ivan Medri,
Rocio Diaz Martin,
Rana Muhammad Shahroz Khan,
Soheil Kolouri
Abstract:
Optimal transport (OT) has gained popularity due to its various applications in fields such as machine learning, statistics, and signal processing. However, the balanced mass requirement limits its performance in practical problems. To address these limitations, variants of the OT problem, including unbalanced OT, Optimal partial transport (OPT), and Hellinger Kantorovich (HK), have been proposed.…
▽ More
Optimal transport (OT) has gained popularity due to its various applications in fields such as machine learning, statistics, and signal processing. However, the balanced mass requirement limits its performance in practical problems. To address these limitations, variants of the OT problem, including unbalanced OT, Optimal partial transport (OPT), and Hellinger Kantorovich (HK), have been proposed. In this paper, we propose the Linear optimal partial transport (LOPT) embedding, which extends the (local) linearization technique on OT and HK to the OPT problem. The proposed embedding allows for faster computation of OPT distance between pairs of positive measures. Besides our theoretical contributions, we demonstrate the LOPT embedding technique in point-cloud interpolation and PCA analysis.
△ Less
Submitted 23 April, 2024; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Deep Clustering of Tabular Data by Weighted Gaussian Distribution Learning
Authors:
Shourav B. Rabbani,
Ivan V. Medri,
Manar D. Samad
Abstract:
Deep learning methods are primarily proposed for supervised learning of images or text with limited applications to clustering problems. In contrast, tabular data with heterogeneous features pose unique challenges in representation learning, where deep learning has yet to replace traditional machine learning. This paper addresses these challenges in develo** one of the first deep clustering meth…
▽ More
Deep learning methods are primarily proposed for supervised learning of images or text with limited applications to clustering problems. In contrast, tabular data with heterogeneous features pose unique challenges in representation learning, where deep learning has yet to replace traditional machine learning. This paper addresses these challenges in develo** one of the first deep clustering methods for tabular data: Gaussian Cluster Embedding in Autoencoder Latent Space (G-CEALS). G-CEALS is an unsupervised deep clustering framework for learning the parameters of multivariate Gaussian cluster distributions by iteratively updating individual cluster weights. The G-CEALS method presents average rank orderings of 2.9(1.7) and 2.8(1.7) based on clustering accuracy and adjusted Rand index (ARI) scores on sixteen tabular data sets, respectively, and outperforms nine state-of-the-art clustering methods. G-CEALS substantially improves clustering performance compared to traditional K-means and GMM, which are still de facto methods for clustering tabular data. Similar computationally efficient and high-performing deep clustering frameworks are imperative to reap the myriad benefits of deep learning on tabular data over traditional machine learning.
△ Less
Submitted 17 May, 2024; v1 submitted 2 January, 2023;
originally announced January 2023.
-
Signed Cumulative Distribution Transform for Parameter Estimation of 1-D Signals
Authors:
Sumati Thareja,
Gustavo Rohde,
Rocio Diaz Martin,
Ivan Medri,
Akram Aldroubi
Abstract:
We describe a method for signal parameter estimation using the signed cumulative distribution transform (SCDT), a recently introduced signal representation tool based on optimal transport theory. The method builds upon signal estimation using the cumulative distribution transform (CDT) originally introduced for positive distributions. Specifically, we show that Wasserstein-type distance minimizati…
▽ More
We describe a method for signal parameter estimation using the signed cumulative distribution transform (SCDT), a recently introduced signal representation tool based on optimal transport theory. The method builds upon signal estimation using the cumulative distribution transform (CDT) originally introduced for positive distributions. Specifically, we show that Wasserstein-type distance minimization can be performed simply using linear least squares techniques in SCDT space for arbitrary signal classes, thus providing a global minimizer for the estimation problem even when the underlying signal is a nonlinear function of the unknown parameters. Comparisons to current signal estimation methods using $L_p$ minimization shows the advantage of the method.
△ Less
Submitted 16 July, 2022;
originally announced July 2022.
-
The Signed Cumulative Distribution Transform for 1-D Signal Analysis and Classification
Authors:
Akram Aldroubi,
Rocio Diaz Martin,
Ivan Medri,
Gustavo K. Rohde,
Sumati Thareja
Abstract:
This paper presents a new mathematical signal transform that is especially suitable for decoding information related to non-rigid signal displacements. We provide a measure theoretic framework to extend the existing Cumulative Distribution Transform [ACHA 45 (2018), no. 3, 616-641] to arbitrary (signed) signals on $\overline{\mathbb{R}}$. We present both forward (analysis) and inverse (synthesis)…
▽ More
This paper presents a new mathematical signal transform that is especially suitable for decoding information related to non-rigid signal displacements. We provide a measure theoretic framework to extend the existing Cumulative Distribution Transform [ACHA 45 (2018), no. 3, 616-641] to arbitrary (signed) signals on $\overline{\mathbb{R}}$. We present both forward (analysis) and inverse (synthesis) formulas for the transform, and describe several of its properties including translation, scaling, convexity, linear separability and others. Finally, we describe a metric in transform space, and demonstrate the application of the transform in classifying (detecting) signals under random displacements.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Error Analysis on the Initial State Reconstruction Problem
Authors:
Rocío Díaz Martín,
Ivan Medri,
Juliana Osorio
Abstract:
In this paper we propose a method to estimate the initial state of a linear dynamical system with noisy observation. The method allows the user to have estimations in real time, that is, to have a new estimation for each new observation. Moreover, at each step, the covariance matrix of the error is known and it is proved that the dynamic of the state estimator error is always Lyapunov stable. Also…
▽ More
In this paper we propose a method to estimate the initial state of a linear dynamical system with noisy observation. The method allows the user to have estimations in real time, that is, to have a new estimation for each new observation. Moreover, at each step, the covariance matrix of the error is known and it is proved that the dynamic of the state estimator error is always Lyapunov stable. Also, %necessary and sufficient conditions are given to guarantee asymptotic stability for the error dynamics of an LTI dynamical system, which is itself an LTV system.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Continuous and discrete dynamical sampling
Authors:
Rocío Díaz Martín,
Ivan Medri,
Ursula Molter
Abstract:
In this paper we study the continuous dynamical sampling problem at infinite time in a complex Hilbert space $\mathcal{H}$. We find necessary and sufficient conditions on a bounded linear operator $A\in\mathcal{B}(\mathcal{H})$ and a set of vectors $\mathcal{G}\subset \mathcal{H}$, in order to obtain that $\{e^{tA}g\}_{g\in\mathcal{G}, t\in[0,\infty)}$ is a semi-continuous frame for $\mathcal{H}$.…
▽ More
In this paper we study the continuous dynamical sampling problem at infinite time in a complex Hilbert space $\mathcal{H}$. We find necessary and sufficient conditions on a bounded linear operator $A\in\mathcal{B}(\mathcal{H})$ and a set of vectors $\mathcal{G}\subset \mathcal{H}$, in order to obtain that $\{e^{tA}g\}_{g\in\mathcal{G}, t\in[0,\infty)}$ is a semi-continuous frame for $\mathcal{H}$. We study if it is possible to discretize the time variable $t$ and still have a frame for $\mathcal{H}$. We also relate the continuous iteration $e^{tA}$ on a set $\mathcal{G}$ to the discrete iteration $(A^\prime)^n$ on $\mathcal{G}^\prime$ for an adequate operator $A^\prime$ and set $\mathcal{G}^\prime\subset \mathcal{H}$.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
Dynamical Sampling: a view from control theory
Authors:
Rocío Díaz Martín,
Ivan Medri,
Ursula Molter
Abstract:
In this contribution we establish a dictionary between terms in two different areas in order to show that many of the topics studied are common ones - just with a different terminology. We further analyze the relations between the discrete-time and continuous-time versions of the problem, using results from both of these fields. We will also differentiate between a discretization of the continuous…
▽ More
In this contribution we establish a dictionary between terms in two different areas in order to show that many of the topics studied are common ones - just with a different terminology. We further analyze the relations between the discrete-time and continuous-time versions of the problem, using results from both of these fields. We will also differentiate between a discretization of the continuous-time dynamical system and a discrete-time dynamical system itself.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Operators generated by wavelets and their boundedness from Hp(Rn) into Lp(Rn)
Authors:
Rocío Díaz,
Iván Medri,
Pablo Rocha
Abstract:
We study the boundedness from Hp(Rn) into Lp(Rn) of certain operators generated by wavelets and Borel measures.
We study the boundedness from Hp(Rn) into Lp(Rn) of certain operators generated by wavelets and Borel measures.
△ Less
Submitted 25 April, 2020; v1 submitted 15 August, 2018;
originally announced August 2018.
-
One-dimensional singular problems involving the p-Laplacian and nonlinearities indefinite in sign
Authors:
Uriel Kaufmann,
Iván Medri
Abstract:
Let $Ω$ be a bounded open interval, let $p>1$ and $γ>0$, and let $m:Ω\rightarrow\mathbb{R}$ be a function that may change sign in $Ω$. In this article we study the existence and nonexistence of positive solutions for one-dimensional singular problems of the form $-(\left\vert u^{\prime}\right\vert ^{p-2}u^{\prime})^{\prime}=m\left( x\right) u^{-γ}$ in $Ω$, $u=0$ on $\partialΩ$. As a consequence we…
▽ More
Let $Ω$ be a bounded open interval, let $p>1$ and $γ>0$, and let $m:Ω\rightarrow\mathbb{R}$ be a function that may change sign in $Ω$. In this article we study the existence and nonexistence of positive solutions for one-dimensional singular problems of the form $-(\left\vert u^{\prime}\right\vert ^{p-2}u^{\prime})^{\prime}=m\left( x\right) u^{-γ}$ in $Ω$, $u=0$ on $\partialΩ$. As a consequence we also derive existence results for other related nonlinearities.
△ Less
Submitted 2 October, 2015; v1 submitted 8 June, 2015;
originally announced June 2015.
-
Strictly positive solutions for one-dimensional nonlinear elliptic problems
Authors:
Uriel Kaufmann,
Ivan Medri
Abstract:
We study existence and nonexistence of strictly positive solutions for the elliptic problems of the form $Lu=m\left( x\right) u^{p}$ in a bounded open interval, with zero boundary conditions, where $L$ is a strongly uniformly elliptic differential operator, $p\in\left( 0,1\right) $, and $m$ is a function that changes sign. We also characterize the set of values $p$ for which the problem admits a s…
▽ More
We study existence and nonexistence of strictly positive solutions for the elliptic problems of the form $Lu=m\left( x\right) u^{p}$ in a bounded open interval, with zero boundary conditions, where $L$ is a strongly uniformly elliptic differential operator, $p\in\left( 0,1\right) $, and $m$ is a function that changes sign. We also characterize the set of values $p$ for which the problem admits a solution, and in addition an existence result for other nonlinearities is presented.
△ Less
Submitted 14 May, 2014;
originally announced May 2014.
-
Strictly positive solutions for one-dimensional nonlinear problems involving the p-Laplacian
Authors:
Uriel Kaufmann,
Ivan Medri
Abstract:
Let $Ω$ be a bounded open interval, and let $p>1$ and $q\in\left(0,p-1\right) $. Let $m\in L^{p^{\prime}}\left(Ω\right) $ and $0\leq c\in L^{\infty}\left(Ω\right) $. We study existence of strictly positive solutions for elliptic problems of the form $-\left(\left\| u^{\prime}\right\|^{p-2}u^{\prime}\right) ^{\prime}+c\left(x\right) u^{p-1}=m\left(x\right) u^{q}$ in $Ω$, $u=0$ on $\partialΩ$. We me…
▽ More
Let $Ω$ be a bounded open interval, and let $p>1$ and $q\in\left(0,p-1\right) $. Let $m\in L^{p^{\prime}}\left(Ω\right) $ and $0\leq c\in L^{\infty}\left(Ω\right) $. We study existence of strictly positive solutions for elliptic problems of the form $-\left(\left\| u^{\prime}\right\|^{p-2}u^{\prime}\right) ^{\prime}+c\left(x\right) u^{p-1}=m\left(x\right) u^{q}$ in $Ω$, $u=0$ on $\partialΩ$. We mention that our results are new even in the case $c\equiv0$.
△ Less
Submitted 6 July, 2013;
originally announced July 2013.