Search | arXiv e-print repository

doi 10.1109/ACCESS.2024.3356913

CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis

Authors: Abdallah Alshantti, Damiano Varagnolo, Adil Rasheed, Aria Rahmati, Frank Westad

Abstract: Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilised for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concern… ▽ More Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilised for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed. In this work, we design a cascaded tabular GAN framework (CasTGAN) for generating realistic tabular data with a specific focus on the validity of the output. In this context, validity refers to the the dependency between features that can be found in the real data, but is typically misrepresented by traditional generative models. Our key idea entails that employing a cascaded architecture in which a dedicated generator samples each feature, the synthetic output becomes more representative of the real data. Our experimental results demonstrate that our model is capable of generating synthetic tabular data that can be used for fitting machine learning models. In addition, our model captures well the constraints and the correlations between the features of the real data, especially the high dimensional datasets. Furthermore, we evaluate the risk of white-box privacy attacks on our model and subsequently show that applying some perturbations to the auxiliary learners in CasTGAN increases the overall robustness of our model against targeted attacks. △ Less

Submitted 22 January, 2024; v1 submitted 1 July, 2023; originally announced July 2023.

arXiv:2112.00115 [pdf, other]

Risk-based implementation of COLREGs for autonomous surface vehicles using deep reinforcement learning

Authors: Thomas Nakken Larsen, Amalie Heiberg, Eivind Meyer, Adil Rasheeda, Omer San, Damiano Varagnolo

Abstract: Autonomous systems are becoming ubiquitous and gaining momentum within the marine sector. Since the electrification of transport is happening simultaneously, autonomous marine vessels can reduce environmental impact, lower costs, and increase efficiency. Although close monitoring is still required to ensure safety, the ultimate goal is full autonomy. One major milestone is to develop a control sys… ▽ More Autonomous systems are becoming ubiquitous and gaining momentum within the marine sector. Since the electrification of transport is happening simultaneously, autonomous marine vessels can reduce environmental impact, lower costs, and increase efficiency. Although close monitoring is still required to ensure safety, the ultimate goal is full autonomy. One major milestone is to develop a control system that is versatile enough to handle any weather and encounter that is also robust and reliable. Additionally, the control system must adhere to the International Regulations for Preventing Collisions at Sea (COLREGs) for successful interaction with human sailors. Since the COLREGs were written for the human mind to interpret, they are written in ambiguous prose and therefore not machine-readable or verifiable. Due to these challenges and the wide variety of situations to be tackled, classical model-based approaches prove complicated to implement and computationally heavy. Within machine learning (ML), deep reinforcement learning (DRL) has shown great potential for a wide range of applications. The model-free and self-learning properties of DRL make it a promising candidate for autonomous vessels. In this work, a subset of the COLREGs is incorporated into a DRL-based path following and obstacle avoidance system using collision risk theory. The resulting autonomous agent dynamically interpolates between path following and COLREG-compliant collision avoidance in the training scenario, isolated encounter situations, and AIS-based simulations of real-world scenarios. △ Less

Submitted 30 November, 2021; originally announced December 2021.

arXiv:1708.00194 [pdf, other]

Distributed multi-agent Gaussian regression via finite-dimensional approximations

Authors: Gianluigi Pillonetto, Luca Schenato, Damiano Varagnolo

Abstract: We consider the problem of distributedly estimating Gaussian processes in multi-agent frameworks. Each agent collects few measurements and aims to collaboratively reconstruct a common estimate based on all data. Agents are assumed with limited computational and communication capabilities and to gather $M$ noisy measurements in total on input locations independently drawn from a known common probab… ▽ More We consider the problem of distributedly estimating Gaussian processes in multi-agent frameworks. Each agent collects few measurements and aims to collaboratively reconstruct a common estimate based on all data. Agents are assumed with limited computational and communication capabilities and to gather $M$ noisy measurements in total on input locations independently drawn from a known common probability density. The optimal solution would require agents to exchange all the $M$ input locations and measurements and then invert an $M \times M$ matrix, a non-scalable task. Differently, we propose two suboptimal approaches using the first $E$ orthonormal eigenfunctions obtained from the \ac{KL} expansion of the chosen kernel, where typically $E \ll M$. The benefits are that the computation and communication complexities scale with $E$ and not with $M$, and computing the required statistics can be performed via standard average consensus algorithms. We obtain probabilistic non-asymptotic bounds that determine a priori the desired level of estimation accuracy, and new distributed strategies relying on Stein's unbiased risk estimate (SURE) paradigms for tuning the regularization parameters and applicable to generic basis functions (thus not necessarily kernel eigenfunctions) and that can again be implemented via average consensus. The proposed estimators and bounds are finally tested on both synthetic and real field data. △ Less

Submitted 10 May, 2018; v1 submitted 1 August, 2017; originally announced August 2017.

Showing 1–3 of 3 results for author: Varagnolo, D