Search | arXiv e-print repository

Generating Realistic Synthetic Relational Data through Graph Variational Autoencoders

Authors: Ciro Antonio Mami, Andrea Coser, Eric Medvet, Alexander T. P. Boudewijn, Marco Volpe, Michael Whitworth, Borut Svara, Gabriele Sgroi, Daniele Panfilo, Sebastiano Saccani

Abstract: Synthetic data generation has recently gained widespread attention as a more reliable alternative to traditional data anonymization. The involved methods are originally developed for image synthesis. Hence, their application to the typically tabular and relational datasets from healthcare, finance and other industries is non-trivial. While substantial research has been devoted to the generation of… ▽ More Synthetic data generation has recently gained widespread attention as a more reliable alternative to traditional data anonymization. The involved methods are originally developed for image synthesis. Hence, their application to the typically tabular and relational datasets from healthcare, finance and other industries is non-trivial. While substantial research has been devoted to the generation of realistic tabular datasets, the study of synthetic relational databases is still in its infancy. In this paper, we combine the variational autoencoder framework with graph neural networks to generate realistic synthetic relational databases. We then apply the obtained method to two publicly available databases in computational experiments. The results indicate that real databases' structures are accurately preserved in the resulting synthetic datasets, even for large datasets with advanced data types. △ Less

Submitted 30 November, 2022; originally announced November 2022.

Comments: 8 pages, 2 figures, 2 tables, Synthetic Data 4 ML workshop of the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2204.06481 [pdf, other]

doi 10.1145/3512290.3528762

Evolving Modular Soft Robots without Explicit Inter-Module Communication using Local Self-Attention

Authors: Federico Pigozzi, Yu** Tang, Eric Medvet, David Ha

Abstract: Modularity in robotics holds great potential. In principle, modular robots can be disassembled and reassembled in different robots, and possibly perform new tasks. Nevertheless, actually exploiting modularity is yet an unsolved problem: controllers usually rely on inter-module communication, a practical requirement that makes modules not perfectly interchangeable and thus limits their flexibility.… ▽ More Modularity in robotics holds great potential. In principle, modular robots can be disassembled and reassembled in different robots, and possibly perform new tasks. Nevertheless, actually exploiting modularity is yet an unsolved problem: controllers usually rely on inter-module communication, a practical requirement that makes modules not perfectly interchangeable and thus limits their flexibility. Here, we focus on Voxel-based Soft Robots (VSRs), aggregations of mechanically identical elastic blocks. We use the same neural controller inside each voxel, but without any inter-voxel communication, hence enabling ideal conditions for modularity: modules are all equal and interchangeable. We optimize the parameters of the neural controller-shared among the voxels-by evolutionary computation. Crucially, we use a local self-attention mechanism inside the controller to overcome the absence of inter-module communication channels, thus enabling our robots to truly be driven by the collective intelligence of their modules. We show experimentally that the evolved robots are effective in the task of locomotion: thanks to self-attention, instances of the same controller embodied in the same robot can focus on different inputs. We also find that the evolved controllers generalize to unseen morphologies, after a short fine-tuning, suggesting that an inductive bias related to the task arises from true modularity. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: Accepted at the Genetic and Evolutionary Computation Conference 2022 (GECCO'22) complex systems track as a full paper

arXiv:2204.02099 [pdf, ps, other]

Collective control of modular soft robots via embodied Spiking Neural Cellular Automata

Authors: Giorgia Nadizar, Eric Medvet, Stefano Nichele, Sidney Pontes-Filho

Abstract: Voxel-based Soft Robots (VSRs) are a form of modular soft robots, composed of several deformable cubes, i.e., voxels. Each VSR is thus an ensemble of simple agents, namely the voxels, which must cooperate to give rise to the overall VSR behavior. Within this paradigm, collective intelligence plays a key role in enabling the emerge of coordination, as each voxel is independently controlled, exploit… ▽ More Voxel-based Soft Robots (VSRs) are a form of modular soft robots, composed of several deformable cubes, i.e., voxels. Each VSR is thus an ensemble of simple agents, namely the voxels, which must cooperate to give rise to the overall VSR behavior. Within this paradigm, collective intelligence plays a key role in enabling the emerge of coordination, as each voxel is independently controlled, exploiting only the local sensory information together with some knowledge passed from its direct neighbors (distributed or collective control). In this work, we propose a novel form of collective control, influenced by Neural Cellular Automata (NCA) and based on the bio-inspired Spiking Neural Networks: the embodied Spiking NCA (SNCA). We experiment with different variants of SNCA, and find them to be competitive with the state-of-the-art distributed controllers for the task of locomotion. In addition, our findings show significant improvement with respect to the baseline in terms of adaptability to unforeseen environmental changes, which could be a determining factor for physical practicability of VSRs. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: Workshop on "From Cells to Societies: Collective Learning across Scales" at the International Conference on Learning Representations (Cells2Societies@ICLR)

arXiv:2204.02046 [pdf, other]

Less is More: A Call to Focus on Simpler Models in Genetic Programming for Interpretable Machine Learning

Authors: Marco Virgolin, Eric Medvet, Tanja Alderliesten, Peter A. N. Bosman

Abstract: Interpretability can be critical for the safe and responsible use of machine learning models in high-stakes applications. So far, evolutionary computation (EC), in particular in the form of genetic programming (GP), represents a key enabler for the discovery of interpretable machine learning (IML) models. In this short paper, we argue that research in GP for IML needs to focus on searching in the… ▽ More Interpretability can be critical for the safe and responsible use of machine learning models in high-stakes applications. So far, evolutionary computation (EC), in particular in the form of genetic programming (GP), represents a key enabler for the discovery of interpretable machine learning (IML) models. In this short paper, we argue that research in GP for IML needs to focus on searching in the space of low-complexity models, by investigating new kinds of search strategies and recombination methods. Moreover, based on our experience of bringing research into clinical practice, we believe that research should strive to design better ways of modeling and pursuing interpretability, for the obtained solutions to ultimately be most useful. △ Less

Submitted 5 April, 2022; originally announced April 2022.

arXiv:2104.06060 [pdf, other]

Model Learning with Personalized Interpretability Estimation (ML-PIE)

Authors: Marco Virgolin, Andrea De Lorenzo, Francesca Randone, Eric Medvet, Mattias Wahde

Abstract: High-stakes applications require AI-generated models to be interpretable. Current algorithms for the synthesis of potentially interpretable models rely on objectives or regularization terms that represent interpretability only coarsely (e.g., model size) and are not designed for a specific user. Yet, interpretability is intrinsically subjective. In this paper, we propose an approach for the synthe… ▽ More High-stakes applications require AI-generated models to be interpretable. Current algorithms for the synthesis of potentially interpretable models rely on objectives or regularization terms that represent interpretability only coarsely (e.g., model size) and are not designed for a specific user. Yet, interpretability is intrinsically subjective. In this paper, we propose an approach for the synthesis of models that are tailored to the user by enabling the user to steer the model synthesis process according to her or his preferences. We use a bi-objective evolutionary algorithm to synthesize models with trade-offs between accuracy and a user-specific notion of interpretability. The latter is estimated by a neural network that is trained concurrently to the evolution using the feedback of the user, which is collected using uncertainty-based active learning. To maximize usability, the user is only asked to tell, given two models at the time, which one is less complex. With experiments on two real-world datasets involving 61 participants, we find that our approach is capable of learning estimations of interpretability that can be very different for different users. Moreover, the users tend to prefer models found using the proposed approach over models found using non-personalized interpretability indices. △ Less

Submitted 27 April, 2021; v1 submitted 13 April, 2021; originally announced April 2021.

Comments: fix typos

arXiv:2004.11170 [pdf, ps, other]

Learning a Formula of Interpretability to Learn Interpretable Formulas

Authors: Marco Virgolin, Andrea De Lorenzo, Eric Medvet, Francesca Randone

Abstract: Many risk-sensitive applications require Machine Learning (ML) models to be interpretable. Attempts to obtain interpretable models typically rely on tuning, by trial-and-error, hyper-parameters of model complexity that are only loosely related to interpretability. We show that it is instead possible to take a meta-learning approach: an ML model of non-trivial Proxies of Human Interpretability (PHI… ▽ More Many risk-sensitive applications require Machine Learning (ML) models to be interpretable. Attempts to obtain interpretable models typically rely on tuning, by trial-and-error, hyper-parameters of model complexity that are only loosely related to interpretability. We show that it is instead possible to take a meta-learning approach: an ML model of non-trivial Proxies of Human Interpretability (PHIs) can be learned from human feedback, then this model can be incorporated within an ML training process to directly optimize for interpretability. We show this for evolutionary symbolic regression. We first design and distribute a survey finalized at finding a link between features of mathematical formulas and two established PHIs, simulatability and decomposability. Next, we use the resulting dataset to learn an ML model of interpretability. Lastly, we query this model to estimate the interpretability of evolving solutions within bi-objective genetic programming. We perform experiments on five synthetic and eight real-world symbolic regression problems, comparing to the traditional use of solution size minimization. The results show that the use of our model leads to formulas that are, for a same level of accuracy-interpretability trade-off, either significantly more or equally accurate. Moreover, the formulas are also arguably more interpretable. Given the very positive results, we believe that our approach represents an important step** stone for the design of next-generation interpretable (evolutionary) ML algorithms. △ Less

Submitted 28 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

Comments: 16 pages, 4 figures Accepted at PPSN2020

arXiv:2001.08617 [pdf, other]

Design, Validation, and Case Studies of 2D-VSR-Sim, an Optimization-friendly Simulator of 2-D Voxel-based Soft Robots

Authors: Eric Medvet, Alberto Bartoli, Andrea De Lorenzo, Stefano Seriani

Abstract: Voxel-based soft robots (VSRs) are aggregations of soft blocks whose design is amenable to optimization. We here present a software, 2D-VSR-Sim, for facilitating research concerning the optimization of VSRs body and brain. The software, written in Java, provides consistent interfaces for all the VSRs aspects suitable for optimization and considers by design the presence of sensing, i.e., the possi… ▽ More Voxel-based soft robots (VSRs) are aggregations of soft blocks whose design is amenable to optimization. We here present a software, 2D-VSR-Sim, for facilitating research concerning the optimization of VSRs body and brain. The software, written in Java, provides consistent interfaces for all the VSRs aspects suitable for optimization and considers by design the presence of sensing, i.e., the possibility of exploiting the feedback from the environment for controlling the VSR. We experimentally characterize, from a mechanical point of view, the VSRs that can be simulated with 2D-VSR-Sim and we discuss the computational burden of the simulation. Finally, we show how 2D-VSR-Sim can be used to repeat the experiments of significant previous studies and, in perspective, to provide experimental answers to a variety of research questions. △ Less

Submitted 27 January, 2020; v1 submitted 23 January, 2020; originally announced January 2020.

Comments: 12 pages, 11 figures

arXiv:1812.02504 [pdf, ps, other]

Observing the Population Dynamics in GE by means of the Intrinsic Dimension

Authors: Eric Medvet, Alberto Bartoli, Alessio Ansuini, Fabiano Tarlao

Abstract: We explore the use of Intrinsic Dimension (ID) for gaining insights in how populations evolve in Evolutionary Algorithms. ID measures the minimum number of dimensions needed to accurately describe a dataset and its estimators are being used more and more in Machine Learning to cope with large datasets. We postulate that ID can provide information about population which is complimentary w.r.t.\ wha… ▽ More We explore the use of Intrinsic Dimension (ID) for gaining insights in how populations evolve in Evolutionary Algorithms. ID measures the minimum number of dimensions needed to accurately describe a dataset and its estimators are being used more and more in Machine Learning to cope with large datasets. We postulate that ID can provide information about population which is complimentary w.r.t.\ what (a simple measure of) diversity tells. We experimented with the application of ID to populations evolved with a recent variant of Grammatical Evolution. The preliminary results suggest that diversity and ID constitute two different points of view on the population dynamics. △ Less

Submitted 6 December, 2018; originally announced December 2018.

Comments: Evolutionary Machine Learning workshop at International Conference on Parallel Problem Solving from Nature (EML@PPSN), 2018, Coimbra (Portugal)

arXiv:1806.03215 [pdf, other]

(In)Secure Configuration Practices of WPA2 Enterprise Supplicants

Authors: Alberto Bartoli, Eric Medvet, Andrea De Lorenzo, Fabiano Tarlao

Abstract: WPA2 Enterprise is a fundamental technology for secure communication in enterprise wireless networks. A key requirement of this technology is that WiFi-enabled devices (i.e., supplicants) be correctly configured before connecting to the enterprise wireless network. Supplicants that are not configured correctly may fall prey of attacks aimed at stealing the network credentials very easily. Such cre… ▽ More WPA2 Enterprise is a fundamental technology for secure communication in enterprise wireless networks. A key requirement of this technology is that WiFi-enabled devices (i.e., supplicants) be correctly configured before connecting to the enterprise wireless network. Supplicants that are not configured correctly may fall prey of attacks aimed at stealing the network credentials very easily. Such credentials have an enormous value because they usually unlock access to all enterprise services. In this work we investigate whether users and technicians are aware of these important and widespread risks. We conducted two extensive analyses: a survey among approximately 1000 users about how they configured their WiFi devices for enterprise network access; and, a review of approximately 310 network configuration guides made available by enterprise network administrators. The results provide strong indications that the key requirement of WPA2 Enterprise is violated systematically and thus can no longer be considered realistic. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: Please cite as: Alberto Bartoli, Eric Medvet, Andrea De Lorenzo, and Fabiano Tarlao. 2018. (In)Secure Configuration Practices of WPA2 Enterprise Supplicants. In Proceedings of Availability, Reliability and Security, Hamburg, August 2018 (ARES), 6 pages

arXiv:1308.1946 [pdf, other]

Citation Counts and Evaluation of Researchers in the Internet Age

Authors: A. Bartoli, E. Medvet

Abstract: Bibliometric measures derived from citation counts are increasingly being used as a research evaluation tool. Their strengths and weaknesses have been widely analyzed in the literature and are often subject of vigorous debate. We believe there are a few fundamental issues related to the impact of the web that are not taken into account with the importance they deserve. We focus on evaluation of re… ▽ More Bibliometric measures derived from citation counts are increasingly being used as a research evaluation tool. Their strengths and weaknesses have been widely analyzed in the literature and are often subject of vigorous debate. We believe there are a few fundamental issues related to the impact of the web that are not taken into account with the importance they deserve. We focus on evaluation of researchers, but several of our arguments may be applied also to evaluation of research institutions as well as of journals and conferences. △ Less

Submitted 7 August, 2013; originally announced August 2013.

Comments: 4 pages, 2 figures, 3 tables

ACM Class: K.3.2

Showing 1–10 of 10 results for author: Medvet, E