Search | arXiv e-print repository

A Discrete Particle Swarm Optimizer for the Design of Cryptographic Boolean Functions

Authors: Luca Mariot, Alberto Leporati, Luca Manzoni

Abstract: A Particle Swarm Optimizer for the search of balanced Boolean functions with good cryptographic properties is proposed in this paper. The algorithm is a modified version of the permutation PSO by Hu, Eberhart and Shi which preserves the Hamming weight of the particles positions, coupled with the Hill Climbing method devised by Millan, Clark and Dawson to improve the nonlinearity and deviation from… ▽ More A Particle Swarm Optimizer for the search of balanced Boolean functions with good cryptographic properties is proposed in this paper. The algorithm is a modified version of the permutation PSO by Hu, Eberhart and Shi which preserves the Hamming weight of the particles positions, coupled with the Hill Climbing method devised by Millan, Clark and Dawson to improve the nonlinearity and deviation from correlation immunity of Boolean functions. The parameters for the PSO velocity equation are tuned by means of two meta-optimization techniques, namely Local Unimodal Sampling (LUS) and Continuous Genetic Algorithms (CGA), finding that CGA produces better results. Using the CGA-evolved parameters, the PSO algorithm is then run on the spaces of Boolean functions from $n=7$ to $n=12$ variables. The results of the experiments are reported, observing that this new PSO algorithm generates Boolean functions featuring similar or better combinations of nonlinearity, correlation immunity and propagation criterion with respect to the ones obtained by other optimization methods. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: Extended version of the poster paper "Heuristic Search by Particle Swarm Optimization of Boolean Functions for Cryptographic Applications" published in GECCO 2015

arXiv:2312.09449 [pdf, other]

vEEGNet: learning latent representations to reconstruct EEG raw data via variational autoencoders

Authors: Alberto Zancanaro, Giulia Cisotto, Italo Zoppis, Sara Lucia Manzoni

Abstract: Electroencephalografic (EEG) data are complex multi-dimensional time-series that are very useful in many applications, from diagnostics to driving brain-computer interface systems. Their classification is still a challenging task, due to the inherent within- and between-subject variability and their low signal-to-noise ratio. On the other hand, the reconstruction of raw EEG data is even more diffi… ▽ More Electroencephalografic (EEG) data are complex multi-dimensional time-series that are very useful in many applications, from diagnostics to driving brain-computer interface systems. Their classification is still a challenging task, due to the inherent within- and between-subject variability and their low signal-to-noise ratio. On the other hand, the reconstruction of raw EEG data is even more difficult because of the high temporal resolution of these signals. Recent literature has proposed numerous machine and deep learning models that could classify, e.g., different types of movements, with an accuracy in the range 70% to 80% (with 4 classes). On the other hand, a limited number of works targeted the reconstruction problem, with very limited results. In this work, we propose vEEGNet, a DL architecture with two modules, i.e., an unsupervised module based on variational autoencoders to extract a latent representation of the data, and a supervised module based on a feed-forward neural network to classify different movements. To build the encoder and the decoder of VAE we exploited the well-known EEGNet network. We implemented two slightly different architectures of vEEGNet, thus showing state-of-the-art classification performance, and the ability to reconstruct both low-frequency and middle-range components of the raw EEG. Although preliminary, this work is promising as we found out that the low-frequency reconstructed signals are consistent with the so-called motor-related cortical potentials, well-known motor-related EEG patterns and we could improve over previous literature by reconstructing faster EEG components, too. Further investigations are needed to explore the potentialities of vEEGNet in reconstructing the full EEG data, generating new samples, and studying the relationship between classification and reconstruction performance. △ Less

Submitted 16 November, 2023; originally announced December 2023.

arXiv:2312.00799 [pdf, other]

hvEEGNet: exploiting hierarchical VAEs on EEG data for neuroscience applications

Authors: Giulia Cisotto, Alberto Zancanaro, Italo F. Zoppis, Sara L. Manzoni

Abstract: With the recent success of artificial intelligence in neuroscience, a number of deep learning (DL) models were proposed for classification, anomaly detection, and pattern recognition tasks in electroencephalography (EEG). EEG is a multi-channel time-series that provides information about the individual brain activity for diagnostics, neuro-rehabilitation, and other applications (including emotions… ▽ More With the recent success of artificial intelligence in neuroscience, a number of deep learning (DL) models were proposed for classification, anomaly detection, and pattern recognition tasks in electroencephalography (EEG). EEG is a multi-channel time-series that provides information about the individual brain activity for diagnostics, neuro-rehabilitation, and other applications (including emotions recognition). Two main issues challenge the existing DL-based modeling methods for EEG: the high variability between subjects and the low signal-to-noise ratio making it difficult to ensure a good quality in the EEG data. In this paper, we propose two variational autoencoder models, namely vEEGNet-ver3 and hvEEGNet, to target the problem of high-fidelity EEG reconstruction. We properly designed their architectures using the blocks of the well-known EEGNet as the encoder, and proposed a loss function based on dynamic time war**. We tested the models on the public Dataset 2a - BCI Competition IV, where EEG was collected from 9 subjects and 22 channels. hvEEGNet was found to reconstruct the EEG data with very high-fidelity, outperforming most previous solutions (including our vEEGNet-ver3 ). Furthermore, this was consistent across all subjects. Interestingly, hvEEGNet made it possible to discover that this popular dataset includes a number of corrupted EEG recordings that might have influenced previous literature results. We also investigated the training behaviour of our models and related it with the quality and the size of the input EEG dataset, aiming at opening a new research debate on this relationship. In the future, hvEEGNet could be used as anomaly (e.g., artefact) detector in large EEG datasets to support the domain experts, but also the latent representations it provides could be used in other classification problems and EEG data generation. △ Less

Submitted 20 November, 2023; originally announced December 2023.

arXiv:2307.15612 [pdf, ps, other]

Fixed Points and Attractors of Reactantless and Inhibitorless Reaction Systems

Authors: Rocco Ascone, Giulia Bernardini, Luca Manzoni

Abstract: Reaction systems are discrete dynamical systems that model biochemical processes in living cells using finite sets of reactants, inhibitors, and products. We investigate the computational complexity of a comprehensive set of problems related to the existence of fixed points and attractors in two constrained classes of reaction systems, in which either reactants or inhibitors are disallowed. These… ▽ More Reaction systems are discrete dynamical systems that model biochemical processes in living cells using finite sets of reactants, inhibitors, and products. We investigate the computational complexity of a comprehensive set of problems related to the existence of fixed points and attractors in two constrained classes of reaction systems, in which either reactants or inhibitors are disallowed. These problems have biological relevance and have been extensively studied in the unconstrained case; however, they remain unexplored in the context of reactantless or inhibitorless systems. Interestingly, we demonstrate that although the absence of reactants or inhibitors simplifies the system's dynamics, it does not always lead to a reduction in the complexity of the considered problems. △ Less

Submitted 30 October, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

Comments: 32 pages. The previous version contained major mistakes. The wrong results are the former Corollaries 9, 10, 15, 17, 23, 25, 25, 32 and 36, Proposition 31 and Remark 30 (all following from a single mistake). We have now corrected all these results

arXiv:2305.16956 [pdf, other]

Local Search, Semantics, and Genetic Programming: a Global Analysis

Authors: Fabio Anselmi, Mauro Castelli, Alberto d'Onofrio, Luca Manzoni, Luca Mariot, Martina Saletta

Abstract: Geometric Semantic Geometric Programming (GSGP) is one of the most prominent Genetic Programming (GP) variants, thanks to its solid theoretical background, the excellent performance achieved, and the execution time significantly smaller than standard syntax-based GP. In recent years, a new mutation operator, Geometric Semantic Mutation with Local Search (GSM-LS), has been proposed to include a loc… ▽ More Geometric Semantic Geometric Programming (GSGP) is one of the most prominent Genetic Programming (GP) variants, thanks to its solid theoretical background, the excellent performance achieved, and the execution time significantly smaller than standard syntax-based GP. In recent years, a new mutation operator, Geometric Semantic Mutation with Local Search (GSM-LS), has been proposed to include a local search step in the mutation process based on the idea that performing a linear regression during the mutation can allow for a faster convergence to good-quality solutions. While GSM-LS helps the convergence of the evolutionary search, it is prone to overfitting. Thus, it was suggested to use GSM-LS only for a limited number of generations and, subsequently, to switch back to standard geometric semantic mutation. A more recently defined variant of GSGP (called GSGP-reg) also includes a local search step but shares similar strengths and weaknesses with GSM-LS. Here we explore multiple possibilities to limit the overfitting of GSM-LS and GSGP-reg, ranging from adaptive methods to estimate the risk of overfitting at each mutation to a simple regularized regression. The results show that the method used to limit overfitting is not that important: providing that a technique to control overfitting is used, it is possible to consistently outperform standard GSGP on both training and unseen data. The obtained results allow practitioners to better understand the role of local search in GSGP and demonstrate that simple regularization strategies are effective in controlling overfitting. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 22 pages, 4 figures, 1 table

arXiv:2303.05228 [pdf, ps, other]

A classification of S-boxes generated by Orthogonal Cellular Automata

Authors: Luca Mariot, Luca Manzoni

Abstract: Most of the approaches published in the literature to construct S-boxes via Cellular Automata (CA) work by either iterating a finite CA for several time steps, or by a one-shot application of the global rule. The main characteristic that brings together these works is that they employ a single CA rule to define the vectorial Boolean function of the S-box. In this work, we explore a different direc… ▽ More Most of the approaches published in the literature to construct S-boxes via Cellular Automata (CA) work by either iterating a finite CA for several time steps, or by a one-shot application of the global rule. The main characteristic that brings together these works is that they employ a single CA rule to define the vectorial Boolean function of the S-box. In this work, we explore a different direction for the design of S-boxes that leverages on Orthogonal CA (OCA), i.e. pairs of CA rules giving rise to orthogonal Latin squares. The motivation stands on the facts that an OCA pair already defines a bijective transformation, and moreover the orthogonality property of the resulting Latin squares ensures a minimum amount of diffusion. We exhaustively enumerate all S-boxes generated by OCA pairs of diameter $4 \le d \le 6$, and measure their nonlinearity. Interestingly, we observe that for $d=4$ and $d=5$ all S-boxes are linear, despite the underlying CA local rules being nonlinear. The smallest nonlinear S-boxes emerges for $d=6$, but their nonlinearity is still too low to be used in practice. Nonetheless, we unearth an interesting structure of linear OCA S-boxes, proving that their Linear Components Space (LCS) is itself the image of a linear CA, or equivalently a polynomial code. We finally classify all linear OCA S-boxes in terms of their generator polynomials. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 22 pages. Extended version of "On the Linear Components Space of S-boxes Generated by Orthogonal Cellular Automata" arXiv:2203.14365v, presented at ACRI 2022. Currently under submission at Natural Computing

arXiv:2211.11551 [pdf, other]

Evolutionary Strategies for the Design of Binary Linear Codes

Authors: Claude Carlet, Luca Mariot, Luca Manzoni, Stjepan Picek

Abstract: The design of binary error-correcting codes is a challenging optimization problem with several applications in telecommunications and storage, which has also been addressed with metaheuristic techniques and evolutionary algorithms. Still, all these efforts focused on optimizing the minimum distance of unrestricted binary codes, i.e., with no constraints on their linearity, which is a desirable pro… ▽ More The design of binary error-correcting codes is a challenging optimization problem with several applications in telecommunications and storage, which has also been addressed with metaheuristic techniques and evolutionary algorithms. Still, all these efforts focused on optimizing the minimum distance of unrestricted binary codes, i.e., with no constraints on their linearity, which is a desirable property for efficient implementations. In this paper, we present an Evolutionary Strategy (ES) algorithm that explores only the subset of linear codes of a fixed length and dimension. To that end, we represent the candidate solutions as binary matrices and devise variation operators that preserve their ranks. Our experiments show that up to length $n=14$, our ES always converges to an optimal solution with a full success rate, and the evolved codes are all inequivalent to the Best-Known Linear Code (BKLC) given by MAGMA. On the other hand, for larger lengths, both the success rate of the ES as well as the diversity of the evolved codes start to drop, with the extreme case of $(16,8,5)$ codes which all turn out to be equivalent to MAGMA's BKLC. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: 15 pages, 3 figures, 3 tables

arXiv:2207.08280 [pdf, ps, other]

Building Correlation Immune Functions from Sets of Mutually Orthogonal Cellular Automata

Authors: Luca Mariot, Luca Manzoni

Abstract: Correlation immune Boolean functions play an important role in the implementation of efficient masking countermeasures for side-channel attacks in cryptography. In this paper, we investigate a method to construct correlation immune functions through families of mutually orthogonal cellular automata (MOCA). First, we show that the orthogonal array (OA) associated to a family of MOCA can be expanded… ▽ More Correlation immune Boolean functions play an important role in the implementation of efficient masking countermeasures for side-channel attacks in cryptography. In this paper, we investigate a method to construct correlation immune functions through families of mutually orthogonal cellular automata (MOCA). First, we show that the orthogonal array (OA) associated to a family of MOCA can be expanded to a binary OA of strength at least 2. To prove this result, we exploit the characterization of MOCA in terms of orthogonal labelings on de Bruijn graphs. Then, we use the resulting binary OA to define the support of a second-order correlation immune function. Next, we perform some computational experiments to construct all such functions up to $n=12$ variables, and observe that their correlation immunity order is actually greater, always at least 3. We conclude by discussing how these results open up interesting perspectives for future research, with respect to the search of new correlation-immune functions and binary orthogonal arrays. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: 15 pages, 1 figure, 1 table

arXiv:2206.10974 [pdf, other]

The Influence of Local Search over Genetic Algorithms with Balanced Representations

Authors: Luca Manzoni, Luca Mariot, Eva Tuba

Abstract: We continue the study of Genetic Algorithms (GA) on combinatorial optimization problems where the candidate solutions need to satisfy a balancedness constraint. It has been observed that the reduction of the search space size granted by ad-hoc crossover and mutation operators does not usually translate to a substantial improvement of the GA performances. There is still no clear explanation of this… ▽ More We continue the study of Genetic Algorithms (GA) on combinatorial optimization problems where the candidate solutions need to satisfy a balancedness constraint. It has been observed that the reduction of the search space size granted by ad-hoc crossover and mutation operators does not usually translate to a substantial improvement of the GA performances. There is still no clear explanation of this phenomenon, although it is suspected that a balanced representation might yield a more irregular fitness landscape, where it could be more difficult for GA to converge to a global optimum. In this paper, we investigate this issue by adding a local search step to a GA with balanced operators, and use it to evolve highly nonlinear balanced Boolean functions. In particular, we organize our experiments around two research questions, namely if local search (1) improves the convergence speed of GA, and (2) decreases the population diversity. Surprisingly, while our results answer affirmatively the first question, they also show that adding local search actually \emph{increases} the diversity among the individuals in the population. We link these findings to some recent results on fitness landscape analysis for problems on Boolean functions. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Comments: 14 pages, 4 figures

arXiv:2205.02598 [pdf, other]

The Effect of Multi-Generational Selection in Geometric Semantic Genetic Programming

Authors: Mauro Castelli, Luca Manzoni, Luca Mariot, Giuliamaria Menara, Gloria Pietropolli

Abstract: Among the evolutionary methods, one that is quite prominent is Genetic Programming, and, in recent years, a variant called Geometric Semantic Genetic Programming (GSGP) has shown to be successfully applicable to many real-world problems. Due to a peculiarity in its implementation, GSGP needs to store all the evolutionary history, i.e., all populations from the first one. We exploit this stored inf… ▽ More Among the evolutionary methods, one that is quite prominent is Genetic Programming, and, in recent years, a variant called Geometric Semantic Genetic Programming (GSGP) has shown to be successfully applicable to many real-world problems. Due to a peculiarity in its implementation, GSGP needs to store all the evolutionary history, i.e., all populations from the first one. We exploit this stored information to define a multi-generational selection scheme that is able to use individuals from older populations. We show that a limited ability to use "old" generations is actually useful for the search process, thus showing a zero-cost way of improving the performances of GSGP. △ Less

Submitted 5 May, 2022; originally announced May 2022.

Comments: 19 pages, 4 figures, 5 tables. Submitted to Applied Sciences

arXiv:2203.14365 [pdf, ps, other]

On the Linear Components Space of S-boxes Generated by Orthogonal Cellular Automata

Authors: Luca Mariot, Luca Manzoni

Abstract: We investigate S-boxes defined by pairs of Orthogonal Cellular Automata (OCA), motivated by the fact that such CA always define bijective vectorial Boolean functions, and could thus be interesting for the design of block ciphers. In particular, we perform an exhaustive search of all nonlinear OCA pairs of diameter $d=4$ and $d=5$, which generate S-boxes of size $6\times 6$ and $8\times 8$, respect… ▽ More We investigate S-boxes defined by pairs of Orthogonal Cellular Automata (OCA), motivated by the fact that such CA always define bijective vectorial Boolean functions, and could thus be interesting for the design of block ciphers. In particular, we perform an exhaustive search of all nonlinear OCA pairs of diameter $d=4$ and $d=5$, which generate S-boxes of size $6\times 6$ and $8\times 8$, respectively. Surprisingly, all these S-boxes turn out to be linear, and thus they are not useful for the design of confusion layers in block ciphers. However, a closer inspection of these S-boxes reveals a very interesting structure. Indeed, we remark that the linear components space of the OCA-based S-boxes found by our exhaustive search are themselves the kernels of linear CA, or, equivalently, \emph{polynomial codes}. We finally classify the polynomial codes of the S-boxes obtained in our exhaustive search and observe that, in most cases, they actually correspond to the cyclic code with generator polynomial $X^{b}+1$, where $b=d-1$. Although these findings rule out the possibility of using OCA to design good S-boxes in block ciphers, they give nonetheless some interesting insights for a theoretical characterization of nonlinear OCA pairs, which is still an open question in general. △ Less

Submitted 14 May, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

Comments: 13 pages, updated version accepted in ACRI 2022

arXiv:2111.13248 [pdf, ps, other]

Heuristic Search of (Semi-)Bent Functions based on Cellular Automata

Authors: Luca Mariot, Martina Saletta, Alberto Leporati, Luca Manzoni

Abstract: An interesting thread in the research of Boolean functions for cryptography and coding theory is the study of secondary constructions: given a known function with a good cryptographic profile, the aim is to extend it to a (usually larger) function possessing analogous properties. In this work, we continue the investigation of a secondary construction based on cellular automata, focusing on the cla… ▽ More An interesting thread in the research of Boolean functions for cryptography and coding theory is the study of secondary constructions: given a known function with a good cryptographic profile, the aim is to extend it to a (usually larger) function possessing analogous properties. In this work, we continue the investigation of a secondary construction based on cellular automata, focusing on the classes of bent and semi-bent functions. We prove that our construction preserves the algebraic degree of the local rule, and we narrow our attention to the subclass of quadratic functions, performing several experiments based on exhaustive combinatorial search and heuristic optimization through Evolutionary Strategies (ES). Finally, we classify the obtained results up to permutation equivalence, remarking that the number of equivalence classes that our CA-XOR construction can successfully extend grows very quickly with respect to the CA diameter. △ Less

Submitted 25 November, 2021; originally announced November 2021.

Comments: 29 pages, 2 figures, 2 tables, preprint submitted to Natural Computing

arXiv:2106.01900 [pdf, other]

doi 10.1016/j.eswa.2021.116029

Salp Swarm Optimization: a Critical Review

Authors: Mauro Castelli, Luca Manzoni, Luca Mariot, Marco S. Nobile, Andrea Tangherloni

Abstract: In the crowded environment of bio-inspired population-based metaheuristics, the Salp Swarm Optimization (SSO) algorithm recently appeared and immediately gained a lot of momentum. Inspired by the peculiar spatial arrangement of salp colonies, which are displaced in long chains following a leader, this algorithm seems to provide an interesting optimization performance. However, the original work wa… ▽ More In the crowded environment of bio-inspired population-based metaheuristics, the Salp Swarm Optimization (SSO) algorithm recently appeared and immediately gained a lot of momentum. Inspired by the peculiar spatial arrangement of salp colonies, which are displaced in long chains following a leader, this algorithm seems to provide an interesting optimization performance. However, the original work was characterized by some conceptual and mathematical flaws, which influenced all ensuing papers on the subject. In this manuscript, we perform a critical review of SSO, highlighting all the issues present in the literature and their negative effects on the optimization process carried out by this algorithm. We also propose a mathematically correct version of SSO, named Amended Salp Swarm Optimizer (ASSO) that fixes all the discussed problems. We benchmarked the performance of ASSO on a set of tailored experiments, showing that it is able to achieve better results than the original SSO. Finally, we performed an extensive study aimed at understanding whether SSO and its variants provide advantages compared to other metaheuristics. The experimental results, where SSO cannot outperform simple well-known metaheuristics, suggest that the scientific community can safely abandon SSO. △ Less

Submitted 6 November, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

Comments: 25 pages, 6 figures. Published in Expert Systems with Applications

Journal ref: Expert Systems with Applications, Volume 189, 2022, 116029

arXiv:2105.04308 [pdf, ps, other]

Parallel Sandpiles or Spurious Bidirectional Icepiles?

Authors: Gianpiero Cattaneo, Luca Manzoni

Abstract: In a recent paper E. Formenti and K. Perrot (FP) introduce a global rule assumed to describe the discrete time dynamics associated with a sandpile model under the parallel application of a suitable local rule acting on d dimensional lattices of cells equipped with uniform neighborhood. In this paper we submit this approach to a critical analysis, in the simplest elementary particular case of a one… ▽ More In a recent paper E. Formenti and K. Perrot (FP) introduce a global rule assumed to describe the discrete time dynamics associated with a sandpile model under the parallel application of a suitable local rule acting on d dimensional lattices of cells equipped with uniform neighborhood. In this paper we submit this approach to a critical analysis, in the simplest elementary particular case of a one-dimensional lattice, which can be divided in two parts. In the first part we prove that the FP global rule does not describe the dynamics of standard sandpiles, but rather furnishes a description of the quite different situation of height difference between consecutive piles. This is a semantic uncorrect difference of interpretation. In the second part we investigate the consequences of the uncorrect FP assumption proving that their global rule describes a bidirectional spurious dynamics of icepiles (rather than sandpiles), in the sense that this latter is the consequence of application of three local rules: bidirectional vertical rule, bidirectional horizontal rule (typical of icepiles), and a granule jump from the bottom to the top (spurious rule of the dynamics). △ Less

Submitted 16 May, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

arXiv:2005.08300 [pdf, other]

Exploring Semi-bent Boolean Functions Arising from Cellular Automata

Authors: Luca Mariot, Martina Saletta, Alberto Leporati, Luca Manzoni

Abstract: Semi-bent Boolean functions are interesting from a cryptographic standpoint, since they possess several desirable properties such as having a low and flat Walsh spectrum, which is useful to resist linear cryptanalysis. In this paper, we consider the search of semi-bent functions through a construction based on cellular automata (CA). In particular, the construction defines a Boolean function by co… ▽ More Semi-bent Boolean functions are interesting from a cryptographic standpoint, since they possess several desirable properties such as having a low and flat Walsh spectrum, which is useful to resist linear cryptanalysis. In this paper, we consider the search of semi-bent functions through a construction based on cellular automata (CA). In particular, the construction defines a Boolean function by computing the XOR of all output cells in the CA. Since the resulting Boolean functions have the same algebraic degree of the CA local rule, we devise a combinatorial algorithm to enumerate all quadratic Boolean functions. We then apply this algorithm to exhaustively explore the space of quadratic rules of up to 6 variables, selecting only those for which our CA-based construction always yields semi-bent functions of up to 20 variables. Finally, we filter the obtained rules with respect to their balancedness, and remark that the semi-bent functions generated through our construction by the remaining rules have a constant number of linear structures. △ Less

Submitted 17 May, 2020; originally announced May 2020.

Comments: 12 pages, 1 figure, submitted to ACRI 2020

arXiv:2004.13832 [pdf, other]

Towards an evolutionary-based approach for natural language processing

Authors: Luca Manzoni, Domagoj Jakobovic, Luca Mariot, Stjepan Picek, Mauro Castelli

Abstract: Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well es… ▽ More Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task. The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output. To assess the suitability of this approach, we perform an experimental evaluation on a set of existing newspaper headlines. Individuals resulting from this (pre-)training phase can be employed as the initial population in other NLP tasks, like sentence generation, which will be the focus of future investigations, possibly employing adversarial co-evolutionary approaches. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Comments: 18 pages, 7 figures, 2 tables. Accepted for publication at the Genetic and Evolutionary Computation Conference (GECCO 2020)

arXiv:2004.11331 [pdf, other]

Tip the Balance: Improving Exploration of Balanced Crossover Operators by Adaptive Bias

Authors: Luca Manzoni, Luca Mariot, Eva Tuba

Abstract: The use of balanced crossover operators in Genetic Algorithms (GA) ensures that the binary strings generated as offsprings have the same Hamming weight of the parents, a constraint which is sought in certain discrete optimization problems. Although this method reduces the size of the search space, the resulting fitness landscape often becomes more difficult for the GA to explore and to discover op… ▽ More The use of balanced crossover operators in Genetic Algorithms (GA) ensures that the binary strings generated as offsprings have the same Hamming weight of the parents, a constraint which is sought in certain discrete optimization problems. Although this method reduces the size of the search space, the resulting fitness landscape often becomes more difficult for the GA to explore and to discover optimal solutions. This issue has been studied in this paper by applying an adaptive bias strategy to a counter-based crossover operator that introduces unbalancedness in the offspring with a certain probability, which is decreased throughout the evolutionary process. Experiments show that improving the exploration of the search space with this adaptive bias strategy is beneficial for the GA performances in terms of the number of optimal solutions found for the balanced nonlinear Boolean functions problem. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Comments: 13 pages, 4 figures

arXiv:2004.11300 [pdf, other]

CoInGP: Convolutional Inpainting with Genetic Programming

Authors: Domagoj Jakobovic, Luca Manzoni, Luca Mariot, Stjepan Picek, Mauro Castelli

Abstract: We investigate the use of Genetic Programming (GP) as a convolutional predictor for missing pixels in images. The training phase is performed by swee** a sliding window over an image, where the pixels on the border represent the inputs of a GP tree. The output of the tree is taken as the predicted value for the central pixel. We consider two topologies for the sliding window, namely the Moore an… ▽ More We investigate the use of Genetic Programming (GP) as a convolutional predictor for missing pixels in images. The training phase is performed by swee** a sliding window over an image, where the pixels on the border represent the inputs of a GP tree. The output of the tree is taken as the predicted value for the central pixel. We consider two topologies for the sliding window, namely the Moore and the Von Neumann neighborhood. The best GP tree scoring the lowest prediction error over the training set is then used to predict the pixels in the test set. We experimentally assess our approach through two experiments. In the first one, we train a GP tree over a subset of 1000 complete images from the MNIST dataset. The results show that GP can learn the distribution of the pixels with respect to a simple baseline predictor, with no significant differences observed between the two neighborhoods. In the second experiment, we train a GP convolutional predictor on two degraded images, removing around 20% of their pixels. In this case, we observe that the Moore neighborhood works better, although the Von Neumann neighborhood allows for a larger training set. △ Less

Submitted 25 April, 2021; v1 submitted 23 April, 2020; originally announced April 2020.

Comments: 21 pages, 8 figures, updated pre-print accepted at GECCO 2021

arXiv:1904.10494 [pdf, other]

Balanced Crossover Operators in Genetic Algorithms

Authors: Luca Manzoni, Luca Mariot, Eva Tuba

Abstract: In several combinatorial optimization problems arising in cryptography and design theory, the admissible solutions must often satisfy a balancedness constraint, such as being represented by bitstrings with a fixed number of ones. For this reason, several works in the literature tackling these optimization problems with Genetic Algorithms (GA) introduced new balanced crossover operators which ensur… ▽ More In several combinatorial optimization problems arising in cryptography and design theory, the admissible solutions must often satisfy a balancedness constraint, such as being represented by bitstrings with a fixed number of ones. For this reason, several works in the literature tackling these optimization problems with Genetic Algorithms (GA) introduced new balanced crossover operators which ensure that the offspring has the same balancedness characteristics of the parents. However, the use of such operators has never been thoroughly motivated, except for some generic considerations about search space reduction. In this paper, we undertake a rigorous statistical investigation on the effect of balanced and unbalanced crossover operators against three optimization problems from the area of cryptography and coding theory: nonlinear balanced Boolean functions, binary Orthogonal Arrays (OA) and bent functions. In particular, we consider three different balanced crossover operators (each with two variants: "left-to-right" and "shuffled"), two of which have never been published before, and compare their performances with classic one-point crossover. We are able to confirm that the balanced crossover operators performs better than all three balanced crossover operators. Furthermore, in two out of three crossovers, the "left-to-right" version performs better than the "shuffled" version. △ Less

Submitted 17 November, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

Comments: 26 pages, 9 figures. General revision of the original draft following reviews

arXiv:1904.07445 [pdf, other]

doi 10.3233/FI-2020-1878

The many roads to the simulation of reaction systems

Authors: Claudio Ferretti, Alberto Leporati, Luca Manzoni, Antonio E. Porreca

Abstract: Reaction systems are a computational model inspired by the bio-chemical reactions that happen inside biological cells. They have been and currently are studied for their many nice theoretical properties. They are also a useful modeling tool for biochemical systems, but in order to be able to employ them effectively in the field the presence of efficient and widely available simulators is essential… ▽ More Reaction systems are a computational model inspired by the bio-chemical reactions that happen inside biological cells. They have been and currently are studied for their many nice theoretical properties. They are also a useful modeling tool for biochemical systems, but in order to be able to employ them effectively in the field the presence of efficient and widely available simulators is essential. Here we explore three different algorithms and implementations of the simulation, comparing them to the current state of the art. We also show that we can obtain performances comparable to GPU-based simulations on real-world systems by using a carefully tuned CPU-based simulator. △ Less

Submitted 15 April, 2019; originally announced April 2019.

Comments: Postprint, to appear in Fundamenta Informaticae

Journal ref: Fundamenta Informaticae, vol. 171, no. 1-4, pp. 175-188, 2020

arXiv:1903.07913 [pdf, other]

doi 10.1016/j.ic.2019.03.006

Complexity of the dynamics of reaction systems

Authors: Alberto Dennunzio, Enrico Formenti, Luca Manzoni, Antonio E. Porreca

Abstract: Reaction systems are discrete dynamical systems inspired by bio-chemical processes, whose dynamical behaviour is expressed by set-theoretic operations on finite sets. Reaction systems thus provide a description of bio-chemical phenomena that complements the more traditional approaches, for instance those based on differential equations. A comprehensive list of decision problems about the dynamical… ▽ More Reaction systems are discrete dynamical systems inspired by bio-chemical processes, whose dynamical behaviour is expressed by set-theoretic operations on finite sets. Reaction systems thus provide a description of bio-chemical phenomena that complements the more traditional approaches, for instance those based on differential equations. A comprehensive list of decision problems about the dynamical behavior of reaction systems (such as cycles and fixed/periodic points, attractors, and reachability) is provided along with the corresponding computational complexity, which ranges from tractable problems to PSPACE-complete problems. △ Less

Submitted 19 March, 2019; originally announced March 2019.

Comments: Preprint

Journal ref: Information and Computation, Volume 267, August 2019, Pages 96-109

arXiv:1902.09523 [pdf, ps, other]

doi 10.1007/s41965-019-00011-4

Characterizing PSPACE with shallow non-confluent P systems

Authors: Alberto Leporati, Luca Manzoni, Giancarlo Mauri, Antonio E. Porreca, Claudio Zandron

Abstract: In P systems with active membranes, the question of understanding the power of non-confluence within a polynomial time bound is still an open problem. It is known that, for shallow P systems, that is, with only one level of nesting, non-confluence allows them to solve conjecturally harder problems than confluent P systems, thus reaching PSPACE. Here we show that PSPACE is not only a bound, but act… ▽ More In P systems with active membranes, the question of understanding the power of non-confluence within a polynomial time bound is still an open problem. It is known that, for shallow P systems, that is, with only one level of nesting, non-confluence allows them to solve conjecturally harder problems than confluent P systems, thus reaching PSPACE. Here we show that PSPACE is not only a bound, but actually an exact characterization. Therefore, the power endowed by non-confluence to shallow P systems is equal to the power gained by confluent P systems when non-elementary membrane division and polynomial depth are allowed, thus suggesting a connection between the roles of non-confluence and nesting depth. △ Less

Submitted 22 February, 2019; originally announced February 2019.

Comments: Preprint. arXiv admin note: text overlap with arXiv:1902.03879

Journal ref: Journal of Membrane Computing 1, 75-84 (2019)

arXiv:1902.06775 [pdf, ps, other]

doi 10.1016/j.ins.2019.02.023

On the dynamical behaviour of linear higher-order cellular automata and its decidability

Authors: Alberto Dennunzio, Enrico Formenti, Luca Manzoni, Luciano Margara, Antonio E. Porreca

Abstract: Higher-order cellular automata (HOCA) are a variant of cellular automata (CA) used in many applications (ranging, for instance, from the design of secret sharing schemes to data compression and image processing), and in which the global state of the system at time $t$ depends not only on the state at time $t-1$, as in the original model, but also on the states at time $t-2, \ldots, t-n$, where… ▽ More Higher-order cellular automata (HOCA) are a variant of cellular automata (CA) used in many applications (ranging, for instance, from the design of secret sharing schemes to data compression and image processing), and in which the global state of the system at time $t$ depends not only on the state at time $t-1$, as in the original model, but also on the states at time $t-2, \ldots, t-n$, where $n$ is the memory size of the HOCA. We provide decidable characterizations of two important dynamical properties, namely, sensitivity to the initial conditions and equicontinuity, for linear HOCA over the alphabet $\mathbb{Z}_m$. Such characterizations extend the ones shown in [23] for linear CA (LCA) over the alphabet $\mathbb{Z}^{n}_m$ in the case $n=1$. We also prove that linear HOCA of size memory $n$ over $\mathbb{Z}_m$ form a class that is indistinguishable from a specific subclass of LCA over $\mathbb{Z}_m^n$. This enables to decide injectivity and surjectivity for linear HOCA of size memory $n$ over $\mathbb{Z}_m$ using the decidable characterization provided in [2] and [19] for injectivity and surjectivity of LCA over $\mathbb{Z}^n_m$. Finally, we prove an equivalence between LCA over $\mathbb{Z}_m^n$ and an important class of non-uniform CA, another variant of CA used in many applications. △ Less

Submitted 18 February, 2019; originally announced February 2019.

Comments: Preprint

Journal ref: Information Sciences 486 (2019) 73-87

arXiv:1902.03883 [pdf, other]

doi 10.1007/s41965-020-00031-5

A Turing machine simulation by P systems without charges

Authors: Alberto Leporati, Luca Manzoni, Giancarlo Mauri, Antonio E. Porreca, Claudio Zandron

Abstract: It is well known that the kind of P systems involved in the definition of the P conjecture is able to solve problems in the complexity class $\mathbf{P}$ by leveraging the uniformity condition. Here we show that these systems are indeed able to simulate deterministic Turing machines working in polynomial time with a weaker uniformity condition and using only one level of membrane nesting. This all… ▽ More It is well known that the kind of P systems involved in the definition of the P conjecture is able to solve problems in the complexity class $\mathbf{P}$ by leveraging the uniformity condition. Here we show that these systems are indeed able to simulate deterministic Turing machines working in polynomial time with a weaker uniformity condition and using only one level of membrane nesting. This allows us to embed this construction into more complex membrane structures, possibly showing that constructions similar to the one performed in [1] for P systems with charges can be carried out also in this case. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Comments: Asian Branch of International Conference on Membrane Computing (ACMC 2018). arXiv admin note: text overlap with arXiv:1902.03879

Journal ref: Journal of Membrane Computing 2, 71-79 (2020)

arXiv:1902.03879 [pdf, ps, other]

doi 10.1007/978-3-030-12797-8_13

Solving QSAT in sublinear depth

Authors: Alberto Leporati, Luca Manzoni, Giancarlo Mauri, Antonio E. Porreca, Claudio Zandron

Abstract: Among $\mathbf{PSPACE}$-complete problems, QSAT, or quantified SAT, is one of the most used to show that the class of problems solvable in polynomial time by families of a given variant of P systems includes the whole $\mathbf{PSPACE}$. However, most solutions require a membrane nesting depth that is linear with respect to the number of variables of the QSAT instance under consideration. While a s… ▽ More Among $\mathbf{PSPACE}$-complete problems, QSAT, or quantified SAT, is one of the most used to show that the class of problems solvable in polynomial time by families of a given variant of P systems includes the whole $\mathbf{PSPACE}$. However, most solutions require a membrane nesting depth that is linear with respect to the number of variables of the QSAT instance under consideration. While a system of a certain depth is needed, since depth 1 systems only allows to solve problems in $\mathbf{P^{\#P}}$, it was until now unclear if a linear depth was, in fact, necessary. Here we use P systems with active membranes with charges, and we provide a construction that proves that QSAT can be solved with a sublinear nesting depth of order $\frac{n}{\log n}$, where $n$ is the number of variables in the quantified formula given as input. △ Less

Submitted 12 February, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

Comments: 19th International Conference on Membrane Computing (CMC19)

Journal ref: Lecture Notes in Computer Science 11399 (2019) 188-201

arXiv:1901.01534 [pdf, ps, other]

Search Space Reduction of Asynchrony Immune Cellular Automata by Center Permutivity

Authors: Luca Mariot, Luca Manzoni, Alberto Dennunzio

Abstract: We continue the study of asynchrony immunity in cellular automata (CA), which can be considered as a weaker version of correlation immunity in the context of vectorial Boolean functions. The property could have applications as a countermeasure for side-channel attacks in CA-based cryptographic primitives, such as S-boxes and pseudorandom number generators. We first give some theoretical results on… ▽ More We continue the study of asynchrony immunity in cellular automata (CA), which can be considered as a weaker version of correlation immunity in the context of vectorial Boolean functions. The property could have applications as a countermeasure for side-channel attacks in CA-based cryptographic primitives, such as S-boxes and pseudorandom number generators. We first give some theoretical results on the necessary conditions that a CA rule must satisfy in order to meet asynchrony immunity, the most important one being center permutivity. Next, we perform an exhaustive search of all asynchrony immune CA rules of neighborhood size up to $5$, leveraging on the discovered theoretical properties to greatly reduce the size of the search space. △ Less

Submitted 21 July, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

Comments: 12 pages, 2 figures, extended version of the paper "Asynchrony Immune Cellular Automata" by L. Mariot presented at ACA 2016. Corrected small typos from previous version and added three bibliographical references

arXiv:1801.07668 [pdf, other]

doi 10.1007/978-3-319-77553-1_4

Pruning Techniques for Mixed Ensembles of Genetic Programming Models

Authors: Mauro Castelli, Ivo Gonçalves, Luca Manzoni, Leonardo Vanneschi

Abstract: The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the e… ▽ More The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the evolutionary computation community. To fill this gap, we propose a strategy that blends individuals produced by standard syntax-based GP and individuals produced by geometric semantic genetic programming, one of the newest semantics-based method developed in GP. In fact, recent literature showed that combining syntax and semantics could improve the generalization ability of a GP model. Additionally, to improve the diversity of the GP models used to build up the ensemble, we propose different pruning criteria that are based on correlation and entropy, a commonly used measure in information theory. Experimental results,obtained over different complex problems, suggest that the pruning criteria based on correlation and entropy could be effective in improving the generalization ability of the ensemble model and in reducing the computational burden required to build it. △ Less

Submitted 23 January, 2018; originally announced January 2018.

arXiv:1707.00451 [pdf, other]

A Distance Between Populations for n-Points Crossover in Genetic Algorithms

Authors: Mauro Castelli, Gianpiero Cattaneo, Luca Manzoni, Leonardo Vanneschi

Abstract: Genetic algorithms (GAs) are an optimization technique that has been successfully used on many real-world problems. There exist different approaches to their theoretical study. In this paper we complete a recently presented approach to model one-point crossover using pretopologies (or Cech topologies) in two ways. First, we extend it to the case of n-points crossover. Then, we experimentally study… ▽ More Genetic algorithms (GAs) are an optimization technique that has been successfully used on many real-world problems. There exist different approaches to their theoretical study. In this paper we complete a recently presented approach to model one-point crossover using pretopologies (or Cech topologies) in two ways. First, we extend it to the case of n-points crossover. Then, we experimentally study how the distance distribution changes when the number of crossover points increases. △ Less

Submitted 3 July, 2017; originally announced July 2017.

arXiv:1208.2437 [pdf, other]

An Efficient Genetic Programming System with Geometric Semantic Operators and its Application to Human Oral Bioavailability Prediction

Authors: Mauro Castelli, Luca Manzoni, Leonardo Vanneschi

Abstract: Very recently new genetic operators, called geometric semantic operators, have been defined for genetic programming. Contrarily to standard genetic operators, which are uniquely based on the syntax of the individuals, these new operators are based on their semantics, meaning with it the set of input-output pairs on training data. Furthermore, these operators present the interesting property of ind… ▽ More Very recently new genetic operators, called geometric semantic operators, have been defined for genetic programming. Contrarily to standard genetic operators, which are uniquely based on the syntax of the individuals, these new operators are based on their semantics, meaning with it the set of input-output pairs on training data. Furthermore, these operators present the interesting property of inducing a unimodal fitness landscape for every problem that consists in finding a match between given input and output data (for instance regression and classification). Nevertheless, the current definition of these operators has a serious limitation: they impose an exponential growth in the size of the individuals in the population, so their use is impossible in practice. This paper is intended to overcome this limitation, presenting a new genetic programming system that implements geometric semantic operators in an extremely efficient way. To demonstrate the power of the proposed system, we use it to solve a complex real-life application in the field of pharmacokinetic: the prediction of the human oral bioavailability of potential new drugs. Besides the excellent performances on training data, which were expected because the fitness landscape is unimodal, we also report an excellent generalization ability of the proposed system, at least for the studied application. In fact, it outperforms standard genetic programming and a wide set of other well-known machine learning methods. △ Less

Submitted 12 August, 2012; originally announced August 2012.

arXiv:1105.0065 [pdf, ps, other]

Computational Aspects of Asynchronous CA

Authors: Jérôme Chandesris, Alberto Dennunzio, Enrico Formenti, Luca Manzoni

Abstract: This work studies some aspects of the computational power of fully asynchronous cellular automata (ACA). We deal with some notions of simulation between ACA and Turing Machines. In particular, we characterize the updating sequences specifying which are "universal", i.e., allowing a (specific family of) ACA to simulate any TM on any input. We also consider the computational cost of such simulations… ▽ More This work studies some aspects of the computational power of fully asynchronous cellular automata (ACA). We deal with some notions of simulation between ACA and Turing Machines. In particular, we characterize the updating sequences specifying which are "universal", i.e., allowing a (specific family of) ACA to simulate any TM on any input. We also consider the computational cost of such simulations. △ Less

Submitted 30 April, 2011; originally announced May 2011.

Showing 1–30 of 30 results for author: Manzoni, L