Search | arXiv e-print repository

Induced Feature Selection by Structured Pruning

Authors: Nathan Hubens, Victor Delvigne, Matei Mancas, Bernard Gosselin, Marius Preda, Titus Zaharia

Abstract: The advent of sparsity inducing techniques in neural networks has been of a great help in the last few years. Indeed, those methods allowed to find lighter and faster networks, able to perform more efficiently in resource-constrained environment such as mobile devices or highly requested servers. Such a sparsity is generally imposed on the weights of neural networks, reducing the footprint of the… ▽ More The advent of sparsity inducing techniques in neural networks has been of a great help in the last few years. Indeed, those methods allowed to find lighter and faster networks, able to perform more efficiently in resource-constrained environment such as mobile devices or highly requested servers. Such a sparsity is generally imposed on the weights of neural networks, reducing the footprint of the architecture. In this work, we go one step further by imposing sparsity jointly on the weights and on the input data. This can be achieved following a three-step process: 1) impose a certain structured sparsity on the weights of the network; 2) track back input features corresponding to zeroed blocks of weight; 3) remove useless weights and input features and retrain the network. Performing pruning both on the network and on input data not only allows for extreme reduction in terms of parameters and operations but can also serve as an interpretation process. Indeed, with the help of data pruning, we now have information about which input feature is useful for the network to keep its performance. Experiments conducted on a variety of architectures and datasets: MLP validated on MNIST, CIFAR10/100 and ConvNets (VGG16 and ResNet18), validated on CIFAR10/100 and CALTECH101 respectively, show that it is possible to achieve additional gains in terms of total parameters and in FLOPs by performing pruning on input data, while also increasing accuracy. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2203.05807 [pdf, other]

Improve Convolutional Neural Network Pruning by Maximizing Filter Variety

Authors: Nathan Hubens, Matei Mancas, Bernard Gosselin, Marius Preda, Titus Zaharia

Abstract: Neural network pruning is a widely used strategy for reducing model storage and computing requirements. It allows to lower the complexity of the network by introducing sparsity in the weights. Because taking advantage of sparse matrices is still challenging, pruning is often performed in a structured way, i.e. removing entire convolution filters in the case of ConvNets, according to a chosen pruni… ▽ More Neural network pruning is a widely used strategy for reducing model storage and computing requirements. It allows to lower the complexity of the network by introducing sparsity in the weights. Because taking advantage of sparse matrices is still challenging, pruning is often performed in a structured way, i.e. removing entire convolution filters in the case of ConvNets, according to a chosen pruning criteria. Common pruning criteria, such as l1-norm or movement, usually do not consider the individual utility of filters, which may lead to: (1) the removal of filters exhibiting rare, thus important and discriminative behaviour, and (2) the retaining of filters with redundant information. In this paper, we present a technique solving those two issues, and which can be appended to any pruning criteria. This technique ensures that the criteria of selection focuses on redundant filters, while retaining the rare ones, thus maximizing the variety of remaining filters. The experimental results, carried out on different datasets (CIFAR-10, CIFAR-100 and CALTECH-101) and using different architectures (VGG-16 and ResNet-18) demonstrate that it is possible to achieve similar sparsity levels while maintaining a higher performance when appending our filter selection technique to pruning criteria. Moreover, we assess the quality of the found sparse sub-networks by applying the Lottery Ticket Hypothesis and find that the addition of our method allows to discover better performing tickets in most cases △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: 10 pages

arXiv:2112.08227 [pdf, other]

doi 10.1145/3378184.3378224

An Experimental Study of the Impact of Pre-training on the Pruning of a Convolutional Neural Network

Authors: Nathan Hubens, Matei Mancas, Bernard Gosselin, Marius Preda, Titus Zaharia

Abstract: In recent years, deep neural networks have known a wide success in various application domains. However, they require important computational and memory resources, which severely hinders their deployment, notably on mobile devices or for real-time applications. Neural networks usually involve a large number of parameters, which correspond to the weights of the network. Such parameters, obtained wi… ▽ More In recent years, deep neural networks have known a wide success in various application domains. However, they require important computational and memory resources, which severely hinders their deployment, notably on mobile devices or for real-time applications. Neural networks usually involve a large number of parameters, which correspond to the weights of the network. Such parameters, obtained with the help of a training process, are determinant for the performance of the network. However, they are also highly redundant. The pruning methods notably attempt to reduce the size of the parameter set, by identifying and removing the irrelevant weights. In this paper, we examine the impact of the training strategy on the pruning efficiency. Two training modalities are considered and compared: (1) fine-tuned and (2) from scratch. The experimental results obtained on four datasets (CIFAR10, CIFAR100, SVHN and Caltech101) and for two different CNNs (VGG16 and MobileNet) demonstrate that a network that has been pre-trained on a large corpus (e.g. ImageNet) and then fine-tuned on a particular dataset can be pruned much more efficiently (up to 80% of parameter reduction) than the same network trained from scratch. △ Less

Submitted 15 December, 2021; originally announced December 2021.

Comments: 7 pages, published at APPIS 2020

arXiv:2107.02086 [pdf, other]

One-Cycle Pruning: Pruning ConvNets Under a Tight Training Budget

Authors: Nathan Hubens, Matei Mancas, Bernard Gosselin, Marius Preda, Titus Zaharia

Abstract: Introducing sparsity in a neural network has been an efficient way to reduce its complexity while kee** its performance almost intact. Most of the time, sparsity is introduced using a three-stage pipeline: 1) train the model to convergence, 2) prune the model according to some criterion, 3) fine-tune the pruned model to recover performance. The last two steps are often performed iteratively, lea… ▽ More Introducing sparsity in a neural network has been an efficient way to reduce its complexity while kee** its performance almost intact. Most of the time, sparsity is introduced using a three-stage pipeline: 1) train the model to convergence, 2) prune the model according to some criterion, 3) fine-tune the pruned model to recover performance. The last two steps are often performed iteratively, leading to reasonable results but also to a time-consuming and complex process. In our work, we propose to get rid of the first step of the pipeline and to combine the two other steps in a single pruning-training cycle, allowing the model to jointly learn for the optimal weights while being pruned. We do this by introducing a novel pruning schedule, named One-Cycle Pruning, which starts pruning from the beginning of the training, and until its very end. Adopting such a schedule not only leads to better performing pruned models but also drastically reduces the training budget required to prune a model. Experiments are conducted on a variety of architectures (VGG-16 and ResNet-18) and datasets (CIFAR-10, CIFAR-100 and Caltech-101), and for relatively high sparsity values (80%, 90%, 95% of weights removed). Our results show that One-Cycle Pruning consistently outperforms commonly used pruning schedules such as One-Shot Pruning, Iterative Pruning and Automated Gradual Pruning, on a fixed training budget. △ Less

Submitted 3 July, 2022; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: Accepted at Sparsity in Neural Networks (SNN 2021)

arXiv:2105.02814 [pdf, other]

End-to-end deep meta modelling to calibrate and optimize energy consumption and comfort

Authors: Max Cohen, Sylvain Le Corff, Maurice Charbit, Marius Preda, Gilles Nozière

Abstract: In this paper, we propose a new end-to-end methodology to optimize the energy performance as well as comfort and air quality in large buildings without any renovation work. We introduce a metamodel based on recurrent neural networks and trained to predict the behavior of a general class of buildings using a database sampled from a simulation program. This metamodel is then deployed in different fr… ▽ More In this paper, we propose a new end-to-end methodology to optimize the energy performance as well as comfort and air quality in large buildings without any renovation work. We introduce a metamodel based on recurrent neural networks and trained to predict the behavior of a general class of buildings using a database sampled from a simulation program. This metamodel is then deployed in different frameworks and its parameters are calibrated using the specific data of two real buildings. Parameters are estimated by comparing the predictions of the metamodel with real data obtained from sensors using the CMA-ES algorithm, a derivative free optimization procedure. Then, energy consumptions are optimized while maintaining a target thermal comfort and air quality, using the NSGA-II multi-objective optimization procedure. The numerical experiments illustrate how this metamodel ensures a significant gain in energy efficiency, up to almost 10%, while being computationally much more appealing than numerical models and flexible enough to be adapted to several types of buildings. △ Less

Submitted 5 November, 2021; v1 submitted 1 February, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:2006.12390

Journal ref: Energy and Buildings, Elsevier, 2021

arXiv:2103.09113 [pdf, other]

EtherSolve: Computing an Accurate Control-Flow Graph from Ethereum Bytecode

Authors: Filippo Contro, Marco Crosara, Mariano Ceccato, Mila Dalla Preda

Abstract: Motivated by the immutable nature of Ethereum smart contracts and of their transactions, quite many approaches have been proposed to detect defects and security problems before smart contracts become persistent in the blockchain and they are granted control on substantial financial value. Because smart contracts source code might not be available, static analysis approaches mostly face the chall… ▽ More Motivated by the immutable nature of Ethereum smart contracts and of their transactions, quite many approaches have been proposed to detect defects and security problems before smart contracts become persistent in the blockchain and they are granted control on substantial financial value. Because smart contracts source code might not be available, static analysis approaches mostly face the challenge of analysing compiled Ethereum bytecode, that is available directly from the official blockchain. However, due to the intrinsic complexity of Ethereum bytecode (especially in jump resolution), static analysis encounters significant obstacles that reduce the accuracy of exiting automated tools. This paper presents a novel static analysis algorithm based on the symbolic execution of the Ethereum operand stack that allows us to resolve jumps in Ethereum bytecode and to construct an accurate control-flow graph (CFG) of the compiled smart contracts. EtherSolve is a prototype implementation of our approach. Experimental results on a significant set of real world Ethereum smart contracts show that EtherSolve improves the accuracy of the execrated CFGs with respect to the state of the art available approaches. Many static analysis techniques are based on the CFG representation of the code and would therefore benefit from the accurate extraction of the CFG. For example, we implemented a simple extension of EtherSolve that allows to detect instances of the re-entrancy vulnerability. △ Less

Submitted 16 March, 2021; originally announced March 2021.

arXiv:2006.12390 [pdf, other]

End-to-end deep metamodeling to calibrate and optimize energy loads

Authors: Max Cohen, Maurice Charbit, Sylvain Le Corff, Marius Preda, Gilles Nozière

Abstract: In this paper, we propose a new end-to-end methodology to optimize the energy performance and the comfort, air quality and hygiene of large buildings. A metamodel based on a Transformer network is introduced and trained using a dataset sampled with a simulation program. Then, a few physical parameters and the building management system settings of this metamodel are calibrated using the CMA-ES opt… ▽ More In this paper, we propose a new end-to-end methodology to optimize the energy performance and the comfort, air quality and hygiene of large buildings. A metamodel based on a Transformer network is introduced and trained using a dataset sampled with a simulation program. Then, a few physical parameters and the building management system settings of this metamodel are calibrated using the CMA-ES optimization algorithm and real data obtained from sensors. Finally, the optimal settings to minimize the energy loads while maintaining a target thermal comfort and air quality are obtained using a multi-objective optimization procedure. The numerical experiments illustrate how this metamodel ensures a significant gain in energy efficiency while being computationally much more appealing than models requiring a huge number of physical parameters to be estimated. △ Less

Submitted 19 June, 2020; originally announced June 2020.

arXiv:1702.02406 [pdf, other]

SEA: String Executability Analysis by Abstract Interpretation

Authors: Vincenzo Arceri, Mila Dalla Preda, Roberto Giacobazzi, Isabella Mastroeni

Abstract: Dynamic languages often employ reflection primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyse, are themselves dynamically mutating objects. We introd… ▽ More Dynamic languages often employ reflection primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyse, are themselves dynamically mutating objects. We introduce SEA, an abstract interpreter for automatic sound string executability analysis of dynamic languages employing bounded (i.e, finitely nested) reflection and dynamic code generation. Strings are statically approximated in an abstract domain of finite state automata with basic operations implemented as symbolic transducers. SEA combines standard program analysis together with string executability analysis. The analysis of a call to reflection determines a call to the same abstract interpreter over a code which is synthesised directly from the result of the static string executability analysis at that program point. The use of regular languages for approximating dynamically generated code structures allows SEA to soundly approximate safety properties of self modifying programs yet maintaining efficiency. Soundness here means that the semantics of the code synthesised by the analyser to resolve reflection over-approximates the semantics of the code dynamically built at run-rime by the program at that point. △ Less

Submitted 8 February, 2017; originally announced February 2017.

Comments: 28 pages, 11 figures

arXiv:1611.09067 [pdf, other]

doi 10.23638/LMCS-13(2:1)2017

Dynamic Choreographies: Theory And Implementation

Authors: Mila Dalla Preda, Maurizio Gabbrielli, Saverio Giallorenzo, Ivan Lanese, Jacopo Mauro

Abstract: Programming distributed applications free from communication deadlocks and race conditions is complex. Preserving these properties when applications are updated at runtime is even harder. We present a choreographic approach for programming updatable, distributed applications. We define a choreography language, called Dynamic Interaction-Oriented Choreography (AIOC), that allows the programmer to s… ▽ More Programming distributed applications free from communication deadlocks and race conditions is complex. Preserving these properties when applications are updated at runtime is even harder. We present a choreographic approach for programming updatable, distributed applications. We define a choreography language, called Dynamic Interaction-Oriented Choreography (AIOC), that allows the programmer to specify, from a global viewpoint, which parts of the application can be updated. At runtime, these parts may be replaced by new AIOC fragments from outside the application. AIOC programs are compiled, generating code for each participant in a process-level language called Dynamic Process-Oriented Choreographies (APOC). We prove that APOC distributed applications generated from AIOC specifications are deadlock free and race free and that these properties hold also after any runtime update. We instantiate the theoretical model above into a programming framework called Adaptable Interaction-Oriented Choreographies in Jolie (AIOCJ) that comprises an integrated development environment, a compiler from an extension of AIOCs to distributed Jolie programs, and a runtime environment to support their execution. △ Less

Submitted 6 April, 2017; v1 submitted 28 November, 2016; originally announced November 2016.

Comments: arXiv admin note: text overlap with arXiv:1407.0970

Journal ref: Logical Methods in Computer Science, Volume 13, Issue 2 (April 10, 2017) lmcs:3263

arXiv:1407.0975 [pdf, other]

AIOCJ: A Choreographic Framework for Safe Adaptive Distributed Applications

Authors: Mila Dalla Preda, Saverio Giallorenzo, Ivan Lanese, Jacopo Mauro, Maurizio Gabbrielli

Abstract: We present AIOCJ, a framework for programming distributed adaptive applications. Applications are programmed using AIOC, a choreographic language suited for expressing patterns of interaction from a global point of view. AIOC allows the programmer to specify which parts of the application can be adapted. Adaptation takes place at runtime by means of rules, which can change during the execution to… ▽ More We present AIOCJ, a framework for programming distributed adaptive applications. Applications are programmed using AIOC, a choreographic language suited for expressing patterns of interaction from a global point of view. AIOC allows the programmer to specify which parts of the application can be adapted. Adaptation takes place at runtime by means of rules, which can change during the execution to tackle possibly unforeseen adaptation needs. AIOCJ relies on a solid theory that ensures applications to be deadlock-free by construction also after adaptation. We describe the architecture of AIOCJ, the design of the AIOC language, and an empirical validation of the framework. △ Less

Submitted 10 July, 2014; v1 submitted 3 July, 2014; originally announced July 2014.

Comments: Technical Report

arXiv:1407.0970 [pdf, ps, other]

Dynamic Choreographies - Safe Runtime Updates of Distributed Applications

Authors: Mila Dalla Preda, Maurizio Gabbrielli, Saverio Giallorenzo, Ivan Lanese, Jacopo Mauro

Abstract: Programming distributed applications free from communication deadlocks and races is complex. Preserving these properties when applications are updated at runtime is even harder. We present DIOC, a language for programming distributed applications that are free from deadlocks and races by construction. A DIOC program describes a whole distributed application as a unique entity (choreography). DIOC… ▽ More Programming distributed applications free from communication deadlocks and races is complex. Preserving these properties when applications are updated at runtime is even harder. We present DIOC, a language for programming distributed applications that are free from deadlocks and races by construction. A DIOC program describes a whole distributed application as a unique entity (choreography). DIOC allows the programmer to specify which parts of the application can be updated. At runtime, these parts may be replaced by new DIOC fragments from outside the application. DIOC programs are compiled, generating code for each site, in a lower-level language called DPOC. We formalise both DIOC and DPOC semantics as labelled transition systems and prove the correctness of the compilation as a trace equivalence result. As corollaries, DPOC applications are free from communication deadlocks and races, even in presence of runtime updates. △ Less

Submitted 31 March, 2015; v1 submitted 3 July, 2014; originally announced July 2014.

Comments: Technical Report

Showing 1–11 of 11 results for author: Preda, M