Search | arXiv e-print repository

Towards Generalized On-Chip Communication for Programmable Accelerators in Heterogeneous Architectures

Authors: Joseph Zuckerman, John-David Wellman, Ajay Vanamali, Manish Shankar, Gabriele Tombesi, Karthik Swaminathan, Kevin Lee, Mohit Kapur, Robert Philhower, Pradip Bose, Luca P. Carloni

Abstract: We present several enhancements to the open-source ESP platform to support flexible and efficient on-chip communication for programmable accelerators in heterogeneous SoCs. These enhancements include 1) a flexible point-to-point communication mechanism between accelerators, 2) a multicast NoC that supports data forwarding to multiple accelerators simultaneously, 3) accelerator synchronization leve… ▽ More We present several enhancements to the open-source ESP platform to support flexible and efficient on-chip communication for programmable accelerators in heterogeneous SoCs. These enhancements include 1) a flexible point-to-point communication mechanism between accelerators, 2) a multicast NoC that supports data forwarding to multiple accelerators simultaneously, 3) accelerator synchronization leveraging the SoC's coherence protocol, 4) an accelerator interface that offers fine-grained control over the communication mode used, and 5) an example ISA extension to support our enhancements. Our solution adds negligible area to the SoC architecture and requires minimal changes to the accelerators themselves. We have validated most of these features in complex FPGA prototypes and plan to include them in the open-source release of ESP in the coming months. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Appeared in the Sixth International Workshop on Domain Specific System Architecture (DOSSA-6)

arXiv:2403.19035 [pdf, other]

MiMiC: A High-Performance Framework for Multiscale Molecular Dynamics Simulations

Authors: Andrej Antalík, Andrea Levy, Sonata Kvedaravičiūtė, Sophia K. Johnson, David Carrasco-Busturia, Bharath Raghavan, François Mouvet, Angela Acocella, Sambit Das, Vikram Gavini, Davide Mandelli, Emiliano Ippoliti, Simone Meloni, Paolo Carloni, Ursula Rothlisberger, Jógvan Magnus Haugaard Olsen

Abstract: MiMiC is a framework for performing multiscale simulations, where individual subsystems are handled at different resolutions and/or levels of theory by loosely coupled external programs. To make it highly efficient and flexible, we adopt an interoperable approach based on a multiple-program multiple-data paradigm, serving as an intermediary responsible for fast data exchange and interactions betwe… ▽ More MiMiC is a framework for performing multiscale simulations, where individual subsystems are handled at different resolutions and/or levels of theory by loosely coupled external programs. To make it highly efficient and flexible, we adopt an interoperable approach based on a multiple-program multiple-data paradigm, serving as an intermediary responsible for fast data exchange and interactions between the subsystems. The main goal of MiMiC is to avoid interfering with the underlying parallelization of the external programs, including the operability on hybrid architectures (e.g., CPU/GPU), and keep their setup and execution as close as possible to the original. At the moment, MiMiC offers an efficient implementation of electrostatic embedding QM/MM that has demonstrated unprecedented parallel scaling in simulations of large biomolecules using CPMD and GROMACS as QM and MM engines, respectively. However, as it is designed for high flexibility with general multiscale models in mind, it can be straightforwardly extended beyond QM/MM. In this article, we illustrate the software design and the features of the framework, which make it a compelling choice for multiscale simulations in the upcoming era of exascale high-performance computing. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2311.05571 [pdf, other]

Effective Data-Driven Collective Variables for Free Energy Calculations from Metadynamics of Paths

Authors: Lukas Müllender, Andrea Rizzi, Michele Parrinello, Paolo Carloni, Davide Mandelli

Abstract: A variety of enhanced sampling methods predict multidimensional free energy landscapes associated with biological and other molecular processes as a function of a few selected collective variables (CVs). The accuracy of these methods is crucially dependent on the ability of the chosen CVs to capture the relevant slow degrees of freedom of the system. For complex processes, finding such CVs is the… ▽ More A variety of enhanced sampling methods predict multidimensional free energy landscapes associated with biological and other molecular processes as a function of a few selected collective variables (CVs). The accuracy of these methods is crucially dependent on the ability of the chosen CVs to capture the relevant slow degrees of freedom of the system. For complex processes, finding such CVs is the real challenge. Machine learning (ML) CVs offer, in principle, a solution to handle this problem. However, these methods rely on the availability of high-quality datasets -- ideally incorporating information about physical pathways and transition states -- which are difficult to access, therefore greatly limiting their domain of application. Here, we demonstrate how these datasets can be generated by means of enhanced sampling simulations in trajectory space via the metadynamics of paths [arXiv:2002.09281] algorithm. The approach is expected to provide a general and efficient way to generate efficient ML-based CVs for the fast prediction of free energy landscapes in enhanced sampling simulations. We demonstrate our approach with two numerical examples, a two-dimensional model potential and the isomerization of alanine dipeptide, using deep targeted discriminant analysis as our ML-based CV of choice. △ Less

Submitted 8 April, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

arXiv:2303.13337 [pdf, other]

Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster

Authors: Nitin Malapally, Viacheslav Bolnykh, Estela Suarez, Paolo Carloni, Thomas Lippert, Davide Mandelli

Abstract: The 3D Discrete Fourier Transform (DFT) is a technique used to solve problems in disparate fields. Nowadays, the commonly adopted implementation of the 3D-DFT is derived from the Fast Fourier Transform (FFT) algorithm. However, evidence indicates that the distributed memory 3D-FFT algorithm does not scale well due to its use of all-to-all communication. Here, building on the work of Sedukhin \text… ▽ More The 3D Discrete Fourier Transform (DFT) is a technique used to solve problems in disparate fields. Nowadays, the commonly adopted implementation of the 3D-DFT is derived from the Fast Fourier Transform (FFT) algorithm. However, evidence indicates that the distributed memory 3D-FFT algorithm does not scale well due to its use of all-to-all communication. Here, building on the work of Sedukhin \textit{et al}. [Proceedings of the 30th International Conference on Computers and Their Applications, CATA 2015 pp. 193-200 (01 2015)], we revisit the possibility of improving the scaling of the 3D-DFT by using an alternative approach that uses point-to-point communication, albeit at a higher arithmetic complexity. The new algorithm exploits tensor-matrix multiplications on a volumetrically decomposed domain via three specially adapted variants of Cannon's algorithm. It has here been implemented as a C++ library called S3DFT and tested on the JUWELS Cluster at the Jülich Supercomputing Center. Our implementation of the shared memory tensor-matrix multiplication attained 88\% of the theoretical single node peak performance. One variant of the distributed memory tensor-matrix multiplication shows excellent scaling, while the other two show poorer performance, which can be attributed to their intrinsic communication patterns. A comparison of S3DFT with the Intel MKL and FFTW3 libraries indicates that currently iMKL performs best overall, followed in order by FFTW3 and S3DFT. This picture might change with further improvements of the algorithm and/or when running on clusters that use network connections with higher latency, e.g. on cloud platforms. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: 18 pages, 8 figures

arXiv:2302.07683 [pdf, other]

doi 10.1073/pnas.2304308120

Multimap targeted free energy estimation

Authors: Andrea Rizzi, Paolo Carloni, Michele Parrinello

Abstract: We present a new method to compute free energies at a quantum mechanical (QM) level of theory from molecular simulations using cheap reference potential energy functions, such as force fields. To overcome the poor overlap between the reference and target distributions, we generalize targeted free energy perturbation (TFEP) to employ multiple configuration maps. While TFEP maps have been obtained b… ▽ More We present a new method to compute free energies at a quantum mechanical (QM) level of theory from molecular simulations using cheap reference potential energy functions, such as force fields. To overcome the poor overlap between the reference and target distributions, we generalize targeted free energy perturbation (TFEP) to employ multiple configuration maps. While TFEP maps have been obtained before from an expensive training of a normalizing flow neural network (NN), our multimap estimator allows us to use the same set of QM calculations to both optimize the maps and estimate the free energy, thus removing almost completely the overhead due to training. A multimap extension of the multistate Bennett acceptance ratio estimator is also derived for cases where samples from two or more states are available. Furthermore, we propose a one-epoch learning policy that can be used to efficiently avoid overfitting when computing the loss function is expensive compared to generating data. Finally, we show how our multimap approach can be combined with enhanced sampling strategies to overcome the pervasive problem of poor convergence due to slow degrees of freedom. We test our method on the HiPen dataset of drug-like molecules and fragments, and we show that it can accelerate the calculation of the free energy difference of switching from a force field to a DFTB3 potential by about 3 orders of magnitude compared to standard FEP and by a factor of about 8 compared to previously published nonequilibrium calculations. △ Less

Submitted 23 August, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

Comments: Main Text: 12 pages, 5 figures, 7 equations. Supplemental Material: 17 pages, 5 figures, 22 equations

Journal ref: Proceedings of the National Academy of Sciences, 120(46), 2023

arXiv:2206.01901 [pdf, other]

Enabling Heterogeneous, Multicore SoC Research with RISC-V and ESP

Authors: Joseph Zuckerman, Paolo Mantovani, Davide Giri, Luca P. Carloni

Abstract: Heterogeneous, multicore SoC architectures are a critical component of today's computing landscape. However, supporting both increasing heterogeneity and multicore execution are significant design challenges. Meanwhile, the growing RISC-V and open-source hardware (OSH) movements have resulted in an increased number of open-source RISC-V processor implementations; however, there are fewer open sour… ▽ More Heterogeneous, multicore SoC architectures are a critical component of today's computing landscape. However, supporting both increasing heterogeneity and multicore execution are significant design challenges. Meanwhile, the growing RISC-V and open-source hardware (OSH) movements have resulted in an increased number of open-source RISC-V processor implementations; however, there are fewer open source SoC design platforms that integrate these processor cores. We present modifications to ESP, an open-source SoC design platform, to enable multicore execution with the RISC-V CVA6 processor. Our implementation is modular and based on standardized interfaces. These properties simplify the integration of new cores. Our modifications enable RISC-V-based SoCs designed with ESP for FPGA to boot Linux SMP and execute multithreaded applications. Coupled with ESP's emphasis on accelerator-centric architectures, our contributions enable the seamless design of a wide range of heterogeneous, multicore SoCs. △ Less

Submitted 4 June, 2022; originally announced June 2022.

Comments: To appear in the Sixth Workshop on Computer Architecture Research with RISC-V (CARRV 2022)

arXiv:2201.04740 [pdf, other]

doi 10.1109/NYSDS.2019.8909784

Accelerating Deep Neural Networks for Real-time Data Selection for High-resolution Imaging Particle Detectors

Authors: Yeon-Jae Jwa, Giuseppe Di Guglielmo, Luca P. Carloni, Georgia Karagiorgi

Abstract: This paper presents the custom implementation, optimization, and performance evaluation of convolutional neural networks on field programmable gate arrays, for the purposes of accelerating deep neural network inference on large, two-dimensional image inputs. The targeted application is that of data selection for high-resolution particle imaging detectors, and in particular liquid argon time projec… ▽ More This paper presents the custom implementation, optimization, and performance evaluation of convolutional neural networks on field programmable gate arrays, for the purposes of accelerating deep neural network inference on large, two-dimensional image inputs. The targeted application is that of data selection for high-resolution particle imaging detectors, and in particular liquid argon time projection chamber detectors, such as that employed by the future Deep Underground Neutrino Experiment. We motivate this particular application based on the excellent performance of deep neural networks on classifying simulated raw data from the DUNE LArTPC, combined with the need for power-efficient data processing in the case of remote, long-term, and limited-access operating detector conditions. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: 10 pages, 5 figures, 8 tables

Journal ref: Published in 2019 New York Scientific Data Summit (NYSDS); Publisher: IEEE; Date Added to IEEE Xplore: 25 November 2019

arXiv:2109.06382 [pdf, other]

doi 10.1145/3466752.3480065

Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs

Authors: Joseph Zuckerman, Davide Giri, Jihye Kwon, Paolo Mantovani, Luca P. Carloni

Abstract: One of the most critical aspects of integrating loosely-coupled accelerators in heterogeneous SoC architectures is orchestrating their interactions with the memory hierarchy, especially in terms of navigating the various cache-coherence options: from accelerators accessing off-chip memory directly, bypassing the cache hierarchy, to accelerators having their own private cache. By running real-size… ▽ More One of the most critical aspects of integrating loosely-coupled accelerators in heterogeneous SoC architectures is orchestrating their interactions with the memory hierarchy, especially in terms of navigating the various cache-coherence options: from accelerators accessing off-chip memory directly, bypassing the cache hierarchy, to accelerators having their own private cache. By running real-size applications on FPGA-based prototypes of many-accelerator multi-core SoCs, we show that the best cache-coherence mode for a given accelerator varies at runtime, depending on the accelerator's characteristics, the workload size, and the overall SoC status. Cohmeleon applies reinforcement learning to select the best coherence mode for each accelerator dynamically at runtime, as opposed to statically at design time. It makes these selections adaptively, by continuously observing the system and measuring its performance. Cohmeleon is accelerator-agnostic, architecture-independent, and it requires minimal hardware support. Cohmeleon is also transparent to application programmers and has a negligible software overhead. FPGA-based experiments show that our runtime approach offers, on average, a 38% speedup with a 66% reduction of off-chip memory accesses compared to state-of-the-art design-time approaches. Moreover, it can match runtime solutions that are manually tuned for the target architecture. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: To appear in the 54th IEEE/ACM Symposium on Microarchitecture (MICRO 2021)

arXiv:2106.09295 [pdf, other]

doi 10.1021/acs.jpclett.1c02135

Targeted free energy perturbation revisited: Accurate free energies from mapped reference potentials

Authors: Andrea Rizzi, Paolo Carloni, Michele Parrinello

Abstract: We present an approach that extends the theory of targeted free energy perturbation (TFEP) to calculate free energy differences and free energy surfaces at an accurate quantum mechanical level of theory from a cheaper reference potential. The convergence is accelerated by a map** function that increases the overlap between the target and the reference distributions. Building on recent work, we s… ▽ More We present an approach that extends the theory of targeted free energy perturbation (TFEP) to calculate free energy differences and free energy surfaces at an accurate quantum mechanical level of theory from a cheaper reference potential. The convergence is accelerated by a map** function that increases the overlap between the target and the reference distributions. Building on recent work, we show that this map can be learned with a normalizing flow neural network, without requiring simulations with the expensive target potential but only a small number of single-point calculations, and, crucially, avoiding the systematic error that was found previously. We validate the method by numerically evaluating the free energy difference in a system with a double-well potential and by describing the free energy landscape of a simple chemical reaction in the gas phase. △ Less

Submitted 9 August, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

Comments: Main Text: 7 pages, 2 figures, 18 equations. Supplemental Material: 4 pages, 26 equations

Journal ref: J. Phys. Chem. Lett. 12 (2021) 9449-9454

arXiv:2103.05579 [pdf, other]

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Authors: Farah Fahim, Benjamin Hawks, Christian Herwig, James Hirschauer, Sergo **dariani, Nhan Tran, Luca P. Carloni, Giuseppe Di Guglielmo, Philip Harris, Jeffrey Krupa, Dylan Rankin, Manuel Blanco Valentin, Josiah Hester, Yingyi Luo, John Mamish, Seda Orgrenci-Memik, Thea Aarrestad, Hamza Javed, Vladimir Loncar, Maurizio Pierini, Adrian Alan Pol, Sioni Summers, Javier Duarte, Scott Hauck, Shih-Chieh Hsu , et al. (5 additional authors not shown)

Abstract: Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-h… ▽ More Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery. △ Less

Submitted 23 March, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 10 pages, 8 figures, TinyML Research Symposium 2021

Report number: FERMILAB-CONF-21-080-SCD

arXiv:2009.01178 [pdf, other]

doi 10.1145/3400302.3415753

Agile SoC Development with Open ESP

Authors: Paolo Mantovani, Davide Giri, Giuseppe Di Guglielmo, Luca Piccolboni, Joseph Zuckerman, Emilio G. Cota, Michele Petracca, Christian Pilato, Luca P. Carloni

Abstract: ESP is an open-source research platform for heterogeneous SoC design. The platform combines a modular tile-based architecture with a variety of application-oriented flows for the design and optimization of accelerators. The ESP architecture is highly scalable and strikes a balance between regularity and specialization. The companion methodology raises the level of abstraction to system-level desig… ▽ More ESP is an open-source research platform for heterogeneous SoC design. The platform combines a modular tile-based architecture with a variety of application-oriented flows for the design and optimization of accelerators. The ESP architecture is highly scalable and strikes a balance between regularity and specialization. The companion methodology raises the level of abstraction to system-level design and enables an automated flow from software and hardware development to full-system prototy** on FPGA. For application developers, ESP offers domain-specific automated solutions to synthesize new accelerators for their software and to map complex workloads onto the SoC architecture. For hardware engineers, ESP offers automated solutions to integrate their accelerator designs into the complete SoC. Conceived as a heterogeneous integration platform and tested through years of teaching at Columbia University, ESP supports the open-source hardware community by providing a flexible platform for agile SoC development. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: Invited Paper at the 2020 International Conference On Computer Aided Design (ICCAD) - Special Session on Opensource Tools and Platforms for Agile Development of Specialized Architectures

arXiv:2007.01061 [pdf, other]

doi 10.1109/SP40001.2021.00010

CRYLOGGER: Detecting Crypto Misuses Dynamically

Authors: Luca Piccolboni, Giuseppe Di Guglielmo, Luca P. Carloni, Simha Sethumadhavan

Abstract: Cryptographic (crypto) algorithms are the essential ingredients of all secure systems: crypto hash functions and encryption algorithms, for example, can guarantee properties such as integrity and confidentiality. Developers, however, can misuse the application programming interfaces (API) of such algorithms by using constant keys and weak passwords. This paper presents CRYLOGGER, the first open-so… ▽ More Cryptographic (crypto) algorithms are the essential ingredients of all secure systems: crypto hash functions and encryption algorithms, for example, can guarantee properties such as integrity and confidentiality. Developers, however, can misuse the application programming interfaces (API) of such algorithms by using constant keys and weak passwords. This paper presents CRYLOGGER, the first open-source tool to detect crypto misuses dynamically. CRYLOGGER logs the parameters that are passed to the crypto APIs during the execution and checks their legitimacy offline by using a list of crypto rules. We compare CRYLOGGER with CryptoGuard, one of the most effective static tools to detect crypto misuses. We show that our tool complements the results of CryptoGuard, making the case for combining static and dynamic approaches. We analyze 1780 popular Android apps downloaded from the Google Play Store to show that CRYLOGGER can detect crypto misuses on thousands of apps dynamically and automatically. We reverse-engineer 28 Android apps and confirm the issues flagged by CRYLOGGER. We also disclose the most critical vulnerabilities to app developers and collect their feedback. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: To appear in the Proceedings of the IEEE Symposium on Security & Privacy (SP) 2021

arXiv:2004.07415 [pdf, other]

The MosaicSim Simulator (Full Technical Report)

Authors: Opeoluwa Matthews, Aninda Manocha, Davide Giri, Marcelo Orenes-Vera, Esin Tureci, Tyler Sorensen, Tae Jun Ham, Juan L. Aragón, Luca P. Carloni, Margaret Martonosi

Abstract: As Moore's Law has slowed and Dennard Scaling has ended, architects are increasingly turning to heterogeneous parallelism and domain-specific hardware-software co-designs. These trends present new challenges for simulation-based performance assessments that are central to early-stage architectural exploration. Simulators must be lightweight to support rich heterogeneous combinations of general pur… ▽ More As Moore's Law has slowed and Dennard Scaling has ended, architects are increasingly turning to heterogeneous parallelism and domain-specific hardware-software co-designs. These trends present new challenges for simulation-based performance assessments that are central to early-stage architectural exploration. Simulators must be lightweight to support rich heterogeneous combinations of general purpose cores and specialized processing units. They must also support agile exploration of hardware-software co-design, i.e. changes in the programming model, compiler, ISA, and specialized hardware. To meet these challenges, we introduce MosaicSim, a lightweight, modular simulator for heterogeneous systems, offering accuracy and agility designed specifically for hardware-software co-design explorations. By integrating the LLVM toolchain, MosaicSim enables efficient modeling of instruction dependencies and flexible additions across the stack. Its modularity also allows the composition and integration of different hardware components. We first demonstrate that MosaicSim captures architectural bottlenecks in applications, and accurately models both scaling trends in a multicore setting and accelerator behavior. We then present two case-studies where MosaicSim enables straightforward design space explorations for emerging systems, i.e. data science application acceleration and heterogeneous parallel architectures. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: This is a full technical report on the MosaicSim simulator. This version is a variation of the original ISPASS publication with additions describing the accuracy of MosaicSim's memory hierarchy performance modeling and additional hardware features, e.g. branch predictors. This technical report will be maintained as the MosaicSim developers continue to augment the simulator with more features

arXiv:2004.03640 [pdf, other]

doi 10.23919/DATE48585.2020.9116317

ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning

Authors: Davide Giri, Kuan-Lin Chiu, Giuseppe Di Guglielmo, Paolo Mantovani, Luca P. Carloni

Abstract: We present ESP4ML, an open-source system-level design flow to build and program SoC architectures for embedded applications that require the hardware acceleration of machine learning and signal processing algorithms. We realized ESP4ML by combining two established open-source projects (ESP and HLS4ML) into a new, fully-automated design flow. For the SoC integration of accelerators generated by HLS… ▽ More We present ESP4ML, an open-source system-level design flow to build and program SoC architectures for embedded applications that require the hardware acceleration of machine learning and signal processing algorithms. We realized ESP4ML by combining two established open-source projects (ESP and HLS4ML) into a new, fully-automated design flow. For the SoC integration of accelerators generated by HLS4ML, we designed a set of new parameterized interface circuits synthesizable with high-level synthesis. For accelerator configuration and management, we developed an embedded software runtime system on top of Linux. With this HW/SW layer, we addressed the challenge of dynamically sha** the data traffic on a network-on-chip to activate and support the reconfigurable pipelines of accelerators that are needed by the application workloads currently running on the SoC. We demonstrate our vertically-integrated contributions with the FPGA-based implementations of complete SoC instances booting Linux and executing computer-vision applications that process images taken from the Google Street View database. △ Less

Submitted 18 June, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: Paper published in the proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Journal ref: Design, Automation and Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2020, pp. 1049-1054

arXiv:1912.11153 [pdf, other]

doi 10.1109/TCAD.2018.2857321

PAGURUS: Low-Overhead Dynamic Information Flow Tracking on Loosely Coupled Accelerators

Authors: Luca Piccolboni, Giuseppe Di Guglielmo, Luca P. Carloni

Abstract: Software-based attacks exploit bugs or vulnerabilities to get unauthorized access or leak confidential information. Dynamic information flow tracking (DIFT) is a security technique to track spurious information flows and provide strong security guarantees against such attacks. To secure heterogeneous systems, the spurious information flows must be tracked through all their components, including pr… ▽ More Software-based attacks exploit bugs or vulnerabilities to get unauthorized access or leak confidential information. Dynamic information flow tracking (DIFT) is a security technique to track spurious information flows and provide strong security guarantees against such attacks. To secure heterogeneous systems, the spurious information flows must be tracked through all their components, including processors, accelerators (i.e., application-specific hardware components) and memories. We present PAGURUS, a flexible methodology to design a low-overhead shell circuit that adds DIFT support to accelerators. The shell uses a coarse-grain DIFT approach, thus not requiring to make modifications to the accelerator's implementation. We analyze the performance and area overhead of the DIFT shell on FPGAs and we propose a metric, called information leakage, to measure its security guarantees. We perform a design-space exploration to show that we can synthesize accelerators with different characteristics in terms of performance, cost and security guarantees. We also present a case study where we use the DIFT shell to secure an accelerator running on a embedded platform with a DIFT-enhanced RISC-V core. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Published in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)

Report number: IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. Volume 37 Number 11 (November 2018)

arXiv:1912.10823 [pdf, other]

doi 10.1145/3126566

COSMOS: Coordination of High-Level Synthesis and Memory Optimization for Hardware Accelerators

Authors: Luca Piccolboni, Paolo Mantovani, Giuseppe Di Guglielmo, Luca P. Carloni

Abstract: Hardware accelerators are key to the efficiency and performance of system-on-chip (SoC) architectures. With high-level synthesis (HLS), designers can easily obtain several performance-cost trade-off implementations for each component of a complex hardware accelerator. However, navigating this design space in search of the Pareto-optimal implementations at the system level is a hard optimization ta… ▽ More Hardware accelerators are key to the efficiency and performance of system-on-chip (SoC) architectures. With high-level synthesis (HLS), designers can easily obtain several performance-cost trade-off implementations for each component of a complex hardware accelerator. However, navigating this design space in search of the Pareto-optimal implementations at the system level is a hard optimization task. We present COSMOS, an automatic methodology for the design-space exploration (DSE) of complex accelerators, that coordinates both HLS and memory optimization tools in a compositional way. First, thanks to the co-design of datapath and memory, COSMOS produces a large set of Pareto-optimal implementations for each component of the accelerator. Then, COSMOS leverages compositional design techniques to quickly converge to the desired trade-off point between cost and performance at the system level. When applied to the system-level design (SLD) of an accelerator for wide-area motion imagery (WAMI), COSMOS explores the design space as completely as an exhaustive search, but it reduces the number of invocations to the HLS tool by up to 14.6x. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Published in ACM Transactions on Embedded Computing Systems (TECS)

Journal ref: ACM Trans. Embed. Comput. Syst. 16, 5s, Article 150 (October 2017)

arXiv:1904.10726 [pdf, other]

Exhaustive Search of Ligand Binding Pathways via Volume-based Metadynamics

Authors: Riccardo Capelli, Paolo Carloni, Michele Parrinello

Abstract: Determining the complete set of ligands' binding/unbinding pathways is important for drug discovery and to rationally interpret mutation data. Here we have developed a metadynamics-based technique that addressed this issue and allows estimating affinities in the presence of multiple escape pathways. Our approach is shown on a Lysozyme T4 variant in complex with the benzene molecule. The calculated… ▽ More Determining the complete set of ligands' binding/unbinding pathways is important for drug discovery and to rationally interpret mutation data. Here we have developed a metadynamics-based technique that addressed this issue and allows estimating affinities in the presence of multiple escape pathways. Our approach is shown on a Lysozyme T4 variant in complex with the benzene molecule. The calculated binding free energy is in agreement with experimental data. Remarkably, not only we were able to find all the previously identified ligand binding pathways, but also we uncovered 3 new ones. This results were obtained at a small computational cost, making this approach valuable for practical applications, such as screening of small compounds libraries. △ Less

Submitted 24 April, 2019; originally announced April 2019.

arXiv:1711.05613 [pdf]

doi 10.1021/acs.jctc.7b00508

Open Boundary Simulations of Proteins and Their Hydration Shells by Hamiltonian Adaptive Resolution Scheme

Authors: Thomas Tarenzi, Vania Calandrini, Raffaello Potestio, Alejandro Giorgetti, Paolo Carloni

Abstract: The recently proposed Hamiltonian Adaptive Resolution Scheme (H-AdResS) allows to perform molecular simulations in an open boundary framework. It allows to change on the fly the resolution of specific subset of molecules (usually the solvent), which are free to diffuse between the atomistic region and the coarse-grained reservoir. So far, the method has been successfully applied to pure liquids. C… ▽ More The recently proposed Hamiltonian Adaptive Resolution Scheme (H-AdResS) allows to perform molecular simulations in an open boundary framework. It allows to change on the fly the resolution of specific subset of molecules (usually the solvent), which are free to diffuse between the atomistic region and the coarse-grained reservoir. So far, the method has been successfully applied to pure liquids. Coupling the H-AdResS methodology to hybrid models of proteins, such as the Molecular Mechanics/Coarse-Grained (MM/CG) scheme, is a promising approach for rigorous calculations of ligand binding free energies in low-resolution protein models. Towards this goal, here we apply for the first time H-AdResS to two atomistic proteins in dual-resolution solvent, proving its ability to reproduce structural and dynamic properties of both the proteins and the solvent, as obtained from atomistic simulations. △ Less

Submitted 15 November, 2017; originally announced November 2017.

Comments: This document is the Accepted Manuscript version of a Published Work that appeared in final form in Journal of Chemical Theory and Computation, copyright \c{opyright} American Chemical Society after peer review and technical editing by the publisher

Journal ref: Journal of Chemical Theory and Computation 2017 13 (11), 5647-5657

arXiv:1710.02202 [pdf, other]

doi 10.1039/C4CP02621G

Statistical Analysis of $σ$-Holes: A Novel Complementary View on Halogen Bonding

Authors: Michal H. Kolář, Paolo Carloni, Pavel Hobza

Abstract: To contribute to the understanding of noncovalent binding of halogenated molecules with a biological activity, electrostatic potential (ESP) maps of more than 2,500 compounds were thoroughly analysed. A peculiar region of positive ESP, called $σ$-hole, is a concept of central importance for halogen bonding. We aim at simplifying the view on $σ$-holes and provide general trends in organic drug-like… ▽ More To contribute to the understanding of noncovalent binding of halogenated molecules with a biological activity, electrostatic potential (ESP) maps of more than 2,500 compounds were thoroughly analysed. A peculiar region of positive ESP, called $σ$-hole, is a concept of central importance for halogen bonding. We aim at simplifying the view on $σ$-holes and provide general trends in organic drug-like molecules. The results are in fair agreement with crystallographic surveys of small molecules as well as of biomolecular complexes and attempt to improve the intuition of chemists when dealing with halogenated compounds. △ Less

Submitted 25 September, 2017; originally announced October 2017.

Comments: 5 pages, 3 figures, published

Journal ref: Physical Chemistry Chemical Physics 2014, 16(36), 19111-19114

arXiv:1702.07929 [pdf]

doi 10.1021/acs.jpclett.7b00127

Proton Dynamics in Protein Mass Spectrometry

Authors: **yu Li, Wen** Lyu, Giulia Rossetti, Albert Konijnenberg, Antonino Natalello, Emiliano Ippoliti, Modesto Orozco, Frank Sobott, Rita Grandori, Paolo Carloni

Abstract: Native electrospray ionization/ion mobility-mass spectrometry (ESI/IM-MS) allows an accurate determination of low-resolution structural features of proteins. Yet, the presence of proton dynamics, observed already by us for DNA in the gas phase, and its impact on protein structural determinants, have not been investigated so far. Here, we address this issue by a multi-step simulation strategy on a… ▽ More Native electrospray ionization/ion mobility-mass spectrometry (ESI/IM-MS) allows an accurate determination of low-resolution structural features of proteins. Yet, the presence of proton dynamics, observed already by us for DNA in the gas phase, and its impact on protein structural determinants, have not been investigated so far. Here, we address this issue by a multi-step simulation strategy on a pharmacologically relevant peptide, the N-terminal residues of amyloid-beta peptide (Abeta(1-16)). Our calculations reproduce the experimental maximum charge state from ESI-MS and are also in fair agreement with collision cross section (CCS) data measured here by ESI/IM-MS. Although the main structural features are preserved, subtle conformational changes do take place in the first ~0.1 ms of dynamics. In addition, intramolecular proton dynamics processes occur on the ps-timescale in the gas phase as emerging from quantum mechanics/molecular mechanics (QM/MM) simulations at the B3LYP level of theory. We conclude that proton transfer phenomena do occur frequently during fly time in ESI-MS experiments (typically on the ms timescale). However, the structural changes associated with the process do not significantly affect the structural determinants. △ Less

Submitted 25 February, 2017; originally announced February 2017.

Comments: J. Phys. Chem. Lett. 2017

arXiv:1608.01912 [pdf, ps, other]

doi 10.1007/s10867-017-9443-x

DNA like$-$charge attraction and overcharging by divalent counterions in the presence of divalent co$-$ions

Authors: Nguyen Viet Duc, Toan T. Nguyen, Paolo Carloni

Abstract: Strongly correlated electrostatics of DNA systems has drawn the interest of many groups, especially the condensation and overcharging of DNA by multivalent counterions. By adding counterions of different valencies and shapes, one can enhance or reduce DNA overcharging. In this papers, we focus on the effect of multivalent co-ions, specifically divalent co-ions such as SO$_4^{2-}$. A computational… ▽ More Strongly correlated electrostatics of DNA systems has drawn the interest of many groups, especially the condensation and overcharging of DNA by multivalent counterions. By adding counterions of different valencies and shapes, one can enhance or reduce DNA overcharging. In this papers, we focus on the effect of multivalent co-ions, specifically divalent co-ions such as SO$_4^{2-}$. A computational experiment of DNA condensation using Monte$-$Carlo simulation in grand canonical ensemble is carried out where DNA system is in equilibrium with a bulk solution containing a mixture of salt of different valency of co-ions. Compared to system with purely monovalent co-ions, the influence of divalent co-ions shows up in multiple aspects. Divalent co-ions lead to an increase of monovalent salt in the DNA condensate. Because monovalent salts mostly participate in linear screening of electrostatic interactions in the system, more monovalent salt molecules enter the condensate leads to screening out of short-range DNA$-$DNA like charge attraction and weaker DNA condensation free energy. The overcharging of DNA by multivalent counterions is also reduced in the presence of divalent co$-$ions. Strong repulsions between DNA and divalent co-ions and among divalent co-ions themselves leads to a {\em depletion} of negative ions near DNA surface as compared to the case without divalent co-ions. At large distance, the DNA$-$DNA repulsive interaction is stronger in the presence of divalent co$-$ions, suggesting that divalent co$-$ions role is not only that of simple stronger linear screening. △ Less

Submitted 24 May, 2017; v1 submitted 5 August, 2016; originally announced August 2016.

Comments: 10 pages, 6 figures

Journal ref: J. Biol. Phys. (2017)

arXiv:1307.5565 [pdf, other]

doi 10.1021/ct3009914

RNA/peptide binding driven by electrostatics -- Insight from bi-directional pulling simulations

Authors: Trang N. Do, Paolo Carloni, Gabriele Varani, Giovanni Bussi

Abstract: RNA/protein interactions play crucial roles in controlling gene expression. They are becoming important targets for pharmaceutical applications. Due to RNA flexibility and to the strength of electrostatic interactions, standard docking methods are insufficient. We here present a computational method which allows studying the binding of RNA molecules and charged peptides with atomistic, explicit-so… ▽ More RNA/protein interactions play crucial roles in controlling gene expression. They are becoming important targets for pharmaceutical applications. Due to RNA flexibility and to the strength of electrostatic interactions, standard docking methods are insufficient. We here present a computational method which allows studying the binding of RNA molecules and charged peptides with atomistic, explicit-solvent molecular dynamics. In our method, a suitable estimate of the electrostatic interaction is used as an order parameter (collective variable) which is then accelerated using bi-directional pulling simulations. Since the electrostatic interaction is only used to enhance the sampling, the approximations used to compute it do not affect the final accuracy. The method is employed to characterize the binding of TAR RNA from HIV-1 and a small cyclic peptide. Our simulation protocol allows blindly predicting the binding pocket and pose as well as the binding affinity. The method is general and could be applied to study other electrostatics-driven binding events. △ Less

Submitted 21 July, 2013; originally announced July 2013.

Comments: Reprinted (adapted) with permission from J. Chem. Theory Comput., 2013, 9 (3) 1720 (2013). Copyright (2013) American Chemical Society

Journal ref: J. Chem. Theory Comput. 9, 1720 (2013)

arXiv:0805.3829 [pdf, ps, other]

Many-Body meets QM/MM: Application to indole in water solution

Authors: Adriano Mosca Conte, Emiliano Ippoliti, Rodolfo Del Sole, Paolo Carloni, Olivia Pulci

Abstract: Spectral properties of chromophores are used to probe complex biological processes in vitro and in vivo, yet how the environment tunes their optical properties is far from being fully understood. Here we present a method to calculate such properties on large scale systems, like biologically relevant molecules in aqueous solution. Our approach is based on many body perturbation theory combined wi… ▽ More Spectral properties of chromophores are used to probe complex biological processes in vitro and in vivo, yet how the environment tunes their optical properties is far from being fully understood. Here we present a method to calculate such properties on large scale systems, like biologically relevant molecules in aqueous solution. Our approach is based on many body perturbation theory combined with quantum-mechanics/molecular-mechanics (QM/MM) approach. We show here how to include quasi-particle and excitonic effects for the calculation of optical absorption spectra in a QM/MM scheme. We apply this scheme, together with the well established TDDFT approach, to indole in water solution. Our calculations show that the solvent induces a redshift in the main spectral peak of indole, in quantitative agreement with the experiments and point to the importance of performing averages over molecular dynamics configurations for calculating optical properties. △ Less

Submitted 25 May, 2008; originally announced May 2008.

Comments: 5 pages, 4 figures

arXiv:q-bio/0609025 [pdf, ps, other]

doi 10.1021/ja060896t

Convergent dynamics in the protease enzymatic superfamily

Authors: Vincenzo Carnevale, Simone Raugei, Cristian Micheletti, Paolo Carloni

Abstract: Proteases regulate various aspects of the life cycle in all organisms by cleaving specific peptide bonds. Their action is so central for biochemical processes that at least 2% of any known genome encodes for proteolytic enzymes. Here we show that selected proteases pairs, despite differences in oligomeric state, catalytic residues and fold, share a common structural organization of functionally… ▽ More Proteases regulate various aspects of the life cycle in all organisms by cleaving specific peptide bonds. Their action is so central for biochemical processes that at least 2% of any known genome encodes for proteolytic enzymes. Here we show that selected proteases pairs, despite differences in oligomeric state, catalytic residues and fold, share a common structural organization of functionally relevant regions which are further shown to undergo similar concerted movements. The structural and dynamical similarities found pervasively across evolutionarily distant clans point to common mechanisms for peptide hydrolysis. △ Less

Submitted 15 September, 2006; originally announced September 2006.

Comments: 13 pages, 6 figures

Journal ref: J. Am. Chem. Soc. 128, 9766-9772 (2006)

arXiv:cond-mat/0405145 [pdf, ps, other]

Accurate and efficient description of protein vibrational dynamics: comparing molecular dynamics and Gaussian models

Authors: Cristian Micheletti, Paolo Carloni, Amos Maritan

Abstract: Current all-atom potential based molecular dynamics (MD) allow the identification of a protein's functional motions on a wide-range of time-scales, up to few tens of ns. However, functional large scale motions of proteins may occur on a time-scale currently not accessible by all-atom potential based molecular dynamics. To avoid the massive computational effort required by this approach several s… ▽ More Current all-atom potential based molecular dynamics (MD) allow the identification of a protein's functional motions on a wide-range of time-scales, up to few tens of ns. However, functional large scale motions of proteins may occur on a time-scale currently not accessible by all-atom potential based molecular dynamics. To avoid the massive computational effort required by this approach several simplified schemes have been introduced. One of the most satisfactory is the Gaussian Network approach based on the energy expansion in terms of the deviation of the protein backbone from its native configuration. Here we consider an extension of this model which captures in a more realistic way the distribution of native interactions due to the introduction of effective sidechain centroids. Since their location is entirely determined by the protein backbone, the model is amenable to the same exact and computationally efficient treatment as previous simpler models. The ability of the model to describe the correlated motion of protein residues in thermodynamic equilibrium is established through a series of successful comparisons with an extensive (14 ns) MD simulation based on the AMBER potential of HIV-1 protease in complex with a peptide substrate. Thus, the model presented here emerges as a powerful tool to provide preliminary, fast yet accurate characterizations of proteins near-native motion. △ Less

Submitted 7 May, 2004; originally announced May 2004.

Comments: 14 pages 7 figures

arXiv:cond-mat/0101229 [pdf, ps, other]

Molecular Dynamics Studies on HIV-1 Protease: Drug Resistance and Folding Pathways

Authors: Fabio Cecconi, Cristian Micheletti, Paolo Carloni, Amos Maritan

Abstract: Drug resistance to HIV-1 Protease involves accumulation of multiple mutations in the protein. Here we investigate the role of these mutations by using molecular dynamics simulations which exploit the influence of the native-state topology in the folding process. Our calculations show that sites contributing to phenotypic resistance of FDA-approved drugs are among the most sensitive positions for… ▽ More Drug resistance to HIV-1 Protease involves accumulation of multiple mutations in the protein. Here we investigate the role of these mutations by using molecular dynamics simulations which exploit the influence of the native-state topology in the folding process. Our calculations show that sites contributing to phenotypic resistance of FDA-approved drugs are among the most sensitive positions for the stability of partially folded states and should play a relevant role in the folding process. Furthermore, associations between amino acid sites mutating under drug treatment are shown to be statistically correlated. The striking correlation between clinical data and our calculations suggest a novel approach to the design of drugs tailored to bind regions crucial not only for protein function but also for folding. △ Less

Submitted 16 January, 2001; originally announced January 2001.

Comments: Revtex, 14 pages, 7 eps figures. Proteins, Structure Function and Genetics, in press (2001)

arXiv:physics/9907003 [pdf, ps, other]

Serine Proteases: an Ab Initio Molecular Dynamics Study

Authors: L. De Santis, P. Carloni

Abstract: In serine proteases (SP's), the H-bond between His-57 and Asp-102, and that between Gly-193 and the transition state intermediate play a crucial role for enzymatic function. To shed light on the nature of these interactions, we have carried out ab initio molecular dynamics simulations on complexes representing adducts between the reaction intermediate and elastase (one protein belonging to the S… ▽ More In serine proteases (SP's), the H-bond between His-57 and Asp-102, and that between Gly-193 and the transition state intermediate play a crucial role for enzymatic function. To shed light on the nature of these interactions, we have carried out ab initio molecular dynamics simulations on complexes representing adducts between the reaction intermediate and elastase (one protein belonging to the SP family). Our calculations indicate the presence of a low--barrier H-bond between His-57 and Asp-102, in complete agreement with NMR experiments on enzyme--transition state analog complexes. Comparison with an ab initio molecular dynamics simulation on a model of the substrate--enzyme adduct indicates that the Gly-193--induced strong stabilization of the intermediate is accomplished by charge/dipole interactions and not by H-bonding as previously suggested. Inclusion of the protein electric field in the calculations does not affect significantly the charge distribution. △ Less

Submitted 1 July, 1999; originally announced July 1999.

Comments: 27 pages, 7 figures in separate files. To appear in PROTEINS: Structure, Function and Genetics

arXiv:cond-mat/9902227 [pdf, ps, other]

Protein Design is a Key Factor for Subunit-subunit Association

Authors: Cecilia Clementi, Paolo Carloni, Amos Maritan

Abstract: Fundamental questions about the role of the quaternary structures are addressed using a statistical mechanics off-lattice model of a dimer protein. The model, in spite of its simplicity, captures key features of the monomer-monomer interactions revealed by atomic force experiments. Force curves during association and dissociation are characterized by sudden jumps followed by smooth behavior and… ▽ More Fundamental questions about the role of the quaternary structures are addressed using a statistical mechanics off-lattice model of a dimer protein. The model, in spite of its simplicity, captures key features of the monomer-monomer interactions revealed by atomic force experiments. Force curves during association and dissociation are characterized by sudden jumps followed by smooth behavior and form hysteresis loops. Furthermore, the process is reversible in a finite range of temperature stabilizing the dimer. It is shown that in the interface between the two monomeric subunits the design procedure naturally favors those amino acids whose mutual interaction is stronger. Furthermore it is shown that the width of the hysteresis loop increases as the design procedure improves, i.e. stabilizes more the dimer. △ Less

Submitted 16 February, 1999; originally announced February 1999.

Comments: submitted to "Proceedings of the National Academy of Sciences, USA"

Showing 1–28 of 28 results for author: Carloni, P