HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations
Authors:
Derek Jones,
Jonathan E. Allen,
Xiaohua Zhang,
Behnam Khaleghi,
Jaeyoung Kang,
Weihong Xu,
Niema Moshiri,
Tajana S. Rosing
Abstract:
Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying ``hit'' molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between t…
▽ More
Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying ``hit'' molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug to its protein target. A major drawback of the approaches is that they require exceptional computing capabilities to consider for even relatively small collections of molecules.
Hyperdimensional Computing (HDC) is a recently proposed learning paradigm that is able to leverage low-precision binary vector arithmetic to build efficient representations of the data that can be obtained without the need for gradient-based optimization approaches that are required in many conventional machine learning and deep learning approaches. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated for a range of application areas. We consider existing HDC approaches for molecular property classification and introduce two novel encoding algorithms that leverage the extended connectivity fingerprint (ECFP) algorithm.
We show that HDC-based inference methods are as much as 90 times more efficient than more complex representative machine learning methods and achieve an acceleration of nearly 9 orders of magnitude as compared to inference with molecular docking. We demonstrate multiple approaches for the encoding of molecular data for HDC and examine their relative performance on a range of challenging molecular property prediction and drug-protein binding classification tasks. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
Trustworthy AI Inference Systems: An Industry Research View
Authors:
Rosario Cammarota,
Matthias Schunter,
Anand Rajan,
Fabian Boemer,
Ágnes Kiss,
Amos Treiber,
Christian Weinert,
Thomas Schneider,
Emmanuel Stapf,
Ahmad-Reza Sadeghi,
Daniel Demmler,
Joshua Stock,
Huili Chen,
Siam Umar Hussain,
Sadegh Riazi,
Farinaz Koushanfar,
Saransh Gupta,
Tajan Simunic Rosing,
Kamalika Chaudhuri,
Hamid Nejatollahi,
Nikil Dutt,
Mohsen Imani,
Kim Laine,
Anuj Dubey,
Aydin Aysu
, et al. (4 additional authors not shown)
Abstract:
In this work, we provide an industry research view for approaching the design, deployment, and operation of trustworthy Artificial Intelligence (AI) inference systems. Such systems provide customers with timely, informed, and customized inferences to aid their decision, while at the same time utilizing appropriate security protection mechanisms for AI models. Additionally, such systems should also…
▽ More
In this work, we provide an industry research view for approaching the design, deployment, and operation of trustworthy Artificial Intelligence (AI) inference systems. Such systems provide customers with timely, informed, and customized inferences to aid their decision, while at the same time utilizing appropriate security protection mechanisms for AI models. Additionally, such systems should also use Privacy-Enhancing Technologies (PETs) to protect customers' data at any time. To approach the subject, we start by introducing current trends in AI inference systems. We continue by elaborating on the relationship between Intellectual Property (IP) and private data protection in such systems. Regarding the protection mechanisms, we survey the security and privacy building blocks instrumental in designing, building, deploying, and operating private AI inference systems. For example, we highlight opportunities and challenges in AI systems using trusted execution environments combined with more recent advances in cryptographic techniques to protect data in use. Finally, we outline areas of further development that require the global collective attention of industry, academia, and government researchers to sustain the operation of trustworthy AI inference systems.
△ Less
Submitted 10 February, 2023; v1 submitted 10 August, 2020;
originally announced August 2020.
QubitHD: A Stochastic Acceleration Method for HD Computing-Based Machine Learning
Authors:
Samuel Bosch,
Alexander Sanchez de la Cerda,
Mohsen Imani,
Tajana Simunic Rosing,
Giovanni De Micheli
Abstract:
Machine Learning algorithms based on Brain-inspired Hyperdimensional(HD) computing imitate cognition by exploiting statistical properties of high-dimensional vector spaces. It is a promising solution for achieving high energy efficiency in different machine learning tasks, such as classification, semi-supervised learning, and clustering. A weakness of existing HD computing-based ML algorithms is t…
▽ More
Machine Learning algorithms based on Brain-inspired Hyperdimensional(HD) computing imitate cognition by exploiting statistical properties of high-dimensional vector spaces. It is a promising solution for achieving high energy efficiency in different machine learning tasks, such as classification, semi-supervised learning, and clustering. A weakness of existing HD computing-based ML algorithms is the fact that they have to be binarized to achieve very high energy efficiency. At the same time, binarized models reach lower classification accuracies. To solve the problem of the trade-off between energy efficiency and classification accuracy, we propose the QubitHD algorithm. It stochastically binarizes HD-based algorithms, while maintaining comparable classification accuracies to their non-binarized counterparts. The FPGA implementation of QubitHD provides a 65% improvement in terms of energy efficiency, and a 95% improvement in terms of training time, as compared to state-of-the-art HD-based ML algorithms. It also outperforms state-of-the-art low-cost classifiers (such as Binarized Neural Networks) in terms of speed and energy efficiency by an order of magnitude during training and inference.
△ Less
Submitted 10 October, 2022; v1 submitted 27 November, 2019;
originally announced November 2019.