-
cuConv: A CUDA Implementation of Convolution for CNN Inference
Authors:
Marc Jordà,
Pedro Valero-Lara,
Antonio J. Peña
Abstract:
Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in production for this purpose. State-of-the-art implementations, however, present a lack of efficiency for some commonly used network configurations.
In this paper w…
▽ More
Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in production for this purpose. State-of-the-art implementations, however, present a lack of efficiency for some commonly used network configurations.
In this paper we propose a GPU-based implementation of the convolution operation for CNN inference that favors coalesced accesses, without requiring prior data transformations. Our experiments demonstrate that our proposal yields notable performance improvements in a range of common CNN forward propagation convolution configurations, with speedups of up to 2.29x with respect to the best implementation of convolution in cuDNN, hence covering a relevant region in currently existing approaches.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Enabling Homomorphically Encrypted Inference for Large DNN Models
Authors:
Guillermo Lloret-Talavera,
Marc Jorda,
Harald Servat,
Fabian Boemer,
Chetan Chauhan,
Shigeki Tomishima,
Nilesh N. Shah,
Antonio J. Peña
Abstract:
The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x-10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evalua…
▽ More
The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x-10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evaluate small models. To overcome these limitations, in this paper we explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory. In particular, we explore the recently-released Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for the first time in the literature. We present an in-depth analysis of the efficiency of the executions with different hardware and software configurations. Our results conclude that DNN inference using HE incurs on friendly access patterns for this memory configuration, yielding efficient executions.
△ Less
Submitted 29 April, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
UniGrasp: Learning a Unified Model to Grasp with Multifingered Robotic Hands
Authors:
Lin Shao,
Fabio Ferreira,
Mikael Jorda,
Varun Nambiar,
Jianlan Luo,
Eugen Solowjow,
Juan Aparicio Ojea,
Oussama Khatib,
Jeannette Bohg
Abstract:
To achieve a successful grasp, gripper attributes such as its geometry and kinematics play a role as important as the object geometry. The majority of previous work has focused on develo** grasp methods that generalize over novel object geometry but are specific to a certain robot hand. We propose UniGrasp, an efficient data-driven grasp synthesis method that considers both the object geometry a…
▽ More
To achieve a successful grasp, gripper attributes such as its geometry and kinematics play a role as important as the object geometry. The majority of previous work has focused on develo** grasp methods that generalize over novel object geometry but are specific to a certain robot hand. We propose UniGrasp, an efficient data-driven grasp synthesis method that considers both the object geometry and gripper attributes as inputs. UniGrasp is based on a novel deep neural network architecture that selects sets of contact points from the input point cloud of the object. The proposed model is trained on a large dataset to produce contact points that are in force closure and reachable by the robot hand. By using contact points as output, we can transfer between a diverse set of multifingered robotic hands. Our model produces over 90% valid contact points in Top10 predictions in simulation and more than 90% successful grasps in real world experiments for various known two-fingered and three-fingered grippers. Our model also achieves 93%, 83% and 90% successful grasps in real world experiments for an unseen two-fingered gripper and two unseen multi-fingered anthropomorphic robotic hands.
△ Less
Submitted 7 September, 2020; v1 submitted 23 October, 2019;
originally announced October 2019.
-
geomstats: a Python Package for Riemannian Geometry in Machine Learning
Authors:
Nina Miolane,
Johan Mathe,
Claire Donnat,
Mikael Jorda,
Xavier Pennec
Abstract:
We introduce geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. We provide efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodes…
▽ More
We introduce geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. We provide efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodesic distances provide a range of intuitive choices of Machine Learning loss functions. We also give the corresponding Riemannian gradients. The operations implemented in geomstats are available with different computing backends such as numpy, tensorflow and keras. We have enabled GPU implementation and integrated geomstats manifold computations into keras deep learning framework. This paper also presents a review of manifolds in machine learning and an overview of the geomstats package with examples demonstrating its use for efficient and user-friendly Riemannian geometry.
△ Less
Submitted 5 November, 2018; v1 submitted 21 May, 2018;
originally announced May 2018.
-
Real Time Collision Detection and Identification for Robotic Manipulators
Authors:
Elena Galbally,
Mikael Jorda
Abstract:
The majority of everyday tasks involve interacting with unstructured environments. This implies that, in order for robots to be truly useful they must be able to handle contacts. This paper explores how a particle filter can be used to localize a contact point and estimate the external force. We demonstrate the capability of the particle filter on a simulated 4DoF planar robotic arm, and compare i…
▽ More
The majority of everyday tasks involve interacting with unstructured environments. This implies that, in order for robots to be truly useful they must be able to handle contacts. This paper explores how a particle filter can be used to localize a contact point and estimate the external force. We demonstrate the capability of the particle filter on a simulated 4DoF planar robotic arm, and compare it to a well-established analytical approach.
△ Less
Submitted 1 February, 2018;
originally announced February 2018.