-
Virtuoso: Video-based Intelligence for real-time tuning on SOCs
Authors:
Jayoung Lee,
PengCheng Wang,
Ran Xu,
Venkat Dasari,
Noah Weston,
Yin Li,
Saurabh Bagchi,
Somali Chaterji
Abstract:
Efficient and adaptive computer vision systems have been proposed to make computer vision tasks, such as image classification and object detection, optimized for embedded or mobile devices. These solutions, quite recent in their origin, focus on optimizing the model (a deep neural network, DNN) or the system by designing an adaptive system with approximation knobs. In spite of several recent effor…
▽ More
Efficient and adaptive computer vision systems have been proposed to make computer vision tasks, such as image classification and object detection, optimized for embedded or mobile devices. These solutions, quite recent in their origin, focus on optimizing the model (a deep neural network, DNN) or the system by designing an adaptive system with approximation knobs. In spite of several recent efforts, we show that existing solutions suffer from two major drawbacks. First, the system does not consider energy consumption of the models while making a decision on which model to run. Second, the evaluation does not consider the practical scenario of contention on the device, due to other co-resident workloads. In this work, we propose an efficient and adaptive video object detection system, Virtuoso, which is jointly optimized for accuracy, energy efficiency, and latency. Underlying Virtuoso is a multi-branch execution kernel that is capable of running at different operating points in the accuracy-energy-latency axes, and a lightweight runtime scheduler to select the best fit execution branch to satisfy the user requirement. To fairly compare with Virtuoso, we benchmark 15 state-of-the-art or widely used protocols, including Faster R-CNN (FRCNN), YOLO v3, SSD, EfficientDet, SELSA, MEGA, REPP, FastAdapt, and our in-house adaptive variants of FRCNN+, YOLO+, SSD+, and EfficientDet+ (our variants have enhanced efficiency for mobiles). With this comprehensive benchmark, Virtuoso has shown superiority to all the above protocols, leading the accuracy frontier at every efficiency level on NVIDIA Jetson mobile GPUs. Specifically, Virtuoso has achieved an accuracy of 63.9%, which is more than 10% higher than some of the popular object detection models, FRCNN at 51.1%, and YOLO at 49.5%.
△ Less
Submitted 24 December, 2021;
originally announced December 2021.
-
Two-Photon Dual-Comb LiDAR
Authors:
Hollie Wright,
**ghua Sun,
David McKendrick,
Nick Weston,
Derryck T. Reid
Abstract:
The interferometric signals produced in conventional dual-comb laser ranging require femtosecond lasers with long-term f_CEO stability, and are limited to an upper sampling rate by radio-frequency aliasing considerations. By using cross-polarized dual combs and two-photon detection we demonstrate carrier-phase-insensitive cross-correlations at sampling rates of up to 12x the conventional dual-comb…
▽ More
The interferometric signals produced in conventional dual-comb laser ranging require femtosecond lasers with long-term f_CEO stability, and are limited to an upper sampling rate by radio-frequency aliasing considerations. By using cross-polarized dual combs and two-photon detection we demonstrate carrier-phase-insensitive cross-correlations at sampling rates of up to 12x the conventional dual-comb aliasing limit, recording these in a digitizer-based acquisition system to implement ranging with sub-100-nm precision. We then extend this concept to show how the high data burden of conventional dual-comb acquisition can be eliminated by using a simple microcontroller as a ns-precision stopwatch to record the time intervals separating the two-photon cross-correlation pulses, providing real-time and continuous LiDAR-like distance metrology capable of sub-100 nm precision and dynamic acquisition for unlimited periods.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
SMASH: One-Shot Model Architecture Search through HyperNetworks
Authors:
Andrew Brock,
Theodore Lim,
J. M. Ritchie,
Nick Weston
Abstract:
Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively…
▽ More
Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMASH
△ Less
Submitted 17 August, 2017;
originally announced August 2017.
-
FreezeOut: Accelerate Training by Progressively Freezing Layers
Authors:
Andrew Brock,
Theodore Lim,
J. M. Ritchie,
Nick Weston
Abstract:
The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time dur…
▽ More
The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOut
△ Less
Submitted 18 June, 2017; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Neural Photo Editing with Introspective Adversarial Networks
Authors:
Andrew Brock,
Theodore Lim,
J. M. Ritchie,
Nick Weston
Abstract:
The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature qua…
▽ More
The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.
△ Less
Submitted 6 February, 2017; v1 submitted 22 September, 2016;
originally announced September 2016.
-
Generative and Discriminative Voxel Modeling with Convolutional Neural Networks
Authors:
Andrew Brock,
Theodore Lim,
J. M. Ritchie,
Nick Weston
Abstract:
When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencod…
▽ More
When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.
△ Less
Submitted 16 August, 2016; v1 submitted 15 August, 2016;
originally announced August 2016.