Search | arXiv e-print repository

High Throughput Multi-Channel Parallelized Diffraction Convolutional Neural Network Accelerator

Authors: Zibo Hu, Shurui Li, Russell L. T. Schwartz, Maria Solyanik-Gorgone, Mario Miscuglio, Puneet Gupta, Volker J. Sorger

Abstract: Convolutional neural networks are paramount in image and signal processing including the relevant classification and training tasks alike and constitute for the majority of machine learning compute demand today. With convolution operations being computationally intensive, next generation hardware accelerators need to offer parallelization and algorithmic-hardware homomorphism. Fortunately, diffrac… ▽ More Convolutional neural networks are paramount in image and signal processing including the relevant classification and training tasks alike and constitute for the majority of machine learning compute demand today. With convolution operations being computationally intensive, next generation hardware accelerators need to offer parallelization and algorithmic-hardware homomorphism. Fortunately, diffractive display optics is capable of million-channel parallel data processing at low latency, however, thus far only showed tens of Hertz slow single image and kernel capability, thereby significantly underdelivering from its performance potential. Here, we demonstrate an operation-parallelized high-throughput Fourier optic convolutional neural network accelerator. For the first time simultaneously processing of multiple kernels in Fourier domain enabled by optical diffraction has been achieved alongside with already conventional in the field input parallelism. Additionally, we show an about one hundred times system speed up over existing optical diffraction-based processors and this demonstration rivals performance of modern electronic solutions. Therefore, this system is capable of processing large-scale matrices about ten times faster than state of art electronic systems. △ Less

Submitted 7 July, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

Comments: 13 pages, 4 figures

arXiv:2105.09943 [pdf, other]

doi 10.1080/23746149.2021.1981155

Prospects and applications of photonic neural networks

Authors: Chaoran Huang, Volker J. Sorger, Mario Miscuglio, Mohammed Al-Qadasi, Avilash Mukherjee, Sudip Shekhar, Lukas Chrostowski, Lutz Lampe, Mitchell Nichols, Mable P. Fok, Daniel Brunner, Alexander N. Tait, Thomas Ferreira de Lima, Bicky A. Marquez, Paul R. Prucnal, Bhavin J. Shastri

Abstract: Neural networks have enabled applications in artificial intelligence through machine learning, and neuromorphic computing. Software implementations of neural networks on conventional computers that have separate memory and processor (and that operate sequentially) are limited in speed and energy efficiency. Neuromorphic engineering aims to build processors in which hardware mimics neurons and syna… ▽ More Neural networks have enabled applications in artificial intelligence through machine learning, and neuromorphic computing. Software implementations of neural networks on conventional computers that have separate memory and processor (and that operate sequentially) are limited in speed and energy efficiency. Neuromorphic engineering aims to build processors in which hardware mimics neurons and synapses in the brain for distributed and parallel processing. Neuromorphic engineering enabled by photonics (optical physics) can offer sub-nanosecond latencies and high bandwidth with low energies to extend the domain of artificial intelligence and neuromorphic computing applications to machine learning acceleration, nonlinear programming, intelligent signal processing, etc. Photonic neural networks have been demonstrated on integrated platforms and free-space optics depending on the class of applications being targeted. Here, we discuss the prospects and demonstrated applications of these photonic neural networks. △ Less

Submitted 20 May, 2021; originally announced May 2021.

arXiv:2102.10398 [pdf]

All-Chalcogenide Programmable All-Optical Deep Neural Networks

Authors: Ting Yu, Xiaoxuan Ma, Ernest Pastor, Jonathan K. George, Simon Wall, Mario Miscuglio, Robert E. Simpson, Volker J. Sorger

Abstract: Deeplearning algorithms are revolutionising many aspects of modern life. Typically, they are implemented in CMOS-based hardware with severely limited memory access times and inefficient data-routing. All-optical neural networks without any electro-optic conversions could alleviate these shortcomings. However, an all-optical nonlinear activation function, which is a vital building block for optical… ▽ More Deeplearning algorithms are revolutionising many aspects of modern life. Typically, they are implemented in CMOS-based hardware with severely limited memory access times and inefficient data-routing. All-optical neural networks without any electro-optic conversions could alleviate these shortcomings. However, an all-optical nonlinear activation function, which is a vital building block for optical neural networks, needs to be developed efficiently on-chip. Here, we introduce and demonstrate both optical synapse weighting and all-optical nonlinear thresholding using two different effects in a chalcogenide material photonic platform. We show how the structural phase transitions in a wide-bandgap phase-change material enables storing the neural network weights via non-volatile photonic memory, whilst resonant bond destabilisation is used as a nonlinear activation threshold without changing the material. These two different transitions within chalcogenides enable programmable neural networks with near-zero static power consumption once trained, in addition to picosecond delays performing inference tasks not limited by wire charging that limit electrical circuits; for instance, we show that nanosecond-order weight programming and near-instantaneous weight updates enable accurate inference tasks within 20 picoseconds in a 3-layer all-optical neural network. Optical neural networks that bypass electro-optic conversion altogether hold promise for network-edge machine learning applications where decision-making in real-time are critical, such as for autonomous vehicles or navigation systems such as signal pre-processing of LIDAR systems. △ Less

Submitted 27 February, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

arXiv:2011.07391 [pdf, other]

Channel Tiling for Improved Performance and Accuracy of Optical Neural Network Accelerators

Authors: Shurui Li, Mario Miscuglio, Volker J. Sorger, Puneet Gupta

Abstract: Low latency, high throughput inference on Convolution Neural Networks (CNNs) remains a challenge, especially for applications requiring large input or large kernel sizes. 4F optics provides a solution to accelerate CNNs by converting convolutions into Fourier-domain point-wise multiplications that are computationally 'free' in optical domain. However, existing 4F CNN systems suffer from the all-po… ▽ More Low latency, high throughput inference on Convolution Neural Networks (CNNs) remains a challenge, especially for applications requiring large input or large kernel sizes. 4F optics provides a solution to accelerate CNNs by converting convolutions into Fourier-domain point-wise multiplications that are computationally 'free' in optical domain. However, existing 4F CNN systems suffer from the all-positive sensor readout issue which makes the implementation of a multi-channel, multi-layer CNN not scalable or even impractical. In this paper we propose a simple channel tiling scheme for 4F CNN systems that utilizes the high resolution of 4F system to perform channel summation inherently in optical domain before sensor detection, so the outputs of different channels can be correctly accumulated. Compared to state of the art, channel tiling gives similar accuracy, significantly better robustness to sensing quantization (33\% improvement in required sensing precision) error and noise (10dB reduction in tolerable sensing noise), 0.5X total filters required, 10-50X+ throughput improvement and as much as 3X reduction in required output camera resolution/bandwidth. Not requiring any additional optical hardware, the proposed channel tiling approach addresses an important throughput and precision bottleneck of high-speed, massively-parallel optical 4F computing systems. △ Less

Submitted 14 January, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

Comments: 11 pages, 8 figures

arXiv:2007.05380 [pdf]

Analog Computing with Metatronic Circuits

Authors: Mario Miscuglio, Yaliang Gui, Xiaoxuan Ma, Shuai Sun, Tarek El-Ghazawi, Tatsuo Itoh, Andrea Alù, Volker J. Sorger

Abstract: Analog photonic solutions offer unique opportunities to address complex computational tasks with unprecedented performance in terms of energy dissipation and speeds, overcoming current limitations of modern computing architectures based on electron flows and digital approaches. The lack of modularization and lumped element reconfigurability in photonics has prevented the transition to an all-optic… ▽ More Analog photonic solutions offer unique opportunities to address complex computational tasks with unprecedented performance in terms of energy dissipation and speeds, overcoming current limitations of modern computing architectures based on electron flows and digital approaches. The lack of modularization and lumped element reconfigurability in photonics has prevented the transition to an all-optical analog computing platform. Here, we explore a nanophotonic platform based on epsilon-near-zero materials capable of solving in the analog domain partial differential equations (PDE). Wavelength stretching in zero-index media enables highly nonlocal interactions within the board based on the conduction of electric displacement, which can be monitored to extract the solution of a broad class of PDE problems. By exploiting control of deposition technique through process parameters, we demonstrate the possibility of implementing the proposed nano-optic processor using CMOS-compatible indium-tin-oxide, whose optical properties can be tuned by carrier injection to obtain programmability at high speeds and low energy requirements. Our nano-optical analog processor can be integrated at chip-scale, processing arbitrary inputs at the speed of light. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:2002.03780 [pdf]

doi 10.1063/5.0001942

Photonic tensor cores for machine learning

Authors: Mario Miscuglio, Volker J. Sorger

Abstract: With an ongoing trend in computing hardware towards increased heterogeneity, domain-specific co-processors are emerging as alternatives to centralized paradigms. The tensor core unit (TPU) has shown to outperform graphic process units by almost 3-orders of magnitude enabled by higher signal throughout and energy efficiency. In this context, photons bear a number of synergistic physical properties… ▽ More With an ongoing trend in computing hardware towards increased heterogeneity, domain-specific co-processors are emerging as alternatives to centralized paradigms. The tensor core unit (TPU) has shown to outperform graphic process units by almost 3-orders of magnitude enabled by higher signal throughout and energy efficiency. In this context, photons bear a number of synergistic physical properties while phase-change materials allow for local nonvolatile mnemonic functionality in these emerging distributed non van-Neumann architectures. While several photonic neural network designs have been explored, a photonic TPU to perform matrix vector multiplication and summation is yet outstanding. Here we introduced an integrated photonics-based TPU by strategically utilizing a) photonic parallelism via wavelength division multiplexing, b) high 2 Peta-operations-per second throughputs enabled by 10s of picosecond-short delays from optoelectronics and compact photonic integrated circuitry, and c) zero power-consuming novel photonic multi-state memories based on phase-change materials featuring vanishing losses in the amorphous state. Combining these physical synergies of material, function, and system, we show that the performance of this 8-bit photonic TPU can be 2-3 orders higher compared to an electrical TPU whilst featuring similar chip areas. This work shows that photonic specialized processors have the potential to augment electronic systems and may perform exceptionally well in network-edge devices in the looming 5G networks and beyond. △ Less

Submitted 29 June, 2020; v1 submitted 1 February, 2020; originally announced February 2020.

Journal ref: Applied Physics Reviews 7, 031404 (2020)

arXiv:1906.10487 [pdf, other]

A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

Authors: Armin Mehrabian, Mario Miscuglio, Yousra Alkabani, Volker J. Sorger, Tarek El-Ghazawi

Abstract: Neural Networks (NNs) have become the mainstream technology in the artificial intelligence (AI) renaissance over the past decade. Among different types of neural networks, convolutional neural networks (CNNs) have been widely adopted as they have achieved leading results in many fields such as computer vision and speech recognition. This success in part is due to the widespread availability of cap… ▽ More Neural Networks (NNs) have become the mainstream technology in the artificial intelligence (AI) renaissance over the past decade. Among different types of neural networks, convolutional neural networks (CNNs) have been widely adopted as they have achieved leading results in many fields such as computer vision and speech recognition. This success in part is due to the widespread availability of capable underlying hardware platforms. Applications have always been a driving factor for design of such hardware architectures. Hardware specialization can expose us to novel architectural solutions, which can outperform general purpose computers for tasks at hand. Although different applications demand for different performance measures, they all share speed and energy efficiency as high priorities. Meanwhile, photonics processing has seen a resurgence due to its inherited high speed and low power nature. Here, we investigate the potential of using photonics in CNNs by proposing a CNN accelerator design based on Winograd filtering algorithm. Our evaluation results show that while a photonic accelerator can compete with current-state-of-the-art electronic platforms in terms of both speed and power, it has the potential to improve the energy efficiency by up to three orders of magnitude. △ Less

Submitted 4 December, 2019; v1 submitted 25 June, 2019; originally announced June 2019.

Comments: 12 pages, photonics, artificial intelligence, convolutional neural networks, Winograd

MSC Class: B.0; B.7; C.1; C.1.2; C.1.4; C.3; C.5; I.2; I.2.5; I.2.10; I.2.11; I.4; I.5; I.5.2; I.5.4; I.5.5; I.6; I.6.3 ACM Class: B.0; B.7; C.1; C.1.2; C.1.4; C.3; C.5; I.2; I.2.5; I.2.10; I.2.11; I.4; I.5; I.5.2; I.5.4; I.5.5; I.6; I.6.3

Showing 1–7 of 7 results for author: Miscuglio, M