-
CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation
Authors:
Yongrui Yu,
Hanyu Chen,
Zitian Zhang,
Qiong Xiao,
Wenhui Lei,
Linrui Dai,
Yu Fu,
Hui Tan,
Guan Wang,
Peng Gao,
Xiaofan Zhang
Abstract:
Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node…
▽ More
Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node generation and the nnU-Net model for lymph node segmentation to improve the segmentation performance of abdominal lymph nodes through synthesizing a diversity of realistic abdominal lymph node data. We propose LN-DDPM, a conditional denoising diffusion probabilistic model (DDPM) for lymph node (LN) generation. LN-DDPM utilizes lymph node masks and anatomical structure masks as model conditions. These conditions work in two conditioning mechanisms: global structure conditioning and local detail conditioning, to distinguish between lymph nodes and their surroundings and better capture lymph node characteristics. The obtained paired abdominal lymph node images and masks are used for the downstream segmentation task. Experimental results on the abdominal lymph node datasets demonstrate that LN-DDPM outperforms other generative methods in the abdominal lymph node image synthesis and better assists the downstream abdominal lymph node segmentation task.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving
Authors:
Daocheng Fu,
Wenjie Lei,
Licheng Wen,
Pinlong Cai,
Song Mao,
Min Dou,
Botian Shi,
Yu Qiao
Abstract:
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platform…
▽ More
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platforms, LimSim++ addresses the need for a long-term closed-loop infrastructure supporting continuous learning and improved generalization in autonomous driving. The platform offers extended-duration, multi-scenario simulations, providing crucial information for (M)LLM-driven vehicles. Users can engage in prompt engineering, model evaluation, and framework enhancement, making LimSim++ a versatile tool for research and practice. This paper additionally introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The open-source resources of LimSim++ are available at: https://pjlab-adg.github.io/limsim-plus/.
△ Less
Submitted 12 April, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
DiVa: An Iterative Framework to Harvest More Diverse and Valid Labels from User Comments for Music
Authors:
Hongru Liang,
**gyao Liu,
Yuanxin Xiang,
Jiachen Du,
Lanjun Zhou,
Shushen Pan,
Wenqiang Lei
Abstract:
Towards sufficient music searching, it is vital to form a complete set of labels for each song. However, current solutions fail to resolve it as they cannot produce diverse enough map**s to make up for the information missed by the gold labels. Based on the observation that such missing information may already be presented in user comments, we propose to study the automated music labeling in an…
▽ More
Towards sufficient music searching, it is vital to form a complete set of labels for each song. However, current solutions fail to resolve it as they cannot produce diverse enough map**s to make up for the information missed by the gold labels. Based on the observation that such missing information may already be presented in user comments, we propose to study the automated music labeling in an essential but under-explored setting, where the model is required to harvest more diverse and valid labels from the users' comments given limited gold labels. To this end, we design an iterative framework (DiVa) to harvest more $\underline{\text{Di}}$verse and $\underline{\text{Va}}$lid labels from user comments for music. The framework makes a classifier able to form complete sets of labels for songs via pseudo-labels inferred from pre-trained classifiers and a novel joint score function. The experiment on a densely annotated testing set reveals the superiority of the Diva over state-of-the-art solutions in producing more diverse labels missed by the gold labels. We hope our work can inspire future research on automated music labeling.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Spatio-Temporal Structure Consistency for Semi-supervised Medical Image Classification
Authors:
Wentao Lei,
Lei Liu,
Li Liu
Abstract:
Intelligent medical diagnosis has shown remarkable progress based on the large-scale datasets with precise annotations. However, fewer labeled images are available due to significantly expensive cost for annotating data by experts. To fully exploit the easily available unlabeled data, we propose a novel Spatio-Temporal Structure Consistent (STSC) learning framework. Specifically, a gram matrix is…
▽ More
Intelligent medical diagnosis has shown remarkable progress based on the large-scale datasets with precise annotations. However, fewer labeled images are available due to significantly expensive cost for annotating data by experts. To fully exploit the easily available unlabeled data, we propose a novel Spatio-Temporal Structure Consistent (STSC) learning framework. Specifically, a gram matrix is derived to combine the spatial structure consistency and temporal structure consistency together. This gram matrix captures the structural similarity among the representations of different training samples. At the spatial level, our framework explicitly enforces the consistency of structural similarity among different samples under perturbations. At the temporal level, we consider the consistency of the structural similarity in different training iterations by digging out the stable sub-structures in a relation graph. Experiments on two medical image datasets (i.e., ISIC 2018 challenge and ChestX-ray14) show that our method outperforms state-of-the-art SSL methods. Furthermore, extensive qualitative analysis on the Gram matrices and heatmaps by Grad-CAM are presented to validate the effectiveness of our method.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
HMRNet: High and Multi-Resolution Network with Bidirectional Feature Calibration for Brain Structure Segmentation in Radiotherapy
Authors:
Hao Fu,
Guotai Wang,
Wenhui Lei,
Wei Xu,
Qianfei Zhao,
Shichuan Zhang,
Kang Li,
Shaoting Zhang
Abstract:
Accurate segmentation of Anatomical brain Barriers to Cancer spread (ABCs) plays an important role for automatic delineation of Clinical Target Volume (CTV) of brain tumors in radiotherapy. Despite that variants of U-Net are state-of-the-art segmentation models, they have limited performance when dealing with ABCs structures with various shapes and sizes, especially thin structures (e.g., the falx…
▽ More
Accurate segmentation of Anatomical brain Barriers to Cancer spread (ABCs) plays an important role for automatic delineation of Clinical Target Volume (CTV) of brain tumors in radiotherapy. Despite that variants of U-Net are state-of-the-art segmentation models, they have limited performance when dealing with ABCs structures with various shapes and sizes, especially thin structures (e.g., the falx cerebri) that span only few slices. To deal with this problem, we propose a High and Multi-Resolution Network (HMRNet) that consists of a multi-scale feature learning branch and a high-resolution branch, which can maintain the high-resolution contextual information and extract more robust representations of anatomical structures with various scales. We further design a Bidirectional Feature Calibration (BFC) block to enable the two branches to generate spatial attention maps for mutual feature calibration. Considering the different sizes and positions of ABCs structures, our network was applied after a rough localization of each structure to obtain fine segmentation results. Experiments on the MICCAI 2020 ABCs challenge dataset showed that: 1) Our proposed two-stage segmentation strategy largely outperformed methods segmenting all the structures in just one stage; 2) The proposed HMRNet with two branches can maintain high-resolution representations and is effective to improve the performance on thin structures; 3) The proposed BFC block outperformed existing attention methods using monodirectional feature calibration. Our method won the second place of ABCs 2020 challenge and has a potential for more accurate and reasonable delineation of CTV of brain tumors.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Outage Analysis of Aerial Semi-Grant-Free NOMA Systems
Authors:
Hongjiang Lei,
Chen Zhu,
Ki-Hong Park,
Imran Shafique Ansari,
Weijia Lei,
Hong Tang,
Kyeong ** Kim
Abstract:
In this paper, we analyze the outage performance of unmanned aerial vehicles (UAVs)-enabled downlink non-orthogonal multiple access (NOMA) communication systems with the semi-grant-free (SGF) transmission scheme. A UAV provides coverage services for a grant-based (GB) user and one user is allowed to utilize the same channel resource opportunistically. The hybrid successive interference cancellatio…
▽ More
In this paper, we analyze the outage performance of unmanned aerial vehicles (UAVs)-enabled downlink non-orthogonal multiple access (NOMA) communication systems with the semi-grant-free (SGF) transmission scheme. A UAV provides coverage services for a grant-based (GB) user and one user is allowed to utilize the same channel resource opportunistically. The hybrid successive interference cancellation scheme is implemented in the downlink NOMA scenarios for the first time. The analytical expressions for the exact and asymptotic outage probability (OP) of the grant-free (GF) user are derived. The results demonstrate that no-zero diversity order can be achieved only under stringent conditions on users' quality of service requirements. Subsequently, we propose an efficient dynamic power allocation (DPA) scheme to relax such data rate constraints to address this issue. The analytical expressions for the exact and asymptotic OP of the GF user with the DPA scheme are derived. Finally, Monte Carlo simulation results are presented to validate the correctness of the derived analytical expressions and demonstrate the effects of the UAV's location and altitude on the OP of the GF user.
△ Less
Submitted 18 February, 2024; v1 submitted 12 May, 2022;
originally announced May 2022.
-
One-shot Weakly-Supervised Segmentation in Medical Images
Authors:
Wenhui Lei,
Qi Su,
Ran Gu,
Na Wang,
Xinglong Liu,
Guotai Wang,
Xiaofan Zhang,
Shaoting Zhang
Abstract:
Deep neural networks usually require accurate and a large number of annotations to achieve outstanding performance in medical image segmentation. One-shot segmentation and weakly-supervised learning are promising research directions that lower labeling effort by learning a new class from only one annotated image and utilizing coarse labels instead, respectively. Previous works usually fail to leve…
▽ More
Deep neural networks usually require accurate and a large number of annotations to achieve outstanding performance in medical image segmentation. One-shot segmentation and weakly-supervised learning are promising research directions that lower labeling effort by learning a new class from only one annotated image and utilizing coarse labels instead, respectively. Previous works usually fail to leverage the anatomical structure and suffer from class imbalance and low contrast problems. Hence, we present an innovative framework for 3D medical image segmentation with one-shot and weakly-supervised settings. Firstly a propagation-reconstruction network is proposed to project scribbles from annotated volume to unlabeled 3D images based on the assumption that anatomical patterns in different human bodies are similar. Then a dual-level feature denoising module is designed to refine the scribbles based on anatomical- and pixel-level features. After expanding the scribbles to pseudo masks, we could train a segmentation model for the new class with the noisy label training strategy. Experiments on one abdomen and one head-and-neck CT dataset show the proposed method obtains significant improvement over the state-of-the-art methods and performs robustly even under severe class imbalance and low contrast.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
Domain Composition and Attention for Unseen-Domain Generalizable Medical Image Segmentation
Authors:
Ran Gu,
**gyang Zhang,
Rui Huang,
Wenhui Lei,
Guotai Wang,
Shaoting Zhang
Abstract:
Domain generalizable model is attracting increasing attention in medical image analysis since data is commonly acquired from different institutes with various imaging protocols and scanners. To tackle this challenging domain generalization problem, we propose a Domain Composition and Attention-based network (DCA-Net) to improve the ability of domain representation and generalization. First, we pre…
▽ More
Domain generalizable model is attracting increasing attention in medical image analysis since data is commonly acquired from different institutes with various imaging protocols and scanners. To tackle this challenging domain generalization problem, we propose a Domain Composition and Attention-based network (DCA-Net) to improve the ability of domain representation and generalization. First, we present a domain composition method that represents one certain domain by a linear combination of a set of basis representations (i.e., a representation bank). Second, a novel plug-and-play parallel domain preceptor is proposed to learn these basis representations and we introduce a divergence constraint function to encourage the basis representations to be as divergent as possible. Then, a domain attention module is proposed to learn the linear combination coefficients of the basis representations. The result of linear combination is used to calibrate the feature maps of an input image, which enables the model to generalize to different and even unseen domains. We validate our method on public prostate MRI dataset acquired from six different institutions with apparent domain shift. Experimental results show that our proposed model can generalize well on different and even unseen domains and it outperforms state-of-the-art methods on the multi-domain prostate segmentation task.
△ Less
Submitted 18 September, 2021;
originally announced September 2021.
-
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT
Authors:
Wanlu Lei,
Yu Ye,
Ming Xiao,
Mikael Skoglund,
Zhu Han
Abstract:
Edge computing provides a promising paradigm to support the implementation of Industrial Internet of Things (IIoT) by offloading tasks to nearby edge nodes. Meanwhile, the increasing network size makes it impractical for centralized data processing due to limited bandwidth, and consequently a decentralized learning scheme is preferable. Reinforcement learning (RL) has been widely investigated and…
▽ More
Edge computing provides a promising paradigm to support the implementation of Industrial Internet of Things (IIoT) by offloading tasks to nearby edge nodes. Meanwhile, the increasing network size makes it impractical for centralized data processing due to limited bandwidth, and consequently a decentralized learning scheme is preferable. Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. For RL in a decentralized setup, edge nodes (agents) connected through a communication network aim to work collaboratively to find a policy to optimize the global reward as the sum of local rewards. However, communication costs, scalability and adaptation in complex environments with heterogeneous agents may significantly limit the performance of decentralized RL. Alternating direction method of multipliers (ADMM) has a structure that allows for decentralized implementation, and has shown faster convergence than gradient descent based methods. Therefore, we propose an adaptive stochastic incremental ADMM (asI-ADMM) algorithm and apply the asI-ADMM to decentralized RL with edge-computing-empowered IIoT networks. We provide convergence properties for proposed algorithms by designing a Lyapunov function and prove that the asI-ADMM has $O(\frac{1}{k}) +O(\frac{1}{M})$ convergence rate where $k$ and $ M$ are the number of iterations and batch samples, respectively. Then, we test our algorithm with two supervised learning problems. For performance evaluation, we simulate two applications in decentralized RL settings with homogeneous and heterogeneous agents. The experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
△ Less
Submitted 30 June, 2021;
originally announced July 2021.
-
Automatic Segmentation of Organs-at-Risk from Head-and-Neck CT using Separable Convolutional Neural Network with Hard-Region-Weighted Loss
Authors:
Wenhui Lei,
Haochen Mei,
Zhengwentai Sun,
Shan Ye,
Ran Gu,
Huan Wang,
Rui Huang,
Shichuan Zhang,
Shaoting Zhang,
Guotai Wang
Abstract:
Nasopharyngeal Carcinoma (NPC) is a leading form of Head-and-Neck (HAN) cancer in the Arctic, China, Southeast Asia, and the Middle East/North Africa. Accurate segmentation of Organs-at-Risk (OAR) from Computed Tomography (CT) images with uncertainty information is critical for effective planning of radiation therapy for NPC treatment. Despite the stateof-the-art performance achieved by Convolutio…
▽ More
Nasopharyngeal Carcinoma (NPC) is a leading form of Head-and-Neck (HAN) cancer in the Arctic, China, Southeast Asia, and the Middle East/North Africa. Accurate segmentation of Organs-at-Risk (OAR) from Computed Tomography (CT) images with uncertainty information is critical for effective planning of radiation therapy for NPC treatment. Despite the stateof-the-art performance achieved by Convolutional Neural Networks (CNNs) for automatic segmentation of OARs, existing methods do not provide uncertainty estimation of the segmentation results for treatment planning, and their accuracy is still limited by several factors, including the low contrast of soft tissues in CT, highly imbalanced sizes of OARs and large inter-slice spacing. To address these problems, we propose a novel framework for accurate OAR segmentation with reliable uncertainty estimation. First, we propose a Segmental Linear Function (SLF) to transform the intensity of CT images to make multiple organs more distinguishable than existing methods based on a simple window width/level that often gives a better visibility of one organ while hiding the others. Second, to deal with the large inter-slice spacing, we introduce a novel 2.5D network (named as 3D-SepNet) specially designed for dealing with clinic HAN CT scans with anisotropic spacing. Thirdly, existing hardness-aware loss function often deal with class-level hardness, but our proposed attention to hard voxels (ATH) uses a voxel-level hardness strategy, which is more suitable to dealing with some hard regions despite that its corresponding class may be easy. Our code is now available at https://github.com/HiLab-git/SepNet.
△ Less
Submitted 3 February, 2021;
originally announced February 2021.
-
Automatic Segmentation of Gross Target Volume of Nasopharynx Cancer using Ensemble of Multiscale Deep Neural Networks with Spatial Attention
Authors:
Haochen Mei,
Wenhui Lei,
Ran Gu,
Shan Ye,
Zhengwentai Sun,
Shichuan Zhang,
Guotai Wang
Abstract:
Radiotherapy is the main treatment modality for nasopharynx cancer. Delineation of Gross Target Volume (GTV) from medical images such as CT and MRI images is a prerequisite for radiotherapy. As manual delineation is time-consuming and laborious, automatic segmentation of GTV has a potential to improve this process. Currently, most of the deep learning-based automatic delineation methods of GTV are…
▽ More
Radiotherapy is the main treatment modality for nasopharynx cancer. Delineation of Gross Target Volume (GTV) from medical images such as CT and MRI images is a prerequisite for radiotherapy. As manual delineation is time-consuming and laborious, automatic segmentation of GTV has a potential to improve this process. Currently, most of the deep learning-based automatic delineation methods of GTV are mainly performed on medical images like CT images. However, it is challenged by the low contrast between the pathology regions and surrounding soft tissues, small target region, and anisotropic resolution of clinical CT images. To deal with these problems, we propose a 2.5D Convolutional Neural Network (CNN) to handle the difference of inplane and through-plane resolution. Furthermore, we propose a spatial attention module to enable the network to focus on small target, and use channel attention to further improve the segmentation performance. Moreover, we use multi-scale sampling method for training so that the networks can learn features at different scales, which are combined with a multi-model ensemble method to improve the robustness of segmentation results. We also estimate the uncertainty of segmentation results based on our model ensemble, which is of great importance for indicating the reliability of automatic segmentation results for radiotherapy planning.
△ Less
Submitted 27 January, 2021;
originally announced January 2021.
-
PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music
Authors:
Hongru Liang,
Wenqiang Lei,
Paul Yaozhu Chan,
Zhenglu Yang,
Maosong Sun,
Tat-Seng Chua
Abstract:
Definitive embeddings remain a fundamental challenge of computational musicology for symbolic music in deep learning today. Analogous to natural language, music can be modeled as a sequence of tokens. This motivates the majority of existing solutions to explore the utilization of word embedding models to build music embeddings. However, music differs from natural languages in two key aspects: (1)…
▽ More
Definitive embeddings remain a fundamental challenge of computational musicology for symbolic music in deep learning today. Analogous to natural language, music can be modeled as a sequence of tokens. This motivates the majority of existing solutions to explore the utilization of word embedding models to build music embeddings. However, music differs from natural languages in two key aspects: (1) musical token is multi-faceted -- it comprises of pitch, rhythm and dynamics information; and (2) musical context is two-dimensional -- each musical token is dependent on both melodic and harmonic contexts. In this work, we provide a comprehensive solution by proposing a novel framework named PiRhDy that integrates pitch, rhythm, and dynamics information seamlessly. PiRhDy adopts a hierarchical strategy which can be decomposed into two steps: (1) token (i.e., note event) modeling, which separately represents pitch, rhythm, and dynamics and integrates them into a single token embedding; and (2) context modeling, which utilizes melodic and harmonic knowledge to train the token embedding. A thorough study was made on each component and sub-strategy of PiRhDy. We further validate our embeddings in three downstream tasks -- melody completion, accompaniment suggestion, and genre classification. Results indicate a significant advancement of the neural approach towards symbolic music as well as PiRhDy's potential as a pretrained tool for a broad range of symbolic music applications.
△ Less
Submitted 15 October, 2020;
originally announced October 2020.
-
Semi-Supervised Active Learning for COVID-19 Lung Ultrasound Multi-symptom Classification
Authors:
Lei Liu,
Wentao Lei,
Yongfang Luo,
Cheng Feng,
Xiang Wan,
Li Liu
Abstract:
Ultrasound (US) is a non-invasive yet effective medical diagnostic imaging technique for the COVID-19 global pandemic. However, due to complex feature behaviors and expensive annotations of US images, it is difficult to apply Artificial Intelligence (AI) assisting approaches for lung's multi-symptom (multi-label) classification. To overcome these difficulties, we propose a novel semi-supervised Tw…
▽ More
Ultrasound (US) is a non-invasive yet effective medical diagnostic imaging technique for the COVID-19 global pandemic. However, due to complex feature behaviors and expensive annotations of US images, it is difficult to apply Artificial Intelligence (AI) assisting approaches for lung's multi-symptom (multi-label) classification. To overcome these difficulties, we propose a novel semi-supervised Two-Stream Active Learning (TSAL) method to model complicated features and reduce labeling costs in an iterative procedure. The core component of TSAL is the multi-label learning mechanism, in which label correlations information is used to design multi-label margin (MLM) strategy and confidence validation for automatically selecting informative samples and confident labels. On this basis, a multi-symptom multi-label (MSML) classification network is proposed to learn discriminative features of lung symptoms, and a human-machine interaction is exploited to confirm the final annotations that are used to fine-tune MSML with progressively labeled data. Moreover, a novel lung US dataset named COVID19-LUSMS is built, currently containing 71 clinical patients with 6,836 images sampled from 678 videos. Experimental evaluations show that TSAL using only 20% data can achieve superior performance to the baseline and the state-of-the-art. Qualitatively, visualization of both attention map and sample distribution confirms the good consistency with the clinic knowledge.
△ Less
Submitted 28 February, 2021; v1 submitted 9 September, 2020;
originally announced September 2020.
-
Deep Reinforcement Learning Based Spectrum Allocation in Integrated Access and Backhaul Networks
Authors:
Wanlu Lei,
Yu Ye,
Ming Xiao
Abstract:
We develop a framework based on deep reinforce-ment learning (DRL) to solve the spectrum allocation problem inthe emerging integrated access and backhaul (IAB) architecturewith large scale deployment and dynamic environment. The avail-able spectrum is divided into several orthogonal sub-channels,and the donor base station (DBS) and all IAB nodes have thesame spectrum resource for allocation, where…
▽ More
We develop a framework based on deep reinforce-ment learning (DRL) to solve the spectrum allocation problem inthe emerging integrated access and backhaul (IAB) architecturewith large scale deployment and dynamic environment. The avail-able spectrum is divided into several orthogonal sub-channels,and the donor base station (DBS) and all IAB nodes have thesame spectrum resource for allocation, where a DBS utilizes thosesub-channels for access links of associated user equipment (UE)as well as for backhaul links of associated IAB nodes, and anIAB node can utilize all for its associated UEs. This is one ofkey features in which 5G differs from traditional settings wherethe backhaul networks were designed independently from theaccess networks. With the goal of maximizing the sum log-rateof all UE groups, we formulate the spectrum allocation probleminto a mix-integer and non-linear programming. However, itis intractable to find an optimal solution especially when theIAB network is large and time-varying. To tackle this problem,we propose to use the latest DRL method by integrating anactor-critic spectrum allocation (ACSA) scheme and deep neuralnetwork (DNN) to achieve real-time spectrum allocation indifferent scenarios. The proposed methods are evaluated throughnumerical simulations and show promising results compared withsome baseline allocation policies.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Analog Weights in ReRAM DNN Accelerators
Authors:
Jason K. Eshraghian,
Sung-Mo Kang,
Seungbum Baek,
Garrick Orchard,
Herbert Ho-Ching Iu,
Wen Lei
Abstract:
Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than sta…
▽ More
Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than standard CMOS logic. By virtue of being resistive switches, ReRAM switches can only reliably store one of two states. This is a severe limitation on the range of values in a computational kernel. This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input. We show this approach reduces average power consumption for a single crossbar convolution by up to a factor of x16 for an unsigned 8-bit input image, where each convolutional process consumes a worst-case of 1.1mW, and reduces area by a factor of x8, without reducing accuracy to the level of binarized neural networks. This presents a massive saving in computing cost when there are many simultaneous in-situ multiply-and-accumulate processes occurring across different crossbars.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.