Search | arXiv e-print repository

DeepSense-V2V: A Vehicle-to-Vehicle Multi-Modal Sensing, Localization, and Communications Dataset

Authors: Joao Morais, Gouranga Charan, Nikhil Srinivas, Ahmed Alkhateeb

Abstract: High data rate and low-latency vehicle-to-vehicle (V2V) communication are essential for future intelligent transport systems to enable coordination, enhance safety, and support distributed computing and intelligence requirements. Develo** effective communication strategies, however, demands realistic test scenarios and datasets. This is important at the high-frequency bands where more spectrum i… ▽ More High data rate and low-latency vehicle-to-vehicle (V2V) communication are essential for future intelligent transport systems to enable coordination, enhance safety, and support distributed computing and intelligence requirements. Develo** effective communication strategies, however, demands realistic test scenarios and datasets. This is important at the high-frequency bands where more spectrum is available, yet harvesting this bandwidth is challenged by the need for direction transmission and the sensitivity of signal propagation to blockages. This work presents the first large-scale multi-modal dataset for studying mmWave vehicle-to-vehicle communications. It presents a two-vehicle testbed that comprises data from a 360-degree camera, four radars, four 60 GHz phased arrays, a 3D lidar, and two precise GPSs. The dataset contains vehicles driving during the day and night for 120 km in intercity and rural settings, with speeds up to 100 km per hour. More than one million objects were detected across all images, from trucks to bicycles. This work further includes detailed dataset statistics that prove the coverage of various situations and highlights how this dataset can enable novel machine-learning applications. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 14 pages, 15 figures, 2 tables. The dataset is available on the DeepSense6G website: https://deepsense6g.net/

arXiv:2402.14766 [pdf, other]

Environment Semantic Communication: Enabling Distributed Sensing Aided Networks

Authors: Shoaib Imran, Gouranga Charan, Ahmed Alkhateeb

Abstract: Millimeter-wave (mmWave) and terahertz (THz) communication systems require large antenna arrays and use narrow directive beams to ensure sufficient receive signal power. However, selecting the optimal beams for these large antenna arrays incurs a significant beam training overhead, making it challenging to support applications involving high mobility. In recent years, machine learning (ML) solutio… ▽ More Millimeter-wave (mmWave) and terahertz (THz) communication systems require large antenna arrays and use narrow directive beams to ensure sufficient receive signal power. However, selecting the optimal beams for these large antenna arrays incurs a significant beam training overhead, making it challenging to support applications involving high mobility. In recent years, machine learning (ML) solutions have shown promising results in reducing the beam training overhead by utilizing various sensing modalities such as GPS position and RGB images. However, the existing approaches are mainly limited to scenarios with only a single object of interest present in the wireless environment and focus only on co-located sensing, where all the sensors are installed at the communication terminal. This brings key challenges such as the limited sensing coverage compared to the coverage of the communication system and the difficulty in handling non-line-of-sight scenarios. To overcome these limitations, our paper proposes the deployment of multiple distributed sensing nodes, each equipped with an RGB camera. These nodes focus on extracting environmental semantics from the captured RGB images. The semantic data, rather than the raw images, are then transmitted to the basestation. This strategy significantly alleviates the overhead associated with the data storage and transmission of the raw images. Furthermore, semantic communication enhances the system's adaptability and responsiveness to dynamic environments, allowing for prioritization and transmission of contextually relevant information. Experimental results on the DeepSense 6G dataset demonstrate the effectiveness of the proposed solution in reducing the sensing data transmission overhead while accurately predicting the optimal beams in realistic communication environments. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: The code and dataset are available on the DeepSense website https://www.deepsense6g.net/

arXiv:2308.10362 [pdf, other]

Vehicle Cameras Guide mmWave Beams: Approach and Real-World V2V Demonstration

Authors: Tawfik Osman, Gouranga Charan, Ahmed Alkhateeb

Abstract: Accurately aligning millimeter-wave (mmWave) and terahertz (THz) narrow beams is essential to satisfy reliability and high data rates of 5G and beyond wireless communication systems. However, achieving this objective is difficult, especially in vehicle-to-vehicle (V2V) communication scenarios, where both transmitter and receiver are constantly mobile. Recently, additional sensing modalities, such… ▽ More Accurately aligning millimeter-wave (mmWave) and terahertz (THz) narrow beams is essential to satisfy reliability and high data rates of 5G and beyond wireless communication systems. However, achieving this objective is difficult, especially in vehicle-to-vehicle (V2V) communication scenarios, where both transmitter and receiver are constantly mobile. Recently, additional sensing modalities, such as visual sensors, have attracted significant interest due to their capability to provide accurate information about the wireless environment. To that end, in this paper, we develop a deep learning solution for V2V scenarios to predict future beams using images from a 360 camera attached to the vehicle. The developed solution is evaluated on a real-world multi-modal mmWave V2V communication dataset comprising co-existing 360 camera and mmWave beam training data. The proposed vision-aided solution achieves $\approx 85\%$ top-5 beam prediction accuracy while significantly reducing the beam training overhead. This highlights the potential of utilizing vision for enabling highly-mobile V2V communications. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: Dataset and code files are available on the DeepSense 6G website https://deepsense6g.net/

arXiv:2308.06868 [pdf, other]

Camera Based mmWave Beam Prediction: Towards Multi-Candidate Real-World Scenarios

Authors: Gouranga Charan, Muhammad Alrabeiah, Tawfik Osman, Ahmed Alkhateeb

Abstract: Leveraging sensory information to aid the millimeter-wave (mmWave) and sub-terahertz (sub-THz) beam selection process is attracting increasing interest. This sensory data, captured for example by cameras at the basestations, has the potential of significantly reducing the beam swee** overhead and enabling highly-mobile applications. The solutions developed so far, however, have mainly considered… ▽ More Leveraging sensory information to aid the millimeter-wave (mmWave) and sub-terahertz (sub-THz) beam selection process is attracting increasing interest. This sensory data, captured for example by cameras at the basestations, has the potential of significantly reducing the beam swee** overhead and enabling highly-mobile applications. The solutions developed so far, however, have mainly considered single-candidate scenarios, i.e., scenarios with a single candidate user in the visual scene, and were evaluated using synthetic datasets. To address these limitations, this paper extensively investigates the sensing-aided beam prediction problem in a real-world multi-object vehicle-to-infrastructure (V2I) scenario and presents a comprehensive machine learning-based framework. In particular, this paper proposes to utilize visual and positional data to predict the optimal beam indices as an alternative to the conventional beam swee** approaches. For this, a novel user (transmitter) identification solution has been developed, a key step in realizing sensing-aided multi-candidate and multi-user beam prediction solutions. The proposed solutions are evaluated on the large-scale real-world DeepSense $6$G dataset. Experimental results in realistic V2I communication scenarios indicate that the proposed solutions achieve close to $100\%$ top-5 beam prediction accuracy for the scenarios with single-user and close to $95\%$ top-5 beam prediction accuracy for multi-candidate scenarios. Furthermore, the proposed approach can identify the probable transmitting candidate with more than $93\%$ accuracy across the different scenarios. This highlights a promising approach for nearly eliminating the beam training overhead in mmWave/THz communication systems. △ Less

Submitted 13 August, 2023; originally announced August 2023.

Comments: Dataset and code files are available on the DeepSense 6G website https://deepsense6g.net/

arXiv:2302.06736 [pdf, other]

Environment Semantic Aided Communication: A Real World Demonstration for Beam Prediction

Authors: Shoaib Imran, Gouranga Charan, Ahmed Alkhateeb

Abstract: Millimeter-wave (mmWave) and terahertz (THz) communication systems adopt large antenna arrays to ensure adequate receive signal power. However, adjusting the narrow beams of these antenna arrays typically incurs high beam training overhead that scales with the number of antennas. Recently proposed vision-aided beam prediction solutions, which utilize \textit{raw RGB images} captured at the basesta… ▽ More Millimeter-wave (mmWave) and terahertz (THz) communication systems adopt large antenna arrays to ensure adequate receive signal power. However, adjusting the narrow beams of these antenna arrays typically incurs high beam training overhead that scales with the number of antennas. Recently proposed vision-aided beam prediction solutions, which utilize \textit{raw RGB images} captured at the basestation to predict the optimal beams, have shown initial promising results. However, they still have a considerable computational complexity, limiting their adoption in the real world. To address these challenges, this paper focuses on develo** and comparing various approaches that extract lightweight semantic information from the visual data. The results show that the proposed solutions can significantly decrease the computational requirements while achieving similar beam prediction accuracy compared to the previously proposed vision-aided solutions. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: Based on the DeepSense dataset https://deepsense6g.net/. arXiv admin note: text overlap with arXiv:2205.12187

arXiv:2301.11283 [pdf, other]

Real-Time Digital Twins: Vision and Research Directions for 6G and Beyond

Authors: Ahmed Alkhateeb, Shuaifeng Jiang, Gouranga Charan

Abstract: This article presents a vision where \textit{real-time} digital twins of the physical wireless environments are continuously updated using multi-modal sensing data from the distributed infrastructure and user devices, and are used to make communication and sensing decisions. This vision is mainly enabled by the advances in precise 3D maps, multi-modal sensing, ray-tracing computations, and machine… ▽ More This article presents a vision where \textit{real-time} digital twins of the physical wireless environments are continuously updated using multi-modal sensing data from the distributed infrastructure and user devices, and are used to make communication and sensing decisions. This vision is mainly enabled by the advances in precise 3D maps, multi-modal sensing, ray-tracing computations, and machine/deep learning. This article details this vision, explains the different approaches for constructing and utilizing these real-time digital twins, discusses the applications and open problems, and presents a research platform that can be used to investigate various digital twin research directions. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: The 6G digital twin research platform will be available soon on https://deepverse6g.net/

arXiv:2211.09769 [pdf, other]

DeepSense 6G: A Large-Scale Real-World Multi-Modal Sensing and Communication Dataset

Authors: Ahmed Alkhateeb, Gouranga Charan, Tawfik Osman, Andrew Hredzak, João Morais, Umut Demirhan, Nikhil Srinivas

Abstract: This article presents the DeepSense 6G dataset, which is a large-scale dataset based on real-world measurements of co-existing multi-modal sensing and communication data. The DeepSense 6G dataset is built to advance deep learning research in a wide range of applications in the intersection of multi-modal sensing, communication, and positioning. This article provides a detailed overview of the Deep… ▽ More This article presents the DeepSense 6G dataset, which is a large-scale dataset based on real-world measurements of co-existing multi-modal sensing and communication data. The DeepSense 6G dataset is built to advance deep learning research in a wide range of applications in the intersection of multi-modal sensing, communication, and positioning. This article provides a detailed overview of the DeepSense dataset structure, adopted testbeds, data collection and processing methodology, deployment scenarios, and example applications, with the objective of facilitating the adoption and reproducibility of multi-modal sensing and communication datasets. △ Less

Submitted 20 March, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

Comments: The dataset is available on the DeepSense 6G website http://deepsense6g.net/

arXiv:2211.07569 [pdf, other]

Millimeter Wave Drones with Cameras: Computer Vision Aided Wireless Beam Prediction

Authors: Gouranga Charan, Andrew Hredzak, Ahmed Alkhateeb

Abstract: Millimeter wave (mmWave) and terahertz (THz) drones have the potential to enable several futuristic applications such as coverage extension, enhanced security monitoring, and disaster management. However, these drones need to deploy large antenna arrays and use narrow directive beams to maintain a sufficient link budget. The large beam training overhead associated with these arrays makes adjusting… ▽ More Millimeter wave (mmWave) and terahertz (THz) drones have the potential to enable several futuristic applications such as coverage extension, enhanced security monitoring, and disaster management. However, these drones need to deploy large antenna arrays and use narrow directive beams to maintain a sufficient link budget. The large beam training overhead associated with these arrays makes adjusting these narrow beams challenging for highly-mobile drones. To address these challenges, this paper proposes a vision-aided machine learning-based approach that leverages visual data collected from cameras installed on the drones to enable fast and accurate beam prediction. Further, to facilitate the evaluation of the proposed solution, we build a synthetic drone communication dataset consisting of co-existing wireless and visual data. The proposed vision-aided solution achieves a top-$1$ beam prediction accuracy of $\approx 91\%$ and close to $100\%$ top-$3$ accuracy. These results highlight the efficacy of the proposed solution towards enabling highly mobile mmWave/THz drone communication. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: The mmWave drone dataset and code files will be available soon! arXiv admin note: text overlap with arXiv:2205.12187

arXiv:2209.07519 [pdf, other]

Multi-Modal Beam Prediction Challenge 2022: Towards Generalization

Authors: Gouranga Charan, Umut Demirhan, João Morais, Arash Behboodi, Hamed Pezeshki, Ahmed Alkhateeb

Abstract: Beam management is a challenging task for millimeter wave (mmWave) and sub-terahertz communication systems, especially in scenarios with highly-mobile users. Leveraging external sensing modalities such as vision, LiDAR, radar, position, or a combination of them, to address this beam management challenge has recently attracted increasing interest from both academia and industry. This is mainly moti… ▽ More Beam management is a challenging task for millimeter wave (mmWave) and sub-terahertz communication systems, especially in scenarios with highly-mobile users. Leveraging external sensing modalities such as vision, LiDAR, radar, position, or a combination of them, to address this beam management challenge has recently attracted increasing interest from both academia and industry. This is mainly motivated by the dependency of the beam direction decision on the user location and the geometry of the surrounding environment -- information that can be acquired from the sensory data. To realize the promised beam management gains, such as the significant reduction in beam alignment overhead, in practice, however, these solutions need to account for important aspects. For example, these multi-modal sensing aided beam selection approaches should be able to generalize their learning to unseen scenarios and should be able to operate in realistic dense deployments. The "Multi-Modal Beam Prediction Challenge 2022: Towards Generalization" competition is offered to provide a platform for investigating these critical questions. In order to facilitate the generalizability study, the competition offers a large-scale multi-modal dataset with co-existing communication and sensing data collected across multiple real-world locations and different times of the day. In this paper, along with the detailed descriptions of the problem statement and the development dataset, we provide a baseline solution that utilizes the user position data to predict the optimal beam indices. The objective of this challenge is to go beyond a simple feasibility study and enable necessary research in this direction, paving the way towards generalizable multi-modal sensing-aided beam management for real-world future communication systems. △ Less

Submitted 13 October, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: The dataset is available on the ML competition page: https://deepsense6g.net/multi-modal-beam-prediction-challenge/

arXiv:2203.05548 [pdf, other]

LiDAR Aided Future Beam Prediction in Real-World Millimeter Wave V2I Communications

Authors: Shuaifeng Jiang, Gouranga Charan, Ahmed Alkhateeb

Abstract: This paper presents the first large-scale real-world evaluation for using LiDAR data to guide the mmWave beam prediction task. A machine learning (ML) model that leverages the LiDAR sensory data to predict the current and future beams was developed. Based on the large-scale real-world dataset, DeepSense 6G, this model was evaluated in a vehicle-to-infrastructure communication scenario with highly-… ▽ More This paper presents the first large-scale real-world evaluation for using LiDAR data to guide the mmWave beam prediction task. A machine learning (ML) model that leverages the LiDAR sensory data to predict the current and future beams was developed. Based on the large-scale real-world dataset, DeepSense 6G, this model was evaluated in a vehicle-to-infrastructure communication scenario with highly-mobile vehicles. The experimental results show that the developed LiDAR-aided beam prediction and tracking model can predict the optimal beam in $95\%$ of the cases and with more than $90\%$ reduction in the beam training overhead. The LiDAR-aided beam tracking achieves comparable accuracy performance to a baseline solution that has perfect knowledge of the previous optimal beams, without requiring any knowledge about the previous optimal beam information and without any need for beam calibration. This highlights a promising solution for the critical beam alignment challenges in mmWave and terahertz communication systems. △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: The dataset and code files will be available on the DeepSense 6G website https://deepsense6g.net/

arXiv:2203.01907 [pdf, other]

Computer Vision Aided Blockage Prediction in Real-World Millimeter Wave Deployments

Authors: Gouranga Charan, Ahmed Alkhateeb

Abstract: This paper provides the first real-world evaluation of using visual (RGB camera) data and machine learning for proactively predicting millimeter wave (mmWave) dynamic link blockages before they happen. Proactively predicting line-of-sight (LOS) link blockages enables mmWave/sub-THz networks to make proactive network management decisions, such as proactive beam switching and hand-off) before a link… ▽ More This paper provides the first real-world evaluation of using visual (RGB camera) data and machine learning for proactively predicting millimeter wave (mmWave) dynamic link blockages before they happen. Proactively predicting line-of-sight (LOS) link blockages enables mmWave/sub-THz networks to make proactive network management decisions, such as proactive beam switching and hand-off) before a link failure happens. This can significantly enhance the network reliability and latency while efficiently utilizing the wireless resources. To evaluate this gain in reality, this paper (i) develops a computer vision based solution that processes the visual data captured by a camera installed at the infrastructure node and (ii) studies the feasibility of the proposed solution based on the large-scale real-world dataset, DeepSense 6G, that comprises multi-modal sensing and communication data. Based on the adopted real-world dataset, the developed solution achieves $\approx 90\%$ accuracy in predicting blockages happening within the future $0.1$s and $\approx 80\%$ for blockages happening within $1$s, which highlights a promising solution for mmWave/sub-THz communication networks. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: The dataset and code files will be available on the DeepSense 6G dataset website https://deepsense6g.net/

arXiv:2111.07574 [pdf, other]

Vision-Position Multi-Modal Beam Prediction Using Real Millimeter Wave Datasets

Authors: Gouranga Charan, Tawfik Osman, Andrew Hredzak, Ngwe Thawdar, Ahmed Alkhateeb

Abstract: Enabling highly-mobile millimeter wave (mmWave) and terahertz (THz) wireless communication applications requires overcoming the critical challenges associated with the large antenna arrays deployed at these systems. In particular, adjusting the narrow beams of these antenna arrays typically incurs high beam training overhead that scales with the number of antennas. To address these challenges, thi… ▽ More Enabling highly-mobile millimeter wave (mmWave) and terahertz (THz) wireless communication applications requires overcoming the critical challenges associated with the large antenna arrays deployed at these systems. In particular, adjusting the narrow beams of these antenna arrays typically incurs high beam training overhead that scales with the number of antennas. To address these challenges, this paper proposes a multi-modal machine learning based approach that leverages positional and visual (camera) data collected from the wireless communication environment for fast beam prediction. The developed framework has been tested on a real-world vehicular dataset comprising practical GPS, camera, and mmWave beam training data. The results show the proposed approach achieves more than $\approx$ 75\% top-1 beam prediction accuracy and close to 100\% top-3 beam prediction accuracy in realistic communication scenarios. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: Dataset and code files will be available on the DeepSense 6G website http://deepsense6g.net/

arXiv:2102.09527 [pdf, other]

Vision-Aided 6G Wireless Communications: Blockage Prediction and Proactive Handoff

Authors: Gouranga Charan, Muhammad Alrabeiah, Ahmed Alkhateeb

Abstract: The sensitivity to blockages is a key challenge for the high-frequency (5G millimeter wave and 6G sub-terahertz) wireless networks. Since these networks mainly rely on line-of-sight (LOS) links, sudden link blockages highly threaten the reliability of the networks. Further, when the LOS link is blocked, the network typically needs to hand off the user to another LOS basestation, which may incur cr… ▽ More The sensitivity to blockages is a key challenge for the high-frequency (5G millimeter wave and 6G sub-terahertz) wireless networks. Since these networks mainly rely on line-of-sight (LOS) links, sudden link blockages highly threaten the reliability of the networks. Further, when the LOS link is blocked, the network typically needs to hand off the user to another LOS basestation, which may incur critical time latency, especially if a search over a large codebook of narrow beams is needed. A promising way to tackle the reliability and latency challenges lies in enabling proaction in wireless networks. Proaction basically allows the network to anticipate blockages, especially dynamic blockages, and initiate user hand-off beforehand. This paper presents a complete machine learning framework for enabling proaction in wireless networks relying on visual data captured, for example, by RGB cameras deployed at the base stations. In particular, the paper proposes a vision-aided wireless communication solution that utilizes bimodal machine learning to perform proactive blockage prediction and user hand-off. The bedrock of this solution is a deep learning algorithm that learns from visual and wireless data how to predict incoming blockages. The predictions of this algorithm are used by the wireless network to proactively initiate hand-off decisions and avoid any unnecessary latency. The algorithm is developed on a vision-wireless dataset generated using the ViWi data-generation framework. Experimental results on two basestations with different cameras indicate that the algorithm is capable of accurately detecting incoming blockages more than $\sim 90\%$ of the time. Such blockage prediction ability is directly reflected in the accuracy of proactive hand-off, which also approaches $87\%$. This highlights a promising direction for enabling high reliability and low latency in future wireless networks. △ Less

Submitted 19 February, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: Submitted to IEEE, 30 pages, 11 figures. The dataset will be available soon on the ViWi website https://viwi-dataset.net/

arXiv:2006.09902 [pdf, other]

Vision-Aided Dynamic Blockage Prediction for 6G Wireless Communication Networks

Authors: Gouranga Charan, Muhammad Alrabeiah, Ahmed Alkhateeb

Abstract: Unlocking the full potential of millimeter-wave and sub-terahertz wireless communication networks hinges on realizing unprecedented low-latency and high-reliability requirements. The challenge in meeting those requirements lies partly in the sensitivity of signals in the millimeter-wave and sub-terahertz frequency ranges to blockages. One promising way to tackle that challenge is to help a wireles… ▽ More Unlocking the full potential of millimeter-wave and sub-terahertz wireless communication networks hinges on realizing unprecedented low-latency and high-reliability requirements. The challenge in meeting those requirements lies partly in the sensitivity of signals in the millimeter-wave and sub-terahertz frequency ranges to blockages. One promising way to tackle that challenge is to help a wireless network develop a sense of its surrounding using machine learning. This paper attempts to do that by utilizing deep learning and computer vision. It proposes a novel solution that proactively predicts \textit{dynamic} link blockages. More specifically, it develops a deep neural network architecture that learns from observed sequences of RGB images and beamforming vectors how to predict possible future link blockages. The proposed architecture is evaluated on a publicly available dataset that represents a synthetic dynamic communication scenario with multiple moving users and blockages. It scores a link-blockage prediction accuracy in the neighborhood of 86\%, a performance that is unlikely to be matched without utilizing visual data. △ Less

Submitted 17 June, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

Comments: The dataset and code files will be available soon on the ViWi website: https://www.viwi-dataset.net/

arXiv:1906.08866 [pdf]

Towards Efficient Neural Networks On-a-chip: Joint Hardware-Algorithm Approaches

Authors: Xiaocong Du, Gokul Krishnan, Abinash Mohanty, Zheng Li, Gouranga Charan, Yu Cao

Abstract: Machine learning algorithms have made significant advances in many applications. However, their hardware implementation on the state-of-the-art platforms still faces several challenges and are limited by various factors, such as memory volume, memory bandwidth and interconnection overhead. The adoption of the crossbar architecture with emerging memory technology partially solves the problem but in… ▽ More Machine learning algorithms have made significant advances in many applications. However, their hardware implementation on the state-of-the-art platforms still faces several challenges and are limited by various factors, such as memory volume, memory bandwidth and interconnection overhead. The adoption of the crossbar architecture with emerging memory technology partially solves the problem but induces process variation and other concerns. In this paper, we will present novel solutions to two fundamental issues in crossbar implementation of Artificial Intelligence (AI) algorithms: device variation and insufficient interconnections. These solutions are inspired by the statistical properties of algorithms themselves, especially the redundancy in neural network nodes and connections. By Random Sparse Adaptation and pruning the connections following the Small-World model, we demonstrate robust and efficient performance on representative datasets such as MNIST and CIFAR-10. Moreover, we present Continuous Growth and Pruning algorithm for future learning and adaptation on hardware. △ Less

Submitted 27 May, 2019; originally announced June 2019.

arXiv:1905.11550 [pdf, other]

Single-Net Continual Learning with Progressive Segmented Training (PST)

Authors: Xiaocong Du, Gouranga Charan, Frank Liu, Yu Cao

Abstract: There is an increasing need of continual learning in dynamic systems, such as the self-driving vehicle, the surveillance drone, and the robotic system. Such a system requires learning from the data stream, training the model to preserve previous information and adapt to a new task, and generating a single-headed vector for future inference. Different from previous approaches with dynamic structure… ▽ More There is an increasing need of continual learning in dynamic systems, such as the self-driving vehicle, the surveillance drone, and the robotic system. Such a system requires learning from the data stream, training the model to preserve previous information and adapt to a new task, and generating a single-headed vector for future inference. Different from previous approaches with dynamic structures, this work focuses on a single network and model segmentation to prevent catastrophic forgetting. Leveraging the redundant capacity of a single network, model parameters for each task are separated into two groups: one important group which is frozen to preserve current knowledge, and secondary group to be saved (not pruned) for a future learning. A fixed-size memory containing a small amount of previously seen data is further adopted to assist the training. Without additional regularization, the simple yet effective approach of PST successfully incorporates multiple tasks and achieves the state-of-the-art accuracy in the single-head evaluation on CIFAR-10 and CIFAR-100 datasets. Moreover, the segmented training significantly improves computation efficiency in continual learning. △ Less

Submitted 19 December, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

Showing 1–16 of 16 results for author: Charan, G