-
CEST-KAN: Kolmogorov-Arnold Networks for CEST MRI Data Analysis
Authors:
Jiawen Wang,
Pei Cai,
Ziyan Wang,
Huabin Zhang,
Jianpan Huang
Abstract:
Purpose: This study aims to propose and investigate the feasibility of using Kolmogorov-Arnold Network (KAN) for CEST MRI data analysis (CEST-KAN). Methods: CEST MRI data were acquired from twelve healthy volunteers at 3T. Data from ten subjects were used for training, while the remaining two were reserved for testing. The performance of multi-layer perceptron (MLP) and KAN models with the same ne…
▽ More
Purpose: This study aims to propose and investigate the feasibility of using Kolmogorov-Arnold Network (KAN) for CEST MRI data analysis (CEST-KAN). Methods: CEST MRI data were acquired from twelve healthy volunteers at 3T. Data from ten subjects were used for training, while the remaining two were reserved for testing. The performance of multi-layer perceptron (MLP) and KAN models with the same network settings were evaluated and compared to the conventional multi-pool Lorentzian fitting (MPLF) method in generating water and multiple CEST contrasts, including amide, relayed nuclear Overhauser effect (rNOE), and magnetization transfer (MT). Results: The water and CEST maps generated by both MLP and KAN were visually comparable to the MPLF results. However, the KAN model demonstrated higher accuracy in extrapolating the CEST fitting metrics, as evidenced by the smaller validation loss during training and smaller absolute error during testing. Voxel-wise correlation analysis showed that all four CEST fitting metrics generated by KAN consistently exhibited higher Pearson coefficients than the MLP results, indicating superior performance. Moreover, the KAN models consistently outperformed the MLP models in varying hidden layer numbers despite longer training time. Conclusion: In this study, we demonstrated for the first time the feasibility of utilizing KAN for CEST MRI data analysis, highlighting its superiority over MLP in this task. The findings suggest that CEST-KAN has the potential to be a robust and reliable post-analysis tool for CEST MRI in clinical settings.
△ Less
Submitted 25 June, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving
Authors:
Daocheng Fu,
Wenjie Lei,
Licheng Wen,
Pinlong Cai,
Song Mao,
Min Dou,
Botian Shi,
Yu Qiao
Abstract:
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platform…
▽ More
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platforms, LimSim++ addresses the need for a long-term closed-loop infrastructure supporting continuous learning and improved generalization in autonomous driving. The platform offers extended-duration, multi-scenario simulations, providing crucial information for (M)LLM-driven vehicles. Users can engage in prompt engineering, model evaluation, and framework enhancement, making LimSim++ a versatile tool for research and practice. This paper additionally introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The open-source resources of LimSim++ are available at: https://pjlab-adg.github.io/limsim-plus/.
△ Less
Submitted 12 April, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Pubic Symphysis-Fetal Head Segmentation Using Pure Transformer with Bi-level Routing Attention
Authors:
Pengzhou Cai,
Jiang Lu,
Yanxin Li,
Libin Lan
Abstract:
In this paper, we propose a method, named BRAU-Net, to solve the pubic symphysis-fetal head segmentation task. The method adopts a U-Net-like pure Transformer architecture with bi-level routing attention and skip connections, which effectively learns local-global semantic information. The proposed BRAU-Net was evaluated on transperineal Ultrasound images dataset from the pubic symphysis-fetal head…
▽ More
In this paper, we propose a method, named BRAU-Net, to solve the pubic symphysis-fetal head segmentation task. The method adopts a U-Net-like pure Transformer architecture with bi-level routing attention and skip connections, which effectively learns local-global semantic information. The proposed BRAU-Net was evaluated on transperineal Ultrasound images dataset from the pubic symphysis-fetal head segmentation and angle of progression (FH-PS-AOP) challenge. The results demonstrate that the proposed BRAU-Net achieves comparable a final score. The codes will be available at https://github.com/Caipengzhou/BRAU-Net.
△ Less
Submitted 7 October, 2023; v1 submitted 30 September, 2023;
originally announced October 2023.
-
TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search
Authors:
Licheng Wen,
Ze Fu,
Pinlong Cai,
Daocheng Fu,
Song Mao,
Botian Shi
Abstract:
Digital twins for intelligent transportation systems are currently attracting great interests, in which generating realistic, diverse, and human-like traffic flow in simulations is a formidable challenge. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adapt…
▽ More
Digital twins for intelligent transportation systems are currently attracting great interests, in which generating realistic, diverse, and human-like traffic flow in simulations is a formidable challenge. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of groupbased Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow replete with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intent completion time, and diversity metrics. Besides, we simulate highway and roundabout scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by showcasing its prowess in simultaneously mass vehicles within a sprawling road network, cultivating a landscape of traffic flow that mirrors the intricacies of human behavior.
△ Less
Submitted 31 August, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
LimSim: A Long-term Interactive Multi-scenario Traffic Simulator
Authors:
Licheng Wen,
Daocheng Fu,
Song Mao,
Pinlong Cai,
Min Dou,
Yikang Li,
Yu Qiao
Abstract:
With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styl…
▽ More
With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styles and multi-vehicle interactions, and struggle to handle corner cases in the dataset. In this paper, we propose LimSim, the Long-term Interactive Multi-scenario traffic Simulator, which aims to provide a long-term continuous simulation capability under the urban road network. LimSim can simulate fine-grained dynamic scenarios and focus on the diverse interactions between multiple vehicles in the traffic flow. This paper provides a detailed introduction to the framework and features of the LimSim, and demonstrates its performance through case studies and experiments. LimSim is now open source on GitHub: https://www.github.com/PJLab-ADG/LimSim .
△ Less
Submitted 26 July, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of Motion Artifacts in Photoplethysmogram Signals on Edge Devices
Authors:
Yali Zheng,
Chen Wu,
Peizheng Cai,
Zhiqiang Zhong,
Hongda Huang,
Yuqi Jiang
Abstract:
Photoplethysmogram (PPG) signals are easily contaminated by motion artifacts in real-world settings, despite their widespread use in Internet-of-Things (IoT) based wearable and smart health devices for cardiovascular health monitoring. This study proposed a lightweight deep neural network, called Tiny-PPG, for accurate and real-time PPG artifact segmentation on IoT edge devices. The model was trai…
▽ More
Photoplethysmogram (PPG) signals are easily contaminated by motion artifacts in real-world settings, despite their widespread use in Internet-of-Things (IoT) based wearable and smart health devices for cardiovascular health monitoring. This study proposed a lightweight deep neural network, called Tiny-PPG, for accurate and real-time PPG artifact segmentation on IoT edge devices. The model was trained and tested on a public dataset, PPG DaLiA, which featured complex artifacts with diverse lengths and morphologies during various daily activities of 15 subjects using a watch-type device (Empatica E4). The model structure, training method and loss function were specifically designed to balance detection accuracy and speed for real-time PPG artifact detection in resource-constrained embedded devices. To optimize the model size and capability in multi-scale feature representation, the model employed depth-wise separable convolution and atrous spatial pyramid pooling modules, respectively. Additionally, the contrastive loss was also utilized to further optimize the feature embeddings. With additional model pruning, Tiny-PPG achieved state-of-the-art detection accuracy of 87.4% while only having 19,726 model parameters (0.15 megabytes), and was successfully deployed on an STM32 embedded system for real-time PPG artifact detection. Therefore, this study provides an effective solution for resource-constraint IoT smart health devices in PPG artifact detection.
△ Less
Submitted 10 October, 2023; v1 submitted 5 May, 2023;
originally announced May 2023.
-
Carl-Lead: Lidar-based End-to-End Autonomous Driving with Contrastive Deep Reinforcement Learning
Authors:
Peide Cai,
Sukai Wang,
Hengli Wang,
Ming Liu
Abstract:
Autonomous driving in urban crowds at unregulated intersections is challenging, where dynamic occlusions and uncertain behaviors of other vehicles should be carefully considered. Traditional methods are heuristic and based on hand-engineered rules and parameters, but scale poorly in new situations. Therefore, they require high labor cost to design and maintain rules in all foreseeable scenarios. R…
▽ More
Autonomous driving in urban crowds at unregulated intersections is challenging, where dynamic occlusions and uncertain behaviors of other vehicles should be carefully considered. Traditional methods are heuristic and based on hand-engineered rules and parameters, but scale poorly in new situations. Therefore, they require high labor cost to design and maintain rules in all foreseeable scenarios. Recently, deep reinforcement learning (DRL) has shown promising results in urban driving scenarios. However, DRL is known to be sample inefficient, and most previous works assume perfect observations such as ground-truth locations and motions of vehicles without considering noises and occlusions, which might be a too strong assumption for policy deployment. In this work, we use DRL to train lidar-based end-to-end driving policies that naturally consider imperfect partial observations. We further use unsupervised contrastive representation learning as an auxiliary task to improve the sample efficiency. The comparative evaluation results reveal that our method achieves higher success rates than the state-of-the-art (SOTA) lidar-based end-to-end driving network, better trades off safety and efficiency than the carefully tuned rule-based method, and generalizes better to new scenarios than the baselines. Demo videos are available at https://caipeide.github.io/carl-lead/.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
DQ-GAT: Towards Safe and Efficient Autonomous Driving with Deep Q-Learning and Graph Attention Networks
Authors:
Peide Cai,
Hengli Wang,
Yuxiang Sun,
Ming Liu
Abstract:
Autonomous driving in multi-agent dynamic traffic scenarios is challenging: the behaviors of road users are uncertain and are hard to model explicitly, and the ego-vehicle should apply complicated negotiation skills with them, such as yielding, merging and taking turns, to achieve both safe and efficient driving in various settings. Traditional planning methods are largely rule-based and scale poo…
▽ More
Autonomous driving in multi-agent dynamic traffic scenarios is challenging: the behaviors of road users are uncertain and are hard to model explicitly, and the ego-vehicle should apply complicated negotiation skills with them, such as yielding, merging and taking turns, to achieve both safe and efficient driving in various settings. Traditional planning methods are largely rule-based and scale poorly in these complex dynamic scenarios, often leading to reactive or even overly conservative behaviors. Therefore, they require tedious human efforts to maintain workability. Recently, deep learning-based methods have shown promising results with better generalization capability but less hand engineering efforts. However, they are either implemented with supervised imitation learning (IL), which suffers from dataset bias and distribution mismatch issues, or are trained with deep reinforcement learning (DRL) but focus on one specific traffic scenario. In this work, we propose DQ-GAT to achieve scalable and proactive autonomous driving, where graph attention-based networks are used to implicitly model interactions, and deep Q-learning is employed to train the network end-to-end in an unsupervised manner. Extensive experiments in a high-fidelity driving simulator show that our method achieves higher success rates than previous learning-based methods and a traditional rule-based method, and better trades off safety and efficiency in both seen and unseen scenarios. Moreover, qualitative results on a trajectory dataset indicate that our learned policy can be transferred to the real world for practical applications with real-time speeds. Demonstration videos are available at https://caipeide.github.io/dq-gat/.
△ Less
Submitted 18 June, 2022; v1 submitted 11 August, 2021;
originally announced August 2021.
-
Vision-Based Autonomous Car Racing Using Deep Imitative Reinforcement Learning
Authors:
Peide Cai,
Hengli Wang,
Huaiyang Huang,
Yuxuan Liu,
Ming Liu
Abstract:
Autonomous car racing is a challenging task in the robotic control area. Traditional modular methods require accurate map**, localization and planning, which makes them computationally inefficient and sensitive to environmental changes. Recently, deep-learning-based end-to-end systems have shown promising results for autonomous driving/racing. However, they are commonly implemented by supervised…
▽ More
Autonomous car racing is a challenging task in the robotic control area. Traditional modular methods require accurate map**, localization and planning, which makes them computationally inefficient and sensitive to environmental changes. Recently, deep-learning-based end-to-end systems have shown promising results for autonomous driving/racing. However, they are commonly implemented by supervised imitation learning (IL), which suffers from the distribution mismatch problem, or by reinforcement learning (RL), which requires a huge amount of risky interaction data. In this work, we present a general deep imitative reinforcement learning approach (DIRL), which successfully achieves agile autonomous racing using visual inputs. The driving knowledge is acquired from both IL and model-based RL, where the agent can learn from human teachers as well as perform self-improvement by safely interacting with an offline world model. We validate our algorithm both in a high-fidelity driving simulation and on a real-world 1/20-scale RC-car with limited onboard computation. The evaluation results demonstrate that our method outperforms previous IL and RL methods in terms of sample efficiency and task performance. Demonstration videos are available at https://caipeide.github.io/autorace-dirl/
△ Less
Submitted 17 July, 2021;
originally announced July 2021.
-
DiGNet: Learning Scalable Self-Driving Policies for Generic Traffic Scenarios with Graph Neural Networks
Authors:
Peide Cai,
Hengli Wang,
Yuxiang Sun,
Ming Liu
Abstract:
Traditional decision and planning frameworks for self-driving vehicles (SDVs) scale poorly in new scenarios, thus they require tedious hand-tuning of rules and parameters to maintain acceptable performance in all foreseeable cases. Recently, self-driving methods based on deep learning have shown promising results with better generalization capability but less hand engineering effort. However, most…
▽ More
Traditional decision and planning frameworks for self-driving vehicles (SDVs) scale poorly in new scenarios, thus they require tedious hand-tuning of rules and parameters to maintain acceptable performance in all foreseeable cases. Recently, self-driving methods based on deep learning have shown promising results with better generalization capability but less hand engineering effort. However, most of the previous learning-based methods are trained and evaluated in limited driving scenarios with scattered tasks, such as lane-following, autonomous braking, and conditional driving. In this paper, we propose a graph-based deep network to achieve scalable self-driving that can handle massive traffic scenarios. Specifically, more than 7,000 km of evaluation is conducted in a high-fidelity driving simulator, in which our method can obey the traffic rules and safely navigate the vehicle in a large variety of urban, rural, and highway environments, including unprotected left turns, narrow roads, roundabouts, and pedestrian-rich intersections. Demonstration videos are available at https://caipeide.github.io/dignet/.
△ Less
Submitted 29 July, 2021; v1 submitted 13 November, 2020;
originally announced November 2020.
-
SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection
Authors:
Rui Fan,
Hengli Wang,
Peide Cai,
Ming Liu
Abstract:
Freespace detection is an essential component of visual perception for self-driving cars. The recent efforts made in data-fusion convolutional neural networks (CNNs) have significantly improved semantic driving scene segmentation. Freespace can be hypothesized as a ground plane, on which the points have similar surface normals. Hence, in this paper, we first introduce a novel module, named surface…
▽ More
Freespace detection is an essential component of visual perception for self-driving cars. The recent efforts made in data-fusion convolutional neural networks (CNNs) have significantly improved semantic driving scene segmentation. Freespace can be hypothesized as a ground plane, on which the points have similar surface normals. Hence, in this paper, we first introduce a novel module, named surface normal estimator (SNE), which can infer surface normal information from dense depth/disparity images with high accuracy and efficiency. Furthermore, we propose a data-fusion CNN architecture, referred to as RoadSeg, which can extract and fuse features from both RGB images and the inferred surface normal information for accurate freespace detection. For research purposes, we publish a large-scale synthetic freespace detection dataset, named Ready-to-Drive (R2D) road dataset, collected under different illumination and weather conditions. The experimental results demonstrate that our proposed SNE module can benefit all the state-of-the-art CNNs for freespace detection, and our SNE-RoadSeg achieves the best overall performance among different datasets.
△ Less
Submitted 25 August, 2020;
originally announced August 2020.
-
Probabilistic End-to-End Vehicle Navigation in Complex Dynamic Environments with Multimodal Sensor Fusion
Authors:
Peide Cai,
Sukai Wang,
Yuxiang Sun,
Ming Liu
Abstract:
All-day and all-weather navigation is a critical capability for autonomous driving, which requires proper reaction to varied environmental conditions and complex agent behaviors. Recently, with the rise of deep learning, end-to-end control for autonomous vehicles has been well studied. However, most works are solely based on visual information, which can be degraded by challenging illumination con…
▽ More
All-day and all-weather navigation is a critical capability for autonomous driving, which requires proper reaction to varied environmental conditions and complex agent behaviors. Recently, with the rise of deep learning, end-to-end control for autonomous vehicles has been well studied. However, most works are solely based on visual information, which can be degraded by challenging illumination conditions such as dim light or total darkness. In addition, they usually generate and apply deterministic control commands without considering the uncertainties in the future. In this paper, based on imitation learning, we propose a probabilistic driving model with ultiperception capability utilizing the information from the camera, lidar and radar. We further evaluate its driving performance online on our new driving benchmark, which includes various environmental conditions (e.g., urban and rural areas, traffic densities, weather and times of the day) and dynamic obstacles (e.g., vehicles, pedestrians, motorcyclists and bicyclists). The results suggest that our proposed model outperforms baselines and achieves excellent generalization performance in unseen environments with heavy traffic and extreme weather.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments
Authors:
Peide Cai,
Yuxiang Sun,
Hengli Wang,
Ming Liu
Abstract:
Traditional methods for autonomous driving are implemented with many building blocks from perception, planning and control, making them difficult to generalize to varied scenarios due to complex assumptions and interdependencies. Recently, the end-to-end driving method has emerged, which performs well and generalizes to new environments by directly learning from export-provided data. However, many…
▽ More
Traditional methods for autonomous driving are implemented with many building blocks from perception, planning and control, making them difficult to generalize to varied scenarios due to complex assumptions and interdependencies. Recently, the end-to-end driving method has emerged, which performs well and generalizes to new environments by directly learning from export-provided data. However, many existing methods on this topic neglect to check the confidence of the driving actions and the ability to recover from driving mistakes. In this paper, we develop an uncertainty-aware end-to-end trajectory generation method based on imitation learning. It can extract spatiotemporal features from the front-view camera images for scene understanding, and then generate collision-free trajectories several seconds into the future. The experimental results suggest that under various weather and lighting conditions, our network can reliably generate trajectories in different urban environments, such as turning at intersections and slowing down for collision avoidance. Furthermore, closed-loop driving tests suggest that the proposed method achieves better cross-scene/platform driving results than the state-of-the-art (SOTA) end-to-end control method, where our model can recover from off-center and off-orientation errors and capture 80% of dangerous cases with high uncertainty estimations.
△ Less
Submitted 23 October, 2020; v1 submitted 27 April, 2020;
originally announced April 2020.
-
High-speed Autonomous Drifting with Deep Reinforcement Learning
Authors:
Peide Cai,
Xiaodong Mei,
Lei Tai,
Yuxiang Sun,
Ming Liu
Abstract:
Drifting is a complicated task for autonomous vehicle control. Most traditional methods in this area are based on motion equations derived by the understanding of vehicle dynamics, which is difficult to be modeled precisely. We propose a robust drift controller without explicit motion equations, which is based on the latest model-free deep reinforcement learning algorithm soft actor-critic. The dr…
▽ More
Drifting is a complicated task for autonomous vehicle control. Most traditional methods in this area are based on motion equations derived by the understanding of vehicle dynamics, which is difficult to be modeled precisely. We propose a robust drift controller without explicit motion equations, which is based on the latest model-free deep reinforcement learning algorithm soft actor-critic. The drift control problem is formulated as a trajectory following task, where the errorbased state and reward are designed. After being trained on tracks with different levels of difficulty, our controller is capable of making the vehicle drift through various sharp corners quickly and stably in the unseen map. The proposed controller is further shown to have excellent generalization ability, which can directly handle unseen vehicle types with different physical properties, such as mass, tire friction, etc.
△ Less
Submitted 5 January, 2020;
originally announced January 2020.