-
OMRA: Online Motion Resolution Adaptation to Remedy Domain Shift in Learned Hierarchical B-frame Coding
Authors:
Zong-Lin Gao,
Sang NguyenQuang,
Wen-Hsiao Peng,
Xiem HoangVan
Abstract:
Learned hierarchical B-frame coding aims to leverage bi-directional reference frames for better coding efficiency. However, the domain shift between training and test scenarios due to dataset limitations poses a challenge. This issue arises from training the codec with small groups of pictures (GOP) but testing it on large GOPs. Specifically, the motion estimation network, when trained on small GO…
▽ More
Learned hierarchical B-frame coding aims to leverage bi-directional reference frames for better coding efficiency. However, the domain shift between training and test scenarios due to dataset limitations poses a challenge. This issue arises from training the codec with small groups of pictures (GOP) but testing it on large GOPs. Specifically, the motion estimation network, when trained on small GOPs, is unable to handle large motion at test time, incurring a negative impact on compression performance. To mitigate the domain shift, we present an online motion resolution adaptation (OMRA) method. It adapts the spatial resolution of video frames on a per-frame basis to suit the capability of the motion estimation network in a pre-trained B-frame codec. Our OMRA is an online, inference technique. It need not re-train the codec and is readily applicable to existing B-frame codecs that adopt hierarchical bi-directional prediction. Experimental results show that OMRA significantly enhances the compression performance of two state-of-the-art learned B-frame codecs on commonly used datasets.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Machine Learning-Based Malicious Vehicle Detection for Security Threats and Attacks in Vehicle Ad-hoc Network (VANET) Communications
Authors:
Thanh Nguyen Canh,
Xiem HoangVan
Abstract:
With the rapid growth of Vehicle Ad-hoc Network (VANET) as a promising technology for efficient and reliable communication among vehicles and infrastructure, the security and integrity of VANET communications has become a critical concern. One of the significant threats to VANET is the presence of blackhole attacks, where malicious nodes disrupt the network's functionality and compromise data conf…
▽ More
With the rapid growth of Vehicle Ad-hoc Network (VANET) as a promising technology for efficient and reliable communication among vehicles and infrastructure, the security and integrity of VANET communications has become a critical concern. One of the significant threats to VANET is the presence of blackhole attacks, where malicious nodes disrupt the network's functionality and compromise data confidentiality, integrity, and availability. In this paper, we propose a machine learning-based approach for blackhole detection in VANET. To achieve this task, we first create a comprehensive dataset comprising normal and malicious traffic flows. Afterward, we study and define a promising set of features to discriminate the blackhole attacks. Finally, we evaluate various machine learning algorithms, including Gradient Boosting, Random Forest, Support Vector Machines, k-Nearest Neighbors, Gaussian Naive Bayes, and Logistic Regression. Experimental results demonstrate the effectiveness of these algorithms in distinguishing between normal and malicious nodes. Our findings also highlight the potential of machine learning based approach in enhancing the security of VANET by detecting and mitigating blackhole attacks.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
S3M: Semantic Segmentation Sparse Map** for UAVs with RGB-D Camera
Authors:
Thanh Nguyen Canh,
Van-Truong Nguyen,
Xiem HoangVan,
Armagan Elibol,
Nak Young Chong
Abstract:
Unmanned Aerial Vehicles (UAVs) hold immense potential for critical applications, such as search and rescue operations, where accurate perception of indoor environments is paramount. However, the concurrent amalgamation of localization, 3D reconstruction, and semantic segmentation presents a notable hurdle, especially in the context of UAVs equipped with constrained power and computational resourc…
▽ More
Unmanned Aerial Vehicles (UAVs) hold immense potential for critical applications, such as search and rescue operations, where accurate perception of indoor environments is paramount. However, the concurrent amalgamation of localization, 3D reconstruction, and semantic segmentation presents a notable hurdle, especially in the context of UAVs equipped with constrained power and computational resources. This paper presents a novel approach to address challenges in semantic information extraction and utilization within UAV operations. Our system integrates state-of-the-art visual SLAM to estimate a comprehensive 6-DoF pose and advanced object segmentation methods at the back end. To improve the computational and storage efficiency of the framework, we adopt a streamlined voxel-based 3D map representation - OctoMap to build a working system. Furthermore, the fusion algorithm is incorporated to obtain the semantic information of each frame from the front-end SLAM task, and the corresponding point. By leveraging semantic information, our framework enhances the UAV's ability to perceive and navigate through indoor spaces, addressing challenges in pose estimation accuracy and uncertainty reduction. Through Gazebo simulations, we validate the efficacy of our proposed system and successfully embed our approach into a Jetson Xavier AGX unit for real-world applications.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Object-Oriented Semantic Map** for Reliable UAVs Navigation
Authors:
Thanh Nguyen Canh,
Armagan Elibol,
Nak Young Chong,
Xiem HoangVan
Abstract:
To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extract…
▽ More
To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extracted from the environment from RGB-D images. Our approach combines a state-of-the-art YOLOv8-based object detection framework at the front end and a 2D SLAM method - CartoGrapher at the back end. To effectively track and position semantic object classes extracted from the front-end interface, we employ the innovative BoT-SORT methodology. A novel association method is introduced to extract the position of objects and then project it with the metric map. Unlike previous research, our approach takes into reliable navigating in the environment with various hollow bottom objects. The output of our system is a probabilistic map, which significantly enhances the map's representation by incorporating object-specific attributes, encompassing class distinctions, accurate positioning, and object heights. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively produce augmented semantic maps containing several objects (notably chairs and desks). Furthermore, our system is evaluated within an embedded computer - Jetson Xavier AGX unit to demonstrate the use case in real-world applications.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Multisensor Data Fusion for Reliable Obstacle Avoidance
Authors:
Thanh Nguyen Canh,
Truong Son Nguyen,
Cong Hoang Quach,
Xiem HoangVan,
Manh Duong Phung
Abstract:
In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection…
▽ More
In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.