Search | arXiv e-print repository

OMRA: Online Motion Resolution Adaptation to Remedy Domain Shift in Learned Hierarchical B-frame Coding

Authors: Zong-Lin Gao, Sang NguyenQuang, Wen-Hsiao Peng, Xiem HoangVan

Abstract: Learned hierarchical B-frame coding aims to leverage bi-directional reference frames for better coding efficiency. However, the domain shift between training and test scenarios due to dataset limitations poses a challenge. This issue arises from training the codec with small groups of pictures (GOP) but testing it on large GOPs. Specifically, the motion estimation network, when trained on small GO… ▽ More Learned hierarchical B-frame coding aims to leverage bi-directional reference frames for better coding efficiency. However, the domain shift between training and test scenarios due to dataset limitations poses a challenge. This issue arises from training the codec with small groups of pictures (GOP) but testing it on large GOPs. Specifically, the motion estimation network, when trained on small GOPs, is unable to handle large motion at test time, incurring a negative impact on compression performance. To mitigate the domain shift, we present an online motion resolution adaptation (OMRA) method. It adapts the spatial resolution of video frames on a per-frame basis to suit the capability of the motion estimation network in a pre-trained B-frame codec. Our OMRA is an online, inference technique. It need not re-train the codec and is readily applicable to existing B-frame codecs that adopt hierarchical bi-directional prediction. Experimental results show that OMRA significantly enhances the compression performance of two state-of-the-art learned B-frame codecs on commonly used datasets. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 7 pages, submitted to IEEE ICIP 2024

arXiv:2401.08135 [pdf, other]

Machine Learning-Based Malicious Vehicle Detection for Security Threats and Attacks in Vehicle Ad-hoc Network (VANET) Communications

Authors: Thanh Nguyen Canh, Xiem HoangVan

Abstract: With the rapid growth of Vehicle Ad-hoc Network (VANET) as a promising technology for efficient and reliable communication among vehicles and infrastructure, the security and integrity of VANET communications has become a critical concern. One of the significant threats to VANET is the presence of blackhole attacks, where malicious nodes disrupt the network's functionality and compromise data conf… ▽ More With the rapid growth of Vehicle Ad-hoc Network (VANET) as a promising technology for efficient and reliable communication among vehicles and infrastructure, the security and integrity of VANET communications has become a critical concern. One of the significant threats to VANET is the presence of blackhole attacks, where malicious nodes disrupt the network's functionality and compromise data confidentiality, integrity, and availability. In this paper, we propose a machine learning-based approach for blackhole detection in VANET. To achieve this task, we first create a comprehensive dataset comprising normal and malicious traffic flows. Afterward, we study and define a promising set of features to discriminate the blackhole attacks. Finally, we evaluate various machine learning algorithms, including Gradient Boosting, Random Forest, Support Vector Machines, k-Nearest Neighbors, Gaussian Naive Bayes, and Logistic Regression. Experimental results demonstrate the effectiveness of these algorithms in distinguishing between normal and malicious nodes. Our findings also highlight the potential of machine learning based approach in enhancing the security of VANET by detecting and mitigating blackhole attacks. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: In the 2023 RIVF International Conference on Computing and Communication Technologies, Hanoi, Vietnam

arXiv:2401.08134 [pdf, other]

S3M: Semantic Segmentation Sparse Map** for UAVs with RGB-D Camera

Authors: Thanh Nguyen Canh, Van-Truong Nguyen, Xiem HoangVan, Armagan Elibol, Nak Young Chong

Abstract: Unmanned Aerial Vehicles (UAVs) hold immense potential for critical applications, such as search and rescue operations, where accurate perception of indoor environments is paramount. However, the concurrent amalgamation of localization, 3D reconstruction, and semantic segmentation presents a notable hurdle, especially in the context of UAVs equipped with constrained power and computational resourc… ▽ More Unmanned Aerial Vehicles (UAVs) hold immense potential for critical applications, such as search and rescue operations, where accurate perception of indoor environments is paramount. However, the concurrent amalgamation of localization, 3D reconstruction, and semantic segmentation presents a notable hurdle, especially in the context of UAVs equipped with constrained power and computational resources. This paper presents a novel approach to address challenges in semantic information extraction and utilization within UAV operations. Our system integrates state-of-the-art visual SLAM to estimate a comprehensive 6-DoF pose and advanced object segmentation methods at the back end. To improve the computational and storage efficiency of the framework, we adopt a streamlined voxel-based 3D map representation - OctoMap to build a working system. Furthermore, the fusion algorithm is incorporated to obtain the semantic information of each frame from the front-end SLAM task, and the corresponding point. By leveraging semantic information, our framework enhances the UAV's ability to perceive and navigate through indoor spaces, addressing challenges in pose estimation accuracy and uncertainty reduction. Through Gazebo simulations, we validate the efficacy of our proposed system and successfully embed our approach into a Jetson Xavier AGX unit for real-world applications. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: In The 2024 IEEE/SICE International Symposium on System Integration (SII2024), Ha Long, Vietnam

arXiv:2401.08132 [pdf, other]

Object-Oriented Semantic Map** for Reliable UAVs Navigation

Authors: Thanh Nguyen Canh, Armagan Elibol, Nak Young Chong, Xiem HoangVan

Abstract: To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extract… ▽ More To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extracted from the environment from RGB-D images. Our approach combines a state-of-the-art YOLOv8-based object detection framework at the front end and a 2D SLAM method - CartoGrapher at the back end. To effectively track and position semantic object classes extracted from the front-end interface, we employ the innovative BoT-SORT methodology. A novel association method is introduced to extract the position of objects and then project it with the metric map. Unlike previous research, our approach takes into reliable navigating in the environment with various hollow bottom objects. The output of our system is a probabilistic map, which significantly enhances the map's representation by incorporating object-specific attributes, encompassing class distinctions, accurate positioning, and object heights. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively produce augmented semantic maps containing several objects (notably chairs and desks). Furthermore, our system is evaluated within an embedded computer - Jetson Xavier AGX unit to demonstrate the use case in real-world applications. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: In the 12th International Conference on Control, Automation and Information Sciences (ICCAIS 2023), Hanoi, Vietnam

arXiv:2212.13218 [pdf, other]

Multisensor Data Fusion for Reliable Obstacle Avoidance

Authors: Thanh Nguyen Canh, Truong Son Nguyen, Cong Hoang Quach, Xiem HoangVan, Manh Duong Phung

Abstract: In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection… ▽ More In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments. △ Less

Submitted 26 December, 2022; originally announced December 2022.

Comments: In the 11th International Conference on Control, Automation and Information Sciences (ICCAIS 2022), Hanoi, Vietnam

Showing 1–5 of 5 results for author: HoangVan, X