Search | arXiv e-print repository

MARS: Multi-Scale Adaptive Robotics Vision for Underwater Object Detection and Domain Generalization

Authors: Lyes Saad Saoud, Lakmal Seneviratne, Irfan Hussain

Abstract: Underwater robotic vision encounters significant challenges, necessitating advanced solutions to enhance performance and adaptability. This paper presents MARS (Multi-Scale Adaptive Robotics Vision), a novel approach to underwater object detection tailored for diverse underwater scenarios. MARS integrates Residual Attention YOLOv3 with Domain-Adaptive Multi-Scale Attention (DAMSA) to enhance detec… ▽ More Underwater robotic vision encounters significant challenges, necessitating advanced solutions to enhance performance and adaptability. This paper presents MARS (Multi-Scale Adaptive Robotics Vision), a novel approach to underwater object detection tailored for diverse underwater scenarios. MARS integrates Residual Attention YOLOv3 with Domain-Adaptive Multi-Scale Attention (DAMSA) to enhance detection accuracy and adapt to different domains. During training, DAMSA introduces domain class-based attention, enabling the model to emphasize domain-specific features. Our comprehensive evaluation across various underwater datasets demonstrates MARS's performance. On the original dataset, MARS achieves a mean Average Precision (mAP) of 58.57\%, showcasing its proficiency in detecting critical underwater objects like echinus, starfish, holothurian, scallop, and waterweeds. This capability holds promise for applications in marine robotics, marine biology research, and environmental monitoring. Furthermore, MARS excels at mitigating domain shifts. On the augmented dataset, which incorporates all enhancements (+Domain +Residual+Channel Attention+Multi-Scale Attention), MARS achieves an mAP of 36.16\%. This result underscores its robustness and adaptability in recognizing objects and performing well across a range of underwater conditions. The source code for MARS is publicly available on GitHub at https://github.com/LyesSaadSaoud/MARS-Object-Detection/ △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2312.06801 [pdf, other]

ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for Underwater Environments

Authors: Lyes Saad Saoud, Zhenwei Niu, Atif Sultan, Lakmal Seneviratne, Irfan Hussain

Abstract: This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments. The first key contribution is Residual Attention YOLOv3, a novel variant of the YOLOv3 framework empowered by residual attention modules. These… ▽ More This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments. The first key contribution is Residual Attention YOLOv3, a novel variant of the YOLOv3 framework empowered by residual attention modules. These modules enable the model to focus on informative features while suppressing background noise, leading to improved detection accuracy and adaptability to different domains. The second contribution is the attention-based domain classification module, vital during training. This module helps the model identify domain-specific information, facilitating the learning of domain-invariant features. Consequently, ADOD can generalize effectively to underwater environments with distinct visual characteristics. Extensive experiments on diverse underwater datasets demonstrate ADOD's superior performance compared to state-of-the-art domain generalization methods, particularly in challenging scenarios. The proposed model achieves exceptional detection performance in both seen and unseen domains, showcasing its effectiveness in handling domain shifts in underwater object detection tasks. ADOD represents a significant advancement in adaptive object detection, providing a promising solution for real-world applications in underwater environments. With the prevalence of domain shifts in such settings, the model's strong generalization ability becomes a valuable asset for practical underwater surveillance and marine research endeavors. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2311.17197 [pdf, other]

Marine$\mathcal{X}$: Design and Implementation of Unmanned Surface Vessel for Vision Guided Navigation

Authors: Muhayy Ud Din, Ahmed Humais, Waseem Akram, Mohamed Alblooshi, Lyes Saad Saoud, Abdelrahman Alblooshi, Lakmal Seneviratne, Irfan Hussain

Abstract: Marine robots, particularly Unmanned Surface Vessels (USVs), have gained considerable attention for their diverse applications in maritime tasks, including search and rescue, environmental monitoring, and maritime security. This paper presents the design and implementation of a USV named marine$\mathcal{X}$. The hardware components of marine$\mathcal{X}$ are meticulously developed to ensure robust… ▽ More Marine robots, particularly Unmanned Surface Vessels (USVs), have gained considerable attention for their diverse applications in maritime tasks, including search and rescue, environmental monitoring, and maritime security. This paper presents the design and implementation of a USV named marine$\mathcal{X}$. The hardware components of marine$\mathcal{X}$ are meticulously developed to ensure robustness, efficiency, and adaptability to varying environmental conditions. Furthermore, the integration of a vision-based object tracking algorithm empowers marine$\mathcal{X}$ to autonomously track and monitor specific objects on the water surface. The control system utilizes PID control, enabling precise navigation of marine$\mathcal{X}$ while maintaining a desired course and distance to the target object. To assess the performance of marine$\mathcal{X}$, comprehensive testing is conducted, encompassing simulation, trials in the marine pool, and real-world tests in the open sea. The successful outcomes of these tests demonstrate the USV's capabilities in achieving real-time object tracking, showcasing its potential for various applications in maritime operations. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: accepted in ICAR

Journal ref: The 21st International Conference on Advanced Robotics (ICAR 2023)

arXiv:2310.02573 [pdf, other]

Robust Collision Detection for Robots with Variable Stiffness Actuation by Using MAD-CNN: Modularized-Attention-Dilated Convolutional Neural Network

Authors: Zhenwei Niu, Lyes Saad Saoud, Irfan Hussain

Abstract: Ensuring safety is paramount in the field of collaborative robotics to mitigate the risks of human injury and environmental damage. Apart from collision avoidance, it is crucial for robots to rapidly detect and respond to unexpected collisions. While several learning-based collision detection methods have been introduced as alternatives to purely model-based detection techniques, there is currentl… ▽ More Ensuring safety is paramount in the field of collaborative robotics to mitigate the risks of human injury and environmental damage. Apart from collision avoidance, it is crucial for robots to rapidly detect and respond to unexpected collisions. While several learning-based collision detection methods have been introduced as alternatives to purely model-based detection techniques, there is currently a lack of such methods designed for collaborative robots equipped with variable stiffness actuators. Moreover, there is potential for further enhancing the network's robustness and improving the efficiency of data training. In this paper, we propose a new network, the Modularized Attention-Dilated Convolutional Neural Network (MAD-CNN), for collision detection in robots equipped with variable stiffness actuators. Our model incorporates a dual inductive bias mechanism and an attention module to enhance data efficiency and improve robustness. In particular, MAD-CNN is trained using only a four-minute collision dataset focusing on the highest level of joint stiffness. Despite limited training data, MAD-CNN robustly detects all collisions with minimal detection delay across various stiffness conditions. Moreover, it exhibits a higher level of collision sensitivity, which is beneficial for effectively handling false positives, which is a common issue in learning-based methods. Experimental results demonstrate that the proposed MAD-CNN model outperforms existing state-of-the-art models in terms of collision sensitivity and robustness. △ Less

Submitted 30 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

arXiv:2310.01795 [pdf, other]

TempoNet: Empowering long-term Knee Joint Angle Prediction with Dynamic Temporal Attention in Exoskeleton Control

Authors: Lyes Saad Saoud, Irfan Hussain

Abstract: In the realm of exoskeleton control, achieving precise control poses challenges due to the mechanical delay of exoskeletons. To address this, incorporating future gait trajectories as feed-forward input has been proposed. However, existing deep learning models for gait prediction mainly focus on short-term predictions, leaving the long-term performance of these models relatively unexplored. In thi… ▽ More In the realm of exoskeleton control, achieving precise control poses challenges due to the mechanical delay of exoskeletons. To address this, incorporating future gait trajectories as feed-forward input has been proposed. However, existing deep learning models for gait prediction mainly focus on short-term predictions, leaving the long-term performance of these models relatively unexplored. In this study, we present TempoNet, a novel model specifically designed for precise knee joint angle prediction. By harnessing dynamic temporal attention within the Transformer-based architecture, TempoNet surpasses existing models in forecasting knee joint angles over extended time horizons. Notably, our model achieves a remarkable reduction of 10\% to 185\% in Mean Absolute Error (MAE) for 100 ms ahead forecasting compared to other transformer-based models, demonstrating its effectiveness. Furthermore, TempoNet exhibits further reliability and superiority over the baseline Transformer model, outperforming it by 14\% in MAE for the 200 ms prediction horizon. These findings highlight the efficacy of TempoNet in accurately predicting knee joint angles and emphasize the importance of incorporating dynamic temporal attention. TempoNet's capability to enhance knee joint angle prediction accuracy opens up possibilities for precise control, improved rehabilitation outcomes, advanced sports performance analysis, and deeper insights into biomechanical research. Code implementation for the TempoNet model can be found in the GitHub repository: https://github.com/LyesSaadSaoud/TempoNet. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: Accepted for presentation at the 2023 IEEE-RAS International Conference on Humanoid Robots, Austin, USA, on December 12-14

arXiv:2308.14762 [pdf, other]

Autonomous Underwater Robotic System for Aquaculture Applications

Authors: Waseem Akram, Muhayyuddin Ahmed, Lyes Saad Saoud, Lakmal Seneviratne, Irfan Hussain

Abstract: Aquaculture is a thriving food-producing sector producing over half of the global fish consumption. However, these aquafarms pose significant challenges such as biofouling, vegetation, and holes within their net pens and have a profound effect on the efficiency and sustainability of fish production. Currently, divers and/or remotely operated vehicles are deployed for inspecting and maintaining aqu… ▽ More Aquaculture is a thriving food-producing sector producing over half of the global fish consumption. However, these aquafarms pose significant challenges such as biofouling, vegetation, and holes within their net pens and have a profound effect on the efficiency and sustainability of fish production. Currently, divers and/or remotely operated vehicles are deployed for inspecting and maintaining aquafarms; this approach is expensive and requires highly skilled human operators. This work aims to develop a robotic-based automatic net defect detection system for aquaculture net pens oriented to on- ROV processing and real-time detection of different aqua-net defects such as biofouling, vegetation, net holes, and plastic. The proposed system integrates both deep learning-based methods for aqua-net defect detection and feedback control law for the vehicle movement around the aqua-net to obtain a clear sequence of net images and inspect the status of the net via performing the inspection tasks. This work contributes to the area of aquaculture inspection, marine robotics, and deep learning aiming to reduce cost, improve quality, and ease of operation. △ Less

Submitted 26 August, 2023; originally announced August 2023.

Comments: arXiv admin note: text overlap with arXiv:2308.13826

arXiv:2306.06900 [pdf, other]

Improving Knee Joint Angle Prediction through Dynamic Contextual Focus and Gated Linear Units

Authors: Lyes Saad Saoud, Humaid Ibrahim, Ahmad Aljarah, Irfan Hussain

Abstract: Accurate knee joint angle prediction is crucial for biomechanical analysis and rehabilitation. In this study, we introduce FocalGatedNet, a novel deep learning model that incorporates Dynamic Contextual Focus (DCF) Attention and Gated Linear Units (GLU) to enhance feature dependencies and interactions. Our model is evaluated on a large-scale dataset and compared to established models in multi-step… ▽ More Accurate knee joint angle prediction is crucial for biomechanical analysis and rehabilitation. In this study, we introduce FocalGatedNet, a novel deep learning model that incorporates Dynamic Contextual Focus (DCF) Attention and Gated Linear Units (GLU) to enhance feature dependencies and interactions. Our model is evaluated on a large-scale dataset and compared to established models in multi-step gait trajectory prediction. Our results reveal that FocalGatedNet outperforms existing models for long-term prediction lengths (20 ms, 60 ms, 80 ms, and 100 ms), demonstrating significant improvements in Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). Specifically for the case of 80 ms, FocalGatedNet achieves a notable MAE reduction of up to 24\%, RMSE reduction of up to 14\%, and MAPE reduction of up to 36\% when compared to Transformer, highlighting its effectiveness in capturing complex knee joint angle patterns. Moreover, FocalGatedNet maintains a lower computational load than most equivalent deep learning models, making it an efficient choice for real-time biomechanical analysis and rehabilitation applications. △ Less

Submitted 2 October, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: Under consideration at Pattern Recognition Letters

arXiv:2207.02589 [pdf, other]

Cascaded Deep Hybrid Models for Multistep Household Energy Consumption Forecasting

Authors: Lyes Saad Saoud, Hasan AlMarzouqi, Ramy Hussein

Abstract: Sustainability requires increased energy efficiency with minimal waste. The future power systems should thus provide high levels of flexibility iin controling energy consumption. Precise projections of future energy demand/load at the aggregate and on the individual site levels are of great importance for decision makers and professionals in the energy industry. Forecasting energy loads has become… ▽ More Sustainability requires increased energy efficiency with minimal waste. The future power systems should thus provide high levels of flexibility iin controling energy consumption. Precise projections of future energy demand/load at the aggregate and on the individual site levels are of great importance for decision makers and professionals in the energy industry. Forecasting energy loads has become more advantageous for energy providers and customers, allowing them to establish an efficient production strategy to satisfy demand. This study introduces two hybrid cascaded models for forecasting multistep household power consumption in different resolutions. The first model integrates Stationary Wavelet Transform (SWT), as an efficient signal preprocessing technique, with Convolutional Neural Networks and Long Short Term Memory (LSTM). The second hybrid model combines SWT with a self-attention based neural network architecture named transformer. The major constraint of using time-frequency analysis methods such as SWT in multistep energy forecasting problems is that they require sequential signals, making signal reconstruction problematic in multistep forecasting applications.The cascaded models can efficiently address this problem through using the recursive outputs. Experimental results show that the proposed hybrid models achieve superior prediction performance compared to the existing multistep power consumption prediction methods. The results will pave the way for more accurate and reliable forecasting of household power consumption. △ Less

Submitted 13 October, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: Under consideration at Pattern Recognition Letters

arXiv:2206.09731 [pdf, other]

doi 10.1109/TGRS.2023.3268159

Semantic Labeling of High Resolution Images Using EfficientUNets and Transformers

Authors: Hasan AlMarzouqi, Lyes Saad Saoud

Abstract: Semantic segmentation necessitates approaches that learn high-level characteristics while dealing with enormous amounts of data. Convolutional neural networks (CNNs) can learn unique and adaptive features to achieve this aim. However, due to the large size and high spatial resolution of remote sensing images, these networks cannot analyze an entire scene efficiently. Recently, deep transformers ha… ▽ More Semantic segmentation necessitates approaches that learn high-level characteristics while dealing with enormous amounts of data. Convolutional neural networks (CNNs) can learn unique and adaptive features to achieve this aim. However, due to the large size and high spatial resolution of remote sensing images, these networks cannot analyze an entire scene efficiently. Recently, deep transformers have proven their capability to record global interactions between different objects in the image. In this paper, we propose a new segmentation model that combines convolutional neural networks with transformers, and show that this mixture of local and global feature extraction techniques provides significant advantages in remote sensing segmentation. In addition, the proposed model includes two fusion layers that are designed to represent multi-modal inputs and output of the network efficiently. The input fusion layer extracts feature maps summarizing the relationship between image content and elevation maps (DSM). The output fusion layer uses a novel multi-task segmentation strategy where class labels are identified using class-specific feature extraction layers and loss functions. Finally, a fast-marching method is used to convert all unidentified class labels to their closest known neighbors. Our results demonstrate that the proposed methodology improves segmentation accuracy compared to state-of-the-art techniques. △ Less

Submitted 22 June, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

Showing 1–9 of 9 results for author: Saoud, L S