-
Ego-to-Exo: Interfacing Third Person Visuals from Egocentric Views in Real-time for Improved ROV Teleoperation
Authors:
Adnan Abdullah,
Ruo Chen,
Ioannis Rekleitis,
Md Jahidul Islam
Abstract:
Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver and navigate the ROV in complex deep-water missions. In this paper, we present an interactive teleoperatio…
▽ More
Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver and navigate the ROV in complex deep-water missions. In this paper, we present an interactive teleoperation interface that (i) offers on-demand "third"-person (exocentric) visuals from past egocentric views, and (ii) facilitates enhanced peripheral information with augmented ROV pose in real-time. We achieve this by integrating a 3D geometry-based Ego-to-Exo view synthesis algorithm into a monocular SLAM system for accurate trajectory estimation. The proposed closed-form solution only uses past egocentric views from the ROV and a SLAM backbone for pose estimation, which makes it portable to existing ROV platforms. Unlike data-driven solutions, it is invariant to applications and waterbody-specific scenes. We validate the geometric accuracy of the proposed framework through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. We demonstrate the benefits of dynamic Ego-to-Exo view generation and real-time pose rendering for remote ROV teleoperation by following navigation guides such as cavelines inside underwater caves. This new way of interactive ROV teleoperation opens up promising opportunities for future research in underwater telerobotics.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Adaptive Lightweight Security for Performance Efficiency in Critical Healthcare Monitoring
Authors:
Ijaz Ahmad,
Faheem Shahid,
Ijaz Ahmad,
Johirul Islam,
Kazi Nymul Haque,
Erkki Harjula
Abstract:
The healthcare infrastructure requires robust security procedures, technologies, and policies due to its critical nature. Since the Internet of Things (IoT) with its diverse technologies has become an integral component of future healthcare systems, its security requires a thorough analysis due to its inherent security limitations that arise from resource constraints. Existing communication techno…
▽ More
The healthcare infrastructure requires robust security procedures, technologies, and policies due to its critical nature. Since the Internet of Things (IoT) with its diverse technologies has become an integral component of future healthcare systems, its security requires a thorough analysis due to its inherent security limitations that arise from resource constraints. Existing communication technologies used for IoT connectivity, such as 5G, provide communications security with the underlying communication infrastructure to a certain level. However, the evolving healthcare paradigm requires adaptive security procedures and technologies that can adapt to the varying resource constraints of IoT devices. This need for adaptive security is particularly pronounced when considering components outside the security sandbox of 5G, such as IoT nodes and M2M connections, which introduce additional security challenges. This article brings forth the unique healthcare monitoring requirements and studies the existing encryption-based security approaches to provide the necessary security. Furthermore, this research introduces a novel approach to optimizing security and performance in IoT in healthcare, particularly in critical use cases such as remote patient monitoring. Finally, the results from the practical implementation demonstrate a marked improvement in the system performance.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Blockchain-based AI Methods for Managing Industrial IoT: Recent Developments, Integration Challenges and Opportunities
Authors:
Anichur Rahman,
Dipanjali Kundu,
Tanoy Debnath,
Muaz Rahman,
Airin Afroj Aishi,
Jahidul Islam
Abstract:
Currently, Blockchain (BC), Artificial Intelligence (AI), and smart Industrial Internet of Things (IIoT) are not only leading promising technologies in the world, but also these technologies facilitate the current society to develop the standard of living and make it easier for users. However, these technologies have been applied in various domains for different purposes. Then, these are successfu…
▽ More
Currently, Blockchain (BC), Artificial Intelligence (AI), and smart Industrial Internet of Things (IIoT) are not only leading promising technologies in the world, but also these technologies facilitate the current society to develop the standard of living and make it easier for users. However, these technologies have been applied in various domains for different purposes. Then, these are successfully assisted in develo** the desired system, such as-smart cities, homes, manufacturers, education, and industries. Moreover, these technologies need to consider various issues-security, privacy, confidentiality, scalability, and application challenges in diverse fields. In this context, with the increasing demand for these issues solutions, the authors present a comprehensive survey on the AI approaches with BC in the smart IIoT. Firstly, we focus on state-of-the-art overviews regarding AI, BC, and smart IoT applications. Then, we provide the benefits of integrating these technologies and discuss the established methods, tools, and strategies efficiently. Most importantly, we highlight the various issues--security, stability, scalability, and confidentiality and guide the way of addressing strategy and methods. Furthermore, the individual and collaborative benefits of applications have been discussed. Lastly, we are extensively concerned about the open research challenges and potential future guidelines based on BC-based AI approaches in the intelligent IIoT system.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Bangladeshi Native Vehicle Detection in Wild
Authors:
Bipin Saha,
Md. Johirul Islam,
Shaikh Khaled Mostaque,
Aditya Bhowmik,
Tapodhir Karmakar Taton,
Md. Nakib Hayat Chowdhury,
Mamun Bin Ibne Reaz
Abstract:
The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle cla…
▽ More
The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle classes have been taken into account, with fully annotated 81542 instances of 17326 images. Each image width is set to at least 1280px. The dataset's average vehicle bounding box-to-image ratio is 4.7036. This Bangladesh Native Vehicle Dataset (BNVD) has accounted for several geographical, illumination, variety of vehicle sizes, and orientations to be more robust on surprised scenarios. In the context of examining the BNVD dataset, this work provides a thorough assessment with four successive You Only Look Once (YOLO) models, namely YOLO v5, v6, v7, and v8. These dataset's effectiveness is methodically evaluated and contrasted with other vehicle datasets already in use. The BNVD dataset exhibits mean average precision(mAP) at 50% intersection over union (IoU) is 0.848 corresponding precision and recall values of 0.841 and 0.774. The research findings indicate a mAP of 0.643 at an IoU range of 0.5 to 0.95. The experiments show that the BNVD dataset serves as a reliable representation of vehicle distribution and presents considerable complexities.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
AquaSonic: Acoustic Manipulation of Underwater Data Center Operations and Resource Management
Authors:
Jennifer Sheldon,
Weidong Zhu,
Adnan Abdullah,
Sri Hrushikesh Varma Bhupathiraju,
Takeshi Sugawara,
Kevin R. B. Butler,
Md Jahidul Islam,
Sara Rampazzi
Abstract:
Underwater datacenters (UDCs) hold promise as next-generation data storage due to their energy efficiency and environmental sustainability benefits. While the natural cooling properties of water save power, the isolated aquatic environment and long-range sound propagation in water create unique vulnerabilities which differ from those of on-land data centers. Our research discovers the unique vulne…
▽ More
Underwater datacenters (UDCs) hold promise as next-generation data storage due to their energy efficiency and environmental sustainability benefits. While the natural cooling properties of water save power, the isolated aquatic environment and long-range sound propagation in water create unique vulnerabilities which differ from those of on-land data centers. Our research discovers the unique vulnerabilities of fault-tolerant storage devices, resource allocation software, and distributed file systems to acoustic injection attacks in UDCs. With a realistic testbed approximating UDC server operations, we empirically characterize the capabilities of acoustic injection underwater and find that an attacker can reduce fault-tolerant RAID 5 storage system throughput by 17% up to 100%. Our closed-water analyses reveal that attackers can (i) cause unresponsiveness and automatic node removal in a distributed filesystem with only 2.4 minutes of sustained acoustic injection, (ii) induce a distributed database's latency to increase by up to 92.7% to reduce system reliability, and (iii) induce load-balance managers to redirect up to 74% of resources to a target server to cause overload or force resource colocation. Furthermore, we perform open-water experiments in a lake and find that an attacker can cause controlled throughput degradation at a maximum allowable distance of 6.35 m using a commercial speaker. We also investigate and discuss the effectiveness of standard defenses against acoustic injection attacks. Finally, we formulate a novel machine learning-based detection system that reaches 0% False Positive Rate and 98.2% True Positive Rate trained on our dataset of profiled hard disk drives under 30-second FIO benchmark execution. With this work, we aim to help manufacturers proactively protect UDCs against acoustic injection attacks and ensure the security of subsea computing infrastructures.
△ Less
Submitted 7 May, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Transcribing Bengali Text with Regional Dialects to IPA using District Guided Tokens
Authors:
S M Jishanul Islam,
Sadia Ahmmed,
Sahid Hossain Mustakim
Abstract:
Accurate transcription of Bengali text to the International Phonetic Alphabet (IPA) is a challenging task due to the complex phonology of the language and context-dependent sound changes. This challenge is even more for regional Bengali dialects due to unavailability of standardized spelling conventions for these dialects, presence of local and foreign words popular in those regions and phonologic…
▽ More
Accurate transcription of Bengali text to the International Phonetic Alphabet (IPA) is a challenging task due to the complex phonology of the language and context-dependent sound changes. This challenge is even more for regional Bengali dialects due to unavailability of standardized spelling conventions for these dialects, presence of local and foreign words popular in those regions and phonological diversity across different regions. This paper presents an approach to this sequence-to-sequence problem by introducing the District Guided Tokens (DGT) technique on a new dataset spanning six districts of Bangladesh. The key idea is to provide the model with explicit information about the regional dialect or "district" of the input text before generating the IPA transcription. This is achieved by prepending a district token to the input sequence, effectively guiding the model to understand the unique phonetic patterns associated with each district. The DGT technique is applied to fine-tune several transformer-based models, on this new dataset. Experimental results demonstrate the effectiveness of DGT, with the ByT5 model achieving superior performance over word-based models like mT5, BanglaT5, and umT5. This is attributed to ByT5's ability to handle a high percentage of out-of-vocabulary words in the test set. The proposed approach highlights the importance of incorporating regional dialect information into ubiquitous natural language processing systems for languages with diverse phonological variations. The following work was a result of the "Bhashamul" challenge, which is dedicated to solving the problem of Bengali text with regional dialects to IPA transcription https://www.kaggle.com/competitions/regipa/. The training and inference notebooks are available through the competition link.
△ Less
Submitted 2 April, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Online Allocation with Replenishable Budgets: Worst Case and Beyond
Authors:
Jianyi Yang,
Pengfei Li,
Mohammad Jaminur Islam,
Shaolei Ren
Abstract:
This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual v…
▽ More
This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T^* >= 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
Authors:
Jian Zhu,
Changbing Yang,
Farhan Samir,
Jahurul Islam
Abstract:
In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages. We curated the IPAPACK, a massively multilingual speech corpora with phonemic transcriptions, encompassing more than 115 languages from diverse language families, selectively checked by linguists. Based on the IPAPACK, we propose CLAP-IPA, a multi…
▽ More
In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages. We curated the IPAPACK, a massively multilingual speech corpora with phonemic transcriptions, encompassing more than 115 languages from diverse language families, selectively checked by linguists. Based on the IPAPACK, we propose CLAP-IPA, a multi-lingual phoneme-speech contrastive embedding model capable of open-vocabulary matching between arbitrary speech signals and phonemic sequences. The proposed model was tested on 95 unseen languages, showing strong generalizability across languages. Temporal alignments between phonemes and speech signals also emerged from contrastive training, enabling zeroshot forced alignment in unseen languages. We further introduced a neural forced aligner IPA-ALIGNER by finetuning CLAP-IPA with the Forward-Sum loss to learn better phone-to-audio alignment. Evaluation results suggest that IPA-ALIGNER can generalize to unseen languages without adaptation.
△ Less
Submitted 1 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
CaveSeg: Deep Semantic Segmentation and Scene Parsing for Autonomous Underwater Cave Exploration
Authors:
A. Abdullah,
T. Barua,
R. Tibbetts,
Z. Chen,
M. J. Islam,
I. Rekleitis
Abstract:
In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstac…
▽ More
In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstacles (e.g. ground plane and overhead layers), scuba divers, and open areas for servoing. Through comprehensive benchmark analyses on cave systems in USA, Mexico, and Spain locations, we demonstrate that robust deep visual models can be developed based on CaveSeg for fast semantic scene parsing of underwater cave environments. In particular, we formulate a novel transformer-based model that is computationally light and offers near real-time execution in addition to achieving state-of-the-art performance. Finally, we explore the design choices and implications of semantic segmentation for visual servoing by AUVs inside underwater caves. The proposed model and benchmark dataset open up promising opportunities for future research in autonomous underwater cave exploration and map**.
△ Less
Submitted 10 May, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
DFT insights into MAX phase borides Hf2AB [A = S, Se, Te] in comparison with MAX phase carbides Hf2AC [A = S, Se, Te]
Authors:
J. Islam,
M. D. Islam,
M. A. Ali,
H. Akter,
A. Hossain,
M. Biswas,
M. M. Hossain,
M. M. Uddin,
S. H. Naqib
Abstract:
In this work, density functional theory (DFT) based calculations were performed to compute the physical properties (structural stability, mechanical behavior, electronic, thermodynamic, and optical properties) of synthesized MAX phases Hf2SB, Hf2SC, Hf2SeB, Hf2SeC, Hf2TeB, and the as-yet-undiscovered MAX carbide phase Hf2TeC. Calculations of formation energy, phonon dispersion curves, and elastic…
▽ More
In this work, density functional theory (DFT) based calculations were performed to compute the physical properties (structural stability, mechanical behavior, electronic, thermodynamic, and optical properties) of synthesized MAX phases Hf2SB, Hf2SC, Hf2SeB, Hf2SeC, Hf2TeB, and the as-yet-undiscovered MAX carbide phase Hf2TeC. Calculations of formation energy, phonon dispersion curves, and elastic constants confirmed the stability of the aforementioned compounds. The obtained values of lattice parameters, elastic constants, and elastic moduli of Hf2SB, Hf2SC, Hf2SeB, Hf2SeC, and Hf2TeB showed fair agreement with earlier studies, whereas the values of the mentioned parameters for the predicted Hf2TeC exhibit a good consequence of B replacement by C. The anisotropic mechanical properties are exhibited by the considered MAX phases. The metallic nature and its anisotropic behavior were revealed by the electronic band structure and density of states. The analysis of the thermal properties Debye temperature, melting temperature, minimum thermal conductivity, and Gruneisen parameter confirmed that the carbide phases were more suited than the boride phases considered herein. The MAX phase response to incoming photons further demonstrated that they were metallic. Their suitability for use as coating materials to prevent solar heating was demonstrated by the reflectivity spectra. Additionally, this study demonstrated the impact of B replacing C in the MAX phases.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Dynamic Exploration-Exploitation Trade-Off in Active Learning Regression with Bayesian Hierarchical Modeling
Authors:
Upala Junaida Islam,
Kamran Paynabar,
George Runger,
Ashif Sikandar Iquebal
Abstract:
Active learning provides a framework to adaptively query the most informative experiments towards learning an unknown black-box function. Various approaches of active learning have been proposed in the literature, however, they either focus on exploration or exploitation in the design space. Methods that do consider exploration-exploitation simultaneously employ fixed or ad-hoc measures to control…
▽ More
Active learning provides a framework to adaptively query the most informative experiments towards learning an unknown black-box function. Various approaches of active learning have been proposed in the literature, however, they either focus on exploration or exploitation in the design space. Methods that do consider exploration-exploitation simultaneously employ fixed or ad-hoc measures to control the trade-off that may not be optimal. In this paper, we develop a Bayesian hierarchical approach, referred as BHEEM, to dynamically balance the exploration-exploitation trade-off as more data points are queried. To sample from the posterior distribution of the trade-off parameter, We subsequently formulate an approximate Bayesian computation approach based on the linear dependence of queried data in the feature space. Simulated and real-world examples show the proposed approach achieves at least 21% and 11% average improvement when compared to pure exploration and exploitation strategies respectively. More importantly, we note that by optimally balancing the trade-off between exploration and exploitation, BHEEM performs better or at least as well as either pure exploration or pure exploitation.
△ Less
Submitted 30 September, 2023; v1 submitted 15 April, 2023;
originally announced April 2023.
-
Weakly Supervised Caveline Detection For AUV Navigation Inside Underwater Caves
Authors:
Boxiao Yu,
Reagan Tibbetts,
Titon Barua,
Ailani Morales,
Ioannis Rekleitis,
Md Jahidul Islam
Abstract:
Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Map** underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave map** by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in c…
▽ More
Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Map** underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave map** by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave map** missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices.
△ Less
Submitted 28 June, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Case-Base Neural Networks: survival analysis with time-varying, higher-order interactions
Authors:
Jesse Islam,
Maxime Turgeon,
Robert Sladek,
Sahir Bhatnagar
Abstract:
In the context of survival analysis, data-driven neural network-based methods have been developed to model complex covariate effects. While these methods may provide better predictive performance than regression-based approaches, not all can model time-varying interactions and complex baseline hazards. To address this, we propose Case-Base Neural Networks (CBNNs) as a new approach that combines th…
▽ More
In the context of survival analysis, data-driven neural network-based methods have been developed to model complex covariate effects. While these methods may provide better predictive performance than regression-based approaches, not all can model time-varying interactions and complex baseline hazards. To address this, we propose Case-Base Neural Networks (CBNNs) as a new approach that combines the case-base sampling framework with flexible neural network architectures. Using a novel sampling scheme and data augmentation to naturally account for censoring, we construct a feed-forward neural network that includes time as an input. CBNNs predict the probability of an event occurring at a given moment to estimate the full hazard function. We compare the performance of CBNNs to regression and neural network-based survival methods in a simulation and three case studies using two time-dependent metrics. First, we examine performance on a simulation involving a complex baseline hazard and time-varying interactions to assess all methods, with CBNN outperforming competitors. Then, we apply all methods to three real data applications, with CBNNs outperforming the competing models in two studies and showing similar performance in the third. Our results highlight the benefit of combining case-base sampling with deep learning to provide a simple and flexible framework for data-driven modeling of single event survival outcomes that estimates time-varying effects and a complex baseline hazard by design. An R package is available at https://github.com/Jesse-Islam/cbnn.
△ Less
Submitted 9 January, 2024; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Enhancing Data Security for Cloud Computing Applications through Distributed Blockchain-based SDN Architecture in IoT Networks
Authors:
Anichur Rahman,
Md. Jahidul Islam,
Rafiqul Islam,
Ayesha Aziz,
Dipanjali Kundu,
Sadia Sazzad,
Md. Razaul Karim,
Mahedi Hasan,
Ziaur Rahman,
Said Elnaffar,
Shahab S. Band
Abstract:
Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to give c…
▽ More
Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to give customers access to remote resources, such as computation and storage operations. However, cloud computing also presents substantial security threats, issues, and challenges. Therefore, to overcome these difficulties, we propose integrating Blockchain and SDN in the cloud computing platform. In this research, we introduce the architecture to better secure clouds. Moreover, we leverage a distributed Blockchain approach to convey security, confidentiality, privacy, integrity, adaptability, and scalability in the proposed architecture. BC provides a distributed or decentralized and efficient environment for users. Also, we present an SDN approach to improving the reliability, stability, and load balancing capabilities of the cloud infrastructure. Finally, we provide an experimental evaluation of the performance of our SDN and BC-based implementation using different parameters, also monitoring some attacks in the system and proving its efficacy.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater Robots
Authors:
Boxiao Yu,
Jiayi Wu,
Md Jahidul Islam
Abstract:
In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater l…
▽ More
In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater light attenuation prior, and then devise a least-squared formulation for coarse pixel-wise depth prediction. Subsequently, we extend this into a domain projection loss that guides the end-to-end learning of UDepth on over 9K RGB-D training samples. UDepth is designed with a computationally light MobileNetV2 backbone and a Transformer-based optimizer for ensuring fast inference rates on embedded systems. By domain-aware design choices and through comprehensive experimental analyses, we demonstrate that it is possible to achieve state-of-the-art depth estimation performance while ensuring a small computational footprint. Specifically, with 70%-80% less network parameters than existing benchmarks, UDepth achieves comparable and often better depth estimation performance. While the full model offers over 66 FPS (13 FPS) inference rates on a single GPU (CPU core), our domain projection for coarse depth prediction runs at 51.5 FPS rates on single-board NVIDIA Jetson TX2s. The inference pipelines are available at https://github.com/uf-robopi/UDepth.
△ Less
Submitted 2 February, 2023; v1 submitted 25 September, 2022;
originally announced September 2022.
-
On the Integration of Blockchain and SDN: Overview, Applications, and Future Perspectives
Authors:
Anichur Rahman,
Antonio Montieri,
Dipanjali Kundu,
Md. Razaul Karim,
Md. Jahidul Islam,
Sara Umme,
Alfredo Nascita,
Antonio Pescapè
Abstract:
Blockchain (BC) and Software-Defined Networking (SDN) are leading technologies which have recently found applications in several network-related scenarios and have consequently experienced a growing interest in the research community. Indeed, current networks connect a massive number of objects over the Internet and in this complex scenario, to ensure security, privacy, confidentiality, and progra…
▽ More
Blockchain (BC) and Software-Defined Networking (SDN) are leading technologies which have recently found applications in several network-related scenarios and have consequently experienced a growing interest in the research community. Indeed, current networks connect a massive number of objects over the Internet and in this complex scenario, to ensure security, privacy, confidentiality, and programmability, the utilization of BC and SDN have been successfully proposed. In this work, we provide a comprehensive survey regarding these two recent research trends and review the related state-of-the-art literature. We first describe the main features of each technology and discuss their most common and used variants. Furthermore, we envision the integration of such technologies to jointly take advantage of these latter efficiently. Indeed, we consider their group-wise utilization -- named BC-SDN -- based on the need for stronger security and privacy. Additionally, we cover the application fields of these technologies both individually and combined. Finally, we discuss the open issues of reviewed research and describe potential directions for future avenues regarding the integration of BC and SDN.
To summarize, the contribution of the present survey spans from an overview of the literature background on BC and SDN to the discussion of the benefits and limitations of BC-SDN integration in different fields, which also raises open challenges and possible future avenues examined herein. To the best of our knowledge, compared to existing surveys, this is the first work that analyzes the aforementioned aspects in light of a broad BC-SDN integration, with a specific focus on security and privacy issues in actual utilization scenarios.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Machine Learning Approaches to Predict Breast Cancer: Bangladesh Perspective
Authors:
Taminul Islam,
Arindom Kundu,
Nazmul Islam Khan,
Choyon Chandra Bonik,
Flora Akter,
Md Jihadul Islam
Abstract:
Nowadays, Breast cancer has risen to become one of the most prominent causes of death in recent years. Among all malignancies, this is the most frequent and the major cause of death for women globally. Manually diagnosing this disease requires a good amount of time and expertise. Breast cancer detection is time-consuming, and the spread of the disease can be reduced by develo** machine-based bre…
▽ More
Nowadays, Breast cancer has risen to become one of the most prominent causes of death in recent years. Among all malignancies, this is the most frequent and the major cause of death for women globally. Manually diagnosing this disease requires a good amount of time and expertise. Breast cancer detection is time-consuming, and the spread of the disease can be reduced by develo** machine-based breast cancer predictions. In Machine learning, the system can learn from prior instances and find hard-to-detect patterns from noisy or complicated data sets using various statistical, probabilistic, and optimization approaches. This work compares several machine learning algorithm's classification accuracy, precision, sensitivity, and specificity on a newly collected dataset. In this work Decision tree, Random Forest, Logistic Regression, Naive Bayes, and XGBoost, these five machine learning approaches have been implemented to get the best performance on our dataset. This study focuses on finding the best algorithm that can forecast breast cancer with maximum accuracy in terms of its classes. This work evaluated the quality of each algorithm's data classification in terms of efficiency and effectiveness. And also compared with other published work on this domain. After implementing the model, this study achieved the best model accuracy, 94% on Random Forest and XGBoost.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Role of shape anisotropy on thermal gradient-driven domain wall dynamics in magnetic nanowires
Authors:
M. T. Islam,
M. A. S. Akanda,
F. Yesmin,
M. A. J. Pikul,
J. M. T. Islam
Abstract:
We investigate the magnetic domain wall (DW) dynamics in uniaxial/biaxial nanowires under a thermal gradient (TG). The findings reveal that the DW propagates toward the hotter region in both nanowires. The main physics of such observations is the magnonic angular momentum transfer to the DW. The hard (shape) anisotropy exists in biaxial nanowire, which contributes an additional torque, hence DW sp…
▽ More
We investigate the magnetic domain wall (DW) dynamics in uniaxial/biaxial nanowires under a thermal gradient (TG). The findings reveal that the DW propagates toward the hotter region in both nanowires. The main physics of such observations is the magnonic angular momentum transfer to the DW. The hard (shape) anisotropy exists in biaxial nanowire, which contributes an additional torque, hence DW speed is larger than that in uniaxial nanowire. With lower dam**, the DW velocity is smaller and DW velocity increases with dam** which is opposite to usual expectation. To explain this, it is predicted that there is a probability to form the standing spin-waves (which do not carry net energy/momentum) together with travelling spin-waves if the propagation length of thermally-generated spin-waves is larger than the nanowire length. For larger-dam**, DW decreases with dam** since the magnon propagation length decreases. Therefore, the above findings might be useful in realizing the spintronic (racetrack memory) devices.
△ Less
Submitted 15 October, 2022; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Myoelectric Pattern Recognition Performance Enhancement Using Nonlinear Features
Authors:
Md. Johirul Islam,
Shamim Ahmad,
Fahmida Haque,
Mamun Bin Ibne Reaz,
Mohammad A. S. Bhuiyan,
Md. Rezaul Islam
Abstract:
The multichannel electrode array used for electromyogram (EMG) pattern recognition provides good performance, but it has a high cost, is computationally expensive, and is inconvenient to wear. Therefore, researchers try to use as few channels as possible while maintaining improved pattern recognition performance. However, minimizing the number of channels affects the performance due to the least s…
▽ More
The multichannel electrode array used for electromyogram (EMG) pattern recognition provides good performance, but it has a high cost, is computationally expensive, and is inconvenient to wear. Therefore, researchers try to use as few channels as possible while maintaining improved pattern recognition performance. However, minimizing the number of channels affects the performance due to the least separable margin among the movements possessing weak signal strengths. To meet these challenges, two time-domain features based on nonlinear scaling, the log of the mean absolute value (LMAV) and the nonlinear scaled value (NSV), are proposed. In this study, we validate the proposed features on two datasets, existing four feature extraction methods, variable window size and various signal to noise ratios (SNR). In addition, we also propose a feature extraction method where the LMAV and NSV are grouped with the existing 11 time-domain features. The proposed feature extraction method enhances accuracy, sensitivity, specificity, precision, and F1 score by 1.00%, 5.01%, 0.55%, 4.71%, and 5.06% for dataset 1, and 1.18%, 5.90%, 0.66%, 5.63%, and 6.04% for dataset 2, respectively. Therefore, the experimental results strongly suggest the proposed feature extraction method, for taking a step forward with regard to improved myoelectric pattern recognition performance.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Manas: Mining Software Repositories to Assist AutoML
Authors:
Giang Nguyen,
Md Johir Islam,
Rangeet Pan,
Hridesh Rajan
Abstract:
Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search p…
▽ More
Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search problem where the starting point is a default CNN model, and mutation of this CNN model allows exploration of the space of CNN models to find a CNN model that will work best for the problem. These works have had significant success in producing high-accuracy CNN models. There are two problems, however. First, NAS can be very costly, often taking several hours to complete. Second, CNN models produced by NAS can be very complex that makes it harder to understand them and costlier to train them. We propose a novel approach for NAS, where instead of starting from a default CNN model, the initial model is selected from a repository of models extracted from GitHub. The intuition being that developers solving a similar problem may have developed a better starting point compared to the default model. We also analyze common layer patterns of CNN models in the wild to understand changes that the developers make to improve their models. Our approach uses commonly occurring changes as mutation operators in NAS. We have extended Auto-Keras to implement our approach. Our evaluation using 8 top voted problems from Kaggle for tasks including image classification and image regression shows that given the same search time, without loss of accuracy, Manas produces models with 42.9% to 99.6% fewer number of parameters than Auto-Keras' models. Benchmarked on GPU, Manas' models train 30.3% to 641.6% faster than Auto-Keras' models.
△ Less
Submitted 13 February, 2022; v1 submitted 6 December, 2021;
originally announced December 2021.
-
Fast Direct Stereo Visual SLAM
Authors:
Jiawei Mo,
Md Jahidul Islam,
Junaed Sattar
Abstract:
We propose a novel approach for fast and accurate stereo visual Simultaneous Localization and Map** (SLAM) independent of feature detection and matching. We extend monocular Direct Sparse Odometry (DSO) to a stereo system by optimizing the scale of the 3D points to minimize photometric error for the stereo configuration, which yields a computationally efficient and robust method compared to conv…
▽ More
We propose a novel approach for fast and accurate stereo visual Simultaneous Localization and Map** (SLAM) independent of feature detection and matching. We extend monocular Direct Sparse Odometry (DSO) to a stereo system by optimizing the scale of the 3D points to minimize photometric error for the stereo configuration, which yields a computationally efficient and robust method compared to conventional stereo matching. We further extend it to a full SLAM system with loop closure to reduce accumulated errors. With the assumption of forward camera motion, we imitate a LiDAR scan using the 3D points obtained from the visual odometry and adapt a LiDAR descriptor for place recognition to facilitate more efficient detection of loop closures. Afterward, we estimate the relative pose using direct alignment by minimizing the photometric error for potential loop closures. Optionally, further improvement over direct alignment is achieved by using the Iterative Closest Point (ICP) algorithm. Lastly, we optimize a pose graph to improve SLAM accuracy globally. By avoiding feature detection or matching in our SLAM system, we ensure high computational efficiency and robustness. Thorough experimental validations on public datasets demonstrate its effectiveness compared to the state-of-the-art approaches.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
DistB-SDoIndustry: Enhancing Security in Industry 4.0 Services based on Distributed Blockchain through Software Defined Networking-IoT Enabled Architecture
Authors:
Anichur Rahman,
Umme Sara,
Dipanjali Kundu,
Saiful Islam,
Md. Jahidul Islam,
Mahedi Hasan,
Ziaur Rahman,
Mostofa Kamal Nasir
Abstract:
The concept of Industry 4.0 is a newly emerging focus of research throughout the world. However, it has lots of challenges to control data, and it can be addressed with various technologies like Internet of Things (IoT), Big Data, Artificial Intelligence (AI), Software Defined Networking (SDN), and Blockchain (BC) for managing data securely. Further, the complexity of sensors, appliances, sensor n…
▽ More
The concept of Industry 4.0 is a newly emerging focus of research throughout the world. However, it has lots of challenges to control data, and it can be addressed with various technologies like Internet of Things (IoT), Big Data, Artificial Intelligence (AI), Software Defined Networking (SDN), and Blockchain (BC) for managing data securely. Further, the complexity of sensors, appliances, sensor networks connecting to the internet and the model of Industry 4.0 has created the challenge of designing systems, infrastructure and smart applications capable of continuously analyzing the data produced. Regarding these, the authors present a distributed Blockchain-based security to industry 4.0 applications with SDN-IoT enabled environment. Where the Blockchain can be capable of leading the robust, privacy and confidentiality to our desired system. In addition, the SDN-IoT incorporates the different services of industry 4.0 with more security as well as flexibility. Furthermore, the authors offer an excellent combination among the technologies like IoT, SDN and Blockchain to improve the security and privacy of Industry 4.0 services properly. Finally , the authors evaluate performance and security in a variety of ways in the presented architecture.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
DistB-Condo: Distributed Blockchain-based IoT-SDN Model for Smart Condominium
Authors:
Anichur Rahman,
Md. Jahidul Islam,
Ziaur Rahman,
Md. Mahfuz Reza,
Adnan Anwar,
M. A. Parvez Mahmud,
Mostofa Kamal Nasir,
Rafidah Md Noor
Abstract:
Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtual…
▽ More
Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtualization (NFV) and Blockchain (BC). These technologies provide a robust, and secured platform to meet all kinds of challenges, such as safety, confidentiality, flexibility, efficiency, and availability. This work suggests a distributed, scalable IoT-SDN with Blockchain-based NFV framework for a smart condominium (DistB-Condo) that can act as an efficient secured platform for a small community. Moreover, the Blockchain-based IoT-SDN with NFV framework provides the combined benefits of leading technologies. It also presents an optimized Cluster Head Selection (CHS) algorithm for selecting a Cluster Head (CH) among the clusters that efficiently saves energy. Besides, a decentralized and secured Blockchain approach has been introduced that allows more prominent security and privacy to the desired condominium network. Our proposed approach has also the ability to detect attacks in an IoT environment. Eventually, this article evaluates the performance of the proposed architecture using different parameters (e.g., throughput, packet arrival rate, and response time). The proposed approach outperforms the existing OF-Based SDN. DistB-Condo has better throughput on average, and the bandwidth (Mbps) much higher than the OF-Based SDN approach in the presence of attacks. Also, the proposed model has an average response time of 5% less than the core model.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
A Generative Approach for Detection-driven Underwater Image Enhancement
Authors:
Chelsey Edge,
Md Jahidul Islam,
Christopher Morse,
Junaed Sattar
Abstract:
In this paper, we introduce a generative model for image enhancement specifically for improving diver detection in the underwater domain. In particular, we present a model that integrates generative adversarial network (GAN)-based image enhancement with the diver detection task. Our proposed approach restructures the GAN objective function to include information from a pre-trained diver detector w…
▽ More
In this paper, we introduce a generative model for image enhancement specifically for improving diver detection in the underwater domain. In particular, we present a model that integrates generative adversarial network (GAN)-based image enhancement with the diver detection task. Our proposed approach restructures the GAN objective function to include information from a pre-trained diver detector with the goal to generate images which would enhance the accuracy of the detector in adverse visual conditions. By incorporating the detector output into both the generator and discriminator networks, our model is able to focus on enhancing images beyond aesthetic qualities and specifically to improve robotic detection of scuba divers. We train our network on a large dataset of scuba divers, using a state-of-the-art diver detector, and demonstrate its utility on images collected from oceanic explorations of human-robot teams. Experimental evaluations demonstrate that our approach significantly improves diver detection performance over raw, unenhanced images, and even outperforms detection performance on the output of state-of-the-art underwater image enhancement algorithms. Finally, we demonstrate the inference performance of our network on embedded devices to highlight the feasibility of operating on board mobile robotic platforms.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots
Authors:
Md Jahidul Islam,
Ruobing Wang,
Junaed Sattar
Abstract:
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images. The SVAM-Net architecture is configured in a unique way to jointly accommodate bottom-up and…
▽ More
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images. The SVAM-Net architecture is configured in a unique way to jointly accommodate bottom-up and top-down learning within two separate branches of the network while sharing the same encoding layers. We design dedicated spatial attention modules (SAMs) along these learning pathways to exploit the coarse-level and fine-level semantic features for SOD at four stages of abstractions. The bottom-up branch performs a rough yet reasonably accurate saliency estimation at a fast rate, whereas the deeper top-down branch incorporates a residual refinement module (RRM) that provides fine-grained localization of the salient objects. Extensive performance evaluation of SVAM-Net on benchmark datasets clearly demonstrates its effectiveness for underwater SOD. We also validate its generalization performance by several ocean trials' data that include test images of diverse underwater scenes and waterbodies, and also images with unseen natural objects. Moreover, we analyze its computational feasibility for robotic deployments and demonstrate its utility in several important use cases of visual attention modeling.
△ Less
Submitted 14 April, 2022; v1 submitted 12 November, 2020;
originally announced November 2020.
-
IMU-Assisted Learning of Single-View Rolling Shutter Correction
Authors:
Jiawei Mo,
Md Jahidul Islam,
Junaed Sattar
Abstract:
Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (I…
▽ More
Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (IMU) data into the pose refinement process, which, compared to the state-of-the-art, greatly enhances the pose prediction. The improved accuracy and robustness make it possible for numerous vision algorithms to use imagery captured by rolling shutter cameras and produce highly accurate results. We also extend a dataset to have real rolling shutter images, IMU data, depth maps, camera poses, and corresponding global shutter images for rolling shutter correction training. We demonstrate the efficacy of the proposed method by evaluating the performance of Direct Sparse Odometry (DSO) algorithm on rolling shutter imagery corrected using the proposed approach. Results show marked improvements of the DSO algorithm over using uncorrected imagery, validating the proposed approach.
△ Less
Submitted 14 September, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
casebase: An Alternative Framework For Survival Analysis and Comparison of Event Rates
Authors:
Sahir Rai Bhatnagar,
Maxime Turgeon,
Jesse Islam,
James A. Hanley,
Olli Saarela
Abstract:
In epidemiological studies of time-to-event data, a quantity of interest to the clinician and the patient is the risk of an event given a covariate profile. However, methods relying on time matching or risk-set sampling (including Cox regression) eliminate the baseline hazard from the likelihood expression or the estimating function. The baseline hazard then needs to be estimated separately using…
▽ More
In epidemiological studies of time-to-event data, a quantity of interest to the clinician and the patient is the risk of an event given a covariate profile. However, methods relying on time matching or risk-set sampling (including Cox regression) eliminate the baseline hazard from the likelihood expression or the estimating function. The baseline hazard then needs to be estimated separately using a non-parametric approach. This leads to step-wise estimates of the cumulative incidence that are difficult to interpret. Using case-base sampling, Hanley & Miettinen (2009) explained how the parametric hazard functions can be estimated using logistic regression. Their approach naturally leads to estimates of the cumulative incidence that are smooth-in-time. In this paper, we present the casebase R package, a comprehensive and flexible toolkit for parametric survival analysis. We describe how the case-base framework can also be used in more complex settings: competing risks, time-varying exposure, and variable selection. Our package also includes an extensive array of visualization tools to complement the analysis of time-to-event data. We illustrate all these features through four different case studies. *SRB and MT contributed equally to this work.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Repairing Deep Neural Networks: Fix Patterns and Challenges
Authors:
Md Johirul Islam,
Rangeet Pan,
Giang Nguyen,
Hridesh Rajan
Abstract:
Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair…
▽ More
Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair tools address? What are the repair patterns whose automation could help developers? Which repair patterns should be assigned a higher priority for building automated bug repair tools? This work presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns; the most common bug fix patterns are fixing data dimension and neural network connectivity; DNN bug fixes have the potential to introduce adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and DNN bug localization, reuse of trained model, and co** with frequent releases are major challenges faced by developers when fixing bugs. We also contribute a benchmark of 667 DNN (bug, repair) instances.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
Semantic Segmentation of Underwater Imagery: Dataset and Benchmark
Authors:
Md Jahidul Islam,
Chelsey Edge,
Yuyang Xiao,
Peigen Luo,
Muntaqim Mehtaz,
Christopher Morse,
Sadman Sakib Enan,
Junaed Sattar
Abstract:
In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborati…
▽ More
In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborative experiments, and annotated by human participants. We also present a benchmark evaluation of state-of-the-art semantic segmentation approaches based on standard performance metrics. In addition, we present SUIM-Net, a fully-convolutional encoder-decoder model that balances the trade-off between performance and computational efficiency. It offers competitive performance while ensuring fast end-to-end inference, which is essential for its use in the autonomy pipeline of visually-guided underwater robots. In particular, we demonstrate its usability benefits for visual servoing, saliency prediction, and detailed scene understanding. With a variety of use cases, the proposed model and benchmark dataset open up promising opportunities for future research in underwater robot vision.
△ Less
Submitted 13 September, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception
Authors:
Md Jahidul Islam,
Peigen Luo,
Junaed Sattar
Abstract:
In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by…
▽ More
In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by formulating a multi-modal objective function that addresses the chrominance-specific underwater color degradation, lack of image sharpness, and loss in high-level feature representation. It is also supervised to learn salient foreground regions in the image, which in turn guides the network to learn global contrast enhancement. We design an end-to-end training pipeline to jointly learn the saliency prediction and SESR on a shared hierarchical feature space for fast inference. Moreover, we present UFO-120, the first dataset to facilitate large-scale SESR learning; it contains over 1500 training samples and a benchmark test set of 120 samples. By thorough experimental evaluation on the UFO-120 and other standard datasets, we demonstrate that Deep SESR outperforms the existing solutions for underwater image enhancement and super-resolution. We also validate its generalization performance on several test cases that include underwater images with diverse spectral and spatial degradation levels, and also terrestrial images with unseen natural objects. Lastly, we analyze its computational feasibility for single-board deployments and demonstrate its operational benefits for visually-guided underwater robots. The model and dataset information will be available at: https://github.com/xahidbuffon/Deep-SESR.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Understanding 3D CNN Behavior for Alzheimer's Disease Diagnosis from Brain PET Scan
Authors:
Jyoti Islam,
Yanqing Zhang
Abstract:
In recent days, Convolutional Neural Networks (CNN) have demonstrated impressive performance in medical image analysis. However, there is a lack of clear understanding of why and how the Convolutional Neural Network performs so well for image analysis task. How CNN analyzes an image and discriminates among samples of different classes are usually considered as non-transparent. As a result, it beco…
▽ More
In recent days, Convolutional Neural Networks (CNN) have demonstrated impressive performance in medical image analysis. However, there is a lack of clear understanding of why and how the Convolutional Neural Network performs so well for image analysis task. How CNN analyzes an image and discriminates among samples of different classes are usually considered as non-transparent. As a result, it becomes difficult to apply CNN based approaches in clinical procedures and automated disease diagnosis systems. In this paper, we consider this issue and work on visualizing and understanding the decision of Convolutional Neural Network for Alzheimer's Disease (AD) Diagnosis. We develop a 3D deep convolutional neural network for AD diagnosis using brain PET scans and propose using five visualizations techniques - Sensitivity Analysis (Backpropagation), Guided Backpropagation, Occlusion, Brain Area Occlusion, and Layer-wise Relevance Propagation (LRP) to understand the decision of the CNN by highlighting the relevant areas in the PET data.
△ Less
Submitted 25 December, 2019; v1 submitted 10 December, 2019;
originally announced December 2019.
-
Machine Vision for Improved Human-Robot Cooperation in Adverse Underwater Conditions
Authors:
Md Jahidul Islam
Abstract:
Visually-guided underwater robots are deployed alongside human divers for cooperative exploration, inspection, and monitoring tasks in numerous shallow-water and coastal-water applications. The most essential capability of such companion robots is to visually interpret their surroundings and assist the divers during various stages of an underwater mission. Despite recent technological advancements…
▽ More
Visually-guided underwater robots are deployed alongside human divers for cooperative exploration, inspection, and monitoring tasks in numerous shallow-water and coastal-water applications. The most essential capability of such companion robots is to visually interpret their surroundings and assist the divers during various stages of an underwater mission. Despite recent technological advancements, the existing systems and solutions for real-time visual perception are greatly affected by marine artifacts such as poor visibility, lighting variation, and the scarcity of salient features. The difficulties are exacerbated by a host of non-linear image distortions caused by the vulnerabilities of underwater light propagation (e.g., wavelength-dependent attenuation, absorption, and scattering). In this dissertation, we present a set of novel and improved visual perception solutions to address these challenges for effective underwater human-robot cooperation.
Specifically, we develop robust and efficient modules for Autonomous Underwater Vehicles (AUVs) to follow and interact with companion divers by accurately perceiving their surroundings while relying on noisy visual sensing alone. Moreover, our proposed perception solutions enable visually-guided robots to see better in noisy sensing conditions and do better with limited computational resources and real-time constraints. The research outcomes entail novel design and efficient implementation of the underlying vision and learning-based algorithms with extensive field experimental validations and feasibility analyses for single-board deployments. In addition to advancing the state-of-the-art, the proposed methodologies and systems take us one step closer toward bridging the gap between theory and practice for improved human-robot cooperation in the wild.
△ Less
Submitted 29 July, 2021; v1 submitted 28 October, 2019;
originally announced November 2019.
-
Halogen Doped Electronic Properties of 2D ZnO: A First Principles Study
Authors:
H. M. R. Faruque,
K. Hosen,
A. S. M. J. Islam,
M. S. Islam
Abstract:
In recent times, two dimensional (2D) ZnO has attracted a great attention in the field of nano-research due to its extraordinary electronic, thermal and optical properties. In this paper, we have explored the effects of halogen impurity do** such as F, Cl, and Br atoms on the electronic properties of 2D ZnO using first principles calculation. The pristine 2D ZnO exhibits a semiconducting behavio…
▽ More
In recent times, two dimensional (2D) ZnO has attracted a great attention in the field of nano-research due to its extraordinary electronic, thermal and optical properties. In this paper, we have explored the effects of halogen impurity do** such as F, Cl, and Br atoms on the electronic properties of 2D ZnO using first principles calculation. The pristine 2D ZnO exhibits a semiconducting behavior with a direct bandgap of 1.67 eV on the Γ point. However, when impurities such as F, Cl, or Br atoms are introduced, the 2D ZnO shows semi-metallic behavior with almost zero bandgap. It is perceived that, owing to the introduction of F impurity, the zero bandgap is exhibited at the K point of the electronic band structure. However, in the case of Cl and Br impurities, the nearly zero bandgap is observed elsewhere rather than on the K point. Moreover, due to the introduction of impurity atoms, the Fermi level also shifted towards the conduction band (CB) suggesting an increase of the carrier concentration in the density of states (DOS) results. These findings might be very much beneficial when do** effect, especially halogen impurity do** is considered to modulate the electronic properties of 2D ZnO in the near future.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
Underwater Image Super-Resolution using Deep Residual Multipliers
Authors:
Md Jahidul Islam,
Sadman Sakib Enan,
Peigen Luo,
Junaed Sattar
Abstract:
We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global con…
▽ More
We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global content, color, and local style information. Additionally, we present USR-248, a large-scale dataset of three sets of underwater images of 'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution. USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR models. Furthermore, we validate the effectiveness of our proposed model through qualitative and quantitative experiments and compare the results with several state-of-the-art models' performances. We also analyze its practical feasibility for applications such as scene understanding and attention modeling in noisy visual conditions.
△ Less
Submitted 24 February, 2020; v1 submitted 20 September, 2019;
originally announced September 2019.
-
What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow
Authors:
Md Johirul Islam,
Hoan Anh Nguyen,
Rangeet Pan,
Hridesh Rajan
Abstract:
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras…
▽ More
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikit-learn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. We classify these questions into seven typical stages of an ML pipeline to understand the correlation between the library and the stage. Then we study the questions and perform statistical analysis to explore the answer to four research objectives (finding the most difficult stage, understanding the nature of problems, nature of libraries and studying whether the difficulties stayed consistent over time). Our findings reveal the urgent need for software engineering (SE) research in this area. Both static and dynamic analyses are mostly absent and badly needed to help developers find errors earlier. While there has been some early research on debugging, much more work is needed. API misuses are prevalent and API design improvements are sorely needed. Last and somewhat surprisingly, a tug of war between providing higher levels of abstractions and the need to understand the behavior of the trained model is prevalent.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
A Comprehensive Study on Deep Learning Bug Characteristics
Authors:
Md Johirul Islam,
Giang Nguyen,
Rangeet Pan,
Hridesh Rajan
Abstract:
Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such c…
▽ More
Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such characteristics of bugs in deep learning software has the potential to foster the development of better deep learning platforms, debugging mechanisms, development practices, and encourage the development of analysis and verification frameworks. Therefore, we study 2716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, root causes of bugs, impacts of bugs, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. The key findings of our study include: data bug and logic bug are the most severe bug types in deep learning software appearing more than 48% of the times, major root causes of these bugs are Incorrect Model Parameter (IPS) and Structural Inefficiency (SI) showing up more than 43% of the times. We have also found that the bugs in the usage of deep learning libraries have some common antipatterns that lead to a strong correlation of bug types among the libraries.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Identifying Classes Susceptible to Adversarial Attacks
Authors:
Rangeet Pan,
Md Johirul Islam,
Shibbir Ahmed,
Hridesh Rajan
Abstract:
Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapp…
▽ More
Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create map** among original classes and adversarial classes that helps to reduce the randomness of a model to a significant amount in an adversarial setting. We analyze the high dimensional geometry among the feature classes and identify the k most susceptible target classes in an adversarial attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet and ResNet-32) datasets. Finally, we evaluate our techniques in order to determine which distance-based measure works best and how the randomness of a model changes with perturbation.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Fast Underwater Image Enhancement for Improved Visual Perception
Authors:
Md Jahidul Islam,
Youya Xia,
Junaed Sattar
Abstract:
In this paper, we present a conditional generative adversarial network-based model for real-time underwater image enhancement. To supervise the adversarial training, we formulate an objective function that evaluates the perceptual image quality based on its global content, color, local texture, and style information. We also present EUVP, a large-scale dataset of a paired and unpaired collection o…
▽ More
In this paper, we present a conditional generative adversarial network-based model for real-time underwater image enhancement. To supervise the adversarial training, we formulate an objective function that evaluates the perceptual image quality based on its global content, color, local texture, and style information. We also present EUVP, a large-scale dataset of a paired and unpaired collection of underwater images (of `poor' and `good' quality) that are captured using seven different cameras over various visibility conditions during oceanic explorations and human-robot collaborative experiments. In addition, we perform several qualitative and quantitative evaluations which suggest that the proposed model can learn to enhance underwater image quality from both paired and unpaired training. More importantly, the enhanced images provide improved performances of standard models for underwater object detection, human pose estimation, and saliency prediction. These results validate that it is suitable for real-time preprocessing in the autonomy pipeline by visually-guided underwater robots. The model and associated training pipelines are available at https://github.com/xahidbuffon/funie-gan.
△ Less
Submitted 8 February, 2020; v1 submitted 23 March, 2019;
originally announced March 2019.
-
Robot-to-Robot Relative Pose Estimation using Humans as Markers
Authors:
Md Jahidul Islam,
Jiawei Mo,
Junaed Sattar
Abstract:
In this paper, we propose a method to determine the 3D relative pose of pairs of communicating robots by using human pose-based key-points as correspondences. We adopt a 'leader-follower' framework, where at first, the leader robot visually detects and triangulates the key-points using the state-of-the-art pose detector named OpenPose. Afterward, the follower robots match the corresponding 2D proj…
▽ More
In this paper, we propose a method to determine the 3D relative pose of pairs of communicating robots by using human pose-based key-points as correspondences. We adopt a 'leader-follower' framework, where at first, the leader robot visually detects and triangulates the key-points using the state-of-the-art pose detector named OpenPose. Afterward, the follower robots match the corresponding 2D projections on their respective calibrated cameras and find their relative poses by solving the perspective-n-point (PnP) problem. In the proposed method, we design an efficient person re-identification technique for associating the mutually visible humans in the scene. Additionally, we present an iterative optimization algorithm to refine the associated key-points based on their local structural properties in the image space. We demonstrate that these refinement processes are essential to establish accurate key-point correspondences across viewpoints. Furthermore, we evaluate the performance of the proposed relative pose estimation system through several experiments conducted in terrestrial and underwater environments. Finally, we discuss the relevant operational challenges of this approach and analyze its feasibility for multi-robot cooperative systems in human-dominated social settings and feature-deprived environments such as underwater.
△ Less
Submitted 6 September, 2020; v1 submitted 2 March, 2019;
originally announced March 2019.
-
Towards Robust Lung Segmentation in Chest Radiographs with Deep Learning
Authors:
Jyoti Islam,
Yanqing Zhang
Abstract:
Automated segmentation of Lungs plays a crucial role in the computer-aided diagnosis of chest X-Ray (CXR) images. Develo** an efficient Lung segmentation model is challenging because of difficulties such as the presence of several edges at the rib cage and clavicle, inconsistent lung shape among different individuals, and the appearance of the lung apex. In this paper, we propose a robust model…
▽ More
Automated segmentation of Lungs plays a crucial role in the computer-aided diagnosis of chest X-Ray (CXR) images. Develo** an efficient Lung segmentation model is challenging because of difficulties such as the presence of several edges at the rib cage and clavicle, inconsistent lung shape among different individuals, and the appearance of the lung apex. In this paper, we propose a robust model for Lung segmentation in Chest Radiographs. Our model learns to ignore the irrelevant regions in an input Chest Radiograph while highlighting regions useful for lung segmentation. The proposed model is evaluated on two public chest X-Ray datasets (Montgomery County, MD, USA, and Shenzhen No. 3 People's Hospital in China). The experimental result with a DICE score of 98.6% demonstrates the robustness of our proposed lung segmentation approach.
△ Less
Submitted 30 November, 2018;
originally announced November 2018.
-
Towards a Generic Diver-Following Algorithm: Balancing Robustness and Efficiency in Deep Visual Detection
Authors:
Md Jahidul Islam,
Michael Fulton,
Junaed Sattar
Abstract:
This paper explores the design and development of a class of robust diver-following algorithms for autonomous underwater robots. By considering the operational challenges for underwater visual tracking in diverse real-world settings, we formulate a set of desired features of a generic diver following algorithm. We attempt to accommodate these features and maximize general tracking performance by e…
▽ More
This paper explores the design and development of a class of robust diver-following algorithms for autonomous underwater robots. By considering the operational challenges for underwater visual tracking in diverse real-world settings, we formulate a set of desired features of a generic diver following algorithm. We attempt to accommodate these features and maximize general tracking performance by exploiting the state-of-the-art deep object detection models. We fine-tune the building blocks of these models with a goal of balancing the trade-off between robustness and efficiency in an onboard setting under real-time constraints. Subsequently, we design an architecturally simple Convolutional Neural Network (CNN)-based diver-detection model that is much faster than the state-of-the-art deep models yet provides comparable detection performances. In addition, we validate the performance and effectiveness of the proposed diver-following modules through a number of field experiments in closed-water and open-water environments.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.
-
Identifying Protein-Protein Interaction using Tree LSTM and Structured Attention
Authors:
Mahtab Ahmed,
Jumayel Islam,
Muhammad Rifayat Samee,
Robert E. Mercer
Abstract:
Identifying interactions between proteins is important to understand underlying biological processes. Extracting a protein-protein interaction (PPI) from the raw text is often very difficult. Previous supervised learning methods have used handcrafted features on human-annotated data sets. In this paper, we propose a novel tree recurrent neural network with structured attention architecture for doi…
▽ More
Identifying interactions between proteins is important to understand underlying biological processes. Extracting a protein-protein interaction (PPI) from the raw text is often very difficult. Previous supervised learning methods have used handcrafted features on human-annotated data sets. In this paper, we propose a novel tree recurrent neural network with structured attention architecture for doing PPI. Our architecture achieves state of the art results (precision, recall, and F1-score) on the AIMed and BioInfer benchmark data sets. Moreover, our models achieve a significant improvement over previous best models without any explicit feature extraction. Our experimental results show that traditional recurrent networks have inferior performance compared to tree recurrent networks for the supervised PPI problem.
△ Less
Submitted 27 July, 2018;
originally announced August 2018.
-
A Cyberinfrastructure for BigData Transportation Engineering
Authors:
Md Johirul Islam,
Anuj Sharma,
Hridesh Rajan
Abstract:
Big Data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, decrease construction worker injuries, among others. Despite these benefits, research on Big Data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a…
▽ More
Big Data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, decrease construction worker injuries, among others. Despite these benefits, research on Big Data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a transportation-specific programming language, and it's Big Data infrastructure that is aimed at decreasing this barrier to entry. Our evaluation that uses over two dozen research questions from six categories show that research is easier to realize as a BoaT computer program, an order of magnitude faster when this program is run, and exhibits 12-14x decrease in storage requirements.
△ Less
Submitted 30 April, 2018;
originally announced May 2018.
-
Understanding Human Motion and Gestures for Underwater Human-Robot Collaboration
Authors:
Md Jahidul Islam
Abstract:
In this paper, we present a number of robust methodologies for an underwater robot to visually detect, follow, and interact with a diver for collaborative task execution. We design and develop two autonomous diver-following algorithms, the first of which utilizes both spatial- and frequency-domain features pertaining to human swimming patterns in order to visually track a diver. The second algorit…
▽ More
In this paper, we present a number of robust methodologies for an underwater robot to visually detect, follow, and interact with a diver for collaborative task execution. We design and develop two autonomous diver-following algorithms, the first of which utilizes both spatial- and frequency-domain features pertaining to human swimming patterns in order to visually track a diver. The second algorithm uses a convolutional neural network-based model for robust tracking-by-detection. In addition, we propose a hand gesture-based human-robot communication framework that is syntactically simpler and computationally more efficient than the existing grammar-based frameworks. In the proposed interaction framework, deep visual detectors are used to provide accurate hand gesture recognition; subsequently, a finite-state machine performs robust and efficient gesture-to-instruction map**. The distinguishing feature of this framework is that it can be easily adopted by divers for communicating with underwater robots without using artificial markers or requiring memorization of complex language rules. Furthermore, we validate the performance and effectiveness of the proposed methodologies through extensive field experiments in closed- and open-water environments. Finally, we perform a user interaction study to demonstrate the usability benefits of our proposed interaction framework compared to existing methods.
△ Less
Submitted 6 April, 2018;
originally announced April 2018.
-
Robotic Detection of Marine Litter Using Deep Visual Detection Models
Authors:
Michael Fulton,
Jungseok Hong,
Md Jahidul Islam,
Junaed Sattar
Abstract:
Trash deposits in aquatic environments have a destructive effect on marine ecosystems and pose a long-term economic and environmental threat. Autonomous underwater vehicles (AUVs) could very well contribute to the solution of this problem by finding and eventually removing trash. This paper evaluates a number of deep-learning algorithms preforming the task of visually detecting trash in realistic…
▽ More
Trash deposits in aquatic environments have a destructive effect on marine ecosystems and pose a long-term economic and environmental threat. Autonomous underwater vehicles (AUVs) could very well contribute to the solution of this problem by finding and eventually removing trash. This paper evaluates a number of deep-learning algorithms preforming the task of visually detecting trash in realistic underwater environments, with the eventual goal of exploration, map**, and extraction of such debris by using AUVs. A large and publicly-available dataset of actual debris in open-water locations is annotated for training a number of convolutional neural network architectures for object detection. The trained networks are then evaluated on a set of images from other portions of that dataset, providing insight into approaches for develo** the detection capabilities of an AUV for underwater trash removal. In addition, the evaluation is performed on three different platforms of varying processing power, which serves to assess these algorithms' fitness for real-time applications.
△ Less
Submitted 21 September, 2018; v1 submitted 3 April, 2018;
originally announced April 2018.
-
Person Following by Autonomous Robots: A Categorical Overview
Authors:
Md Jahidul Islam,
Jungseok Hong,
Junaed Sattar
Abstract:
A wide range of human-robot collaborative applications in diverse domains such as manufacturing, health care, the entertainment industry, and social interactions, require an autonomous robot to follow its human companion. Different working environments and applications pose diverse challenges by adding constraints on the choice of sensors, the degree of autonomy, and dynamics of a person-following…
▽ More
A wide range of human-robot collaborative applications in diverse domains such as manufacturing, health care, the entertainment industry, and social interactions, require an autonomous robot to follow its human companion. Different working environments and applications pose diverse challenges by adding constraints on the choice of sensors, the degree of autonomy, and dynamics of a person-following robot. Researchers have addressed these challenges in many ways and contributed to the development of a large body of literature. This paper provides a comprehensive overview of the literature by categorizing different aspects of person-following by autonomous robots. Also, the corresponding operational challenges are identified based on various design choices for ground, underwater, and aerial scenarios. In addition, state-of-the-art methods for perception, planning, control, and interaction are elaborately discussed and their applicability in varied operational scenarios are presented. Then, some of the prominent methods are qualitatively compared, corresponding practicalities are illustrated, and their feasibility is analyzed for various use-cases. Furthermore, several prospective application areas are identified, and open problems are highlighted for future research.
△ Less
Submitted 17 September, 2019; v1 submitted 21 March, 2018;
originally announced March 2018.
-
Enhancing Underwater Imagery using Generative Adversarial Networks
Authors:
Cameron Fabbri,
Md Jahidul Islam,
Junaed Sattar
Abstract:
Autonomous underwater vehicles (AUVs) rely on a variety of sensors - acoustic, inertial and visual - for intelligent decision making. Due to its non-intrusive, passive nature, and high information content, vision is an attractive sensing modality, particularly at shallower depths. However, factors such as light refraction and absorption, suspended particles in the water, and color distortion affec…
▽ More
Autonomous underwater vehicles (AUVs) rely on a variety of sensors - acoustic, inertial and visual - for intelligent decision making. Due to its non-intrusive, passive nature, and high information content, vision is an attractive sensing modality, particularly at shallower depths. However, factors such as light refraction and absorption, suspended particles in the water, and color distortion affect the quality of visual data, resulting in noisy and distorted images. AUVs that rely on visual sensing thus face difficult challenges, and consequently exhibit poor performance on vision-driven tasks. This paper proposes a method to improve the quality of visual underwater scenes using Generative Adversarial Networks (GANs), with the goal of improving input to vision-driven behaviors further down the autonomy pipeline. Furthermore, we show how recently proposed methods are able to generate a dataset for the purpose of such underwater image restoration. For any visually-guided underwater robots, this improvement can result in increased safety and reliability through robust visual perception. To that effect, we present quantitative and qualitative data which demonstrates that images corrected through the proposed approach generate more visually appealing images, and also provide increased accuracy for a diver tracking algorithm.
△ Less
Submitted 11 January, 2018;
originally announced January 2018.
-
An Ensemble of Deep Convolutional Neural Networks for Alzheimer's Disease Detection and Classification
Authors:
Jyoti Islam,
Yanqing Zhang
Abstract:
Alzheimer's Disease destroys brain cells causing people to lose their memory, mental functions and ability to continue daily activities. It is a severe neurological brain disorder which is not curable, but earlier detection of Alzheimer's Disease can help for proper treatment and to prevent brain tissue damage. Detection and classification of Alzheimer's Disease (AD) is challenging because sometim…
▽ More
Alzheimer's Disease destroys brain cells causing people to lose their memory, mental functions and ability to continue daily activities. It is a severe neurological brain disorder which is not curable, but earlier detection of Alzheimer's Disease can help for proper treatment and to prevent brain tissue damage. Detection and classification of Alzheimer's Disease (AD) is challenging because sometimes the signs that distinguish Alzheimer's Disease MRI data can be found in normal healthy brain MRI data of older people. Moreover, there are relatively small amount of dataset available to train the automated Alzheimer's Disease detection and classification model. In this paper, we present a novel Alzheimer's Disease detection and classification model using brain MRI data analysis. We develop an ensemble of deep convolutional neural networks and demonstrate superior performance on the Open Access Series of Imaging Studies (OASIS) dataset.
△ Less
Submitted 19 December, 2017; v1 submitted 1 December, 2017;
originally announced December 2017.
-
Sentiment analysis of twitter data
Authors:
Hamid Bagheri,
Md Johirul Islam
Abstract:
Social networks are the main resources to gather information about people's opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this technical paper, we show the application of sentimental analysis and how to connect to Twitter and run sentimental analysis queries. We run experiments on different queries from politics to humanity an…
▽ More
Social networks are the main resources to gather information about people's opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this technical paper, we show the application of sentimental analysis and how to connect to Twitter and run sentimental analysis queries. We run experiments on different queries from politics to humanity and show the interesting results. We realized that the neutral sentiments for tweets are significantly high which clearly shows the limitations of the current works.
△ Less
Submitted 15 December, 2017; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Dynamic Reconfiguration of Mission Parameters in Underwater Human-Robot Collaboration
Authors:
Md Jahidul Islam,
Marc Ho,
Junaed Sattar
Abstract:
This paper presents a real-time programming and parameter reconfiguration method for autonomous underwater robots in human-robot collaborative tasks. Using a set of intuitive and meaningful hand gestures, we develop a syntactically simple framework that is computationally more efficient than a complex, grammar-based approach. In the proposed framework, a convolutional neural network is trained to…
▽ More
This paper presents a real-time programming and parameter reconfiguration method for autonomous underwater robots in human-robot collaborative tasks. Using a set of intuitive and meaningful hand gestures, we develop a syntactically simple framework that is computationally more efficient than a complex, grammar-based approach. In the proposed framework, a convolutional neural network is trained to provide accurate hand gesture recognition; subsequently, a finite-state machine-based deterministic model performs efficient gesture-to-instruction map**, and further improves robustness of the interaction scheme. The key aspect of this framework is that it can be easily adopted by divers for communicating simple instructions to underwater robots without using artificial tags such as fiducial markers, or requiring them to memorize a potentially complex set of language rules. Extensive experiments are performed both on field-trial data and through simulation, which demonstrate the robustness, efficiency, and portability of this framework in a number of different scenarios. Finally, a user interaction study is presented that illustrates the gain in usability of our proposed interaction framework compared to the existing methods for underwater domains.
△ Less
Submitted 20 February, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.