Search | arXiv e-print repository

MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection

Authors: Michelle Adeline, Junn Yong Loo, Vishnu Monn Baskaran

Abstract: Multi-view 3D object detection is a crucial component of autonomous driving systems. Contemporary query-based methods primarily depend either on dataset-specific initialization of 3D anchors, introducing bias, or utilize dense attention mechanisms, which are computationally inefficient and unscalable. To overcome these issues, we present MDHA, a novel sparse query-based framework, which constructs… ▽ More Multi-view 3D object detection is a crucial component of autonomous driving systems. Contemporary query-based methods primarily depend either on dataset-specific initialization of 3D anchors, introducing bias, or utilize dense attention mechanisms, which are computationally inefficient and unscalable. To overcome these issues, we present MDHA, a novel sparse query-based framework, which constructs adaptive 3D output proposals using hybrid anchors from multi-view, multi-scale input. Fixed 2D anchors are combined with depth predictions to form 2.5D anchors, which are projected to obtain 3D proposals. To ensure high efficiency, our proposed Anchor Encoder performs sparse refinement and selects the top-k anchors and features. Moreover, while existing multi-view attention mechanisms rely on projecting reference points to multiple images, our novel Circular Deformable Attention mechanism only projects to a single image but allows reference points to seamlessly attend to adjacent images, improving efficiency without compromising on performance. On the nuScenes val set, it achieves 46.4% mAP and 55.0% NDS with a ResNet101 backbone. MDHA significantly outperforms the baseline, where anchor proposals are modelled as learnable embeddings. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.08015 [pdf]

Highly agile flat swimming robot

Authors: Florian Hartmann, Mrudhula Baskaran, Gaetan Raynaud, Mehdi Benbedda, Karen Mulleners, Herbert Shea

Abstract: Exploring bodies of water on their surface allows robots to efficiently communicate and harvest energy from the sun. On the water surface, however, robots often face highly unstructured environments, cluttered with plant matter, animals, and debris. We report a fast (5.1 cm/s translation and 195 °/s rotation), centimeter-scale swimming robot with high maneuverability and autonomous untethered oper… ▽ More Exploring bodies of water on their surface allows robots to efficiently communicate and harvest energy from the sun. On the water surface, however, robots often face highly unstructured environments, cluttered with plant matter, animals, and debris. We report a fast (5.1 cm/s translation and 195 °/s rotation), centimeter-scale swimming robot with high maneuverability and autonomous untethered operation. Locomotion is enabled by a pair of soft, millimeter-thin, undulating pectoral fins, in which traveling waves are electrically excited to generate propulsion. The robots navigate through narrow spaces, through grassy plants, and push objects weighing over 16x their body weight. Such robots can allow distributed environmental monitoring as well as continuous measurement of plant and water parameters for aqua-farming. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2306.12361 [pdf, other]

Sigma-point Kalman Filter with Nonlinear Unknown Input Estimation via Optimization and Data-driven Approach for Dynamic Systems

Authors: Junn Yong Loo, Ze Yang Ding, Vishnu Monn Baskaran, Surya Girinatha Nurzaman, Chee Pin Tan

Abstract: Most works on joint state and unknown input (UI) estimation require the assumption that the UIs are linear; this is potentially restrictive as it does not hold in many intelligent autonomous systems. To overcome this restriction and circumvent the need to linearize the system, we propose a derivative-free Unknown Input Sigma-point Kalman Filter (SPKF-nUI) where the SPKF is interconnected with a ge… ▽ More Most works on joint state and unknown input (UI) estimation require the assumption that the UIs are linear; this is potentially restrictive as it does not hold in many intelligent autonomous systems. To overcome this restriction and circumvent the need to linearize the system, we propose a derivative-free Unknown Input Sigma-point Kalman Filter (SPKF-nUI) where the SPKF is interconnected with a general nonlinear UI estimator that can be implemented via nonlinear optimization and data-driven approaches. The nonlinear UI estimator uses the posterior state estimate which is less susceptible to state prediction error. In addition, we introduce a joint sigma-point transformation scheme to incorporate both the state and UI uncertainties in the estimation of SPKF-nUI. An in-depth stochastic stability analysis proves that the proposed SPKF-nUI yields exponentially converging estimation error bounds under reasonable assumptions. Finally, two case studies are carried out on a simulation-based rigid robot and a physical soft robot, i.e., robots made of soft materials with complex dynamics to validate effectiveness of the proposed filter on nonlinear dynamic systems. Our results demonstrate that the proposed SPKF-nUI achieves the lowest state and UI estimation errors when compared to the existing nonlinear state-UI filters. △ Less

Submitted 24 June, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

arXiv:2306.04919 [pdf, other]

Unsupervised Cross-Domain Soft Sensor Modelling via Deep Physics-Inspired Particle Flow Bayes

Authors: Junn Yong Loo, Ze Yang Ding, Surya G. Nurzaman, Chee-Ming Ting, Vishnu Monn Baskaran, Chee Pin Tan

Abstract: Data-driven soft sensors are essential for achieving accurate perception through reliable state inference. However, develo** representative soft sensor models is challenged by issues such as missing labels, domain adaptability, and temporal coherence in data. To address these challenges, we propose a deep Particle Flow Bayes (DPFB) framework for cross-domain soft sensor modeling in the absence o… ▽ More Data-driven soft sensors are essential for achieving accurate perception through reliable state inference. However, develo** representative soft sensor models is challenged by issues such as missing labels, domain adaptability, and temporal coherence in data. To address these challenges, we propose a deep Particle Flow Bayes (DPFB) framework for cross-domain soft sensor modeling in the absence of target state labels. In particular, a sequential Bayes objective is first formulated to perform the maximum likelihood estimation underlying the cross-domain soft sensing problem. At the core of the framework, we incorporate a physics-inspired particle flow that optimizes the sequential Bayes objective to perform an exact Bayes update of the model extracted latent and hidden features. As a result, these contributions enable the proposed framework to learn a rich approximate posterior feature representation capable of characterizing complex cross-domain system dynamics and performing effective time series unsupervised domain adaptation (UDA). Finally, we validate the framework on a complex industrial multiphase flow process system with complex dynamics and multiple operating conditions. The results demonstrate that the DPFB framework achieves superior cross-domain soft sensing performance, outperforming state-of-the-art deep UDA and normalizing flow approaches. △ Less

Submitted 8 July, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

arXiv:2303.01693 [pdf, other]

doi 10.1109/ICRA48891.2023.10160662

Cross-domain Transfer Learning and State Inference for Soft Robots via a Semi-supervised Sequential Variational Bayes Framework

Authors: Shageenderan Sapai, Junn Yong Loo, Ze Yang Ding, Chee Pin Tan, Raphael CW Phan, Vishnu Monn Baskaran, Surya Girinatha Nurzaman

Abstract: Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged f… ▽ More Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged for various reasons, including difficulty in the sensorization of soft robots and the inconvenience of collecting data in unstructured environments. To address this challenge, in this paper, we propose a semi-supervised sequential variational Bayes (DSVB) framework for transfer learning and state inference in soft robots with missing state labels on certain robot configurations. Considering that soft robots may exhibit distinct dynamics under different robot configurations, a feature space transfer strategy is also incorporated to promote the adaptation of latent features across multiple configurations. Unlike existing transfer learning approaches, our proposed DSVB employs a recurrent neural network to model the nonlinear dynamics and temporal coherence in soft robot data. The proposed framework is validated on multiple setup configurations of a pneumatic-based soft robot finger. Experimental results on four transfer scenarios demonstrate that DSVB performs effective transfer learning and accurate state inference amidst missing state labels. The data and code are available at https://github.com/shageenderan/DSVB. △ Less

Submitted 25 August, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: Accepted at the International Conference on Robotics and Automation (ICRA) 2023

arXiv:2302.11361 [pdf, other]

HDR image watermarking using saliency detection and quantization index modulation

Authors: Ahmed Khan, Minoru Kuribayashi, KokSheik Wong, Vishnu Monn Baskaran

Abstract: High-dynamic range (HDR) images are circulated rapidly over the internet with risks of being exploited for unauthorized usage. To protect these images, some HDR image based watermarking (HDR-IW) methods were put forward. However, they inherited the same problem faced by conventional IW methods for standard dynamic range (SDR) images, where only trade-offs among conflicting requirements are managed… ▽ More High-dynamic range (HDR) images are circulated rapidly over the internet with risks of being exploited for unauthorized usage. To protect these images, some HDR image based watermarking (HDR-IW) methods were put forward. However, they inherited the same problem faced by conventional IW methods for standard dynamic range (SDR) images, where only trade-offs among conflicting requirements are managed instead of simultaneous improvement. In this paper, a novel saliency (eye-catching object) detection based trade-off independent HDR-IW is proposed, to simultaneously improve robustness, imperceptibility and payload. First, the host image goes through our proposed salient object detection model to produce a saliency map, which is, in turn, exploited to segment the foreground and background of the host image. Next, the binary watermark is partitioned into the foregrounds and backgrounds using the same mask and scrambled using a random permutation algorithm. Finally, the watermark segments are embedded into selected bit-plane of the corresponding host segments using quantized indexed modulation. Experimental results suggest that the proposed work outperforms state-of-the-art methods in terms of improving the conflicting requirements. △ Less

Submitted 23 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

arXiv:2107.14771 [pdf, other]

doi 10.1017/flo.2022.9

Lagrangian analysis of bio-inspired vortex ring formation

Authors: Mrudhula Baskaran, Karen Mulleners

Abstract: Pulsatile jet propulsion is a highly energy-efficient swimming mode used by aquatic animals that continues to inspire engineers of underwater vehicles. Here, we present a bio-inspired jet propulsor that combines the flexible hull of a jellyfish with the bivalve compression of a scallop to create individual vortex rings for thrust generation. Similar to biological jetters, our propulsor generates a… ▽ More Pulsatile jet propulsion is a highly energy-efficient swimming mode used by aquatic animals that continues to inspire engineers of underwater vehicles. Here, we present a bio-inspired jet propulsor that combines the flexible hull of a jellyfish with the bivalve compression of a scallop to create individual vortex rings for thrust generation. Similar to biological jetters, our propulsor generates a non-linear temporal exit velocity profile and has a finite volume capacity. The formation process of the vortices generated by this jet profile is analysed using time-resolved velocity field measurements. The transient development of the vortex properties is characterised based on the evolution of ridges in the finite-time Lyapunov exponent field and on local extrema in the pressure field derived from the velocity data. Special attention is directed toward the vortex merging observed in the trailing jet. During vortex merging, the Lagrangian vortex boundaries first contract in the stream-wise direction before expanding in the normal direction to keep the non-dimensional energy at its minimum value. The circulation, diameter, and translation velocity of the vortex increase due to merging. The vortex merging takes place because the velocity of the trailing vortex is higher than the velocity of the main vortex prior to merging. △ Less

Submitted 29 April, 2022; v1 submitted 30 July, 2021; originally announced July 2021.

arXiv:2105.15154 [pdf, other]

CO2-driven diffusiophoresis and water cleaning: Similarity solutions for predicting the exclusion zone in a channel flow

Authors: Suin Shim, Mrudhula Baskaran, Ethan H. Thai, Howard A. Stone

Abstract: We investigate experimentally and theoretically diffusiophoretic separation of negatively charged particles in a rectangular channel flow, driven by CO2 dissolution from one side-wall. Since the negatively charged particles create an exclusion zone near the boundary where CO2 is introduced, we model the problem by applying a shear flow approximation in a two-dimensional configuration. From the for… ▽ More We investigate experimentally and theoretically diffusiophoretic separation of negatively charged particles in a rectangular channel flow, driven by CO2 dissolution from one side-wall. Since the negatively charged particles create an exclusion zone near the boundary where CO2 is introduced, we model the problem by applying a shear flow approximation in a two-dimensional configuration. From the form of the equations we define a similarity variable to transform the reaction-diffusion equations for CO2 and ions and the advection-diffusion equation for the particle distribution to ordinary differential equations. The definition of the similarity variable suggests a characteristic length scale for the particle exclusion zone. We consider height-averaged flow behaviors in rectangular channels to rationalize and connect our experimental observations with the model, by calculating the wall shear rate as functions of channel dimensions. Our observations and the theoretical model provide the design parameters such as flow speed, channel dimensions and CO2 pressure for the in-flow water cleaning systems. △ Less

Submitted 31 May, 2021; originally announced May 2021.

arXiv:2007.00754 [pdf, other]

Simulation and Analysis of Distributed Wireless Sensor Network using Message Passing Interface

Authors: Bhanuka Manesha Samarasekara Vitharana Gamage, Vishnu Monn Baskaran

Abstract: Wireless Sensor Networks (WSN) are used by many industries from environment monitoring systems to NASA's space exploration programs, as it has allowed society to monitor and prevent problems before they occur with less cost and maintenance. This document aims to propose and analyze an efficient inter process communication (IPC) architecture using a nearest neighbor/grid based socket architecture.… ▽ More Wireless Sensor Networks (WSN) are used by many industries from environment monitoring systems to NASA's space exploration programs, as it has allowed society to monitor and prevent problems before they occur with less cost and maintenance. This document aims to propose and analyze an efficient inter process communication (IPC) architecture using a nearest neighbor/grid based socket architecture. A parallelized version of the AES encryption algorithm is also used in order to increase the security of the WSN. First the proposed architecture is compared and contrasted against other well established architectures. Next, the benefits and drawbacks of the AES encryption algorithm is elucidated. The Message Parsing Interface (MPI) library in C is used for the communication while OpenMP is used for parallelizing the encryption algorithm. Next an analysis is performed on the results obtained from multiple simulations. Finally a conclusion is made that the grid based IPC architecture with AES parallel encryption helps WSNs maintain security in communication while being cost and power efficient to operate. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: 11 pages, 11 figures

arXiv:2007.00745 [pdf, other]

Efficient Generation of Mandelbrot Set using Message Passing Interface

Authors: Bhanuka Manesha Samarasekara Vitharana Gamage, Vishnu Monn Baskaran

Abstract: With the increasing need for safer and reliable systems, Mandelbrot Set's use in the encryption world is evident to everyone. This document aims to provide an efficient method to generate this set using data parallelism. First Bernstein's conditions are used to ensure that the Data is parallelizable when generating the Mandelbrot Set. Then Amdhal's Law is used to calculate the theoretical speed up… ▽ More With the increasing need for safer and reliable systems, Mandelbrot Set's use in the encryption world is evident to everyone. This document aims to provide an efficient method to generate this set using data parallelism. First Bernstein's conditions are used to ensure that the Data is parallelizable when generating the Mandelbrot Set. Then Amdhal's Law is used to calculate the theoretical speed up, to be used to compare three partition schemes. The three partition schemes discussed in this document are the Naïve Row Segmentation, the First Come First Served Row Segmentation and the Alternating Row Segmentation. The Message Parsing Interface (MPI) library in C is used for all of the communication. After testing all the implementation on MonARCH, the results demonstrate that the Naïve Row Segmentation approach did not perform as par. But the Alternating Row Segmentation approach performs better when the number of tasks are $< 16$, where as the First Come First Served approach performs better when the number of tasks is $\ge 16$. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: 12 pages, 10 figures

arXiv:1902.03585 [pdf, other]

doi 10.1109/TCYB.2019.2897162

Angle-Closure Detection in Anterior Segment OCT based on Multi-Level Deep Network

Authors: Huazhu Fu, Yanwu Xu, Stephen Lin, Damon Wing Kee Wong, Mani Baskaran, Meenakshi Mahesh, Tin Aung, Jiang Liu

Abstract: Irreversible visual impairment is often caused by primary angle-closure glaucoma, which could be detected via Anterior Segment Optical Coherence Tomography (AS-OCT). In this paper, an automated system based on deep learning is presented for angle-closure detection in AS-OCT images. Our system learns a discriminative representation from training data that captures subtle visual cues not modeled by… ▽ More Irreversible visual impairment is often caused by primary angle-closure glaucoma, which could be detected via Anterior Segment Optical Coherence Tomography (AS-OCT). In this paper, an automated system based on deep learning is presented for angle-closure detection in AS-OCT images. Our system learns a discriminative representation from training data that captures subtle visual cues not modeled by handcrafted features. A Multi-Level Deep Network (MLDN) is proposed to formulate this learning, which utilizes three particular AS-OCT regions based on clinical priors: the global anterior segment structure, local iris region, and anterior chamber angle (ACA) patch. In our method, a sliding window based detector is designed to localize the ACA region, which addresses ACA detection as a regression task. Then, three parallel sub-networks are applied to extract AS-OCT representations for the global image and at clinically-relevant local regions. Finally, the extracted deep features of these sub-networks are concatenated into one fully connected layer to predict the angle-closure detection result. In the experiments, our system is shown to surpass previous detection methods and other deep learning systems on two clinical AS-OCT datasets. △ Less

Submitted 10 February, 2019; originally announced February 2019.

Comments: 9 pages, accepted by IEEE Transactions on Cybernetics

arXiv:1806.05781 [pdf, other]

doi 10.3389/fpsyg.2018.01128

A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges

Authors: Yee-Hui Oh, John See, Anh Cat Le Ngo, Raphael Chung-Wei Phan, Vishnu Monn Baskaran

Abstract: Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions pos… ▽ More Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions possible today, in contrast to decades ago when it was primarily the domain of psychiatrists where analysis was largely manual. Indeed, although the study of facial micro-expressions is a well-established field in psychology, it is still relatively new from the computational perspective with many interesting problems. In this survey, we present a comprehensive review of state-of-the-art databases and methods for micro-expressions spotting and recognition. Individual stages involved in the automation of these tasks are also described and reviewed at length. In addition, we also deliberate on the challenges and future directions in this growing field of automatic facial micro-expression analysis. △ Less

Submitted 14 June, 2018; originally announced June 2018.

Comments: 45 pages, single column preprint version. Submitted: 2 December 2017, Accepted: 12 June 2018 to Frontiers in Psychology

arXiv:1601.05458 [pdf, ps, other]

Efficient Compilation to Event-Driven Task Programs

Authors: Benoit Meister, Muthu Baskaran, Benoit Pradelle, Thomas Henretty, Richard Lethin

Abstract: As illustrated by the emergence of a class of new languages and runtimes, it is expected that a large portion of the programs to run on extreme scale computers will need to be written as graphs of event-driven tasks (EDTs). EDT runtime systems, which schedule such collections of tasks, enable more concurrency than traditional runtimes by reducing the amount of inter-task synchronization, improving… ▽ More As illustrated by the emergence of a class of new languages and runtimes, it is expected that a large portion of the programs to run on extreme scale computers will need to be written as graphs of event-driven tasks (EDTs). EDT runtime systems, which schedule such collections of tasks, enable more concurrency than traditional runtimes by reducing the amount of inter-task synchronization, improving dynamic load balancing and making more operations asynchronous. We present an efficient technique to generate such task graphs from a polyhedral representation of a program, both in terms of compilation time and asymptotic execution time. Task dependences become materialized in different forms, depending upon the synchronization model available with the targeted runtime. We explore the different ways of programming EDTs using each synchronization model, and identify important sources of overhead associated with them. We evaluate these programming schemes according to the cost they entail in terms of sequential start-up, in-flight task management, space used for synchronization objects, and garbage collection of these objects. While our implementation and evaluation take place in a polyhedral compiler, the presented overhead cost analysis is useful in the more general context of automatic code generation. △ Less

Submitted 6 January, 2016; originally announced January 2016.

Comments: 18 pages, 6 figures

arXiv:1512.01542 [pdf, other]

Optimizing the domain wall fermion Dirac operator using the R-Stream source-to-source compiler

Authors: Meifeng Lin, Eric Papenhausen, M. Harper Langston, Benoit Meister, Muthu Baskaran, Taku Izubuchi, Chulwoo Jung

Abstract: The application of the Dirac operator on a spinor field, the Dslash operation, is the most computation-intensive part of the lattice QCD simulations. It is often the key kernel to optimize to achieve maximum performance on various platforms. Here we report on a project to optimize the domain wall fermion Dirac operator in Columbia Physics System (CPS) using the R-Stream source-to-source compiler.… ▽ More The application of the Dirac operator on a spinor field, the Dslash operation, is the most computation-intensive part of the lattice QCD simulations. It is often the key kernel to optimize to achieve maximum performance on various platforms. Here we report on a project to optimize the domain wall fermion Dirac operator in Columbia Physics System (CPS) using the R-Stream source-to-source compiler. Our initial target platform is the Intel PC clusters. We discuss the optimization strategies involved before and after the automatic code generation with R-Stream and present some preliminary benchmark results. △ Less

Submitted 4 December, 2015; originally announced December 2015.

Comments: 7 pages, 4 figures. Proceedings of the 33rd International Symposium on Lattice Field Theory, July 14 -18, 2015, Kobe, Japan

Journal ref: PoS(LATTICE 2015)022

arXiv:1409.1914 [pdf, ps, other]

A Tale of Three Runtimes

Authors: Nicolas Vasilache, Muthu Baskaran, Tom Henretty, Benoit Meister, M. Harper Langston, Sanket Tavarageri, Richard Lethin

Abstract: This contribution discusses the automatic generation of event-driven, tuple-space based programs for task-oriented execution models from a sequential C specification. We developed a hierarchical map** solution using auto-parallelizing compiler technology to target three different runtimes relying on event-driven tasks (EDTs). Our solution benefits from the important observation that loop types e… ▽ More This contribution discusses the automatic generation of event-driven, tuple-space based programs for task-oriented execution models from a sequential C specification. We developed a hierarchical map** solution using auto-parallelizing compiler technology to target three different runtimes relying on event-driven tasks (EDTs). Our solution benefits from the important observation that loop types encode short, transitive relations among EDTs that are compact and efficiently evaluated at runtime. In this context, permutable loops are of particular importance as they translate immediately into conservative point-to-point synchronizations of distance 1. Our solution generates calls into a runtime-agnostic C++ layer, which we have retargeted to Intel's Concurrent Collections (CnC), ETI's SWARM, and the Open Community Runtime (OCR). Experience with other runtime systems motivates our introduction of support for hierarchical async-finishes in CnC. Experimental data is provided to show the benefit of automatically generated code for EDT-based runtimes as well as comparisons across runtimes. △ Less

Submitted 5 September, 2014; originally announced September 2014.

Showing 1–15 of 15 results for author: Baskaran, M