-
Conditional Variational Diffusion Models
Authors:
Gabriel della Maggiora,
Luis Alberto Croquevielle,
Nikita Deshpande,
Harry Horsley,
Thomas Heinis,
Artur Yakimovich
Abstract:
Inverse problems aim to determine parameters from observations, a crucial task in engineering and science. Lately, generative models, especially diffusion models, have gained popularity in this area for their ability to produce realistic solutions and their good mathematical properties. Despite their success, an important drawback of diffusion models is their sensitivity to the choice of variance…
▽ More
Inverse problems aim to determine parameters from observations, a crucial task in engineering and science. Lately, generative models, especially diffusion models, have gained popularity in this area for their ability to produce realistic solutions and their good mathematical properties. Despite their success, an important drawback of diffusion models is their sensitivity to the choice of variance schedule, which controls the dynamics of the diffusion process. Fine-tuning this schedule for specific applications is crucial but time-costly and does not guarantee an optimal result. We propose a novel approach for learning the schedule as part of the training process. Our method supports probabilistic conditioning on data, provides high-quality solutions, and is flexible, proving able to adapt to different applications with minimum overhead. This approach is tested in two unrelated inverse problems: super-resolution microscopy and quantitative phase imaging, yielding comparable or superior results to previous methods and fine-tuned diffusion models. We conclude that fine-tuning the schedule by experimentation should be avoided because it can be learned during training in a stable way that yields better results.
△ Less
Submitted 26 April, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
A generalization of the achievable rate of a MISO system using Bode-Fano wideband matching theory
Authors:
Nitish Deshpande,
Miguel R. Castellanos,
Saeed R. Khosravirad,
**feng Du,
Harish Viswanathan,
Robert W. Heath Jr
Abstract:
Impedance-matching networks affect power transfer from the radio frequency (RF) chains to the antennas. Their design impacts the signal to noise ratio (SNR) and the achievable rate. In this paper, we maximize the information-theoretic achievable rate of a multiple-input-single-output (MISO) system with wideband matching constraints. Using a multiport circuit theory approach with frequency-selectiv…
▽ More
Impedance-matching networks affect power transfer from the radio frequency (RF) chains to the antennas. Their design impacts the signal to noise ratio (SNR) and the achievable rate. In this paper, we maximize the information-theoretic achievable rate of a multiple-input-single-output (MISO) system with wideband matching constraints. Using a multiport circuit theory approach with frequency-selective scattering parameters, we propose a general framework for optimizing the MISO achievable rate that incorporates Bode-Fano wideband matching theory. We express the solution to the achievable rate optimization problem in terms of the optimized transmission coefficient and the Lagrangian parameters corresponding to the Bode-Fano inequality constraints. We apply this framework to a single electric Chu's antenna and an array of two electric Chu's antennas. We compare the optimized achievable rate obtained numerically with other benchmarks like the ideal achievable rate computed by disregarding matching constraints and the achievable rate obtained by using sub-optimal matching strategies like conjugate matching and frequency-flat transmission. We also propose a practical methodology to approximate the achievable rate bound by using the optimal transmission coefficient to derive a physically realizable matching network through the ADS software.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Learning Skills from Demonstrations: A Trend from Motion Primitives to Experience Abstraction
Authors:
Mehrdad Tavassoli,
Sunny Katyara,
Maria Pozzi,
Nikhil Deshpande,
Darwin G. Caldwell,
Domenico Prattichizzo
Abstract:
The uses of robots are changing from static environments in factories to encompass novel concepts such as Human-Robot Collaboration in unstructured settings. Pre-programming all the functionalities for robots becomes impractical, and hence, robots need to learn how to react to new events autonomously, just like humans. However, humans, unlike machines, are naturally skilled in responding to unexpe…
▽ More
The uses of robots are changing from static environments in factories to encompass novel concepts such as Human-Robot Collaboration in unstructured settings. Pre-programming all the functionalities for robots becomes impractical, and hence, robots need to learn how to react to new events autonomously, just like humans. However, humans, unlike machines, are naturally skilled in responding to unexpected circumstances based on either experiences or observations. Hence, embedding such anthropoid behaviours into robots entails the development of neuro-cognitive models that emulate motor skills under a robot learning paradigm. Effective encoding of these skills is bound to the proper choice of tools and techniques. This paper studies different motion and behaviour learning methods ranging from Movement Primitives (MP) to Experience Abstraction (EA), applied to different robotic tasks. These methods are scrutinized and then experimentally benchmarked by reconstructing a standard pick-n-place task. Apart from providing a standard guideline for the selection of strategies and algorithms, this paper aims to draw a perspectives on their possible extensions and improvements
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
A Perception-Driven Approach To Immersive Remote Telerobotics
Authors:
Y. T. Tefera,
D. Mazzanti,
S. Anastasi,
D. G. Caldwell,
P. Fiorini,
N. Deshpande
Abstract:
Virtual Reality (VR) interfaces are increasingly used as remote visualization media in telerobotics. Remote environments captured through RGB-D cameras and visualized using VR interfaces can enhance operators' situational awareness and sense of presence. However, this approach has strict requirements for the speed, throughput, and quality of the visualized 3D data.Further, telerobotics requires op…
▽ More
Virtual Reality (VR) interfaces are increasingly used as remote visualization media in telerobotics. Remote environments captured through RGB-D cameras and visualized using VR interfaces can enhance operators' situational awareness and sense of presence. However, this approach has strict requirements for the speed, throughput, and quality of the visualized 3D data.Further, telerobotics requires operators to focus on their tasks fully, requiring high perceptual and cognitive skills. This paper shows a work-in-progress framework to address these challenges by taking the human visual system (HVS) as an inspiration. Human eyes use attentional mechanisms to select and draw user engagement to a specific place from the dynamic environment. Inspired by this, the framework implements functionalities to draw users's engagement to a specific place while simultaneously reducing latency and bandwidth requirements.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A wideband generalization of the near-field region for extremely large phased-arrays
Authors:
Nitish Deshpande,
Miguel R. Castellanos,
Saeed R. Khosravirad,
**feng Du,
Harish Viswanathan,
Robert W. Heath Jr
Abstract:
The narrowband and far-field assumption in conventional wireless system design leads to a mismatch with the optimal beamforming required for wideband and near-field systems. This discrepancy is exacerbated for larger apertures and bandwidths. To characterize the behavior of near-field and wideband systems, we derive the beamforming gain expression achieved by a frequency-flat phased array designed…
▽ More
The narrowband and far-field assumption in conventional wireless system design leads to a mismatch with the optimal beamforming required for wideband and near-field systems. This discrepancy is exacerbated for larger apertures and bandwidths. To characterize the behavior of near-field and wideband systems, we derive the beamforming gain expression achieved by a frequency-flat phased array designed for plane-wave propagation. To determine the far-field to near-field boundary for a wideband system, we propose a frequency-selective distance metric. The proposed far-field threshold increases for frequencies away from the center frequency. The analysis results in a fundamental upper bound on the product of the array aperture and the system bandwidth. We present numerical results to illustrate how the gain threshold affects the maximum usable bandwidth for the n260 and n261 5G NR bands.
△ Less
Submitted 29 June, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Temporal Multimodal Multivariate Learning
Authors:
Hyoshin Park,
Justice Darko,
Niharika Deshpande,
Venktesh Pandey,
Hui Su,
Masahiro Ono,
Dedrick Barkely,
Larkin Folsom,
Derek Posselt,
Steve Chien
Abstract:
We introduce temporal multimodal multivariate learning, a new family of decision making models that can indirectly learn and transfer online information from simultaneous observations of a probability distribution with more than one peak or more than one outcome variable from one time stage to another. We approximate the posterior by sequentially removing additional uncertainties across different…
▽ More
We introduce temporal multimodal multivariate learning, a new family of decision making models that can indirectly learn and transfer online information from simultaneous observations of a probability distribution with more than one peak or more than one outcome variable from one time stage to another. We approximate the posterior by sequentially removing additional uncertainties across different variables and time, based on data-physics driven correlation, to address a broader class of challenging time-dependent decision-making problems under uncertainty. Extensive experiments on real-world datasets ( i.e., urban traffic data and hurricane ensemble forecasting data) demonstrate the superior performance of the proposed targeted decision-making over the state-of-the-art baseline prediction methods across various settings.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Addressing Tactic Volatility in Self-Adaptive Systems Using Evolved Recurrent Neural Networks and Uncertainty Reduction Tactics
Authors:
Aizaz Ul Haq,
Niranjana Deshpande,
AbdElRahman ElSaid,
Travis Desell,
Daniel E. Krutz
Abstract:
Self-adaptive systems frequently use tactics to perform adaptations. Tactic examples include the implementation of additional security measures when an intrusion is detected, or activating a cooling mechanism when temperature thresholds are surpassed. Tactic volatility occurs in real-world systems and is defined as variable behavior in the attributes of a tactic, such as its latency or cost. A sys…
▽ More
Self-adaptive systems frequently use tactics to perform adaptations. Tactic examples include the implementation of additional security measures when an intrusion is detected, or activating a cooling mechanism when temperature thresholds are surpassed. Tactic volatility occurs in real-world systems and is defined as variable behavior in the attributes of a tactic, such as its latency or cost. A system's inability to effectively account for tactic volatility adversely impacts its efficiency and resiliency against the dynamics of real-world environments. To enable systems' efficiency against tactic volatility, we propose a Tactic Volatility Aware (TVA-E) process utilizing evolved Recurrent Neural Networks (eRNN) to provide accurate tactic predictions. TVA-E is also the first known process to take advantage of uncertainty reduction tactics to provide additional information to the decision-making process and reduce uncertainty. TVA-E easily integrates into popular adaptation processes enabling it to immediately benefit a large number of existing self-adaptive systems. Simulations using 52,106 tactic records demonstrate that: I) eRNN is an effective prediction mechanism, II) TVA-E represents an improvement over existing state-of-the-art processes in accounting for tactic volatility, and III) Uncertainty reduction tactics are beneficial in accounting for tactic volatility. The developed dataset and tool can be found at https://tacticvolatility.github.io/
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
Highly Generalizable Models for Multilingual Hate Speech Detection
Authors:
Neha Deshpande,
Nicholas Farris,
Vidhur Kumar
Abstract:
Hate speech detection has become an important research topic within the past decade. More private corporations are needing to regulate user generated content on different platforms across the globe. In this paper, we introduce a study of multilingual hate speech classification. We compile a dataset of 11 languages and resolve different taxonomies by analyzing the combined data with binary labels:…
▽ More
Hate speech detection has become an important research topic within the past decade. More private corporations are needing to regulate user generated content on different platforms across the globe. In this paper, we introduce a study of multilingual hate speech classification. We compile a dataset of 11 languages and resolve different taxonomies by analyzing the combined data with binary labels: hate speech or not hate speech. Defining hate speech in a single way across different languages and datasets may erase cultural nuances to the definition, therefore, we utilize language agnostic embeddings provided by LASER and MUSE in order to develop models that can use a generalized definition of hate speech across datasets. Furthermore, we evaluate prior state of the art methodologies for hate speech detection under our expanded dataset. We conduct three types of experiments for a binary hate speech classification task: Multilingual-Train Monolingual-Test, MonolingualTrain Monolingual-Test and Language-Family-Train Monolingual Test scenarios to see if performance increases for each language due to learning more from other language data.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Navigation In Urban Environments Amongst Pedestrians Using Multi-Objective Deep Reinforcement Learning
Authors:
Niranjan Deshpande,
Dominique Vaufreydaz,
Anne Spalanzani
Abstract:
Urban autonomous driving in the presence of pedestrians as vulnerable road users is still a challenging and less examined research problem. This work formulates navigation in urban environments as a multi objective reinforcement learning problem. A deep learning variant of thresholded lexicographic Q-learning is presented for autonomous navigation amongst pedestrians. The multi objective DQN agent…
▽ More
Urban autonomous driving in the presence of pedestrians as vulnerable road users is still a challenging and less examined research problem. This work formulates navigation in urban environments as a multi objective reinforcement learning problem. A deep learning variant of thresholded lexicographic Q-learning is presented for autonomous navigation amongst pedestrians. The multi objective DQN agent is trained on a custom urban environment developed in CARLA simulator. The proposed method is evaluated by comparing it with a single objective DQN variant on known and unknown environments. Evaluation results show that the proposed method outperforms the single objective DQN variant with respect to all aspects.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Fusing Visuo-Tactile Perception into Kernelized Synergies for Robust Gras** and Fine Manipulation of Non-rigid Objects
Authors:
Sunny Katyara,
Nikhil Deshpande,
Fanny Ficuciello,
Fei Chen,
Bruno Siciliano,
Darwin G. Caldwell
Abstract:
Handling non-rigid objects using robot hands necessities a framework that does not only incorporate human-level dexterity and cognition but also the multi-sensory information and system dynamics for robust and fine interactions. In this research, our previously developed kernelized synergies framework, inspired from human behaviour on reusing same subspace for gras** and manipulation, is augment…
▽ More
Handling non-rigid objects using robot hands necessities a framework that does not only incorporate human-level dexterity and cognition but also the multi-sensory information and system dynamics for robust and fine interactions. In this research, our previously developed kernelized synergies framework, inspired from human behaviour on reusing same subspace for gras** and manipulation, is augmented with visuo-tactile perception for autonomous and flexible adaptation to unknown objects. To detect objects and estimate their poses, a simplified visual pipeline using RANSAC algorithm with Euclidean clustering and SVM classifier is exploited. To modulate interaction efforts while gras** and manipulating non-rigid objects, the tactile feedback using T40S shokac chip sensor, generating 3D force information, is incorporated. Moreover, different kernel functions are examined in the kernelized synergies framework, to evaluate its performance and potential against task reproducibility, execution, generalization and synergistic re-usability. Experiments performed with robot arm-hand system validates the capability and usability of upgraded framework on stably gras** and dexterously manipulating the non-rigid objects.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Formulating Intuitive Stack-of-Tasks using Visuo-Tactile Perception for Collaborative Human-Robot Fine Manipulation
Authors:
Sunny Katyara,
Nikhil Deshpande,
Fanny Ficuciello,
Tao Teng,
Bruno Siciliano,
Darwin G. Caldwell,
Fei Chen
Abstract:
Enabling robots to work in close proximity to humans necessitates a control framework that does not only incorporate multi-sensory information for autonomous and coordinated interactions but also has perceptive task planning to ensure an adaptable and flexible collaborative behaviour. In this research, an intuitive stack-of-tasks (iSoT) formulation is proposed, that defines the robot's actions by…
▽ More
Enabling robots to work in close proximity to humans necessitates a control framework that does not only incorporate multi-sensory information for autonomous and coordinated interactions but also has perceptive task planning to ensure an adaptable and flexible collaborative behaviour. In this research, an intuitive stack-of-tasks (iSoT) formulation is proposed, that defines the robot's actions by considering the human-arm postures and the task progression. The framework is augmented with visuo-tactile information to effectively perceive the collaborative environment and intuitively switch between the planned sub-tasks. The visual feedback from depth cameras monitors and estimates the objects' poses and human-arm postures, while the tactile data provides the exploration skills to detect and maintain the desired contacts to avoid object slippage. To evaluate the performance, effectiveness and usability of the proposed framework, assembly and disassembly tasks, performed by the human-human and human-robot partners, are considered and analyzed using distinct evaluation metrics i.e, approach adaptation, grasp correction, task coordination latency, cumulative posture deviation, and task repeatability.
△ Less
Submitted 4 January, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Behavioral decision-making for urban autonomous driving in the presence of pedestrians using Deep Recurrent Q-Network
Authors:
Niranjan Deshpande,
Dominique Vaufreydaz,
Anne Spalanzani
Abstract:
Decision making for autonomous driving in urban environments is challenging due to the complexity of the road structure and the uncertainty in the behavior of diverse road users. Traditional methods consist of manually designed rules as the driving policy, which require expert domain knowledge, are difficult to generalize and might give sub-optimal results as the environment gets complex. Whereas,…
▽ More
Decision making for autonomous driving in urban environments is challenging due to the complexity of the road structure and the uncertainty in the behavior of diverse road users. Traditional methods consist of manually designed rules as the driving policy, which require expert domain knowledge, are difficult to generalize and might give sub-optimal results as the environment gets complex. Whereas, using reinforcement learning, optimal driving policy could be learned and improved automatically through several interactions with the environment. However, current research in the field of reinforcement learning for autonomous driving is mainly focused on highway setup with little to no emphasis on urban environments. In this work, a deep reinforcement learning based decision-making approach for high-level driving behavior is proposed for urban environments in the presence of pedestrians. For this, the use of Deep Recurrent Q-Network (DRQN) is explored, a method combining state-of-the art Deep Q-Network (DQN) with a long term short term memory (LSTM) layer hel** the agent gain a memory of the environment. A 3-D state representation is designed as the input combined with a well defined reward function to train the agent for learning an appropriate behavior policy in a real-world like urban simulator. The proposed method is evaluated for dense urban scenarios and compared with a rule-based approach and results show that the proposed DRQN based driving behavior decision maker outperforms the rule-based approach.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Towards a Magnetically Actuated Laser Scanner for Endoscopic Microsurgeries
Authors:
Alperen Acemoglu,
Nikhil Deshpande,
Leonardo S. Mattos
Abstract:
This article presents the design and assembly of a novel magnetically actuated endoscopic laser scanner device. The device is designed to perform 2D position control and high speed scanning of a fiber-based laser for operation in narrow workspaces. The device includes laser focusing optics to allow non-contact incisions and tablet-based control interface for intuitive teleoperation. The performanc…
▽ More
This article presents the design and assembly of a novel magnetically actuated endoscopic laser scanner device. The device is designed to perform 2D position control and high speed scanning of a fiber-based laser for operation in narrow workspaces. The device includes laser focusing optics to allow non-contact incisions and tablet-based control interface for intuitive teleoperation. The performance of the proof-of-concept device is analysed through controllability and the usability studies. The computer-controlled high-speed scanning demonstrates repeatable results with 21 um precision and a stable response up to 48 Hz. Teleoperation user trials, were performed for trajectory-following tasks with 12 subjects, show an accuracy of 39 um. The innovative design of the device can be applied to both surgical and diagnostic (imaging) applications in endoscopic systems.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Review of Robust Video Watermarking Algorithms
Authors:
Neeta Deshpande,
Archana Rajurkar,
R Manthalkar
Abstract:
There has been a remarkable increase in the data exchange over web and the widespread use of digital media. As a result, multimedia data transfers also had a boost up. The mounting interest with reference to digital watermarking throughout the last decade is certainly due to the increase in the need of copyright protection of digital content. This is also enhanced due to commercial prospective. Ap…
▽ More
There has been a remarkable increase in the data exchange over web and the widespread use of digital media. As a result, multimedia data transfers also had a boost up. The mounting interest with reference to digital watermarking throughout the last decade is certainly due to the increase in the need of copyright protection of digital content. This is also enhanced due to commercial prospective. Applications of video watermarking in copy control, broadcast monitoring, fingerprinting, video authentication, copyright protection etc is immensely rising. The main aspects of information hiding are capacity, security and robustness. Capacity deals with the amount of information that can be hidden. The skill of anyone detecting the information is security and robustness refers to the resistance to modification of the cover content before concealed information is destroyed. Video watermarking algorithms normally prefers robustness. In a robust algorithm it is not possible to eliminate the watermark without rigorous degradation of the cover content. In this paper, we introduce the notion of Video Watermarking and the features required to design a robust watermarked video for a valuable application. We review several algorithms, and introduce frequently used key techniques. The aim of this paper is to focus on the various domains of video watermarking techniques. The majority of the reviewed methods based on video watermarking emphasize on the notion of robustness of the algorithm.
△ Less
Submitted 11 April, 2010;
originally announced April 2010.