Search | arXiv e-print repository

arXiv:2405.19307 [pdf, other]

Data Efficient Behavior Cloning for Fine Manipulation via Continuity-based Corrective Labels

Authors: Abhay Deshpande, Liyiming Ke, Quinn Pfeifer, Abhishek Gupta, Siddhartha S. Srinivasa

Abstract: We consider imitation learning with access only to expert demonstrations, whose real-world application is often limited by covariate shift due to compounding errors during execution. We investigate the effectiveness of the Continuity-based Corrective Labels for Imitation Learning (CCIL) framework in mitigating this issue for real-world fine manipulation tasks. CCIL generates corrective labels by l… ▽ More We consider imitation learning with access only to expert demonstrations, whose real-world application is often limited by covariate shift due to compounding errors during execution. We investigate the effectiveness of the Continuity-based Corrective Labels for Imitation Learning (CCIL) framework in mitigating this issue for real-world fine manipulation tasks. CCIL generates corrective labels by learning a locally continuous dynamics model from demonstrations to guide the agent back toward expert states. Through extensive experiments on peg insertion and fine gras**, we provide the first empirical validation that CCIL can significantly improve imitation learning performance despite discontinuities present in contact-rich manipulation. We find that: (1) real-world manipulation exhibits sufficient local smoothness to apply CCIL, (2) generated corrective labels are most beneficial in low-data regimes, and (3) label filtering based on estimated dynamics model error enables performance gains. To effectively apply CCIL to robotic domains, we offer a practical instantiation of the framework and insights into design choices and hyperparameter selection. Our work demonstrates CCIL's practicality for alleviating compounding errors in imitation learning on physical robots. △ Less

Submitted 3 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

arXiv:2403.11298 [pdf, other]

Multi-Sample Long Range Path Planning under Sensing Uncertainty for Off-Road Autonomous Driving

Authors: Matt Schmittle, Rohan Baijal, Brian Hou, Siddhartha Srinivasa, Byron Boots

Abstract: We focus on the problem of long-range dynamic replanning for off-road autonomous vehicles, where a robot plans paths through a previously unobserved environment while continuously receiving noisy local observations. An effective approach for planning under sensing uncertainty is determinization, where one converts a stochastic world into a deterministic one and plans under this simplification. Thi… ▽ More We focus on the problem of long-range dynamic replanning for off-road autonomous vehicles, where a robot plans paths through a previously unobserved environment while continuously receiving noisy local observations. An effective approach for planning under sensing uncertainty is determinization, where one converts a stochastic world into a deterministic one and plans under this simplification. This makes the planning problem tractable, but the cost of following the planned path in the real world may be different than in the determinized world. This causes collisions if the determinized world optimistically ignores obstacles, or causes unnecessarily long routes if the determinized world pessimistically imagines more obstacles. We aim to be robust to uncertainty over potential worlds while still achieving the efficiency benefits of determinization. We evaluate algorithms for dynamic replanning on a large real-world dataset of challenging long-range planning problems from the DARPA RACER program. Our method, Dynamic Replanning via Evaluating and Aggregating Multiple Samples (DREAMS), outperforms other determinization-based approaches in terms of combined traversal time and collision cost. https://sites.google.com/cs.washington.edu/dreams/ △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.04134 [pdf, other]

An Adaptable, Safe, and Portable Robot-Assisted Feeding System

Authors: Ethan Kroll Gordon, Rajat Kumar Jenamani, Amal Nanavati, Ziang Liu, Haya Bolotski, Raida Karim, Daniel Stabile, Atharva Kashyap, Bernie Hao Zhu, Xilai Dai, Tyler Schrenk, Jonathan Ko, Taylor Kessler Faulkner, Tapomayukh Bhattacharjee, Siddhartha Srinivasa

Abstract: We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For… ▽ More We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For bite acquisition, we leverage multi-modal online learning to tractably adapt to unseen food types. For bite transfer, we leverage real-time mouth perception and interaction-aware control. Co-designed with community researchers, our system has been validated through multiple end-user studies. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: HRI 2024 Demo; Corrected inaccurate author ordering in ACM DL which occurred due to formatting issues

arXiv:2403.00489 [pdf, other]

Multiple Ways of Working with Users to Develop Physically Assistive Robots

Authors: Amal Nanavati, Max Pascher, Vinitha Ranganeni, Ethan K. Gordon, Taylor Kessler Faulkner, Siddhartha S. Srinivasa, Maya Cakmak, Patrícia Alves-Oliveira, Jens Gerken

Abstract: Despite the growth of physically assistive robotics (PAR) research over the last decade, nearly half of PAR user studies do not involve participants with the target disabilities. There are several reasons for this -- recruitment challenges, small sample sizes, and transportation logistics -- all influenced by systemic barriers that people with disabilities face. However, it is well-established tha… ▽ More Despite the growth of physically assistive robotics (PAR) research over the last decade, nearly half of PAR user studies do not involve participants with the target disabilities. There are several reasons for this -- recruitment challenges, small sample sizes, and transportation logistics -- all influenced by systemic barriers that people with disabilities face. However, it is well-established that working with end-users results in technology that better addresses their needs and integrates with their lived circumstances. In this paper, we reflect on multiple approaches we have taken to working with people with motor impairments across the design, development, and evaluation of three PAR projects: (a) assistive feeding with a robot arm; (b) assistive teleoperation with a mobile manipulator; and (c) shared control with a robot arm. We discuss these approaches to working with users along three dimensions -- individual vs. community-level insight, logistic burden on end-users vs. researchers, and benefit to researchers vs. community -- and share recommendations for how other PAR researchers can incorporate users into their work. △ Less

Submitted 7 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

Comments: A3DE '24: Workshop on Assistive Applications, Accessibility, and Disability Ethics at the ACM/IEEE International Conference on Human-Robot Interaction

arXiv:2401.12159 [pdf, other]

Transcending To Notions

Authors: Sama Sai Karthik, Jayati Deshmukh, Srinath Srinivasa

Abstract: Social identities play an important role in the dynamics of human societies, and it can be argued that some sense of identification with a larger cause or idea plays a critical role in making humans act responsibly. Often social activists strive to get populations to identify with some cause or notion -- like green energy, diversity, etc. in order to bring about desired social changes. We explore… ▽ More Social identities play an important role in the dynamics of human societies, and it can be argued that some sense of identification with a larger cause or idea plays a critical role in making humans act responsibly. Often social activists strive to get populations to identify with some cause or notion -- like green energy, diversity, etc. in order to bring about desired social changes. We explore the problem of designing computational models for social identities in the context of autonomous AI agents. For this, we propose an agent model that enables agents to identify with certain notions and show how this affects collective outcomes. We also contrast between associations of identity with rational preferences. The proposed model is simulated in an application context of urban mobility, where we show how changes in social identity affect mobility patterns and collective outcomes. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2401.11881 [pdf, other]

Modelling the Dynamics of Identity and Fairness in Ultimatum Game

Authors: Janvi Chhabra, Jayati Deshmukh, Srinath Srinivasa

Abstract: Allocation games are zero-sum games that model the distribution of resources among multiple agents. In this paper, we explore the interplay between an elastic sense of subjective identity and its impact on notions of fairness in allocation. An elastic sense of identity in agents is known to lead to responsible decision-making in non-cooperative, non-zero-sum games like Prisoners' Dilemma, and is a… ▽ More Allocation games are zero-sum games that model the distribution of resources among multiple agents. In this paper, we explore the interplay between an elastic sense of subjective identity and its impact on notions of fairness in allocation. An elastic sense of identity in agents is known to lead to responsible decision-making in non-cooperative, non-zero-sum games like Prisoners' Dilemma, and is a desirable feature to add into agent models. However, when it comes to allocation, an elastic sense of identity can be shown to exacerbate inequities in allocation, giving no rational incentive for agents to act fairly towards one another. This lead us to introduce a sense of fairness as an innate characteristic of autonomous agency. For this, we implement the well-known Ultimatum Game between two agents, where their elastic sense of self (controlled by a parameter called $γ$) and a sense of fairness (controlled by a parameter called $τ$) are both varied. We study the points at which agents find it no longer rational to identify with the other agent, and uphold their sense of fairness, and vice versa. Such a study also helps us discern the subtle difference between responsibility and fairness when it comes to autonomous agency. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2311.11199 [pdf, other]

HOUND: An Open-Source, Low-cost Research Platform for High-speed Off-road Underactuated Nonholonomic Driving

Authors: Sidharth Talia, Matt Schmittle, Alexander Lambert, Alexander Spitzer, Christoforos Mavrogiannis, Siddhartha S. Srinivasa

Abstract: Off-road vehicles are susceptible to rollovers in terrains with large elevation features, such as steep hills, ditches, and berms. One way to protect them against rollovers is ruggedization through the use of industrial-grade parts and physical modifications. However, this solution can be prohibitively expensive for academic research labs. Our key insight is that a software-based rollover-preventi… ▽ More Off-road vehicles are susceptible to rollovers in terrains with large elevation features, such as steep hills, ditches, and berms. One way to protect them against rollovers is ruggedization through the use of industrial-grade parts and physical modifications. However, this solution can be prohibitively expensive for academic research labs. Our key insight is that a software-based rollover-prevention system (RPS) enables the use of commercial-off-the-shelf hardware parts that are cheaper than their industrial counterparts, thus reducing overall cost. In this paper, we present HOUND, a small-scale, inexpensive, off-road autonomy platform that can handle challenging outdoor terrains at high speeds through the integration of an RPS. HOUND is integrated with a complete stack for perception and control, geared towards aggressive offroad driving. We deploy HOUND in the real world, at high speeds, on four different terrains covering 50 km of driving and highlight its utility in preventing rollovers and traversing difficult terrain. Additionally, through integration with BeamNG, a state-of-the-art driving simulator, we demonstrate a significant reduction in rollovers without compromising turning ability across a series of simulated experiments. Supplementary material can be found on our website, where we will also release all design documents for the platform: https://sites.google.com/view/prl-hound . △ Less

Submitted 18 November, 2023; originally announced November 2023.

Comments: 6 Pages, 8 Figures

arXiv:2310.18290 [pdf, other]

An Approach to Automatically generating Riddles aiding Concept Attainment

Authors: Niharika Sri Parasa, Chaitali Diwan, Srinath Srinivasa

Abstract: One of the primary challenges in online learning environments, is to retain learner engagement. Several different instructional strategies are proposed both in online and offline environments to enhance learner engagement. The Concept Attainment Model is one such instructional strategy that focuses on learners acquiring a deeper understanding of a concept rather than just its dictionary definition… ▽ More One of the primary challenges in online learning environments, is to retain learner engagement. Several different instructional strategies are proposed both in online and offline environments to enhance learner engagement. The Concept Attainment Model is one such instructional strategy that focuses on learners acquiring a deeper understanding of a concept rather than just its dictionary definition. This is done by searching and listing the properties used to distinguish examples from non-examples of various concepts. Our work attempts to apply the Concept Attainment Model to build conceptual riddles, to deploy over online learning environments. The approach involves creating factual triples from learning resources, classifying them based on their uniqueness to a concept into `Topic Markers' and `Common', followed by generating riddles based on the Concept Attainment Model's format and capturing all possible solutions to those riddles. The results obtained from the human evaluation of riddles prove encouraging. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: 9 pages, 5 figures

arXiv:2310.12972 [pdf, other]

CCIL: Continuity-based Data Augmentation for Corrective Imitation Learning

Authors: Liyiming Ke, Yunchu Zhang, Abhay Deshpande, Siddhartha Srinivasa, Abhishek Gupta

Abstract: We present a new technique to enhance the robustness of imitation learning methods by generating corrective data to account for compounding errors and disturbances. While existing methods rely on interactive expert labeling, additional offline datasets, or domain-specific invariances, our approach requires minimal additional assumptions beyond access to expert data. The key insight is to leverage… ▽ More We present a new technique to enhance the robustness of imitation learning methods by generating corrective data to account for compounding errors and disturbances. While existing methods rely on interactive expert labeling, additional offline datasets, or domain-specific invariances, our approach requires minimal additional assumptions beyond access to expert data. The key insight is to leverage local continuity in the environment dynamics to generate corrective labels. Our method first constructs a dynamics model from the expert demonstration, encouraging local Lipschitz continuity in the learned model. In locally continuous regions, this model allows us to generate corrective labels within the neighborhood of the demonstrations but beyond the actual set of states and actions in the dataset. Training on this augmented data enhances the agent's ability to recover from perturbations and deal with compounding errors. We demonstrate the effectiveness of our generated labels through experiments in a variety of robotics domains in simulation that have distinct forms of continuity and discontinuity, including classic control problems, drone flying, navigation with high-dimensional sensor observations, legged locomotion, and tabletop manipulation. △ Less

Submitted 3 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.07018 [pdf, other]

NEWTON: Are Large Language Models Capable of Physical Reasoning?

Authors: Yi Ru Wang, Jiafei Duan, Dieter Fox, Siddhartha Srinivasa

Abstract: Large Language Models (LLMs), through their contextualized representations, have been empirically proven to encapsulate syntactic, semantic, word sense, and common-sense knowledge. However, there has been limited exploration of their physical reasoning abilities, specifically concerning the crucial attributes for comprehending everyday objects. To address this gap, we introduce NEWTON, a repositor… ▽ More Large Language Models (LLMs), through their contextualized representations, have been empirically proven to encapsulate syntactic, semantic, word sense, and common-sense knowledge. However, there has been limited exploration of their physical reasoning abilities, specifically concerning the crucial attributes for comprehending everyday objects. To address this gap, we introduce NEWTON, a repository and benchmark for evaluating the physics reasoning skills of LLMs. Further, to enable domain-specific adaptation of this benchmark, we present a pipeline to enable researchers to generate a variant of this benchmark that has been customized to the objects and attributes relevant for their application. The NEWTON repository comprises a collection of 2800 object-attribute pairs, providing the foundation for generating infinite-scale assessment templates. The NEWTON benchmark consists of 160K QA questions, curated using the NEWTON repository to investigate the physical reasoning capabilities of several mainstream language models across foundational, explicit, and implicit reasoning tasks. Through extensive empirical analysis, our results highlight the capabilities of LLMs for physical reasoning. We find that LLMs like GPT-4 demonstrate strong reasoning capabilities in scenario-based tasks but exhibit less consistency in object-attribute reasoning compared to humans (50% vs. 84%). Furthermore, the NEWTON platform demonstrates its potential for evaluating and enhancing language models, paving the way for their integration into physically grounded settings, such as robotic manipulation. Project site: https://newtonreasoning.github.io △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: EMNLP 2023 Findings; 8 pages, 3 figures, 7 tables; Project page: https://newtonreasoning.github.io

arXiv:2309.16789 [pdf, other]

Extensible Consent Management Architectures for Data Trusts

Authors: Balambiga Ayappane, Rohith Vaidyanathan, Srinath Srinivasa, Jayati Deshmukh

Abstract: Sensitive personal information of individuals and non-personal information of organizations or communities often needs to be legitimately exchanged among different stakeholders, to provide services, maintain public health, law and order, and so on. While such exchanges are necessary, they also impose enormous privacy and security challenges. Data protection laws like GDPR for personal data and Ind… ▽ More Sensitive personal information of individuals and non-personal information of organizations or communities often needs to be legitimately exchanged among different stakeholders, to provide services, maintain public health, law and order, and so on. While such exchanges are necessary, they also impose enormous privacy and security challenges. Data protection laws like GDPR for personal data and Indian Non-personal data protection draft specify conditions and the \textit{legal capacity} in which personal and non-personal information can be solicited and disseminated further. But there is a dearth of formalisms for specifying legal capacities and jurisdictional boundaries, so that open-ended exchange of such data can be implemented. This paper proposes an extensible framework for consent management in Data Trusts in which data can flow across a network through "role tunnels" established based on corresponding legal capacities. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: An earlier version of this paper was published in ISIC 2021

arXiv:2309.14580 [pdf, other]

CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss

Authors: Rakshith Sharma Srinivasa, Jae** Cho, Chouchang Yang, Yashas Malur Saidutta, Ching-Hua Lee, Yilin Shen, Hongxia **

Abstract: This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that hav… ▽ More This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that have recently gained considerable attention. Most existing works for cross-modal representation alignment (including CLIP and LiT) use the standard contrastive training objective, which employs sets of positive and negative examples to align similar and repel dissimilar training data samples. However, similarity amongst training examples has a more continuous nature, thus calling for a more `non-binary' treatment. To address this, we propose a novel loss function called Continuously Weighted Contrastive Loss (CWCL) that employs a continuous measure of similarity. With CWCL, we seek to align the embedding space of one modality with another. Owing to the continuous nature of similarity in the proposed loss function, these models outperform existing methods for 0-shot transfer across multiple models, datasets and modalities. Particularly, we consider the modality pairs of image-text and speech-text and our models achieve 5-8% (absolute) improvement over previous state-of-the-art methods in 0-shot image classification and 20-30% (absolute) improvement in 0-shot speech-to-intent classification and keyword classification. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2023 conference

arXiv:2309.06112 [pdf, ps, other]

Characterizing Latent Perspectives of Media Houses Towards Public Figures

Authors: Sharath Srivatsa, Srinath Srinivasa

Abstract: Media houses reporting on public figures, often come with their own biases stemming from their respective worldviews. A characterization of these underlying patterns helps us in better understanding and interpreting news stories. For this, we need diverse or subjective summarizations, which may not be amenable for classifying into predefined class labels. This work proposes a zero-shot approach fo… ▽ More Media houses reporting on public figures, often come with their own biases stemming from their respective worldviews. A characterization of these underlying patterns helps us in better understanding and interpreting news stories. For this, we need diverse or subjective summarizations, which may not be amenable for classifying into predefined class labels. This work proposes a zero-shot approach for non-extractive or generative characterizations of person entities from a corpus using GPT-2. We use well-articulated articles from several well-known news media houses as a corpus to build a sound argument for this approach. First, we fine-tune a GPT-2 pre-trained language model with a corpus where specific person entities are characterized. Second, we further fine-tune this with demonstrations of person entity characterizations, created from a corpus of programmatically constructed characterizations. This twice fine-tuned model is primed with manual prompts consisting of entity names that were not previously encountered in the second fine-tuning, to generate a simple sentence about the entity. The results were encouraging, when compared against actual characterizations from the corpus. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2307.06951

AI For Global Climate Cooperation 2023 Competition Proceedings

Authors: Yoshua Bengio, Prateek Gupta, Lu Li, Soham Phade, Sunil Srinivasa, Andrew Williams, Tianyu Zhang, Yang Zhang, Stephan Zheng

Abstract: The international community must collaborate to mitigate climate change and sustain economic growth. However, collaboration is hard to achieve, partly because no global authority can ensure compliance with international climate agreements. Combining AI with climate-economic simulations offers a promising solution to design international frameworks, including negotiation protocols and climate agree… ▽ More The international community must collaborate to mitigate climate change and sustain economic growth. However, collaboration is hard to achieve, partly because no global authority can ensure compliance with international climate agreements. Combining AI with climate-economic simulations offers a promising solution to design international frameworks, including negotiation protocols and climate agreements, that promote and incentivize collaboration. In addition, these frameworks should also have policy goals fulfillment, and sustained commitment, taking into account climate-economic dynamics and strategic behaviors. These challenges require an interdisciplinary approach across machine learning, economics, climate science, law, policy, ethics, and other fields. Towards this objective, we organized AI for Global Climate Cooperation, a Mila competition in which teams submitted proposals and analyses of international frameworks, based on (modifications of) RICE-N, an AI-driven integrated assessment model (IAM). In particular, RICE-N supports modeling regional decision-making using AI agents. Furthermore, the IAM then models the climate-economic impact of those decisions into the future. Whereas the first track focused only on performance metrics, the proposals submitted to the second track were evaluated both quantitatively and qualitatively. The quantitative evaluation focused on a combination of (i) the degree of mitigation of global temperature rise and (ii) the increase in economic productivity. On the other hand, an interdisciplinary panel of human experts in law, policy, sociology, economics and environmental science, evaluated the solutions qualitatively. In particular, the panel considered the effectiveness, simplicity, feasibility, ethics, and notions of climate justice of the protocols. In the third track, the participants were asked to critique and improve RICE-N. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2304.03416 [pdf, other]

To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive Refinement

Authors: Yashas Malur Saidutta, Rakshith Sharma Srinivasa, Ching-Hua Lee, Chouchang Yang, Yilin Shen, Hongxia **

Abstract: Keyword spotting systems continuously process audio streams to detect keywords. One of the most challenging tasks in designing such systems is to reduce False Alarm (FA) which happens when the system falsely registers a keyword despite the keyword not being uttered. In this paper, we propose a simple yet elegant solution to this problem that follows from the law of total probability. We show that… ▽ More Keyword spotting systems continuously process audio streams to detect keywords. One of the most challenging tasks in designing such systems is to reduce False Alarm (FA) which happens when the system falsely registers a keyword despite the keyword not being uttered. In this paper, we propose a simple yet elegant solution to this problem that follows from the law of total probability. We show that existing deep keyword spotting mechanisms can be improved by Successive Refinement, where the system first classifies whether the input audio is speech or not, followed by whether the input is keyword-like or not, and finally classifies which keyword was uttered. We show across multiple models with size ranging from 13K parameters to 2.41M parameters, the successive refinement technique reduces FA by up to a factor of 8 on in-domain held-out FA data, and up to a factor of 7 on out-of-domain (OOD) FA data. Further, our proposed approach is "plug-and-play" and can be applied to any deep keyword spotting model. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Comments: Accepted for publication in ICASSP 2023

arXiv:2303.05508 [pdf, other]

Cherry-Picking with Reinforcement Learning : Robust Dynamic Gras** in Unstable Conditions

Authors: Yunchu Zhang, Liyiming Ke, Abhay Deshpande, Abhishek Gupta, Siddhartha Srinivasa

Abstract: Gras** small objects surrounded by unstable or non-rigid material plays a crucial role in applications such as surgery, harvesting, construction, disaster recovery, and assisted feeding. This task is especially difficult when fine manipulation is required in the presence of sensor noise and perception errors; errors inevitably trigger dynamic motion, which is challenging to model precisely. Circ… ▽ More Gras** small objects surrounded by unstable or non-rigid material plays a crucial role in applications such as surgery, harvesting, construction, disaster recovery, and assisted feeding. This task is especially difficult when fine manipulation is required in the presence of sensor noise and perception errors; errors inevitably trigger dynamic motion, which is challenging to model precisely. Circumventing the difficulty to build accurate models for contacts and dynamics, data-driven methods like reinforcement learning (RL) can optimize task performance via trial and error, reducing the need for accurate models of contacts and dynamics. Applying RL methods to real robots, however, has been hindered by factors such as prohibitively high sample complexity or the high training infrastructure cost for providing resets on hardware. This work presents CherryBot, an RL system that uses chopsticks for fine manipulation that surpasses human reactiveness for some dynamic gras** tasks. By integrating imprecise simulators, suboptimal demonstrations and external state estimation, we study how to make a real-world robot learning system sample efficient and general while reducing the human effort required for supervision. Our system shows continual improvement through 30 minutes of real-world interaction: through reactive retry, it achieves an almost 100% success rate on the demanding task of using chopsticks to grasp small objects swinging in the air. We demonstrate the reactiveness, robustness and generalizability of CherryBot to varying object shapes and dynamics (e.g., external disturbances like wind and human perturbations). Videos are available at https://goodcherrybot.github.io/. △ Less

Submitted 28 June, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2303.01428 [pdf, other]

PuSHR: A Multirobot System for Nonprehensile Rearrangement

Authors: Sidharth Talia, Arnav Thareja, Christoforos Mavrogiannis, Matt Schmittle, Siddhartha S. Srinivasa

Abstract: We focus on the problem of rearranging a set of objects with a team of car-like robot pushers built using off-the-shelf components. Maintaining control of pushed objects while avoiding collisions in a tight space demands highly coordinated motion that is challenging to execute on constrained hardware. Centralized replanning approaches become intractable even for small-sized problems whereas decent… ▽ More We focus on the problem of rearranging a set of objects with a team of car-like robot pushers built using off-the-shelf components. Maintaining control of pushed objects while avoiding collisions in a tight space demands highly coordinated motion that is challenging to execute on constrained hardware. Centralized replanning approaches become intractable even for small-sized problems whereas decentralized approaches often get stuck in deadlocks. Our key insight is that by carefully assigning pushing tasks to robots, we could reduce the complexity of the rearrangement task, enabling robust performance via scalable decentralized control. Based on this insight, we built PuSHR, a system that optimally assigns pushing tasks and trajectories to robots offline, and performs trajectory tracking via decentralized control online. Through an ablation study in simulation, we demonstrate that PuSHR dominates baselines ranging from purely decentralized to fully decentralized in terms of success rate and time efficiency across challenging tasks with up to 4 robots. Hardware experiments demonstrate the transfer of our system to the real world and highlight its robustness to model inaccuracies. Our code can be found at https://github.com/prl-mushr/pushr, and videos from our experiments at https://youtu.be/DIWmZerF_O8. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2303.01424 [pdf, other]

From Crowd Motion Prediction to Robot Navigation in Crowds

Authors: Sriyash Poddar, Christoforos Mavrogiannis, Siddhartha S. Srinivasa

Abstract: We focus on robot navigation in crowded environments. To navigate safely and efficiently within crowds, robots need models for crowd motion prediction. Building such models is hard due to the high dimensionality of multiagent domains and the challenge of collecting or simulating interaction-rich crowd-robot demonstrations. While there has been important progress on models for offline pedestrian mo… ▽ More We focus on robot navigation in crowded environments. To navigate safely and efficiently within crowds, robots need models for crowd motion prediction. Building such models is hard due to the high dimensionality of multiagent domains and the challenge of collecting or simulating interaction-rich crowd-robot demonstrations. While there has been important progress on models for offline pedestrian motion forecasting, transferring their performance on real robots is nontrivial due to close interaction settings and novelty effects on users. In this paper, we investigate the utility of a recent state-of-the-art motion prediction model (S-GAN) for crowd navigation tasks. We incorporate this model into a model predictive controller (MPC) and deploy it on a self-balancing robot which we subject to a diverse range of crowd behaviors in the lab. We demonstrate that while S-GAN motion prediction accuracy transfers to the real world, its value is not reflected on navigation performance, measured with respect to safety and efficiency; in fact, the MPC performs indistinguishably even when using a simple constant-velocity prediction model, suggesting that substantial model improvements might be needed to yield significant gains for crowd navigation tasks. Footage from our experiments can be found at https://youtu.be/mzFiXg8KsZ0. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2301.03957 [pdf, ps, other]

AI based approach to Trailer Generation for Online Educational Courses

Authors: Prakhar Mishra, Chaitali Diwan, Srinath Srinivasa, G. Srinivasaraghavan

Abstract: In this paper, we propose an AI based approach to Trailer Generation in the form of short videos for online educational courses. Trailers give an overview of the course to the learners and help them make an informed choice about the courses they want to learn. It also helps to generate curiosity and interest among the learners and encourages them to pursue a course. While it is possible to manuall… ▽ More In this paper, we propose an AI based approach to Trailer Generation in the form of short videos for online educational courses. Trailers give an overview of the course to the learners and help them make an informed choice about the courses they want to learn. It also helps to generate curiosity and interest among the learners and encourages them to pursue a course. While it is possible to manually generate the trailers, it requires extensive human efforts and skills over a broad spectrum of design, span selection, video editing, domain knowledge, etc., thus making it time-consuming and expensive, especially in an academic setting. The framework we propose in this work is a template based method for video trailer generation, where most of the textual content of the trailer is auto-generated and the trailer video is automatically generated, by leveraging Machine Learning and Natural Language Processing techniques. The proposed trailer is in the form of a timeline consisting of various fragments created by selecting, para-phrasing or generating content using various proposed techniques. The fragments are further enhanced by adding voice-over text, subtitles, animations, etc., to create a holistic experience. Finally, we perform user evaluation with 63 human evaluators for evaluating the trailers generated by our system and the results obtained were encouraging. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: 8 pages, 9 figures

arXiv:2210.12851 [pdf, other]

Lazy Incremental Search for Efficient Replanning with Bounded Suboptimality Guarantees

Authors: Jaein Lim, Mahdi Ghanei, R. Connor Lawson, Siddhartha Srinivasa, Panagiotis Tsiotras

Abstract: We present a lazy incremental search algorithm, Lifelong-GLS (L-GLS), along with its bounded suboptimal version, Bounded L-GLS (B-LGLS) that combine the search efficiency of incremental search algorithms with the evaluation efficiency of lazy search algorithms for fast replanning in problem domains where edge-evaluations are more expensive than vertex-expansions. The proposed algorithms generalize… ▽ More We present a lazy incremental search algorithm, Lifelong-GLS (L-GLS), along with its bounded suboptimal version, Bounded L-GLS (B-LGLS) that combine the search efficiency of incremental search algorithms with the evaluation efficiency of lazy search algorithms for fast replanning in problem domains where edge-evaluations are more expensive than vertex-expansions. The proposed algorithms generalize Lifelong Planning A* (LPA*) and its bounded suboptimal version, Truncated LPA* (TLPA*), within the Generalized Lazy Search (GLS) framework, so as to restrict expensive edge evaluations only to the current shortest subpath when the cost-to-come inconsistencies are propagated during repair. We also present dynamic versions of the L-GLS and B-LGLS algorithms, called Generalized D* (GD*) and Bounded Generalized D* (B-GD*), respectively, for efficient replanning with non-stationary queries, designed specifically for navigation of mobile robots. We prove that the proposed algorithms are complete and correct in finding a solution that is guaranteed not to exceed the optimal solution cost by a user-chosen factor. Our numerical and experimental results support the claim that the proposed integration of the incremental and lazy search frameworks can help find solutions faster compared to the regular incremental or regular lazy search algorithms when the underlying graph representation changes often. △ Less

Submitted 23 October, 2022; originally announced October 2022.

Comments: arXiv admin note: text overlap with arXiv:2105.12076

arXiv:2210.07077 [pdf, ps, other]

Sketching low-rank matrices with a shared column space by convex programming

Authors: Rakshith S Srinivasa, Seonho Kim, Kiryung Lee

Abstract: In many practical applications including remote sensing, multi-task learning, and multi-spectrum imaging, data are described as a set of matrices sharing a common column space. We consider the joint estimation of such matrices from their noisy linear measurements. We study a convex estimator regularized by a pair of matrix norms. The measurement model corresponds to block-wise sensing and the reco… ▽ More In many practical applications including remote sensing, multi-task learning, and multi-spectrum imaging, data are described as a set of matrices sharing a common column space. We consider the joint estimation of such matrices from their noisy linear measurements. We study a convex estimator regularized by a pair of matrix norms. The measurement model corresponds to block-wise sensing and the reconstruction is possible only when the total energy is well distributed over blocks. The first norm, which is the maximum-block-Frobenius norm, favors such a solution. This condition is analogous to the notion of low-spikiness in matrix completion or column-wise sensing. The second norm, which is a tensor norm on a pair of suitable Banach spaces, induces low-rankness in the solution together with the first norm. We demonstrate that the joint estimation provides a significant gain over the individual recovery of each matrix when the number of matrices sharing a column space and the ambient dimension of the shared column space are large relative to the number of columns in each matrix. The convex estimator is cast as a semidefinite program and an efficient ADMM algorithm is derived. The empirical behavior of the convex estimator is illustrated using Monte Carlo simulations and recovery performance is compared to existing methods in the literature. △ Less

Submitted 5 June, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.06479 [pdf, other]

Real World Offline Reinforcement Learning with Realistic Data Source

Authors: Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar

Abstract: Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience. However, current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories, and thus hold limited relevance for real-world robotics. In this work (Real-ORL), we posi… ▽ More Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience. However, current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories, and thus hold limited relevance for real-world robotics. In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning. Under these settings, we perform an extensive (6500+ trajectories collected over 800+ robot hours and 270+ human labor hour) empirical study evaluating generalization and transfer capabilities of representative ORL methods on four real-world tabletop manipulation tasks. Our study finds that ORL and imitation learning prefer different action spaces, and that ORL algorithms can generalize from leveraging offline heterogeneous data sources and outperform imitation learning. We release our dataset and implementations at URL: https://sites.google.com/view/real-orl △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: Project website: https://sites.google.com/view/real-orl

arXiv:2209.06999 [pdf]

Data Science Approach to predict the winning Fantasy Cricket Team Dream 11 Fantasy Sports

Authors: Sachin Kumar S, Prithvi HV, C Nandini

Abstract: The evolution of digital technology and the increasing popularity of sports inspired the innovators to take the experience of users with a proclivity towards sports to a whole new different level, by introducing Fantasy Sports Platforms FSPs. The application of Data Science and Analytics is Ubiquitous in the Modern World. Data Science and Analytics open doors to gain a deeper understanding and hel… ▽ More The evolution of digital technology and the increasing popularity of sports inspired the innovators to take the experience of users with a proclivity towards sports to a whole new different level, by introducing Fantasy Sports Platforms FSPs. The application of Data Science and Analytics is Ubiquitous in the Modern World. Data Science and Analytics open doors to gain a deeper understanding and help in the decision making process. We firmly believed that we could adopt Data Science to predict the winning fantasy cricket team on the FSP, Dream 11. We built a predictive model that predicts the performance of players in a prospective game. We used a combination of Greedy and Knapsack Algorithms to prescribe the combination of 11 players to create a fantasy cricket team that has the most significant statistical odds of finishing as the strongest team thereby giving us a higher chance of winning the pot of bets on the Dream 11 FSP. We used PyCaret Python Library to help us understand and adopt the best Regressor Algorithm for our problem statement to make precise predictions. Further, we used Plotly Python Library to give us visual insights into the team, and players performances by accounting for the statistical, and subjective factors of a prospective game. The interactive plots help us to bolster the recommendations of our predictive model. You either win big, win small, or lose your bet based on the performance of the players selected for your fantasy team in the prospective game, and our model increases the probability of you winning big. △ Less

Submitted 14 September, 2022; originally announced September 2022.

arXiv:2209.04836 [pdf, other]

Git Re-Basin: Merging Models modulo Permutation Symmetries

Authors: Samuel K. Ainsworth, Jonathan Hayase, Siddhartha Srinivasa

Abstract: The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease. Though non-convex optimization is NP-hard, simple algorithms -- often variants of stochastic gradient descent -- exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes often contain (nearl… ▽ More The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease. Though non-convex optimization is NP-hard, simple algorithms -- often variants of stochastic gradient descent -- exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes often contain (nearly) a single basin after accounting for all possible permutation symmetries of hidden units a la Entezari et al. 2021. We introduce three algorithms to permute the units of one model to bring them into alignment with a reference model in order to merge the two models in weight space. This transformation produces a functionally equivalent set of weights that lie in an approximately convex basin near the reference model. Experimentally, we demonstrate the single basin phenomenon across a variety of model architectures and datasets, including the first (to our knowledge) demonstration of zero-barrier linear mode connectivity between independently trained ResNet models on CIFAR-10. Additionally, we identify intriguing phenomena relating model width and training time to mode connectivity. Finally, we discuss shortcomings of the linear mode connectivity hypothesis, including a counterexample to the single basin theory. △ Less

Submitted 1 March, 2023; v1 submitted 11 September, 2022; originally announced September 2022.

arXiv:2208.07004 [pdf, other]

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Authors: Tianyu Zhang, Andrew Williams, Soham Phade, Sunil Srinivasa, Yang Zhang, Prateek Gupta, Yoshua Bengio, Stephan Zheng

Abstract: Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing severe inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no… ▽ More Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing severe inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no central authority to enforce adherence to those agreements. Hence, it is critical to design negotiation and agreement frameworks that foster cooperation, allow all agents to meet their individual policy objectives, and incentivize long-term adherence. This is an interdisciplinary challenge that calls for collaboration between researchers in machine learning, economics, climate science, law, policy, ethics, and other fields. In particular, we argue that machine learning is a critical tool to address the complexity of this domain. To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks. We also describe how to use multi-agent reinforcement learning to train rational agents using RICE-N. This framework underpinsAI for Global Climate Cooperation, a working group collaboration and competition on climate negotiation and agreement design. Here, we invite the scientific community to design and evaluate their solutions using RICE-N, machine learning, economic intuition, and other domain knowledge. More information can be found on www.ai4climatecoop.org. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: 12 pages (21 with appendices), 5 figures. For associated working group, see https://www.ai4climatecoop.org/

MSC Class: 93A16; 91-10; 68T07 ACM Class: I.2.11; J.2; J.4

arXiv:2207.06362 [pdf, other]

Iterative Linear Quadratic Optimization for Nonlinear Control: Differentiable Programming Algorithmic Templates

Authors: Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel, Zaid Harchaoui

Abstract: We present the implementation of nonlinear control algorithms based on linear and quadratic approximations of the objective from a functional viewpoint. We present a gradient descent, a Gauss-Newton method, a Newton method, differential dynamic programming approaches with linear quadratic or quadratic approximations, various line-search strategies, and regularized variants of these algorithms. We… ▽ More We present the implementation of nonlinear control algorithms based on linear and quadratic approximations of the objective from a functional viewpoint. We present a gradient descent, a Gauss-Newton method, a Newton method, differential dynamic programming approaches with linear quadratic or quadratic approximations, various line-search strategies, and regularized variants of these algorithms. We derive the computational complexities of all algorithms in a differentiable programming framework and present sufficient optimality conditions. We compare the algorithms on several benchmarks, such as autonomous car racing using a bicycle model of a car. The algorithms are coded in a differentiable programming language in a publicly available package. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: This is a companion report to the arXiv report "Complexity Bounds of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control" <arXiv:2204.02322> by the same authors

MSC Class: 68Q25; 49M37 ACM Class: G.1.6

arXiv:2204.08405 [pdf, ps, other]

Zero-shot Entity and Tweet Characterization with Designed Conditional Prompts and Contexts

Authors: Sharath Srivatsa, Tushar Mohan, Kumari Neha, Nishchay Malakar, Ponnurangam Kumaraguru, Srinath Srinivasa

Abstract: Online news and social media have been the de facto mediums to disseminate information globally from the beginning of the last decade. However, bias in content and purpose of intentions are not regulated, and managing bias is the responsibility of content consumers. In this regard, understanding the stances and biases of news sources towards specific entities becomes important. To address this pro… ▽ More Online news and social media have been the de facto mediums to disseminate information globally from the beginning of the last decade. However, bias in content and purpose of intentions are not regulated, and managing bias is the responsibility of content consumers. In this regard, understanding the stances and biases of news sources towards specific entities becomes important. To address this problem, we use pretrained language models, which have been shown to bring about good results with no task-specific training or few-shot training. In this work, we approach the problem of characterizing Named Entities and Tweets as an open-ended text classification and open-ended fact probing problem.We evaluate the zero-shot language model capabilities of Generative Pretrained Transformer 2 (GPT-2) to characterize Entities and Tweets subjectively with human psychology-inspired and logical conditional prefixes and contexts. First, we fine-tune the GPT-2 model on a sufficiently large news corpus and evaluate subjective characterization of popular entities in the corpus by priming with prefixes. Second, we fine-tune GPT-2 with a Tweets corpus from a few popular hashtags and evaluate characterizing tweets by priming the language model with prefixes, questions, and contextual synopsis prompts. Entity characterization results were positive across measures and human evaluation. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2204.06501 [pdf, other]

Clinical trial site matching with improved diversity using fair policy learning

Authors: Rakshith S Srinivasa, Cheng Qian, Brandon Theodorou, Jeffrey Spaeder, Cao Xiao, Lucas Glass, Jimeng Sun

Abstract: The ongoing pandemic has highlighted the importance of reliable and efficient clinical trials in healthcare. Trial sites, where the trials are conducted, are chosen mainly based on feasibility in terms of medical expertise and access to a large group of patients. More recently, the issue of diversity and inclusion in clinical trials is gaining importance. Different patient groups may experience th… ▽ More The ongoing pandemic has highlighted the importance of reliable and efficient clinical trials in healthcare. Trial sites, where the trials are conducted, are chosen mainly based on feasibility in terms of medical expertise and access to a large group of patients. More recently, the issue of diversity and inclusion in clinical trials is gaining importance. Different patient groups may experience the effects of a medical drug/ treatment differently and hence need to be included in the clinical trials. These groups could be based on ethnicity, co-morbidities, age, or economic factors. Thus, designing a method for trial site selection that accounts for both feasibility and diversity is a crucial and urgent goal. In this paper, we formulate this problem as a ranking problem with fairness constraints. Using principles of fairness in machine learning, we learn a model that maps a clinical trial description to a ranked list of potential trial sites. Unlike existing fairness frameworks, the group membership of each trial site is non-binary: each trial site may have access to patients from multiple groups. We propose fairness criteria based on demographic parity to address such a multi-group membership scenario. We test our method on 480 real-world clinical trials and show that our model results in a list of potential trial sites that provides access to a diverse set of patients while also ensuing a high number of enrolled patients. △ Less

Submitted 13 April, 2022; originally announced April 2022.

ACM Class: J.3; I.2.1

arXiv:2204.02460 [pdf, other]

Electrostatic Brakes Enable Individual Joint Control of Underactuated, Highly Articulated Robots

Authors: Patrick Lancaster, Christoforos Mavrogiannis, Siddhartha Srinivasa, Joshua Smith

Abstract: Highly articulated organisms serve as blueprints for incredibly dexterous mechanisms, but building similarly capable robotic counterparts has been hindered by the difficulties of develo** electromechanical actuators with both the high strength and compactness of biological muscle. We develop a stackable electrostatic brake that has comparable specific tension and weight to that of muscles and in… ▽ More Highly articulated organisms serve as blueprints for incredibly dexterous mechanisms, but building similarly capable robotic counterparts has been hindered by the difficulties of develo** electromechanical actuators with both the high strength and compactness of biological muscle. We develop a stackable electrostatic brake that has comparable specific tension and weight to that of muscles and integrate it into a robotic joint. Compared to electromechanical motors, our brake-equipped joint is four times lighter and one thousand times more power efficient while exerting similar holding torques. Our joint design enables a ten degree-of-freedom robot equipped with only one motor to manipulate multiple objects simultaneously. We also show that the use of brakes allows a two-fingered robot to perform in-hand re-positioning of an object 45% more quickly and with 53% lower positioning error than without brakes. Relative to fully actuated robots, our findings suggest that robots equipped with such electrostatic brakes will have lower weight, volume, and power consumption yet retain the ability to reach arbitrary joint configurations. △ Less

Submitted 30 October, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 18 pages, 17 figures

arXiv:2204.02371 [pdf, other]

Optical Proximity Sensing for Pose Estimation During In-Hand Manipulation

Authors: Patrick Lancaster, Pratik Gyawali, Christoforos Mavrogiannis, Siddhartha S. Srinivasa, Joshua R. Smith

Abstract: During in-hand manipulation, robots must be able to continuously estimate the pose of the object in order to generate appropriate control actions. The performance of algorithms for pose estimation hinges on the robot's sensors being able to detect discriminative geometric object features, but previous sensing modalities are unable to make such measurements robustly. The robot's fingers can occlude… ▽ More During in-hand manipulation, robots must be able to continuously estimate the pose of the object in order to generate appropriate control actions. The performance of algorithms for pose estimation hinges on the robot's sensors being able to detect discriminative geometric object features, but previous sensing modalities are unable to make such measurements robustly. The robot's fingers can occlude the view of environment- or robot-mounted image sensors, and tactile sensors can only measure at the local areas of contact. Motivated by fingertip-embedded proximity sensors' robustness to occlusion and ability to measure beyond the local areas of contact, we present the first evaluation of proximity sensor based pose estimation for in-hand manipulation. We develop a novel two-fingered hand with fingertip-embedded optical time-of-flight proximity sensors as a testbed for pose estimation during planar in-hand manipulation. Here, the in-hand manipulation task consists of the robot moving a cylindrical object from one end of its workspace to the other. We demonstrate, with statistical significance, that proximity-sensor based pose estimation via particle filtering during in-hand manipulation: a) exhibits 50% lower average pose error than a tactile-sensor based baseline; b) empowers a model predictive controller to achieve 30% lower final positioning error compared to when using tactile-sensor based pose estimates. △ Less

Submitted 30 October, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 8 pages, 6 figures

arXiv:2204.02322 [pdf, ps, other]

Complexity Bounds of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control

Authors: Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel, Zaid Harchaoui

Abstract: A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations of the original problem around current candidate solutions. While widely popular in many domains, such an approach has mainly been analyzed locally. We observe that global convergence guarantees can be ensured provided that the linearized discrete t… ▽ More A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations of the original problem around current candidate solutions. While widely popular in many domains, such an approach has mainly been analyzed locally. We observe that global convergence guarantees can be ensured provided that the linearized discrete time dynamics are surjective and costs on the state variables are strongly convex. We present how the surjectivity of the linearized dynamics can be ensured by appropriate discretization schemes given the existence of a feedback linearization scheme. We present complexity bounds of algorithms based on linear quadratic approximations through the lens of generalized Gauss-Newton methods. Our analysis uncovers several convergence phases for regularized generalized Gauss-Newton algorithms. △ Less

Submitted 5 April, 2022; originally announced April 2022.

MSC Class: 68Q25; 49M37 ACM Class: G.1.6

arXiv:2202.07074 [pdf, other]

doi 10.1109/LRA.2020.2969912

Benchmarking Robot Manipulation with the Rubik's Cube

Authors: Boling Yang, Patrick E. Lancaster, Siddhartha S. Srinivasa, Joshua R. Smith

Abstract: Benchmarks for robot manipulation are crucial to measuring progress in the field, yet there are few benchmarks that demonstrate critical manipulation skills, possess standardized metrics, and can be attempted by a wide array of robot platforms. To address a lack of such benchmarks, we propose Rubik's cube manipulation as a benchmark to measure simultaneous performance of precise manipulation and s… ▽ More Benchmarks for robot manipulation are crucial to measuring progress in the field, yet there are few benchmarks that demonstrate critical manipulation skills, possess standardized metrics, and can be attempted by a wide array of robot platforms. To address a lack of such benchmarks, we propose Rubik's cube manipulation as a benchmark to measure simultaneous performance of precise manipulation and sequential manipulation. The sub-structure of the Rubik's cube demands precise positioning of the robot's end effectors, while its highly reconfigurable nature enables tasks that require the robot to manage pose uncertainty throughout long sequences of actions. We present a protocol for quantitatively measuring both the accuracy and speed of Rubik's cube manipulation. This protocol can be attempted by any general-purpose manipulator, and only requires a standard 3x3 Rubik's cube and a flat surface upon which the Rubik's cube initially rests (e.g. a table). We demonstrate this protocol for two distinct baseline approaches on a PR2 robot. The first baseline provides a fundamental approach for pose-based Rubik's cube manipulation. The second baseline demonstrates the benchmark's ability to quantify improved performance by the system, particularly that resulting from the integration of pre-touch sensing. To demonstrate the benchmark's applicability to other robot platforms and algorithmic approaches, we present the functional blocks required to enable the HERB robot to manipulate the Rubik's cube via push-gras**. △ Less

Submitted 14 February, 2022; originally announced February 2022.

Comments: IEEE RAL

Journal ref: IEEE Robotics and Automation Letters 5.2 (2020): 2094-2099

arXiv:2202.00071 [pdf, other]

JULIA: Joint Multi-linear and Nonlinear Identification for Tensor Completion

Authors: Cheng Qian, Kejun Huang, Lucas Glass, Rakshith S. Srinivasa, Jimeng Sun

Abstract: Tensor completion aims at imputing missing entries from a partially observed tensor. Existing tensor completion methods often assume either multi-linear or nonlinear relationships between latent components. However, real-world tensors have much more complex patterns where both multi-linear and nonlinear relationships may coexist. In such cases, the existing methods are insufficient to describe t… ▽ More Tensor completion aims at imputing missing entries from a partially observed tensor. Existing tensor completion methods often assume either multi-linear or nonlinear relationships between latent components. However, real-world tensors have much more complex patterns where both multi-linear and nonlinear relationships may coexist. In such cases, the existing methods are insufficient to describe the data structure. This paper proposes a Joint mUlti-linear and nonLinear IdentificAtion (JULIA) framework for large-scale tensor completion. JULIA unifies the multi-linear and nonlinear tensor completion models with several advantages over the existing methods: 1) Flexible model selection, i.e., it fits a tensor by assigning its values as a combination of multi-linear and nonlinear components; 2) Compatible with existing nonlinear tensor completion methods; 3) Efficient training based on a well-designed alternating optimization approach. Experiments on six real large-scale tensors demonstrate that JULIA outperforms many existing tensor completion algorithms. Furthermore, JULIA can improve the performance of a class of nonlinear tensor completion methods. The results show that in some large-scale tensor completion scenarios, baseline methods with JULIA are able to obtain up to 55% lower root mean-squared-error and save 67% computational complexity. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2201.05576 [pdf, other]

AI and the Sense of Self

Authors: Srinath Srinivasa, Jayati Deshmukh

Abstract: After several winters, AI is center-stage once again, with current advances enabling a vast array of AI applications. This renewed wave of AI has brought back to the fore several questions from the past, about philosophical foundations of intelligence and common sense -- predominantly motivated by ethical concerns of AI decision-making. In this paper, we address some of the arguments that led to r… ▽ More After several winters, AI is center-stage once again, with current advances enabling a vast array of AI applications. This renewed wave of AI has brought back to the fore several questions from the past, about philosophical foundations of intelligence and common sense -- predominantly motivated by ethical concerns of AI decision-making. In this paper, we address some of the arguments that led to research interest in intelligent agents, and argue for their relevance even in today's context. Specifically we focus on the cognitive sense of "self" and its role in autonomous decision-making leading to responsible behaviour. The authors hope to make a case for greater research interest in building richer computational models of AI agents with a sense of self. △ Less

Submitted 7 January, 2022; originally announced January 2022.

Comments: Previous version of this paper was published in Jijnasa 2021

arXiv:2112.05575 [pdf, other]

doi 10.4018/978-1-7998-2975-1.ch001

Paradigms of Computational Agency

Authors: Srinath Srinivasa, Jayati Deshmukh

Abstract: Agent-based models have emerged as a promising paradigm for addressing ever increasing complexity of information systems. In its initial days in the 1990s when object-oriented modeling was at its peak, an agent was treated as a special kind of "object" that had a persistent state and its own independent thread of execution. Since then, agent-based models have diversified enormously to even open ne… ▽ More Agent-based models have emerged as a promising paradigm for addressing ever increasing complexity of information systems. In its initial days in the 1990s when object-oriented modeling was at its peak, an agent was treated as a special kind of "object" that had a persistent state and its own independent thread of execution. Since then, agent-based models have diversified enormously to even open new conceptual insights about the nature of systems in general. This paper presents a perspective on the disparate ways in which our understanding of agency, as well as computational models of agency have evolved. Advances in hardware like GPUs, that brought neural networks back to life, may also similarly infuse new life into agent-based models, as well as pave the way for advancements in research on Artificial General Intelligence (AGI). △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: Earlier version of this paper was published as a book chapter in the book titled Novel Approaches to Information Systems Design

Journal ref: In Novel Approaches to Information Systems Design, pp. 1-19. IGI Global, 2020

arXiv:2111.11401 [pdf, other]

Balancing Efficiency and Comfort in Robot-Assisted Bite Transfer

Authors: Suneel Belkhale, Ethan K. Gordon, Yuxiao Chen, Siddhartha Srinivasa, Tapomayukh Bhattacharjee, Dorsa Sadigh

Abstract: Robot-assisted feeding in household environments is challenging because it requires robots to generate trajectories that effectively bring food items of varying shapes and sizes into the mouth while making sure the user is comfortable. Our key insight is that in order to solve this challenge, robots must balance the efficiency of feeding a food item with the comfort of each individual bite. We for… ▽ More Robot-assisted feeding in household environments is challenging because it requires robots to generate trajectories that effectively bring food items of varying shapes and sizes into the mouth while making sure the user is comfortable. Our key insight is that in order to solve this challenge, robots must balance the efficiency of feeding a food item with the comfort of each individual bite. We formalize comfort and efficiency as heuristics to incorporate in motion planning. We present an approach based on heuristics-guided bi-directional Rapidly-exploring Random Trees (h-BiRRT) that selects bite transfer trajectories of arbitrary food item geometries and shapes using our developed bite efficiency and comfort heuristics and a learned constraint model. Real-robot evaluations show that optimizing both comfort and efficiency significantly outperforms a fixed-pose based method, and users preferred our method significantly more than that of a method that maximizes only user comfort. Videos and Appendices are found on our website: https://sites.google.com/view/comfortbitetransfer-icra22/home. △ Less

Submitted 24 June, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

arXiv:2111.09436 [pdf]

doi 10.1038/s41928-023-00930-2

Cryogenic Memory Technologies

Authors: Shamiul Alam, Md Shafayat Hossain, Srivatsa Rangachar Srinivasa, Ahmedullah Aziz

Abstract: The surging interest in quantum computing, space electronics, and superconducting circuits has led to new developments in cryogenic data storage technology. Quantum computers promise to far extend our processing capabilities and may allow solving currently intractable computational challenges. Even with the advent of the quantum computing era, ultra-fast and energy-efficient classical computing sy… ▽ More The surging interest in quantum computing, space electronics, and superconducting circuits has led to new developments in cryogenic data storage technology. Quantum computers promise to far extend our processing capabilities and may allow solving currently intractable computational challenges. Even with the advent of the quantum computing era, ultra-fast and energy-efficient classical computing systems are still in high demand. One of the classical platforms that can achieve this dream combination is superconducting single flux quantum (SFQ) electronics. A major roadblock towards implementing scalable quantum computers and practical SFQ circuits is the lack of suitable and compatible cryogenic memory that can operate at 4 Kelvin (or lower) temperature. Cryogenic memory is also critically important in space-based applications. A multitude of device technologies have already been explored to find suitable candidates for cryogenic data storage. Here, we review the existing and emerging variants of cryogenic memory technologies. To ensure an organized discussion, we categorize the family of cryogenic memory platforms into three types: superconducting, non-superconducting, and hybrid. We scrutinize the challenges associated with these technologies and discuss their future prospects. △ Less

Submitted 1 July, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

Comments: 21 pages, 6 figures, 1 table

Journal ref: Nature Electronics (2023)

arXiv:2111.02972 [pdf, other]

Stein Variational Probabilistic Roadmaps

Authors: Alexander Lambert, Brian Hou, Rosario Scalise, Siddhartha S. Srinivasa, Byron Boots

Abstract: Efficient and reliable generation of global path plans are necessary for safe execution and deployment of autonomous systems. In order to generate planning graphs which adequately resolve the topology of a given environment, many sampling-based motion planners resort to coarse, heuristically-driven strategies which often fail to generalize to new and varied surroundings. Further, many of these app… ▽ More Efficient and reliable generation of global path plans are necessary for safe execution and deployment of autonomous systems. In order to generate planning graphs which adequately resolve the topology of a given environment, many sampling-based motion planners resort to coarse, heuristically-driven strategies which often fail to generalize to new and varied surroundings. Further, many of these approaches are not designed to contend with partial-observability. We posit that such uncertainty in environment geometry can, in fact, help drive the sampling process in generating feasible, and probabilistically-safe planning graphs. We propose a method for Probabilistic Roadmaps which relies on particle-based Variational Inference to efficiently cover the posterior distribution over feasible regions in configuration space. Our approach, Stein Variational Probabilistic Roadmap (SV-PRM), results in sample-efficient generation of planning-graphs and large improvements over traditional sampling approaches. We demonstrate the approach on a variety of challenging planning problems, including real-world probabilistic occupancy maps and high-dof manipulation problems common in robotics. △ Less

Submitted 20 May, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: Pre-print

arXiv:2110.15205 [pdf, ps, other]

Approximately low-rank recovery from noisy and local measurements by convex program

Authors: Kiryung Lee, Rakshith Sharma Srinivasa, Marius Junge, Justin Romberg

Abstract: Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the… ▽ More Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the $\ell_1$ norm for sparsity through the singular value decomposition, other matrix norms also induce low-rankness. Particularly as one interprets a matrix as a linear operator between Banach spaces, various tensor product norms generalize the role of the nuclear norm. We provide a tensor-norm-constrained estimator for the recovery of approximately low-rank matrices from local measurements corrupted with noise. A tensor-norm regularizer is designed to adapt to the local structure. We derive statistical analysis of the estimator over matrix completion and decentralized sketching by applying Maurey's empirical method to tensor products of Banach spaces. The estimator provides a near-optimal error bound in a minimax sense and admits a polynomial-time algorithm for these applications. △ Less

Submitted 3 March, 2023; v1 submitted 28 October, 2021; originally announced October 2021.

arXiv:2110.04669 [pdf, other]

Leveraging Experience in Lazy Search

Authors: Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa

Abstract: Lazy graph search algorithms are efficient at solving motion planning problems where edge evaluation is the computational bottleneck. These algorithms work by lazily computing the shortest potentially feasible path, evaluating edges along that path, and repeating until a feasible path is found. The order in which edges are selected is critical to minimizing the total number of edge evaluations: a… ▽ More Lazy graph search algorithms are efficient at solving motion planning problems where edge evaluation is the computational bottleneck. These algorithms work by lazily computing the shortest potentially feasible path, evaluating edges along that path, and repeating until a feasible path is found. The order in which edges are selected is critical to minimizing the total number of edge evaluations: a good edge selector chooses edges that are not only likely to be invalid, but also eliminates future paths from consideration. We wish to learn such a selector by leveraging prior experience. We formulate this problem as a Markov Decision Process (MDP) on the state of the search problem. While solving this large MDP is generally intractable, we show that we can compute oracular selectors that can solve the MDP during training. With access to such oracles, we use imitation learning to find effective policies. If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly. We evaluate our algorithms on a wide range of 2D and 7D problems and show that the learned selector outperforms baseline commonly used heuristics. We further provide a novel theoretical analysis of lazy search in a Bayesian framework as well as regret guarantees on our imitation learning based approach to motion planning. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Extended journal version accepted for publication at Autonomous Robots; 17 pages. arXiv admin note: substantial text overlap with arXiv:1907.07238

arXiv:2109.10957 [pdf, other]

Real Robot Challenge: A Robotics Competition in the Cloud

Authors: Stefan Bauer, Felix Widmaier, Manuel Wüthrich, Annika Buchholz, Sebastian Stark, Anirudh Goyal, Thomas Steinbrenner, Joel Akpo, Shruti Joshi, Vincent Berenz, Vaibhav Agrawal, Niklas Funk, Julen Urain De Jesus, Jan Peters, Joe Watson, Claire Chen, Krishnan Srinivasan, Junwu Zhang, Jeffrey Zhang, Matthew R. Walter, Rishabh Madan, Charles Schaff, Takahiro Maeda, Takuma Yoneda, Denis Yarats , et al. (17 additional authors not shown)

Abstract: Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able… ▽ More Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects. △ Less

Submitted 10 June, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

arXiv:2109.10652 [pdf, other]

Gotta catch 'em all: a Multistage Framework for honeypot fingerprinting

Authors: Shreyas Srinivasa, Jens Myrup Pedersen, Emmanouil Vasilomanolakis

Abstract: Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recogni… ▽ More Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recognize the existence of such a system, they can evade it. In this article, we revisit the honeypot identification field, by providing a holistic framework that includes state of the art and novel fingerprinting components. We decrease the probability of false positives by proposing a rigid multi-step approach for labeling a system as a honeypot. We perform extensive scans covering 2.9 billion addresses of the IPv4 space and identify a total of 21,855 honeypot instances. Moreover, we present a number of interesting side-findings such as the identification of more than 354,431 non-honeypot systems that represent potentially vulnerable servers (e.g. SSH servers with default password configurations and vulnerable versions). Lastly, we discuss countermeasures against honeypot fingerprinting techniques. △ Less

Submitted 22 September, 2021; originally announced September 2021.

arXiv:2109.07060 [pdf, other]

Analyzing Multiagent Interactions in Traffic Scenes via Topological Braids

Authors: Christoforos Mavrogiannis, Jonathan DeCastro, Siddhartha S. Srinivasa

Abstract: We focus on the problem of analyzing multiagent interactions in traffic domains. Understanding the space of behavior of real-world traffic may offer significant advantages for algorithmic design, data-driven methodologies, and benchmarking. However, the high dimensionality of the space and the stochasticity of human behavior may hinder the identification of important interaction patterns. Our key… ▽ More We focus on the problem of analyzing multiagent interactions in traffic domains. Understanding the space of behavior of real-world traffic may offer significant advantages for algorithmic design, data-driven methodologies, and benchmarking. However, the high dimensionality of the space and the stochasticity of human behavior may hinder the identification of important interaction patterns. Our key insight is that traffic environments feature significant geometric and temporal structure, leading to highly organized collective behaviors, often drawn from a small set of dominant modes. In this work, we propose a representation based on the formalism of topological braids that can summarize arbitrarily complex multiagent behavior into a compact object of dual geometric and symbolic nature, capturing critical events of interaction. This representation allows us to formally enumerate the space of outcomes in a traffic scene and characterize their complexity. We illustrate the value of the proposed representation in summarizing critical aspects of real-world traffic behavior through a case study on recent driving datasets. We show that despite the density of real-world traffic, observed behavior tends to follow highly organized patterns of low interaction. Our framework may be a valuable tool for evaluating the richness of driving datasets, but also for synthetically designing balanced training datasets or benchmarks. △ Less

Submitted 18 May, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

Comments: Accepted at ICRA

arXiv:2109.05084 [pdf, other]

Winding Through: Crowd Navigation via Topological Invariance

Authors: Christoforos Mavrogiannis, Krishna Balasubramanian, Sriyash Poddar, Anush Gandra, Siddhartha S. Srinivasa

Abstract: We focus on robot navigation in crowded environments. The challenge of predicting the motion of a crowd around a robot makes it hard to ensure human safety and comfort. Recent approaches often employ end-to-end techniques for robot control or deep architectures for high-fidelity human motion prediction. While these methods achieve important performance benchmarks in simulated domains, dataset limi… ▽ More We focus on robot navigation in crowded environments. The challenge of predicting the motion of a crowd around a robot makes it hard to ensure human safety and comfort. Recent approaches often employ end-to-end techniques for robot control or deep architectures for high-fidelity human motion prediction. While these methods achieve important performance benchmarks in simulated domains, dataset limitations and high sample complexity tend to prevent them from transferring to real-world environments. Our key insight is that a low-dimensional representation that captures critical features of crowd-robot dynamics could be sufficient to enable a robot to wind through a crowd smoothly. To this end, we mathematically formalize the act of passing between two agents as a rotation, using a notion of topological invariance. Based on this formalism, we design a cost functional that favors robot trajectories contributing higher passing progress and penalizes switching between different sides of a human. We incorporate this functional into a model predictive controller that employs a simple constant-velocity model of human motion prediction. This results in robot motion that accomplishes statistically significantly higher clearances from the crowd compared to state-of-the-art baselines while maintaining competitive levels of efficiency, across extensive simulations and challenging real-world experiments on a self-balancing robot. △ Less

Submitted 22 November, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

Comments: Final version to appear at IEEE RA-L - minor acknowledgments fix

arXiv:2108.13976 [pdf, other]

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Authors: Tian Lan, Sunil Srinivasa, Huan Wang, Stephan Zheng

Abstract: Deep reinforcement learning (RL) is a powerful framework to train decision-making models in complex environments. However, RL can be slow as it requires repeated interaction with a simulation of the environment. In particular, there are key system engineering bottlenecks when using RL in complex environments that feature multiple agents with high-dimensional state, observation, or action spaces. W… ▽ More Deep reinforcement learning (RL) is a powerful framework to train decision-making models in complex environments. However, RL can be slow as it requires repeated interaction with a simulation of the environment. In particular, there are key system engineering bottlenecks when using RL in complex environments that feature multiple agents with high-dimensional state, observation, or action spaces. We present WarpDrive, a flexible, lightweight, and easy-to-use open-source RL framework that implements end-to-end deep multi-agent RL on a single GPU (Graphics Processing Unit), built on PyCUDA and PyTorch. Using the extreme parallelization capability of GPUs, WarpDrive enables orders-of-magnitude faster RL compared to common implementations that blend CPU simulations and GPU models. Our design runs simulations and the agents in each simulation in parallel. It eliminates data copying between CPU and GPU. It also uses a single simulation data store on the GPU that is safely updated in-place. WarpDrive provides a lightweight Python interface and flexible environment wrappers that are easy to use and extend. Together, this allows the user to easily run thousands of concurrent multi-agent simulations and train on extremely large batches of experience. Through extensive experiments, we verify that WarpDrive provides high-throughput and scales almost linearly to many agents and parallel environments. For example, WarpDrive yields 2.9 million environment steps/second with 2000 environments and 1000 agents (at least 100x higher throughput compared to a CPU implementation) in a benchmark Tag simulation. As such, WarpDrive is a fast and extensible multi-agent RL platform to significantly accelerate research and development. △ Less

Submitted 8 October, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

Comments: TL and SS contributed equally. Code is available at https://www.github.com/salesforce/warp-drive. 11 pages, 14 figures

arXiv:2108.02904 [pdf, other]

Building a Foundation for Data-Driven, Interpretable, and Robust Policy Design using the AI Economist

Authors: Alexander Trott, Sunil Srinivasa, Douwe van der Wal, Sebastien Haneuse, Stephan Zheng

Abstract: Optimizing economic and public policy is critical to address socioeconomic issues and trade-offs, e.g., improving equality, productivity, or wellness, and poses a complex mechanism design problem. A policy designer needs to consider multiple objectives, policy levers, and behavioral responses from strategic actors who optimize for their individual objectives. Moreover, real-world policies should b… ▽ More Optimizing economic and public policy is critical to address socioeconomic issues and trade-offs, e.g., improving equality, productivity, or wellness, and poses a complex mechanism design problem. A policy designer needs to consider multiple objectives, policy levers, and behavioral responses from strategic actors who optimize for their individual objectives. Moreover, real-world policies should be explainable and robust to simulation-to-reality gaps, e.g., due to calibration issues. Existing approaches are often limited to a narrow set of policy levers or objectives that are hard to measure, do not yield explicit optimal policies, or do not consider strategic behavior, for example. Hence, it remains challenging to optimize policy in real-world scenarios. Here we show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning (RL) and data-driven simulations. We validate our framework on optimizing the stringency of US state policies and Federal subsidies during a pandemic, e.g., COVID-19, using a simulation fitted to real data. We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes. Their behavior can be explained, e.g., well-performing policies respond strongly to changes in recovery and vaccination rates. They are also robust to calibration errors, e.g., infection rates that are over or underestimated. As of yet, real-world policymaking has not seen adoption of machine learning methods at large, including RL and AI-driven simulations. Our results show the potential of AI to guide policy design and improve social welfare amidst the complexity of the real world. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: 41 pages, 14 figures. AT, SS, and SZ contributed equally

arXiv:2108.02755 [pdf, other]

The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning

Authors: Stephan Zheng, Alexander Trott, Sunil Srinivasa, David C. Parkes, Richard Socher

Abstract: AI and reinforcement learning (RL) have improved many areas, but are not yet widely adopted in economic policy design, mechanism design, or economics at large. At the same time, current economic methodology is limited by a lack of counterfactual data, simplistic behavioral models, and limited opportunities to experiment with policies and evaluate behavioral responses. Here we show that machine-lea… ▽ More AI and reinforcement learning (RL) have improved many areas, but are not yet widely adopted in economic policy design, mechanism design, or economics at large. At the same time, current economic methodology is limited by a lack of counterfactual data, simplistic behavioral models, and limited opportunities to experiment with policies and evaluate behavioral responses. Here we show that machine-learning-based economic simulation is a powerful policy and mechanism design framework to overcome these limitations. The AI Economist is a two-level, deep RL framework that trains both agents and a social planner who co-adapt, providing a tractable solution to the highly unstable and novel two-level RL challenge. From a simple specification of an economy, we learn rational agent behaviors that adapt to learned planner policies and vice versa. We demonstrate the efficacy of the AI Economist on the problem of optimal taxation. In simple one-step economies, the AI Economist recovers the optimal tax policy of economic theory. In complex, dynamic economies, the AI Economist substantially improves both utilitarian social welfare and the trade-off between equality and productivity over baselines. It does so despite emergent tax-gaming strategies, while accounting for agent interactions and behavioral change more accurately than economic theory. These results demonstrate for the first time that two-level, deep RL can be used for understanding and as a complement to theory for economic design, unlocking a new computational learning-based approach to understanding economic policy. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: Substantial Extension of arXiv:2004.13332. SZ and AT contributed equally

arXiv:2108.01254 [pdf, other]

doi 10.1109/RO-MAN46459.2019.8956243

Desk Organization: Effect of Multimodal Inputs on Spatial Relational Learning

Authors: Ryan Rowe, Shivam Singhal, Daqing Yi, Tapomayukh Bhattacharjee, Siddhartha S. Srinivasa

Abstract: For robots to operate in a three dimensional world and interact with humans, learning spatial relationships among objects in the surrounding is necessary. Reasoning about the state of the world requires inputs from many different sensory modalities including vision ($V$) and haptics ($H$). We examine the problem of desk organization: learning how humans spatially position different objects on a pl… ▽ More For robots to operate in a three dimensional world and interact with humans, learning spatial relationships among objects in the surrounding is necessary. Reasoning about the state of the world requires inputs from many different sensory modalities including vision ($V$) and haptics ($H$). We examine the problem of desk organization: learning how humans spatially position different objects on a planar surface according to organizational ''preference''. We model this problem by examining how humans position objects given multiple features received from vision and haptic modalities. However, organizational habits vary greatly between people both in structure and adherence. To deal with user organizational preferences, we add an additional modality, ''utility'' ($U$), which informs on a particular human's perceived usefulness of a given object. Models were trained as generalized (over many different people) or tailored (per person). We use two types of models: random forests, which focus on precise multi-task classification, and Markov logic networks, which provide an easily interpretable insight into organizational habits. The models were applied to both synthetic data, which proved to be learnable when using fixed organizational constraints, and human-study data, on which the random forest achieved over 90% accuracy. Over all combinations of $\{H, U, V\}$ modalities, $UV$ and $HUV$ were the most informative for organization. In a follow-up study, we gauged participants preference of desk organizations by a generalized random forest organization vs. by a random model. On average, participants rated the random forest models as 4.15 on a 5-point Likert scale compared to 1.84 for the random model △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 8 pages, 7 figures

ACM Class: I.2.9

Journal ref: 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) (pp. 1-8). IEEE

arXiv:2105.12076 [pdf, other]

Lazy Lifelong Planning for Efficient Replanning in Graphs with Expensive Edge Evaluation

Authors: Jaein Lim, Siddhartha Srinivasa, Panagiotis Tsiotras

Abstract: We present an incremental search algorithm, called Lifelong-GLS, which combines the vertex efficiency of Lifelong Planning A* (LPA*) and the edge efficiency of Generalized Lazy Search (GLS) for efficient replanning on dynamic graphs where edge evaluation is expensive. We use a lazily evaluated LPA* to repair the cost-to-come inconsistencies of the relevant region of the current search tree based o… ▽ More We present an incremental search algorithm, called Lifelong-GLS, which combines the vertex efficiency of Lifelong Planning A* (LPA*) and the edge efficiency of Generalized Lazy Search (GLS) for efficient replanning on dynamic graphs where edge evaluation is expensive. We use a lazily evaluated LPA* to repair the cost-to-come inconsistencies of the relevant region of the current search tree based on the previous search results, and then we restrict the expensive edge evaluations only to the current shortest subpath as in the GLS framework. The proposed algorithm is complete and correct in finding the optimal solution in the current graph, if one exists. We also show that the search returns a bounded suboptimal solution, if an inflated heuristic edge weight is used and the tree repairing propagation is truncated early for faster search. Finally, we show the efficiency of the proposed algorithm compared to the standard LPA* and the GLS algorithms over consecutive search episodes in a dynamic environment. For each search, the proposed algorithm reduces the edge evaluations by a significant amount compared to the LPA*. Both the number of vertex expansions and the number of edge evaluations are reduced substantially compared to GLS, as the proposed algorithm utilizes previous search results to facilitate the new search. △ Less

Submitted 25 May, 2021; originally announced May 2021.

arXiv:2105.02087 [pdf, other]

doi 10.1109/LRA.2021.3129139

Benchmarking Structured Policies and Policy Optimization for Real-World Dexterous Object Manipulation

Authors: Niklas Funk, Charles Schaff, Rishabh Madan, Takuma Yoneda, Julen Urain De Jesus, Joe Watson, Ethan K. Gordon, Felix Widmaier, Stefan Bauer, Siddhartha S. Srinivasa, Tapomayukh Bhattacharjee, Matthew R. Walter, Jan Peters

Abstract: Dexterous manipulation is a challenging and important problem in robotics. While data-driven methods are a promising approach, current benchmarks require simulation or extensive engineering support due to the sample inefficiency of popular methods. We present benchmarks for the TriFinger system, an open-source robotic platform for dexterous manipulation and the focus of the 2020 Real Robot Challen… ▽ More Dexterous manipulation is a challenging and important problem in robotics. While data-driven methods are a promising approach, current benchmarks require simulation or extensive engineering support due to the sample inefficiency of popular methods. We present benchmarks for the TriFinger system, an open-source robotic platform for dexterous manipulation and the focus of the 2020 Real Robot Challenge. The benchmarked methods, which were successful in the challenge, can be generally described as structured policies, as they combine elements of classical robotics and modern policy optimization. This inclusion of inductive biases facilitates sample efficiency, interpretability, reliability and high performance. The key aspects of this benchmarking is validation of the baselines across both simulation and the real system, thorough ablation study over the core features of each solution, and a retrospective analysis of the challenge as a manipulation benchmark. The code and demo videos for this work can be found on our website (https://sites.google.com/view/benchmark-rrc). △ Less

Submitted 8 December, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

Journal ref: IEEE Robotics and Automation Letters 7 (2022) 478-485

Showing 1–50 of 130 results for author: Srinivasa, S