Skip to main content

Showing 1–50 of 54 results for author: Faust, A

.
  1. arXiv:2404.11018  [pdf, other

    cs.LG cs.AI cs.CL

    Many-Shot In-Context Learning

    Authors: Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

    Abstract: Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2403.03950  [pdf, other

    cs.LG cs.AI stat.ML

    Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

    Authors: Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

    Abstract: Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained using a mean squared error regression objective to match bootstrapped target values. However, scaling value-based RL methods that use regression to large networks, such as high-capacity Transformers, has proven challenging. This difficulty is in stark contrast… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2402.05821  [pdf, other

    cs.LG cs.NE

    Guided Evolution with Binary Discriminators for ML Program Search

    Authors: John D. Co-Reyes, Yingjie Miao, George Tucker, Aleksandra Faust, Esteban Real

    Abstract: How to automatically design better machine learning programs is an open problem within AutoML. While evolution has been a popular tool to search for better ML programs, using learning itself to guide the search has been less successful and less understood on harder problems but has the promise to dramatically increase the speed and final performance of the optimization process. We propose guiding… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  5. arXiv:2311.18751  [pdf, other

    cs.LG cs.AI cs.CL

    Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web

    Authors: Hiroki Furuta, Yutaka Matsuo, Aleksandra Faust, Izzeddin Gur

    Abstract: Language model agents (LMA) recently emerged as a promising paradigm on muti-step decision making tasks, often outperforming humans and other reinforcement learning agents. Despite the promise, their performance on real-world applications that often involve combinations of tasks is still underexplored. In this work, we introduce a new benchmark, called CompWoB -- 50 new compositional web automatio… ▽ More

    Submitted 4 February, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Code: https://github.com/google-research/google-research/tree/master/compositional_rl/compwob

  6. arXiv:2311.02462  [pdf, ps, other

    cs.AI

    Levels of AGI for Operationalizing Progress on the Path to AGI

    Authors: Meredith Ringel Morris, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg

    Abstract: We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors. This framework introduces levels of AGI performance, generality, and autonomy, providing a common language to compare models, assess risks, and measure progress along the path to AGI. To develop our framework, we analyze existing definitions of AGI, and distill… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: version 4 - Position Paper accepted to ICML 2024. Note that due to ICML position paper titling format requirements, the title has changed slightly from that of the original arXiv pre-print. The original pre-print title was "Levels of AGI: Operationalizing Progress on the Path to AGI" but the official published title for ICML 2024 is "Levels of AGI for Operationalizing Progress on the Path to AGI"

    Journal ref: Proceedings of ICML 2024

  7. arXiv:2310.08710  [pdf, other

    cs.RO cs.LG

    Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

    Authors: Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John D. Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp

    Abstract: Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simul… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  8. arXiv:2307.12856  [pdf, other

    cs.LG cs.AI cs.CL

    A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

    Authors: Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust

    Abstract: Pre-trained large language models (LLMs) have recently achieved better generalization and sample efficiency in autonomous web automation. However, the performance on real-world websites has still suffered from (1) open domainness, (2) limited context length, and (3) lack of inductive bias on HTML. We introduce WebAgent, an LLM-driven agent that learns from self-experience to complete tasks on real… ▽ More

    Submitted 25 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted to ICLR 2024 (Oral)

  9. arXiv:2307.00184  [pdf, other

    cs.CL cs.AI cs.CY cs.HC

    Personality Traits in Large Language Models

    Authors: Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić

    Abstract: The advent of large language models (LLMs) has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text. As LLMs increasingly power conversational agents used by the general public world-wide, the synthetic personality embedded in these models, by virtue of training on large amounts of human data, is becoming increasingly important.… ▽ More

    Submitted 21 September, 2023; v1 submitted 30 June, 2023; originally announced July 2023.

    MSC Class: 68T35 ACM Class: I.2.7

  10. arXiv:2306.08888  [pdf, other

    cs.AR cs.LG

    ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design

    Authors: Srivatsan Krishnan, Amir Yazdanbaksh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, Daniel Richins, Devashree Tripathy, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: Machine learning is a prevalent approach to tame the complexity of design space exploration for domain-specific architectures. Using ML for design space exploration poses challenges. First, it's not straightforward to identify the suitable algorithm from an increasing pool of ML methods. Second, assessing the trade-offs between performance and sample efficiency across these methods is inconclusive… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: International Symposium on Computer Architecture (ISCA 2023)

  11. arXiv:2306.07580  [pdf, other

    cs.RO

    SayTap: Language to Quadrupedal Locomotion

    Authors: Yu** Tang, Wenhao Yu, Jie Tan, Heiga Zen, Aleksandra Faust, Tatsuya Harada

    Abstract: Large language models (LLMs) have demonstrated the potential to perform high-level planning. Yet, it remains a challenge for LLMs to comprehend low-level commands, such as joint angle targets or motor torques. This paper proposes an approach to use foot contact patterns as an interface that bridges human commands in natural language and a locomotion controller that outputs these low-level commands… ▽ More

    Submitted 14 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

  12. arXiv:2305.11854  [pdf, other

    cs.LG cs.AI stat.ML

    Multimodal Web Navigation with Instruction-Finetuned Foundation Models

    Authors: Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum, Yutaka Matsuo, Aleksandra Faust, Shixiang Shane Gu, Izzeddin Gur

    Abstract: The progress of autonomous web navigation has been hindered by the dependence on billions of exploratory interactions via online reinforcement learning, and domain-specific model designs that make it difficult to leverage generalization from rich out-of-domain data. In this work, we study data-driven offline training for web agents with vision-language foundation models. We propose an instruction-… ▽ More

    Submitted 25 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to ICLR 2024. Website: https://sites.google.com/view/mm-webnav/

  13. arXiv:2304.00432  [pdf, other

    eess.SY

    Multi-Agent Reachability Calibration with Conformal Prediction

    Authors: Anish Muthali, Haotian Shen, Sampada Deglurkar, Michael H. Lim, Rebecca Roelofs, Aleksandra Faust, Claire Tomlin

    Abstract: We investigate methods to provide safety assurances for autonomous agents that incorporate predictions of other, uncontrolled agents' behavior into their own trajectory planning. Given a learning-based forecasting model that predicts agents' trajectories, we introduce a method for providing probabilistic assurances on the model's prediction error with calibrated confidence intervals. Through quant… ▽ More

    Submitted 13 December, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

  14. arXiv:2212.11419  [pdf, other

    cs.AI cs.RO

    Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

    Authors: Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine

    Abstract: Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantia… ▽ More

    Submitted 10 August, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    ACM Class: I.2.9; I.2.6

  15. arXiv:2211.16385  [pdf, other

    cs.AR cs.AI cs.LG cs.MA

    Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration

    Authors: Srivatsan Krishnan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, Aleksandra Faust

    Abstract: Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g., datapath, memory blocks in different hierarchies, interconnects, compiler optimization, etc.) quickly results in a combinatorial explosion of design s… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Workshop on ML for Systems at NeurIPS 2022

  16. CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

    Authors: Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica

    Abstract: Reinforcement Learning (RL) algorithms are often known for sample inefficiency and difficult generalization. Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks. This is a non-stationary process where the task distribution evolves along with agent policies; cr… ▽ More

    Submitted 7 March, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Preprint, Currently Under Review

  17. arXiv:2210.03945  [pdf, other

    cs.LG cs.AI

    Understanding HTML with Large Language Models

    Authors: Izzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust

    Abstract: Large language models (LLMs) have shown exceptional performance on a variety of natural language tasks. Yet, their capabilities for HTML understanding -- i.e., parsing the raw HTML of a webpage, with applications to automation of web-based tasks, crawling, and browser-assisted retrieval -- have not been fully explored. We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analy… ▽ More

    Submitted 19 May, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

  18. arXiv:2205.12648  [pdf, other

    cs.LG cs.AI

    Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization

    Authors: Sungryull Sohn, Hyunjae Woo, Jongwook Choi, lyubing qiang, Izzeddin Gur, Aleksandra Faust, Honglak Lee

    Abstract: We tackle real-world problems with complex structures beyond the pixel-based game or simulator. We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph that defines a set of subtasks and their dependencies that are unknown to the agent. Different from the previous meta-rl methods trying to directly infer the unstructured task embedding, our mul… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to UAI 2022 as an oral presentation

  19. arXiv:2205.05748  [pdf, other

    cs.LG cs.RO

    Tiny Robot Learning: Challenges and Directions for Machine Learning in Resource-Constrained Robots

    Authors: Sabrina M. Neuman, Brian Plancher, Bardienus P. Duisterhof, Srivatsan Krishnan, Colby Banbury, Mark Mazumder, Shvetank Prakash, Jason Jabbour, Aleksandra Faust, Guido C. H. E. de Croon, Vijay Janapa Reddi

    Abstract: Machine learning (ML) has become a pervasive tool across computing systems. An emerging application that stress-tests the challenges of ML system design is tiny robot learning, the deployment of ML on resource-constrained low-cost autonomous robots. Tiny robot learning lies at the intersection of embedded systems, robotics, and ML, compounding the challenges of these domains. Tiny robot learning i… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: 4 pages, 3 figures, 1 table, in IEEE AICAS 2022

  20. arXiv:2204.10898  [pdf, other

    cs.RO cs.AR

    Roofline Model for UAVs: A Bottleneck Analysis Tool for Onboard Compute Characterization of Autonomous Unmanned Aerial Vehicles

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Ninad Jadhav, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: We introduce an early-phase bottleneck analysis and characterization model called the F-1 for designing computing systems that target autonomous Unmanned Aerial Vehicles (UAVs). The model provides insights by exploiting the fundamental relationships between various components in the autonomous UAV, such as sensor, compute, and body dynamics. To guarantee safe operation while maximizing the perform… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: To Appear in 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). arXiv admin note: substantial text overlap with arXiv:2111.03792

  21. arXiv:2204.04292  [pdf, other

    cs.LG

    Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability

    Authors: Juan Jose Garau-Luis, Yingjie Miao, John D. Co-Reyes, Aaron Parisi, Jie Tan, Esteban Real, Aleksandra Faust

    Abstract: Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world. Designing RL algorithms that optimize these objectives can be a costly and painstaking process. This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions. MetaPG explicitly optimizes for generalizability and performance, and implicit… ▽ More

    Submitted 24 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

  22. arXiv:2201.08896  [pdf, other

    cs.LG cs.AI

    Environment Generation for Zero-Shot Compositional Reinforcement Learning

    Authors: Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj Tiwari, Honglak Lee, Aleksandra Faust

    Abstract: Many real-world problems are compositional - solving them requires completing interdependent sub-tasks, either in series or in parallel, that can be represented as a dependency graph. Deep reinforcement learning (RL) agents often struggle to learn such complex tasks due to the long time horizons and sparse rewards. To address this problem, we present Compositional Design of Environments (CoDE), wh… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

    Comments: Published in NeurIPS 2021

  23. Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

    Authors: Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer

    Abstract: The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents. However, the success of RL agents is often highly sensitive to design choices in the training process, which may require tedious and error-prone manual tuning. This makes it challenging to use RL for new problems,… ▽ More

    Submitted 2 June, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: Published in JAIR. Co-first authors and co-last authors are listed in alphabetical order

    MSC Class: 68T01 ACM Class: I.2.6

    Journal ref: Journal of Artificial Intelligence Research 74 (2022) 517-568

  24. arXiv:2112.09456  [pdf, other

    cs.AI cs.LG cs.RO eess.SY

    Compositional Learning-based Planning for Vision POMDPs

    Authors: Sampada Deglurkar, Michael H. Lim, Johnathan Tucker, Zachary N. Sunberg, Aleksandra Faust, Claire J. Tomlin

    Abstract: The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle high-dimensional image observations prevalent in real world applications, and often require lengthy online training that requires interaction with the environment. In thi… ▽ More

    Submitted 2 December, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

  25. arXiv:2111.12872  [pdf, other

    cs.CV cs.CL

    Less is More: Generating Grounded Navigation Instructions from Landmarks

    Authors: Su Wang, Ceslee Montgomery, Jordi Orbay, Vighnesh Birodkar, Aleksandra Faust, Izzeddin Gur, Natasha Jaques, Austin Waters, Jason Baldridge, Peter Anderson

    Abstract: We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes. Existing generators suffer from poor visual grounding, causing them to rely on language priors and hallucinate objects. Our MARKY-MT5 system addresses this by focusing on visual landmarks; it comprises a first stage landmark detector and a second stage generator -- a multimodal, multiling… ▽ More

    Submitted 4 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: CVPR 2022 Camera-ready

  26. arXiv:2111.03792   

    cs.RO

    Roofline Model for UAVs:A Bottleneck Analysis Tool for Designing Compute Systems for Autonomous Drones

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: We present a bottleneck analysis tool for designing compute systems for autonomous Unmanned Aerial Vehicles (UAV). The tool provides insights by exploiting the fundamental relationships between various components in the autonomous UAV such as sensor, compute, body dynamics. To guarantee safe operation while maximizing the performance (e.g., velocity) of the UAV, the compute, sensor, and other mech… ▽ More

    Submitted 15 June, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: The latest and updated version with conference is available here: arXiv:2204.10898

  27. arXiv:2109.07578  [pdf, other

    cs.LG cs.AI cs.RO eess.SY

    Multi-Task Learning with Sequence-Conditioned Transporter Networks

    Authors: Michael H. Lim, Andy Zeng, Brian Ichter, Maryam Bandari, Erwin Coumans, Claire Tomlin, Stefan Schaal, Aleksandra Faust

    Abstract: Enabling robots to solve multiple manipulation tasks has a wide range of industrial applications. While learning-based approaches enjoy flexibility and generalizability, scaling these approaches to solve such compositional tasks remains a challenge. In this work, we aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling. First, we propose a new suite of be… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

  28. arXiv:2106.02229  [pdf, other

    cs.LG cs.AI cs.CV

    Differentiable Architecture Search for Reinforcement Learning

    Authors: Yingjie Miao, Xingyou Song, John D. Co-Reyes, Daiyi Peng, Summer Yue, Eugene Brevdo, Aleksandra Faust

    Abstract: In this paper, we investigate the fundamental question: To what extent are gradient-based neural architecture search (NAS) techniques applicable to RL? Using the original DARTS as a convenient baseline, we discover that the discrete architectures found can achieve up to 250% performance compared to manual architecture designs on both discrete and continuous action space environments across off-pol… ▽ More

    Submitted 15 November, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper at the first Automated Machine Learning Conference (AutoML-Conf) 2022. Code can be found at https://github.com/google/brain_autorl/tree/main/rl_darts

  29. arXiv:2104.07750  [pdf, other

    cs.AI cs.MA

    Joint Attention for Multi-Agent Coordination and Social Learning

    Authors: Dennis Lee, Natasha Jaques, Chase Kew, Jiaxing Wu, Douglas Eck, Dale Schuurmans, Aleksandra Faust

    Abstract: Joint attention - the ability to purposefully coordinate attention with another agent, and mutually attend to the same thing -- is a critical component of human social cognition. In this paper, we ask whether joint attention can be useful as a mechanism for improving multi-agent coordination and social learning. We first develop deep reinforcement learning (RL) agents with a recurrent visual atten… ▽ More

    Submitted 7 August, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

  30. arXiv:2103.01991  [pdf, other

    cs.LG cs.AI cs.MA

    Adversarial Environment Generation for Learning to Navigate the Web

    Authors: Izzeddin Gur, Natasha Jaques, Kevin Malta, Manoj Tiwari, Honglak Lee, Aleksandra Faust

    Abstract: Learning to autonomously navigate the web is a difficult sequential decision making task. The state and action spaces are large and combinatorial in nature, and websites are dynamic environments consisting of several pages. One of the bottlenecks of training web navigation agents is providing a learnable curriculum of training environments that can cover the large variety of real-world websites. T… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: Presented at Deep RL Workshop, NeurIPS, 2020

  31. arXiv:2102.02988  [pdf, other

    cs.RO cs.AI cs.AR cs.LG

    AutoPilot: Automating SoC Design Space Exploration for SWaP Constrained Autonomous UAVs

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Paul Whatmough, Aleksandra Faust, Sabrina Neuman, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi

    Abstract: Building domain-specific accelerators for autonomous unmanned aerial vehicles (UAVs) is challenging due to a lack of systematic methodology for designing onboard compute. Balancing a computing system for a UAV requires considering both the cyber (e.g., sensor rate, compute performance) and physical (e.g., payload weight) characteristics that affect overall performance. Iterating over the many comp… ▽ More

    Submitted 10 September, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

  32. arXiv:2101.03958  [pdf, other

    cs.LG cs.AI cs.NE

    Evolving Reinforcement Learning Algorithms

    Authors: John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

    Abstract: We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, l… ▽ More

    Submitted 10 November, 2022; v1 submitted 8 January, 2021; originally announced January 2021.

    Comments: ICLR 2021 Oral. See project website at https://sites.google.com/view/evolvingrl

  33. arXiv:2003.09354  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Visual Navigation Among Humans with Optimal Control as a Supervisor

    Authors: Varun Tolani, Somil Bansal, Aleksandra Faust, Claire Tomlin

    Abstract: Real world visual navigation requires robots to operate in unfamiliar, human-occupied dynamic environments. Navigation around humans is especially difficult because it requires anticipating their future motion, which can be quite challenging. We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans based only on monocular, first-perso… ▽ More

    Submitted 12 February, 2021; v1 submitted 20 March, 2020; originally announced March 2020.

    Comments: Project Website: https://smlbansal.github.io/LB-WayPtNav-DH/

  34. arXiv:2003.06906  [pdf, other

    cs.MA cs.AI cs.LG cs.RO

    Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous

    Authors: Rose E. Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan Zhang, Brian Ichter, Jie Tan, Aleksandra Faust

    Abstract: Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to p… ▽ More

    Submitted 9 November, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

    Comments: CoRL 2020. The video is available at: https://youtu.be/-ydXHUtPzWE

  35. arXiv:1910.05917  [pdf, other

    cs.RO cs.LG

    Neural Collision Clearance Estimator for Batched Motion Planning

    Authors: J. Chase Kew, Brian Ichter, Maryam Bandari, Tsang-Wei Edward Lee, Aleksandra Faust

    Abstract: We present a neural network collision checking heuristic, ClearanceNet, and a planning algorithm, CN-RRT. ClearanceNet learns to predict separation distance (minimum distance between robot and workspace) with respect to a workspace. CN-RRT then efficiently computes a motion plan by leveraging three key features of ClearanceNet. First, CN-RRT explores the space by expanding multiple nodes at the sa… ▽ More

    Submitted 14 July, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

  36. arXiv:1910.03701  [pdf, other

    cs.RO cs.LG

    Learned Critical Probabilistic Roadmaps for Robotic Motion Planning

    Authors: Brian Ichter, Edward Schmerling, Tsang-Wei Edward Lee, Aleksandra Faust

    Abstract: Sampling-based motion planning techniques have emerged as an efficient algorithmic paradigm for solving complex motion planning problems. These approaches use a set of probing samples to construct an implicit graph representation of the robot's state space, allowing arbitrarily accurate representations as the number of samples increases to infinity. In practice, however, solution trajectories only… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

  37. arXiv:1910.01055  [pdf, other

    cs.LG cs.AI cs.RO

    QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning

    Authors: Srivatsan Krishnan, Maximilian Lam, Sharad Chitlangia, Zishen Wan, Gabriel Barth-Maron, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: Deep reinforcement learning continues to show tremendous potential in achieving task-level autonomy, however, its computational and energy demands remain prohibitively high. In this paper, we tackle this problem by applying quantization to reinforcement learning. To that end, we introduce a novel Reinforcement Learning (RL) training paradigm, \textit{ActorQ}, to speed up actor-learner distributed… ▽ More

    Submitted 13 November, 2022; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Equal contribution from first three authors. Updating with QuaRL for sustainable (carbon emissions) RL results

    Journal ref: Published in Transactions on Machine Learning Research (07/2022)

  38. arXiv:1909.12971  [pdf, other

    cs.AI cs.RO

    Zero-shot Imitation Learning from Demonstrations for Legged Robot Visual Navigation

    Authors: Xinlei Pan, Tingnan Zhang, Brian Ichter, Aleksandra Faust, Jie Tan, Sehoon Ha

    Abstract: Imitation learning is a popular approach for training visual navigation policies. However, collecting expert demonstrations for legged robots is challenging as these robots can be hard to control, move slowly, and cannot operate continuously for a long time. Here, we propose a zero-shot imitation learning approach for training a visual navigation policy on legged robots from human (third-person pe… ▽ More

    Submitted 4 March, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

    Comments: Accepted by ICRA 2020. Project website: https://sites.google.com/berkeley.edu/zero-shot-lfd/

  39. arXiv:1909.11236  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller

    Authors: Bardienus P. Duisterhof, Srivatsan Krishnan, Jonathan J. Cruz, Colby R. Banbury, William Fu, Aleksandra Faust, Guido C. H. E. de Croon, Vijay Janapa Reddi

    Abstract: We present fully autonomous source seeking onboard a highly constrained nano quadcopter, by contributing application-specific system and observation feature design to enable inference of a deep-RL policy onboard a nano quadcopter. Our deep-RL algorithm finds a high-performance solution to a challenging problem, even in presence of high noise levels and generalizes across real and simulation enviro… ▽ More

    Submitted 15 January, 2021; v1 submitted 24 September, 2019; originally announced September 2019.

  40. arXiv:1907.04799  [pdf, other

    cs.RO cs.AI cs.LG

    RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators from RL Policies

    Authors: Hao-Tien Lewis Chiang, Jasmine Hsu, Marek Fiser, Lydia Tapia, Aleksandra Faust

    Abstract: This paper addresses two challenges facing sampling-based kinodynamic motion planning: a way to identify good candidate states for local transitions and the subsequent computationally intractable steering between these candidate states. Through the combination of sampling-based planning, a Rapidly Exploring Randomized Tree (RRT) and an efficient kinodynamic motion planner through machine learning,… ▽ More

    Submitted 12 July, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

    Comments: Accepted to Robotics and Automation Letters in June 2019

    Journal ref: Robotics and Automation Letters 2019

  41. arXiv:1906.10513  [pdf, other

    cs.RO

    The Role of Compute in Autonomous Aerial Vehicles

    Authors: Behzad Boroujerdian, Hasan Genc, Srivatsan Krishnan, Bardienus Pieter Duisterhof, Brian Plancher, Kayvan Mansoorshahi, Marcelino Almeida, Wenzhi Cui, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: Autonomous-mobile cyber-physical machines are part of our future. Specifically, unmanned-aerial-vehicles have seen a resurgence in activity with use-cases such as package delivery. These systems face many challenges such as their low-endurance caused by limited onboard-energy, hence, improving the mission-time and energy are of importance. Such improvements traditionally are delivered through bett… ▽ More

    Submitted 23 June, 2019; originally announced June 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1905.06388

  42. arXiv:1906.00421  [pdf, other

    cs.RO cs.LG

    Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual Navigation

    Authors: Srivatsan Krishnan, Behzad Boroujerdian, William Fu, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: We introduce Air Learning, an open-source simulator, and a gym environment for deep reinforcement learning research on resource-constrained aerial robots. Equipped with domain randomization, Air Learning exposes a UAV agent to a diverse set of challenging scenarios. We seed the toolset with point-to-point obstacle avoidance tasks in three different environments and Deep Q Networks (DQN) and Proxim… ▽ More

    Submitted 13 November, 2022; v1 submitted 2 June, 2019; originally announced June 2019.

    Comments: To Appear in Springer Machine Learning Journal (Special Issue on Reinforcement Learning for Real Life). Updating the title to match the Springer Machine Learning Journal

  43. arXiv:1905.07628  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Evolving Rewards to Automate Reinforcement Learning

    Authors: Aleksandra Faust, Anthony Francis, Dar Mehta

    Abstract: Many continuous control tasks have easily formulated objectives, yet using them directly as a reward in reinforcement learning (RL) leads to suboptimal policies. Therefore, many classical control tasks guide RL training using complex rewards, which require tedious hand-tuning. We automate the reward search with AutoRL, an evolutionary layer over standard RL that treats reward tuning as hyperparame… ▽ More

    Submitted 18 May, 2019; originally announced May 2019.

    Comments: Accepted to 6th AutoML@ICML

  44. arXiv:1905.06388  [pdf, other

    cs.RO

    MAVBench: Micro Aerial Vehicle Benchmarking

    Authors: Behzad Boroujerdian, Hasan Genc, Srivatsan Krishnan, Wenzhi Cui, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: Unmanned Aerial Vehicles (UAVs) are getting closer to becoming ubiquitous in everyday life. Among them, Micro Aerial Vehicles (MAVs) have seen an outburst of attention recently, specifically in the area with a demand for autonomy. A key challenge standing in the way of making MAVs autonomous is that researchers lack the comprehensive understanding of how performance, power, and computational bottl… ▽ More

    Submitted 31 May, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

    Journal ref: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

  45. arXiv:1902.09458  [pdf, other

    cs.RO cs.AI cs.LG

    Long-Range Indoor Navigation with PRM-RL

    Authors: Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee

    Abstract: Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map… ▽ More

    Submitted 22 February, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: Accepted to T-RO

  46. arXiv:1901.10031  [pdf, other

    cs.LG cs.AI stat.ML

    Lyapunov-based Safe Policy Optimization for Continuous Control

    Authors: Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh

    Abstract: We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate these problems as constrained Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve the… ▽ More

    Submitted 11 February, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

  47. arXiv:1812.09195  [pdf, other

    cs.LG cs.CL stat.ML

    Learning to Navigate the Web

    Authors: Izzeddin Gur, Ulrich Rueckert, Aleksandra Faust, Dilek Hakkani-Tur

    Abstract: Learning in environments with large state and action spaces, and sparse rewards, can hinder a Reinforcement Learning (RL) agent's learning through trial-and-error. For instance, following natural language instructions on the Web (such as booking a flight ticket) leads to RL settings where input vocabulary and number of actionable elements on a page can grow very large. Even though recent approache… ▽ More

    Submitted 21 December, 2018; originally announced December 2018.

    Comments: International Conference on Learning Representations (ICLR), 2019

  48. arXiv:1811.12651  [pdf, other

    cs.RO

    PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning

    Authors: Aleksandra Faust, Hao-Tien Lewis Chiang, Lydia Tapia

    Abstract: Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acceleration controlled robots, and propose a motion planning solution, PrEference Appraisal Reinforceme… ▽ More

    Submitted 30 November, 2018; originally announced November 2018.

    Comments: 20 pages

  49. arXiv:1809.10124  [pdf, other

    cs.RO cs.AI cs.LG

    Learning Navigation Behaviors End-to-End with AutoRL

    Authors: Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis

    Abstract: We learn end-to-end point-to-point and path-following navigation behaviors that avoid moving obstacles. These policies receive noisy lidar observations and output robot linear and angular velocities. The policies are trained in small, static environments with AutoRL, an evolutionary automation layer around Reinforcement Learning (RL) that searches for a deep RL reward and neural network architectu… ▽ More

    Submitted 1 February, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

    Comments: Accepted to RA-L/ICRA 2019. Chiang and Faust contributed equally

  50. arXiv:1809.09261  [pdf, other

    cs.AI cs.LG eess.SY

    Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting

    Authors: Aleksandra Faust, James B. Aimone, Conrad D. James, Lydia Tapia

    Abstract: Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements. In particular, reinforcement learning (RL) and feedback control can be used to help a robot achieve a goal. Taking advantage of this body of work, this paper formulates general computation as a feedback-control problem, which allows the agent to autonomously ove… ▽ More

    Submitted 24 September, 2018; originally announced September 2018.

    Comments: 11 pages, accepted to CDC 2018. Here with additional evaluations