Skip to main content

Showing 1–9 of 9 results for author: Sax, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2110.04994  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

    Authors: Ainaz Eftekhar, Alexander Sax, Roman Bachmann, Jitendra Malik, Amir Zamir

    Abstract: This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. In addition to enabling interesting lines of research, we show the tooling and generated data suffice to train robust vision models. Common… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: ICCV 2021: See project website https://omnidata.vision

  2. arXiv:2103.10919  [pdf, other

    cs.CV cs.LG

    Robustness via Cross-Domain Ensembles

    Authors: Teresa Yeo, Oğuzhan Fatih Kar, Alexander Sax, Amir Zamir

    Abstract: We present a method for making neural network predictions robust to shifts from the training data distribution. The proposed method is based on making predictions via a diverse set of cues (called 'middle domains') and ensembling them into one strong prediction. The premise of the idea is that predictions made via different cues respond differently to a distribution shift, hence one should be able… ▽ More

    Submitted 3 September, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Comments: Project website at https://crossdomain-ensembles.epfl.ch/

  3. arXiv:2011.06698  [pdf, other

    cs.RO cs.CV cs.LG

    Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation

    Authors: Bryan Chen, Alexander Sax, Gene Lewis, Iro Armeni, Silvio Savarese, Amir Zamir, Jitendra Malik, Lerrel Pinto

    Abstract: Vision-based robotics often separates the control loop into one module for perception and a separate module for control. It is possible to train the whole system end-to-end (e.g. with deep RL), but doing it "from scratch" comes with a high sample complexity cost and the final result is often brittle, failing unexpectedly if the test environment differs from that of training. We study the effects… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    Comments: Extended version of CoRL 2020 camera ready. Supplementary released separately

  4. arXiv:2006.04096  [pdf, other

    cs.CV cs.GR cs.LG

    Robust Learning Through Cross-Task Consistency

    Authors: Amir Zamir, Alexander Sax, Teresa Yeo, Oğuzhan Kar, Nikhil Cheerla, Rohan Suri, Zhangjie Cao, Jitendra Malik, Leonidas Guibas

    Abstract: Visual perception entails solving a wide set of tasks, e.g., object detection, depth estimation, etc. The predictions made for multiple tasks from the same image are not independent, and therefore, are expected to be consistent. We propose a broadly applicable and fully computational method for augmenting learning with Cross-Task Consistency. The proposed formulation is based on inference-path inv… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

    Comments: CVPR 2020 (Oral). Project website, models, live demo at http://consistency.epfl.ch/

  5. arXiv:1912.13503  [pdf, other

    cs.LG cs.CV cs.NE cs.RO

    Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks

    Authors: Jeffrey O Zhang, Alexander Sax, Amir Zamir, Leonidas Guibas, Jitendra Malik

    Abstract: When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than starting from randomly initialized weights. Adaptation can be useful in cases when training data is scarce, when a single learner needs to perform multiple tasks, or when one wishes to encode priors in the network. The most commonly employed approaches for network adaptation are fine-tuning… ▽ More

    Submitted 30 July, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: In ECCV 2020 (Spotlight). For more, see project website and code at http://sidetuning.berkeley.edu

  6. arXiv:1912.11121  [pdf, other

    cs.CV cs.LG cs.NE cs.RO

    Learning to Navigate Using Mid-Level Visual Priors

    Authors: Alexander Sax, Jeffrey O. Zhang, Bradley Emi, Amir Zamir, Silvio Savarese, Leonidas Guibas, Jitendra Malik

    Abstract: How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. navigating a complex environment)? What are the consequences of not utilizing such visual priors in learning? We study these questions by integrating a generic perceptual skill set (a distance estimator, an edge detector, etc.) within a reinforcement le… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: In Conference on Robot Learning, 2019. See project website and demos at http://perceptual.actor/

  7. arXiv:1812.11971  [pdf, other

    cs.CV cs.AI cs.LG cs.NE cs.RO

    Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies

    Authors: Alexander Sax, Bradley Emi, Amir R. Zamir, Leonidas Guibas, Silvio Savarese, Jitendra Malik

    Abstract: How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. delivering a package)? We study this question by integrating a generic perceptual skill set (e.g. a distance estimator, an edge detector, etc.) within a reinforcement learning framework--see Figure 1. This skill set (hereafter mid-level perception) prov… ▽ More

    Submitted 22 April, 2019; v1 submitted 31 December, 2018; originally announced December 2018.

    Comments: See project website, demos, and code at http://perceptual.actor

  8. arXiv:1808.10654  [pdf, other

    cs.AI cs.CV cs.GR cs.LG cs.RO

    Gibson Env: Real-World Perception for Embodied Agents

    Authors: Fei Xia, Amir Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, Silvio Savarese

    Abstract: Develo** visual perception models for active agents and sensorimotor control are cumbersome to be done in the physical world, as existing algorithms are too slow to efficiently learn in real-time and robots are fragile and costly. This has given rise to learning-in-simulation which consequently casts a question on whether the results transfer to real-world. In this paper, we are concerned with t… ▽ More

    Submitted 31 August, 2018; originally announced August 2018.

    Comments: Access the code, dataset, and project website at http://gibsonenv.vision/ . CVPR 2018

    Journal ref: CVPR 2018

  9. arXiv:1804.08328  [pdf, other

    cs.CV cs.AI cs.LG cs.NE cs.RO

    Taskonomy: Disentangling Task Transfer Learning

    Authors: Amir Zamir, Alexander Sax, William Shen, Leonidas Guibas, Jitendra Malik, Silvio Savarese

    Abstract: Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and provides a principled way for identifying redundancies acros… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: CVPR 2018 (Oral). See project website and live demos at http://taskonomy.vision/