Skip to main content

Showing 1–18 of 18 results for author: Jabri, A

.
  1. arXiv:2306.08068  [pdf, other

    cs.CV cs.AI cs.LG

    DORSal: Diffusion for Object-centric Representations of Scenes et al

    Authors: Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf

    Abstract: Recent progress in 3D scene understanding enables scalable learning of representations across large datasets of diverse scenes. As a consequence, generalization to unseen scenes and objects, rendering novel views from just a single or a handful of input images, and controllable scene generation that supports editing, is now possible. However, training jointly on a large number of scenes typically… ▽ More

    Submitted 2 May, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Accepted to ICLR 2024. Project page: https://www.sjoerdvansteenkiste.com/dorsal

  2. arXiv:2306.00986  [pdf, other

    cs.CV cs.LG stat.ML

    Diffusion Self-Guidance for Controllable Image Generation

    Authors: Dave Epstein, Allan Jabri, Ben Poole, Alexei A. Efros, Aleksander Holynski

    Abstract: Large-scale generative models are capable of producing high-quality images from detailed text descriptions. However, many aspects of an image are difficult or impossible to convey through text. We introduce self-guidance, a method that provides greater control over generated images by guiding the internal representations of diffusion models. We demonstrate that properties such as the shape, locati… ▽ More

    Submitted 11 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Project page at https://dave.ml/selfguidance/

  3. arXiv:2305.08932  [pdf, other

    cs.LG cs.AI

    MIMEx: Intrinsic Rewards from Masked Input Modeling

    Authors: Toru Lin, Allan Jabri

    Abstract: Exploring in environments with high-dimensional observations is hard. One promising approach for exploration is to use intrinsic rewards, which often boils down to estimating "novelty" of states, transitions, or trajectories with deep networks. Prior works have shown that conditional prediction objectives such as masked autoencoding can be seen as stochastic estimation of pseudo-likelihood. We sho… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: Code available at https://github.com/ToruOwO/mimex

  4. arXiv:2212.11972  [pdf, other

    cs.LG cs.CV cs.NE

    Scalable Adaptive Computation for Iterative Generation

    Authors: Allan Jabri, David Fleet, Ting Chen

    Abstract: Natural data is redundant yet predominant architectures tile computation uniformly across their input and output space. We propose the Recurrent Interface Networks (RINs), an attention-based architecture that decouples its core computation from the dimensionality of the data, enabling adaptive computation for more scalable generation of high-dimensional data. RINs focus the bulk of computation (i.… ▽ More

    Submitted 13 June, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: ICML'23. Code at https://github.com/google-research/pix2seq

  5. arXiv:2204.01784  [pdf, other

    cs.CV

    Object Permanence Emerges in a Random Walk along Memory

    Authors: Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon

    Abstract: This paper proposes a self-supervised objective for learning representations that localize objects under occlusion - a property known as object permanence. A central question is the choice of learning signal in cases of total occlusion. Rather than directly supervising the locations of invisible objects, we propose a self-supervised objective that requires neither human annotation, nor assumptions… ▽ More

    Submitted 13 June, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

  6. arXiv:2203.10159  [pdf, other

    cs.CV

    Discovering Objects that Can Move

    Authors: Zhipeng Bao, Pavel Tokmakov, Allan Jabri, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert

    Abstract: This paper studies the problem of object discovery -- separating objects from the background without manual labels. Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions. However, by relying on appearance alone, these methods fail to separate objects from the background in cluttered scenes. This is a fundamental limitation since… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  7. arXiv:2201.08379  [pdf, other

    cs.CV

    Learning Pixel Trajectories with Multiscale Contrastive Random Walks

    Authors: Zhangxing Bian, Allan Jabri, Alexei A. Efros, Andrew Owens

    Abstract: A range of video modeling tasks, from optical flow to multiple object tracking, share the same fundamental challenge: establishing space-time correspondence. Yet, approaches that dominate each space differ. We take a step towards bridging this gap by extending the recent contrastive random walk formulation to much denser, pixel-level space-time graphs. The main contribution is introducing hierarch… ▽ More

    Submitted 4 April, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  8. Rotational spectroscopic study and astronomical search for propiolamide in Sgr B2(N)

    Authors: E. R. Alonso, L. Kolesniková, A. Belloche, S. Mata, R. T. Garrod, A. Jabri, I. León, J. -C. Guillemin, H. S. P. Müller, K. M. Menten, J. L. Alonso

    Abstract: For all the amides detected in the interstellar medium (ISM), the corresponding nitriles or isonitriles have also been detected in the ISM, some of which have relatively high abundances. Among the abundant nitriles for which the corresponding amide has not yet been detected is cyanoacetylene (HCCCN), whose amide counterpart is propiolamide (HCCC(O)NH$_2$). With the aim of supporting searches for t… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: The article will be published as a regular paper in Astronomy and Astrophysics

    Journal ref: A&A 647, A55 (2021)

  9. arXiv:2006.14613  [pdf, other

    cs.CV cs.LG eess.IV

    Space-Time Correspondence as a Contrastive Random Walk

    Authors: Allan Jabri, Andrew Owens, Alexei A. Efros

    Abstract: This paper proposes a simple self-supervised approach for learning a representation for visual correspondence from raw video. We cast correspondence as prediction of links in a space-time graph constructed from video. In this graph, the nodes are patches sampled from each frame, and nodes adjacent in time can share a directed edge. We learn a representation in which pairwise similarity defines tra… ▽ More

    Submitted 3 December, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020 camera ready version -- Code at github.com/ajabri/videowalk

  10. arXiv:1912.11032  [pdf, other

    cs.RO cs.AI cs.LG

    Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning

    Authors: Richard Li, Allan Jabri, Trevor Darrell, Pulkit Agrawal

    Abstract: Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements. Many practical tasks require manipulation of multiple objects, and the complexity of such tasks increases with the number of objects. Learning from a curriculum of increasingly complex tasks appears to be a natural solution, but unfortunately, does… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: 10 pages, 4 figures and 1 table in main article, 3 figures and 3 tables in appendix. Supplementary website and videos at https://richardrl.github.io/relational-rl/

  11. arXiv:1912.04226  [pdf, other

    cs.AI cs.LG

    Unsupervised Curricula for Visual Meta-Reinforcement Learning

    Authors: Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn

    Abstract: In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast reinforcement learning (RL) strategies that transfer to similar tasks. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can "useful" pre-training tasks be discovered in… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: NeurIPS 2019

  12. arXiv:1903.07593  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Correspondence from the Cycle-Consistency of Time

    Authors: Xiaolong Wang, Allan Jabri, Alexei A. Efros

    Abstract: We introduce a self-supervised method for learning visual correspondence from unlabeled video. The main idea is to use cycle-consistency in time as free supervisory signal for learning visual representations from scratch. At training time, our model learns a feature map representation to be useful for performing cycle-consistent tracking. At test time, we use the acquired representation to find ne… ▽ More

    Submitted 2 April, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: CVPR 2019 Oral. Project page: http://ajabri.github.io/timecycle

  13. arXiv:1804.00645  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Universal Planning Networks

    Authors: Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

    Abstract: A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient… ▽ More

    Submitted 4 April, 2018; v1 submitted 2 April, 2018; originally announced April 2018.

    Comments: Videos available at https://sites.google.com/view/upn-public/home

  14. arXiv:1707.06320  [pdf, other

    cs.CL cs.CV

    Learning Visually Grounded Sentence Representations

    Authors: Douwe Kiela, Alexis Conneau, Allan Jabri, Maximilian Nickel

    Abstract: We introduce a variety of models, trained on a supervised image captioning corpus to predict the image features for a given caption, to perform sentence representation grounding. We train a grounded sentence encoder that achieves good performance on COCO caption and image retrieval and subsequently show that this encoder can successfully be transferred to various NLP tasks, with improved performan… ▽ More

    Submitted 4 June, 2018; v1 submitted 19 July, 2017; originally announced July 2017.

    Comments: Published at NAACL-18

  15. arXiv:1701.08954  [pdf, ps, other

    cs.LG cs.AI cs.CL

    CommAI: Evaluating the first steps towards a useful general AI

    Authors: Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov

    Abstract: With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal. However, most current research focuses instead on important but narrow applications, such as image classification or machine translation. We believe this to be largely due to the lack of objective ways to measure progress towards broad machine intelligence. In or… ▽ More

    Submitted 27 March, 2017; v1 submitted 31 January, 2017; originally announced January 2017.

    Comments: Published in ICLR 2017 Workshop Track

  16. arXiv:1612.09161  [pdf, other

    cs.CV

    Learning Visual N-Grams from Web Data

    Authors: Ang Li, Allan Jabri, Armand Joulin, Laurens van der Maaten

    Abstract: Real-world image recognition systems need to recognize tens of thousands of classes that constitute a plethora of visual concepts. The traditional approach of annotating thousands of images per class for training is infeasible in such a scenario, prompting the use of webly supervised data. This paper explores the training of image-recognition systems on large numbers of images and associated user… ▽ More

    Submitted 5 August, 2017; v1 submitted 29 December, 2016; originally announced December 2016.

  17. arXiv:1606.08390  [pdf, ps, other

    cs.CV

    Revisiting Visual Question Answering Baselines

    Authors: Allan Jabri, Armand Joulin, Laurens van der Maaten

    Abstract: Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms designed to support "reasoning". For multiple-choice VQA, nearly all of these systems train a multi-class classifier on image and question features to predict an… ▽ More

    Submitted 22 November, 2016; v1 submitted 27 June, 2016; originally announced June 2016.

    Comments: European Conference on Computer Vision

  18. arXiv:1511.02251  [pdf, other

    cs.CV

    Learning Visual Features from Large Weakly Supervised Data

    Authors: Armand Joulin, Laurens van der Maaten, Allan Jabri, Nicolas Vasilache

    Abstract: Convolutional networks trained on large supervised dataset produce visual features which form the basis for the state-of-the-art in many computer-vision problems. Further improvements of these visual features will likely require even larger manually labeled data sets, which severely limits the pace at which progress can be made. In this paper, we explore the potential of leveraging massive, weakly… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.