Skip to main content

Showing 1–11 of 11 results for author: Tung, H F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.01782  [pdf, other

    cs.CV cs.AI cs.LG q-bio.NC

    3D View Prediction Models of the Dorsal Visual Stream

    Authors: Gabriel Sarch, Hsiao-Yu Fish Tung, Aria Wang, Jacob Prince, Michael Tarr

    Abstract: Deep neural network representations align well with brain activity in the ventral visual stream. However, the primate visual system has a distinct dorsal processing stream with different functional properties. To test if a model trained to perceive 3D scene geometry aligns better with neural responses in dorsal visual areas, we trained a self-supervised geometry-aware recurrent neural network (GRN… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 2023 Conference on Cognitive Computational Neuroscience

  2. arXiv:2106.08261  [pdf, other

    cs.AI cs.CV

    Physion: Evaluating Physical Prediction from Vision in Humans and Machines

    Authors: Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Yu Fish Tung, R. T. Pramod, Cameron Holdaway, Sirui Tao, Kevin Smith, Fan-Yun Sun, Li Fei-Fei, Nancy Kanwisher, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith E. Fan

    Abstract: While current vision algorithms excel at many challenging tasks, it is unclear how well they understand the physical dynamics of real-world environments. Here we introduce Physion, a dataset and benchmark for rigorously evaluating the ability to predict how physical scenarios will evolve over time. Our dataset features realistic simulations of a wide range of physical phenomena, including rigid an… ▽ More

    Submitted 20 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: 28 pages

    ACM Class: I.2.10; I.4.8; I.5

  3. arXiv:2011.06464  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    3D-OES: Viewpoint-Invariant Object-Factorized Environment Simulators

    Authors: Hsiao-Yu Fish Tung, Zhou Xian, Mihir Prabhudesai, Shamit Lal, Katerina Fragkiadaki

    Abstract: We propose an action-conditioned dynamics model that predicts scene changes caused by object and agent interactions in a viewpoint-invariant 3D neural scene representation space, inferred from RGB-D videos. In this 3D feature space, objects do not interfere with one another and their appearance persists over time and across viewpoints. This permits our model to predict future scenes long in the fu… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  4. arXiv:2010.16279  [pdf, other

    cs.CV

    3D Object Recognition By Corresponding and Quantizing Neural 3D Scene Representations

    Authors: Mihir Prabhudesai, Shamit Lal, Hsiao-Yu Fish Tung, Adam W. Harley, Shubhankar Potdar, Katerina Fragkiadaki

    Abstract: We propose a system that learns to detect objects and infer their 3D poses in RGB-D images. Many existing systems can identify objects and infer 3D poses, but they heavily rely on human labels and 3D annotations. The challenge here is to achieve this without relying on strong supervision signals. To address this challenge, we propose a model that maps RGB-D images to a set of 3D visual feature map… ▽ More

    Submitted 30 October, 2020; originally announced October 2020.

  5. arXiv:1910.01210  [pdf, other

    cs.CV cs.LG cs.RO

    Embodied Language Grounding with 3D Visual Feature Representations

    Authors: Mihir Prabhudesai, Hsiao-Yu Fish Tung, Syed Ashar Javed, Maximilian Sieb, Adam W. Harley, Katerina Fragkiadaki

    Abstract: We propose associating language utterances to 3D visual abstractions of the scene they describe. The 3D visual abstractions are encoded as 3-dimensional visual feature maps. We infer these 3D visual scene feature maps from RGB images of the scene via view prediction: when the generated 3D scene feature map is neurally projected from a camera viewpoint, it should match the corresponding RGB image.… ▽ More

    Submitted 17 June, 2021; v1 submitted 2 October, 2019; originally announced October 2019.

    Journal ref: Conference on Computer Vision and Pattern Recognition. 2020, pp. 2220-2229

  6. arXiv:1906.03764  [pdf, other

    cs.CV

    Learning from Unlabelled Videos Using Contrastive Predictive Neural 3D Map**

    Authors: Adam W. Harley, Shrinidhi K. Lakshmikanth, Fangyu Li, Xian Zhou, Hsiao-Yu Fish Tung, Katerina Fragkiadaki

    Abstract: Predictive coding theories suggest that the brain learns by predicting observations at various levels of abstraction. One of the most basic prediction tasks is view prediction: how would a given scene look from an alternative viewpoint? Humans excel at this task. Our ability to imagine and fill in missing information is tightly coupled with perception: we feel as if we see the world in 3 dimension… ▽ More

    Submitted 16 May, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

  7. arXiv:1901.00003  [pdf, other

    cs.CV

    Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

    Authors: Hsiao-Yu Fish Tung, Ricson Cheng, Katerina Fragkiadaki

    Abstract: We integrate two powerful ideas, geometry and deep visual representation learning, into recurrent network architectures for mobile visual scene understanding. The proposed networks learn to "lift" and integrate 2D visual features over time into latent 3D feature maps of the scene. They are equipped with differentiable geometric operations, such as projection, unprojection, egomotion estimation and… ▽ More

    Submitted 8 April, 2019; v1 submitted 31 December, 2018; originally announced January 2019.

  8. arXiv:1804.10692  [pdf, other

    cs.CV cs.RO

    Reward Learning from Narrated Demonstrations

    Authors: Hsiao-Yu Fish Tung, Adam W. Harley, Liang-Kang Huang, Katerina Fragkiadaki

    Abstract: Humans effortlessly "program" one another by communicating goals and desires in natural language. In contrast, humans program robotic behaviours by indicating desired object locations and poses to be achieved, by providing RGB images of goal configurations, or supplying a demonstration to be imitated. None of these methods generalize across environment variations, and they convey the goal in awkwa… ▽ More

    Submitted 27 April, 2018; originally announced April 2018.

    Comments: The work has been accepted to Conference on Computer Vision and Pattern Recognition (CVPR) 2018

  9. arXiv:1712.01337  [pdf, other

    cs.CV

    Self-supervised Learning of Motion Capture

    Authors: Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, Katerina Fragkiadaki

    Abstract: Current state-of-the-art solutions for motion capture from a single camera are optimization driven: they optimize the parameters of a 3D human model so that its re-projection matches measurements in the video (e.g. person segmentation, optical flow, keypoint detections etc.). Optimization models are susceptible to local minima. This has been the bottleneck that forced using clean green-screen like… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

    Comments: Neural Information Processing Systems (NIPS) 2017

  10. arXiv:1705.11166  [pdf, other

    cs.CV

    Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision

    Authors: Hsiao-Yu Fish Tung, Adam W. Harley, William Seto, Katerina Fragkiadaki

    Abstract: Researchers have developed excellent feed-forward models that learn to map images to desired outputs, such as to the images' latent factors, or to other images, using supervised learning. Learning such map**s from unlabelled data, or improving upon supervised models by exploiting unlabelled data, remains elusive. We argue that there are two important parts to learning without annotations: (i) ma… ▽ More

    Submitted 1 September, 2017; v1 submitted 31 May, 2017; originally announced May 2017.

    Journal ref: The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4354-4362

  11. arXiv:1704.00003  [pdf, other

    cs.LG stat.ML

    Spectral Methods for Nonparametric Models

    Authors: Hsiao-Yu Fish Tung, Chao-Yuan Wu, Manzil Zaheer, Alexander J. Smola

    Abstract: Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models. In this paper, we introduce spectral methods for the two most popular nonparametric models: the Indian Buffet Process (IBP) and the Hierarchical Dirichlet Process (HDP). We show that using spectral methods for the inference of nonparametric models are computationally and statistically efficient.… ▽ More

    Submitted 30 March, 2017; originally announced April 2017.

    Comments: Keywords: Spectral Methods, Indian Buffet Process, Hierarchical Dirichlet Process