Skip to main content

Showing 1–14 of 14 results for author: Zhu, A Z

.
  1. arXiv:2401.02402  [pdf, other

    cs.CV

    3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation

    Authors: Zihao Xiao, Longlong **g, Shangxuan Wu, Alex Zihao Zhu, **gwei Ji, Chiyu Max Jiang, Wei-Chih Hung, Thomas Funkhouser, Weicheng Kuo, Anelia Angelova, Yin Zhou, Shiwei Sheng

    Abstract: 3D panoptic segmentation is a challenging perception task, especially in autonomous driving. It aims to predict both semantic and instance annotations for 3D points in a scene. Although prior 3D panoptic segmentation approaches have achieved great performance on closed-set benchmarks, generalizing these approaches to unseen things and unseen stuff categories remains an open problem. For unseen obj… ▽ More

    Submitted 2 April, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  2. arXiv:2309.16889  [pdf, other

    cs.CV

    Superpixel Transformers for Efficient Semantic Segmentation

    Authors: Alex Zihao Zhu, Jieru Mei, Siyuan Qiao, Hang Yan, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar

    Abstract: Semantic segmentation, which aims to classify every pixel in an image, is a key task in machine perception, with many applications across robotics and autonomous driving. Due to the high dimensionality of this task, most existing approaches use local operations, such as convolutions, to generate per-pixel features. However, these methods are typically unable to effectively leverage global context… ▽ More

    Submitted 2 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 8 pages, 5 figures, 4 tables. Presented at IROS 2023. Equal contribution by A. Zhu and J. Mei

  3. arXiv:2210.08113  [pdf, other

    cs.CV

    Instance Segmentation with Cross-Modal Consistency

    Authors: Alex Zihao Zhu, Vincent Casser, Reza Mahjourian, Henrik Kretzschmar, Sören Pirk

    Abstract: Segmenting object instances is a key task in machine perception, with safety-critical applications in robotics and autonomous driving. We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities, such as cameras and LiDAR. Our method learns to predict embeddings for each pixel or point that give rise to a dense segmentation of the scen… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: 8 pages, 9 figures, 5 tables. Presented at IROS 2022

  4. arXiv:2206.07704  [pdf, other

    cs.CV

    Waymo Open Dataset: Panoramic Video Panoptic Segmentation

    Authors: Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar, Dragomir Anguelov

    Abstract: Panoptic image segmentation is the computer vision task of finding groups of pixels in an image and assigning semantic classes and object instance identifiers to them. Research in image segmentation has become increasingly popular due to its critical applications in robotics and autonomous driving. The research community thereby relies on publicly available benchmark dataset to advance the state-o… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Our dataset can be found at https://waymo.com/open

  5. arXiv:2003.06696  [pdf, other

    cs.NE

    Spike-FlowNet: Event-based Optical Flow Estimation with Energy-Efficient Hybrid Neural Networks

    Authors: Chankyu Lee, Adarsh Kumar Kosta, Alex Zihao Zhu, Kenneth Chaney, Kostas Daniilidis, Kaushik Roy

    Abstract: Event-based cameras display great potential for a variety of tasks such as high-speed motion detection and navigation in low-light environments where conventional frame-based cameras suffer critically. This is attributed to their high temporal resolution, high dynamic range, and low-power consumption. However, conventional computer vision methods as well as deep Analog Neural Networks (ANNs) are n… ▽ More

    Submitted 14 September, 2020; v1 submitted 14 March, 2020; originally announced March 2020.

    Comments: European Conference on Computer Vision (ECCV) 2020

  6. arXiv:1912.01584  [pdf, other

    cs.CV

    EventGAN: Leveraging Large Scale Image Datasets for Event Cameras

    Authors: Alex Zihao Zhu, Ziyun Wang, Kaung Khant, Kostas Daniilidis

    Abstract: Event cameras provide a number of benefits over traditional cameras, such as the ability to track incredibly fast motions, high dynamic range, and low power consumption. However, their application into computer vision problems, many of which are primarily dominated by deep learning solutions, has been limited by the lack of labeled training data for events. In this work, we propose a method which… ▽ More

    Submitted 19 December, 2019; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 10 pages, 5 figures, 2 tables, Code: https://github.com/alexzzhu/EventGAN, Video: https://www.youtube.com/watch?v=Vcm4Iox4H2w

  7. arXiv:1902.06820  [pdf, other

    cs.CV

    Motion Equivariant Networks for Event Cameras with the Temporal Normalization Transform

    Authors: Alex Zihao Zhu, Ziyun Wang, Kostas Daniilidis

    Abstract: In this work, we propose a novel transformation for events from an event camera that is equivariant to optical flow under convolutions in the 3-D spatiotemporal domain. Events are generated by changes in the image, which are typically due to motion, either of the camera or the scene. As a result, different motions result in a different set of events. For learning based tasks based on a static scen… ▽ More

    Submitted 18 February, 2019; originally announced February 2019.

    Comments: 8 pages, 2 figures, 1 table

  8. arXiv:1812.08351  [pdf, other

    cs.CV

    Robustness Meets Deep Learning: An End-to-End Hybrid Pipeline for Unsupervised Learning of Egomotion

    Authors: Alex Zihao Zhu, Wenxin Liu, Ziyun Wang, Vijay Kumar, Kostas Daniilidis

    Abstract: In this work, we propose a method that combines unsupervised deep learning predictions for optical flow and monocular disparity with a model based optimization procedure for instantaneous camera pose. Given the flow and disparity predictions from the network, we apply a RANSAC outlier rejection scheme to find an inlier set of flows and disparities, which we use to solve for the relative camera pos… ▽ More

    Submitted 12 February, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: 10 pages, 5 figures, 7 tables

  9. arXiv:1812.08156  [pdf, other

    cs.CV

    Unsupervised Event-based Learning of Optical Flow, Depth, and Egomotion

    Authors: Alex Zihao Zhu, Liangzhe Yuan, Kenneth Chaney, Kostas Daniilidis

    Abstract: In this work, we propose a novel framework for unsupervised learning for event cameras that learns motion information from only the event stream. In particular, we propose an input representation of the events in the form of a discretized volume that maintains the temporal distribution of the events, which we pass through a neural network to predict the motion of the events. This motion is used to… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: 9 pages, 7 figures

  10. arXiv:1803.09025  [pdf, other

    cs.CV

    Realtime Time Synchronized Event-based Stereo

    Authors: Alex Zihao Zhu, Yibo Chen, Kostas Daniilidis

    Abstract: In this work, we propose a novel event based stereo method which addresses the problem of motion blur for a moving event camera. Our method uses the velocity of the camera and a range of disparities to synchronize the positions of the events, as if they were captured at a single point in time. We represent these events using a pair of novel time synchronized event disparity volumes, which we show… ▽ More

    Submitted 18 October, 2018; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: 13 pages, 3 figures, 1 table. Video: https://youtu.be/4oa7e4hsrYo. Updated with final version with additional experiments

    Journal ref: European Conference on Computer Vision 2018

  11. EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras

    Authors: Alex Zihao Zhu, Liangzhe Yuan, Kenneth Chaney, Kostas Daniilidis

    Abstract: Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, develo** algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networ… ▽ More

    Submitted 13 August, 2018; v1 submitted 19 February, 2018; originally announced February 2018.

    Comments: 9 pages, 5 figures, 1 table. Accompanying video: https://youtu.be/eMHZBSoq0sE. Dataset: https://daniilidis-group.github.io/mvsec/, Robotics: Science and Systems 2018

  12. The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception

    Authors: Alex Zihao Zhu, Dinesh Thakur, Tolga Ozaslan, Bernd Pfrommer, Vijay Kumar, Kostas Daniilidis

    Abstract: Event based cameras are a new passive sensing modality with a number of benefits over traditional cameras, including extremely low latency, asynchronous data acquisition, high dynamic range and very low power consumption. There has been a lot of recent interest and development in applying algorithms to use the events to perform a variety of 3D perception tasks, such as feature tracking, visual odo… ▽ More

    Submitted 19 February, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: 8 pages, 7 figures, 2 tables. Website: https://daniilidis-group.github.io/mvsec/. Video: https://www.youtube.com/watch?v=AwRMO5vFgak. Updated website and video in comments, DOI

  13. arXiv:1706.01966  [pdf, other

    cs.RO

    Controlling a Robotic Stereo Camera Under Image Quantization Noise

    Authors: Charles Freundlich, Yan Zhang, Alex Zihao Zhu, Philippos Mordohai, Michael M. Zavlanos

    Abstract: In this paper, we address the problem of controlling a mobile stereo camera under image quantization noise. Assuming that a pair of images of a set of targets is available, the camera moves through a sequence of Next-Best-Views (NBVs), i.e., a sequence of views that minimize the trace of the targets' cumulative state covariance, constructed using a realistic model of the stereo rig that captures i… ▽ More

    Submitted 13 January, 2018; v1 submitted 6 June, 2017; originally announced June 2017.

    Comments: International Journal of Robotics Research, October 2017

  14. arXiv:1309.7147  [pdf, ps, other

    cond-mat.soft

    Stress relaxation for granular materials near Jamming under cyclic compression

    Authors: Somayeh Farhadi, Robert P Behringer, Alex Z Zhu

    Abstract: We have explored isotropically jammed states of semi-2D granular materials through cyclic compression. In each compression cycle, systems of either identical ellipses or bi-disperse disks, transition between jammed and unjammed states. We determine the evolution of the average pressure, P, and structure through consecutive jammed states. We observe a transition point, φm, above which P persists ov… ▽ More

    Submitted 27 September, 2013; originally announced September 2013.

    Comments: 5 pages