Skip to main content

Showing 1–10 of 10 results for author: Kitani, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08858  [pdf, other

    cs.RO cs.CV cs.LG eess.SY

    OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

    Authors: Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi

    Abstract: We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy. Using kinematic pose as a universal control interface, OmniH2O enables various ways for a human to control a full-sized humanoid with dexterous hands, including using real-time teleoperation through VR headset, verbal instruction, and RGB camera. OmniH2O also enables full autono… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://omni.human2humanoid.com/

  2. arXiv:2403.04436  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

    Authors: Tairan He, Zhengyi Luo, Wenli Xiao, Chong Zhang, Kris Kitani, Changliu Liu, Guanya Shi

    Abstract: We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwar… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Project website: https://human2humanoid.com/

  3. arXiv:2108.06858  [pdf, other

    eess.IV cs.CV

    No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

    Authors: S. Alireza Golestaneh, Saba Dadsetan, Kris M. Kitani

    Abstract: The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations, it is a complex and unsolved problem due to the absence of the pristine reference image. In this paper, we propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-a… ▽ More

    Submitted 5 January, 2022; v1 submitted 15 August, 2021; originally announced August 2021.

  4. arXiv:2008.02787  [pdf, other

    cs.CV cs.GR eess.IV eess.SP physics.optics

    Efficient Non-Line-of-Sight Imaging from Transient Sinograms

    Authors: Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O'Toole

    Abstract: Non-line-of-sight (NLOS) imaging techniques use light that diffusely reflects off of visible surfaces (e.g., walls) to see around corners. One approach involves using pulsed lasers and ultrafast sensors to measure the travel time of multiply scattered light. Unlike existing NLOS techniques that generally require densely raster scanning points across the entirety of a relay wall, we explore a more… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: ECCV 2020. Project page: https://marikoisogawa.github.io/project/c2nlos

  5. Improving Lesion Segmentation for Diabetic Retinopathy using Adversarial Learning

    Authors: Qiqi Xiao, Jiaxu Zou, Muqiao Yang, Alex Gaudio, Kris Kitani, Asim Smailagic, Pedro Costa, Min Xu

    Abstract: Diabetic Retinopathy (DR) is a leading cause of blindness in working age adults. DR lesions can be challenging to identify in fundus images, and automatic DR detection systems can offer strong clinical value. Of the publicly available labeled datasets for DR, the Indian Diabetic Retinopathy Image Dataset (IDRiD) presents retinal fundus images with pixel-level annotations of four distinct lesions:… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: Accepted to International Conference on Image Analysis and Recognition, ICIAR 2019. Published at https://doi.org/10.1007/978-3-030-27272-2_29 Code: https://github.com/zoujx96/DR-segmentation

  6. arXiv:2007.12034  [pdf, other

    cs.CV cs.LG eess.IV

    AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

    Authors: Xiaofang Wang, Xuehan Xiong, Maxim Neumann, AJ Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua

    Abstract: Convolutional operations have two limitations: (1) do not explicitly model where to focus as the same filter is applied to all the positions, and (2) are unsuitable for modeling long-range dependencies as they only operate on a small neighborhood. While both limitations can be alleviated by attention operations, many design choices remain to be determined to use attention, especially when applying… ▽ More

    Submitted 31 July, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  7. arXiv:2006.07327  [pdf, other

    cs.CV cs.LG eess.IV

    GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning

    Authors: Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

    Abstract: 3D Multi-object tracking (MOT) is crucial to autonomous systems. Recent work uses a standard tracking-by-detection pipeline, where feature extraction is first performed independently for each object in order to compute an affinity matrix. Then the affinity matrix is passed to the Hungarian algorithm for data association. A key process of this standard pipeline is to learn discriminative features f… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: CVPR 2020. My website for all my research works: http://www.xinshuoweng.com/

  8. arXiv:2006.03783  [pdf, other

    cs.CV eess.IV

    No-Reference Image Quality Assessment via Feature Fusion and Multi-Task Learning

    Authors: S. Alireza Golestaneh, Kris Kitani

    Abstract: Blind or no-reference image quality assessment (NR-IQA) is a fundamental, unsolved, and yet challenging problem due to the unavailability of a reference image. It is vital to the streaming and social media industries that impact billions of viewers daily. Although previous NR-IQA methods leveraged different feature extraction approaches, the performance bottleneck still exists. In this paper, we p… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

  9. arXiv:2003.14414  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation

    Authors: Mariko Isogawa, Ye Yuan, Matthew O'Toole, Kris Kitani

    Abstract: We describe a method for 3D human pose estimation from transient images (i.e., a 3D spatio-temporal histogram of photons) acquired by an optical non-line-of-sight (NLOS) imaging system. Our method can perceive 3D human pose by `looking around corners' through the use of light indirectly reflected by the environment. We bring together a diverse set of technologies from NLOS imaging, human pose esti… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: CVPR 2020. Video: https://youtu.be/4HFulrdmLE8. Project page: https://marikoisogawa.github.io/project/nlos_pose

  10. arXiv:2003.08386  [pdf, other

    cs.CV cs.LG eess.IV

    DLow: Diversifying Latent Flows for Diverse Human Motion Prediction

    Authors: Ye Yuan, Kris Kitani

    Abstract: Deep generative models are often used for human motion prediction as they are able to model multi-modal data distributions and characterize diverse human behavior. While much care has been taken into designing and learning deep generative models, how to efficiently produce diverse samples from a deep generative model after it has been trained is still an under-explored problem. To obtain samples f… ▽ More

    Submitted 22 July, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: ECCV 2020. Project Page: https://www.ye-yuan.com/dlow