Skip to main content

Showing 1–12 of 12 results for author: Ohkawa, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung ** Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  2. arXiv:2403.04381  [pdf, other

    cs.CV

    Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation

    Authors: Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang, Yoichi Sato

    Abstract: The pursuit of accurate 3D hand pose estimation stands as a keystone for understanding human activity in the realm of egocentric vision. The majority of existing estimation methods still rely on single-view images as input, leading to potential limitations, e.g., limited field-of-view and ambiguity in depth. To address these problems, adding another camera to better capture the shape of hands is a… ▽ More

    Submitted 9 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by CVPR2024. Code will be released at https://github.com/ut-vision/S2DHand

  3. arXiv:2311.17366  [pdf, other

    cs.CV

    Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction

    Authors: Yilin Wen, Hao Pan, Takehiko Ohkawa, Lei Yang, Jia Pan, Yoichi Sato, Taku Komura, Wen** Wang

    Abstract: We present a novel framework that concurrently tackles hand action recognition and 3D future hand motion prediction. While previous works focus on either recognition or prediction, we propose a generative Transformer VAE architecture to jointly capture both aspects, facilitating realistic motion prediction by leveraging the short-term hand motion and long-term action consistency observed across ti… ▽ More

    Submitted 24 December, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

  4. arXiv:2311.16444  [pdf, other

    cs.CV cs.CL

    Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos

    Authors: Takehiko Ohkawa, Takuma Yagi, Taichi Nishimura, Ryosuke Furuta, Atsushi Hashimoto, Yoshitaka Ushiku, Yoichi Sato

    Abstract: We propose a novel benchmark for cross-view knowledge transfer of dense video captioning, adapting models from web instructional videos with exocentric views to an egocentric view. While dense video captioning (predicting time segments and their captions) is primarily studied with exocentric videos (e.g., YouCook2), benchmarks with egocentric videos are restricted due to data scarcity. To overcome… ▽ More

    Submitted 29 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

  5. arXiv:2304.12301  [pdf, other

    cs.CV

    AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation

    Authors: Takehiko Ohkawa, Kun He, Fadime Sener, Tomas Hodan, Luan Tran, Cem Keskin

    Abstract: We present AssemblyHands, a large-scale benchmark dataset with accurate 3D hand pose annotations, to facilitate the study of egocentric activities with challenging hand-object interactions. The dataset includes synchronized egocentric and exocentric images sampled from the recent Assembly101 dataset, in which participants assemble and disassemble take-apart toys. To obtain high-quality 3D hand pos… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: CVPR 2023. Project page: https://assemblyhands.github.io/

  6. arXiv:2209.02241  [pdf, other

    cs.CV cs.AI

    Real-Time Cattle Interaction Recognition via Triple-stream Network

    Authors: Yang Yang, Mizuka Komatsu, Kenji Oyama, Takenao Ohkawa

    Abstract: In stockbreeding of beef cattle, computer vision-based approaches have been widely employed to monitor cattle conditions (e.g. the physical, physiology, and health). To this end, the accurate and effective recognition of cattle action is a prerequisite. Generally, most existing models are confined to individual behavior that uses video-based methods to extract spatial-temporal features for recogni… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: Accepted in ICMLA2022

  7. arXiv:2206.02257  [pdf, other

    cs.CV

    Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey

    Authors: Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato

    Abstract: In this survey, we present a systematic review of 3D hand pose estimation from the perspective of efficient annotation and learning. 3D hand pose estimation has been an important research area owing to its potential to enable various applications, such as video understanding, AR/VR, and robotics. However, the performance of models is tied to the quality and quantity of annotated 3D hand poses. Und… ▽ More

    Submitted 26 April, 2023; v1 submitted 5 June, 2022; originally announced June 2022.

  8. arXiv:2203.08344  [pdf, other

    cs.CV cs.LG

    Domain Adaptive Hand Keypoint and Pixel Localization in the Wild

    Authors: Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato

    Abstract: We aim to improve the performance of regressing hand keypoints and segmenting pixel-level hand masks under new imaging conditions (e.g., outdoors) when we only have labeled images taken under very different conditions (e.g., indoors). In the real world, it is important that the model trained for both tasks works under various imaging conditions. However, their variation covered by existing labeled… ▽ More

    Submitted 14 July, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: Accepted to ECCV 2022

  9. arXiv:2202.13941  [pdf, other

    cs.CV cs.AI cs.LG

    Background Mixup Data Augmentation for Hand and Object-in-Contact Detection

    Authors: Koya Tango, Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato

    Abstract: Detecting the positions of human hands and objects-in-contact (hand-object detection) in each video frame is vital for understanding human activities from videos. For training an object detector, a method called Mixup, which overlays two training images to mitigate data bias, has been empirically shown to be effective for data augmentation. However, in hand-object detection, mixing two hand-manipu… ▽ More

    Submitted 28 February, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: 5 pages, 4 figures

  10. Foreground-Aware Stylization and Consensus Pseudo-Labeling for Domain Adaptation of First-Person Hand Segmentation

    Authors: Takehiko Ohkawa, Takuma Yagi, Atsushi Hashimoto, Yoshitaka Ushiku, Yoichi Sato

    Abstract: Hand segmentation is a crucial task in first-person vision. Since first-person images exhibit strong bias in appearance among different environments, adapting a pre-trained segmentation model to a new domain is required in hand segmentation. Here, we focus on appearance gaps for hand regions and backgrounds separately. We propose (i) foreground-aware image stylization and (ii) consensus pseudo-lab… ▽ More

    Submitted 27 March, 2022; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Access 2021

  11. arXiv:2003.00187  [pdf, other

    cs.CV cs.LG

    Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image Translation

    Authors: Takehiko Ohkawa, Naoto Inoue, Hirokatsu Kataoka, Nakamasa Inoue

    Abstract: Unpaired image-to-image (I2I) translation has received considerable attention in pattern recognition and computer vision because of recent advancements in generative adversarial networks (GANs). However, due to the lack of explicit supervision, unpaired I2I models often fail to generate realistic images, especially in challenging datasets with different backgrounds and poses. Hence, stabilization… ▽ More

    Submitted 12 October, 2020; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: Accepted to ICPR2020

  12. arXiv:1508.07123  [pdf

    cs.AR cs.DC cs.RO

    Proposal of ROS-compliant FPGA Component for Low-Power Robotic Systems

    Authors: Kazushi Yamashina, Takeshi Ohkawa, Kanemitsu Ootsu, Takashi Yokota

    Abstract: In recent years, robots are required to be autonomous and their robotic software are sophisticated. Robots have a problem of insufficient performance, since it cannot equip with a high-performance microprocessor due to battery-power operation. On the other hand, FPGA devices can accelerate specific functions in a robot system without increasing power consumption by implementing customized circuits… ▽ More

    Submitted 28 August, 2015; originally announced August 2015.

    Comments: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320)

    Report number: FSP/2015/12