Skip to main content

Showing 1–8 of 8 results for author: Yoshihashi, R

.
  1. arXiv:2309.01369  [pdf, other

    cs.CV

    Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised Semantic Segmentation

    Authors: Ryota Yoshihashi, Yuya Otsuka, Kenji Doi, Tomohiro Tanaka, Hirokatsu Kataoka

    Abstract: The advance of generative models for images has inspired various training techniques for image recognition utilizing synthetic images. In semantic segmentation, one promising approach is extracting pseudo-masks from attention maps in text-to-image diffusion models, which enables real-image-and-annotation-free training. However, the pioneering training method using the diffusion-synthetic images an… ▽ More

    Submitted 15 April, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

  2. arXiv:2211.13844  [pdf, other

    cs.CV cs.LG

    Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning

    Authors: Ryota Yoshihashi, Shuhei Nishimura, Dai Yonebayashi, Yuya Otsuka, Tomohiro Tanaka, Takashi Miyazaki

    Abstract: Siamese-network-based self-supervised learning (SSL) suffers from slow convergence and instability in training. To alleviate this, we propose a framework to exploit intermediate self-supervisions in each stage of deep nets, called the Ladder Siamese Network. Our self-supervised losses encourage the intermediate layers to be consistent with different data augmentations to single samples, which faci… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  3. arXiv:2106.05611  [pdf, other

    cs.CV

    Context-Free TextSpotter for Real-Time and Mobile End-to-End Text Detection and Recognition

    Authors: Ryota Yoshihashi, Tomohiro Tanaka, Kenji Doi, Takumi Fu**o, Naoaki Yamashita

    Abstract: In the deployment of scene-text spotting systems on mobile platforms, lightweight models with low computation are preferable. In concept, end-to-end (E2E) text spotting is suitable for such purposes because it performs text detection and recognition in a single model. However, current state-of-the-art E2E methods rely on heavy feature extractors, recurrent sequence modellings, and complex shape al… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: To appear in ICDAR2021

  4. arXiv:2105.08253  [pdf, other

    cs.CV

    Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K Videos using a Joint Detection-and-Tracking Approach

    Authors: Ryota Yoshihashi, Rei Kawakami, Shaodi You, Tu Tuan Trinh, Makoto Iida, Takeshi Naemura

    Abstract: Detecting tiny objects in a high-resolution video is challenging because the visual information is little and unreliable. Specifically, the challenge includes very low resolution of the objects, MPEG artifacts due to compression and a large searching area with many hard negatives. Tracking is equally difficult because of the unreliable appearance, and the unreliable motion estimation. Luckily, we… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:1709.04666

  5. arXiv:1812.07134  [pdf, other

    cs.CV

    Hybrid Loss for Learning Single-Image-based HDR Reconstruction

    Authors: Kenta Moriwaki, Ryota Yoshihashi, Rei Kawakami, Shaodi You, Takeshi Naemura

    Abstract: This paper tackles high-dynamic-range (HDR) image reconstruction given only a single low-dynamic-range (LDR) image as input. While the existing methods focus on minimizing the mean-squared-error (MSE) between the target and reconstructed images, we minimize a hybrid loss that consists of perceptual and adversarial losses in addition to HDR-reconstruction loss. The reconstruction loss instead of MS… ▽ More

    Submitted 17 December, 2018; originally announced December 2018.

    Comments: 20 pages, 17 figures

  6. arXiv:1812.04246  [pdf, other

    cs.CV

    Classification-Reconstruction Learning for Open-Set Recognition

    Authors: Ryota Yoshihashi, Wen Shao, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

    Abstract: Open-set classification is a problem of handling `unknown' classes that are not contained in the training dataset, whereas traditional classifiers assume that only known classes appear in the test environment. Existing open-set classifiers rely on deep networks trained in a supervised manner on known classes in the training set; this causes specialization of learned representations to known classe… ▽ More

    Submitted 6 October, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: 11 pages, 7 figures

  7. arXiv:1805.05569  [pdf, other

    cs.CV

    Cross-connected Networks for Multi-task Learning of Detection and Segmentation

    Authors: Seiichiro Fukuda, Ryota Yoshihashi, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

    Abstract: Multi-task learning improves generalization performance by sharing knowledge among related tasks. Existing models are for task combinations annotated on the same dataset, while there are cases where multiple datasets are available for each task. How to utilize knowledge of successful single-task CNNs that are trained on each dataset has been explored less than multi-task learning with a single dat… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

  8. arXiv:1709.04666  [pdf, other

    cs.CV

    Differentiating Objects by Motion: Joint Detection and Tracking of Small Flying Objects

    Authors: Ryota Yoshihashi, Tu Tuan Trinh, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

    Abstract: While generic object detection has achieved large improvements with rich feature hierarchies from deep nets, detecting small objects with poor visual cues remains challenging. Motion cues from multiple frames may be more informative for detecting such hard-to-distinguish objects in each frame. However, how to encode discriminative motion patterns, such as deformations and pose changes that charact… ▽ More

    Submitted 15 May, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

    Comments: 10 pages, 8 figures