Skip to main content

Showing 1–24 of 24 results for author: Suganuma, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.03747  [pdf

    cs.CV

    SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers

    Authors: Xiangyong Lu, Masanori Suganuma, Takayuki Okatani

    Abstract: Computer vision has become increasingly prevalent in solving real-world problems across diverse domains, including smart agriculture, fishery, and livestock management. These applications may not require processing many image frames per second, leading practitioners to use single board computers (SBCs). Although many lightweight networks have been developed for mobile/edge devices, they primarily… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: 11 pages, 2 figures, WACV2024

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV2024)

  2. arXiv:2310.04671  [pdf, other

    cs.CV

    Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

    Authors: Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Masahiro Takahashi, Ryoma Niihara, Takayuki Okatani

    Abstract: This paper addresses the problem of predicting hazards that drivers may encounter while driving a car. We formulate it as a task of anticipating impending accidents using a single input image captured by car dashcams. Unlike existing approaches to driving hazard prediction that rely on computational simulations or anomaly detection from videos, this study focuses on high-level inference from stati… ▽ More

    Submitted 1 July, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: Main Paper: 11 pages, Supplementary Materials: 25 pages

    Journal ref: IEEE Trans. Intell. Veh. (2024) 1-11

  3. arXiv:2307.03243  [pdf, other

    cs.CV

    That's BAD: Blind Anomaly Detection by Implicit Local Feature Clustering

    Authors: Jie Zhang, Masanori Suganuma, Takayuki Okatani

    Abstract: Recent studies on visual anomaly detection (AD) of industrial objects/textures have achieved quite good performance. They consider an unsupervised setting, specifically the one-class setting, in which we assume the availability of a set of normal (\textit{i.e.}, anomaly-free) images for training. In this paper, we consider a more challenging scenario of unsupervised AD, in which we detect anomalie… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  4. arXiv:2307.03101  [pdf, other

    cs.CV

    Contextual Affinity Distillation for Image Anomaly Detection

    Authors: Jie Zhang, Masanori Suganuma, Takayuki Okatani

    Abstract: Previous works on unsupervised industrial anomaly detection mainly focus on local structural anomalies such as cracks and color contamination. While achieving significantly high detection performance on this kind of anomaly, they are faced with logical anomalies that violate the long-range dependencies such as a normal object placed in the wrong position. In this paper, based on previous knowledge… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  5. arXiv:2307.02897  [pdf, other

    cs.CV

    RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution

    Authors: Han Zou, Masanori Suganuma, Takayuki Okatani

    Abstract: Smartphones equipped with a multi-camera system comprising multiple cameras with different field-of-view (FoVs) are becoming more prevalent. These camera configurations are compatible with reference-based SR and video SR, which can be executed simultaneously while recording video on the device. Thus, combining these two SR methods can improve image quality. Recently, Lee et al. have presented such… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  6. arXiv:2307.02875  [pdf, other

    cs.CV

    Reference-based Motion Blur Removal: Learning to Utilize Sharpness in the Reference Image

    Authors: Han Zou, Masanori Suganuma, Takayuki Okatani

    Abstract: Despite the recent advancement in the study of removing motion blur in an image, it is still hard to deal with strong blurs. While there are limits in removing blurs from a single image, it has more potential to use multiple images, e.g., using an additional image as a reference to deblur a blurry image. A typical setting is deburring an image using a nearby sharp image(s) in a video sequence, as… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  7. arXiv:2207.09775  [pdf, other

    cs.CV

    Rectifying Open-set Object Detection: A Taxonomy, Practical Applications, and Proper Evaluation

    Authors: Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani

    Abstract: Open-set object detection (OSOD) has recently gained attention. It is to detect unknown objects while correctly detecting known objects. In this paper, we first point out that the recent studies' formalization of OSOD, which generalizes open-set recognition (OSR) and thus considers an unlimited variety of unknown objects, has a fundamental issue. This issue emerges from the difference between imag… ▽ More

    Submitted 29 November, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: 17 pages, 7 figures

  8. arXiv:2207.09666  [pdf, other

    cs.CV cs.AI cs.CL

    GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features

    Authors: Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

    Abstract: Current state-of-the-art methods for image captioning employ region-based features, as they provide object-level information that is essential to describe the content of images; they are usually extracted by an object detector such as Faster R-CNN. However, they have several issues, such as lack of contextual information, the risk of inaccurate detection, and the high computational cost. The first… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022; 14 pages with appendix; Code: https://github.com/davidnvq/grit

  9. arXiv:2207.03047  [pdf, other

    cs.CV eess.IV

    Single-image Defocus Deblurring by Integration of Defocus Map Prediction Tracing the Inverse Problem Computation

    Authors: Qian Ye, Masanori Suganuma, Takayuki Okatani

    Abstract: In this paper, we consider the problem in defocus image deblurring. Previous classical methods follow two-steps approaches, i.e., first defocus map estimation and then the non-blind deblurring. In the era of deep learning, some researchers have tried to address these two problems by CNN. However, the simple concatenation of defocus map, which represents the blur level, leads to suboptimal performa… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  10. arXiv:2207.02539  [pdf, other

    cs.CV

    Learning Regularized Multi-Scale Feature Flow for High Dynamic Range Imaging

    Authors: Qian Ye, Masanori Suganuma, Jun Xiao, Takayuki Okatani

    Abstract: Reconstructing ghosting-free high dynamic range (HDR) images of dynamic scenes from a set of multi-exposure images is a challenging task, especially with large object motion and occlusions, leading to visible artifacts using existing methods. To address this problem, we propose a deep network that tries to learn multi-scale feature flow guided by the regularized loss. It first extracts multi-scale… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  11. arXiv:2207.00067  [pdf, other

    cs.CV cs.AI

    Rethinking Unsupervised Domain Adaptation for Semantic Segmentation

    Authors: Zhijie Wang, Masanori Suganuma, Takayuki Okatani

    Abstract: Unsupervised domain adaptation (UDA) adapts a model trained on one domain (called source) to a novel domain (called target) using only unlabeled data. Due to its high annotation cost, researchers have developed many UDA methods for semantic segmentation, which assume no labeled sample is available in the target domain. We question the practicality of this assumption for two reasons. First, after t… ▽ More

    Submitted 22 January, 2024; v1 submitted 30 June, 2022; originally announced July 2022.

    Comments: Under review in Pattern Recognition Letters

  12. arXiv:2109.06432  [pdf, other

    cs.CV cs.AI

    Improved Few-shot Segmentation by Redefinition of the Roles of Multi-level CNN Features

    Authors: Zhijie Wang, Masanori Suganuma, Takayuki Okatani

    Abstract: This study is concerned with few-shot segmentation, i.e., segmenting the region of an unseen object class in a query image, given support image(s) of its instances. The current methods rely on the pretrained CNN features of the support and query images. The key to good performance depends on the proper fusion of their mid-level and high-level features; the former contains shape-oriented informatio… ▽ More

    Submitted 14 September, 2021; v1 submitted 14 September, 2021; originally announced September 2021.

  13. arXiv:2109.06422  [pdf, other

    cs.CV cs.AI

    Cross-Region Domain Adaptation for Class-level Alignment

    Authors: Zhijie Wang, Xing Liu, Masanori Suganuma, Takayuki Okatani

    Abstract: Semantic segmentation requires a lot of training data, which necessitates costly annotation. There have been many studies on unsupervised domain adaptation (UDA) from one domain to another, e.g., from computer graphics to real images. However, there is still a gap in accuracy between UDA and supervised training on native domain data. It is arguably attributable to class-level misalignment between… ▽ More

    Submitted 6 October, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: Under review in Computer Vision and Image Understanding

  14. arXiv:2109.03585  [pdf, other

    cs.CV

    Matching in the Dark: A Dataset for Matching Image Pairs of Low-light Scenes

    Authors: Wenzheng Song, Masanori Suganuma, Xing Liu, Noriyuki Shimobayashi, Daisuke Maruta, Takayuki Okatani

    Abstract: This paper considers matching images of low-light scenes, aiming to widen the frontier of SfM and visual SLAM applications. Recent image sensors can record the brightness of scenes with more than eight-bit precision, available in their RAW-format image. We are interested in making full use of such high-precision information to match extremely low-light scene images that conventional methods cannot… ▽ More

    Submitted 14 September, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: 15 pages, 14 figures, ICCV2021

    MSC Class: 68T40; 68T07

  15. arXiv:2106.00596  [pdf, other

    cs.CV

    Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks

    Authors: Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

    Abstract: There is a growing interest in the community in making an embodied AI agent perform a complicated task while interacting with an environment following natural language directives. Recent studies have tackled the problem using ALFRED, a well-designed dataset for the task, but achieved only very low accuracy. This paper proposes a new method, which outperforms the previous methods by a large margin.… ▽ More

    Submitted 6 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: To appear in IJCAI2021. 8-page main paper and Appendix following. Appendix E for details of entry submission to EAI 2021. Github: https://github.com/davidnvq/lwit-alfred

  16. arXiv:2005.03463  [pdf, other

    eess.IV cs.CV

    How Can CNNs Use Image Position for Segmentation?

    Authors: Rito Murase, Masanori Suganuma, Takayuki Okatani

    Abstract: Convolution is an equivariant operation, and image position does not affect its result. A recent study shows that the zero-padding employed in convolutional layers of CNNs provides position information to the CNNs. The study further claims that the position information enables accurate inference for several tasks, such as object recognition, segmentation, etc. However, there is a technical issue w… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: 11 pages

  17. arXiv:1911.11390  [pdf, other

    cs.CV

    Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs

    Authors: Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

    Abstract: It has been a primary concern in recent studies of vision and language tasks to design an effective attention mechanism dealing with interactions between the two modalities. The Transformer has recently been extended and applied to several bi-modal tasks, yielding promising results. For visual dialog, it becomes necessary to consider interactions between three or more inputs, i.e., an image, a que… ▽ More

    Submitted 17 July, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: Accepted to ECCV 2020, 14 pages. Slight change in title

  18. arXiv:1910.09212  [pdf, other

    cs.CV

    Analysis and a Solution of Momentarily Missed Detection for Anchor-based Object Detectors

    Authors: Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani

    Abstract: The employment of convolutional neural networks has led to significant performance improvement on the task of object detection. However, when applying existing detectors to continuous frames in a video, we often encounter momentary miss-detection of objects, that is, objects are undetected exceptionally at a few frames, although they are correctly detected at all other frames. In this paper, we an… ▽ More

    Submitted 16 January, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: Accepted to WACV 2020, 9 pages

  19. arXiv:1907.04508  [pdf, other

    cs.CV

    Restoring Images with Unknown Degradation Factors by Recurrent Use of a Multi-branch Network

    Authors: Xing Liu, Masanori Suganuma, Xiyang Luo, Takayuki Okatani

    Abstract: The employment of convolutional neural networks has achieved unprecedented performance in the task of image restoration for a variety of degradation factors. However, high-performance networks have been specifically designed for a single degradation factor. In this paper, we tackle a harder problem, restoring a clean image from its degraded version with an unknown degradation factor, subject to th… ▽ More

    Submitted 21 January, 2020; v1 submitted 10 July, 2019; originally announced July 2019.

  20. arXiv:1905.10628  [pdf, other

    cs.CV

    Hyperparameter-Free Out-of-Distribution Detection Using Softmax of Scaled Cosine Similarity

    Authors: Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani

    Abstract: The ability to detect out-of-distribution (OOD) samples is vital to secure the reliability of deep neural networks in real-world applications. Considering the nature of OOD samples, detection methods should not have hyperparameters that need to be tuned depending on incoming OOD samples. However, most of the recently proposed methods do not meet this requirement, leading to compromised performance… ▽ More

    Submitted 25 November, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

    Comments: Extend the supplementary material

  21. arXiv:1903.08817  [pdf, other

    cs.CV

    Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration

    Authors: Xing Liu, Masanori Suganuma, Zhun Sun, Takayuki Okatani

    Abstract: In this paper, we study design of deep neural networks for tasks of image restoration. We propose a novel style of residual connections dubbed "dual residual connection", which exploits the potential of paired operations, e.g., up- and down-sampling or convolution with large- and small-size kernels. We design a modular block implementing this connection style; it is equipped with two containers to… ▽ More

    Submitted 7 April, 2019; v1 submitted 20 March, 2019; originally announced March 2019.

    Comments: i) Accepted to CVPR 2019 ii) Code, trained models and additional results for visual comparison will be provided at https://github.com/liu-vis/DualResidualNetworks

  22. arXiv:1812.00733  [pdf, other

    cs.CV

    Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions

    Authors: Masanori Suganuma, Xing Liu, Takayuki Okatani

    Abstract: Many studies have been conducted so far on image restoration, the problem of restoring a clean image from its distorted version. There are many different types of distortion which affect image quality. Previous studies have focused on single types of distortion, proposing methods for removing them. However, image quality degrades due to multiple factors in the real world. Thus, depending on applic… ▽ More

    Submitted 7 April, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

    Comments: CVPR 2019

  23. arXiv:1803.00370  [pdf, other

    cs.NE

    Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search

    Authors: Masanori Suganuma, Mete Ozay, Takayuki Okatani

    Abstract: Researchers have applied deep neural networks to image restoration tasks, in which they proposed various network architectures, loss functions, and training methods. In particular, adversarial training, which is employed in recent studies, seems to be a key ingredient to success. In this paper, we show that simple convolutional autoencoders (CAEs) built upon only standard network components, i.e.,… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: Our code is available at https://github.com/sg-nm/Evolutionary-Autoencoders

  24. arXiv:1704.00764  [pdf, other

    cs.NE

    A Genetic Programming Approach to Designing Convolutional Neural Network Architectures

    Authors: Masanori Suganuma, Shinichi Shirakawa, Tomoharu Nagao

    Abstract: The convolutional neural network (CNN), which is one of the deep learning models, has seen much success in a variety of computer vision tasks. However, designing CNN architectures still requires expert knowledge and a lot of trial and error. In this paper, we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP). In our m… ▽ More

    Submitted 11 August, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

    Comments: This is the revised version of the GECCO 2017 paper. The code of our method is available at https://github.com/sg-nm/cgp-cnn