Skip to main content

Showing 1–8 of 8 results for author: Kawanabe, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.16559  [pdf, other

    cs.RO cs.CV

    Map-based Modular Approach for Zero-shot Embodied Question Answering

    Authors: Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe

    Abstract: Building robots capable of interacting with humans through natural language in the visual world presents a significant challenge in the field of robotics. To overcome this challenge, Embodied Question Answering (EQA) has been proposed as a benchmark task to measure the ability to identify an object navigating through a previously unseen environment in response to human-posed questions. Although so… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  2. arXiv:2310.18773  [pdf, other

    cs.CV

    CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data

    Authors: Taiki Miyanishi, Fumiya Kitamori, Shuhei Kurita, Jungdae Lee, Motoaki Kawanabe, Nakamasa Inoue

    Abstract: City-scale 3D point cloud is a promising way to express detailed and complicated outdoor structures. It encompasses both the appearance and geometry features of segmented city components, including cars, streets, and buildings, that can be utilized for attractive applications such as user-interactive navigation of autonomous vehicles and drones. However, compared to the extensive text annotations… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: NeurIPS D&B 2023. The first two authors are equally contributed

  3. arXiv:2305.13876  [pdf, other

    cs.CV

    Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans

    Authors: Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, Motoki Kawanabe

    Abstract: We present a novel task for cross-dataset visual grounding in 3D scenes (Cross3DVG), which overcomes limitations of existing 3D visual grounding models, specifically their restricted 3D resources and consequent tendencies of overfitting a specific 3D dataset. We created RIORefer, a large-scale 3D visual grounding dataset, to facilitate Cross3DVG. It includes more than 63k diverse descriptions of 3… ▽ More

    Submitted 7 February, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 3DV 2024

  4. arXiv:2206.01323  [pdf, other

    cs.LG eess.SP

    SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

    Authors: Reinmar J Kobler, Jun-ichiro Hirayama, Qibin Zhao, Motoaki Kawanabe

    Abstract: Electroencephalography (EEG) provides access to neuronal dynamics non-invasively with millisecond resolution, rendering it a viable method in neuroscience and healthcare. However, its utility is limited as current EEG technology does not generalize well across domains (i.e., sessions and subjects) without expensive supervised re-calibration. Contemporary methods cast this transfer learning (TL) pr… ▽ More

    Submitted 12 October, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: 10 pages, accepted at NeurIPS 2022

    ACM Class: I.5.1; I.5.4; J.3

  5. arXiv:2112.10482  [pdf, other

    cs.CV

    ScanQA: 3D Question Answering for Spatial Scene Understanding

    Authors: Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe

    Abstract: We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA). In the 3D-QA task, models receive visual information from the entire 3D scene of the rich RGB-D indoor scan and answer the given textual questions about the 3D scene. Unlike the 2D-question answering of VQA, the conventional 2D-QA models suffer from problems with spatial understanding of object alignment and direction… ▽ More

    Submitted 7 May, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: CVPR2022. The first three authors are equally contributed. Project page: https://github.com/ATR-DBI/ScanQA

  6. arXiv:2107.14398  [pdf, other

    cs.LG eess.SP

    On the interpretation of linear Riemannian tangent space model parameters in M/EEG

    Authors: Reinmar J. Kobler, Jun-Ichiro Hirayama, Lea Hehenberger Catarina Lopes-Dias, Gernot R. Müller-Putz, Motoaki Kawanabe

    Abstract: Riemannian tangent space methods offer state-of-the-art performance in magnetoencephalography (MEG) and electroencephalography (EEG) based applications such as brain-computer interfaces and biomarker development. One limitation, particularly relevant for biomarker development, is limited model interpretability compared to established component-based methods. Here, we propose a method to transform… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

  7. Insights from Classifying Visual Concepts with Multiple Kernel Learning

    Authors: Alexander Binder, Shinichi Nakajima, Marius Kloft, Christina Müller, Wojciech Samek, Ulf Brefeld, Klaus-Robert Müller, Motoaki Kawanabe

    Abstract: Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unf… ▽ More

    Submitted 15 December, 2011; originally announced December 2011.

    Comments: 18 pages, 8 tables, 4 figures, format deviating from plos one submission format requirements for aesthetic reasons

    Journal ref: PLoS ONE 7(8): e38897, 2012

  8. arXiv:0912.1128  [pdf, ps, other

    stat.ML cs.LG

    How to Explain Individual Classification Decisions

    Authors: David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, Klaus-Robert Mueller

    Abstract: After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted the particular label for a single instance and what features were most influenti… ▽ More

    Submitted 6 December, 2009; originally announced December 2009.

    Comments: 31 pages, 14 figures