Skip to main content

Showing 1–15 of 15 results for author: Kanehira, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.02316  [pdf, other

    cs.RO

    Designing Library of Skill-Agents for Hardware-Level Reusability

    Authors: Jun Takamatsu, Daichi Saito, Katsushi Ikeuchi, Atsushi Kanehira, Kazuhiro Sasabuchi, Naoki Wake

    Abstract: To use new robot hardware in a new environment, it is necessary to develop a control program tailored to that specific robot in that environment. Considering the reusability of software among robots is crucial to minimize the effort involved in this process and maximize software reuse across different robots in different environments. This paper proposes a method to remedy this process by consider… ▽ More

    Submitted 20 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  2. arXiv:2311.12015  [pdf, other

    cs.RO cs.CL cs.CV

    GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V(ision), to facilitate one-shot visual teaching for robotic manipulation. This system analyzes videos of humans performing tasks and outputs executable robot programs that incorporate insights into affordances. The process begins with GPT-4V analyzing the videos to obtain textual explanations of environmental and… ▽ More

    Submitted 6 May, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: 9 pages, 12 figures, 2 tables. Last updated on May 6th, 2024

  3. arXiv:2311.11007  [pdf, other

    cs.RO

    Constraint-aware Policy for Compliant Manipulation

    Authors: Daichi Saito, Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Jun Takamatsu, Hideki Koike, Katsushi Ikeuchi

    Abstract: Robot manipulation in a physically-constrained environment requires compliant manipulation. Compliant manipulation is a manipulation skill to adjust hand motion based on the force imposed by the environment. Recently, reinforcement learning (RL) has been applied to solve household operations involving compliant manipulation. However, previous RL methods have primarily focused on designing a policy… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  4. arXiv:2310.11753  [pdf, other

    cs.RO cs.CL

    Bias in Emotion Recognition with ChatGPT

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This technical report explores the ability of ChatGPT in recognizing emotions from text, which can be the basis of various applications like interactive chatbots, data annotation, and mental health analysis. While prior research has shown ChatGPT's basic ability in sentiment analysis, its performance in more nuanced emotion recognition is not yet explored. Here, we conducted experiments to evaluat… ▽ More

    Submitted 4 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 5 pages, 4 figures, 6 tables

  5. arXiv:2306.01741  [pdf, other

    cs.RO cs.CL

    GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This technical paper introduces a chatting robot system that utilizes recent advancements in large-scale language models (LLMs) such as GPT-3 and ChatGPT. The system is integrated with a co-speech gesture generation system, which selects appropriate gestures based on the conceptual meaning of speech. Our motivation is to explore ways of utilizing the recent progress in LLMs for practical robotic a… ▽ More

    Submitted 10 May, 2023; originally announced June 2023.

  6. ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This paper demonstrates how OpenAI's ChatGPT can be used in a few-shot setting to convert natural language instructions into a sequence of executable robot actions. The paper proposes easy-to-customize input prompts for ChatGPT that meet common requirements in practical applications, such as easy integration with robot execution systems and applicability to various environments while minimizing th… ▽ More

    Submitted 29 August, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: 21 figures, 7 tables. Published in IEEE Access (in press). Last updated August 29th, 2023

  7. arXiv:2301.01382  [pdf, other

    cs.RO

    Task-sequencing Simulator: Integrated Machine Learning to Execution Simulation for Robot Manipulation

    Authors: Kazuhiro Sasabuchi, Daichi Saito, Atsushi Kanehira, Naoki Wake, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: A task-sequencing simulator in robotics manipulation to integrate simulation-for-learning and simulation-for-execution is introduced. Unlike existing machine-learning simulation where a non-decomposed simulation is used to simulate a training scenario, the task-sequencing simulator runs a composed simulation using building blocks. This way, the simulation-for-learning is structured similarly to a… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: 7 pages, 6 figures

  8. Interactive Task Encoding System for Learning-from-Observation

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: We present the Interactive Task Encoding System (ITES) for teaching robots to perform manipulative tasks. ITES is designed as an input system for the Learning-from-Observation (LfO) framework, which enables household robots to be programmed using few-shot human demonstrations without the need for coding. In contrast to previous LfO systems that rely solely on visual demonstrations, ITES leverages… ▽ More

    Submitted 28 April, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 6 pages, 9 figures. Submitted to and accepted by 2023 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM). Last updated April 28st, 2023

  9. arXiv:2212.09242  [pdf, other

    cs.RO

    Learning-from-Observation System Considering Hardware-Level Reusability

    Authors: Jun Takamatsu, Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Katsushi Ikeuchi

    Abstract: Robot developers develop various types of robots for satisfying users' various demands. Users' demands are related to their backgrounds and robots suitable for users may vary. If a certain developer would offer a robot that is different from the usual to a user, the robot-specific software has to be changed. On the other hand, robot-software developers would like to reuse their developed software… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Comments: 5 pages, 4 figures

  10. arXiv:2106.04555  [pdf, other

    cs.CV

    Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

    Authors: Tommi Kerola, Jie Li, Atsushi Kanehira, Yasunori Kudo, Alexis Vallet, Adrien Gaidon

    Abstract: Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. Although they are related, unifying them faces an apparent paradox: how to learn simultaneously instance-specific and category-specific (i.e. instance-agnostic) representations jointly. Hence, state-of-the-art panoptic segmentation methods use complex models with a distinct stream for each task. In contra… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: 13 pages, 9 figures, including supplementary material. To be published in CVPR 2021

  11. arXiv:1812.01280  [pdf, other

    cs.CV

    Learning to Explain with Complemental Examples

    Authors: Atsushi Kanehira, Tatsuya Harada

    Abstract: This paper addresses the generation of explanations with visual examples. Given an input sample, we build a system that not only classifies it to a specific category, but also outputs linguistic explanations and a set of visual examples that render the decision interpretable. Focusing especially on the complementarity of the multimodal information, i.e., linguistic and visual examples, we attempt… ▽ More

    Submitted 20 May, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

    Comments: Camera ready version of CVPR'19

  12. arXiv:1812.01263  [pdf, other

    cs.CV

    Multimodal Explanations by Predicting Counterfactuality in Videos

    Authors: Atsushi Kanehira, Kentaro Takemoto, Sho Inayoshi, Tatsuya Harada

    Abstract: This study addresses generating counterfactual explanations with multimodal information. Our goal is not only to classify a video into a specific category, but also to provide explanations on why it is not categorized to a specific class with combinations of visual-linguistic information. Requirements that the expected output should satisfy are referred to as counterfactuality in this paper: (1) C… ▽ More

    Submitted 19 May, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

    Comments: Camera ready version of CVPR'19

  13. arXiv:1804.02843  [pdf, other

    cs.CV

    Viewpoint-aware Video Summarization

    Authors: Atsushi Kanehira, Luc Van Gool, Yoshitaka Ushiku, Tatsuya Harada

    Abstract: This paper introduces a novel variant of video summarization, namely building a summary that depends on the particular aspect of a video the viewer focuses on. We refer to this as $\textit{viewpoint}$. To infer what the desired $\textit{viewpoint}$ may be, we assume that several other videos are available, especially groups of videos, e.g., as folders on a person's phone or laptop. The semantic si… ▽ More

    Submitted 10 April, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: to appear at CVPR 2018

  14. arXiv:1511.06783  [pdf, ps, other

    cs.CV

    Recognizing Activities of Daily Living with a Wrist-mounted Camera

    Authors: Katsunori Ohnishi, Atsushi Kanehira, Asako Kanezaki, Tatsuya Harada

    Abstract: We present a novel dataset and a novel algorithm for recognizing activities of daily living (ADL) from a first-person wearable camera. Handled objects are crucially important for egocentric ADL recognition. For specific examination of objects related to users' actions separately from other objects in an environment, many previous works have addressed the detection of handled objects in images capt… ▽ More

    Submitted 28 April, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

    Comments: CVPR2016 spotlight presentation

  15. arXiv:1502.06064  [pdf, ps, other

    stat.ML cs.LG cs.MS

    MILJS : Brand New JavaScript Libraries for Matrix Calculation and Machine Learning

    Authors: Ken Miura, Tetsuaki Mano, Atsushi Kanehira, Yuichiro Tsuchiya, Tatsuya Harada

    Abstract: MILJS is a collection of state-of-the-art, platform-independent, scalable, fast JavaScript libraries for matrix calculation and machine learning. Our core library offering a matrix calculation is called Sushi, which exhibits far better performance than any other leading machine learning libraries written in JavaScript. Especially, our matrix multiplication is 177 times faster than the fastest Java… ▽ More

    Submitted 20 February, 2015; originally announced February 2015.