Skip to main content

Showing 1–1 of 1 results for author: Gavryushin, A

.
  1. arXiv:2301.09209  [pdf, other

    cs.CV cs.CL

    Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation

    Authors: Razvan-George Pasca, Alexey Gavryushin, Muhammad Hamza, Yen-Ling Kuo, Kaichun Mo, Luc Van Gool, Otmar Hilliges, Xi Wang

    Abstract: We study object interaction anticipation in egocentric videos. This task requires an understanding of the spatio-temporal context formed by past actions on objects, coined action context. We propose TransFusion, a multimodal transformer-based architecture. It exploits the representational power of language by summarizing the action context. TransFusion leverages pre-trained image captioning and vi… ▽ More

    Submitted 10 March, 2024; v1 submitted 22 January, 2023; originally announced January 2023.