Skip to main content

Showing 1–5 of 5 results for author: Kameko, H

.
  1. arXiv:2404.03161  [pdf, other

    cs.CV cs.CL cs.MM

    BioVL-QR: Egocentric Biochemical Video-and-Language Dataset Using Micro QR Codes

    Authors: Taichi Nishimura, Koki Yamamoto, Yuto Haneji, Keiya Kajimura, Chihiro Nishiwaki, Eriko Daikoku, Natsuko Okuda, Fumihito Ono, Hirotaka Kameko, Shinsuke Mori

    Abstract: This paper introduces a biochemical vision-and-language dataset, which consists of 24 egocentric experiment videos, corresponding protocols, and video-and-language alignments. The key challenge in the wet-lab domain is detecting equipment, reagents, and containers is difficult because the lab environment is scattered by filling objects on the table and some objects are indistinguishable. Therefore… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 6 pages

  2. arXiv:2403.16483  [pdf, other

    cs.CL

    Automatic Construction of a Large-Scale Corpus for Geoparsing Using Wikipedia Hyperlinks

    Authors: Keyaki Ohno, Hirotaka Kameko, Keisuke Shirai, Taichi Nishimura, Shinsuke Mori

    Abstract: Geoparsing is the task of estimating the latitude and longitude (coordinates) of location expressions in texts. Geoparsing must deal with the ambiguity of the expressions that indicate multiple locations with the same notation. For evaluating geoparsing systems, several corpora have been proposed in previous work. However, these corpora are small-scale and suffer from the coverage of location expr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  3. arXiv:2305.19497  [pdf, other

    cs.CL

    Towards Flow Graph Prediction of Open-Domain Procedural Texts

    Authors: Keisuke Shirai, Hirotaka Kameko, Shinsuke Mori

    Abstract: Machine comprehension of procedural texts is essential for reasoning about the steps and automating the procedures. However, this requires identifying entities within a text and resolving the relationships between the entities. Previous work focused on the cooking domain and proposed a framework to convert a recipe text into a flow graph (FG) representation. In this work, we propose a framework ba… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: RepL4NLP 2023

  4. arXiv:2209.10134  [pdf, other

    cs.MM cs.CL cs.CV

    Recipe Generation from Unsegmented Cooking Videos

    Authors: Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, Shinsuke Mori

    Abstract: This paper tackles recipe generation from unsegmented cooking videos, a task that requires agents to (1) extract key events in completing the dish and (2) generate sentences for the extracted events. Our task is similar to dense video captioning (DVC), which aims at detecting events thoroughly and generating sentences for them. However, unlike DVC, in recipe generation, recipe story awareness is c… ▽ More

    Submitted 18 February, 2024; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: Accepted at ACM TOMM; ACM Transactions on Multimedia Computing, Communications, and Applications

  5. arXiv:2209.05840  [pdf, other

    cs.CL cs.AI

    Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

    Authors: Keisuke Shirai, Atsushi Hashimoto, Taichi Nishimura, Hirotaka Kameko, Shuhei Kurita, Yoshitaka Ushiku, Shinsuke Mori

    Abstract: We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-mo… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: COLING 2022