Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Authors:
Shohei Higashiyama,
Hiroki Ouchi,
Hiroki Teranishi,
Hiroyuki Otomo,
Yusuke Ide,
Aitaro Yamamoto,
Hiroyuki Shindo,
Yuki Matsuda,
Shoko Wakamiya,
Naoya Inoue,
Ikuya Yamada,
Taro Watanabe
Abstract:
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coref…
▽ More
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coreference clusters, and 2,551 geo-entities linked to geo-database entries.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.