-
Evaluating Image Review Ability of Vision Language Models
Authors:
Shigeki Saito,
Kazuki Hayashi,
Yusuke Ide,
Yusuke Sakai,
Kazuma Onishi,
Toma Suzuki,
Seiji Gobara,
Hidetaka Kamigaito,
Katsuhiko Hayashi,
Taro Watanabe
Abstract:
Large-scale vision language models (LVLMs) are language models that are capable of processing images and text inputs by a single model. This paper explores the use of LVLMs to generate review texts for images. The ability of LVLMs to review images is not fully understood, highlighting the need for a methodical evaluation of their review abilities. Unlike image captions, review texts can be written…
▽ More
Large-scale vision language models (LVLMs) are language models that are capable of processing images and text inputs by a single model. This paper explores the use of LVLMs to generate review texts for images. The ability of LVLMs to review images is not fully understood, highlighting the need for a methodical evaluation of their review abilities. Unlike image captions, review texts can be written from various perspectives such as image composition and exposure. This diversity of review perspectives makes it difficult to uniquely determine a single correct review for an image. To address this challenge, we introduce an evaluation method based on rank correlation analysis, in which review texts are ranked by humans and LVLMs, then, measures the correlation between these rankings. We further validate this approach by creating a benchmark dataset aimed at assessing the image review ability of recent LVLMs. Our experiments with the dataset reveal that LVLMs, particularly those with proven superiority in other evaluative contexts, excel at distinguishing between high-quality and substandard image reviews.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Japanese Lexical Complexity for Non-Native Readers: A New Dataset
Authors:
Yusuke Ide,
Masato Mita,
Adam Nohejl,
Hiroki Ouchi,
Taro Watanabe
Abstract:
Lexical complexity prediction (LCP) is the task of predicting the complexity of words in a text on a continuous scale. It plays a vital role in simplifying or annotating complex words to assist readers. To study lexical complexity in Japanese, we construct the first Japanese LCP dataset. Our dataset provides separate complexity scores for Chinese/Korean annotators and others to address the readers…
▽ More
Lexical complexity prediction (LCP) is the task of predicting the complexity of words in a text on a continuous scale. It plays a vital role in simplifying or annotating complex words to assist readers. To study lexical complexity in Japanese, we construct the first Japanese LCP dataset. Our dataset provides separate complexity scores for Chinese/Korean annotators and others to address the readers' L1-specific needs. In the baseline experiment, we demonstrate the effectiveness of a BERT-based system for Japanese LCP.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Authors:
Shohei Higashiyama,
Hiroki Ouchi,
Hiroki Teranishi,
Hiroyuki Otomo,
Yusuke Ide,
Aitaro Yamamoto,
Hiroyuki Shindo,
Yuki Matsuda,
Shoko Wakamiya,
Naoya Inoue,
Ikuya Yamada,
Taro Watanabe
Abstract:
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coref…
▽ More
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coreference clusters, and 2,551 geo-entities linked to geo-database entries.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Combinatorial and approximative analyses in a spatially random division process
Authors:
Yukio Hayashi,
Takayuki Komaki,
Yusuke Ide,
Takuya Machida,
Norio Konno
Abstract:
For a spatial characteristic, there exist commonly fat-tail frequency distributions of fragment-size and -mass of glass, areas enclosed by city roads, and pore size/volume in random packings. In order to give a new analytical approach for the distributions, we consider a simple model which constructs a fractal-like hierarchical network based on random divisions of rectangles. The stochastic proces…
▽ More
For a spatial characteristic, there exist commonly fat-tail frequency distributions of fragment-size and -mass of glass, areas enclosed by city roads, and pore size/volume in random packings. In order to give a new analytical approach for the distributions, we consider a simple model which constructs a fractal-like hierarchical network based on random divisions of rectangles. The stochastic process makes a Markov chain and corresponds to directional random walks with splitting into four particles. We derive a combinatorial analytical form and its continuous approximation for the distribution of rectangle areas, and numerically show a good fitting with the actual distribution in the averaging behavior of the divisions.
△ Less
Submitted 31 January, 2013; v1 submitted 10 January, 2013;
originally announced January 2013.
-
Spectral Properties of the Threshold Network Model
Authors:
Yusuke Ide,
Norio Konno,
Nobuaki Obata
Abstract:
We study the spectral distribution of the threshold network model.The results contain an explicit description and its asymptotic behaviour.
We study the spectral distribution of the threshold network model.The results contain an explicit description and its asymptotic behaviour.
△ Less
Submitted 31 December, 2009;
originally announced January 2010.