Skip to main content

Showing 1–50 of 120 results for author: Nakashima, Y

.
  1. arXiv:2406.17760  [pdf, other

    hep-ph

    Dark photon pair production via off-shell dark Higgs at FASER

    Authors: Takeshi Araki, Kento Asai, Yohei Nakashima, Takashi Shimomura

    Abstract: We consider a dark photon model in which the dark U(1) gauge symmetry is spontaneously broken by a vacuum expectation value of a new scalar boson. We focus on the ForwArd Search ExpeRiment (FASER) and calculate its sensitivity to the dark photon produced from the off-shell decay of the new scalar boson. It is found that the off-shell production extends the sensitivity region beyond the kinematical… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 19 pages, 4 figures, 1 table

    Report number: UME-PP-028, KYUSHU-HET-288

  2. arXiv:2406.13912  [pdf, other

    cs.CV

    From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment

    Authors: Yusuke Hirota, Ryo Hachiuma, Chao-Han Huck Yang, Yuta Nakashima

    Abstract: Large language models (LLMs) have enhanced the capacity of vision-language models to caption visual text. This generative approach to image caption enrichment further makes textual captions more descriptive, improving alignment with the visual context. However, while many studies focus on benefits of generative caption enrichment (GCE), are there any negative side effects? We compare standard-form… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.09884  [pdf, other

    cs.MM cs.CL cs.SI

    Enhancing Fake News Detection in Social Media via Label Propagation on Cross-modal Tweet Graph

    Authors: Wanqing Zhao, Yuta Nakashima, Haiyuan Chen, Noboru Babaguchi

    Abstract: Fake news detection in social media has become increasingly important due to the rapid proliferation of personal media channels and the consequential dissemination of misleading information. Existing methods, which primarily rely on multimodal features and graph-based techniques, have shown promising performance in detecting fake news. However, they still face a limitation, i.e., sparsity in graph… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages

  4. SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration

    Authors: Yuto Nakashima, Mingzhe Yang, Yukino Baba

    Abstract: Generating preferred images using generative adversarial networks (GANs) is challenging owing to the high-dimensional nature of latent space. In this study, we propose a novel approach that uses simple user-swipe interactions to generate preferred images for users. To effectively explore the latent space with only swipe interactions, we apply principal component analysis to the latent space of the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 11 pages, 13 figures

  5. arXiv:2404.03242  [pdf, other

    cs.CV

    Would Deep Generative Models Amplify Bias in Future Models?

    Authors: Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa Garcia, Yuta Nakashima

    Abstract: We investigate the impact of deep generative models on potential social biases in upcoming computer vision models. As the internet witnesses an increasing influx of AI-generated images, concerns arise regarding inherent biases that may accompany them, potentially leading to the dissemination of harmful content. This paper explores whether a detrimental feedback loop, resulting in bias amplificatio… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted to CVPR 2024

  6. arXiv:2402.19223  [pdf, other

    cs.DS

    Edit and Alphabet-Ordering Sensitivity of Lex-parse

    Authors: Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai

    Abstract: We investigate the compression sensitivity [Akagi et al., 2023] of lex-parse [Navarro et al., 2021] for two operations: (1) single character edit and (2) modification of the alphabet ordering, and give tight upper and lower bounds for both operations. For both lower bounds, we use the family of Fibonacci words. For the bounds on edit operations, our analysis makes heavy use of properties of the Ly… ▽ More

    Submitted 12 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  7. arXiv:2402.19146  [pdf, other

    cs.DS

    Computing Longest Common Subsequence under Cartesian-Tree Matching Model

    Authors: Taketo Tsujimoto, Hiroki Shibata, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

    Abstract: Two strings of the same length are said to Cartesian-tree match (CT-match) if their Cartesian-trees are isomorphic [Park et al., TCS 2020]. Cartesian-tree matching is a natural model that allows for capturing similarities of numerical sequences. Oizumi et al. [CPM 2022] showed that subsequence pattern matching under CT-matching model can be solved in polynomial time. This current article follows a… ▽ More

    Submitted 5 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  8. arXiv:2401.12464  [pdf, other

    cs.HC cs.RO

    Estimation of posture and joint angle of human body using foot pressure distribution: Morphological computation with human foot

    Authors: Yo Kobayashi, Yasutaka Nakashima

    Abstract: This paper proposes a novel contact and wearable sensing system for estimating the upper body posture and joint angles (ankle, knee, and hip) of the human body using foot pressure distribution information obtained from a sensor attached to the plantar region. In the proposed estimation method, sensors are installed only on the plantar region, which is the end of the human body and the point of con… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  9. arXiv:2312.03027  [pdf, other

    cs.CV

    Stable Diffusion Exposed: Gender Bias from Prompt to Image

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: Recent studies have highlighted biases in generative models, shedding light on their predisposition towards gender-based stereotypes and imbalances. This paper contributes to this growing body of research by introducing an evaluation protocol designed to automatically analyze the impact of gender indicators on Stable Diffusion images. Leveraging insights from prior work, we explore how gender indi… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  10. arXiv:2311.18345  [pdf

    cs.CY

    Situating the social issues of image generation models in the model life cycle: a sociotechnical approach

    Authors: Amelia Katirai, Noa Garcia, Kazuki Ide, Yuta Nakashima, Atsuo Kishimoto

    Abstract: The race to develop image generation models is intensifying, with a rapid increase in the number of text-to-image models available. This is coupled with growing public awareness of these technologies. Though other generative AI models--notably, large language models--have received recent critical attention for the social and other non-technical issues they raise, there has been relatively little c… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  11. arXiv:2311.03648  [pdf, other

    cs.CV

    Instruct Me More! Random Prompting for Visual In-Context Learning

    Authors: Jiahao Zhang, Bowen Wang, Liangzhi Li, Yuta Nakashima, Hajime Nagahara

    Abstract: Large-scale models trained on extensive datasets, have emerged as the preferred approach due to their high generalizability across various tasks. In-context learning (ICL), a popular strategy in natural language processing, uses such models for different tasks by providing instructive prompts but without updating model parameters. This idea is now being explored in computer vision, where an input-… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at WACV 2024

  12. arXiv:2309.16162  [pdf, other

    cs.HC

    ACT2G: Attention-based Contrastive Learning for Text-to-Gesture Generation

    Authors: Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    Abstract: Recent increase of remote-work, online meeting and tele-operation task makes people find that gesture for avatars and communication robots is more important than we have thought. It is one of the key factors to achieve smooth and natural communication between humans and AI systems and has been intensively researched. Current gesture generation methods are mostly based on deep neural network using… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  13. arXiv:2309.15523  [pdf, other

    cs.CV

    Improving Facade Parsing with Vision Transformers and Line Integration

    Authors: Bowen Wang, Jiaxing Zhang, Ran Zhang, Yunqin Li, Liangzhi Li, Yuta Nakashima

    Abstract: Facade parsing stands as a pivotal computer vision task with far-reaching applications in areas like architecture, urban planning, and energy efficiency. Despite the recent success of deep learning-based methods in yielding impressive results on certain open-source datasets, their viability for real-world applications remains uncertain. Real-world scenarios are considerably more intricate, demandi… ▽ More

    Submitted 6 October, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: 13 pages, 7 figures, 9 tables

  14. arXiv:2308.02269  [pdf, other

    cs.DS cs.FL cs.IR

    Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph

    Authors: Hiroki Arimura, Shunsuke Inenaga, Yasuaki Kobayashi, Yuto Nakashima, Mizuki Sue

    Abstract: In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size $e$ for a text $T$ of length $n$ into other text indexing structures for the same text, suitable for highly repetitive texts: the run-length BWT of size $r$, the irreducible PLCP array of size $r$, and the qu… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: The short version of this paper will appear in SPIRE 2023, Pisa, Italy, September 26-28, 2023, Lecture Notes in Computer Science, Springer

  15. arXiv:2307.07676  [pdf, other

    cs.DS

    Computing SEQ-IC-LCS of Labeled Graphs

    Authors: Yuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga

    Abstract: We consider labeled directed graphs where each vertex is labeled with a non-empty string. Such labeled graphs are also known as non-linear texts in the literature. In this paper, we introduce a new problem of comparing two given labeled graphs, called the SEQ-IC-LCS problem on labeled graphs. The goal of SEQ-IC-LCS is to compute the the length of the longest common subsequence (LCS) $Z$ of two tar… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted for PSC 2023

  16. arXiv:2307.01967  [pdf, other

    cs.DS

    Linear-time computation of generalized minimal absent words for multiple strings

    Authors: Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Abstract: A string $w$ is called a minimal absent word (MAW) for a string $S$ if $w$ does not occur as a substring in $S$ and all proper substrings of $w$ occur in $S$. MAWs are well-studied combinatorial string objects that have potential applications in areas including bioinformatics, musicology, and data compression. In this paper, we generalize the notion of MAWs to a set… ▽ More

    Submitted 28 July, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: Accepted for SPIRE 2023

  17. Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, so… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  18. arXiv:2304.10131  [pdf, other

    cs.CV

    Learning Bottleneck Concepts in Image Classification

    Authors: Bowen Wang, Liangzhi Li, Yuta Nakashima, Hajime Nagahara

    Abstract: Interpreting and explaining the behavior of deep neural networks is critical for many tasks. Explainable AI provides a way to address this challenge, mostly by providing per-pixel relevance to the decision. Yet, interpreting such explanations may require expert knowledge. Some recent attempts toward interpretability adopt a concept-based framework, giving a higher-level relationship between some c… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted in CVPR 2023

  19. arXiv:2304.03693  [pdf, other

    cs.CV

    Model-Agnostic Gender Debiased Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Image captioning models are known to perpetuate and amplify harmful societal bias in the training set. In this work, we aim to mitigate such gender bias in image captioning models. While prior work has addressed this problem by forcing models to focus on people to reduce gender misclassification, it conversely generates gender-stereotypical words at the expense of predicting the correct gender. Fr… ▽ More

    Submitted 21 December, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  20. arXiv:2304.02828  [pdf, other

    cs.CV cs.CY

    Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

    Authors: Noa Garcia, Yusuke Hirota, Yankun Wu, Yuta Nakashima

    Abstract: The increasing tendency to collect large and uncurated datasets to train vision-and-language models has raised concerns about fair representations. It is known that even small but manually annotated datasets, such as MSCOCO, are affected by societal bias. This problem, far from being solved, may be getting worse with data crawled from the Internet without much control. In addition, the lack of too… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  21. arXiv:2304.01816  [pdf, other

    cs.CV

    Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

    Authors: Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

    Abstract: Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images. However, our survey of 37 recent papers reveals that many works rely solely on automatic measures (e.g., FID) or perform poorly described human evaluations that are not reliable or repeatable. This paper proposes a standard… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  22. arXiv:2303.01726  [pdf, other

    cs.DS

    Tight bounds for the sensitivity of CDAWGs with left-end edits

    Authors: Hiroto Fujimaru, Yuto Nakashima, Shunsuke Inenaga

    Abstract: Compact directed acyclic word graphs (CDAWGs) [Blumer et al. 1987] are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string $T$ is obtained by merging isomorphic subtrees of the suffix tree [Weiner 1973] of the same string $T$, thus CDAWGs are a compact indexing structure. In this paper, we… ▽ More

    Submitted 15 March, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: This is a full version of the paper that appeared in the proceedings of WORDS 2023

  23. arXiv:2302.02586  [pdf, other

    cs.DS

    Optimal LZ-End Parsing is Hard

    Authors: Hideo Bannai, Mitsuru Funakoshi, Kazuhiro Kurita, Yuto Nakashima, Kazuhisa Seto, Takeaki Uno

    Abstract: LZ-End is a variant of the well-known Lempel-Ziv parsing family such that each phrase of the parsing has a previous occurrence, with the additional constraint that the previous occurrence must end at the end of a previous phrase. LZ-End was initially proposed as a greedy parsing, where each phrase is determined greedily from left to right, as the longest factor that satisfies the above constraint~… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  24. arXiv:2301.13356  [pdf, other

    cs.CV cs.LG

    Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

    Authors: Hugo Lemarchant, Liangzi Li, Yiming Qian, Yuta Nakashima, Hajime Nagahara

    Abstract: Vision Transformers (ViTs) are becoming a very popular paradigm for vision tasks as they achieve state-of-the-art performance on image classification. However, although early works implied that this network structure had increased robustness against adversarial attacks, some works argue ViTs are still vulnerable. This paper presents our first attempt toward detecting adversarial attacks during inf… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  25. arXiv:2211.10056  [pdf, other

    cs.CV

    Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

    Authors: Zongshang Pang, Yuta Nakashima, Mayu Otani, Hajime Nagahara

    Abstract: Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing. Unsupervised methods usually rely on heuristic training objectives such as diversity and representativeness. However, such methods need to bootstrap the online-generated summaries to compute the objectives for importance score regression. We consider such a pipeline inefficie… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: To appear in WACV2023

  26. Faster Space-Efficient STR-IC-LCS Computation

    Authors: Yuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Abstract: One of the most fundamental method for comparing two given strings $A$ and $B$ is the longest common subsequence (LCS), where the task is to find (the length) of an LCS of $A$ and $B$. In this paper, we deal with the STR-IC-LCS problem which is one of the constrained LCS problems proposed by Chen and Chao [J. Comb. Optim, 2011]. A string $Z$ is said to be an STR-IC-LCS of three given strings $A$,… ▽ More

    Submitted 19 May, 2024; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: This is a full version of "Space-Efficient STR-IC-LCS Computation" presented at SOFSEM 2023

    Journal ref: Theoretical Computer Science, Volume 1003, 1 July 2024, 114607

  27. arXiv:2210.06790  [pdf, other

    cs.RO cs.MM

    Deep Gesture Generation for Social Robots Using Type-Specific Libraries

    Authors: Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    Abstract: Body language such as conversational gesture is a powerful way to ease communication. Conversational gestures do not only make a speech more lively but also contain semantic meaning that helps to stress important information in the discussion. In the field of robotics, giving conversational agents (humanoid robots or virtual avatars) the ability to properly use gestures is critical, yet remain a t… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  28. arXiv:2210.02067  [pdf, other

    cs.DS

    Computing maximal generalized palindromes

    Authors: Mitsuru Funakoshi, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: Palindromes are popular and important objects in textual data processing, bioinformatics, and combinatorics on words. Let $S = XaY$ be a string, where $X$ and $Y$ are of the same length and $a$ is either a single character or the empty string. Then, there exist two alternative definitions for palindromes: $S$ is said to be a palindrome if: Reversal-based definition: $S$ is equal to its reversal… ▽ More

    Submitted 17 November, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

  29. arXiv:2208.10758  [pdf, other

    cs.CV cs.AI

    Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks

    Authors: Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: Is more data always better to train vision-and-language models? We study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks their overall performance will improve. However, we show that not all the knowledge transfers well or has a positive impact on related tasks, even when they share a commo… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  30. Gender and Racial Bias in Visual Question Answering Datasets

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Vision-and-language tasks have increasingly drawn more attention as a means to evaluate human-like reasoning in machine learning models. A popular task in the field is visual question answering (VQA), which aims to answer questions about images. However, VQA models have been shown to exploit language bias by learning the statistical correlations between questions and answers without looking into t… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: ACM Conference on Fairness, Accountability, and Transparency (FAccT 2022)

  31. arXiv:2203.16062  [pdf, other

    cs.CV cs.IR

    AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

    Authors: Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai

    Abstract: Evaluation measures have a crucial impact on the direction of research. Therefore, it is of utmost importance to develop appropriate and reliable evaluation measures for new applications where conventional measures are not well suited. Video Moment Retrieval (VMR) is one such application, and the current practice is to use R@$K,θ$ for evaluating VMR systems. However, this measure has two disadvant… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR2022

  32. arXiv:2203.15395  [pdf, other

    cs.CV cs.MM

    Quantifying Societal Bias Amplification in Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: We study societal bias amplification in image captioning. Image captioning models have been shown to perpetuate gender and racial biases, however, metrics to measure, quantify, and evaluate the societal bias in captions are not yet standardized. We provide a comprehensive study on the strengths and limitations of each metric, and propose LIC, a metric to study captioning bias amplification. We arg… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  33. arXiv:2203.14438  [pdf, other

    cs.CV

    Optimal Correction Cost for Object Detection Evaluation

    Authors: Mayu Otani, Riku Togashi, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

    Abstract: Mean Average Precision (mAP) is the primary evaluation measure for object detection. Although object detection has a broad range of applications, mAP evaluates detectors in terms of the performance of ranked instance retrieval. Such the assumption for the evaluation task does not suit some downstream tasks. To alleviate the gap between downstream tasks and the evaluation scenario, we propose Optim… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  34. arXiv:2202.13591  [pdf, other

    cs.DS

    Minimal Absent Words on Run-Length Encoded Strings

    Authors: Tooru Akagi, Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

    Abstract: A string $w$ is called a minimal absent word (MAW) for another string $T$ if $w$ does not occur (as a substring) in $T$ and any proper substring of $w$ occurs in $T$. State-of-the-art data structures for reporting the set $\mathsf{MAW}(T)$ of MAWs from a given string $T$ of length $n$ require $O(n)$ space, can be built in $O(n)$ time, and can report all MAWs in $O(|\mathsf{MAW}(T)|)$ time upon a q… ▽ More

    Submitted 14 April, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: Accepted for CPM 2022

  35. arXiv:2110.13395  [pdf, other

    cs.CV cs.AI

    Transferring Domain-Agnostic Knowledge in Video Question Answering

    Authors: Tianran Wu, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Haruo Takemura

    Abstract: Video question answering (VideoQA) is designed to answer a given question based on a relevant video clip. The current available large-scale datasets have made it possible to formulate VideoQA as the joint understanding of visual and language information. However, this training procedure is costly and still less competent with human performance. In this paper, we investigate a transfer learning met… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  36. arXiv:2109.05743  [pdf, other

    cs.CV cs.AI cs.CL

    Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

    Authors: Zechen Bai, Yuta Nakashima, Noa Garcia

    Abstract: Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to people by generating comprehensive descriptions of fine-art paintings. Generating informative descriptions for artworks, however, is extremely challenging, as it requires to 1) describe multiple aspects of the image such as its style, content, or composition, and 2) pr… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: ICCV 2021

  37. Built Year Prediction from Buddha Face with Heterogeneous Labels

    Authors: Yiming Qian, Cheikh Brahim El Vaigh, Yuta Nakashima, Benjamin Renoust, Hajime Nagahara, Yutaka Fujioka

    Abstract: Buddha statues are a part of human culture, especially of the Asia area, and they have been alongside human civilisation for more than 2,000 years. As history goes by, due to wars, natural disasters, and other reasons, the records that show the built years of Buddha statues went missing, which makes it an immense work for historians to estimate the built years. In this paper, we pursue the idea of… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

  38. arXiv:2107.03000  [pdf, other

    cs.CV

    PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation

    Authors: Akihiko Sayo, Diego Thomas, Hiroshi Kawasaki, Yuta Nakashima, Katsushi Ikeuchi

    Abstract: We propose a new 2D pose refinement network that learns to predict the human bias in the estimated 2D pose. There are biases in 2D pose estimations that are due to differences between annotations of 2D joint locations based on annotators' perception and those defined by motion capture (MoCap) systems. These biases are crafted into publicly available 2D pose datasets and cannot be removed with exis… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  39. arXiv:2106.13445  [pdf, other

    cs.CV

    A Picture May Be Worth a Hundred Words for Visual Question Answering

    Authors: Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Ittetsu Taniguchi, Takao Onoye

    Abstract: How far can we go with textual representations for understanding pictures? In image understanding, it is essential to use concise but detailed image representations. Deep visual features extracted by vision models, such as Faster R-CNN, are prevailing used in multiple tasks, and especially in visual question answering (VQA). However, conventional deep visual features may struggle to convey all the… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  40. arXiv:2106.01595  [pdf, other

    cs.DS

    Position Heaps for Cartesian-tree Matching on Strings and Tries

    Authors: Akio Nishimoto, Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga

    Abstract: The Cartesian-tree pattern matching is a recently introduced scheme of pattern matching that detects fragments in a sequential data stream which have a similar structure as a query pattern. Formally, Cartesian-tree pattern matching seeks all substrings $S'$ of the text string $S$ such that the Cartesian tree of $S'$ and that of a query pattern $P$ coincide. In this paper, we present a new indexing… ▽ More

    Submitted 14 August, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

  41. arXiv:2106.01190  [pdf, ps, other

    math.CO cs.DM

    Counting Lyndon Subsequences

    Authors: Ryo Hirakawa, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    Abstract: Counting substrings/subsequences that preserve some property (e.g., palindromes, squares) is an important mathematical interest in stringology. Recently, Glen et al. studied the number of Lyndon factors in a string. A string $w = uv$ is called a Lyndon word if it is the lexicographically smallest among all of its conjugates $vu$. In this paper, we consider a more general problem "counting Lyndon s… ▽ More

    Submitted 13 July, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

  42. arXiv:2106.01173  [pdf, other

    cs.DS cs.DM

    On the approximation ratio of LZ-End to LZ77

    Authors: Takumi Ideue, Takuya Mieno, Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    Abstract: A family of Lempel-Ziv factorizations is a well-studied string structure. The LZ-End factorization is a member of the family that achieved faster extraction of any substrings (Kreft & Navarro, TCS 2013). One of the interests for LZ-End factorizations is the possible difference between the size of LZ-End and LZ77 factorizations. They also showed families of strings where the approximation ratio of… ▽ More

    Submitted 15 August, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

  43. arXiv:2105.13744  [pdf, ps, other

    cs.DS

    Grammar Index By Induced Suffix Sorting

    Authors: Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: Pattern matching is the most central task for text indices. Most recent indices leverage compression techniques to make pattern matching feasible for massive but highly-compressible datasets. Within this kind of indices, we propose a new compressed text index built upon a grammar compression based on induced suffix sorting [Nunes et al., DCC'18]. We show that this grammar exhibits a locality sensi… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: Our implementation is available at https://github.com/TooruAkagi/GCIS_Index

  44. arXiv:2105.11852  [pdf

    cs.LG cs.CV

    GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph

    Authors: Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: The rise of digitization of cultural documents offers large-scale contents, opening the road for development of AI systems in order to preserve, search, and deliver cultural heritage. To organize such cultural content also means to classify them, a task that is very familiar to modern computer science. Contextual information is often the key to structure such real world data, and we propose to use… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  45. arXiv:2105.08496  [pdf, other

    math.CO cs.DM cs.FL

    Combinatorics of minimal absent words for a sliding window

    Authors: Tooru Akagi, Yuki Kuhara, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: A string $w$ is called a minimal absent word (MAW) for another string $T$ if $w$ does not occur in $T$ but the proper substrings of $w$ occur in $T$. For example, let $Σ= \{\mathtt{a, b, c}\}$ be the alphabet. Then, the set of MAWs for string $w = \mathtt{abaab}$ is $\{\mathtt{aaa, aaba, bab, bb, c}\}$. In this paper, we study combinatorial properties of MAWs in the sliding window model, namely, h… ▽ More

    Submitted 15 April, 2022; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: A part of the results of this article appeared in Proc. SOFSEM 2020 (also in arXiv:1909.02804). The results on binary alphabets are the main new material in this article

  46. arXiv:2103.03468  [pdf, other

    cs.DS cs.CC

    Compressed Communication Complexity of Hamming Distance

    Authors: Shiori Mitsuya, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: We consider the communication complexity of the Hamming distance of two strings. Bille et al. [SPIRE 2018] considered the communication complexity of the longest common prefix (LCP) problem in the setting where the two parties have their strings in a compressed form, i.e., represented by the Lempel-Ziv 77 factorization (LZ77) with/without self-references. We present a randomized public-coin protoc… ▽ More

    Submitted 31 March, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

  47. arXiv:2101.11906  [pdf, other

    physics.data-an cs.LG hep-ex physics.ins-det

    Development of a Vertex Finding Algorithm using Recurrent Neural Network

    Authors: Kiichi Goto, Taikan Suehara, Tamaki Yoshioka, Masakazu Kurata, Hajime Nagahara, Yuta Nakashima, Noriko Takemura, Masako Iwasaki

    Abstract: Deep learning is a rapidly-evolving technology with possibility to significantly improve physics reach of collider experiments. In this study we developed a novel algorithm of vertex finding for future lepton colliders such as the International Linear Collider. We deploy two networks; one is simple fully-connected layers to look for vertex seeds from track pairs, and the other is a customized Recu… ▽ More

    Submitted 19 November, 2022; v1 submitted 28 January, 2021; originally announced January 2021.

    Comments: 16 pages, 9 figures

    Journal ref: Nucl.Instrum.MethodsPhys.Res. 1047 (2023) 167836

  48. arXiv:2101.05479  [pdf, other

    cs.CV cs.LG

    Understanding the Role of Scene Graphs in Visual Question Answering

    Authors: Vinay Damodaran, Sharanya Chakravarthy, Akshay Kumar, Anjana Umapathy, Teruko Mitamura, Yuta Nakashima, Noa Garcia, Chenhui Chu

    Abstract: Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search. In this work, we explore the use of scene graphs for solving the VQA task. We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning ca… ▽ More

    Submitted 16 January, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  49. arXiv:2012.10092  [pdf, other

    cs.DS

    The Parameterized Suffix Tray

    Authors: Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: Let $Σ$ and $Π$ be disjoint alphabets, respectively called the static alphabet and the parameterized alphabet. Two strings $x$ and $y$ over $Σ\cup Π$ of equal length are said to parameterized match (p-match) if there exists a renaming bijection $f$ on $Σ$ and $Π$ which is identity on $Σ$ and maps the characters of $x$ to those of $y$ so that the two strings become identical. The indexing version o… ▽ More

    Submitted 3 February, 2021; v1 submitted 18 December, 2020; originally announced December 2020.

    Comments: Accepted for CIAC 2021

  50. arXiv:2011.12527  [pdf, other

    cs.CV

    Match Them Up: Visually Explainable Few-shot Image Classification

    Authors: Bowen Wang, Liangzhi Li, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara

    Abstract: Few-shot learning (FSL) approaches are usually based on an assumption that the pre-trained knowledge can be obtained from base (seen) categories and can be well transferred to novel (unseen) categories. However, there is no guarantee, especially for the latter part. This issue leads to the unknown nature of the inference process in most FSL methods, which hampers its application in some risk-sensi… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.