Skip to main content

Showing 1–50 of 55 results for author: Nakayama, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19316  [pdf, other

    cs.CV

    Enhanced Data Transfer Cooperating with Artificial Triplets for Scene Graph Generation

    Authors: KuanChao Chu, Satoshi Yamazaki, Hideki Nakayama

    Abstract: This work focuses on training dataset enhancement of informative relational triplets for Scene Graph Generation (SGG). Due to the lack of effective supervision, the current SGG model predictions perform poorly for informative relational triplets with inadequate training samples. Therefore, we propose two novel training dataset enhancement modules: Feature Space Triplet Augmentation (FSTA) and Soft… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted to IEICE Transactions on Information and Systems in April 2024

  2. A Better LLM Evaluator for Text Generation: The Impact of Prompt Output Sequencing and Optimization

    Authors: KuanChao Chu, Yi-Pei Chen, Hideki Nakayama

    Abstract: This research investigates prompt designs of evaluating generated texts using large language models (LLMs). While LLMs are increasingly used for scoring various inputs, creating effective prompts for open-ended text evaluation remains challenging due to model sensitivity and subjectivity in evaluation of text generation. Our study experimented with different prompt structures, altering the sequenc… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Presented in JSAI 2024. The first two authors contributed equally. arXiv admin note: substantial text overlap with arXiv:2406.02863

  3. arXiv:2406.02863  [pdf, ps, other

    cs.CL

    LLM as a Scorer: The Impact of Output Order on Dialogue Evaluation

    Authors: Yi-Pei Chen, KuanChao Chu, Hideki Nakayama

    Abstract: This research investigates the effect of prompt design on dialogue evaluation using large language models (LLMs). While LLMs are increasingly used for scoring various inputs, creating effective prompts for dialogue evaluation remains challenging due to model sensitivity and subjectivity in dialogue assessments. Our study experimented with different prompt structures, altering the sequence of outpu… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Presented in AAAI 2024 Spring Symposium. The first two authors contributed equally

  4. arXiv:2405.17974  [pdf, other

    cs.CL cs.AI

    Recent Trends in Personalized Dialogue Generation: A Review of Datasets, Methodologies, and Evaluations

    Authors: Yi-Pei Chen, Noriki Nishida, Hideki Nakayama, Yuji Matsumoto

    Abstract: Enhancing user engagement through personalization in conversational agents has gained significance, especially with the advent of large language models that generate fluent responses. Personalized dialogue generation, however, is multifaceted and varies in its definition -- ranging from instilling a persona in the agent to capturing users' explicit and implicit cues. This paper seeks to systemical… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Presented in LREC-COLING 2024

  5. arXiv:2403.18187  [pdf, other

    cs.CV

    LayoutFlow: Flow Matching for Layout Generation

    Authors: Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui, Mayu Otani, Hideki Nakayama

    Abstract: Finding a suitable layout represents a crucial task for diverse applications in graphic design. Motivated by simpler and smoother sampling trajectories, we explore the use of Flow Matching as an alternative to current diffusion-based layout generation models. Specifically, we propose LayoutFlow, an efficient flow-based model capable of generating high-quality layouts. Instead of progressively deno… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  6. arXiv:2311.15879  [pdf, other

    cs.CV

    EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

    Authors: Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

    Abstract: Large language models (LLMs)-based image captioning has the capability of describing objects not explicitly observed in training data; yet novel objects occur frequently, necessitating the requirement of sustaining up-to-date object knowledge for open-world comprehension. Instead of relying on large amounts of data and/or scaling up network parameters, we introduce a highly effective retrieval-aug… ▽ More

    Submitted 7 April, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: CVPR 2024

  7. arXiv:2311.12897  [pdf, other

    cs.GR

    An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes

    Authors: Kai Katsumata, Duc Minh Vo, Hideki Nakayama

    Abstract: In novel view synthesis of scenes from multiple input views, 3D Gaussian splatting emerges as a viable alternative to existing radiance field approaches, delivering great visual quality and real-time rendering. While successful in static scenes, the present advancement of 3D Gaussian representation, however, faces challenges in dynamic scenes in terms of memory consumption and the need for numerou… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: 10 pages, 10 figures

  8. arXiv:2308.10005  [pdf, other

    cs.CV

    Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

    Authors: Jiaxuan Li, Duc Minh Vo, Hideki Nakayama

    Abstract: Bias mitigation in image classification has been widely researched, and existing methods have yielded notable results. However, most of these methods implicitly assume that a given image contains only one type of known or unknown bias, failing to consider the complexities of real-world biases. We introduce a more challenging scenario, agnostic biases mitigation, aiming at bias removal regardless o… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  9. arXiv:2307.08995  [pdf, other

    cs.CV

    Revisiting Latent Space of GAN Inversion for Real Image Editing

    Authors: Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

    Abstract: The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem. In this study, we revisit StyleGANs' hyperspherical prior $\mathcal{Z}$ and combine it with highly capable latent spaces to build combined spaces that faithfully invert real images while maint… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: 10 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2306.00241

  10. arXiv:2307.08319  [pdf, other

    cs.CV

    Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data

    Authors: Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama

    Abstract: Label-noise or curated unlabeled data is used to compensate for the assumption of clean labeled data in training the conditional generative adversarial network; however, satisfying such an extended assumption is occasionally laborious or impractical. As a step towards generative modeling accessible to everyone, we introduce a novel conditional image generation framework that accepts noisy-labeled… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 10 pages, 13 figures

  11. arXiv:2306.00241  [pdf, other

    cs.CV

    Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space

    Authors: Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

    Abstract: The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem. In this study, we revisit StyleGANs' hyperspherical prior $\mathcal{Z}$ and $\mathcal{Z}^+$ and integrate them into seminal GAN inversion methods to improve editing quality. Besides faithful r… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 5 pages, 9 figures, AI4CC Workshop at CVPR 2023

  12. arXiv:2304.08327  [pdf, other

    cs.CL

    LED: A Dataset for Life Event Extraction from Dialogs

    Authors: Yi-Pei Chen, An-Zi Yen, Hen-Hsen Huang, Hideki Nakayama, Hsin-Hsi Chen

    Abstract: Lifelogging has gained more attention due to its wide applications, such as personalized recommendations or memory assistance. The issues of collecting and extracting personal life events have emerged. People often share their life experiences with others through conversations. However, extracting life events from conversations is rarely explored. In this paper, we present Life Event Dialog, a dat… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Accepted to EACL 2023 Findings

  13. arXiv:2304.06602  [pdf, other

    cs.CV

    A-CAP: Anticipation Captioning with Commonsense Knowledge

    Authors: Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama

    Abstract: Humans possess the capacity to reason about the future based on a sparse collection of visual cues acquired over time. In order to emulate this ability, we introduce a novel task called Anticipation Captioning, which generates a caption for an unseen oracle image using a sparsely temporally-ordered set of images. To tackle this new task, we propose a model called A-CAP, which incorporates commonse… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  14. arXiv:2210.08465  [pdf, other

    cs.CV cs.CL

    Character-Centric Story Visualization via Visual Planning and Token Alignment

    Authors: Hong Chen, Rujun Han, Te-Lin Wu, Hideki Nakayama, Nanyun Peng

    Abstract: Story visualization advances the traditional text-to-image generation by enabling multiple image generation based on a complete story. This task requires machines to 1) understand long text inputs and 2) produce a globally consistent image sequence that illustrates the contents of the story. A key challenge of consistent story visualization is to preserve characters that are essential in stories.… ▽ More

    Submitted 22 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

    Comments: accepted by EMNLP2022

  15. arXiv:2210.08459  [pdf, other

    cs.CL

    StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

    Authors: Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama

    Abstract: Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference. We go beyond this limitation by considering a novel \textbf{Story} \textbf{E}valuation method that mimics human preference when judging a story, namely \textbf{StoryER}, which consists of three sub-tasks: \textbf{R}anking, \textbf{R}ating and \textbf{R}easoning. Given eith… ▽ More

    Submitted 21 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

    Comments: accepted by EMNLP 2022

  16. arXiv:2209.13359  [pdf, other

    cs.CV cs.CL

    Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

    Authors: Erica K. Shimomoto, Edison Marrese-Taylor, Hiroya Takamura, Ichiro Kobayashi, Hideki Nakayama, Yusuke Miyao

    Abstract: This paper explores the task of Temporal Video Grounding (TVG) where, given an untrimmed video and a natural language sentence query, the goal is to recognize and determine temporal boundaries of action instances in the video described by the query. Recent works tackled this task by improving query inputs with large pre-trained language models (PLM) at the cost of more expensive training. However,… ▽ More

    Submitted 25 May, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted for Findings of ACL2023

  17. arXiv:2206.05898  [pdf, other

    cs.CV cs.LG

    Pixel to Binary Embedding Towards Robustness for CNNs

    Authors: Ikki Kishida, Hideki Nakayama

    Abstract: There are several problems with the robustness of Convolutional Neural Networks (CNNs). For example, the prediction of CNNs can be changed by adding a small magnitude of noise to an input, and the performances of CNNs are degraded when the distribution of input is shifted by a transformation never seen during training (e.g., the blur effect). There are approaches to replace pixel values with binar… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: Accepted to ICPR2022

  18. arXiv:2204.14249  [pdf, other

    cs.CV

    OSSGAN: Open-Set Semi-Supervised Image Generation

    Authors: Kai Katsumata, Duc Minh Vo, Hideki Nakayama

    Abstract: We introduce a challenging training scheme of conditional GANs, called open-set semi-supervised image generation, where the training dataset consists of two parts: (i) labeled data and (ii) unlabeled data with samples belonging to one of the labeled data classes, namely, a closed-set, and samples not belonging to any of the labeled data classes, namely, an open-set. Unlike the existing semi-superv… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPR 2022

  19. arXiv:2203.14499  [pdf, other

    cs.CV

    NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

    Authors: Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

    Abstract: Novel object captioning aims at describing objects absent from training data, with the key ingredient being the provision of object vocabulary to the model. Although existing methods heavily rely on an object detection model, we view the detection step as vocabulary retrieval from an external knowledge in the form of embeddings for any object's definition from Wiktionary, where we use in the retri… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  20. arXiv:2203.08456  [pdf, other

    cs.CV

    PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

    Authors: Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

    Abstract: We push forward neural network compression research by exploiting a novel challenging task of large-scale conditional generative adversarial networks (GANs) compression. To this end, we propose a gradually shrinking GAN (PPCD-GAN) by introducing progressive pruning residual block (PP-Res) and class-aware distillation. The PP-Res is an extension of the conventional residual block where each convolu… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: accepted at WACV 2022

  21. arXiv:2110.10774  [pdf, other

    cs.CL

    SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation

    Authors: Hong Chen, Hiroya Takamura, Hideki Nakayama

    Abstract: Generating texts in scientific papers requires not only capturing the content contained within the given input but also frequently acquiring the external information called \textit{context}. We push forward the scientific text generation by proposing a new task, namely \textbf{context-aware text generation} in the scientific domain, aiming at exploiting the contributions of context in generated te… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: this paper was accepted by EMNLP2021-findings

  22. arXiv:2102.04600  [pdf, other

    physics.chem-ph cs.LG

    Graph Energy-based Model for Substructure Preserving Molecular Design

    Authors: Ryuichiro Hataya, Hideki Nakayama, Kazuki Yoshizoe

    Abstract: It is common practice for chemists to search chemical databases based on substructures of compounds for finding molecules with desired properties. The purpose of de novo molecular generation is to generate instead of search. Existing machine learning based molecular design methods have no or limited ability in generating novel molecules that preserves a target substructure. Our Graph Energy-based… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: preprint

  23. arXiv:2102.02977  [pdf, other

    cs.CL

    GraphPlan: Story Generation by Planning with Event Graph

    Authors: Hong Chen, Raphael Shu, Hiroya Takamura, Hideki Nakayama

    Abstract: Story generation is a task that aims to automatically produce multiple sentences to make up a meaningful story. This task is challenging because it requires high-level understanding of semantic meaning of sentences and causality of story events. Naive sequence-to-sequence models generally fail to acquire such knowledge, as the logical correctness can hardly be guaranteed in a text generation model… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

  24. arXiv:2102.02963  [pdf, other

    cs.CV cs.CL

    Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling

    Authors: Hong Chen, Yifei Huang, Hiroya Takamura, Hideki Nakayama

    Abstract: Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we u… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: Accepted by AAAI2021

  25. arXiv:2102.01847  [pdf, other

    cs.CL

    An Investigation Between Schema Linking and Text-to-SQL Performance

    Authors: Yasufumi Taniguchi, Hiroki Nakayama, Kubo Takahiro, Jun Suzuki

    Abstract: Text-to-SQL is a crucial task toward develo** methods for understanding natural language by computers. Recent neural approaches deliver excellent performance; however, models that are difficult to interpret inhibit future developments. Hence, this study aims to provide a better approach toward the interpretation of neural models. We hypothesize that the internal behavior of models at hand become… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  26. arXiv:2009.14759  [pdf, other

    cs.AI

    Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network

    Authors: Yuxuan Wu, Hideki Nakayama

    Abstract: Neural Module Network (NMN) is a machine learning model for solving the visual question answering tasks. NMN uses programs to encode modules' structures, and its modularized architecture enables it to solve logical problems more reasonably. However, because of the non-differentiable procedure of module selection, NMN is hard to be trained end-to-end. To overcome this problem, existing work either… ▽ More

    Submitted 30 September, 2020; originally announced September 2020.

    Comments: in Neural Module Network[C]//Proceedings of the Asian Conference on Computer Vision. 2020

  27. arXiv:2007.13559  [pdf, other

    cs.CV cs.LG eess.IV

    MADGAN: unsupervised Medical Anomaly Detection GAN using multiple adjacent brain MRI slice reconstruction

    Authors: Changhee Han, Leonardo Rundo, Kohei Murao, Tomoyuki Noguchi, Yuki Shimahara, Zoltan Adam Milacski, Saori Koshino, Evis Sala, Hideki Nakayama, Shinichi Satoh

    Abstract: Unsupervised learning can discover various unseen abnormalities, relying on large-scale unannotated medical images of healthy subjects. Towards this, unsupervised methods reconstruct a 2D/3D single medical image to detect outliers either in the learned feature space or from high reconstruction loss. However, without considering continuity between multiple adjacent slices, they cannot directly disc… ▽ More

    Submitted 12 October, 2020; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: 23 pages, 11 figures, submitted to BMC Bioinformatics. Extended version of arXiv:1906.06114

  28. arXiv:2006.07965  [pdf, other

    cs.CV

    Meta Approach to Data Augmentation Optimization

    Authors: Ryuichiro Hataya, Jan Zdenek, Kazuki Yoshizoe, Hideki Nakayama

    Abstract: Data augmentation policies drastically improve the performance of image recognition tasks, especially when the policies are optimized for the target data and tasks. In this paper, we propose to optimize image recognition models and data augmentation policies simultaneously to improve the performance using gradient descent. Unlike prior methods, our approach avoids using proxy tasks or reducing sea… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  29. arXiv:2005.00879  [pdf, other

    cs.CL cs.LG

    Single Model Ensemble using Pseudo-Tags and Distinct Vectors

    Authors: Ryosuke Kuwabara, Jun Suzuki, Hideki Nakayama

    Abstract: Model ensemble techniques often increase task performance in neural networks; however, they require increased time, memory, and management effort. In this study, we propose a novel method that replicates the effects of a model ensemble with a single model. Our approach creates K-virtual models within a single parameter space using K-distinct pseudo-tags and K-distinct vectors. Experiments on text… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted by ACL2020

  30. arXiv:2001.03923  [pdf, ps, other

    cs.CV cs.LG eess.IV

    Bridging the gap between AI and Healthcare sides: towards develo** clinically relevant AI-powered diagnosis systems

    Authors: Changhee Han, Leonardo Rundo, Kohei Murao, Takafumi Nemoto, Hideki Nakayama

    Abstract: Despite the success of Convolutional Neural Network-based Computer-Aided Diagnosis research, its clinical applications remain challenging. Accordingly, develo** medical Artificial Intelligence (AI) fitting into a clinical environment requires identifying/bridging the gap between AI and Healthcare sides. Since the biggest problem in Medical Imaging lies in data paucity, confirming the clinical re… ▽ More

    Submitted 6 April, 2020; v1 submitted 12 January, 2020; originally announced January 2020.

    Comments: 13 pages, 2 figure, accepted to AIAI 2020

  31. Empirical Study of Easy and Hard Examples in CNN Training

    Authors: Ikki Kishida, Hideki Nakayama

    Abstract: Deep Neural Networks (DNNs) generalize well despite their massive size and capability of memorizing all examples. There is a hypothesis that DNNs start learning from simple patterns and the hypothesis is based on the existence of examples that are consistently well-classified at the early training stage (i.e., easy examples) and examples misclassified (i.e., hard examples). Easy examples are the e… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

    Comments: Accepted to ICONIP 2019

    Journal ref: ICONIP 2019

  32. arXiv:1911.06987  [pdf, other

    cs.CV

    Faster AutoAugment: Learning Augmentation Strategies using Backpropagation

    Authors: Ryuichiro Hataya, Jan Zdenek, Kazuki Yoshizoe, Hideki Nakayama

    Abstract: Data augmentation methods are indispensable heuristics to boost the performance of deep neural networks, especially in image recognition tasks. Recently, several studies have shown that augmentation strategies found by search algorithms outperform hand-made strategies. Such methods employ black-box search algorithms over image transformations with continuous or discrete parameters and require a lo… ▽ More

    Submitted 16 November, 2019; originally announced November 2019.

  33. arXiv:1908.07181  [pdf, other

    cs.CL cs.LG

    Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta Posterior

    Authors: Raphael Shu, Jason Lee, Hideki Nakayama, Kyunghyun Cho

    Abstract: Although neural machine translation models reached high translation quality, the autoregressive nature makes inference difficult to parallelize and leads to high translation latency. Inspired by recent refinement-based approaches, we propose LaNMT, a latent-variable non-autoregressive model with continuous latent variables and deterministic inference procedure. In contrast to existing approaches,… ▽ More

    Submitted 21 November, 2019; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: This paper was accepted to AAAI 2020, the copyright is transferred to AAAI

  34. arXiv:1906.06114  [pdf, other

    eess.IV cs.CV

    GAN-based Multiple Adjacent Brain MRI Slice Reconstruction for Unsupervised Alzheimer's Disease Diagnosis

    Authors: Changhee Han, Leonardo Rundo, Kohei Murao, Zoltán Ádám Milacski, Kazuki Umemoto, Evis Sala, Hideki Nakayama, Shin'ichi Satoh

    Abstract: Unsupervised learning can discover various unseen diseases, relying on large-scale unannotated medical images of healthy subjects. Towards this, unsupervised methods reconstruct a single medical image to detect outliers either in the learned feature space or from high reconstruction loss. However, without considering continuity between multiple adjacent slices, they cannot directly discriminate di… ▽ More

    Submitted 16 March, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: 10 pages, 4 figures, Accepted to Lecture Notes in Bioinformatics (LNBI) as a volume in the Springer series

  35. arXiv:1906.04962  [pdf, other

    cs.CV eess.IV

    Synthesizing Diverse Lung Nodules Wherever Massively: 3D Multi-Conditional GAN-based CT Image Augmentation for Object Detection

    Authors: Changhee Han, Yoshiro Kitamura, Akira Kudo, Akimichi Ichinose, Leonardo Rundo, Yujiro Furukawa, Kazuki Umemoto, Yuanzhong Li, Hideki Nakayama

    Abstract: Accurate Computer-Assisted Diagnosis, relying on large-scale annotated pathological images, can alleviate the risk of overlooking the diagnosis. Unfortunately, in medical imaging, most available datasets are small/fragmented. To tackle this, as a Data Augmentation (DA) method, 3D conditional Generative Adversarial Networks (GANs) can synthesize desired realistic/diverse 3D images as additional tra… ▽ More

    Submitted 12 August, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: 9 pages, 6 figures, accepted to 3DV 2019

  36. arXiv:1905.13456  [pdf, other

    eess.IV cs.AI cs.CV

    Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection

    Authors: Changhee Han, Leonardo Rundo, Ryosuke Araki, Yudai Nagano, Yujiro Furukawa, Giancarlo Mauri, Hideki Nakayama, Hideaki Hayashi

    Abstract: Convolutional Neural Networks (CNNs) achieve excellent computer-assisted diagnosis with sufficient annotated training data. However, most medical imaging datasets are small and fragmented. In this context, Generative Adversarial Networks (GANs) can synthesize realistic/diverse additional training images to fill the data lack in the real image distribution; researchers have improved classification… ▽ More

    Submitted 9 October, 2019; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: 12 pages, 7 figures, accepted to IEEE ACCESS

  37. arXiv:1904.08254  [pdf, other

    cs.CV cs.LG

    USE-Net: incorporating Squeeze-and-Excitation blocks into U-Net for prostate zonal segmentation of multi-institutional MRI datasets

    Authors: Leonardo Rundo, Changhee Han, Yudai Nagano, ** Zhang, Ryuichiro Hataya, Carmelo Militello, Andrea Tangherloni, Marco S. Nobile, Claudio Ferretti, Daniela Besozzi, Maria Carla Gilardi, Salvatore Vitabile, Giancarlo Mauri, Hideki Nakayama, Paolo Cazzaniga

    Abstract: Prostate cancer is the most common malignant tumors in men but prostate Magnetic Resonance Imaging (MRI) analysis remains challenging. Besides whole prostate gland segmentation, the capability to differentiate between the blurry boundary of the Central Gland (CG) and Peripheral Zone (PZ) can lead to differential diagnosis, since tumor's frequency and severity differ in these regions. To tackle the… ▽ More

    Submitted 17 July, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

    Comments: 44 pages, 6 figures, Accepted to Neurocomputing, Co-first authors: Leonardo Rundo and Changhee Han

  38. arXiv:1904.00838  [pdf

    cs.CV cs.AI

    Learning More with Less: GAN-based Medical Image Augmentation

    Authors: Changhee Han, Kohei Murao, Shin'ichi Satoh, Hideki Nakayama

    Abstract: Convolutional Neural Network (CNN)-based accurate prediction typically requires large-scale annotated training data. In Medical Imaging, however, both obtaining medical data and annotating them by expert physicians are challenging; to overcome this lack of data, Data Augmentation (DA) using Generative Adversarial Networks (GANs) is essential, since they can synthesize additional annotated training… ▽ More

    Submitted 29 May, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

    Comments: 6 pages, 2 figures, to appear in MEDICAL IMAGING TECHNOLOGY Special Issue

  39. arXiv:1903.12571  [pdf, other

    cs.CV cs.AI

    CNN-based Prostate Zonal Segmentation on T2-weighted MR Images: A Cross-dataset Study

    Authors: Leonardo Rundo, Changhee Han, ** Zhang, Ryuichiro Hataya, Yudai Nagano, Carmelo Militello, Claudio Ferretti, Marco S. Nobile, Andrea Tangherloni, Maria Carla Gilardi, Salvatore Vitabile, Hideki Nakayama, Giancarlo Mauri

    Abstract: Prostate cancer is the most common cancer among US men. However, prostate imaging is still challenging despite the advances in multi-parametric Magnetic Resonance Imaging (MRI), which provides both morphologic and functional information pertaining to the pathological regions. Along with whole prostate gland segmentation, distinguishing between the Central Gland (CG) and Peripheral Zone (PZ) can gu… ▽ More

    Submitted 29 March, 2019; originally announced March 2019.

    Comments: 12 pages, 3 figures, Accepted to Neural Approaches to Dynamics of Signal Exchanges as a Springer book chapter

  40. arXiv:1903.12564  [pdf, other

    cs.CV cs.AI

    Infinite Brain MR Images: PGGAN-based Data Augmentation for Tumor Detection

    Authors: Changhee Han, Leonardo Rundo, Ryosuke Araki, Yujiro Furukawa, Giancarlo Mauri, Hideki Nakayama, Hideaki Hayashi

    Abstract: Due to the lack of available annotated medical images, accurate computer-assisted diagnosis requires intensive Data Augmentation (DA) techniques, such as geometric/intensity transformations of original images; however, those transformed images intrinsically have a similar distribution to the original ones, leading to limited performance improvement. To fill the data lack in the real image distribu… ▽ More

    Submitted 29 March, 2019; originally announced March 2019.

    Comments: 13 pages, 6 figures, Accepted to Neural Approaches to Dynamics of Signal Exchanges as a Springer book chapter

  41. Learning More with Less: Conditional PGGAN-based Data Augmentation for Brain Metastases Detection Using Highly-Rough Annotation on MR Images

    Authors: Changhee Han, Kohei Murao, Tomoyuki Noguchi, Yusuke Kawata, Fumiya Uchiyama, Leonardo Rundo, Hideki Nakayama, Shin'ichi Satoh

    Abstract: Accurate Computer-Assisted Diagnosis, associated with proper data wrangling, can alleviate the risk of overlooking the diagnosis in a clinical environment. Towards this, as a Data Augmentation (DA) technique, Generative Adversarial Networks (GANs) can synthesize additional training data to handle the small/fragmented medical imaging datasets collected from various scanners; those images are realis… ▽ More

    Submitted 22 August, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

    Comments: 9 pages, 7 figures, accepted to CIKM 2019 (acceptance rate: 19%)

  42. arXiv:1810.09309  [pdf, other

    cs.CL cs.LG

    Real-time Neural-based Input Method

    Authors: Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama

    Abstract: The input method is an essential service on every mobile and desktop devices that provides text suggestions. It converts sequential keyboard inputs to the characters in its target language, which is indispensable for Japanese and Chinese users. Due to critical resource constraints and limited network bandwidth of the target devices, applying neural models to input method is not well explored. In t… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.

  43. arXiv:1810.06859  [pdf, other

    cs.CV

    Semantic Aware Attention Based Deep Object Co-segmentation

    Authors: Hong Chen, Yifei Huang, Hideki Nakayama

    Abstract: Object co-segmentation is the task of segmenting the same objects from multiple images. In this paper, we propose the Attention Based Object Co-Segmentation for object co-segmentation that utilize a novel attention mechanism in the bottleneck layer of deep neural network for the selection of semantically related features. Furthermore, we take the benefit of attention learner and propose an algorit… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

  44. arXiv:1808.04525  [pdf, other

    cs.CL

    Discrete Structural Planning for Neural Machine Translation

    Authors: Raphael Shu, Hideki Nakayama

    Abstract: Structural planning is important for producing long sentences, which is a missing part in current language generation models. In this work, we add a planning phase in neural machine translation to control the coarse structure of output sentences. The model first generates some planner codes, then predicts real output words conditioned on them. The codes are learned to capture the coarse structure… ▽ More

    Submitted 14 August, 2018; originally announced August 2018.

  45. arXiv:1801.01777  [pdf

    q-fin.ST cs.LG

    Deep Learning for Forecasting Stock Returns in the Cross-Section

    Authors: Masaya Abe, Hideki Nakayama

    Abstract: Many studies have been undertaken by using machine learning techniques, including neural networks, to predict stock returns. Recently, a method known as deep learning, which achieves high performance mainly in image recognition and speech recognition, has attracted attention in the machine learning field. This paper implements deep learning to predict one-month-ahead stock returns in the cross-sec… ▽ More

    Submitted 12 June, 2018; v1 submitted 3 January, 2018; originally announced January 2018.

    Comments: 12 pages, 2 figures, 8 tables, accepted at PAKDD 2018

  46. arXiv:1711.07170  [pdf, other

    cs.CV cs.AI

    Parameter Reference Loss for Unsupervised Domain Adaptation

    Authors: Jiren **, Richard G. Calland, Takeru Miyato, Brian K. Vogel, Hideki Nakayama

    Abstract: The success of deep learning in computer vision is mainly attributed to an abundance of data. However, collecting large-scale data is not always possible, especially for the supervised labels. Unsupervised domain adaptation (UDA) aims to utilize labeled data from a source domain to learn a model that generalizes to a target domain of unlabeled data. A large amount of existing work uses Siamese net… ▽ More

    Submitted 5 December, 2017; v1 submitted 20 November, 2017; originally announced November 2017.

    Comments: Add experiments that compare parameter reference loss with existing methods using the same architecture

  47. arXiv:1711.01068  [pdf, other

    cs.CL

    Compressing Word Embeddings via Deep Compositional Code Learning

    Authors: Raphael Shu, Hideki Nakayama

    Abstract: Natural language processing (NLP) models often require a massive number of parameters for word embeddings, resulting in a large storage or memory footprint. Deploying neural NLP models to mobile devices requires compressing the word embeddings without any significant sacrifices in performance. For this purpose, we propose to construct the embeddings with few basis vectors. For each word, the compo… ▽ More

    Submitted 17 November, 2017; v1 submitted 3 November, 2017; originally announced November 2017.

  48. arXiv:1707.01830  [pdf, other

    cs.CL

    Single-Queue Decoding for Neural Machine Translation

    Authors: Raphael Shu, Hideki Nakayama

    Abstract: Neural machine translation models rely on the beam search algorithm for decoding. In practice, we found that the quality of hypotheses in the search space is negatively affected owing to the fixed beam size. To mitigate this problem, we store all hypotheses in a single priority queue and use a universal score function for hypothesis selection. The proposed algorithm is more flexible as the discard… ▽ More

    Submitted 8 July, 2017; v1 submitted 6 July, 2017; originally announced July 2017.

  49. arXiv:1704.03169  [pdf, other

    cs.CL

    Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation

    Authors: Raphael Shu, Hideki Nakayama

    Abstract: For extended periods of time, sequence generation models rely on beam search algorithm to generate output sequence. However, the correctness of beam search degrades when the a model is over-confident about a suboptimal prediction. In this paper, we propose to perform minimum Bayes-risk (MBR) decoding for some extra steps at a later stage. In order to speed up MBR decoding, we compute the Bayes ris… ▽ More

    Submitted 8 June, 2017; v1 submitted 11 April, 2017; originally announced April 2017.

  50. arXiv:1702.00182  [pdf

    cs.MM cs.GR

    Inkjet printing-based volumetric display projecting multiple full-colour 2D patterns

    Authors: Ryuji Hirayama, Tomotaka Suzuki, Tomoyoshi Shimobaba, Atsushi Shiraki, Makoto Naruse, Hirotaka Nakayama, Takashi Kakue, Tomoyoshi Ito

    Abstract: In this study, a method to construct a full-colour volumetric display is presented using a commercially available inkjet printer. Photoreactive luminescence materials are minutely and automatically printed as the volume elements, and volumetric displays are constructed with high resolution using easy-to-fabricate means that exploit inkjet printing technologies. The results experimentally demonstra… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.