Skip to main content

Showing 1–15 of 15 results for author: Narihira, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.17251  [pdf, other

    cs.CV

    GenWarp: Single Image to Novel Views with Semantic-Preserving Generative War**

    Authors: Junyoung Seo, Kazumi Fukuda, Takashi Shibuya, Takuya Narihira, Naoki Murata, Shoukang Hu, Chieh-Hsin Lai, Seungryong Kim, Yuki Mitsufuji

    Abstract: Generating novel views from a single image remains a challenging task due to the complexity of 3D scenes and the limited diversity in the existing multi-view datasets to train a model on. Recent research combining large-scale text-to-image (T2I) models with monocular depth estimation (MDE) has shown promise in handling in-the-wild images. In these methods, an input view is geometrically warped to… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Project page: https://GenWarp-NVS.github.io

  2. arXiv:2303.15780  [pdf, other

    cs.CV

    Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion

    Authors: Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira

    Abstract: We propose a high-quality 3D-to-3D conversion method, Instruct 3D-to-3D. Our method is designed for a novel task, which is to convert a given 3D scene to another scene according to text instructions. Instruct 3D-to-3D applies pretrained Image-to-Image diffusion models for 3D-to-3D conversion. This enables the likelihood maximization of each viewpoint image and high-quality 3D generation. In additi… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Project page: https://sony.github.io/Instruct3Dto3D-doc/

  3. arXiv:2303.13121  [pdf, other

    cs.CV

    DetOFA: Efficient Training of Once-for-All Networks for Object Detection Using Path Filter

    Authors: Yuiko Sakuma, Masato Ishii, Takuya Narihira

    Abstract: We address the challenge of training a large supernet for the object detection task, using a relatively small amount of training data. Specifically, we propose an efficient supernet-based neural architecture search (NAS) method that uses search space pruning. The search space defined by the supernet is pruned by removing candidate models that are predicted to perform poorly. To effectively remove… ▽ More

    Submitted 19 October, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted to ICCV workshop 2023

  4. arXiv:2302.00675  [pdf, other

    cs.CV cs.GR

    NDJIR: Neural Direct and Joint Inverse Rendering for Geometry, Lights, and Materials of Real Object

    Authors: Kazuki Yoshiyama, Takuya Narihira

    Abstract: The goal of inverse rendering is to decompose geometry, lights, and materials given pose multi-view images. To achieve this goal, we propose neural direct and joint inverse rendering, NDJIR. Different from prior works which relies on some approximations of the rendering equation, NDJIR directly addresses the integrals in the rendering equation and jointly decomposes geometry: signed distance funct… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: 26 pages

  5. arXiv:2212.02024  [pdf, other

    cs.CV cs.LG

    Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

    Authors: Naoki Matsunaga, Masato Ishii, Akio Hayakawa, Kenji Suzuki, Takuya Narihira

    Abstract: Our goal is to develop fine-grained real-image editing methods suitable for real-world applications. In this paper, we first summarize four requirements for these methods and propose a novel diffusion-based image editing framework with pixel-wise guidance that satisfies these requirements. Specifically, we train pixel-classifiers with a few annotated data and then infer the segmentation map of a t… ▽ More

    Submitted 31 May, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

    Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2023

  6. arXiv:2202.10758  [pdf, other

    cs.CV cs.LG

    Thinking the Fusion Strategy of Multi-reference Face Reenactment

    Authors: Takuya Yashima, Takuya Narihira, Tamaki Kojima

    Abstract: In recent advances of deep generative models, face reenactment -manipulating and controlling human face, including their head movement-has drawn much attention for its wide range of applicability. Despite its strong expressiveness, it is inevitable that the models fail to reconstruct or accurately generate unseen side of the face of a given single reference image. Most of existing methods alleviat… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: Submitted to ICIP2022, 5 pages, 3 figures, 3 tables

  7. arXiv:2103.11807  [pdf, other

    cs.LG cs.AI

    Data Cleansing for Deep Neural Networks with Storage-efficient Approximation of Influence Functions

    Authors: Kenji Suzuki, Yoshiyuki Kobayashi, Takuya Narihira

    Abstract: Identifying the influence of training data for data cleansing can improve the accuracy of deep learning. An approach with stochastic gradient descent (SGD) called SGD-influence to calculate the influence scores was proposed, but, the calculation costs are expensive. It is necessary to temporally store the parameters of the model during training phase for inference phase to calculate influence sore… ▽ More

    Submitted 1 June, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

  8. arXiv:2103.04037  [pdf, other

    cs.CV cs.CL

    Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision

    Authors: Andrew Shin, Masato Ishii, Takuya Narihira

    Abstract: Transformer architectures have brought about fundamental changes to computational linguistic field, which had been dominated by recurrent neural networks for many years. Its success also implies drastic changes in cross-modal tasks with language and vision, and many researchers have already tackled the issue. In this paper, we review some of the most critical milestones in the field, as well as ov… ▽ More

    Submitted 9 November, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

    Comments: Accepted for publication by International Journal of Computer Vision (IJCV)

  9. arXiv:2102.06725  [pdf, other

    cs.LG cs.CV

    Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives

    Authors: Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama

    Abstract: While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools. In this paper, we introduce Neural Network Libraries (https://nnabla.org), a deep learning framework designed from engineer's perspe… ▽ More

    Submitted 21 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: https://nnabla.org

  10. arXiv:2011.12528  [pdf, other

    cs.CV

    Reference-Based Video Colorization with Spatiotemporal Correspondence

    Authors: Naofumi Akimoto, Akio Hayakawa, Andrew Shin, Takuya Narihira

    Abstract: We propose a novel reference-based video colorization framework with spatiotemporal correspondence. Reference-based methods colorize grayscale frames referencing a user input color frame. Existing methods suffer from the color leakage between objects and the emergence of average colors, derived from non-local semantic correspondence in space. To address this issue, we warp colors only from the reg… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

  11. arXiv:2010.14109  [pdf, other

    cs.LG

    Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling

    Authors: Akio Hayakawa, Takuya Narihira

    Abstract: While large neural networks demonstrate higher performance in various tasks, training large networks is difficult due to limitations on GPU memory size. We propose a novel out-of-core algorithm that enables faster training of extremely large-scale neural networks with sizes larger than allotted GPU memory. Under a given memory budget constraint, our scheduling algorithm locally adapts the timing o… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

  12. arXiv:1908.03343  [pdf, ps, other

    cs.LG cs.AI cs.RO

    Fully Convolutional Search Heuristic Learning for Rapid Path Planners

    Authors: Yuka Ariki, Takuya Narihira

    Abstract: Path-planning algorithms are an important part of a wide variety of robotic applications, such as mobile robot navigation and robot arm manipulation. However, in large search spaces in which local traps may exist, it remains challenging to reliably find a path while satisfying real-time constraints. Efforts to speed up the path search have led to the development of many practical path-planning alg… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

    Comments: 11 pages, 4 figures

  13. arXiv:1512.02767  [pdf, other

    cs.CV cs.LG cs.NE

    Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding

    Authors: Michael Maire, Takuya Narihira, Stella X. Yu

    Abstract: Spectral embedding provides a framework for solving perceptual organization problems, including image segmentation and figure/ground organization. From an affinity matrix describing pairwise relationships between pixels, it clusters pixels into regions, and, using a complex-valued extension, orders pixels according to layer. We train a convolutional neural network (CNN) to directly predict the pai… ▽ More

    Submitted 11 April, 2016; v1 submitted 9 December, 2015; originally announced December 2015.

    Comments: minor updates; extended version of CVPR 2016 conference paper

  14. arXiv:1512.02311  [pdf, other

    cs.CV

    Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression

    Authors: Takuya Narihira, Michael Maire, Stella X. Yu

    Abstract: We introduce a new approach to intrinsic image decomposition, the task of decomposing a single image into albedo and shading components. Our strategy, which we term direct intrinsics, is to learn a convolutional neural network (CNN) that directly predicts output albedo and shading channels from an input RGB image patch. Direct intrinsics is a departure from classical techniques for intrinsic image… ▽ More

    Submitted 7 December, 2015; originally announced December 2015.

    Comments: International Conference on Computer Vision (ICCV), 2015

  15. arXiv:1511.06838  [pdf, other

    cs.CV cs.CL

    Map** Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets

    Authors: Takuya Narihira, Damian Borth, Stella X. Yu, Karl Ni, Trevor Darrell

    Abstract: We consider the visual sentiment task of map** an image to an adjective noun pair (ANP) such as "cute baby". To capture the two-factor structure of our ANP semantics as well as to overcome annotation noise and ambiguity, we propose a novel factorized CNN model which learns separate representations for adjectives and nouns but optimizes the classification performance over their product. Our exper… ▽ More

    Submitted 20 November, 2015; originally announced November 2015.