Skip to main content

Showing 1–50 of 61 results for author: Yi, K M

.
  1. arXiv:2404.13024  [pdf, other

    cs.CV eess.IV

    BANF: Band-limited Neural Fields for Levels of Detail Reconstruction

    Authors: Ahan Shabanov, Shrisudhan Govindarajan, Cody Reading, Lily Goli, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: Largely due to their implicit nature, neural fields lack a direct mechanism for filtering, as Fourier analysis from discrete signal processing is not directly applicable to these representations. Effective filtering of neural fields is critical to enable level-of-detail processing in downstream applications, and support operations that involve sampling the field on regular grids (e.g. marching cub… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Project Page: https://theialab.github.io/banf

  2. arXiv:2404.12547  [pdf, other

    cs.CV

    Evaluating Alternatives to SFM Point Cloud Initialization for Gaussian Splatting

    Authors: Yalda Foroutan, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: 3D Gaussian Splatting has recently been embraced as a versatile and effective method for scene reconstruction and novel view synthesis, owing to its high-quality results and compatibility with hardware rasterization. Despite its advantages, Gaussian Splatting's reliance on high-quality point cloud initialization by Structure-from-Motion (SFM) algorithms is a significant limitation to be overcome.… ▽ More

    Submitted 23 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  3. arXiv:2404.09591  [pdf, other

    cs.CV

    3D Gaussian Splatting as Markov Chain Monte Carlo

    Authors: Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Weiwei Sun, Jeff Tseng, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: While 3D Gaussian Splatting has recently become popular for neural rendering, current methods rely on carefully engineered cloning and splitting strategies for placing Gaussians, which can lead to poor-quality renderings, and reliance on a good initialization. In this work, we rethink the set of 3D Gaussians as a random sample drawn from an underlying probability distribution describing the physic… ▽ More

    Submitted 16 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  4. arXiv:2404.08327  [pdf, other

    cs.CV

    Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training

    Authors: Hyesong Choi, Hye** Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min

    Abstract: In this paper, we introduce Saliency-Based Adaptive Masking (SBAM), a novel and cost-effective approach that significantly enhances the pre-training performance of Masked Image Modeling (MIM) approaches by prioritizing token salience. Our method provides robustness against variations in masking ratios, effectively mitigating the performance instability issues common in existing methods. This relax… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  5. arXiv:2312.12416  [pdf, other

    cs.CV cs.LG

    Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models

    Authors: Shweta Mahajan, Tanzila Rahman, Kwang Moo Yi, Leonid Sigal

    Abstract: The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt engineering, current approaches largely rely on embedding inversion by optimizing and then map** them to pseudo-tokens. However, working with such high-dimens… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  6. arXiv:2312.06799  [pdf, other

    cs.CV cs.LG

    Densify Your Labels: Unsupervised Clustering with Bipartite Matching for Weakly Supervised Point Cloud Segmentation

    Authors: Shaobo Xia, Jun Yue, Kacper Kania, Leyuan Fang, Andrea Tagliasacchi, Kwang Moo Yi, Weiwei Sun

    Abstract: We propose a weakly supervised semantic segmentation method for point clouds that predicts "per-point" labels from just "whole-scene" annotations while achieving the performance of recent fully supervised approaches. Our core idea is to propagate the scene-level labels to each point in the point cloud by creating pseudo labels in a conservative way. Specifically, we over-segment point cloud featur… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: The first two authors contributed equally; Project website: https://densify-your-labels.github.io/

  7. arXiv:2312.02362  [pdf, other

    cs.CV cs.GR

    PointNeRF++: A multi-scale, point-based Neural Radiance Field

    Authors: Weiwei Sun, Eduard Trulls, Yang-Che Tseng, Sneha Sambandam, Gopal Sharma, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: Point clouds offer an attractive source of information to complement images in neural scene representations, especially when few images are available. Neural rendering methods based on point clouds do exist, but they do not perform well when the point cloud quality is low -- e.g., sparse or incomplete, which is often the case with real-world data. We overcome these problems with a simple represent… ▽ More

    Submitted 21 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Project website: https://pointnerfpp.github.io/

  8. arXiv:2312.02202  [pdf, other

    cs.GR cs.CV

    Volumetric Rendering with Baked Quadrature Fields

    Authors: Gopal Sharma, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: We propose a novel Neural Radiance Field (NeRF) representation for non-opaque scenes that allows fast inference by utilizing textured polygons. Despite the high-quality novel view rendering that NeRF provides, a critical limitation is that it relies on volume rendering that can be computationally expensive and does not utilize the advancements in modern graphics hardware. Existing methods for this… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  9. arXiv:2312.01305  [pdf, other

    cs.CV cs.AI cs.GR

    ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models

    Authors: Jeong-gi Kwak, Erqun Dong, Yuhe **, Hanseok Ko, Shweta Mahajan, Kwang Moo Yi

    Abstract: Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: Project page: https://jgkwak95.github.io/ViVid-1-to-3/

  10. arXiv:2312.00075  [pdf, other

    cs.CV

    Accelerating Neural Field Training via Soft Mining

    Authors: Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: We present an approach to accelerate Neural Field training by efficiently selecting sampling locations. While Neural Fields have recently become popular, it is often trained by uniformly sampling the training domain, or through handcrafted heuristics. We show that improved convergence and final training quality can be achieved by a soft mining technique based on importance sampling: rather than ei… ▽ More

    Submitted 29 November, 2023; originally announced December 2023.

  11. arXiv:2312.00065  [pdf, other

    cs.CV

    Unsupervised Keypoints from Pretrained Diffusion Models

    Authors: Eric Hedlin, Gopal Sharma, Shweta Mahajan, Xingzhe He, Hossam Isack, Abhishek Kar Helge Rhodin, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We leverage the emergent knowledge within text-to-image diffusion models, towards more robust unsupervised keypoints. Our core idea is to find text embeddings that w… ▽ More

    Submitted 21 May, 2024; v1 submitted 29 November, 2023; originally announced December 2023.

  12. arXiv:2307.07663  [pdf, other

    cs.CV

    INVE: Interactive Neural Video Editing

    Authors: Jiahui Huang, Leonid Sigal, Kwang Moo Yi, Oliver Wang, Joon-Young Lee

    Abstract: We present Interactive Neural Video Editing (INVE), a real-time video editing solution, which can assist the video editing process by consistently propagating sparse frame edits to the entire video clip. Our method is inspired by the recent work on Layered Neural Atlas (LNA). LNA, however, suffers from two major drawbacks: (1) the method is too slow for interactive editing, and (2) it offers insuf… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  13. arXiv:2306.16485  [pdf, other

    astro-ph.IM astro-ph.SR

    StarUnLink: identifying and mitigating signals from communications satellites in stellar spectral surveys

    Authors: Spencer Bialek, Sara Lucatello, Sebastien Fabbro, Kwang Moo Yi, Kim Venn

    Abstract: A relatively new concern for the forthcoming massive spectroscopic sky surveys is the impact of contamination from low earth orbit satellites. Several hundred thousand of these satellites are licensed for launch in the next few years and it has been estimated that, in some cases, up to a few percent of spectra could be contaminated when using wide field, multi-fiber spectrographs. In this paper, a… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 15 pages. To be published in MNRAS

  14. arXiv:2305.15581  [pdf, other

    cs.CV

    Unsupervised Semantic Correspondence Using Stable Diffusion

    Authors: Eric Hedlin, Gopal Sharma, Shweta Mahajan, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple… ▽ More

    Submitted 23 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Project website: https://github.com/ubc-vision/LDM_correspondences

  15. arXiv:2305.11111  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    PPDONet: Deep Operator Networks for Fast Prediction of Steady-State Solutions in Disk-Planet Systems

    Authors: Shunyuan Mao, Ruobing Dong, Lu Lu, Kwang Moo Yi, Sifan Wang, Paris Perdikaris

    Abstract: We develop a tool, which we name Protoplanetary Disk Operator Network (PPDONet), that can predict the solution of disk-planet interactions in protoplanetary disks in real-time. We base our tool on Deep Operator Networks (DeepONets), a class of neural networks capable of learning non-linear operators to represent deterministic and stochastic differential equations. With PPDONet we map three scalar… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 10 pages, 6 figures, 2 tables; ApJL accepted

  16. arXiv:2305.07514  [pdf, other

    cs.CV cs.GR

    BlendFields: Few-Shot Example-Driven Facial Modeling

    Authors: Kacper Kania, Stephan J. Garbin, Andrea Tagliasacchi, Virginia Estellers, Kwang Moo Yi, Julien Valentin, Tomasz Trzciński, Marek Kowalski

    Abstract: Generating faithful visualizations of human faces requires capturing both coarse and fine-level details of the face geometry and appearance. Existing methods are either data-driven, requiring an extensive corpus of data not publicly accessible to the research community, or fail to capture fine details because they rely on geometric face models that cannot represent fine-grained details in texture… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted to CVPR 2023. Project page: https://blendfields.github.io/

  17. arXiv:2304.13141  [pdf, other

    cs.CV cs.GR

    CN-DHF: Compact Neural Double Height-Field Representations of 3D Shapes

    Authors: Eric Hedlin, **fan Yang, Nicholas Vining, Kwang Moo Yi, Alla Sheffer

    Abstract: We introduce CN-DHF (Compact Neural Double-Height-Field), a novel hybrid neural implicit 3D shape representation that is dramatically more compact than the current state of the art. Our representation leverages Double-Height-Field (DHF) geometries, defined as closed shapes bounded by a pair of oppositely oriented height-fields that share a common axis, and leverages the following key observations:… ▽ More

    Submitted 26 April, 2023; v1 submitted 29 March, 2023; originally announced April 2023.

    Comments: Eric Hedlin and **fan Yang contributed equally to this work

  18. arXiv:2304.12390  [pdf, other

    cs.CV cs.GR

    Pointersect: Neural Rendering with Cloud-Ray Intersection

    Authors: Jen-Hao Rick Chang, Wei-Yu Chen, Anurag Ranjan, Kwang Moo Yi, Oncel Tuzel

    Abstract: We propose a novel method that renders point clouds as if they are surfaces. The proposed method is differentiable and requires no scene-specific optimization. This unique capability enables, out-of-the-box, surface normal estimation, rendering room-scale point clouds, inverse rendering, and ray tracing with global illumination. Unlike existing work that focuses on converting point clouds to other… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  19. arXiv:2303.15437  [pdf, other

    cs.CV

    FaceLit: Neural 3D Relightable Faces

    Authors: Anurag Ranjan, Kwang Moo Yi, Jen-Hao Rick Chang, Oncel Tuzel

    Abstract: We propose a generative framework, FaceLit, capable of generating a 3D face that can be rendered at various user-defined lighting conditions and views, learned purely from 2D images in-the-wild without any manual annotation. Unlike existing works that require careful capture setup or human labor, we rely on off-the-shelf pose and illumination estimators. With these estimates, we incorporate the Ph… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  20. arXiv:2212.01735  [pdf, other

    cs.CV cs.AI cs.GR

    Neural Fourier Filter Bank

    Authors: Zhijie Wu, Yuhe **, Kwang Moo Yi

    Abstract: We present a novel method to provide efficient and highly detailed reconstructions. Inspired by wavelets, we learn a neural field that decompose the signal both spatially and frequency-wise. We follow the recent grid-based paradigm for spatial decomposition, but unlike existing work, encourage specific frequencies to be stored in each grid via Fourier features encodings. We then apply a multi-laye… ▽ More

    Submitted 24 August, 2023; v1 submitted 3 December, 2022; originally announced December 2022.

  21. arXiv:2210.15121  [pdf, other

    cs.CV

    Bootstrap** Human Optical Flow and Pose

    Authors: Aritro Roy Arko, James J. Little, Kwang Moo Yi

    Abstract: We propose a bootstrap** framework to enhance human optical flow and pose. We show that, for videos involving humans in scenes, we can improve both the optical flow and the pose estimation quality of humans by considering the two tasks at the same time. We enhance optical flow estimates by fine-tuning them to fit the human pose estimates and vice versa. In more detail, we optimize the pose and o… ▽ More

    Submitted 28 October, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at BMVC 2022. Supplementary qualitative results - https://aritro30.github.io/results/. Code at https://github.com/ubc-vision/bootstrap**-human-optical-flow-and-pose

  22. arXiv:2209.10684  [pdf, other

    cs.CV

    Attention Beats Concatenation for Conditioning Neural Fields

    Authors: Daniel Rebain, Mark J. Matthews, Kwang Moo Yi, Gopal Sharma, Dmitry Lagun, Andrea Tagliasacchi

    Abstract: Neural fields model signals by map** coordinate inputs to sampled values. They are becoming an increasingly important backbone architecture across many fields from vision and graphics to biology and astronomy. In this paper, we explore the differences between common conditioning mechanisms within these networks, an essential ingredient in shifting neural fields from memorization of signals to ge… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  23. arXiv:2208.02337  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Estimating Visual Information From Audio Through Manifold Learning

    Authors: Fabrizio Pedersoli, Dryden Wiebe, Amin Banitalebi, Yong Zhang, George Tzanetakis, Kwang Moo Yi

    Abstract: We propose a new framework for extracting visual information about a scene only using audio signals. Audio-based methods can overcome some of the limitations of vision-based methods i.e., they do not require "line-of-sight", are robust to occlusions and changes in illumination, and can function as a backup in case vision/lidar sensors fail. Therefore, audio-based methods can be useful even for app… ▽ More

    Submitted 13 September, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

  24. arXiv:2207.09978  [pdf, other

    cs.CV cs.GR

    NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds

    Authors: Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: We introduce a method for instance proposal generation for 3D point clouds. Existing techniques typically directly regress proposals in a single feed-forward step, leading to inaccurate estimation. We show that this serves as a critical bottleneck, and propose a method based on iterative bilateral filtering with learned kernels. Following the spirit of bilateral filtering, we consider both the dee… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Project website: https://neuralbf.github.io

  25. arXiv:2206.08460  [pdf, other

    cs.CV cs.LG

    TUSK: Task-Agnostic Unsupervised Keypoints

    Authors: Yuhe **, Weiwei Sun, Jan Hosang, Eduard Trulls, Kwang Moo Yi

    Abstract: Existing unsupervised methods for keypoint learning rely heavily on the assumption that a specific keypoint type (e.g. elbow, digit, abstract geometric shape) appears only once in an image. This greatly limits their applicability, as each instance must be isolated before applying the method-an issue that is never discussed or evaluated. We thus propose a novel method to learn Task-agnostic, UnSupe… ▽ More

    Submitted 12 January, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

  26. arXiv:2205.14147  [pdf, other

    eess.IV cs.LG

    FlowNet-PET: Unsupervised Learning to Perform Respiratory Motion Correction in PET Imaging

    Authors: Teaghan O'Briain, Carlos Uribe, Kwang Moo Yi, Jonas Teuwen, Ioannis Sechopoulos, Magdalena Bazalova-Carter

    Abstract: To correct for respiratory motion in PET imaging, an interpretable and unsupervised deep learning technique, FlowNet-PET, was constructed. The network was trained to predict the optical flow between two PET frames from different breathing amplitude ranges. The trained model aligns different retrospectively-gated PET images, providing a final image with similar counting statistics as a non-gated im… ▽ More

    Submitted 2 August, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

  27. arXiv:2205.00076  [pdf, other

    cs.CV

    A Simple Method to Boost Human Pose Estimation Accuracy by Correcting the Joint Regressor for the Human3.6m Dataset

    Authors: Eric Hedlin, Helge Rhodin, Kwang Moo Yi

    Abstract: Many human pose estimation methods estimate Skinned Multi-Person Linear (SMPL) models and regress the human joints from these SMPL estimates. In this work, we show that the most widely used SMPL-to-joint linear layer (joint regressor) is inaccurate, which may mislead pose evaluation results. To achieve a more accurate joint regressor, we propose a method to create pseudo-ground-truth SMPL poses, w… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  28. arXiv:2203.12575  [pdf, other

    cs.CV

    NeuMan: Neural Human Radiance Field from a Single Video

    Authors: Wei Jiang, Kwang Moo Yi, Golnoosh Samei, Oncel Tuzel, Anurag Ranjan

    Abstract: Photorealistic rendering and reposing of humans is important for enabling augmented reality experiences. We propose a novel framework to reconstruct the human and the scene that can be rendered with novel human poses and views from just a single in-the-wild video. Given a video captured by a moving camera, we train two NeRF models: a human NeRF model and a scene NeRF model. To train these models,… ▽ More

    Submitted 21 September, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

  29. arXiv:2203.03570  [pdf, other

    cs.CV cs.GR cs.LG

    Kubric: A scalable dataset generator

    Authors: Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi , et al. (10 additional authors not shown)

    Abstract: Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 21 pages, CVPR2022

  30. Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Crop**

    Authors: Nora Horanyi, Kedi Xia, Kwang Moo Yi, Abhishake Kumar Bojja, Ales Leonardis, Hyung ** Chang

    Abstract: We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image crop** methods, where one typically trains a deep network to regress to crop parameters or crop** actions, we propose to directly optimize for the crop** parameters by repurposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuni… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

    Journal ref: Pattern Recognition, 2022, 108485, ISSN 0031-3203

  31. arXiv:2112.01983  [pdf, other

    cs.CV cs.GR

    CoNeRF: Controllable Neural Radiance Fields

    Authors: Kacper Kania, Kwang Moo Yi, Marek Kowalski, Tomasz Trzciński, Andrea Tagliasacchi

    Abstract: We extend neural 3D representations to allow for intuitive and interpretable user control beyond novel view rendering (i.e. camera control). We allow the user to annotate which part of the scene one wishes to control with just a small number of mask annotations in the training images. Our key idea is to treat the attributes as latent variables that are regressed by the neural network given the sce… ▽ More

    Submitted 6 December, 2021; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: Project page: https://conerf.github.io/

  32. arXiv:2111.12747  [pdf, other

    cs.CV

    Layered Controllable Video Generation

    Authors: Jiahui Huang, Yuhe **, Kwang Moo Yi, Leonid Sigal

    Abstract: We introduce layered controllable video generation, where we, without any supervision, decompose the initial frame of a video into foreground and background layers, with which the user can control the video generation process by simply manipulating the foreground mask. The key challenges are the unsupervised foreground-background separation, which is ambiguous, and ability to anticipate user manip… ▽ More

    Submitted 30 September, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: This paper has been accepted to ECCV 2022 as an Oral paper

  33. arXiv:2111.09996  [pdf, other

    cs.CV

    LOLNeRF: Learn from One Look

    Authors: Daniel Rebain, Mark Matthews, Kwang Moo Yi, Dmitry Lagun, Andrea Tagliasacchi

    Abstract: We present a method for learning a generative 3D model based on neural radiance fields, trained solely from data with only single views of each object. While generating realistic images is no longer a difficult task, producing the corresponding 3D structure such that they can be rendered from different views is non-trivial. We show that, unlike existing methods, one does not need multi-view data t… ▽ More

    Submitted 25 April, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

    Comments: See https://lolnerf.github.io for additional results

  34. arXiv:2106.03804  [pdf, other

    cs.GR cs.CV

    Deep Medial Fields

    Authors: Daniel Rebain, Ke Li, Vincent Sitzmann, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: Implicit representations of geometry, such as occupancy fields or signed distance fields (SDF), have recently re-gained popularity in encoding 3D solid shape in a functional form. In this work, we introduce medial fields: a field function derived from the medial axis transform (MAT) that makes available information about the underlying 3D geometry that is immediately useful for a number of downstr… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  35. arXiv:2103.14167  [pdf, other

    cs.CV

    COTR: Correspondence Transformer for Matching Across Images

    Authors: Wei Jiang, Eduard Trulls, Jan Hosang, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: We propose a novel framework for finding correspondences in images based on a deep neural network that, given two images and a query point in one of them, finds its correspondence in the other. By doing so, one has the option to query only the points of interest and retrieve sparse correspondences, or to query all points in an image and obtain dense map**s. Importantly, in order to capture both… ▽ More

    Submitted 17 August, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

  36. Convolutional neural network identification of galaxy post-mergers in UNIONS using IllustrisTNG

    Authors: Robert W. Bickley, Connor Bottrell, Maan H. Hani, Sara L. Ellison, Hossen Teimoorinia, Kwang Moo Yi, Scott Wilkinson, Stephen Gwyn, Michael J. Hudson

    Abstract: The Canada-France Imaging Survey (CFIS) will consist of deep, high-resolution r-band imaging over ~5000 square degrees of the sky, representing a first-rate opportunity to identify recently-merged galaxies. Due to the large number of galaxies in CFIS, we investigate the use of a convolutional neural network (CNN) for automated merger classification. Training samples of post-merger and isolated gal… ▽ More

    Submitted 18 March, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: 21 pages, 19 figures, 2 tables, Accepted for publication in MNRAS

  37. arXiv:2012.04718  [pdf, other

    cs.CV cs.LG

    Canonical Capsules: Self-Supervised Capsules in Canonical Pose

    Authors: Weiwei Sun, Andrea Tagliasacchi, Boyang Deng, Sara Sabour, Soroosh Yazdani, Geoffrey Hinton, Kwang Moo Yi

    Abstract: We propose a self-supervised capsule architecture for 3D point clouds. We compute capsule decompositions of objects through permutation-equivariant attention, and self-supervise the process by training with pairs of randomly rotated objects. Our key idea is to aggregate the attention masks into semantic keypoints, and use these to supervise a decomposition that satisfies the capsule invariance/equ… ▽ More

    Submitted 24 November, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: NeurIPS 2021; The first two authors contributed equally; Project website: https://canonical-capsules.github.io

  38. arXiv:2011.12490  [pdf, other

    cs.CV cs.GR

    DeRF: Decomposed Radiance Fields

    Authors: Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. Yet, generating these images is very computationally intensive, limiting their applicability in practical scenarios. In this paper, we propose a technique based on spatial decomposition capable of mitigating this issue. Our key observation is that there… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  39. arXiv:2007.03112  [pdf, other

    astro-ph.SR astro-ph.GA stat.ML

    Interpreting Stellar Spectra with Unsupervised Domain Adaptation

    Authors: Teaghan O'Briain, Yuan-Sen Ting, Sébastien Fabbro, Kwang M. Yi, Kim Venn, Spencer Bialek

    Abstract: We discuss how to achieve map** from large sets of imperfect simulations and observational data with unsupervised domain adaptation. Under the hypothesis that simulated and observed data distributions share a common underlying representation, we show how it is possible to transfer between simulated and observed domains. Driven by an application to interpret stellar spectroscopic sky surveys, we… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: 4 pages, 4 figure, accepted to the ICML 2020 Machine Learning Interpretability for Scientific Discovery workshop. A full 20-page version is submitted to ApJ. The code used in this study is made publicly available on github: https://github.com/teaghan/Cycle_SN

  40. arXiv:2007.03109  [pdf, other

    astro-ph.SR astro-ph.GA astro-ph.IM physics.data-an stat.ML

    Cycle-StarNet: Bridging the gap between theory and data by leveraging large datasets

    Authors: Teaghan O'Briain, Yuan-Sen Ting, Sébastien Fabbro, Kwang M. Yi, Kim Venn, Spencer Bialek

    Abstract: The advancements in stellar spectroscopy data acquisition have made it necessary to accomplish similar improvements in efficient data analysis techniques. Current automated methods for analyzing spectra are either (a) data-driven, which requires prior knowledge of stellar parameters and elemental abundances, or (b) based on theoretical synthetic models that are susceptible to the gap between theor… ▽ More

    Submitted 13 November, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: 23 pages, 15 figures, 2 tables, accepted for publication on Nov 12, 2020, Nov 12. A companion 4-page preview is accepted to the ICML 2020 Machine Learning Interpretability for Scientific Discovery workshop. The code used in this study is made publicly available on github: https://github.com/teaghan/Cycle_SN

    Journal ref: 2021, ApJ, 906, 130

  41. arXiv:2004.07931  [pdf, other

    cs.CV

    Eigendecomposition-Free Training of Deep Networks for Linear Least-Square Problems

    Authors: Zheng Dang, Kwang Moo Yi, Yinlin Hu, Fei Wang, Pascal Fua, Mathieu Salzmann

    Abstract: Many classical Computer Vision problems, such as essential matrix computation and pose estimation from 3D to 2D correspondences, can be tackled by solving a linear least-square problem, which can be done by finding the eigenvector corresponding to the smallest, or zero, eigenvalue of a matrix representing a linear system. Incorporating this in deep learning frameworks would allow us to explicitly… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

    Comments: 16 pages, Accepted by TPAMI. arXiv admin note: substantial text overlap with arXiv:1803.08071

  42. arXiv:2003.11249  [pdf, other

    cs.LG cs.CV stat.ML

    VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

    Authors: Jongwon Choi, Kwang Moo Yi, Jihoon Kim, **ho Choo, Byoungjip Kim, **-Yeop Chang, Youngjune Gwon, Hyung ** Chang

    Abstract: Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. In this work, we show that this is harmful. We propose a method based on the Bayes' rule, that can naturally incorporate class imbalance into the Active Learning framework. We derive that three terms shoul… ▽ More

    Submitted 3 December, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

  43. Image Matching across Wide Baselines: From Paper to Practice

    Authors: Yuhe **, Dmytro Mishkin, Anastasiia Mishchuk, Jiri Matas, Pascal Fua, Kwang Moo Yi, Eduard Trulls

    Abstract: We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task -- the accuracy of the reconstructed camera pose -- as our primary metric. Our pipeline's modular structure allows easy integration, configuration, and combination of different methods and heuristics. This is demonstrated by embedding dozens of popular algorithms and evaluati… ▽ More

    Submitted 11 February, 2021; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: Added: KeyNet-SOSNet, AffNet-HardNet, TFeat, MKD from kornia

  44. arXiv:1912.03629  [pdf, other

    cs.CV cs.GR cs.LG

    VoronoiNet: General Functional Approximators with Local Support

    Authors: Francis Williams, Daniele Panozzo, Kwang Moo Yi, Andrea Tagliasacchi

    Abstract: Voronoi diagrams are highly compact representations that are used in various Graphics applications. In this work, we show how to embed a differentiable version of it -- via a novel deep architecture -- into a generative deep network. By doing so, we achieve a highly compact latent embedding that is able to provide much more detailed reconstructions, both in 2D and 3D, for various shapes. In this t… ▽ More

    Submitted 8 December, 2019; originally announced December 2019.

  45. arXiv:1911.10657  [pdf, other

    cs.CV

    Reducing the Human Effort in Develo** PET-CT Registration

    Authors: Teaghan O'Briain, Kyong Hwan **, Hongyoon Choi, Erika Chin, Magdalena Bazalova-Carter, Kwang Moo Yi

    Abstract: We aim to reduce the tedious nature of develo** and evaluating methods for aligning PET-CT scans from multiple patient visits. Current methods for registration rely on correspondences that are created manually by medical experts with 3D manipulation, or assisted alignments done by utilizing mutual information across CT scans that may not be consistent when transferred to the PET images. Instead,… ▽ More

    Submitted 24 November, 2019; originally announced November 2019.

  46. arXiv:1911.02602  [pdf, other

    astro-ph.IM astro-ph.SR

    Assessing the performance of LTE and NLTE synthetic stellar spectra in a machine learning framework

    Authors: Spencer Bialek, Sébastien Fabbro, Kim A. Venn, Nripesh Kumar, Teaghan O'Briain, Kwang Moo Yi

    Abstract: In the current era of stellar spectroscopic surveys, synthetic spectral libraries are the basis for the derivation of stellar parameters and chemical abundances. In this paper, we compare the stellar parameters determined using five popular synthetic spectral grids (INTRIGOSS, FERRE, AMBRE, PHOENIX, and MPIA/1DNLTE) with our convolutional neural network (CNN, $\texttt{StarNet}$). The stellar param… ▽ More

    Submitted 1 September, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: 20 pages, 17 figures, published by MNRAS. Code for StarNet available at https://github.com/Spiffical/StarNet

  47. LRP2020: Machine Learning Advantages in Canadian Astrophysics

    Authors: K. A. Venn, S. Fabbro, A Liu, Y. Hezaveh, L. Perreault-Levasseur, G. Eadie, S. Ellison, J. Woo, JJ. Kavelaars, K. M. Yi, R. Hlozek, J. Bovy, H. Teimoorinia, S. Ravanbakhsh, L. Spencer

    Abstract: The application of machine learning (ML) methods to the analysis of astrophysical datasets is on the rise, particularly as the computing power and complex algorithms become more powerful and accessible. As the field of ML enjoys a continuous stream of breakthroughs, its applications demonstrate the great potential of ML, ranging from achieving tens of millions of times increase in analysis speed (… ▽ More

    Submitted 15 October, 2019; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: White paper E015 submitted to the Canadian Long Range Plan LRP2020

  48. arXiv:1909.08034  [pdf, other

    cs.CV

    Optimizing Through Learned Errors for Accurate Sports Field Registration

    Authors: Wei Jiang, Juan Camilo Gamboa Higuera, Baptiste Angles, Weiwei Sun, Mehrsan Javan, Kwang Moo Yi

    Abstract: We propose an optimization-based framework to register sports field templates onto broadcast videos. For accurate registration we go beyond the prevalent feed-forward paradigm. Instead, we propose to train a deep network that regresses the registration error, and then register images by finding the registration parameters that minimize the regressed error. We demonstrate the effectiveness of our m… ▽ More

    Submitted 28 May, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

  49. arXiv:1908.05547  [pdf, other

    cs.CV

    Beyond Cartesian Representations for Local Descriptors

    Authors: Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, Eduard Trulls

    Abstract: The dominant approach for learning local patch descriptors relies on small image regions whose scale must be properly estimated a priori by a keypoint detector. In other words, if two patches are not in correspondence, their descriptors will not match. A strategy often used to alleviate this problem is to "pool" the pixel-wise features over log-polar regions, rather than regularly spaced ones. By… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

  50. arXiv:1907.02545  [pdf, other

    cs.CV

    ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning

    Authors: Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: Many problems in computer vision require dealing with sparse, unordered data in the form of point clouds. Permutation-equivariant networks have become a popular solution-they operate on individual data points with simple perceptrons and extract contextual information with global pooling. This can be achieved with a simple normalization of the feature maps, a global operation that is unaffected by… ▽ More

    Submitted 31 January, 2021; v1 submitted 4 July, 2019; originally announced July 2019.

    Comments: CVPR 2020