-
Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
Authors:
Isha Garg,
Christian Koguchi,
Eshan Verma,
Daniel Ulbricht
Abstract:
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the…
▽ More
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the entire space available to them. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact. This preserves the functional map** of the layer and reveals its low rank structure. In our findings, we conclude that most models utilize a fraction of the available space. For instance, for ViTB-16 and ViTL-16 trained on ImageNet, the mean layer utilization is 35% and 20% respectively. Our transformation results in reducing the parameters to 50% and 25% respectively, while resulting in less than 0.2% accuracy drop after fine-tuning. We also show that self-supervised pre-training drives this utilization up to 70%, justifying its suitability for downstream tasks.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
GAUDI: A Neural Architect for Immersive 3D Scene Generation
Authors:
Miguel Angel Bautista,
Pengsheng Guo,
Samira Abnar,
Walter Talbott,
Alexander Toshev,
Zhuoyuan Chen,
Laurent Dinh,
Shuangfei Zhai,
Hanlin Goh,
Daniel Ulbricht,
Afshin Dehghan,
Josh Susskind
Abstract:
We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generati…
▽ More
We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generative model that enables both unconditional and conditional generation of 3D scenes. Our model generalizes previous works that focus on single objects by removing the assumption that the camera pose distribution can be shared across samples. We show that GAUDI obtains state-of-the-art performance in the unconditional generative setting across multiple datasets and allows for conditional generation of 3D scenes given conditioning variables like sparse image observations or text that describes the scene.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Fast and Explicit Neural View Synthesis
Authors:
Pengsheng Guo,
Miguel Angel Bautista,
Alex Colburn,
Liang Yang,
Daniel Ulbricht,
Joshua M. Susskind,
Qi Shan
Abstract:
We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radian…
▽ More
We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.
△ Less
Submitted 8 December, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
Learning to Branch for Multi-Task Learning
Authors:
Pengsheng Guo,
Chen-Yu Lee,
Daniel Ulbricht
Abstract:
Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad…
▽ More
Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.
△ Less
Submitted 9 June, 2020; v1 submitted 2 June, 2020;
originally announced June 2020.
-
Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
Authors:
Chen-Yu Lee,
Tanmay Batra,
Mohammad Haris Baig,
Daniel Ulbricht
Abstract:
In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric. Our proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers. It provides a geometrically meanin…
▽ More
In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric. Our proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers. It provides a geometrically meaningful guidance to detect target samples that are far from the support of the source and enables efficient distribution alignment in an end-to-end trainable fashion. In the experiments, we validate the effectiveness and genericness of our method on digit and sign recognition, image classification, semantic segmentation, and object detection.
△ Less
Submitted 10 March, 2019;
originally announced March 2019.
-
Three-Dimensional Organic Microlasers with Low Lasing Thresholds Fabricated by Multiphoton Lithography
Authors:
Vincent W. Chen,
Nina Sobeshchuk,
Clement Lafargue,
Eric S. Mansfield,
Jeannie Yom,
Luke Johnstone,
Joel M. Hales,
Stefan Bittner,
Severin Charpignon,
David Ulbricht,
Joseph Lautru,
Igor Denisyuk,
Joseph Zyss,
Joseph W. Perry,
Melanie Lebental
Abstract:
Cuboid-shaped organic microcavities containing a pyrromethene laser dye and supported upon a photonic crystal have been investigated as an approach to reducing the lasing threshold of the cavities. Multiphoton lithography facilitated fabrication of the cuboid cavities directly on the substrate or on the decoupling structure, while similar structures were fabricated on the substrate by UV lithograp…
▽ More
Cuboid-shaped organic microcavities containing a pyrromethene laser dye and supported upon a photonic crystal have been investigated as an approach to reducing the lasing threshold of the cavities. Multiphoton lithography facilitated fabrication of the cuboid cavities directly on the substrate or on the decoupling structure, while similar structures were fabricated on the substrate by UV lithography for comparison. Significant reduction of the lasing threshold by a factor of ~30 has been observed for cavities supported by the photonic crystal relative to those fabricated on the substrate. The lasing mode spectra of the cuboid microresonators provide strong evidence showing that the lasing modes are localized in the horizontal plane, with the shape of an inscribed diamond.
△ Less
Submitted 2 April, 2014; v1 submitted 26 March, 2014;
originally announced March 2014.