Search | arXiv e-print repository

Revealing the Utilized Rank of Subspaces of Learning in Neural Networks

Authors: Isha Garg, Christian Koguchi, Eshan Verma, Daniel Ulbricht

Abstract: In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the… ▽ More In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the entire space available to them. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact. This preserves the functional map** of the layer and reveals its low rank structure. In our findings, we conclude that most models utilize a fraction of the available space. For instance, for ViTB-16 and ViTL-16 trained on ImageNet, the mean layer utilization is 35% and 20% respectively. Our transformation results in reducing the parameters to 50% and 25% respectively, while resulting in less than 0.2% accuracy drop after fine-tuning. We also show that self-supervised pre-training drives this utilization up to 70%, justifying its suitability for downstream tasks. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: Presented at Efficient Systems for Foundation Models Workshop at the International Conference on Machine Learning (ICML) 2024

arXiv:2207.13751 [pdf, other]

GAUDI: A Neural Architect for Immersive 3D Scene Generation

Authors: Miguel Angel Bautista, Pengsheng Guo, Samira Abnar, Walter Talbott, Alexander Toshev, Zhuoyuan Chen, Laurent Dinh, Shuangfei Zhai, Hanlin Goh, Daniel Ulbricht, Afshin Dehghan, Josh Susskind

Abstract: We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generati… ▽ More We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generative model that enables both unconditional and conditional generation of 3D scenes. Our model generalizes previous works that focus on single objects by removing the assumption that the camera pose distribution can be shared across samples. We show that GAUDI obtains state-of-the-art performance in the unconditional generative setting across multiple datasets and allows for conditional generation of 3D scenes given conditioning variables like sparse image observations or text that describes the scene. △ Less

Submitted 27 July, 2022; originally announced July 2022.

Comments: Project webpage: https://github.com/apple/ml-gaudi

arXiv:2107.05775 [pdf, other]

Fast and Explicit Neural View Synthesis

Authors: Pengsheng Guo, Miguel Angel Bautista, Alex Colburn, Liang Yang, Daniel Ulbricht, Joshua M. Susskind, Qi Shan

Abstract: We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radian… ▽ More We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision. △ Less

Submitted 8 December, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

arXiv:2006.01895 [pdf, other]

Learning to Branch for Multi-Task Learning

Authors: Pengsheng Guo, Chen-Yu Lee, Daniel Ulbricht

Abstract: Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad… ▽ More Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy. △ Less

Submitted 9 June, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

Comments: Accepted at ICML 2020

arXiv:1903.04064 [pdf, other]

Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation

Authors: Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht

Abstract: In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric. Our proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers. It provides a geometrically meanin… ▽ More In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric. Our proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers. It provides a geometrically meaningful guidance to detect target samples that are far from the support of the source and enables efficient distribution alignment in an end-to-end trainable fashion. In the experiments, we validate the effectiveness and genericness of our method on digit and sign recognition, image classification, semantic segmentation, and object detection. △ Less

Submitted 10 March, 2019; originally announced March 2019.

Comments: Accepted at CVPR 2019

arXiv:1403.6899 [pdf]

doi 10.1364/OE.22.012316

Three-Dimensional Organic Microlasers with Low Lasing Thresholds Fabricated by Multiphoton Lithography

Authors: Vincent W. Chen, Nina Sobeshchuk, Clement Lafargue, Eric S. Mansfield, Jeannie Yom, Luke Johnstone, Joel M. Hales, Stefan Bittner, Severin Charpignon, David Ulbricht, Joseph Lautru, Igor Denisyuk, Joseph Zyss, Joseph W. Perry, Melanie Lebental

Abstract: Cuboid-shaped organic microcavities containing a pyrromethene laser dye and supported upon a photonic crystal have been investigated as an approach to reducing the lasing threshold of the cavities. Multiphoton lithography facilitated fabrication of the cuboid cavities directly on the substrate or on the decoupling structure, while similar structures were fabricated on the substrate by UV lithograp… ▽ More Cuboid-shaped organic microcavities containing a pyrromethene laser dye and supported upon a photonic crystal have been investigated as an approach to reducing the lasing threshold of the cavities. Multiphoton lithography facilitated fabrication of the cuboid cavities directly on the substrate or on the decoupling structure, while similar structures were fabricated on the substrate by UV lithography for comparison. Significant reduction of the lasing threshold by a factor of ~30 has been observed for cavities supported by the photonic crystal relative to those fabricated on the substrate. The lasing mode spectra of the cuboid microresonators provide strong evidence showing that the lasing modes are localized in the horizontal plane, with the shape of an inscribed diamond. △ Less

Submitted 2 April, 2014; v1 submitted 26 March, 2014; originally announced March 2014.

Comments: 11 pages, 11 figures, 25 references

Journal ref: Opt. Express 22, 12316 (2014)

Showing 1–6 of 6 results for author: Ulbricht, D