Search | arXiv e-print repository

Coarse-To-Fine Tensor Trains for Compact Visual Representations

Authors: Sebastian Loeschcke, Dan Wang, Christian Leth-Espensen, Serge Belongie, Michael J. Kastoryano, Sagie Benaim

Abstract: The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly… ▽ More The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly compact tensor train representation, is still lacking. This has prevented practitioners from deploying the full potential of tensor networks for visual data. To this end, we propose 'Prolongation Upsampling Tensor Train (PuTT)', a novel method for learning tensor train representations in a coarse-to-fine manner. Our method involves the prolonging or `upsampling' of a learned tensor train representation, creating a sequence of 'coarse-to-fine' tensor trains that are incrementally refined. We evaluate our representation along three axes: (1). compression, (2). denoising capability, and (3). image completion capability. To assess these axes, we consider the tasks of image fitting, 3D fitting, and novel view synthesis, where our method shows an improved performance compared to state-of-the-art tensor-based methods. For full results see our project webpage: https://sebulo.github.io/PuTT_website/ △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Project webpage: https://sebulo.github.io/PuTT_website/

arXiv:2405.20322 [pdf, other]

Quantum generalizations of Glauber and Metropolis dynamics

Authors: András Gilyén, Chi-Fang Chen, Joao F. Doriguello, Michael J. Kastoryano

Abstract: Classical Markov Chain Monte Carlo methods have been essential for simulating statistical physical systems and have proven well applicable to other systems with complex degrees of freedom. Motivated by the statistical physics origins, Chen, Kastoryano, and Gilyén [CKG23] proposed a continuous-time quantum thermodynamic analog to Glauber dynamic that is (i) exactly detailed balanced, (ii) efficient… ▽ More Classical Markov Chain Monte Carlo methods have been essential for simulating statistical physical systems and have proven well applicable to other systems with complex degrees of freedom. Motivated by the statistical physics origins, Chen, Kastoryano, and Gilyén [CKG23] proposed a continuous-time quantum thermodynamic analog to Glauber dynamic that is (i) exactly detailed balanced, (ii) efficiently implementable, and (iii) quasi-local for geometrically local systems. Physically, their construction gives a smooth variant of the Davies' generator derived from weak system-bath interaction. In this work, we give an efficiently implementable discrete-time quantum counterpart to Metropolis sampling that also enjoys the desirable features (i)-(iii). Also, we give an alternative highly coherent quantum generalization of detailed balanced dynamics that resembles another physically derived master equation, and propose a smooth interpolation between this and earlier constructions. We study generic properties of all constructions, including the uniqueness of the fixed-point and the locality of the resulting operators. We hope our results provide a systematic approach to the possible quantum generalizations of classical Glauber and Metropolis dynamics. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.16528 [pdf, other]

LoQT: Low Rank Adapters for Quantized Training

Authors: Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie, Vésteinn Snæbjarnarson

Abstract: Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. L… ▽ More Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. LoQT uses gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices. Our approach is suitable for both pretraining and fine-tuning of models, which we demonstrate experimentally for language modeling and downstream task adaptation. We find that LoQT enables efficient training of models up to 7B parameters on a consumer-grade 24GB GPU. We also demonstrate the feasibility of training a 13B parameter model using per-layer gradient updates on the same hardware. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Showing 1–3 of 3 results for author: Kastoryano, M J