Skip to main content

Showing 1–6 of 6 results for author: Dolgov, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.16367  [pdf, other

    cs.LG cs.AI cs.CL

    TQCompressor: improving tensor decomposition methods in neural networks via permutations

    Authors: V. Abronin, A. Naumov, D. Mazur, D. Bystrov, K. Tsarova, Ar. Melnikov, I. Oseledets, S. Dolgov, R. Brasher, M. Perelshtein

    Abstract: We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions. We explore the challenges posed by the computational and storage demands of pre-trained language models in NLP tasks and propose a permutation-based enhancement to Kronecker decomposition. This enhancement makes it possible to reduce loss in model expressivity which is usually associ… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  2. arXiv:2302.08134  [pdf, ps, other

    stat.ML cs.LG

    A weighted subspace exponential kernel for support tensor machines

    Authors: Kirandeep Kour, Sergey Dolgov, Peter Benner, Martin Stoll, Max Pfeffer

    Abstract: High-dimensional data in the form of tensors are challenging for kernel classification methods. To both reduce the computational complexity and extract informative features, kernels based on low-rank tensor decompositions have been proposed. However, what decisive features of the tensors are exploited by these kernels is often unclear. In this paper we propose a novel kernel that is based on the T… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  3. arXiv:2209.01941  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Deep importance sampling using tensor trains with application to a priori and a posteriori rare event estimation

    Authors: Tiangang Cui, Sergey Dolgov, Robert Scheichl

    Abstract: We propose a deep importance sampling method that is suitable for estimating rare event probabilities in high-dimensional problems. We approximate the optimal importance distribution in a general importance sampling problem as the pushforward of a reference distribution under a composition of order-preserving transformations, in which each transformation is formed by a squared tensor-train decompo… ▽ More

    Submitted 24 May, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

  4. arXiv:2106.04170  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Conditional Deep Inverse Rosenblatt Transports

    Authors: Tiangang Cui, Sergey Dolgov, Olivier Zahm

    Abstract: We present a novel offline-online method to mitigate the computational burden of Bayesian inference, particularly in the regime where the posterior densities are computationally demanding to evaluate while real-time inference results are needed. In the offline phase, the proposed method learns the joint law of the parameter random variables and the observable random variables in the tensor-train (… ▽ More

    Submitted 28 January, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: 41 pages

  5. arXiv:2007.06968  [pdf, other

    stat.ML cs.LG math.NA stat.CO

    Deep composition of tensor-trains using squared inverse Rosenblatt transports

    Authors: Tiangang Cui, Sergey Dolgov

    Abstract: Characterising intractable high-dimensional random variables is one of the fundamental challenges in stochastic computation. The recent surge of transport maps offers a mathematical foundation and new insights for tackling this challenge by coupling intractable random variables with tractable reference random variables. This paper generalises the functional tensor-train approximation of the invers… ▽ More

    Submitted 18 October, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: Found Comput Math (2021)

  6. arXiv:2002.05079  [pdf, other

    cs.LG math.NA stat.ML

    Efficient Structure-preserving Support Tensor Train Machine

    Authors: Kirandeep Kour, Sergey Dolgov, Martin Stoll, Peter Benner

    Abstract: An increasing amount of collected data are high-dimensional multi-way arrays (tensors), and it is crucial for efficient learning algorithms to exploit this tensorial structure as much as possible. The ever-present curse of dimensionality for high dimensional data and the loss of structure when vectorizing the data motivates the use of tailored low-rank tensor classification methods. In the presenc… ▽ More

    Submitted 3 August, 2021; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: 20 pages, 5 figures, 2 table, 2 Algorithm