Skip to main content

Showing 1–3 of 3 results for author: Kostenok, E

.
  1. arXiv:2405.20935  [pdf, other

    cs.LG cs.AI

    Effective Interplay between Sparsity and Quantization: From Theory to Practice

    Authors: Simla Burcu Harma, Ayan Chakraborty, Elizaveta Kostenok, Danila Mishin, Dongho Ha, Babak Falsafi, Martin Jaggi, Ming Liu, Yunho Oh, Suvinay Subramanian, Amir Yazdanbakhsh

    Abstract: The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonstrated significant reduction in computational and memory footprints while preserving model accuracy. While effective, the interplay between these two m… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2309.06527  [pdf, other

    cs.CL cs.CR cs.LG

    Machine Translation Models Stand Strong in the Face of Adversarial Attacks

    Authors: Pavel Burnyshev, Elizaveta Kostenok, Alexey Zaytsev

    Abstract: Adversarial attacks expose vulnerabilities of deep learning models by introducing minor perturbations to the input, which lead to substantial alterations in the output. Our research focuses on the impact of such adversarial attacks on sequence-to-sequence (seq2seq) models, specifically machine translation models. We introduce algorithms that incorporate basic text perturbation heuristics and more… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Journal ref: AIST-2023

  3. arXiv:2308.11295  [pdf, other

    cs.LG

    Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

    Authors: Elizaveta Kostenok, Daniil Cherniavskii, Alexey Zaytsev

    Abstract: Determining the degree of confidence of deep learning model in its prediction is an open problem in the field of natural language processing. Most of the classical methods for uncertainty estimation are quite weak for text classification models. We set the task of obtaining an uncertainty estimate for neural networks based on the Transformer architecture. A key feature of such mo-dels is the atten… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.