Skip to main content

Showing 1–3 of 3 results for author: Yanush, V

Searching in archive stat. Search in all archives.
.
  1. arXiv:2006.06880  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

    Authors: Alexander Shekhovtsov, Viktor Yanush

    Abstract: Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful experimental results have been achieved with empirical straight-through (ST) approaches, proposing a variety of ad-hoc rules for propagating gradients through non-differentiable activations and updating discrete wei… ▽ More

    Submitted 19 October, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 33 pages, DAGM 2021 version (presented, to be published)

  2. arXiv:2006.03143  [pdf, other

    stat.ML cs.LG

    Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

    Authors: Alexander Shekhovtsov, Viktor Yanush, Boris Flach

    Abstract: In neural networks with binary activations and or binary weights the training by gradient descent is complicated as the model has piecewise constant response. We consider stochastic binary networks, obtained by adding noises in front of activations. The expected model response becomes a smooth function of parameters, its gradient is well defined but it is challenging to estimate it accurately. We… ▽ More

    Submitted 4 November, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020

  3. arXiv:1901.08045  [pdf, other

    stat.ML cs.LG

    Hamiltonian Monte-Carlo for Orthogonal Matrices

    Authors: Viktor Yanush, Dmitry Kropotov

    Abstract: We consider the problem of sampling from posterior distributions for Bayesian models where some parameters are restricted to be orthogonal matrices. Such matrices are sometimes used in neural networks models for reasons of regularization and stabilization of training procedures, and also can parameterize matrices of bounded rank, positive-definite matrices and others. In \citet{byrne2013geodesic}… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.