Showing 1–2 of 2 results for author: Rogozhnikov, A

Search v0.5.6 released 2020-02-24

arXiv:2106.03143 [pdf, other]

cs.LG cs.CL cs.CV

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Authors: Tatiana Likhomanenko, Qiantong Xu, Gabriel Synnaeve, Ronan Collobert, Alex Rogozhnikov

Abstract: Without positional information, attention-based Transformer neural networks are permutation-invariant. Absolute or relative positional embeddings are the most popular ways to feed Transformer models with positional information. Absolute positional embeddings are simple to implement, but suffer from generalization issues when evaluating on sequences longer than seen at training time. Relative posit… ▽ More Without positional information, attention-based Transformer neural networks are permutation-invariant. Absolute or relative positional embeddings are the most popular ways to feed Transformer models with positional information. Absolute positional embeddings are simple to implement, but suffer from generalization issues when evaluating on sequences longer than seen at training time. Relative positions are more robust to input length change, but are more complex to implement and yield inferior model throughput due to extra computational and memory costs. In this paper, we propose an augmentation-based approach (CAPE) for absolute positional embeddings, which keeps the advantages of both absolute (simplicity and speed) and relative positional embeddings (better generalization). In addition, our empirical evaluation on state-of-the-art models in machine translation, image and speech recognition demonstrates that CAPE leads to better generalization performance as well as increased stability with respect to training hyper-parameters. △ Less

Submitted 8 November, 2021; v1 submitted 6 June, 2021; originally announced June 2021.
arXiv:1706.01109 [pdf, other]

stat.ML cs.LG

InfiniteBoost: building infinite ensembles with gradient descent

Authors: Alex Rogozhnikov, Tatiana Likhomanenko

Abstract: In machine learning ensemble methods have demonstrated high accuracy for the variety of problems in different areas. Two notable ensemble methods widely used in practice are gradient boosting and random forests. In this paper we present InfiniteBoost - a novel algorithm, which combines important properties of these two approaches. The algorithm constructs the ensemble of trees for which two proper… ▽ More In machine learning ensemble methods have demonstrated high accuracy for the variety of problems in different areas. Two notable ensemble methods widely used in practice are gradient boosting and random forests. In this paper we present InfiniteBoost - a novel algorithm, which combines important properties of these two approaches. The algorithm constructs the ensemble of trees for which two properties hold: trees of the ensemble incorporate the mistakes done by others; at the same time the ensemble could contain the infinite number of trees without the over-fitting effect. The proposed algorithm is evaluated on the regression, classification, and ranking tasks using large scale, publicly available datasets. △ Less

Submitted 21 September, 2018; v1 submitted 4 June, 2017; originally announced June 2017.

Comments: 7 pages, 5 figures, 3 tables

Search v0.5.6 released 2020-02-24