Skip to main content

Showing 1–3 of 3 results for author: Salcianu, A

.
  1. arXiv:2203.17189  [pdf, other

    cs.LG cs.CL

    Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

    Authors: Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen , et al. (18 additional authors not shown)

    Abstract: Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves. Scaling can be complicated due to various factors including the need to distribute computation on supercomputer clusters (e.g., TPUs), prevent bottlenecks when infeeding data, and ensure reproducible results. In this work, we presen… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

  2. arXiv:2012.15524  [pdf, other

    cs.CL

    Fast WordPiece Tokenization

    Authors: Xinying Song, Alex Salcianu, Yang Song, Dave Dopson, Denny Zhou

    Abstract: Tokenization is a fundamental preprocessing step for almost all NLP tasks. In this paper, we propose efficient algorithms for the WordPiece tokenization used in BERT, from single-word tokenization to general text (e.g., sentence) tokenization. When tokenizing a single word, WordPiece uses a longest-match-first strategy, known as maximum matching. The best known algorithms so far are O(n^2) (where… ▽ More

    Submitted 5 October, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Accepted to EMNLP 2021 as an oral paper

  3. arXiv:1708.00214  [pdf, other

    cs.CL cs.NE

    Natural Language Processing with Small Feed-Forward Networks

    Authors: Jan A. Botha, Emily Pitler, Ji Ma, Anton Bakalov, Alex Salcianu, David Weiss, Ryan McDonald, Slav Petrov

    Abstract: We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural… ▽ More

    Submitted 1 August, 2017; originally announced August 2017.

    Comments: EMNLP 2017 short paper

    MSC Class: 68T50 ACM Class: I.2.7