Skip to main content

Showing 1–5 of 5 results for author: Hewavitharana, S

.
  1. arXiv:2311.02084  [pdf, other

    cs.CV cs.CL cs.IR

    ITEm: Unsupervised Image-Text Embedding Learning for eCommerce

    Authors: Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki

    Abstract: Product embedding serves as a cornerstone for a wide range of applications in eCommerce. The product embedding learned from multiple modalities shows significant improvement over that from a single modality, since different modalities provide complementary information. However, some modalities are more informatively dominant than others. How to teach a model to learn embedding from different modal… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 October, 2023; originally announced November 2023.

  2. arXiv:2211.04898  [pdf, other

    cs.CL cs.AI

    Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

    Authors: Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

    Abstract: The pre-training of masked language models (MLMs) consumes massive computation to achieve good results on downstream NLP tasks, resulting in a large carbon footprint. In the vanilla MLM, the virtual tokens, [MASK]s, act as placeholders and gather the contextualized information from unmasked tokens to restore the corrupted information. It raises the question of whether we can append [MASK]s at a la… ▽ More

    Submitted 15 November, 2022; v1 submitted 9 November, 2022; originally announced November 2022.

    Comments: Code available at: https://github.com/BaohaoLiao/3ml

  3. arXiv:2109.08712  [pdf, other

    cs.CL cs.AI cs.LG

    Back-translation for Large-Scale Multilingual Machine Translation

    Authors: Baohao Liao, Shahram Khadivi, Sanjika Hewavitharana

    Abstract: This paper illustrates our approach to the shared task on large-scale multilingual machine translation in the sixth conference on machine translation (WMT-21). This work aims to build a single multilingual translation system with a hypothesis that a universal cross-language representation leads to better multilingual translation performance. We extend the exploration of different back-translation… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

  4. arXiv:1906.03129  [pdf, other

    cs.CL cs.AI

    Word-based Domain Adaptation for Neural Machine Translation

    Authors: Shen Yan, Leonard Dahlmann, Pavel Petrushkov, Sanjika Hewavitharana, Shahram Khadivi

    Abstract: In this paper, we empirically investigate applying word-level weights to adapt neural machine translation to e-commerce domains, where small e-commerce datasets and large out-of-domain datasets are available. In order to mine in-domain like words in the out-of-domain datasets, we compute word weights by using a domain-specific and a non-domain-specific language model followed by smoothing and bina… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: Published on the proceedings of the International Workshop on Spoken Language Translation (IWSLT), 2018

    Journal ref: Proceedings of the 15th International Workshop on Spoken Language Translation, Bruges, Belgium, October 29-30, 2018

  5. arXiv:1707.07835  [pdf, other

    cs.IR cs.CL

    Towards Semantic Query Segmentation

    Authors: A**kya Kale, Thrivikrama Taula, Sanjika Hewavitharana, Amit Srivastava

    Abstract: Query Segmentation is one of the critical components for understanding users' search intent in Information Retrieval tasks. It involves grou** tokens in the search query into meaningful phrases which help downstream tasks like search relevance and query understanding. In this paper, we propose a novel approach to segment user queries using distributed query embeddings. Our key contribution is a… ▽ More

    Submitted 25 July, 2017; originally announced July 2017.