Skip to main content

Showing 1–2 of 2 results for author: Religa, T L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.13865  [pdf, other

    cs.LG cs.CR

    Selective Pre-training for Private Fine-tuning

    Authors: Da Yu, Sivakanth Gopi, Janardhan Kulkarni, Zinan Lin, Saurabh Naik, Tomasz Lukasz Religa, Jian Yin, Huishuai Zhang

    Abstract: Text prediction models, when used in applications like email clients or word processors, must protect user data privacy and adhere to model size constraints. These constraints are crucial to meet memory and inference time requirements, as well as to reduce inference costs. Building small, fast, and private domain-specific language models is a thriving area of research. In this work, we show that a… ▽ More

    Submitted 2 July, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Transactions on Machine Learning Research. Code available at https://github.com/dayu11/selective_pretraining_for_private_finetuning

  2. arXiv:2203.02094  [pdf, other

    cs.LG cs.CL

    LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

    Authors: Mojan Javaheripi, Gustavo H. de Rosa, Subhabrata Mukherjee, Shital Shah, Tomasz L. Religa, Caio C. T. Mendes, Sebastien Bubeck, Farinaz Koushanfar, Debadeepta Dey

    Abstract: The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak memory utilization and latency is non-trivial. This is exacerbated by the proliferation of various hardware. We leverage the somewhat surprising empir… ▽ More

    Submitted 17 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.