Skip to main content

Showing 1–5 of 5 results for author: Cromp, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00894  [pdf, other

    cs.LG cs.AI cs.CL

    Pretrained Hybrids with MAD Skills

    Authors: Nicholas Roberts, Samuel Guo, Zhiqi Gao, Satya Sai Srinath Namburi GNVV, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala

    Abstract: While Transformers underpin modern large language models (LMs), there is a growing list of alternative architectures with new capabilities, promises, and tradeoffs. This makes choosing the right LM architecture challenging. Recently-proposed $\textit{hybrid architectures}$ seek a best-of-all-worlds approach that reaps the benefits of all architectures. Hybrid design is difficult for two reasons: i… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2404.08461  [pdf, other

    cs.LG cs.AI

    OTTER: Improving Zero-Shot Classification via Optimal Transport

    Authors: Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala

    Abstract: Popular zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label distribution. Existing approaches that seek to repair the label distribution are not suitable in zero-shot settings, as they have incompatible requirements such as access to labeled downstream task data or knowledge o… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 29 pages

  3. arXiv:2307.12226  [pdf, other

    cs.LG cs.AI stat.ML

    Geometry-Aware Adaptation for Pretrained Models

    Authors: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

    Abstract: Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes -- or, in the case of… ▽ More

    Submitted 27 November, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  4. arXiv:2303.17713  [pdf, other

    cs.LG cs.CY stat.ML

    Mitigating Source Bias for Fairer Weak Supervision

    Authors: Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala

    Abstract: Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak super… ▽ More

    Submitted 29 November, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  5. arXiv:2107.12373  [pdf, ps, other

    cs.DB cs.DS cs.LG

    Relational Boosted Regression Trees

    Authors: Sonia Cromp, Alireza Samadian, Kirk Pruhs

    Abstract: Many tasks use data housed in relational databases to train boosted regression tree models. In this paper, we give a relational adaptation of the greedy algorithm for training boosted regression trees. For the subproblem of calculating the sum of squared residuals of the dataset, which dominates the runtime of the boosting algorithm, we provide a $(1 + ε)$-approximation using the tensor sketch tec… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.