Skip to main content

Showing 1–18 of 18 results for author: Ponomareva, N

.
  1. arXiv:2403.12983  [pdf, other

    cs.CV cs.LG

    OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization

    Authors: Xiang Meng, Shibal Ibrahim, Kayhan Behdin, Hussein Hazimeh, Natalia Ponomareva, Rahul Mazumder

    Abstract: Structured pruning is a promising approach for reducing the inference costs of large vision and language models. By removing carefully chosen structures, e.g., neurons or attention heads, the improvements from this approach can be realized on standard deep learning hardware. In this work, we focus on structured pruning in the one-shot (post-training) setting, which does not require model retrainin… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  2. arXiv:2402.11120  [pdf, other

    cs.LG cs.CV stat.ML

    DART: A Principled Approach to Adversarially Robust Unsupervised Domain Adaptation

    Authors: Yunjuan Wang, Hussein Hazimeh, Natalia Ponomareva, Alexey Kurakin, Ibrahim Hammoud, Raman Arora

    Abstract: Distribution shifts and adversarial examples are two major challenges for deploying machine learning models. While these challenges have been studied individually, their combination is an important topic that remains relatively under-explored. In this work, we study the problem of adversarial robustness under a common setting of distribution shift - unsupervised domain adaptation (UDA). Specifical… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  3. JetTrain: IDE-Native Machine Learning Experiments

    Authors: Artem Trofimov, Mikhail Kostyukov, Sergei Ugdyzhekov, Natalia Ponomareva, Igor Naumov, Maksim Melekhovets

    Abstract: Integrated development environments (IDEs) are prevalent code-writing and debugging tools. However, they have yet to be widely adopted for launching machine learning (ML) experiments. This work aims to fill this gap by introducing JetTrain, an IDE-integrated tool that delegates specific tasks from an IDE to remote computational resources. A user can write and debug code locally and then seamlessly… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: IDE workshop @ ICSE 2024

  4. arXiv:2402.04177  [pdf, other

    cs.CL cs.LG stat.ML

    Scaling Laws for Downstream Task Performance of Large Language Models

    Authors: Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo

    Abstract: Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  5. arXiv:2306.03256  [pdf, other

    cs.LG stat.ML

    Explaining and Adapting Graph Conditional Shift

    Authors: Qi Zhu, Yizhu Jiao, Natalia Ponomareva, Jiawei Han, Bryan Perozzi

    Abstract: Graph Neural Networks (GNNs) have shown remarkable performance on graph-structured data. However, recent empirical studies suggest that GNNs are very susceptible to distribution shift. There is still significant ambiguity about why graph-based models seem more vulnerable to these shifts. In this work we provide a thorough theoretical analysis on it by quantifying the magnitude of conditional shift… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  6. COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search

    Authors: Shibal Ibrahim, Wenyu Chen, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder

    Abstract: The sparse Mixture-of-Experts (Sparse-MoE) framework efficiently scales up model capacity in various domains, such as natural language processing and vision. Sparse-MoEs select a subset of the "experts" (thus, only a portion of the overall network) for each input sample using a sparse, trainable gate. Existing sparse gates are prone to convergence and performance issues when training with first-or… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted in KDD 2023

  7. arXiv:2306.01684  [pdf, other

    cs.LG cs.CR

    Harnessing large-language models to generate private synthetic text

    Authors: Alexey Kurakin, Natalia Ponomareva, Umar Syed, Liam MacDermed, Andreas Terzis

    Abstract: Differentially private training algorithms like DP-SGD protect sensitive training data by ensuring that trained models do not reveal private information. An alternative approach, which this paper studies, is to use a sensitive dataset to generate synthetic data that is differentially private with respect to the original data, and then non-privately training a model on the synthetic data. Doing so… ▽ More

    Submitted 10 January, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 31 pages; 7 figures; compared to previous version added result of LoRa-finetuning

  8. arXiv:2305.05973  [pdf, other

    cs.CL cs.CR cs.IR

    Synthetic Query Generation for Privacy-Preserving Deep Retrieval Systems using Differentially Private Language Models

    Authors: Aldo Gael Carranza, Rezsa Farahani, Natalia Ponomareva, Alex Kurakin, Matthew Jagielski, Milad Nasr

    Abstract: We address the challenge of ensuring differential privacy (DP) guarantees in training deep retrieval systems. Training these systems often involves the use of contrastive-style losses, which are typically non-per-example decomposable, making them difficult to directly DP-train with since common techniques require per-example gradients. To address this issue, we propose an approach that prioritizes… ▽ More

    Submitted 23 May, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted to NAACL 2024

  9. arXiv:2303.00654  [pdf, other

    cs.LG cs.CR stat.ML

    How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

    Authors: Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta

    Abstract: ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP t… ▽ More

    Submitted 31 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Journal ref: Journal of Artificial Intelligence Research 77 (2023) 1113-1201

  10. arXiv:2302.14623  [pdf, other

    cs.LG cs.CV math.OC

    Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

    Authors: Riade Benbaki, Wenyu Chen, Xiang Meng, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder

    Abstract: The sheer size of modern neural networks makes model serving a serious computational challenge. A popular class of compression techniques overcomes this challenge by pruning or sparsifying the weights of pretrained networks. While useful, these techniques often face serious tradeoffs between computational requirements and compression quality. In this work, we propose a novel optimization-based pru… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  11. arXiv:2302.00089  [pdf, other

    cs.LG cs.AI

    Mind the (optimality) Gap: A Gap-Aware Learning Rate Scheduler for Adversarial Nets

    Authors: Hussein Hazimeh, Natalia Ponomareva

    Abstract: Adversarial nets have proved to be powerful in various domains including generative modeling (GANs), transfer learning, and fairness. However, successfully training adversarial nets using first-order methods remains a major challenge. Typically, careful choices of the learning rates are needed to maintain the delicate balance between the competing networks. In this paper, we design a novel learnin… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: Accepted to AISTATS 2023

  12. Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

    Authors: Shibal Ibrahim, Natalia Ponomareva, Rahul Mazumder

    Abstract: Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the sta… ▽ More

    Submitted 26 May, 2023; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted in ECMLPKDD 2022

  13. arXiv:2108.01099  [pdf, other

    cs.LG

    Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training Data

    Authors: Qi Zhu, Natalia Ponomareva, Jiawei Han, Bryan Perozzi

    Abstract: There has been a recent surge of interest in designing Graph Neural Networks (GNNs) for semi-supervised learning tasks. Unfortunately this work has assumed that the nodes labeled for use in training were selected uniformly at random (i.e. are an IID sample). However in many real world scenarios gathering labels for graph nodes is both expensive and inherently biased -- so this assumption can not b… ▽ More

    Submitted 26 October, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: NeurIPS 2021

  14. arXiv:2002.07772  [pdf, other

    cs.LG cs.CV stat.ML

    The Tree Ensemble Layer: Differentiability meets Conditional Computation

    Authors: Hussein Hazimeh, Natalia Ponomareva, Petros Mol, Zhenyu Tan, Rahul Mazumder

    Abstract: Neural networks and tree ensembles are state-of-the-art learners, each with its unique statistical and computational advantages. We aim to combine these advantages by introducing a new layer for neural networks, composed of an ensemble of differentiable decision trees (a.k.a. soft trees). While differentiable trees demonstrate promising results in the literature, they are typically slow in trainin… ▽ More

    Submitted 10 July, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  15. arXiv:1909.08792  [pdf, other

    cs.RO cs.AI cs.LG

    Agent Prioritization for Autonomous Navigation

    Authors: Khaled S. Refaat, Kai Ding, Natalia Ponomareva, Stéphane Ross

    Abstract: In autonomous navigation, a planning system reasons about other agents to plan a safe and plausible trajectory. Before planning starts, agents are typically processed with computationally intensive models for recognition, tracking, motion estimation and prediction. With limited computational resources and a large number of agents to process in real time, it becomes important to efficiently rank ag… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: 8 pages, accepted to IEEE/RSJ International Conference on Robots and Systems (IROS) 2019

  16. arXiv:1903.08708  [pdf, other

    cs.LG stat.ML

    Accelerating Gradient Boosting Machine

    Authors: Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni

    Abstract: Gradient Boosting Machine (GBM) is an extremely powerful supervised learning algorithm that is widely used in practice. GBM routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In this work, we propose Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov's acceleration techniques into the design of GBM. The difficulty in accele… ▽ More

    Submitted 27 August, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

  17. arXiv:1710.11555  [pdf, other

    stat.ML cs.LG

    TF Boosted Trees: A scalable TensorFlow based framework for gradient boosting

    Authors: Natalia Ponomareva, Soroush Radpour, Gilbert Hendry, Salem Haykal, Thomas Colthurst, Petr Mitrichev, Alexander Grushetsky

    Abstract: TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent… ▽ More

    Submitted 31 October, 2017; originally announced October 2017.

    Comments: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017). The final publication will be available at link.springer.com and is available on ECML website http://ecmlpkdd2017.ijs.si/papers/paperID705.pdf

  18. arXiv:1710.11547  [pdf, other

    stat.ML cs.LG

    Compact Multi-Class Boosted Trees

    Authors: Natalia Ponomareva, Thomas Colthurst, Gilbert Hendry, Salem Haykal, Soroush Radpour

    Abstract: Gradient boosted decision trees are a popular machine learning technique, in part because of their ability to give good accuracy with small models. We describe two extensions to the standard tree boosting algorithm designed to increase this advantage. The first improvement extends the boosting formalism from scalar-valued trees to vector-valued trees. This allows individual trees to be used as mul… ▽ More

    Submitted 31 October, 2017; originally announced October 2017.

    Comments: Accepted for publication in IEEE Big Data 2017 http://cci.drexel.edu/bigdata/bigdata2017/AcceptedPapers.html