Skip to main content

Showing 1–3 of 3 results for author: Vucetic, D

.
  1. arXiv:2310.02779  [pdf, other

    cs.LG cs.GT

    Expected flow networks in stochastic environments and two-player zero-sum games

    Authors: Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

    Abstract: Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution. GFlowNets have been successfully applied to various structured object generation tasks, sampling a diverse set of high-reward objects quickly. We propose expected flow networks (EFlowNets), which extend GFlowNets to stochastic environments. We show that EFlowNets outperform other GFlowNet for… ▽ More

    Submitted 13 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; code: https://github.com/GFNOrg/AdversarialFlowNetworks

  2. arXiv:2208.02070  [pdf, other

    cs.CL cs.LG

    Efficient Fine-Tuning of Compressed Language Models with Learners

    Authors: Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

    Abstract: Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the computational challenges of training to downstream tasks. We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: 8 pages, 9 figures, 2 tables, presented at ICML 2022 workshop on Hardware-Aware Efficient Training (HAET 2022)

  3. Efficient Fine-Tuning of BERT Models on the Edge

    Authors: Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

    Abstract: Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased reso… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: 4 pages, 2 figures, 3 tables. To be published in ISCAS 2022 and made available on IEEE Xplore