Skip to main content

Showing 1–11 of 11 results for author: Dun, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.03899  [pdf, other

    cs.LG

    CrysFormer: Protein Structure Prediction via 3d Patterson Maps and Partial Structure Attention

    Authors: Chen Dun, Qiutai Pan, Shikai **, Ria Stevens, Mitchell D. Miller, George N. Phillips, Jr., Anastasios Kyrillidis

    Abstract: Determining the structure of a protein has been a decades-long open question. A protein's three-dimensional structure often poses nontrivial computation costs, when classical simulation algorithms are utilized. Advances in the transformer neural network architecture -- such as AlphaFold2 -- achieve significant improvements for this problem, by learning from a large dataset of sequence information… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  2. arXiv:2310.02842  [pdf, other

    cs.CL cs.AI

    Swee** Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation

    Authors: Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Anastasios Kyrillidis, Robert Sim

    Abstract: Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how… ▽ More

    Submitted 5 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  3. arXiv:2309.03469  [pdf, other

    cs.LG cs.AI cs.CV

    Fast FixMatch: Faster Semi-Supervised Learning with Curriculum Batch Size

    Authors: John Chen, Chen Dun, Anastasios Kyrillidis

    Abstract: Advances in Semi-Supervised Learning (SSL) have almost entirely closed the gap between SSL and Supervised Learning at a fraction of the number of labels. However, recent performance improvements have often come \textit{at the cost of significantly increased training computation}. To address this, we propose Curriculum Batch Size (CBS), \textit{an unlabeled batch size curriculum which exploits the… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  4. arXiv:2306.08586  [pdf, other

    cs.LG cs.AI math.OC

    FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts

    Authors: Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Robert Sim, Anastasios Kyrillidis, Dimitrios Dimitriadis

    Abstract: One of the goals in Federated Learning (FL) is to create personalized models that can adapt to the context of each participating client, while utilizing knowledge from a shared global model. Yet, often, personalization requires a fine-tuning step using clients' labeled data in order to achieve good performance. This may not be feasible in scenarios where incoming clients are fresh and/or have priv… ▽ More

    Submitted 4 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 19 Pages

  5. arXiv:2210.16169  [pdf, other

    cs.LG cs.AI cs.IT math.OC

    LOFT: Finding Lottery Tickets through Filter-wise Training

    Authors: Qihan Wang, Chen Dun, Fangshuo Liao, Chris Jermaine, Anastasios Kyrillidis

    Abstract: Recent work on the Lottery Ticket Hypothesis (LTH) shows that there exist ``\textit{winning tickets}'' in large neural networks. These tickets represent ``sparse'' versions of the full model that can be trained independently to achieve comparable accuracy with respect to the full model. However, finding the winning tickets requires one to \emph{pretrain} the large model for at least a number of ep… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  6. arXiv:2210.16105  [pdf, other

    cs.LG cs.AI cs.IT math.OC

    Efficient and Light-Weight Federated Learning via Asynchronous Distributed Dropout

    Authors: Chen Dun, Mirian Hipolito, Chris Jermaine, Dimitrios Dimitriadis, Anastasios Kyrillidis

    Abstract: Asynchronous learning protocols have regained attention lately, especially in the Federated Learning (FL) setup, where slower clients can severely impede the learning process. Herein, we propose \texttt{AsyncDrop}, a novel asynchronous FL framework that utilizes dropout regularization to handle device heterogeneity in distributed settings. Overall, \texttt{AsyncDrop} achieves better performance co… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  7. arXiv:2110.12292  [pdf, other

    cs.LG

    Federated Multiple Label Hashing (FedMLH): Communication Efficient Federated Learning on Extreme Classification Tasks

    Authors: Zhenwei Dai, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Anshumali Shrivastava

    Abstract: Federated learning enables many local devices to train a deep learning model jointly without sharing the local data. Currently, most of federated training schemes learns a global model by averaging the parameters of local models. However, most of these training schemes suffer from high communication cost resulted from transmitting full local model parameters. Moreover, directly averaging model par… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: 10 pages, 5 figures

  8. arXiv:2107.00961  [pdf, other

    cs.LG cs.CV cs.DC math.OC

    ResIST: Layer-Wise Decomposition of ResNets for Distributed Training

    Authors: Chen Dun, Cameron R. Wolfe, Christopher M. Jermaine, Anastasios Kyrillidis

    Abstract: We propose ResIST, a novel distributed training protocol for Residual Networks (ResNets). ResIST randomly decomposes a global ResNet into several shallow sub-ResNets that are trained independently in a distributed manner for several local iterations, before having their updates synchronized and aggregated into the global model. In the next round, new sub-ResNets are randomly generated and the proc… ▽ More

    Submitted 14 March, 2022; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: 26 pages, 8 figures, pre-print under review

  9. arXiv:2102.10424  [pdf, other

    cs.LG cs.AI cs.DC math.OC

    GIST: Distributed Training for Large-Scale Graph Convolutional Networks

    Authors: Cameron R. Wolfe, **gkang Yang, Arindam Chowdhury, Chen Dun, Artun Bayer, Santiago Segarra, Anastasios Kyrillidis

    Abstract: The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale both in terms of graph size and the number of model parameters. Although some work has explored training on large-scale graphs (e.g., GraphSAGE, ClusterGCN, etc.), we pioneer efficient training of large-scale GCN models (i.e., ultra-wide, overparameterized mo… ▽ More

    Submitted 14 March, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: 28 pages, 5 figures, pre-print under review

    ACM Class: I.2.4

  10. arXiv:1910.02120  [pdf, other

    cs.LG stat.ML

    Distributed Learning of Deep Neural Networks using Independent Subnet Training

    Authors: Binhang Yuan, Cameron R. Wolfe, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Christopher M. Jermaine

    Abstract: Distributed machine learning (ML) can bring more computational resources to bear than single-machine learning, thus enabling reductions in training time. Distributed learning partitions models and data over many machines, allowing model and dataset sizes beyond the available compute power and memory of a single machine. In practice though, distributed ML is challenging when distribution is mandato… ▽ More

    Submitted 18 April, 2022; v1 submitted 4 October, 2019; originally announced October 2019.

  11. arXiv:1810.09202  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Graph Convolutional Reinforcement Learning

    Authors: Jiechuan Jiang, Chen Dun, Tiejun Huang, Zongqing Lu

    Abstract: Learning to cooperate is crucially important in multi-agent environments. The key is to understand the mutual interplay between agents. However, multi-agent environments are highly dynamic, where agents keep moving and their neighbors change quickly. This makes it hard to learn abstract representations of mutual interplay between agents. To tackle these difficulties, we propose graph convolutional… ▽ More

    Submitted 11 February, 2020; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: ICLR'20