Skip to main content

Showing 1–9 of 9 results for author: Vo, V A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.09126  [pdf, other

    cs.DC cs.AI cs.CL cs.LG cs.SE

    MPIrigen: MPI Code Generation through Domain-Specific Language Models

    Authors: Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Guy Tamir, Ted Willke, Nesreen Ahmed, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generati… ▽ More

    Submitted 23 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  2. arXiv:2402.02018  [pdf, other

    cs.LG

    The Landscape and Challenges of HPC Research and LLMs

    Authors: Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari

    Abstract: Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breach… ▽ More

    Submitted 6 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  3. arXiv:2312.13322  [pdf, other

    cs.PL cs.AI cs.LG cs.SE

    Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks

    Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Mihai Capota, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  4. arXiv:2308.09440  [pdf, other

    cs.CL cs.PL

    Scope is all you need: Transforming LLMs for HPC Code

    Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found… ▽ More

    Submitted 29 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

  5. arXiv:2305.12248  [pdf, other

    cs.CL cs.CV

    Brain encoding models based on multimodal transformers can transfer across language and vision

    Authors: Jerry Tang, Meng Du, Vy A. Vo, Vasudev Lal, Alexander G. Huth

    Abstract: Encoding models have been used to assess how the human brain represents concepts in language and vision. While language and vision rely on similar concept representations, current encoding models are typically trained and tested on brain responses to each modality in isolation. Recent advances in multimodal pretraining have produced transformers that can extract aligned representations of concepts… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  6. arXiv:2210.01869  [pdf, other

    cs.CL cs.AI

    Memory in humans and deep language models: Linking hypotheses for model augmentation

    Authors: Omri Raccah, Phoebe Chen, Ted L. Willke, David Poeppel, Vy A. Vo

    Abstract: The computational complexity of the self-attention mechanism in Transformer models significantly limits their ability to generalize over long temporal durations. Memory-augmentation, or the explicit storing of past information in external memory for subsequent predictions, has become a constructive avenue for mitigating this limitation. We argue that memory-augmented Transformers can benefit subst… ▽ More

    Submitted 27 November, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: 6 figures

  7. arXiv:2209.10818  [pdf, other

    cs.LG cs.AI cs.NE

    Memory-Augmented Graph Neural Networks: A Brain-Inspired Review

    Authors: Guixiang Ma, Vy A. Vo, Theodore Willke, Nesreen K. Ahmed

    Abstract: We provide a comprehensive review of the existing literature on memory-augmented GNNs. We review these works through the lens of psychology and neuroscience, which has several established theories on how multiple memory systems and mechanisms operate in biological brains. We propose a taxonomy of memory-augmented GNNs and a set of criteria for comparing their memory mechanisms. We also provide cri… ▽ More

    Submitted 14 July, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

  8. arXiv:2105.05944  [pdf, other

    cs.LG

    Slower is Better: Revisiting the Forgetting Mechanism in LSTM for Slower Information Decay

    Authors: Hsiang-Yun Sherry Chien, Javier S. Turek, Nicole Beckage, Vy A. Vo, Christopher J. Honey, Ted L. Willke

    Abstract: Sequential information contains short- to long-range dependencies; however, learning long-timescale information has been a challenge for recurrent neural networks. Despite improvements in long short-term memory networks (LSTMs), the forgetting mechanism results in the exponential decay of information, limiting their capacity to capture long-timescale information. Here, we propose a power law forge… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: 16 pages, 10 figures

  9. arXiv:2009.12727  [pdf, other

    cs.CL cs.LG

    Multi-timescale Representation Learning in LSTM Language Models

    Authors: Shivangi Mahto, Vy A. Vo, Javier S. Turek, Alexander G. Huth

    Abstract: Language models must capture statistical dependencies between words at timescales ranging from very short to very long. Earlier work has demonstrated that dependencies in natural language tend to decay with distance between words according to a power law. However, it is unclear how this knowledge can be used for analyzing or designing neural network language models. In this work, we derived a theo… ▽ More

    Submitted 17 March, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

    MSC Class: 91F20 ACM Class: I.2.7; I.2.6

    Journal ref: International Conference on Learning Representations 2021