Skip to main content

Showing 1–7 of 7 results for author: Hassanpour, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19995  [pdf, other

    cs.CL cs.AI cs.LG

    Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model

    Authors: Habib Hajimolahoseini, Mohammad Hassanpour, Foozhan Ataiefard, Boxing Chen, Yang Liu

    Abstract: This paper introduces a novel method of Progressive Low Rank Decomposition (PLRD) tailored for the compression of large language models. Our approach leverages a pre-trained model, which is then incrementally decompressed to smaller sizes using progressively lower ranks. This method allows for significant reductions in computational overhead and energy consumption, as subsequent models are derived… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2401.15293  [pdf, other

    cs.CV cs.AI cs.LG

    SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection

    Authors: Foozhan Ataiefard, Walid Ahmed, Habib Hajimolahoseini, Saina Asani, Farnoosh Javadi, Mohammad Hassanpour, Omar Mohamed Awad, Austin Wen, Kangling Liu, Yang Liu

    Abstract: Vision transformers are known to be more computationally and data-intensive than CNN models. These transformer models such as ViT, require all the input image tokens to learn the relationship among them. However, many of these tokens are not informative and may contain irrelevant information such as unrelated background or unimportant scenery. These tokens are overlooked by the multi-head self-att… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  3. arXiv:2401.01577  [pdf, other

    cs.CV

    Test-Time Personalization with Meta Prompt for Gaze Estimation

    Authors: Huan Liu, Julia Qi, Zhenhao Li, Mohammad Hassanpour, Yang Wang, Konstantinos Plataniotis, Yuanhao Yu

    Abstract: Despite the recent remarkable achievement in gaze estimation, efficient and accurate personalization of gaze estimation without labels is a practical problem but rarely touched on in the literature. To achieve efficient personalization, we take inspiration from the recent advances in Natural Language Processing (NLP) by updating a negligible number of parameters, "prompts", at the test time. Speci… ▽ More

    Submitted 12 March, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  4. arXiv:2312.10610  [pdf, other

    cs.CL

    Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization

    Authors: Xuan Long Do, Mohammad Hassanpour, Ahmed Masry, Parsa Kavehzadeh, Enamul Hoque, Shafiq Joty

    Abstract: A number of tasks have been proposed recently to facilitate easy access to charts such as chart QA and summarization. The dominant paradigm to solve these tasks has been to fine-tune a pretrained model on the task data. However, this approach is not only expensive but also not generalizable to unseen tasks. On the other hand, large language models (LLMs) have shown impressive generalization capabi… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 23 pages

  5. arXiv:2311.15134  [pdf, other

    cs.LG cs.AI

    SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

    Authors: Habib Hajimolahoseini, Omar Mohamed Awad, Walid Ahmed, Austin Wen, Saina Asani, Mohammad Hassanpour, Farnoosh Javadi, Mehdi Ahmadi, Foozhan Ataiefard, Kangling Liu, Yang Liu

    Abstract: In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance criteria measured over the entire dataset during warm-up stages, aiming to preserve the model performance with fewer examples during the rest of training. The impo… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  6. arXiv:2311.03426  [pdf, other

    cs.LG cs.AI cs.CV

    GQKVA: Efficient Pre-training of Transformers by Grou** Queries, Keys, and Values

    Authors: Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu

    Abstract: Massive transformer-based models face several challenges, including slow and computationally intensive pre-training and over-parametrization. This paper addresses these challenges by proposing a versatile method called GQKVA, which generalizes query, key, and value grou** techniques. GQKVA is designed to speed up transformer pre-training while reducing the model size. Our experiments with variou… ▽ More

    Submitted 13 December, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  7. arXiv:2112.12630  [pdf, other

    cs.AR cs.LG

    A Survey of Near-Data Processing Architectures for Neural Networks

    Authors: Mehdi Hassanpour, Marc Riera, Antonio González

    Abstract: Data-intensive workloads and applications, such as machine learning (ML), are fundamentally limited by traditional computing systems based on the von-Neumann architecture. As data movement operations and energy consumption become key bottlenecks in the design of computing systems, the interest in unconventional approaches such as Near-Data Processing (NDP), machine learning, and especially neural… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.