Skip to main content

Showing 1–6 of 6 results for author: Ataiefard, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19995  [pdf, other

    cs.CL cs.AI cs.LG

    Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model

    Authors: Habib Hajimolahoseini, Mohammad Hassanpour, Foozhan Ataiefard, Boxing Chen, Yang Liu

    Abstract: This paper introduces a novel method of Progressive Low Rank Decomposition (PLRD) tailored for the compression of large language models. Our approach leverages a pre-trained model, which is then incrementally decompressed to smaller sizes using progressively lower ranks. This method allows for significant reductions in computational overhead and energy consumption, as subsequent models are derived… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2401.15293  [pdf, other

    cs.CV cs.AI cs.LG

    SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection

    Authors: Foozhan Ataiefard, Walid Ahmed, Habib Hajimolahoseini, Saina Asani, Farnoosh Javadi, Mohammad Hassanpour, Omar Mohamed Awad, Austin Wen, Kangling Liu, Yang Liu

    Abstract: Vision transformers are known to be more computationally and data-intensive than CNN models. These transformer models such as ViT, require all the input image tokens to learn the relationship among them. However, many of these tokens are not informative and may contain irrelevant information such as unrelated background or unimportant scenery. These tokens are overlooked by the multi-head self-att… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  3. arXiv:2311.15134  [pdf, other

    cs.LG cs.AI

    SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

    Authors: Habib Hajimolahoseini, Omar Mohamed Awad, Walid Ahmed, Austin Wen, Saina Asani, Mohammad Hassanpour, Farnoosh Javadi, Mehdi Ahmadi, Foozhan Ataiefard, Kangling Liu, Yang Liu

    Abstract: In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance criteria measured over the entire dataset during warm-up stages, aiming to preserve the model performance with fewer examples during the rest of training. The impo… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  4. arXiv:2311.03426  [pdf, other

    cs.LG cs.AI cs.CV

    GQKVA: Efficient Pre-training of Transformers by Grou** Queries, Keys, and Values

    Authors: Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu

    Abstract: Massive transformer-based models face several challenges, including slow and computationally intensive pre-training and over-parametrization. This paper addresses these challenges by proposing a versatile method called GQKVA, which generalizes query, key, and value grou** techniques. GQKVA is designed to speed up transformer pre-training while reducing the model size. Our experiments with variou… ▽ More

    Submitted 13 December, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  5. arXiv:2309.14615  [pdf, other

    cs.LG cs.CE q-fin.TR

    Gray-box Adversarial Attack of Deep Reinforcement Learning-based Trading Agents

    Authors: Foozhan Ataiefard, Hadi Hemmati

    Abstract: In recent years, deep reinforcement learning (Deep RL) has been successfully implemented as a smart agent in many systems such as complex games, self-driving cars, and chat-bots. One of the interesting use cases of Deep RL is its application as an automated stock trading agent. In general, any automated trading agent is prone to manipulations by adversaries in the trading environment. Thus studyin… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  6. arXiv:2101.04948  [pdf, other

    cs.LG cs.SE

    Deep State Inference: Toward Behavioral Model Inference of Black-box Software Systems

    Authors: Foozhan Ataiefard, Mohammad Jafar Mashhadi, Hadi Hemmati, Niel Walkinshaw

    Abstract: Many software engineering tasks, such as testing, and anomaly detection can benefit from the ability to infer a behavioral model of the software.Most existing inference approaches assume access to code to collect execution sequences. In this paper, we investigate a black-box scenario, where the system under analysis cannot be instrumented, in this granular fashion.This scenario is particularly pre… ▽ More

    Submitted 12 October, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: 17 pages,9 figures. arXiv admin note: text overlap with arXiv:2008.11856