Skip to main content

Showing 1–2 of 2 results for author: Vayani, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.14743  [pdf, other

    cs.CV

    VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding

    Authors: Ahmad Mahmood, Ashmal Vayani, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

    Abstract: Recent studies have demonstrated the effectiveness of Large Language Models (LLMs) as reasoning modules that can deconstruct complex tasks into more manageable sub-tasks, particularly when applied to visual reasoning tasks for images. In contrast, this paper introduces a Video Understanding and Reasoning Framework (VURF) based on the reasoning power of LLMs. Ours is a novel approach to extend the… ▽ More

    Submitted 24 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  2. arXiv:2402.16840  [pdf, other

    cs.CL

    MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

    Authors: Omkar Thawakar, Ashmal Vayani, Salman Khan, Hisham Cholakal, Rao M. Anwer, Michael Felsberg, Tim Baldwin, Eric P. Xing, Fahad Shahbaz Khan

    Abstract: "Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. However, LLMs do not suit well for scenarios that require on-device processing, energy efficiency, low memory footprint, and response efficiency. These requisites are crucial for privacy, security, and sustainable deployment. This paper explores the "less is more" paradigm by addressing the chall… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Code available at : https://github.com/mbzuai-oryx/MobiLlama