Skip to main content

Showing 1–5 of 5 results for author: Shaikh, B

Searching in archive cs. Search in all archives.
.
  1. From CNNs to Transformers in Multimodal Human Action Recognition: A Survey

    Authors: Muhammad Bilal Shaikh, Syed Mohammed Shamsul Islam, Douglas Chai, Naveed Akhtar

    Abstract: Due to its widespread applications, human action recognition is one of the most widely studied research problems in Computer Vision. Recent studies have shown that addressing it using multimodal data leads to superior performance as compared to relying on a single data modality. During the adoption of deep learning for visual modelling in the last decade, action recognition approaches have mainly… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 23 pages, 5 figures and 3 Tables. To appear in ACM Trans. Multimedia Comput. Commun. Appl.(TOMM) 2024

    ACM Class: A.1; I.2.10

  2. arXiv:2308.03741  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    MAiVAR-T: Multimodal Audio-image and Video Action Recognizer using Transformers

    Authors: Muhammad Bilal Shaikh, Douglas Chai, Syed Mohammed Shamsul Islam, Naveed Akhtar

    Abstract: In line with the human capacity to perceive the world by simultaneously processing and integrating high-dimensional inputs from multiple modalities like vision and audio, we propose a novel model, MAiVAR-T (Multimodal Audio-Image to Video Action Recognition Transformer). This model employs an intuitive approach for the combination of audio-image and video modalities, with a primary aim to escalate… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 6 pages, 7 figures, 4 tables, Peer reviewed, Accepted @ The 11th European Workshop on Visual Information Processing (EUVIP) will be held on 11th-14th September 2023, in Gjøvik, Norway. arXiv admin note: text overlap with arXiv:2103.15691 by other authors

  3. MAiVAR: Multimodal Audio-Image and Video Action Recognizer

    Authors: Muhammad Bilal Shaikh, Douglas Chai, Syed Mohammed Shamsul Islam, Naveed Akhtar

    Abstract: Currently, action recognition is predominately performed on video data as processed by CNNs. We investigate if the representation process of CNNs can also be leveraged for multimodal action recognition by incorporating image-based audio representations of actions in a task. To this end, we propose Multimodal Audio-Image and Video Action Recognizer (MAiVAR), a CNN-based audio-image to video fusion… ▽ More

    Submitted 10 September, 2022; originally announced September 2022.

    Comments: Peer reviewed & accepted at IEEE VCIP 2022 (http://www.vcip2022.org/)

    ACM Class: I.2.10; I.5.4; I.5.2

    Journal ref: 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)

  4. arXiv:2203.06732  [pdf, other

    q-bio.QM cs.CE q-bio.MN

    BioSimulators: a central registry of simulation engines and services for recommending specific tools

    Authors: Bilal Shaikh, Lucian P. Smith, Dan Vasilescu, Gnaneswara Marupilla, Michael Wilson, Eran Agmon, Henry Agnew, Steven S. Andrews, Azraf Anwar, Moritz E. Beber, Frank T. Bergmann, David Brooks, Lutz Brusch, Laurence Calzone, Kiri Choi, Joshua Cooper, John Detloff, Brian Drawert, Michel Dumontier, G. Bard Ermentrout, James R. Faeder, Andrew P. Freiburger, Fabian Fröhlich, Akira Funahashi, Alan Garny , et al. (46 additional authors not shown)

    Abstract: Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find… ▽ More

    Submitted 13 March, 2022; originally announced March 2022.

    Comments: 6 pages, 2 figures

  5. arXiv:2005.05227  [pdf, other

    cs.DB q-bio.QM

    ObjTables: structured spreadsheets that promote data quality, reuse, and integration

    Authors: Jonathan R. Karr, Wolfram Liebermeister, Arthur P. Goldberg, John A. P. Sekar, Bilal Shaikh

    Abstract: A central challenge in science is to understand how systems behaviors emerge from complex networks. This often requires aggregating, reusing, and integrating heterogeneous information. Supplementary spreadsheets to articles are a key data source. Spreadsheets are popular because they are easy to read and write. However, spreadsheets are often difficult to reanalyze because they capture data ad hoc… ▽ More

    Submitted 6 August, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

    Comments: 5 pages, 1 figures, 18 pages of supplementary information, 3 supplementary datasets