Skip to main content

Showing 1–6 of 6 results for author: Shirian, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.15786  [pdf, other

    cs.LG cs.AI cs.LO

    SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems

    Authors: Amir Samadi, Amir Shirian, Konstantinos Koufos, Kurt Debattista, Mehrdad Dianati

    Abstract: A CF explainer identifies the minimum modifications in the input that would alter the model's output to its complement. In other words, a CF explainer computes the minimum modifications required to cross the model's decision boundary. Current deep generative CF models often work with user-selected features rather than focusing on the discriminative features of the black-box model. Consequently, su… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: This paper is accepted at the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023)

  2. arXiv:2303.02665  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Heterogeneous Graph Learning for Acoustic Event Classification

    Authors: Amir Shirian, Mona Ahmadian, Krishna Somandepalli, Tanaya Guha

    Abstract: Heterogeneous graphs provide a compact, efficient, and scalable way to model data involving multiple disparate modalities. This makes modeling audiovisual data using heterogeneous graphs an attractive option. However, graph structure does not appear naturally in audiovisual data. Graphs for audiovisual data are constructed manually which is both difficult and sub-optimal. In this work, we address… ▽ More

    Submitted 12 March, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: text overlap with arXiv:2207.07935

  3. arXiv:2207.07935  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Visually-aware Acoustic Event Detection using Heterogeneous Graphs

    Authors: Amir Shirian, Krishna Somandepalli, Victor Sanchez, Tanaya Guha

    Abstract: Perception of auditory events is inherently multimodal relying on both audio and visual cues. A large number of existing multimodal approaches process each modality using modality-specific models and then fuse the embeddings to encode the joint information. In contrast, we employ heterogeneous graphs to explicitly capture the spatial and temporal relationships between the modalities and represent… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

  4. Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data

    Authors: Amir Shirian, Krishna Somandepalli, Tanaya Guha

    Abstract: Large scale databases with high-quality manual annotations are scarce in audio domain. We thus explore a self-supervised graph approach to learning audio representations from highly limited labelled data. Considering each audio sample as a graph node, we propose a subgraph-based framework with novel self-supervision tasks that can learn effective audio representations. During training, subgraphs a… ▽ More

    Submitted 16 July, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

  5. arXiv:2008.02661  [pdf, other

    cs.CV cs.MM eess.AS

    Dynamic Emotion Modeling with Learnable Graphs and Graph Inception Network

    Authors: A. Shirian, S. Tripathi, T. Guha

    Abstract: Human emotion is expressed, perceived and captured using a variety of dynamic data modalities, such as speech (verbal), videos (facial expressions) and motion sensors (body gestures). We propose a generalized approach to emotion recognition that can adapt across modalities by modeling dynamic data as structured graphs. The motivation behind the graph approach is to build compact models without com… ▽ More

    Submitted 8 February, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

    Journal ref: 10.1109/TMM.2021.3059169

  6. arXiv:2008.02063  [pdf, other

    cs.CV cs.LG eess.AS

    Compact Graph Architecture for Speech Emotion Recognition

    Authors: A. Shirian, T. Guha

    Abstract: We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an… ▽ More

    Submitted 2 February, 2021; v1 submitted 5 August, 2020; originally announced August 2020.