Skip to main content

Showing 1–30 of 30 results for author: Fayyaz, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00219  [pdf, other

    cs.CL cs.AI

    Evaluating Human Alignment and Model Faithfulness of LLM Rationale

    Authors: Mohsen Fayyaz, Fan Yin, Jiao Sun, Nanyun Peng

    Abstract: We study how well large language models (LLMs) explain their generations with rationales -- a set of tokens extracted from the input texts that reflect the decision process of LLMs. We examine LLM rationales extracted with two methods: 1) attribution-based methods that use attention or gradients to locate important tokens, and 2) prompting-based methods that guide LLMs to extract rationales using… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  2. arXiv:2405.17397  [pdf, other

    cs.CV

    Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding

    Authors: Niloofar Azizi, Mohsen Fayyaz, Horst Bischof

    Abstract: Understanding human behavior fundamentally relies on accurate 3D human pose estimation. Graph Convolutional Networks (GCNs) have recently shown promising advancements, delivering state-of-the-art performance with rather lightweight architectures. In the context of graph-structured data, leveraging the eigenvectors of the graph Laplacian matrix for positional encoding is effective. Yet, the approac… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2404.11672  [pdf, other

    cs.CL

    MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

    Authors: Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

    Abstract: While current large language models (LLMs) demonstrate some capabilities in knowledge-intensive tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with infrequent knowledge and temporal degradation. In addition, the uninterpretable nature of parametric memorization makes it challenging to understand and prevent hallucination. Paramet… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  4. arXiv:2306.02873  [pdf, other

    cs.CL

    DecompX: Explaining Transformers Decisions by Propagating Token Decomposition

    Authors: Ali Modarressi, Mohsen Fayyaz, Ehsan Aghazadeh, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: An emerging solution for explaining Transformer-based models is to use vector-based analysis on how the representations are formed. However, providing a faithful vector-based explanation for a multi-layer model could be challenging in three aspects: (1) Incorporating all components into the analysis, (2) Aggregating the layer dynamics to determine the information flow and mixture throughout the en… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 (main conference)

  5. arXiv:2305.14322  [pdf, other

    cs.CL

    RET-LLM: Towards a General Read-Write Memory for Large Language Models

    Authors: Ali Modarressi, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

    Abstract: Large language models (LLMs) have significantly advanced the field of natural language processing (NLP) through their extensive parameters and comprehensive data utilization. However, existing LLMs lack a dedicated memory unit, limiting their ability to explicitly store and retrieve knowledge for various tasks. In this paper, we propose RET-LLM a novel framework that equips LLMs with a general wri… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  6. arXiv:2211.07804  [pdf, other

    eess.IV cs.CV

    Diffusion Models for Medical Image Analysis: A Comprehensive Survey

    Authors: Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, Moein Heidari, Reza Azad, Mohsen Fayyaz, Ilker Hacihaliloglu, Dorit Merhof

    Abstract: Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. D… ▽ More

    Submitted 3 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Third revision: including more papers and further discussions

  7. arXiv:2211.05610  [pdf, other

    cs.CL

    BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

    Authors: Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Mohammad Taher Pilehvar, Yadollah Yaghoobzadeh, Samira Ebrahimi Kahou

    Abstract: Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based s… ▽ More

    Submitted 28 November, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: ENLSP @ NeurIPS2022

  8. arXiv:2205.03286  [pdf, other

    cs.CL

    GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

    Authors: Ali Modarressi, Mohsen Fayyaz, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: There has been a growing interest in interpreting the underlying dynamics of Transformers. While self-attention patterns were initially deemed as the primary option, recent studies have shown that integrating other components can yield more accurate explanations. This paper introduces a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted to NAACL 2022 (main conference)

  9. arXiv:2203.14139  [pdf, other

    cs.CL

    Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages

    Authors: Ehsan Aghazadeh, Mohsen Fayyaz, Yadollah Yaghoobzadeh

    Abstract: Human languages are full of metaphorical expressions. Metaphors help people understand the world by connecting new concepts and domains to more familiar ones. Large pre-trained language models (PLMs) are therefore assumed to encode metaphorical knowledge useful for NLP systems. In this paper, we investigate this hypothesis for PLMs, by probing metaphoricity information in their encodings, and by m… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022 (main conference)

  10. arXiv:2111.15667  [pdf, other

    cs.CV

    Adaptive Token Sampling For Efficient Vision Transformers

    Authors: Mohsen Fayyaz, Soroush Abbasi Koohpayegani, Farnoush Rezaei Jafari, Sunando Sengupta, Hamid Reza Vaezi Joze, Eric Sommerlade, Hamed Pirsiavash, Juergen Gall

    Abstract: While state-of-the-art vision transformer models achieve promising results in image classification, they are computationally expensive and require many GFLOPs. Although the GFLOPs of a vision transformer can be decreased by reducing the number of tokens in the network, there is no setting that is optimal for all input images. In this work, we therefore introduce a differentiable parameter-free Ada… ▽ More

    Submitted 26 July, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: ECCV 2022

  11. arXiv:2110.14392  [pdf, other

    cs.CV

    TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction

    Authors: Saber Pourheydari, Emad Bahrami, Mohsen Fayyaz, Gianpiero Francesca, Mehdi Noroozi, Juergen Gall

    Abstract: While recurrent neural networks (RNNs) demonstrate outstanding capabilities for future video frame prediction, they model dynamics in a discrete time space, i.e., they predict the frames sequentially with a fixed temporal step. RNNs are therefore prone to accumulate the error as the number of future frames increases. In contrast, partial differential equations (PDEs) model physical phenomena like… ▽ More

    Submitted 12 October, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: BMVC 2022

  12. arXiv:2109.11593  [pdf, other

    cs.CV

    Long Short View Feature Decomposition via Contrastive Video Representation Learning

    Authors: Nadine Behrmann, Mohsen Fayyaz, Juergen Gall, Mehdi Noroozi

    Abstract: Self-supervised video representation methods typically focus on the representation of temporal attributes in videos. However, the role of stationary versus non-stationary attributes is less explored: Stationary features, which remain similar throughout the video, enable the prediction of video-level action classes. Non-stationary features, which represent temporally varying attributes, are more be… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: ICCV 2021 (Main Conference)

  13. arXiv:2109.05958  [pdf, other

    cs.CL cs.AI

    Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations

    Authors: Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Hosein Mohebbi, Mohammad Taher Pilehvar

    Abstract: Most of the recent works on probing representations have focused on BERT, with the presumption that the findings might be similar to the other models. In this work, we extend the probing studies to two other models in the family, namely ELECTRA and XLNet, showing that variations in the pre-training objectives or architectural choices can result in different behaviors in encoding linguistic informa… ▽ More

    Submitted 15 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted to BlackboxNLP Workshop at EMNLP 2021

  14. arXiv:2011.08652  [pdf, other

    cs.CV

    3D CNNs with Adaptive Temporal Feature Resolutions

    Authors: Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Juergen Gall

    Abstract: While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips. In this work, we therefore introduce a different… ▽ More

    Submitted 11 August, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: CVPR 2021

  15. arXiv:2003.14266  [pdf, other

    cs.CV

    SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

    Authors: Mohsen Fayyaz, Juergen Gall

    Abstract: Temporal action segmentation is a topic of increasing interest, however, annotating each frame in a video is cumbersome and costly. Weakly supervised approaches therefore aim at learning temporal action segmentation from videos that are only weakly labeled. In this work, we assume that for each training video only the list of actions is given that occur in the video, but not when, how often, and i… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: CVPR 2020

  16. arXiv:1904.11451  [pdf, other

    cs.CV

    Large Scale Holistic Video Understanding

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc Van Gool

    Abstract: Video recognition has been advanced in recent years by benchmarks with rich annotations. However, research is still mainly limited to human action or sports recognition - focusing on a highly specific video understanding task and thus leaving a significant gap towards describing the overall content of a video. We fill this gap by presenting a large-scale "Holistic Video Understanding Dataset"~(HVU… ▽ More

    Submitted 15 December, 2020; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: ECCV 2020

  17. arXiv:1904.03116  [pdf, other

    cs.CV cs.LG

    Fast Weakly Supervised Action Segmentation Using Mutual Consistency

    Authors: Yaser Souri, Mohsen Fayyaz, Luca Minciullo, Gianpiero Francesca, Juergen Gall

    Abstract: Action segmentation is the task of predicting the actions for each frame of a video. As obtaining the full annotation of videos for action segmentation is expensive, weakly supervised approaches that can learn only from transcripts are appealing. In this paper, we propose a novel end-to-end approach for weakly supervised action segmentation based on a two-branch neural network. The two branches of… ▽ More

    Submitted 10 June, 2021; v1 submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted for publication at TPAMI (IEEE Transactions on Pattern Analysis and Machine Intelligence) in 2021. First two authors contributed equally

  18. arXiv:1806.09986  [pdf

    cs.CV

    Online Signature Verification using Deep Representation: A new Descriptor

    Authors: Mohammad Hajizadeh Saffar, Mohsen Fayyaz, Mohammad Sabokrou, Mahmood Fathy

    Abstract: This paper presents an accurate method for verifying online signatures. The main difficulty of signature verification come from: (1) Lacking enough training samples (2) The methods must be spatial change invariant. To deal with these difficulties and modeling the signatures efficiently, we propose a method that a one-class classifier per each user is built on discriminative features. First, we pre… ▽ More

    Submitted 23 June, 2018; originally announced June 2018.

    Comments: arXiv admin note: substantial text overlap with arXiv:1505.08153

  19. arXiv:1806.07754  [pdf, other

    cs.CV

    Spatio-Temporal Channel Correlation Networks for Action Classification

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, M. Mahdi Arzani, Rahman Yousefzadeh, Juergen Gall, Luc Van Gool

    Abstract: The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts… ▽ More

    Submitted 7 February, 2019; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: Accepted in ECCV 2018. arXiv admin note: substantial text overlap with arXiv:1711.08200

  20. arXiv:1806.06172  [pdf

    cs.CV

    Semantic Video Segmentation: A Review on Recent Approaches

    Authors: Mohammad Hajizadeh Saffar, Mohsen Fayyaz, Mohammad Sabokrou, Mahmood Fathy

    Abstract: This paper gives an overview on semantic segmentation consists of an explanation of this field, it's status and relation with other vision fundamental tasks, different datasets and common evaluation parameters that have been used by researchers. This survey also includes an overall review on a variety of recent approaches (RDF, MRF, CRF, etc.) and their advantages and challenges and shows the supe… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

  21. arXiv:1805.09521  [pdf, other

    cs.CV

    AVID: Adversarial Visual Irregularity Detection

    Authors: Mohammad Sabokrou, Masoud Pourreza, Mohsen Fayyaz, Rahim Entezari, Mahmood Fathy, Jürgen Gall, Ehsan Adeli

    Abstract: Real-time detection of irregularities in visual data is very invaluable and useful in many prospective applications including surveillance, patient monitoring systems, etc. With the surge of deep learning methods in the recent years, researchers have tried a wide spectrum of methods for different applications. However, for the case of irregularity or anomaly detection in videos, training an end-to… ▽ More

    Submitted 17 July, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

  22. arXiv:1802.06205  [pdf, other

    cs.CV

    Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet

    Authors: Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou, Ehsan Adeli

    Abstract: Major winning Convolutional Neural Networks (CNNs), such as VGGNet, ResNet, DenseNet, \etc, include tens to hundreds of millions of parameters, which impose considerable computation and memory overheads. This limits their practical usage in training and optimizing for real-world applications. On the contrary, light-weight architectures, such as SqueezeNet, are being proposed to address this issue.… ▽ More

    Submitted 17 February, 2018; originally announced February 2018.

    Comments: The Submitted version to the IEEE TIP on December 2017, replaced high resolution images with low-res counterparts due to arXiv size limitation, 19 pages

  23. arXiv:1711.08200  [pdf, other

    cs.CV

    Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, Amir Hossein Karami, Mohammad Mahdi Arzani, Rahman Yousefzadeh, Luc Van Gool

    Abstract: The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on spatio-temporal approaches with fixed temporal convolution kernel depths. We introduce a new temporal layer that models variable temporal convolution kernel depths. We e… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

  24. arXiv:1609.00866  [pdf, other

    cs.CV

    Deep-Anomaly: Fully Convolutional Neural Network for Fast Anomaly Detection in Crowded Scenes

    Authors: Mohammad Sabokrou, Mohsen Fayyaz, Mahmood Fathy, Zahra Moayedd, Reinhard klette

    Abstract: The detection of abnormal behaviours in crowded scenes has to deal with many challenges. This paper presents an efficient method for detection and localization of anomalies in videos. Using fully convolutional neural networks (FCNs) and temporal data, a pre-trained supervised FCN is transferred into an unsupervised FCN ensuring the detection of (global) anomalies in scenes. High performance in ter… ▽ More

    Submitted 30 April, 2017; v1 submitted 3 September, 2016; originally announced September 2016.

  25. arXiv:1608.06037  [pdf, other

    cs.CV cs.NE

    Lets keep it simple, Using simple architectures to outperform deeper and more complex architectures

    Authors: Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou

    Abstract: Major winning Convolutional Neural Networks (CNNs), such as AlexNet, VGGNet, ResNet, GoogleNet, include tens to hundreds of millions of parameters, which impose considerable computation and memory overhead. This limits their practical use for training, optimization and memory efficiency. On the contrary, light-weight architectures, being proposed to address this issue, mainly suffer from low accur… ▽ More

    Submitted 27 April, 2023; v1 submitted 21 August, 2016; originally announced August 2016.

    Comments: Added the long-overdue ImageNet results and updated the missed cifar10/100 results from 2018

  26. arXiv:1608.05971  [pdf, other

    cs.CV

    STFCN: Spatio-Temporal FCN for Semantic Video Segmentation

    Authors: Mohsen Fayyaz, Mohammad Hajizadeh Saffar, Mohammad Sabokrou, Mahmood Fathy, Reinhard Klette, Fay Huang

    Abstract: This paper presents a novel method to involve both spatial and temporal features for semantic video segmentation. Current work on convolutional neural networks(CNNs) has shown that CNNs provide advanced spatial features supporting a very good performance of solutions for both image and video analysis, especially for the semantic segmentation task. We investigate how involving temporal features als… ▽ More

    Submitted 2 September, 2016; v1 submitted 21 August, 2016; originally announced August 2016.

  27. arXiv:1508.03710  [pdf

    cs.CV

    A Novel Approach For Finger Vein Verification Based on Self-Taught Learning

    Authors: Mohsen Fayyaz, Masoud PourReza, Mohammad Hajizadeh Saffar, Mohammad Sabokrou, Mahmood Fathy

    Abstract: In this paper, we propose a method for user Finger Vein Authentication (FVA) as a biometric system. Using the discriminative features for classifying theses finger veins is one of the main tips that make difference in related works, Thus we propose to learn a set of representative features, based on autoencoders. We model the user finger vein using a Gaussian distribution. Experimental results sho… ▽ More

    Submitted 15 August, 2015; originally announced August 2015.

    Comments: 4 pages, 4 figures, Submitted Iranian Conference on Machine Vision and Image Processing

  28. Feature Representation for Online Signature Verification

    Authors: Mohsen Fayyaz, Mohammad Hajizadeh_Saffar, Mohammad Sabokrou, Mahmood Fathy

    Abstract: Biometrics systems have been used in a wide range of applications and have improved people authentication. Signature verification is one of the most common biometric methods with techniques that employ various specifications of a signature. Recently, deep learning has achieved great success in many fields, such as image, sounds and text processing. In this paper, deep learning method has been used… ▽ More

    Submitted 29 May, 2015; originally announced May 2015.

    Comments: 10 pages, 10 figures, Submitted to IEEE Transactions on Information Forensics and Security

  29. arXiv:1411.4670  [pdf, other

    cs.CV

    AlexU-Word: A New Dataset for Isolated-Word Closed-Vocabulary Offline Arabic Handwriting Recognition

    Authors: Mohamed E. Hussein, Marwan Torki, Ahmed Elsallamy, Mahmoud Fayyaz

    Abstract: In this paper, we introduce the first phase of a new dataset for offline Arabic handwriting recognition. The aim is to collect a very large dataset of isolated Arabic words that covers all letters of the alphabet in all possible shapes using a small number of simple words. The end goal is to collect a very large dataset of segmented letter images, which can be used to build and evaluate Arabic han… ▽ More

    Submitted 17 November, 2014; originally announced November 2014.

    Comments: 6 pages, 8 figure, and 6 tables

    ACM Class: I.5.2; I.7.5

  30. arXiv:1411.3519  [pdf, other

    cs.CV

    Window-Based Descriptors for Arabic Handwritten Alphabet Recognition: A Comparative Study on a Novel Dataset

    Authors: Marwan Torki, Mohamed E. Hussein, Ahmed Elsallamy, Mahmoud Fayyaz, Shehab Yaser

    Abstract: This paper presents a comparative study for window-based descriptors on the application of Arabic handwritten alphabet recognition. We show a detailed experimental evaluation of different descriptors with several classifiers. The objective of the paper is to evaluate different window-based descriptors on the problem of Arabic letter recognition. Our experiments clearly show that they perform very… ▽ More

    Submitted 17 November, 2014; v1 submitted 13 November, 2014; originally announced November 2014.

    ACM Class: I.5.2; I.7.5