FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Mohamed, Shaheer; Haghighat, Maryam; Fernando, Tharindu; Sridharan, Sridha; Fookes, Clinton; Moghadam, Peyman

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.09431 (cs)

[Submitted on 18 Sep 2023 (v1), last revised 4 Jan 2024 (this version, v4)]

Title:FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Authors:Shaheer Mohamed, Maryam Haghighat, Tharindu Fernando, Sridha Sridharan, Clinton Fookes, Peyman Moghadam

View PDF HTML (experimental)

Abstract:Hyperspectral images (HSIs) contain rich spectral and spatial information. Motivated by the success of transformers in the field of natural language processing and computer vision where they have shown the ability to learn long range dependencies within input data, recent research has focused on using transformers for HSIs. However, current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension, resulting in the under-utilization of spatial information. Moreover, transformers are known to be data-hungry and their performance relies heavily on large-scale pretraining, which is challenging due to limited annotated hyperspectral data. Therefore, the full potential of HSI transformers has not been fully realized. To overcome these limitations, we propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures, leading to significant improvements in performance. The factorization of the inputs allows the spectral and spatial transformers to better capture the interactions within the hyperspectral data cubes. Inspired by masked image modeling pretraining, we also devise efficient masking strategies for pretraining each of the spectral and spatial transformers. We conduct experiments on six publicly available datasets for HSI classification task and demonstrate that our model achieves state-of-the-art performance in all the datasets. The code for our model will be made available at this https URL.

Comments:	Accepted to IEEE Transactions on Geoscience and Remote Sensing in December 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2309.09431 [cs.CV]
	(or arXiv:2309.09431v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.09431

Submission history

From: Shaheer Mohamed [view email]
[v1] Mon, 18 Sep 2023 02:05:52 UTC (3,630 KB)
[v2] Mon, 11 Dec 2023 02:15:50 UTC (1,952 KB)
[v3] Tue, 12 Dec 2023 04:47:38 UTC (1,952 KB)
[v4] Thu, 4 Jan 2024 01:05:16 UTC (1,952 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators