MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

Fang, Zijie; Wang, Yifeng; Wang, Zhi; Zhang, Jian; Ji, Xiangyang; Zhang, Yongbing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.05160 (cs)

[Submitted on 8 Mar 2024]

Title:MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

Authors:Zijie Fang, Yifeng Wang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang

View PDF HTML (experimental)

Abstract:Recently, pathological diagnosis, the gold standard for cancer diagnosis, has achieved superior performance by combining the Transformer with the multiple instance learning (MIL) framework using whole slide images (WSIs). However, the giga-pixel nature of WSIs poses a great challenge for the quadratic-complexity self-attention mechanism in Transformer to be applied in MIL. Existing studies usually use linear attention to improve computing efficiency but inevitably bring performance bottlenecks. To tackle this challenge, we propose a MamMIL framework for WSI classification by cooperating the selective structured state space model (i.e., Mamba) with MIL for the first time, enabling the modeling of instance dependencies while maintaining linear complexity. Specifically, to solve the problem that Mamba can only conduct unidirectional one-dimensional (1D) sequence modeling, we innovatively introduce a bidirectional state space model and a 2D context-aware block to enable MamMIL to learn the bidirectional instance dependencies with 2D spatial relationships. Experiments on two datasets show that MamMIL can achieve advanced classification performance with smaller memory footprints than the state-of-the-art MIL frameworks based on the Transformer. The code will be open-sourced if accepted.

Comments:	11 pages, 2 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.05160 [cs.CV]
	(or arXiv:2403.05160v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.05160

Submission history

From: Zijie Fang [view email]
[v1] Fri, 8 Mar 2024 09:02:13 UTC (459 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators