Skip to main content

Showing 1–2 of 2 results for author: Cid, Y D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.08868  [pdf, other

    cs.CV

    B-Cos Aligned Transformers Learn Human-Interpretable Features

    Authors: Manuel Tran, Amal Lahiani, Yashin Dicente Cid, Melanie Boxberg, Peter Lienemann, Christian Matek, Sophia J. Wagner, Fabian J. Theis, Eldad Klaiman, Tingying Peng

    Abstract: Vision Transformers (ViTs) and Swin Transformers (Swin) are currently state-of-the-art in computational pathology. However, domain experts are still reluctant to use these models due to their lack of interpretability. This is not surprising, as critical decisions need to be transparent and understandable. The most common approach to understanding transformers is to visualize their attention. Howev… ▽ More

    Submitted 18 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at MICCAI 2023 (oral). Camera-ready available at https://doi.org/10.1007/978-3-031-43993-3_50

  2. arXiv:2305.14243  [pdf, other

    cs.AI cs.CV

    Training Transitive and Commutative Multimodal Transformers with LoReTTa

    Authors: Manuel Tran, Yashin Dicente Cid, Amal Lahiani, Fabian J. Theis, Tingying Peng, Eldad Klaiman

    Abstract: Training multimodal foundation models is challenging due to the limited availability of multimodal datasets. While many public datasets pair images with text, few combine images with audio or text with audio. Even rarer are datasets that align all three modalities at once. Critical domains such as healthcare, infrastructure, or transportation are particularly affected by missing modalities. This m… ▽ More

    Submitted 16 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at NeurIPS 2023 (poster). Camera-ready version