Skip to main content

Showing 1–5 of 5 results for author: Kakogeorgiou, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15587  [pdf, other

    cs.CV

    Composed Image Retrieval for Remote Sensing

    Authors: Bill Psomas, Ioannis Kakogeorgiou, Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum, Yannis Avrithis, Konstantinos Karantzalos

    Abstract: This work introduces composed image retrieval to remote sensing. It allows to query a large image archive by image examples alternated by a textual description, enriching the descriptive power over unimodal queries, either visual or textual. Various attributes can be modified by the textual part, such as shape, color, or context. A novel method fusing image-to-image and text-to-image similarity is… ▽ More

    Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted for ORAL presentation at the 2024 IEEE International Geoscience and Remote Sensing Symposium

  2. arXiv:2312.00648  [pdf, other

    cs.CV

    SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

    Authors: Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos, Nikos Komodakis

    Abstract: Unsupervised object-centric learning aims to decompose scenes into interpretable object entities, termed slots. Slot-based auto-encoders stand out as a prominent method for this task. Within them, crucial aspects include guiding the encoder to generate object-specific slots and ensuring the decoder utilizes them during reconstruction. This work introduces two novel techniques, (i) an attention-bas… ▽ More

    Submitted 5 April, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 (Highlight). Code: https://github.com/gkakogeorgiou/spot

  3. arXiv:2309.06891  [pdf, other

    cs.CV cs.LG

    Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

    Authors: Bill Psomas, Ioannis Kakogeorgiou, Konstantinos Karantzalos, Yannis Avrithis

    Abstract: Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different? As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the p… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: ICCV 2023. Code and models: https://github.com/billpsomas/simpool

    Journal ref: International Conference on Computer Vision (2023)

  4. What to Hide from Your Students: Attention-Guided Masked Image Modeling

    Authors: Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis

    Abstract: Transformers and masked language modeling are quickly being adopted and explored in computer vision as vision transformers and masked image modeling (MIM). In this work, we argue that image token masking differs from token masking in text, due to the amount and correlation of tokens in an image. In particular, to generate a challenging pretext task for MIM, we advocate a shift from random masking… ▽ More

    Submitted 22 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: ECCV 2022. Codes and models are available at https://github.com/gkakogeorgiou/attmask

    Journal ref: European Conference on Computer Vision (2022)

  5. Evaluating explainable artificial intelligence methods for multi-label deep learning classification tasks in remote sensing

    Authors: Ioannis Kakogeorgiou, Konstantinos Karantzalos

    Abstract: Although deep neural networks hold the state-of-the-art in several remote sensing tasks, their black-box operation hinders the understanding of their decisions, concealing any bias and other shortcomings in datasets and model performance. To this end, we have applied explainable artificial intelligence (XAI) methods in remote sensing multi-label classification tasks towards producing human-interpr… ▽ More

    Submitted 20 September, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Journal ref: International Journal of Applied Earth Observation and Geoinformation 103 (2021) 102520