Skip to main content

Showing 1–23 of 23 results for author: Savakis, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00293  [pdf, other

    cs.CV

    MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model

    Authors: Rajat Sahay, Andreas Savakis

    Abstract: The emergence of foundation models, such as the Segment Anything Model (SAM), has sparked interest in Parameter-Efficient Fine-Tuning (PEFT) methods that tailor these large models to application domains outside their training data. However, different PEFT techniques modify the representation of a model differently, making it a non-trivial task to select the most appropriate method for the domain o… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Workshop on Foundation Models, CVPR 2024

  2. arXiv:2401.11243  [pdf, ps, other

    cs.CV

    LRP-QViT: Mixed-Precision Vision Transformer Quantization via Layer-wise Relevance Propagation

    Authors: Navin Ranjan, Andreas Savakis

    Abstract: Vision transformers (ViTs) have demonstrated remarkable performance across various visual tasks. However, ViT models suffer from substantial computational and memory requirements, making it challenging to deploy them on resource-constrained platforms. Quantization is a popular approach for reducing model size, but most studies mainly focus on equal bit-width quantization for the entire network, re… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  3. arXiv:2312.03767  [pdf, other

    cs.CV cs.AI

    Unknown Sample Discovery for Source Free Open Set Domain Adaptation

    Authors: Chowdhury Sadman Jahan, Andreas Savakis

    Abstract: Open Set Domain Adaptation (OSDA) aims to adapt a model trained on a source domain to a target domain that undergoes distribution shift and contains samples from novel classes outside the source domain. Source-free OSDA (SF-OSDA) techniques eliminate the need to access source domain samples, but current SF-OSDA methods utilize only the known classes in the target domain for adaptation, and require… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  4. arXiv:2308.00956  [pdf, other

    cs.CV cs.LG

    Curriculum Guided Domain Adaptation in the Dark

    Authors: Chowdhury Sadman Jahan, Andreas Savakis

    Abstract: Addressing the rising concerns of privacy and security, domain adaptation in the dark aims to adapt a black-box source trained model to an unlabeled target domain without access to any source data or source model parameters. The need for domain adaptation of black-box predictors becomes even more pronounced to protect intellectual property as deep learning based solutions are becoming increasingly… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  5. arXiv:2308.00924  [pdf, other

    cs.CV cs.LG

    Continual Domain Adaptation on Aerial Images under Gradually Degrading Weather

    Authors: Chowdhury Sadman Jahan, Andreas Savakis

    Abstract: Domain adaptation (DA) strives to mitigate the domain gap between the source domain where a model is trained, and the target domain where the model is deployed. When a deep learning model is deployed on an aerial platform, it may face gradually degrading weather conditions during operation, leading to widening domain gaps between the training data and the encountered evaluation data. We synthesize… ▽ More

    Submitted 14 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  6. arXiv:2205.14474  [pdf, other

    cs.CV

    DeepRM: Deep Recurrent Matching for 6D Pose Refinement

    Authors: Alexander Avery, Andreas Savakis

    Abstract: Precise 6D pose estimation of rigid objects from RGB images is a critical but challenging task in robotics, augmented reality and human-computer interaction. To address this problem, we propose DeepRM, a novel recurrent network architecture for 6D pose refinement. DeepRM leverages initial coarse pose estimates to render synthetic images of target objects. The rendered images are then matched with… ▽ More

    Submitted 16 June, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

    Comments: 9 pages, 3 figures, CVPR 2023 RHOBIN Workshop

  7. arXiv:2112.10716  [pdf, other

    cs.CV

    BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall Representations

    Authors: Bruno Artacho, Andreas Savakis

    Abstract: We propose BAPose, a novel bottom-up approach that achieves state-of-the-art results for multi-person pose estimation. Our end-to-end trainable framework leverages a disentangled multi-scale waterfall architecture and incorporates adaptive convolutions to infer keypoints more precisely in crowded scenes with occlusions. The multi-scale representations, obtained by the disentangled waterfall module… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

  8. arXiv:2105.06033  [pdf, other

    cs.CV

    Extreme Face Inpainting with Sketch-Guided Conditional GAN

    Authors: Nilesh Pandey, Andreas Savakis

    Abstract: Recovering badly damaged face images is a useful yet challenging task, especially in extreme cases where the masked or damaged region is very large. One of the major challenges is the ability of the system to generalize on faces outside the training dataset. We propose to tackle this extreme inpainting task with a conditional Generative Adversarial Network (GAN) that utilizes structural informatio… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  9. arXiv:2104.08112  [pdf, other

    cs.CV

    Grassmann Iterative Linear Discriminant Analysis with Proxy Matrix Optimization

    Authors: Navya Nagananda, Breton Minnehan, Andreas Savakis

    Abstract: Linear Discriminant Analysis (LDA) is commonly used for dimensionality reduction in pattern recognition and statistics. It is a supervised method that aims to find the most discriminant space of reduced dimension that can be further used for classification. In this work, we present a Grassmann Iterative LDA method (GILDA) that is based on Proxy Matrix Optimization (PMO). PMO makes use of automatic… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  10. arXiv:2104.03510  [pdf, other

    cs.CV

    SiamReID: Confuser Aware Siamese Tracker with Re-identification Feature

    Authors: Abu Md Niamul Taufique, Andreas Savakis, Michael Braun, Daniel Kubacki, Ethan Dell, Lei Qian, Sean M. O'Rourke

    Abstract: Siamese deep-network trackers have received significant attention in recent years due to their real-time speed and state-of-the-art performance. However, Siamese trackers suffer from similar looking confusers, that are prevalent in aerial imagery and create challenging conditions due to prolonged occlusions where the tracker object re-appears under different pose and illumination. Our work propose… ▽ More

    Submitted 15 April, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: 6 pages, 5 figures

  11. Benchmarking Deep Trackers on Aerial Videos

    Authors: Abu Md Niamul Taufique, Breton Minnehan, Andreas Savakis

    Abstract: In recent years, deep learning-based visual object trackers have achieved state-of-the-art performance on several visual object tracking benchmarks. However, most tracking benchmarks are focused on ground level videos, whereas aerial tracking presents a new set of challenges. In this paper, we compare ten trackers based on deep learning techniques on four aerial datasets. We choose top performing… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: 25 pages, 10 figures, 7 tables

    ACM Class: I.4; I.5

    Journal ref: Sensors 2020, 20(2), 547

  12. Visualization of Deep Transfer Learning In SAR Imagery

    Authors: Abu Md Niamul Taufique, Navya Nagananda, Andreas Savakis

    Abstract: Synthetic Aperture Radar (SAR) imagery has diverse applications in land and marine surveillance. Unlike electro-optical (EO) systems, these systems are not affected by weather conditions and can be used in the day and night times. With the growing importance of SAR imagery, it would be desirable if models trained on widely available EO datasets can also be used for SAR images. In this work, we con… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: 4 pages, 5 figures

    ACM Class: I.4; I.5

    Journal ref: IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium

  13. Automatic Quantification of Facial Asymmetry using Facial Landmarks

    Authors: Abu Md Niamul Taufique, Andreas Savakis, Jonathan Leckenby

    Abstract: One-sided facial paralysis causes uneven movements of facial muscles on the sides of the face. Physicians currently assess facial asymmetry in a subjective manner based on their clinical experience. This paper proposes a novel method to provide an objective and quantitative asymmetry score for frontal faces. Our metric has the potential to help physicians for diagnosis as well as monitoring the re… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: 5 pages, 4 figures

    ACM Class: I.4; I.5

    Journal ref: 2019 IEEE Western New York Image and Signal Processing Workshop (WNYISPW)

  14. arXiv:2103.11056  [pdf, other

    cs.CV

    ConDA: Continual Unsupervised Domain Adaptation

    Authors: Abu Md Niamul Taufique, Chowdhury Sadman Jahan, Andreas Savakis

    Abstract: Domain Adaptation (DA) techniques are important for overcoming the domain shift between the source domain used for training and the target domain where testing takes place. However, current DA methods assume that the entire target domain is available during adaptation, which may not hold in practice. This paper considers a more realistic scenario, where target data become available in smaller batc… ▽ More

    Submitted 7 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Comments: 10pages, 4 figures

    ACM Class: I.4; I.5

  15. arXiv:2103.10180  [pdf, other

    cs.CV cs.LG eess.IV

    OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation

    Authors: Bruno Artacho, Andreas Savakis

    Abstract: We propose OmniPose, a single-pass, end-to-end trainable framework, that achieves state-of-the-art results for multi-person pose estimation. Using a novel waterfall module, the OmniPose architecture leverages multi-scale feature representations that increase the effectiveness of backbone feature extractors, without the need for post-processing. OmniPose incorporates contextual information across s… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: arXiv admin note: text overlap with arXiv:2001.08095

  16. arXiv:2011.14417  [pdf, other

    cs.CV

    LABNet: Local Graph Aggregation Network with Class Balanced Loss for Vehicle Re-Identification

    Authors: Abu Md Niamul Taufique, Andreas Savakis

    Abstract: Vehicle re-identification is an important computer vision task where the objective is to identify a specific vehicle among a set of vehicles seen at various viewpoints. Recent methods based on deep learning utilize a global average pooling layer after the backbone feature extractor, however, this ignores any spatial reasoning on the feature map. In this paper, we propose local graph aggregation on… ▽ More

    Submitted 30 January, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

    Comments: 12 pages, 4 figures

  17. arXiv:2001.08095  [pdf, other

    cs.CV

    UniPose: Unified Human Pose Estimation in Single Images and Videos

    Authors: Bruno Artacho, Andreas Savakis

    Abstract: We propose UniPose, a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual segme… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

  18. Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation

    Authors: Bruno Artacho, Andreas Savakis

    Abstract: We propose a new efficient architecture for semantic segmentation, based on a "Waterfall" Atrous Spatial Pooling architecture, that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture while maintaining multiscale fields-of-v… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

    Comments: 17 pages, 11 figures

    Journal ref: Sensors, 19(24), 2019

  19. arXiv:1909.02165  [pdf, other

    cs.CV cs.GR eess.IV

    Poly-GAN: Multi-Conditioned GAN for Fashion Synthesis

    Authors: Nilesh Pandey, Andreas Savakis

    Abstract: We present Poly-GAN, a novel conditional GAN architecture that is motivated by Fashion Synthesis, an application where garments are automatically placed on images of human models at an arbitrary pose. Poly-GAN allows conditioning on multiple inputs and is suitable for many tasks, including image alignment, image stitching, and inpainting. Existing methods have a similar pipeline where three differ… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

  20. arXiv:1903.04988  [pdf, other

    cs.CV cs.LG

    Cascaded Projection: End-to-End Network Compression and Acceleration

    Authors: Breton Minnehan, Andreas Savakis

    Abstract: We propose a data-driven approach for deep convolutional neural network compression that achieves high accuracy with high throughput and low memory requirements. Current network compression methods either find a low-rank factorization of the features that requires more memory, or select only a subset of features by pruning entire filter channels. We propose the Cascaded Projection (CaP) compressio… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  21. arXiv:1809.10274  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Semantically Invariant Text-to-Image Generation

    Authors: Shagan Sah, Dheeraj Peri, Ameya Shringi, Chi Zhang, Miguel Dominguez, Andreas Savakis, Ray Ptucha

    Abstract: Image captioning has demonstrated models that are capable of generating plausible text given input images or videos. Further, recent work in image generation has shown significant improvements in image quality when text is used as a prior. Our work ties these concepts together by creating an architecture that can enable bidirectional generation of images and text. We call this network Multi-Modal… ▽ More

    Submitted 26 September, 2018; originally announced September 2018.

    Comments: 5 papers, 5 figures, Published in 2018 25th IEEE International Conference on Image Processing (ICIP)

  22. arXiv:1806.07688  [pdf, other

    cs.CV

    DEFRAG: Deep Euclidean Feature Representations through Adaptation on the Grassmann Manifold

    Authors: Breton Minnehan, Andreas Savakis

    Abstract: We propose a novel technique for training deep networks with the objective of obtaining feature representations that exist in a Euclidean space and exhibit strong clustering behavior. Our desired features representations have three traits: they can be compared using a standard Euclidian distance metric, samples from the same class are tightly clustered, and samples from different classes are well… ▽ More

    Submitted 20 June, 2018; originally announced June 2018.

  23. arXiv:1612.00390  [pdf

    cs.CV

    Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks

    Authors: Jefferson Ryan Medel, Andreas Savakis

    Abstract: Automating the detection of anomalous events within long video sequences is challenging due to the ambiguity of how such events are defined. We approach the problem by learning generative models that can identify anomalies in videos using limited supervision. We propose end-to-end trainable composite Convolutional Long Short-Term Memory (Conv-LSTM) networks that are able to predict the evolution o… ▽ More

    Submitted 15 December, 2016; v1 submitted 1 December, 2016; originally announced December 2016.