Skip to main content

Showing 1–8 of 8 results for author: Drosou, A

.
  1. arXiv:2406.17159  [pdf, other

    eess.AS cs.MM cs.SD

    Exploring compressibility of transformer based text-to-music (TTM) models

    Authors: Vasileios Moschopoulos, Thanasis Kotsiopoulos, Pablo Peso Parada, Konstantinos Nikiforidis, Alexandros Stergiadis, Gerasimos Papakostas, Md Asif Jalal, Jisi Zhang, Anastasios Drosou, Karthikeyan Saravanan

    Abstract: State-of-the art Text-To-Music (TTM) generative AI models are large and require desktop or server class compute, making them infeasible for deployment on mobile phones. This paper presents an analysis of trade-offs between model compression and generation performance of TTM models. We study compression through knowledge distillation and specific modifications that enable applicability over the var… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Proceedings of INTERSPEECH 2024

  2. arXiv:2403.04508  [pdf, other

    cs.CV cs.GR

    Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces

    Authors: Evangelos Skartados, Mehmet Kerim Yucel, Bruno Manganelli, Anastasios Drosou, Albert Saà-Garriga

    Abstract: Neural Radiance Fields (NeRF) have quickly become the primary approach for 3D reconstruction and novel view synthesis in recent years due to their remarkable performance. Despite the huge interest in NeRF methods, a practical use case of NeRFs has largely been ignored; the exploration of the scene space modelled by a NeRF. In this paper, for the first time in the literature, we propose and formall… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted at ACM MMSys'24

  3. arXiv:2401.13146  [pdf, other

    eess.AS cs.CL cs.SD

    Locality enhanced dynamic biasing and sampling strategies for contextual ASR

    Authors: Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the t… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  4. arXiv:2306.15377  [pdf, other

    cs.CV

    TrickVOS: A Bag of Tricks for Video Object Segmentation

    Authors: Evangelos Skartados, Konstantinos Georgiadis, Mehmet Kerim Yucel, Koskinas Ioannis, Armando Domi, Anastasios Drosou, Bruno Manganelli, Albert Saa-Garriga

    Abstract: Space-time memory (STM) network methods have been dominant in semi-supervised video object segmentation (SVOS) due to their remarkable performance. In this work, we identify three key aspects where we can improve such methods; i) supervisory signal, ii) pretraining and iii) spatial awareness. We then propose TrickVOS; a generic, method-agnostic bag of tricks addressing each aspect with i) a struct… ▽ More

    Submitted 28 June, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted to ICIP 2023

  5. arXiv:2303.12862  [pdf, other

    cs.CV

    LP-IOANet: Efficient High Resolution Document Shadow Removal

    Authors: Konstantinos Georgiadis, M. Kerim Yucel, Evangelos Skartados, Valia Dimaridou, Anastasios Drosou, Albert Saa-Garriga, Bruno Manganelli

    Abstract: Document shadow removal is an integral task in document enhancement pipelines, as it improves visibility, readability and thus the overall quality. Assuming that the majority of practical document shadow removal scenarios require real-time, accurate models that can produce high-resolution outputs in-the-wild, we propose Laplacian Pyramid with Input/Output Attention Network (LP-IOANet), a novel pip… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  6. arXiv:2210.16078  [pdf, other

    cs.CV

    Adaptive Mask-based Pyramid Network for Realistic Bokeh Rendering

    Authors: Konstantinos Georgiadis, Albert Saà-Garriga, Mehmet Kerim Yucel, Anastasios Drosou, Bruno Manganelli

    Abstract: Bokeh effect highlights an object (or any part of the image) while blurring the rest of the image, and creates a visually pleasant artistic effect. Due to the sensor-based limitations on mobile devices, machine learning (ML) based bokeh rendering has gained attention as a reliable alternative. In this paper, we focus on several improvements in ML-based bokeh rendering; i) on-device performance wit… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: ECCV 2022 Advances in Image Manipulation Workshop. See the workshop website for posters and recordings

  7. arXiv:2105.12053  [pdf, other

    cs.CV

    Real-time Monocular Depth Estimation with Sparse Supervision on Mobile

    Authors: Mehmet Kerim Yucel, Valia Dimaridou, Anastasios Drosou, Albert Saà-Garriga

    Abstract: Monocular (relative or metric) depth estimation is a critical task for various applications, such as autonomous vehicles, augmented reality and image editing. In recent years, with the increasing availability of mobile devices, accurate and mobile-friendly depth models have gained importance. Increasingly accurate models typically require more computational resources, which inhibits the use of suc… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: To appear at CVPR 2021 Mobile AI (MAI) Workshop

  8. Admission and Congestion Control for 5G Network Slicing

    Authors: Bin Han, Antonio De Domenico, Ghina Dandachi, Anastasios Drosou, Dimitrios Tzovaras, Roberto Querio, Fabrizio Moggio, Ömer Bulakci, Hans D. Schotten

    Abstract: Network Slicing has been widely accepted as essential feature of future 5th Generation (5G) mobile communication networks. Accounting the potentially dense demand of network slices as a cloud service and the limited resource of mobile network operators (MNOs), an efficient inter-slice management and orchestration plays a key role in 5G networks. This calls advanced solutions for slice admission an… ▽ More

    Submitted 31 August, 2018; originally announced September 2018.

    Comments: Submitted to 2018 IEEE Conference on Standards for Communications and Networking (CSCN)

    Journal ref: 2018 IEEE Conference on Standards for Communications and Networking (CSCN)