Skip to main content

Showing 1–50 of 67 results for author: Cavallaro, A

.
  1. arXiv:2406.16384  [pdf, other

    cs.CV

    High-resolution open-vocabulary object 6D pose estimation

    Authors: Jaime Corsetti, Davide Boscaini, Francesco Giuliari, Changjae Oh, Andrea Cavallaro, Fabio Poiesi

    Abstract: The generalisation to unseen objects in the 6D pose estimation task is very challenging. While Vision-Language Models (VLMs) enable using natural language descriptions to support 6D pose estimation of unseen objects, these solutions underperform compared to model-based methods. In this work we present Horyon, an open-vocabulary VLM-based architecture that addresses relative pose estimation between… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Technical report. Extension of CVPR paper "Open-vocabulary object 6D pose estimation". Project page: https://jcorsetti.github.io/oryon

  2. arXiv:2405.01646  [pdf, other

    cs.CV

    Explaining models relating objects and privacy

    Authors: Alessio Xompero, Myriam Bontonou, Jean-Michel Arbona, Emmanouil Benetos, Andrea Cavallaro

    Abstract: Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image to determine why the image is predicted as private. To explain the decision of these models, we use feature-attribution to identify and quantify whic… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 7 pages, 3 figures, 1 table, supplementary material included as Appendix. Paper accepted at the 3rd XAI4CV Workshop at CVPR 2024. Code: https://github.com/graphnex/ig-privacy

  3. arXiv:2405.01353  [pdf, other

    cs.CV

    Sparse multi-view hand-object reconstruction for unseen environments

    Authors: Yik Lung Pang, Changjae Oh, Andrea Cavallaro

    Abstract: Recent works in hand-object reconstruction mainly focus on the single-view and dense multi-view settings. On the one hand, single-view methods can leverage learned shape priors to generalise to unseen objects but are prone to inaccuracies due to occlusions. On the other hand, dense multi-view methods are very accurate but cannot easily adapt to unseen objects without further data collection. In co… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Camera-ready version. Paper accepted to CVPRW 2024. 8 pages, 7 figures, 1 table

  4. arXiv:2312.00690  [pdf, other

    cs.CV

    Open-vocabulary object 6D pose estimation

    Authors: Jaime Corsetti, Davide Boscaini, Changjae Oh, Andrea Cavallaro, Fabio Poiesi

    Abstract: We introduce the new setting of open-vocabulary object 6D pose estimation, in which a textual prompt is used to specify the object of interest. In contrast to existing approaches, in our setting (i) the object of interest is specified solely through the textual prompt, (ii) no object model (e.g., CAD or video sequence) is required at inference, and (iii) the object is imaged from two RGBD viewpoin… ▽ More

    Submitted 25 June, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Camera ready version (CVPR 2024, poster highlight). New Oryon version: arXiv:2406.16384

  5. Human-interpretable and deep features for image privacy classification

    Authors: Darya Baranouskaya, Andrea Cavallaro

    Abstract: Privacy is a complex, subjective and contextual concept that is difficult to define. Therefore, the annotation of images to train privacy classifiers is a challenging task. In this paper, we analyse privacy classification datasets and the properties of controversial images that are annotated with contrasting privacy labels by different assessors. We discuss suitable features for image privacy clas… ▽ More

    Submitted 31 October, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Journal ref: 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 2023, pp. 3489-3492

  6. SOUL at LBT: commissioning results, science and future

    Authors: Enrico Pinna, Fabio Rossi, Guido Agapito, Alfio Puglisi, Cédric Plantet, Essna Ghose, Matthieu Bec, Marco Bonaglia, Runa Briguglio, Guido Brusa, Luca Carbonaro, Alessandro Cavallaro, Julian Christou, Olivier Durney, Steve Ertel, Simone Esposito, Paolo Grani, Juan Carlos Guerra, Philip Hinz, Michael Lefebvre, Tommaso Mazzoni, Brandon Mechtley, Douglas L. Miller, Manny Montoya, Jennifer Power , et al. (5 additional authors not shown)

    Abstract: The SOUL systems at the Large Bincoular Telescope can be seen such as precursor for the ELT SCAO systems, combining together key technologies such as EMCCD, Pyramid WFS and adaptive telescopes. After the first light of the first upgraded system on September 2018, going through COVID and technical stops, we now have all the 4 systems working on-sky. Here, we report about some key control improvemen… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 13 pages, 10 figures, Adaptive Optics for Extremely Large Telescopes 7th Edition, 25-30 Jun 2023 Avignon (France)

    Journal ref: AO4ELT7 proceedings 2023

  7. arXiv:2310.00503  [pdf, other

    cs.CV

    Black-box Attacks on Image Activity Prediction and its Natural Language Explanations

    Authors: Alina Elena Baia, Valentina Poggioni, Andrea Cavallaro

    Abstract: Explainable AI (XAI) methods aim to describe the decision process of deep neural networks. Early XAI methods produced visual explanations, whereas more recent techniques generate multimodal explanations that include textual information and visual representations. Visual XAI methods have been shown to be vulnerable to white-box and gray-box adversarial attacks, with an attacker having full or parti… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: Accepted at ICCV2023 AROW Workshop

  8. arXiv:2308.11233  [pdf, other

    cs.CV

    Affordance segmentation of hand-occluded containers from exocentric images

    Authors: Tommaso Apicella, Alessio Xompero, Edoardo Ragusa, Riccardo Berta, Andrea Cavallaro, Paolo Gastaldo

    Abstract: Visual affordance segmentation identifies the surfaces of an object an agent can interact with. Common challenges for the identification of affordances are the variety of the geometry and physical properties of these surfaces as well as occlusions. In this paper, we focus on occlusions of an object that is hand-held by a person manipulating it. To address this challenge, we propose an affordance s… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: Paper accepted to Workshop on Assistive Computer Vision and Robotics (ACVR) in International Conference on Computer Vision (ICCV) 2023; 10 pages, 4 figures, 2 tables. Data, code, and trained models are available at https://apicis.github.io/projects/acanet.html

  9. arXiv:2306.10887  [pdf

    cond-mat.mtrl-sci

    Ion Intercalation in Lanthanum Strontium Ferrite for Aqueous Electrochemical Energy Storage Devices

    Authors: Yunqing Tang, Francesco Chiabrera, Alex Morata, Andrea Cavallaro, Maciej O. Liedke, Hemesh Avireddy, Mar Maller, Maik Butterling, Andreas Wagner, Michel Stchakovsky, Federico Baiutti, Ainara Aguadero, Albert Tarancón

    Abstract: Ion intercalation of perovskite oxides in liquid electrolytes is a very promising method for controlling their functional properties while storing charge, which opens the potential application in different energy and information technologies. Although the role of defect chemistry in the oxygen intercalation in a gaseous environment is well established, the mechanism of ion intercalation in liquid… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Journal ref: ACS Appl. Mater. Interfaces 2022, 14, 18486

  10. arXiv:2305.07396  [pdf, other

    astro-ph.GA

    GMP-selected dual and lensed AGNs: selection function and classification based on near-IR colors and resolved spectra from VLT/ERIS, KECK/OSIRIS, and LBT/LUCI

    Authors: F. Mannucci, M. Scialpi, A. Ciurlo, S. Yeh, C. Marconcini, G. Tozzi, G. Cresci, A. Marconi, A. Amiri, F. Belfiore, S. Carniani, C. Cicone, E. Nardini, E. Pancino, K. Rubinur, P. Severgnini, L. Ulivi, G. Venturi, C. Vignali, M. Volonteri, E. Pinna, F. Rossi, A. Puglisi, G. Agapito, C. Plantet , et al. (22 additional authors not shown)

    Abstract: The Gaia-Multi-Peak (GMP) technique can be used to identify large numbers of dual or lensed AGN candidates at sub-arcsec separation, allowing us to study both multiple SMBHs in the same galaxy and rare, compact lensed systems. The observed samples can be used to test the predictions of the models of SMBH merging once 1) the selection function of the GMP technique is known, and 2) each system has b… ▽ More

    Submitted 9 October, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 14 pages,A&A, in press

  11. arXiv:2211.10470  [pdf, other

    cs.CV

    A mixed-reality dataset for category-level 6D pose and size estimation of hand-occluded containers

    Authors: Xavier Weber, Alessio Xompero, Andrea Cavallaro

    Abstract: Estimating the 6D pose and size of household containers is challenging due to large intra-class variations in the object properties, such as shape, size, appearance, and transparency. The task is made more difficult when these objects are held and manipulated by a person due to varying degrees of hand occlusions caused by the type of grasps and by the viewpoint of the camera observing the person h… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, 1 table. Submitted to IEEE ICASSP 2023. Webpage at https://corsmal.eecs.qmul.ac.uk/pose.html

  12. arXiv:2210.11169  [pdf, other

    cs.CV

    Content-based Graph Privacy Advisor

    Authors: Dimitrios Stoidis, Andrea Cavallaro

    Abstract: People may be unaware of the privacy risks of uploading an image online. In this paper, we present Graph Privacy Advisor, an image privacy classifier that uses scene information and object cardinality as cues to predict whether an image is private. Graph Privacy Advisor simplifies a state-of-the-art graph model and improves its performance by refining the relevance of the information extracted fro… ▽ More

    Submitted 13 November, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: 8 pages, 3 figures, in Proceedings of IEEE BigMM 2022

  13. arXiv:2209.05077  [pdf, other

    cs.CV eess.IV

    BON: An extended public domain dataset for human activity recognition

    Authors: Girmaw Abebe Tadesse, Oliver Bent, Komminist Weldemariam, Md. Abrar Istiak, Taufiq Hasan, Andrea Cavallaro

    Abstract: Body-worn first-person vision (FPV) camera enables to extract a rich source of information on the environment from the subject's viewpoint. However, the research progress in wearable camera-based egocentric office activity understanding is slow compared to other activity environments (e.g., kitchen and outdoor ambulatory), mainly due to the lack of adequate datasets to train more sophisticated (e.… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  14. arXiv:2208.11661  [pdf, other

    cs.CV

    Cross-Camera View-Overlap Recognition

    Authors: Alessio Xompero, Andrea Cavallaro

    Abstract: We propose a decentralised view-overlap recognition framework that operates across freely moving cameras without the need of a reference 3D map. Each camera independently extracts, aggregates into a hierarchical structure, and shares feature-point descriptors over time. A view overlap is recognised by view-matching and geometric validation to discard wrongly matched views. The proposed framework i… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: 17 pages, 5 figures, 2 tables. Accepted to International Workshop on Distributed Smart Cameras (IWDSC) at the 2022 European Conference on Computer Vision (ECCV2022)

  15. arXiv:2207.08726  [pdf, other

    physics.plasm-ph

    Radiative pulsed L-mode operation in ARC-class reactors

    Authors: S. J. Frank, C. J. Perks, A. O. Nelson, T. Qian, S. **, A. J. Cavallaro, A. Rutkowski, A. H. Reiman, J. P. Freidberg, P. Rodriguez-Fernandez, D. G. Whyte

    Abstract: A new ARC-class, highly-radiative, pulsed, L-mode, burning plasma scenario is developed and evaluated as a candidate for future tokamak reactors. Pulsed inductive operation alleviates the stringent current drive requirements of steady-state reactors, and operation in L-mode affords ELM-free access to $\sim90\%$ core radiation fractions, significantly reducing the divertor power handling requiremen… ▽ More

    Submitted 9 September, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

  16. arXiv:2207.05470  [pdf, other

    cs.CV eess.IV

    On the limits of perceptual quality measures for enhanced underwater images

    Authors: Chau Yi Li, Andrea Cavallaro

    Abstract: The appearance of objects in underwater images is degraded by the selective attenuation of light, which reduces contrast and causes a colour cast. This degradation depends on the water environment, and increases with depth and with the distance of the object from the camera. Despite an increasing volume of works in underwater image enhancement and restoration, the lack of a commonly accepted evalu… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: Accepted in ICIP 2022

  17. arXiv:2207.01052  [pdf, other

    cs.SD cs.LG eess.AS

    Generating gender-ambiguous voices for privacy-preserving speech recognition

    Authors: Dimitrios Stoidis, Andrea Cavallaro

    Abstract: Our voice encodes a uniquely identifiable pattern which can be used to infer private attributes, such as gender or identity, that an individual might wish not to reveal when using a speech recognition service. To prevent attribute inference attacks alongside speech recognition tasks, we present a generative adversarial network, GenGAN, that synthesises voices that conceal the gender or identity of… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

    Comments: 5 pages, 4 figures, submitted to INTERSPEECH

  18. arXiv:2206.00772  [pdf, other

    cs.LG cs.AI cs.CR

    On the reversibility of adversarial attacks

    Authors: Chau Yi Li, Ricardo Sánchez-Matilla, Ali Shahin Shamsabadi, Riccardo Mazzon, Andrea Cavallaro

    Abstract: Adversarial attacks modify images with perturbations that change the prediction of classifiers. These modified images, known as adversarial examples, expose the vulnerabilities of deep neural network classifiers. In this paper, we investigate the predictability of the map** between the classes predicted for original images and for their corresponding adversarial examples. This predictability rel… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  19. arXiv:2203.04027  [pdf, other

    cs.LG cs.CV cs.RO

    Data augmentation with mixtures of max-entropy transformations for filling-level classification

    Authors: Apostolos Modas, Andrea Cavallaro, Pascal Frossard

    Abstract: We address the problem of distribution shifts in test-time data with a principled data augmentation scheme for the task of content-level classification. In such a task, properties such as shape or transparency of test-time containers (cup or drinking glass) may differ from those represented in the training data. Dealing with such distribution shifts using standard augmentation schemes is challengi… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

  20. arXiv:2203.02635  [pdf, other

    cs.CV cs.CR cs.LG

    Training privacy-preserving video analytics pipelines by suppressing features that reveal information about private attributes

    Authors: Chau Yi Li, Andrea Cavallaro

    Abstract: Deep neural networks are increasingly deployed for scene analytics, including to evaluate the attention and reaction of people exposed to out-of-home advertisements. However, the features extracted by a deep neural network that was trained to predict a specific, consensual attribute (e.g. emotion) may also encode and thus reveal information about private, protected attributes (e.g. age or gender).… ▽ More

    Submitted 1 June, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

  21. arXiv:2203.01977  [pdf, other

    cs.MM cs.CV cs.RO cs.SD eess.AS

    Audio-Visual Object Classification for Human-Robot Collaboration

    Authors: A. Xompero, Y. L. Pang, T. Patten, A. Prabhakar, B. Calli, A. Cavallaro

    Abstract: Human-robot collaboration requires the contactless estimation of the physical properties of containers manipulated by a person, for example while pouring content in a cup or moving a food box. Acoustic and visual signals can be used to estimate the physical properties of such objects, which may vary substantially in shape, material and size, and also be occluded by the hands of the person. To faci… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, 1 table; accepted at ICASSP 2022; Challenge webpage, see https://corsmal.eecs.qmul.ac.uk/challenge.html

  22. arXiv:2202.09263  [pdf, other

    cs.LG cs.MM

    Is Cross-Attention Preferable to Self-Attention for Multi-Modal Emotion Recognition?

    Authors: Vandana Rajan, Alessio Brutti, Andrea Cavallaro

    Abstract: Humans express their emotions via facial expressions, voice intonation and word choices. To infer the nature of the underlying emotion, recognition models may use a single modality, such as vision, audio, and text, or a combination of modalities. Generally, models that fuse complementary information from multiple modalities outperform their uni-modal counterparts. However, a successful model that… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

    Comments: Accepted at ICASSP 2022

  23. arXiv:2112.02381  [pdf, other

    cs.RO eess.SY

    Active Sensing for Search and Tracking: A Review

    Authors: Luca Varotto, Angelo Cenedese, Andrea Cavallaro

    Abstract: Active Position Estimation (APE) is the task of localizing one or more targets using one or more sensing platforms. APE is a key task for search and rescue missions, wildlife monitoring, source term estimation, and collaborative mobile robotics. Success in APE depends on the level of cooperation of the sensing platforms, their number, their degrees of freedom and the quality of the information gat… ▽ More

    Submitted 4 December, 2021; originally announced December 2021.

    Comments: 26 pages, 5 tables, 3 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  24. arXiv:2108.03318  [pdf, other

    cs.RO

    OHPL: One-shot Hand-eye Policy Learner

    Authors: Changjae Oh, Yik Lung Pang, Andrea Cavallaro

    Abstract: The control of a robot for manipulation tasks generally relies on object detection and pose estimation. An attractive alternative is to learn control policies directly from raw input data. However, this approach is time-consuming and expensive since learning the policy requires many trials with robot actions in the physical environment. To reduce the training cost, the policy can be learned in sim… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: Camera-ready version. Paper accepted to IROS 2021. 7 pages, 7 figures, 2 tables

  25. arXiv:2108.00809  [pdf, other

    eess.IV

    Cross-Modal Knowledge Transfer via Inter-Modal Translation and Alignment for Affect Recognition

    Authors: Vandana Rajan, Alessio Brutti, Andrea Cavallaro

    Abstract: Multi-modal affect recognition models leverage complementary information in different modalities to outperform their uni-modal counterparts. However, due to the unavailability of modality-specific sensors or data, multi-modal models may not be always employable. For this reason, we aim to improve the performance of uni-modal affect recognition models by transferring knowledge from a better-perform… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: Under review

  26. arXiv:2107.12719  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    The CORSMAL benchmark for the prediction of the properties of containers

    Authors: Alessio Xompero, Santiago Donaher, Vladimir Iashin, Francesca Palermo, Gökhan Solak, Claudio Coppola, Reina Ishikawa, Yuichi Nagao, Ryo Hachiuma, Qi Liu, Fan Feng, Chuanlin Lan, Rosa H. M. Chan, Guilherme Christmann, Jyun-Ting Song, Gonuguntla Neeharika, Chinnakotla Krishna Teja Reddy, Dinesh Jain, Bakhtawar Ur Rehman, Andrea Cavallaro

    Abstract: The contactless estimation of the weight of a container and the amount of its content manipulated by a person are key pre-requisites for safe human-to-robot handovers. However, opaqueness and transparencies of the container and the content, and variability of materials, shapes, and sizes, make this estimation difficult. In this paper, we present a range of methods and an open framework to benchmar… ▽ More

    Submitted 21 April, 2022; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Authors' post-print accepted for publication in IEEE Access, see https://doi.org/10.1109/ACCESS.2022.3166906 . 14 pages, 6 tables, 7 figures

    Journal ref: IEEE Access, vol. 10, 2022, 1-15

  27. arXiv:2107.01309  [pdf, other

    cs.RO

    Towards safe human-to-robot handovers of unknown containers

    Authors: Yik Lung Pang, Alessio Xompero, Changjae Oh, Andrea Cavallaro

    Abstract: Safe human-to-robot handovers of unknown objects require accurate estimation of hand poses and object properties, such as shape, trajectory, and weight. Accurately estimating these properties requires the use of scanned 3D object models or expensive equipment, such as motion capture systems and markers, or both. However, testing handover algorithms with robots may be dangerous for the human and, w… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

    Comments: Camera-ready version. Paper accepted to RO-MAN 2021. 8 pages, 8 figures, 1 table

  28. arXiv:2106.04528  [pdf, other

    physics.plasm-ph physics.app-ph physics.comp-ph physics.data-an

    Modeling of Particle Transport, Neutrals and Radiation in Magnetically-Confined Plasmas with Aurora

    Authors: F. Sciortino, T. Odstrčil, A. Cavallaro, S. Smith, O. Meneghini, R. Reksoatmodjo, O. Linder, J. D. Lore, N. T. Howard, E. S. Marmar, S. Mordijck

    Abstract: We present Aurora, an open-source package for particle transport, neutrals and radiation modeling in magnetic confinement fusion plasmas. Aurora's modern multi-language interface enables simulations of 1.5D impurity transport within high-performance computing frameworks, particularly for the inference of particle transport coefficients. A user-friendly Python library allows simple interaction with… ▽ More

    Submitted 7 July, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: 8 pages + references, 5 figures

  29. arXiv:2104.11051  [pdf, other

    cs.SD cs.LG eess.AS

    Protecting gender and identity with disentangled speech representations

    Authors: Dimitrios Stoidis, Andrea Cavallaro

    Abstract: Besides its linguistic content, our speech is rich in biometric information that can be inferred by classifiers. Learning privacy-preserving representations for speech signals enables downstream tasks without sharing unnecessary, private information about an individual. In this paper, we show that protecting gender information in speech is more effective than modelling speaker-identity information… ▽ More

    Submitted 16 June, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: 5 pages, 2 figures

  30. arXiv:2103.15999  [pdf, other

    cs.SD eess.AS

    Audio classification of the content of food containers and drinking glasses

    Authors: Santiago Donaher, Alessio Xompero, Andrea Cavallaro

    Abstract: Food containers, drinking glasses and cups handled by a person generate sounds that vary with the type and amount of their content. In this paper, we propose a new model for sound-based classification of the type and amount of content in a container. The proposed model is based on the decomposition of the problem into two steps, namely action recognition and content classification. We use the scen… ▽ More

    Submitted 9 June, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Camera-ready version. Paper accepted to EUSIPCO21. 5 pages, 4 figures, 3 tables. Minor improvements to the paper presentation

  31. arXiv:2102.04057  [pdf, other

    cs.CV cs.LG

    Improving filling level classification with adversarial training

    Authors: Apostolos Modas, Alessio Xompero, Ricardo Sanchez-Matilla, Pascal Frossard, Andrea Cavallaro

    Abstract: We investigate the problem of classifying - from a single image - the level of content in a cup or a drinking glass. This problem is made challenging by several ambiguities caused by transparencies, shape variations and partial occlusions, and by the availability of only small training datasets. In this paper, we tackle this problem with an appropriate strategy for transfer learning. Specifically,… ▽ More

    Submitted 16 June, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: Accepted to the 28th IEEE International Conference on Image Processing (ICIP) 2021

  32. arXiv:2101.07091  [pdf, other

    astro-ph.IM

    Bringing SOUL on sky

    Authors: Enrico Pinna, Fabio Rossi, Alfio Puglisi, Guido Agapito, Marco Bonaglia, Cedric Plantet, Tommaso Mazzoni, Runa Briguglio, Luca Carbonaro, Marco Xompero, Paolo Grani, Armando Riccardi, Simone Esposito, Phil Hinz, Amali Vaz, Steve Ertel, Oscar M. Montoya, Oliver Durney, Julian Christou, Doug L. Miller, Greg Taylor, Alessandro Cavallaro, Michael Lefebvre

    Abstract: The SOUL project is upgrading the 4 SCAO systems of LBT, pushing the current guide star limits of about 2 magnitudes fainter thanks to Electron Multiplied CCD detector. This improvement will open the NGS SCAO correction to a wider number of scientific cases from high contrast imaging in the visible to extra-galactic source in the NIR. The SOUL systems are today the unique case where pyramid WFS, a… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    Comments: 11 pages, 9 figures, 1 table. AO4ELT6 proceedings

    Journal ref: AO4ELT6 proceedings 2019

  33. arXiv:2101.06795  [pdf, other

    eess.AS cs.SD eess.SP

    An embedded multichannel sound acquisition system for drone audition

    Authors: Michael Clayton, Lin Wang, Andrew McPherson, Andrea Cavallaro

    Abstract: Microphone array techniques can improve the acoustic sensing performance on drones, compared to the use of a single microphone. However, multichannel sound acquisition systems are not available in current commercial drone platforms. To encourage the research in drone audition, we present an embedded sound acquisition and recording system with eight microphones and a multichannel sound recorder mou… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  34. arXiv:2012.12258  [pdf, other

    cs.CV

    Underwater image filtering: methods, datasets and evaluation

    Authors: Chau Yi Li, Riccardo Mazzon, Andrea Cavallaro

    Abstract: Underwater images are degraded by the selective attenuation of light that distorts colours and reduces contrast. The degradation extent depends on the water type, the distance between an object and the camera, and the depth under the water surface the object is at. Underwater image filtering aims to restore or to enhance the appearance of objects captured in an underwater image. Restoration method… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  35. arXiv:2011.10474  [pdf, other

    cs.RO eess.SY

    Probabilistic Radio-Visual Active Sensing for Search and Tracking

    Authors: L. Varotto, A. Cenedese, A. Cavallaro

    Abstract: Active Search and Tracking for search and rescue missions or collaborative mobile robotics relies on the actuation of a sensing platform to detect and localize a target. In this paper we focus on visually detecting a radio-emitting target with an aerial robot equipped with a radio receiver and a camera. Visual-based tracking provides high accuracy, but the directionality of the sensing domain may… ▽ More

    Submitted 11 April, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

    Comments: 6 pages, 3 figures, 1 table, accepted at ECC 2021

  36. arXiv:2011.08483  [pdf, other

    cs.SD cs.LG eess.AS

    FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

    Authors: Ali Shahin Shamsabadi, Francisco Sepúlveda Teixeira, Alberto Abad, Bhiksha Raj, Andrea Cavallaro, Isabel Trancoso

    Abstract: Speaker identification models are vulnerable to carefully designed adversarial perturbations of their input signals that induce misclassification. In this work, we propose a white-box steganography-inspired adversarial attack that generates imperceptible adversarial perturbations against a speaker identification model. Our approach, FoolHD, uses a Gated Convolutional Autoencoder that operates in t… ▽ More

    Submitted 20 February, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: https://fsepteixeira.github.io/FoolHD/

  37. arXiv:2011.01631  [pdf, other

    cs.LG cs.MM

    Robust Latent Representations via Cross-Modal Translation and Alignment

    Authors: Vandana Rajan, Alessio Brutti, Andrea Cavallaro

    Abstract: Multi-modal learning relates information across observation modalities of the same physical phenomenon to leverage complementary information. Most multi-modal machine learning methods require that all the modalities used for training are also available for testing. This is a limitation when the signals from some modalities are unavailable or are severely degraded by noise. To address this limitati… ▽ More

    Submitted 8 March, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Journal ref: ICASSP 2021

  38. arXiv:2009.14684  [pdf, other

    cs.CV

    Benchmark for Anonymous Video Analytics

    Authors: Ricardo Sanchez-Matilla, Andrea Cavallaro

    Abstract: Out-of-home audience measurement aims to count and characterize the people exposed to advertising content in the physical world. While audience measurement solutions based on computer vision are of increasing interest, no commonly accepted benchmark exists to evaluate and compare their performance. In this paper, we propose the first benchmark for digital out-of-home audience measurement that eval… ▽ More

    Submitted 3 October, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

  39. Semantically Adversarial Learnable Filters

    Authors: Ali Shahin Shamsabadi, Changjae Oh, Andrea Cavallaro

    Abstract: We present an adversarial framework to craft perturbations that mislead classifiers by accounting for the image content and the semantics of the labels. The proposed framework combines a structure loss and a semantic adversarial loss in a multi-task objective function to train a fully convolutional neural network. The structure loss helps generate perturbations whose type and magnitude are defined… ▽ More

    Submitted 5 April, 2022; v1 submitted 13 August, 2020; originally announced August 2020.

    Comments: 13 pages

    Journal ref: IEEE Transactions on Image Processing, 2021

  40. arXiv:2008.02397  [pdf, other

    cs.LG eess.IV

    DANA: Dimension-Adaptive Neural Architecture for Multivariate Sensor Data

    Authors: Mohammad Malekzadeh, Richard G. Clegg, Andrea Cavallaro, Hamed Haddadi

    Abstract: Motion sensors embedded in wearable and mobile devices allow for dynamic selection of sensor streams and sampling rates, enabling several applications, such as power management and data-sharing control. While deep neural networks (DNNs) achieve competitive accuracy in sensor data classification, DNNs generally process incoming data from a fixed set of sensors with a fixed sampling rate, and change… ▽ More

    Submitted 12 August, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

    Comments: Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 5, No. 3, Article 120. Publication date: September 2021

  41. arXiv:2007.10115  [pdf, other

    eess.SP cs.CR cs.LG

    Towards robust sensing for Autonomous Vehicles: An adversarial perspective

    Authors: Apostolos Modas, Ricardo Sanchez-Matilla, Pascal Frossard, Andrea Cavallaro

    Abstract: Autonomous Vehicles rely on accurate and robust sensor observations for safety critical decision-making in a variety of conditions. Fundamental building blocks of such systems are sensors and classifiers that process ultrasound, RADAR, GPS, LiDAR and camera signals~\cite{Khan2018}. It is of primary importance that the resulting decisions are robust to perturbations, which can take the form of diff… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Journal ref: IEEE Signal Processing Magazine, Volume 37, Issue 4, Pages 14 - 23, July 2020

  42. Exploiting vulnerabilities of deep neural networks for privacy protection

    Authors: Ricardo Sanchez-Matilla, Chau Yi Li, Ali Shahin Shamsabadi, Riccardo Mazzon, Andrea Cavallaro

    Abstract: Adversarial perturbations can be added to images to protect their content from unwanted inferences. These perturbations may, however, be ineffective against classifiers that were not {seen} during the generation of the perturbation, or against defenses {based on re-quantization, median filtering or JPEG compression. To address these limitations, we present an adversarial attack {that is} specifica… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Journal ref: IEEE Transactions on Multimedia 2020

  43. arXiv:2007.02808  [pdf, other

    cs.CV

    Novel-View Human Action Synthesis

    Authors: Mohamed Ilyes Lakhal, Davide Boscaini, Fabio Poiesi, Oswald Lanz, Andrea Cavallaro

    Abstract: Novel-View Human Action Synthesis aims to synthesize the movement of a body from a virtual viewpoint, given a video from a real viewpoint. We present a novel 3D reasoning to synthesize the target viewpoint. We first estimate the 3D mesh of the target body and transfer the rough textures from the 2D images to the mesh. As this transfer may generate sparse textures on the mesh due to frame resolutio… ▽ More

    Submitted 8 October, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: Asian Conference on Computer Vision (ACCV) 2020

  44. arXiv:2004.05703  [pdf, other

    cs.LG cs.CR stat.ML

    DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments

    Authors: Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas, Soteris Demetriou, Ilias Leontiadis, Andrea Cavallaro, Hamed Haddadi

    Abstract: We present DarkneTZ, a framework that uses an edge device's Trusted Execution Environment (TEE) in conjunction with model partitioning to limit the attack surface against Deep Neural Networks (DNNs). Increasingly, edge devices (smartphones and consumer IoT devices) are equipped with pre-trained DNNs for a variety of applications. This trend comes with privacy risks as models can leak information a… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 13 pages, 8 figures, accepted to ACM MobiSys 2020

  45. arXiv:2004.05574  [pdf, other

    cs.CR cs.LG

    PrivEdge: From Local to Distributed Private Training and Prediction

    Authors: Ali Shahin Shamsabadi, Adria Gascon, Hamed Haddadi, Andrea Cavallaro

    Abstract: Machine Learning as a Service (MLaaS) operators provide model training and prediction on the cloud. MLaaS applications often rely on centralised collection and aggregation of user data, which could lead to significant privacy concerns when dealing with sensitive personal data. To address this problem, we propose PrivEdge, a technique for privacy-preserving MLaaS that safeguards the privacy of user… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: IEEE Transactions on Information Forensics and Security (TIFS)

  46. arXiv:1911.12354  [pdf, other

    cs.CV

    Multi-view shape estimation of transparent containers

    Authors: Alessio Xompero, Ricardo Sanchez-Matilla, Apostolos Modas, Pascal Frossard, Andrea Cavallaro

    Abstract: The 3D localisation of an object and the estimation of its properties, such as shape and dimensions, are challenging under varying degrees of transparency and lighting conditions. In this paper, we propose a method for jointly localising container-like objects and estimating their dimensions using two wide-baseline, calibrated RGB cameras. Under the assumption of circular symmetry along the vertic… ▽ More

    Submitted 9 March, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: Accepted to International Conference on Acoustic, Speech, and Signal Processing (ICASSP); 5 pages, 7 figures

  47. arXiv:1911.10891  [pdf, other

    cs.CV

    ColorFool: Semantic Adversarial Colorization

    Authors: Ali Shahin Shamsabadi, Ricardo Sanchez-Matilla, Andrea Cavallaro

    Abstract: Adversarial attacks that generate small L_p-norm perturbations to mislead classifiers have limited success in black-box settings and with unseen classifiers. These attacks are also not robust to defenses that use denoising filters and to adversarial training procedures. Instead, adversarial attacks that generate unrestricted perturbations are more robust to defenses, are generally more successful… ▽ More

    Submitted 12 April, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR2020)

  48. arXiv:1911.05996  [pdf, other

    cs.LG cs.HC eess.SP stat.ML

    Privacy and Utility Preserving Sensor-Data Transformations

    Authors: Mohammad Malekzadeh, Richard G. Clegg, Andrea Cavallaro, Hamed Haddadi

    Abstract: Sensitive inferences and user re-identification are major threats to privacy when raw sensor data from wearable or portable devices are shared with cloud-assisted applications. To mitigate these threats, we propose mechanisms to transform sensor data before sharing them with applications running on users' devices. These transformations aim at eliminating patterns that can be used for user re-ident… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

    Comments: Accepted to appear in Pervasive and Mobile computing (PMC) Journal, Elsevier

  49. arXiv:1910.12227  [pdf, other

    cs.LG cs.CV stat.ML

    EdgeFool: An Adversarial Image Enhancement Filter

    Authors: Ali Shahin Shamsabadi, Changjae Oh, Andrea Cavallaro

    Abstract: Adversarial examples are intentionally perturbed images that mislead classifiers. These images can, however, be easily detected using denoising algorithms, when high-frequency spatial perturbations are used, or can be noticed by humans, when perturbations are large. In this paper, we propose EdgeFool, an adversarial image enhancement filter that learns structure-aware adversarial perturbations. Ed… ▽ More

    Submitted 5 March, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

    Journal ref: Proceedings of the 45th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)2020

  50. arXiv:1910.06827  [pdf, other

    cs.CV

    Learning Generalisable Omni-Scale Representations for Person Re-Identification

    Authors: Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang

    Abstract: An effective person re-identification (re-ID) model should learn feature representations that are both discriminative, for distinguishing similar-looking people, and generalisable, for deployment across datasets without any adaptation. In this paper, we develop novel CNN architectures to address both challenges. First, we present a re-ID CNN termed omni-scale network (OSNet) to learn features that… ▽ More

    Submitted 29 April, 2021; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: TPAMI 2021. Journal extension of arXiv:1905.00953. Updates: added appendix. arXiv admin note: text overlap with arXiv:1905.00953