Skip to main content

Showing 1–50 of 113 results for author: Escalera, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09073  [pdf, other

    cs.LG

    Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

    Authors: Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, Julio Jacques Junior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, Isabelle Guyon

    Abstract: We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In thi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2405.14094  [pdf, other

    cs.LG cs.AI cs.CV math.AT stat.ML

    Attending to Topological Spaces: The Cellular Transformer

    Authors: Rubén Ballester, Pablo Hernández-García, Mathilde Papillon, Claudio Battiloro, Nina Miolane, Tolga Birdal, Carles Casacuberta, Sergio Escalera, Mustafa Hajij

    Abstract: Topological Deep Learning seeks to enhance the predictive performance of neural network models by harnessing topological structures in input data. Topological neural networks operate on spaces such as cell complexes and hypergraphs, that can be seen as generalizations of graphs. In this work, we introduce the Cellular Transformer (CT), a novel architecture that generalizes graph-based transformers… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2405.06994  [pdf, other

    cs.CV cs.LG

    GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution Shifts

    Authors: Sofia Casarin, Oswald Lanz, Sergio Escalera

    Abstract: Neural Architecture Search (NAS) methods have shown to output networks that largely outperform human-designed networks. However, conventional NAS methods have mostly tackled the single dataset scenario, incuring in a large computational cost as the procedure has to be run from scratch for every new dataset. In this work, we focus on predictor-based algorithms and propose a simple and efficient way… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  4. arXiv:2404.09988  [pdf, other

    cs.CV

    in2IN: Leveraging individual Information to Generate Human INteractions

    Authors: Pablo Ruiz Ponce, German Barquero, Cristina Palmero, Sergio Escalera, Jose Garcia-Rodriguez

    Abstract: Generating human-human motion interactions conditioned on textual descriptions is a very useful application in many areas such as robotics, gaming, animation, and the metaverse. Alongside this utility also comes a great difficulty in modeling the highly dimensional inter-personal dynamics. In addition, properly capturing the intra-personal diversity of interactions has a lot of challenges. Current… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Project page: https://pabloruizponce.github.io/in2IN/

  5. arXiv:2404.09703  [pdf, other

    cs.LG stat.ML

    AI Competitions and Benchmarks: Dataset Development

    Authors: Romain Egele, Julio C. S. Jacques Junior, Jan N. van Rijn, Isabelle Guyon, Xavier Baró, Albert Clapés, Prasanna Balaprakash, Sergio Escalera, Thomas Moeslund, Jun Wan

    Abstract: Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual dat… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Preprint version of the 3rd Chapter of the book: Competitions and Benchmarks, the science behind the contests (https://sites.google.com/chalearn.org/book/home)

  6. arXiv:2404.06211  [pdf, other

    cs.CV

    Unified Physical-Digital Attack Detection Challenge

    Authors: Haocheng Yuan, Ajian Liu, Junze Zheng, Jun Wan, Jiankang Deng, Sergio Escalera, Hugo Jair Escalante, Isabelle Guyon, Zhen Lei

    Abstract: Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac… ▽ More

    Submitted 18 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures

  7. arXiv:2404.05392  [pdf, other

    cs.CV

    T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in Sports Videos

    Authors: Artur Xarles, Sergio Escalera, Thomas B. Moeslund, Albert Clapés

    Abstract: In this paper, we introduce T-DEED, a Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in sports videos. T-DEED addresses multiple challenges in the task, including the need for discriminability among frame representations, high output temporal resolution to maintain prediction precision, and the necessity to capture information at different temporal scales to handle e… ▽ More

    Submitted 11 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  8. arXiv:2404.01891  [pdf, other

    cs.CV

    ASTRA: An Action Spotting TRAnsformer for Soccer Videos

    Authors: Artur Xarles, Sergio Escalera, Thomas B. Moeslund, Albert Clapés

    Abstract: In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transfor… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  9. arXiv:2404.01775  [pdf, other

    cs.CV cs.AI cs.LG

    A noisy elephant in the room: Is your out-of-distribution detector robust to label noise?

    Authors: Galadrielle Humblot-Renaux, Sergio Escalera, Thomas B. Moeslund

    Abstract: The ability to detect unfamiliar or unexpected images is essential for safe deployment of computer vision systems. In the context of classification, the task of detecting images outside of a model's training domain is known as out-of-distribution (OOD) detection. While there has been a growing research interest in develo** post-hoc OOD detection methods, there has been comparably little discussi… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024

  10. arXiv:2403.15194  [pdf, other

    cs.CV cs.LG

    Your Image is My Video: Resha** the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion

    Authors: Sofia Casarin, Cynthia I. Ugwu, Sergio Escalera, Oswald Lanz

    Abstract: The landscape of deep learning research is moving towards innovative strategies to harness the true potential of data. Traditionally, emphasis has been on scaling model architectures, resulting in large and complex neural networks, which can be difficult to train with limited computational resources. However, independently of the model size, data quality (i.e. amount and variability) is still a ma… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  11. arXiv:2403.14333  [pdf, other

    cs.CV

    CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

    Authors: Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Lei

    Abstract: Domain generalization (DG) based Face Anti-Spoofing (FAS) aims to improve the model's performance on unseen domains. Existing methods either rely on domain labels to align domain-invariant feature spaces, or disentangle generalizable features from the whole sample, which inevitably lead to the distortion of semantic feature structures and achieve limited generalization. In this work, we make use o… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures

  12. arXiv:2402.15509  [pdf, other

    cs.CV

    Seamless Human Motion Composition with Blended Positional Encodings

    Authors: German Barquero, Sergio Escalera, Cristina Palmero

    Abstract: Conditional human motion generation is an important topic with many applications in virtual reality, gaming, and robotics. While prior works have focused on generating motion guided by text, music, or scenes, these typically result in isolated motions confined to short durations. Instead, we address the generation of long, continuous sequences guided by a series of varying textual descriptions. In… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Project page: https://barquerogerman.github.io/FlowMDM/

  13. arXiv:2402.14720  [pdf

    cs.CV

    A Transformer Model for Boundary Detection in Continuous Sign Language

    Authors: Razieh Rastgoo, Kourosh Kiani, Sergio Escalera

    Abstract: Sign Language Recognition (SLR) has garnered significant attention from researchers in recent years, particularly the intricate domain of Continuous Sign Language Recognition (CSLR), which presents heightened complexity compared to Isolated Sign Language Recognition (ISLR). One of the prominent challenges in CSLR pertains to accurately detecting the boundaries of isolated signs within a continuous… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  14. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 17 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  15. arXiv:2401.17699  [pdf, other

    cs.CV

    Unified Physical-Digital Face Attack Detection

    Authors: Hao Fang, Ajian Liu, Haocheng Yuan, Junze Zheng, Dingheng Zeng, Yanhong Liu, Jiankang Deng, Sergio Escalera, Xiaoming Liu, Jun Wan, Zhen Lei

    Abstract: Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks. However, previous related work rarely considers both situations at the same time. This implies the deployment of multiple models and thus more computational burden. The main reasons for this lack of an integrated model are caused by two factors: (1) The lack of a dataset including both… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 12 pages, 8 figures

  16. arXiv:2401.05166  [pdf, other

    cs.CV

    REACT 2024: the Second Multiple Appropriate Facial Reaction Generation Challenge

    Authors: Siyang Song, Micol Spitale, Cheng Luo, Cristina Palmero, German Barquero, Hengde Zhu, Sergio Escalera, Michel Valstar, Tobias Baur, Fabien Ringeval, Elisabeth Andre, Hatice Gunes

    Abstract: In dyadic interactions, humans communicate their intentions and state of mind using verbal and non-verbal cues, where multiple different facial reactions might be appropriate in response to a specific speaker behaviour. Then, how to develop a machine learning (ML) model that can automatically generate multiple appropriate, diverse, realistic and synchronised human facial reactions from an previous… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    MSC Class: 68T40

  17. arXiv:2312.13377  [pdf, other

    cs.CV

    SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization

    Authors: David Pujol-Perich, Albert Clapés, Sergio Escalera

    Abstract: Temporal Action Localization (TAL) is a complex task that poses relevant challenges, particularly when attempting to generalize on new -- unseen -- domains in real-world applications. These scenarios, despite realistic, are often neglected in the literature, exposing these solutions to important performance degradation. In this work, we tackle this issue by introducing, for the first time, an appr… ▽ More

    Submitted 5 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  18. arXiv:2312.05840  [pdf, other

    cs.LG math.AT

    Topological Data Analysis for Neural Network Analysis: A Comprehensive Survey

    Authors: Rubén Ballester, Carles Casacuberta, Sergio Escalera

    Abstract: This survey provides a comprehensive exploration of applications of Topological Data Analysis (TDA) within neural network analysis. Using TDA tools such as persistent homology and Mapper, we delve into the intricate structures and behaviors of neural networks and their datasets. We discuss different strategies to obtain topological information from data and neural networks by means of TDA. Additio… ▽ More

    Submitted 3 January, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: 70 pages, 7 figures. 4 references added. Minor changes in the text. Part of generative models reestructured to improve generality and clarity of exposition

    MSC Class: 62R40; 55N31; 68T07 ACM Class: I.2.6

  19. arXiv:2311.05567  [pdf, other

    cs.CV cs.HC cs.LG

    Exploring Emotion Expression Recognition in Older Adults Interacting with a Virtual Coach

    Authors: Cristina Palmero, Mikel deVelasco, Mohamed Amine Hmani, Aymen Mtibaa, Leila Ben Letaifa, Pau Buch-Cardona, Raquel Justo, Terry Amorese, Eduardo González-Fraile, Begoña Fernández-Ruanova, Jofre Tenorio-Laranga, Anna Torp Johansen, Micaela Rodrigues da Silva, Liva Jenny Martinussen, Maria Stylianou Korsnes, Gennaro Cordasco, Anna Esposito, Mounim A. El-Yacoubi, Dijana Petrovska-Delacrétaz, M. Inés Torres, Sergio Escalera

    Abstract: The EMPATHIC project aimed to design an emotionally expressive virtual coach capable of engaging healthy seniors to improve well-being and promote independent aging. One of the core aspects of the system is its human sensing capabilities, allowing for the perception of emotional states to provide a personalized experience. This paper outlines the development of the emotion expression recognition m… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  20. arXiv:2311.02700  [pdf, other

    cs.CV

    A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Dra**

    Authors: Hunor Laczkó, Meysam Madadi, Sergio Escalera, Jordi Gonzalez

    Abstract: RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and dra**. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unsha** the garments w.r.t. the body. We study conditioning the… ▽ More

    Submitted 15 January, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: WACV24, IEEE copyright

  21. ICML 2023 Topological Deep Learning Challenge : Design and Results

    Authors: Mathilde Papillon, Mustafa Hajij, Helen Jenne, Johan Mathe, Audun Myers, Theodore Papamarkou, Tolga Birdal, Tamal Dey, Tim Doster, Tegan Emerson, Gurusankar Gopalakrishnan, Devendra Govil, Aldo Guzmán-Sáenz, Henry Kvinge, Neal Livesay, Soham Mukherjee, Shreyas N. Samaga, Karthikeyan Natesan Ramamurthy, Maneel Reddy Karri, Paul Rosen, Sophia Sanborn, Robin Walters, Jens Agerberg, Sadrodin Barikbin, Claudio Battiloro , et al. (31 additional authors not shown)

    Abstract: This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The chal… ▽ More

    Submitted 18 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  22. arXiv:2309.06006  [pdf, ps, other

    cs.CV cs.AI

    SoccerNet 2023 Challenges Results

    Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim , et al. (77 additional authors not shown)

    Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  23. arXiv:2308.04870  [pdf, other

    cs.LG math.AT stat.ML

    Decorrelating neurons using persistence

    Authors: Rubén Ballester, Carles Casacuberta, Sergio Escalera

    Abstract: We propose a novel way to improve the generalisation capacity of deep learning models by reducing high correlations between neurons. For this, we present two regularisation terms computed from the weights of a minimum spanning tree of the clique whose vertices are the neurons of a given network (or a sample of those), where weights on edges are correlation dissimilarities. We provide an extensive… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 15 pages, 4 figures

    MSC Class: 55N31; 68T07 ACM Class: I.2.6

  24. arXiv:2308.04657  [pdf, other

    cs.CV

    Which Tokens to Use? Investigating Token Reduction in Vision Transformers

    Authors: Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor, Thomas B. Moeslund

    Abstract: Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs more efficient by removing redundant information in the processed tokens. While different methods have been explored to achieve this goal, we still lack understanding of the resulting reduction patterns and how those patterns differ across token reduction methods and datasets. To close this gap, we set out… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: ICCV 2023 NIVT Workshop. Project webpage https://vap.aau.dk/tokens

  25. arXiv:2307.14768  [pdf, other

    cs.CV

    Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining

    Authors: Benjia Zhou, Zhigang Chen, Albert Clapés, Jun Wan, Yanyan Liang, Sergio Escalera, Zhen Lei, Du Zhang

    Abstract: Sign Language Translation (SLT) is a challenging task due to its cross-domain nature, involving the translation of visual-gestural language to text. Many previous methods employ an intermediate representation, i.e., gloss sequences, to facilitate SLT, thus transforming it into a two-stage task of sign language recognition (SLR) followed by sign language translation (SLT). However, the scarcity of… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted to ICCV'23

  26. arXiv:2306.14658  [pdf, other

    cs.CV cs.AI cs.LG

    Beyond AUROC & co. for evaluating out-of-distribution detection performance

    Authors: Galadrielle Humblot-Renaux, Sergio Escalera, Thomas B. Moeslund

    Abstract: While there has been a growing research interest in develo** out-of-distribution (OOD) detection methods, there has been comparably little discussion around how these methods should be evaluated. Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs. In this work, we take a closer look at the go-t… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: published in SAIAD CVPRW'23 (Safe Artificial Intelligence for All Domains CVPR workshop)

  27. arXiv:2306.06583  [pdf, other

    cs.CV

    REACT2023: the first Multi-modal Multiple Appropriate Facial Reaction Generation Challenge

    Authors: Siyang Song, Micol Spitale, Cheng Luo, German Barquero, Cristina Palmero, Sergio Escalera, Michel Valstar, Tobias Baur, Fabien Ringeval, Elisabeth Andre, Hatice Gunes

    Abstract: The Multi-modal Multiple Appropriate Facial Reaction Generation Challenge (REACT2023) is the first competition event focused on evaluating multimedia processing and machine learning techniques for generating human-appropriate facial reactions in various dyadic interaction scenarios, with all participants competing strictly under the same conditions. The goal of the challenge is to provide the firs… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    MSC Class: 68T40

  28. arXiv:2304.07580  [pdf, other

    cs.CV

    Surveillance Face Presentation Attack Detection Challenge

    Authors: Hao Fang, Ajian Liu, Jun Wan, Sergio Escalera, Hugo Jair Escalante, Zhen Lei

    Abstract: Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, most of the studies lacked consideration of long-distance scenarios. Specifically, compared with FAS in traditional scenes such as phone unlocking, face payment, and self-service security inspection, FAS in long-distance such as station squares, parks, and self-service supermarkets are… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: 8 pages, 7 figures

  29. arXiv:2304.05753  [pdf, other

    cs.CV cs.AI

    Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results

    Authors: Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, Ajian Liu, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Jun Wan, Jiankang Deng

    Abstract: Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during trainin… ▽ More

    Submitted 4 May, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: CVPRW2023

  30. arXiv:2303.08639  [pdf, other

    cs.CV

    Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

    Authors: Hugo Bertiche, Niloy J. Mitra, Kuldeep Kulkarni, Chun-Hao Paul Huang, Tuanfeng Y. Wang, Meysam Madadi, Sergio Escalera, Duygu Ceylan

    Abstract: Cinemagraphs are short loo** videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We inv… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  31. arXiv:2302.08909  [pdf, other

    cs.CV

    Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification

    Authors: Ihsan Ullah, Dustin Carrión-Ojeda, Sergio Escalera, Isabelle Guyon, Mike Huisman, Felix Mohr, Jan N van Rijn, Haozhe Sun, Joaquin Vanschoren, Phan Anh Vu

    Abstract: We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks., NeurIPS, Nov 2022, New Orleans, United States

  32. arXiv:2301.00975  [pdf, other

    cs.CV

    Surveillance Face Anti-spoofing

    Authors: Hao Fang, Ajian Liu, Jun Wan, Sergio Escalera, Chenxu Zhao, Xu Zhang, Stan Z. Li, Zhen Lei

    Abstract: Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveilla… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: 15 pages, 9 figures

  33. arXiv:2212.11220  [pdf, other

    cs.CV cs.GR cs.LG

    Neural Cloth Simulation

    Authors: Hugo Bertiche, Meysam Madadi, Sergio Escalera

    Abstract: We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Journal ref: Neural Cloth Simulation. ACM Trans. Graph. 41, 6, Article 220 (December 2022), 14 pages

  34. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  35. arXiv:2211.14304  [pdf, other

    cs.CV

    BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

    Authors: German Barquero, Sergio Escalera, Cristina Palmero

    Abstract: Stochastic human motion prediction (HMP) has generally been tackled with generative adversarial networks and variational autoencoders. Most prior works aim at predicting highly diverse movements in terms of the skeleton joints' dispersion. This has led to methods predicting fast and motion-divergent movements, which are often unrealistic and incoherent with past motion. Such methods also neglect c… ▽ More

    Submitted 2 August, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: ICCV 2023 Camera-ready version. Project page: https://barquerogerman.github.io/BeLFusion/

    Journal ref: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023

  36. SoccerNet 2022 Challenges Results

    Authors: Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao , et al. (69 additional authors not shown)

    Abstract: The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on det… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted at ACM MMSports 2022

  37. arXiv:2208.14686  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    NeurIPS'22 Cross-Domain MetaDL competition: Design and baseline results

    Authors: Dustin Carrión-Ojeda, Hong Chen, Adrian El Baz, Sergio Escalera, Chaoyu Guan, Isabelle Guyon, Ihsan Ullah, Xin Wang, Wenwu Zhu

    Abstract: We present the design and baseline results for a new challenge in the ChaLearn meta-learning series, accepted at NeurIPS'22, focusing on "cross-domain" meta-learning. Meta-learning aims to leverage experience gained from previous tasks to solve new tasks efficiently (i.e., with better performance, little training data, and/or modest computational resources). While previous challenges in the series… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: Meta-Knowledge Transfer/Communication in Different Systems, Sep 2022, Grenoble, France

  38. arXiv:2207.07619  [pdf, other

    cs.CV

    A Non-Anatomical Graph Structure for isolated hand gesture separation in continuous gesture sequences

    Authors: Razieh Rastgoo, Kourosh Kiani, Sergio Escalera

    Abstract: Continuous Hand Gesture Recognition (CHGR) has been extensively studied by researchers in the last few decades. Recently, one model has been presented to deal with the challenge of the boundary detection of isolated gestures in a continuous gesture video [17]. To enhance the model performance and also replace the handcrafted feature extractor in the presented model in [17], we propose a GCN model… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

  39. arXiv:2206.10903  [pdf, ps, other

    cs.CV

    UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022

    Authors: Alex Falcon, Giuseppe Serra, Sergio Escalera, Oswald Lanz

    Abstract: This report presents the technical details of our submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022. To participate in the challenge, we designed an ensemble consisting of different models trained with two recently developed relevance-augmented versions of the widely used triplet loss. Our submission, visible on the public leaderboard, obtains an average score of 61.02% n… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: Ranked joint 1st place in the Multi-Instance Action Retrieval Challenge organized at EPIC@CVPR2022

  40. Relevance-based Margin for Contrastively-trained Video Retrieval Models

    Authors: Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz

    Abstract: Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space b… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted for presentation at International Conference on Multimedia Retrieval (ICMR '22)

  41. arXiv:2204.00923  [pdf

    cs.CV

    Word separation in continuous sign language using isolated signs and post-processing

    Authors: Razieh Rastgoo, Kourosh Kiani, Sergio Escalera

    Abstract: . Continuous Sign Language Recognition (CSLR) is a long challenging task in Computer Vision due to the difficulties in detecting the explicit boundaries between the words in a sign sentence. To deal with this challenge, we propose a two-stage model. In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs. In the second stage,… ▽ More

    Submitted 1 June, 2023; v1 submitted 2 April, 2022; originally announced April 2022.

  42. arXiv:2203.12330  [pdf, other

    cs.LG math.AT

    Predicting the generalization gap in neural networks using topological data analysis

    Authors: Rubén Ballester, Xavier Arnal Clemente, Carles Casacuberta, Meysam Madadi, Ciprian A. Corneanu, Sergio Escalera

    Abstract: Understanding how neural networks generalize on unseen data is crucial for designing more robust and reliable models. In this paper, we study the generalization gap of neural networks using methods from topological data analysis. For this purpose, we compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase, aiming to capture… ▽ More

    Submitted 12 August, 2023; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: 24 pages, 7 figures. The Related Work section has been updated and the experiments have been executed anew including a 5x2-fold cross-validation scheme. Figure 4.3 has been crucially improved thanks to the discovery that the clusters of neural networks that appear in that figure correspond to different depths of the corresponding architectures

    MSC Class: 55N31; 68T07 ACM Class: I.2.6

  43. arXiv:2203.10974  [pdf, other

    cs.CV

    Towards Self-Supervised Gaze Estimation

    Authors: Arya Farkhondeh, Cristina Palmero, Simone Scardapane, Sergio Escalera

    Abstract: Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometri… ▽ More

    Submitted 23 November, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: BMVC 2022. For code and pre-trained models, visit https://github.com/aryafarkhondeh/SwAT

  44. arXiv:2203.08897  [pdf, other

    cs.CV

    Gate-Shift-Fuse for Video Action Recognition

    Authors: Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

    Abstract: Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in… ▽ More

    Submitted 15 April, 2023; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted to TPAMI. arXiv admin note: text overlap with arXiv:1912.00381

  45. arXiv:2203.03245  [pdf, other

    cs.CV cs.RO

    Comparison of Spatio-Temporal Models for Human Motion and Pose Forecasting in Face-to-Face Interaction Scenarios

    Authors: German Barquero, Johnny Núñez, Zhen Xu, Sergio Escalera, Wei-Wei Tu, Isabelle Guyon, Cristina Palmero

    Abstract: Human behavior forecasting during human-human interactions is of utmost importance to provide robotic or virtual agents with social intelligence. This problem is especially challenging for scenarios that are highly driven by interpersonal dynamics. In this work, we present the first systematic comparison of state-of-the-art approaches for behavior forecasting. To do so, we leverage whole-body anno… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: single column, 27 pages, 7 figures, 7 tables

    Journal ref: Proceedings of Machine Learning Research, 2022

  46. arXiv:2203.02480  [pdf, other

    cs.CV cs.RO

    Didn't see that coming: a survey on non-verbal social human behavior forecasting

    Authors: German Barquero, Johnny Núñez, Sergio Escalera, Zhen Xu, Wei-Wei Tu, Isabelle Guyon, Cristina Palmero

    Abstract: Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of s… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: single column, 27 pages, 4 figures, 3 tables

    Journal ref: Proceedings of Machine Learning Research, 2022

  47. Video Transformers: A Survey

    Authors: Javier Selva, Anders S. Johansen, Sergio Escalera, Kamal Nasrollahi, Thomas B. Moeslund, Albert Clapés

    Abstract: Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced by the temporal dimension. While there are surveys analyzing the advances of Transformers for visio… ▽ More

    Submitted 13 February, 2023; v1 submitted 16 January, 2022; originally announced January 2022.

  48. arXiv:2201.03801  [pdf, other

    cs.LG cs.AI

    Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

    Authors: Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, Sirui Hong, Frank Hutter, Rongrong Ji, Julio C. S. Jacques Junior, Ge Li, Marius Lindauer, Zhipeng Luo, Meysam Madadi, Thomas Nierhoff, Kangning Niu, Chunguang Pan, Danny Stoll, Sebastien Treguer, ** Wang, Peng Wang, Chenglin Wu, Youcheng Xiong, Arbe r Zela, Yang Zhang

    Abstract: This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification… ▽ More

    Submitted 11 January, 2022; originally announced January 2022.

    Comments: The first three authors contributed equally; This is only a draft version

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) 2021

  49. arXiv:2201.03186  [pdf, other

    eess.IV cs.CV

    MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images

    Authors: Lei Li, Fu** Wu, Sihan Wang, Xinzhe Luo, Carlos Martin-Isla, Shuwei Zhai, Jianpeng Zhang, Yanfei Liu7, Zhen Zhang, Markus J. Ankenbrand, Haochuan Jiang, Xiaoran Zhang, Linhong Wang, Tewodros Weldebirhan Arega, Elif Altunok, Zhou Zhao, Feiyan Li, Jun Ma, ** Yang, Elodie Puybareau, Ilkay Oksuz, Stephanie Bricq, Weisheng Li, Kumaradevan Punithakumar, Sotirios A. Tsaftaris , et al. (7 additional authors not shown)

    Abstract: Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  50. CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation

    Authors: Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nicola Rieke, Samuel Joutard, Ben Glocker, Jorge Cardoso, Marc Modat, Kayhan Batmanghelich, Arseniy Belkov, Maria Baldeon Calisto, Jae Won Choi, Benoit M. Dawant, Hexin Dong, Sergio Escalera, Yubo Fan, Lasse Hansen, Mattias P. Heinrich, Smriti Joshi, Victoriya Kashtanova, Hyeon Gyu Kim, Satoshi Kondo, Christian N. Kruse, Susana K. Lai-Yuen , et al. (15 additional authors not shown)

    Abstract: Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality… ▽ More

    Submitted 14 December, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

    Comments: In Medical Image Analysis