-
Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation
Authors:
Elham Amin Mansour,
Ozan Unal,
Suman Saha,
Benjamin Bejar,
Luc Van Gool
Abstract:
The increasing relevance of panoptic segmentation is tied to the advancements in autonomous driving and AR/VR applications. However, the deployment of such models has been limited due to the expensive nature of dense data annotation, giving rise to unsupervised domain adaptation (UDA). A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domai…
▽ More
The increasing relevance of panoptic segmentation is tied to the advancements in autonomous driving and AR/VR applications. However, the deployment of such models has been limited due to the expensive nature of dense data annotation, giving rise to unsupervised domain adaptation (UDA). A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domain while harmonizing the subtasks of semantic and instance segmentation to limit catastrophic interference. While considerable progress has been achieved, existing approaches mainly focus on the adaptation of semantic segmentation. In this work, we focus on incorporating instance-level adaptation via a novel instance-aware cross-domain mixing strategy IMix. IMix significantly enhances the panoptic quality by improving instance segmentation performance. Specifically, we propose inserting high-confidence predicted instances from the target domain onto source images, retaining the exhaustiveness of the resulting pseudo-labels while reducing the injected confirmation bias. Nevertheless, such an enhancement comes at the cost of degraded semantic performance, attributed to catastrophic forgetting. To mitigate this issue, we regularize our semantic branch by employing CLIP-based domain alignment (CDA), exploiting the domain-robustness of natural language prompts. Finally, we present an end-to-end model incorporating these two mechanisms called LIDAPS, achieving state-of-the-art results on all popular panoptic UDA benchmarks.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Semantic-aware Video Representation for Few-shot Action Recognition
Authors:
Yutao Tang,
Benjamin Bejar,
Rene Vidal
Abstract:
Recent work on action recognition leverages 3D features and textual information to achieve state-of-the-art performance. However, most of the current few-shot action recognition methods still rely on 2D frame-level representations, often require additional components to model temporal relations, and employ complex distance functions to achieve accurate alignment of these representations. In additi…
▽ More
Recent work on action recognition leverages 3D features and textual information to achieve state-of-the-art performance. However, most of the current few-shot action recognition methods still rely on 2D frame-level representations, often require additional components to model temporal relations, and employ complex distance functions to achieve accurate alignment of these representations. In addition, existing methods struggle to effectively integrate textual semantics, some resorting to concatenation or addition of textual and visual features, and some using text merely as an additional supervision without truly achieving feature fusion and information transfer from different modalities. In this work, we propose a simple yet effective Semantic-Aware Few-Shot Action Recognition (SAFSAR) model to address these issues. We show that directly leveraging a 3D feature extractor combined with an effective feature-fusion scheme, and a simple cosine similarity for classification can yield better performance without the need of extra components for temporal modeling or complex distance functions. We introduce an innovative scheme to encode the textual semantics into the video representation which adaptively fuses features from text and video, and encourages the visual encoder to extract more semantically consistent features. In this scheme, SAFSAR achieves alignment and fusion in a compact way. Experiments on five challenging few-shot action recognition benchmarks under various settings demonstrate that the proposed SAFSAR model significantly improves the state-of-the-art performance.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Facial Tic Detection in Untrimmed Videos of Tourette Syndrome Patients
Authors:
Yutao Tang,
Benjamín Béjar,
Joey K. -Y. Essoe,
Joseph F. McGuire,
René Vidal
Abstract:
Tourette Syndrome (TS) is a behavior disorder that onsets in childhood and is characterized by the expression of involuntary movements and sounds commonly referred to as tics. Behavioral therapy is the first-line treatment for patients with TS, and it helps patients raise awareness about tic occurrence as well as develop tic inhibition strategies. However, the limited availability of therapists an…
▽ More
Tourette Syndrome (TS) is a behavior disorder that onsets in childhood and is characterized by the expression of involuntary movements and sounds commonly referred to as tics. Behavioral therapy is the first-line treatment for patients with TS, and it helps patients raise awareness about tic occurrence as well as develop tic inhibition strategies. However, the limited availability of therapists and the difficulties for in-home follow up work limits its effectiveness. An automatic tic detection system that is easy to deploy could alleviate the difficulties of home-therapy by providing feedback to the patients while exercising tic awareness. In this work, we propose a novel architecture (T-Net) for automatic tic detection and classification from untrimmed videos. T-Net combines temporal detection and segmentation and operates on features that are interpretable to a clinician. We compare T-Net to several state-of-the-art systems working on deep features extracted from the raw videos and T-Net achieves comparable performance in terms of average precision while relying on interpretable features needed in clinical practice.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Trend estimation and short-term forecasting of COVID-19 cases and deaths worldwide
Authors:
Ekaterina Krymova,
Benjamín Béjar,
Dorina Thanou,
Tao Sun,
Elisa Manetti,
Gavin Lee,
Kristen Namigai,
Christine Choirat,
Antoine Flahault,
Guillaume Obozinski
Abstract:
Since the beginning of the COVID-19 pandemic, many dashboards have emerged as useful tools to monitor the evolution of the pandemic, inform the public, and assist governments in decision making. Our goal is to develop a globally applicable method, integrated in a twice daily updated dashboard that provides an estimate of the trend in the evolution of the number of cases and deaths from reported da…
▽ More
Since the beginning of the COVID-19 pandemic, many dashboards have emerged as useful tools to monitor the evolution of the pandemic, inform the public, and assist governments in decision making. Our goal is to develop a globally applicable method, integrated in a twice daily updated dashboard that provides an estimate of the trend in the evolution of the number of cases and deaths from reported data of more than 200 countries and territories, as well as a seven-day forecast. One of the significant difficulties to manage a quickly propagating epidemic is that the details of the dynamic needed to forecast its evolution are obscured by the delays in the identification of cases and deaths and by irregular reporting. Our forecasting methodology substantially relies on estimating the underlying trend in the observed time series using robust seasonal trend decomposition techniques. This allows us to obtain forecasts with simple, yet effective extrapolation methods in linear or log scale. We present the results of an assessment of our forecasting methodology and discuss its application to the production of global and regional risk maps.
△ Less
Submitted 24 March, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.
-
The fastest $\ell_{1,\infty}$ prox in the west
Authors:
Benjamín Béjar,
Ivan Dokmanić,
René Vidal
Abstract:
Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proxima…
▽ More
Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proximal operator of the mixed $\ell_{1,\infty}$ matrix norm and show that it can be computed in closed form by applying the well-known soft-thresholding operator to each column of the matrix. However, unlike the vector $\ell_1$ norm case where the threshold is constant, in the mixed $\ell_{1,\infty}$ norm case each column of the matrix might require a different threshold and all thresholds depend on the given matrix. We propose a general iterative algorithm for computing these thresholds, as well as two efficient implementations that further exploit easy to compute lower bounds for the mixed norm of the optimal solution. Experiments on large-scale synthetic and real data indicate that the proposed methods can be orders of magnitude faster than state-of-the-art methods.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
The regularized tau estimator: A robust and efficient solution to ill-posed linear inverse problems
Authors:
Marta Martinez-Camara,
Michael Muma,
Benjamin Bejar,
Abdelhak M. Zoubir,
Martin Vetterli
Abstract:
Linear inverse problems are ubiquitous. Often the measurements do not follow a Gaussian distribution. Additionally, a model matrix with a large condition number can complicate the problem further by making it ill-posed. In this case, the performance of popular estimators may deteriorate significantly. We have developed a new estimator that is both nearly optimal in the presence of Gaussian errors…
▽ More
Linear inverse problems are ubiquitous. Often the measurements do not follow a Gaussian distribution. Additionally, a model matrix with a large condition number can complicate the problem further by making it ill-posed. In this case, the performance of popular estimators may deteriorate significantly. We have developed a new estimator that is both nearly optimal in the presence of Gaussian errors while being also robust against outliers. Furthermore, it obtains meaningful estimates when the problem is ill-posed through the inclusion of $\ell_1$ and $\ell_2$ regularizations. The computation of our estimate involves minimizing a non-convex objective function. Hence, we are not guaranteed to find the global minimum in a reasonable amount of time. Thus, we propose two algorithms that converge to a good local minimum in a reasonable (and adjustable) amount of time, as an approximation of the global minimum. We also analyze how the introduction of the regularization term affects the statistical properties of our estimator. We confirm high robustness against outliers and asymptotic efficiency for Gaussian distributions by deriving measures of robustness such as the influence function, sensitivity curve, bias, asymptotic variance, and mean square error. We verify the theoretical results using numerical experiments and show that the proposed estimator outperforms recently proposed methods, especially for increasing amounts of outlier contamination. Python code for all of the algorithms are available online in the spirit of reproducible research.
△ Less
Submitted 23 May, 2016;
originally announced June 2016.