Skip to main content

Showing 1–29 of 29 results for author: Cardinaux, F

.
  1. arXiv:2407.03036  [pdf, other

    cs.CV

    SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning

    Authors: Bac Nguyen, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville

    Abstract: Handling distribution shifts from training data, known as out-of-distribution (OOD) generalization, poses a significant challenge in the field of machine learning. While a pre-trained vision-language model like CLIP has demonstrated remarkable zero-shot performance, further adaptation of the model to downstream tasks leads to undesirable degradation for OOD data. In this work, we introduce Sparse… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2404.00675  [pdf, other

    cs.CV cs.AI

    LLM meets Vision-Language Models for Zero-Shot One-Class Classification

    Authors: Yassir Bendou, Giulia Lioi, Bastien Pasdeloup, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Vincent Gripon

    Abstract: We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually con… ▽ More

    Submitted 27 May, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  3. arXiv:2401.11311  [pdf, other

    cs.CV

    A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models

    Authors: Reda Bensaid, Vincent Gripon, François Leduc-Primeau, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux

    Abstract: In recent years, the rapid evolution of computer vision has seen the emergence of various foundation models, each tailored to specific data types and tasks. In this study, we explore the adaptation of these models for few-shot semantic segmentation. Specifically, we conduct a comprehensive comparative analysis of four prominent foundation models: DINO V2, Segment Anything, CLIP, Masked AutoEncoder… ▽ More

    Submitted 2 April, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

  4. arXiv:2311.14544  [pdf, other

    cs.CV cs.AI

    Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning

    Authors: Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Giulia Lioi, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

    Abstract: In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual feat… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: R0-FoMo: Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models at NeurIPS 2023

  5. arXiv:2309.03974  [pdf, other

    cs.LG

    DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation

    Authors: Pau Mulet Arabi, Alec Flowers, Lukas Mauch, Fabien Cardinaux

    Abstract: Computing gradients of an expectation with respect to the distributional parameters of a discrete distribution is a problem arising in many fields of science and engineering. Typically, this problem is tackled using Reinforce, which frames the problem of gradient estimation as a Monte Carlo simulation. Unfortunately, the Reinforce estimator is especially sensitive to discrepancies between the true… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 22 pages, 7 figures

    ACM Class: I.2.0

  6. arXiv:2306.01442  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Towards Robust FastSpeech 2 by Modelling Residual Multimodality

    Authors: Fabian Kögel, Bac Nguyen, Fabien Cardinaux

    Abstract: State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech. For expressive speech datasets however, we observe characteristic audio distortions. We demonstrate that such artefacts are introduced to the vocoder reconstruction by over-smooth mel-spectrogram predictions, which are induced by the choice of mean-squa… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023

  7. arXiv:2303.03717  [pdf, other

    cs.SD eess.AS

    Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation

    Authors: Bac Nguyen, Stefan Uhlich, Fabien Cardinaux

    Abstract: Self-supervised learning (SSL) has recently shown remarkable results in closing the gap between supervised and unsupervised learning. The idea is to learn robust features that are invariant to distortions of the input data. Despite its success, this idea can suffer from a collapsing issue where the network produces a constant representation. To this end, we introduce SELFIE, a novel Self-supervise… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  8. arXiv:2212.06461  [pdf, ps, other

    cs.LG cs.AI cs.CV stat.ML

    A Statistical Model for Predicting Generalization in Few-Shot Classification

    Authors: Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Lukas Mauch, Stefan Uhlich, Fabien Cardinaux, Ghouthi Boukli Hacene, Javier Alonso Garcia

    Abstract: The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to rely on features extracted from pre-trained neural networks combined with distance-based classifiers such as nearest class mean. In this work, we introduce a Gaus… ▽ More

    Submitted 28 March, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  9. arXiv:2205.12918  [pdf, other

    cs.CV

    A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps

    Authors: Xiaowen Jiang, Valerio Cambareri, Gianluca Agresti, Cynthia Ifeyinwa Ugwu, Adriano Simonetto, Fabien Cardinaux, Pietro Zanuttigh

    Abstract: Sparse active illumination enables precise time-of-flight depth sensing as it maximizes signal-to-noise ratio for low power budgets. However, depth completion is required to produce dense depth maps for 3D perception. We address this task with realistic illumination and sensor resolution constraints by simulating ToF datasets for indoor 3D perception with challenging sparsity levels. We propose a… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022. Presented at the 5th Efficient Deep Learning for Computer Vision Workshop

  10. arXiv:2203.11049  [pdf, other

    cs.SD cs.LG eess.AS

    AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling

    Authors: Bac Nguyen, Fabien Cardinaux, Stefan Uhlich

    Abstract: Parallel text-to-speech (TTS) models have recently enabled fast and highly-natural speech synthesis. However, they typically require external alignment models, which are not necessarily optimized for the decoder as they are not jointly trained. In this paper, we propose a differentiable duration method for learning monotonic alignments between input and output sequences. Our method is based on a s… ▽ More

    Submitted 7 March, 2023; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: ICASSP 2023

  11. arXiv:2106.00992  [pdf, other

    cs.SD cs.AI eess.AS

    NVC-Net: End-to-End Adversarial Voice Conversion

    Authors: Bac Nguyen, Fabien Cardinaux

    Abstract: Voice conversion has gained increasing popularity in many applications of speech synthesis. The idea is to change the voice identity from one speaker into another while kee** the linguistic content unchanged. Many voice conversion approaches rely on the use of a vocoder to reconstruct the speech from acoustic features, and as a consequence, the speech quality heavily depends on such a vocoder. I… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

  12. arXiv:2103.13322  [pdf, other

    cs.CV cs.CC

    DNN Quantization with Attention

    Authors: Ghouthi Boukli Hacene, Lukas Mauch, Stefan Uhlich, Fabien Cardinaux

    Abstract: Low-bit quantization of network weights and activations can drastically reduce the memory footprint, complexity, energy consumption and latency of Deep Neural Networks (DNNs). However, low-bit quantization can also cause a considerable drop in accuracy, in particular when we apply it to complex learning tasks or lightweight DNN architectures. In this paper, we propose a training procedure that rel… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  13. arXiv:2102.06725  [pdf, other

    cs.LG cs.CV

    Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives

    Authors: Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama

    Abstract: While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools. In this paper, we introduce Neural Network Libraries (https://nnabla.org), a deep learning framework designed from engineer's perspe… ▽ More

    Submitted 21 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: https://nnabla.org

  14. arXiv:2011.12043  [pdf, other

    cs.LG cs.AI cs.NE

    Efficient Sampling for Predictor-Based Neural Architecture Search

    Authors: Lukas Mauch, Stephen Tiedemann, Javier Alonso Garcia, Bac Nguyen Cong, Kazuki Yoshiyama, Fabien Cardinaux, Thomas Kemp

    Abstract: Recently, predictor-based algorithms emerged as a promising approach for neural architecture search (NAS). For NAS, we typically have to calculate the validation accuracy of a large number of Deep Neural Networks (DNNs), what is computationally complex. Predictor-based NAS algorithms address this problem. They train a proxy model that can infer the validation accuracy of DNNs directly from their n… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  15. arXiv:2005.07810  [pdf, other

    eess.AS cs.LG cs.SD

    Unsupervised Cross-Domain Speech-to-Speech Conversion with Time-Frequency Consistency

    Authors: Mohammad Asif Khan, Fabien Cardinaux, Stefan Uhlich, Marc Ferras, Asja Fischer

    Abstract: In recent years generative adversarial network (GAN) based models have been successfully applied for unsupervised speech-to-speech conversion.The rich compact harmonic view of the magnitude spectrogram is considered a suitable choice for training these models with audio data. To reconstruct the speech signal first a magnitude spectrogram is generated by the neural network, which is then utilized b… ▽ More

    Submitted 18 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

  16. Iteratively Training Look-Up Tables for Network Quantization

    Authors: Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso Garcia, Lukas Mauch, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

    Abstract: Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word length of the network parameters or remove weights from the network if they are not needed. In this article we discuss a general framework for network reduction… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

    Comments: Copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  17. arXiv:1905.11452  [pdf

    cs.LG cs.CV stat.ML

    Mixed Precision DNNs: All you need is a good parametrization

    Authors: Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

    Abstract: Efficient deep neural network (DNN) inference on mobile or embedded devices typically involves quantization of the network parameters and activations. In particular, mixed precision networks achieve better performance than networks with homogeneous bitwidth for the same size constraint. Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desira… ▽ More

    Submitted 22 May, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: International Conference on Learning Representations (ICLR) 2020; Source code at https://github.com/sony/ai-research-code

  18. arXiv:1811.05355  [pdf, ps, other

    cs.LG stat.ML

    Iteratively Training Look-Up Tables for Network Quantization

    Authors: Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso García, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

    Abstract: Operating deep neural networks on devices with limited resources requires the reduction of their memory footprints and computational requirements. In this paper we introduce a training method, called look-up table quantization, LUT-Q, which learns a dictionary and assigns each weight to one of the dictionary's values. We show that this method is very flexible and that many other techniques can be… ▽ More

    Submitted 13 November, 2018; originally announced November 2018.

    Comments: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications

  19. arXiv:1807.02710  [pdf, other

    cs.SD cs.LG eess.AS

    Improving DNN-based Music Source Separation using Phase Features

    Authors: Joachim Muth, Stefan Uhlich, Nathanael Perraudin, Thomas Kemp, Fabien Cardinaux, Yuki Mitsufuji

    Abstract: Music source separation with deep neural networks typically relies only on amplitude features. In this paper we show that additional phase features can improve the separation performance. Using the theoretical relationship between STFT phase and amplitude, we conjecture that derivatives of the phase are a good feature representation opposed to the raw phase. We verify this conjecture experimentall… ▽ More

    Submitted 16 July, 2018; v1 submitted 7 July, 2018; originally announced July 2018.

    Comments: 7 pages, 9 figures, Joint Workshop on Machine Learning for Music at ICML, IJCAI/ECAI and AAMAS, 2018

  20. Structure of marginally jammed polydisperse packings of frictionless spheres

    Authors: Chi Zhang, Cathal B. O'Donovan, Eric I. Corwin, Frédéric Cardinaux, Thomas G. Mason, Matthias E. Möbius, Frank Scheffold

    Abstract: We model the packing structure of a marginally jammed bulk ensemble of polydisperse spheres using an extended granocentric mode explicitly taking into account rattlers. This leads to a relation- ship between the characteristic parameters of the packing, such as the mean number of neighbors and the fraction of rattlers, and the radial distribution function g(r). We find excellent agreement between… ▽ More

    Submitted 2 November, 2014; originally announced November 2014.

    Comments: submitted

    Journal ref: Phys. Rev. E 91, 032302 (2015)

  21. Accounting for inertia effects to access the high-frequency microrheology of viscoelastic fluids

    Authors: P. Domínguez-García, Frédéric Cardinaux, Elena Bertseva, László Forró, Frank Scheffold, Sylvia Jeney

    Abstract: We study the Brownian motion of microbeads immersed in water and in a viscoelastic wormlike micelles solution by optical trap** interferometry and diffusing wave spectroscopy. Through the mean-square displacement obtained from both techniques, we deduce the mechanical properties of the fluids at high frequencies by explicitly accounting for inertia effects of the particle and the surrounding flu… ▽ More

    Submitted 3 December, 2014; v1 submitted 18 August, 2014; originally announced August 2014.

    Journal ref: Physical Review E 90, 060301(R), 2014

  22. arXiv:1305.5182  [pdf, other

    cond-mat.soft cond-mat.mtrl-sci cond-mat.stat-mech

    Linear and nonlinear rheology of dense emulsions: Identifying the glass and jamming regimes

    Authors: Frank Scheffold, Frédéric Cardinaux, Thomas G. Mason

    Abstract: We discuss the linear and non-linear rheology of concentrated (sub)microscale emulsions, amorphous disordered solids composed of repulsive and deformable soft colloidal spheres. Based on recent results from simulation and theory, we derive quantitative predictions for the dependences of the elastic shear modulus and the yield stress on the droplet volume fraction. The remarkable agreement with exp… ▽ More

    Submitted 22 May, 2013; originally announced May 2013.

    Comments: submitted

    Journal ref: J. Phys.: Condens. Matter 25, 502101 (2013)

  23. arXiv:1209.3362  [pdf

    physics.data-an cond-mat.soft physics.optics

    Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

    Authors: Giovanni Cerchiari, Fabrizio Croccolo, Frédéric Cardinaux, Frank Scheffold

    Abstract: We present an implementation of the analysis of dynamic near field scattering (NFS) data using a graphics processing unit (GPU). We introduce an optimized data management scheme thereby limiting the number of operations required. Overall, we reduce the processing time from hours to minutes, for typical experimental conditions. Previously the limiting step in such experiments, the processing time i… ▽ More

    Submitted 15 September, 2012; originally announced September 2012.

    Comments: accepted for publication in Review of Scientific Instruments (Note), supplementary material not included

    Journal ref: Rev. Sci. Instrum. 83, 106101 (2012)

  24. arXiv:1112.2510  [pdf, ps, other

    cond-mat.soft physics.bio-ph physics.chem-ph physics.med-ph

    Effect of glycerol and dimethyl sulfoxide on the phase behavior of lysozyme: Theory and experiments

    Authors: Christoph Gögelein, Dana Wagner, Frédéric Cardinaux, Gerhard Nägele, Stefan U. Egelhaaf

    Abstract: Salt, glycerol and dimethyl sulfoxide (DMSO) are used to modify the properties of protein solutions. We experimentally determined the effect of these additives on the phase behavior of lysozyme solutions. Upon the addition of glycerol and DMSO, the fluid-solid transition and the gas-liquid coexistence curve (binodal) shift to lower temperatures and the gap between them increases. The experimentall… ▽ More

    Submitted 23 December, 2011; v1 submitted 12 December, 2011; originally announced December 2011.

    Comments: Manuscript accepted for publication in The Journal of Chemical Physics

    Journal ref: The Journal of Chemical Physics 136: 015102, 2012

  25. arXiv:1101.4447  [pdf, other

    cond-mat.soft

    Phase separation and dynamical arrest for particles interacting with mixed potentials--the case of globular proteins revisited

    Authors: Thomas Gibaud, Frederic Cardinaux, Johan Bergenholtz, Anna Stradner, Peter Schurtenberger

    Abstract: We examine the applicability of the extended law of corresponding states (ELCS) to equilibrium and non equilibrium features of the state diagram of the globular protein lysozyme. We provide compelling evidence that the ELCS correctly reproduces the location of the binodal for different ionic strengths, but fails in describing the location of the arrest line. We subsequently use Mode Coupling Theor… ▽ More

    Submitted 24 January, 2011; originally announced January 2011.

    Journal ref: Soft Matter 7, 857 (2011)

  26. arXiv:0902.0310  [pdf

    cond-mat.soft cond-mat.mtrl-sci

    Interplay between Spinodal Decomposition and Glass Formation in Proteins Exhibiting Short-Range Attractions

    Authors: Frederic Cardinaux, Thomas Gibaud, Anna Stradner, Peter Schurtenberger

    Abstract: We investigate the competition between spinodal decomposition and dynamical arrest using aqueous solutions of the globular protein lysozyme as a model system for colloids with short-range attractions. We show that quenches below a temperature Ta lead to gel formation as a result of a local arrest of the proteindense phase during spinodal decomposition. The rheological properties of these gels al… ▽ More

    Submitted 2 February, 2009; originally announced February 2009.

    Journal ref: PRL 99, 118301 (2007)

  27. arXiv:0812.0952  [pdf, ps, other

    cond-mat.soft

    Rheology, Structure and Dynamics of Colloid-Polymer Mixtures: from Liquids to Gels

    Authors: M. Laurati, G. Petekidis, N. Koumakis, F. Cardinaux, A. B. Schofield, J. M. Brader, M. Fuchs, S. U. Egelhaaf

    Abstract: We investigated the viscoelastic properties of colloid-polymer mixtures at intermediate colloid volume fraction and varying polymer concentrations, thereby tuning the attractive interactions. Within the examined range of polymer concentrations, the samples ranged from fluids to gels. Already in the liquid phase the viscoelastic properties significantly changed when approaching the gelation bound… ▽ More

    Submitted 4 December, 2008; originally announced December 2008.

    Comments: 34 pages, 11 figures, To be submitted to JCP

  28. Modeling Equilibrium Clusters in Lysozyme Solutions

    Authors: Frédéric Cardinaux, Anna Stradner, Peter Schurtenberger, Francesco Sciortino, Emanuela Zaccarelli

    Abstract: We present a combined experimental and numerical study of the equilibrium cluster formation in globular protein solutions under no-added salt conditions. We show that a cluster phase emerges as a result of a competition between a long-range screened Coulomb repulsion and a short-range attraction. A simple effective potential, in which only depth and width of the attractive part of the potential… ▽ More

    Submitted 4 August, 2006; v1 submitted 11 July, 2006; originally announced July 2006.

    Comments: 4 pages, 4 figures

    Journal ref: Europhys. Lett. 77, 48004 (2007)

  29. Multi-speckle diffusing wave spectroscopy with a single mode detection scheme

    Authors: P. Zakharov, F. Cardinaux, F. Scheffold

    Abstract: We present a detection scheme for diffusing wave spectroscopy (DWS) based on a two cell geometry that allows efficient ensemble averaging. This is achieved by putting a fast rotating diffuser in the optical path between laser and sample. We show that the recorded (multi-speckle) correlation echoes provide an ensemble averaged signal that does not require additional time averaging. We find the pe… ▽ More

    Submitted 27 September, 2005; v1 submitted 26 September, 2005; originally announced September 2005.

    Comments: Submitted to PRE