-
Cooperative learning of Pl@ntNet's Artificial Intelligence algorithm: how does it work and how can we improve it?
Authors:
Tanguy Lefort,
Antoine Affouard,
Benjamin Charlier,
Jean-Christophe Lombardo,
Mathias Chouet,
Hervé Goëau,
Joseph Salmon,
Pierre Bonnet,
Alexis Joly
Abstract:
Deep learning models for plant species identification rely on large annotated datasets. The PlantNet system enables global data collection by allowing users to upload and annotate plant observations, leading to noisy labels due to diverse user skills. Achieving consensus is crucial for training, but the vast scale of collected data makes traditional label aggregation strategies challenging. Existi…
▽ More
Deep learning models for plant species identification rely on large annotated datasets. The PlantNet system enables global data collection by allowing users to upload and annotate plant observations, leading to noisy labels due to diverse user skills. Achieving consensus is crucial for training, but the vast scale of collected data makes traditional label aggregation strategies challenging. Existing methods either retain all observations, resulting in noisy training data or selectively keep those with sufficient votes, discarding valuable information. Additionally, as many species are rarely observed, user expertise can not be evaluated as an inter-user agreement: otherwise, botanical experts would have a lower weight in the AI training step than the average user. Our proposed label aggregation strategy aims to cooperatively train plant identification AI models. This strategy estimates user expertise as a trust score per user based on their ability to identify plant species from crowdsourced data. The trust score is recursively estimated from correctly identified species given the current estimated labels. This interpretable score exploits botanical experts' knowledge and the heterogeneity of users. Subsequently, our strategy removes unreliable observations but retains those with limited trusted annotations, unlike other approaches. We evaluate PlantNet's strategy on a released large subset of the PlantNet database focused on European flora, comprising over 6M observations and 800K users. We demonstrate that estimating users' skills based on the diversity of their expertise enhances labeling performance. Our findings emphasize the synergy of human annotation and data filtering in improving AI performance for a refined dataset. We explore incorporating AI-based votes alongside human input. This can further enhance human-AI interactions to detect unreliable observations.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Directional drying of a colloidal dispersion: quantitative description with water potential measurements using water clusters in a poly(dimethylsiloxane) microfluidic chip
Authors:
Hrishikesh **ulkar,
Sonia Maréchal,
Jean-Baptiste Salmon
Abstract:
We have developed a poly(dimethylsiloxane) (PDMS) microfluidic chip to study the directional drying of a colloidal dispersion confined in a channel. Our measurements on a dispersion of silica nanoparticles once again revealed the phenomenology commonly observed for such systems: the formation of a porous solid with linear growth in the channel at short times, slowing down at longer times as the ev…
▽ More
We have developed a poly(dimethylsiloxane) (PDMS) microfluidic chip to study the directional drying of a colloidal dispersion confined in a channel. Our measurements on a dispersion of silica nanoparticles once again revealed the phenomenology commonly observed for such systems: the formation of a porous solid with linear growth in the channel at short times, slowing down at longer times as the evaporation rate decreases. The growth of the solid is also accompanied by mechanical stresses that are released by the delamination of the solid from the channel walls and the formation of cracks. In addition to these observations, we report original measurements using hydrophilic filler in the PDMS formulation used (Sylgard-184). When the PDMS matrix is in contact with water, water molecules pool around these hydrophilic sites, resulting in the formation of microscopic water clusters whose size depends on the water potential $ψ$. In our work, we have used these water clusters to estimate the water potential profile in the channel as the porous solid grows. Using a transport model that also takes into account solid delamination in the channel, we then linked these water potential measurements to the hydraulic permeability of the porous solid. These measurements finally enabled us to show that the slowdown in the evaporation rate is due to the invasion of the porous solid by air/water nanomenisci at a critical capillary pressure $ψ_\text{cap}$.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Angular momentum transport by magnetic fields in main sequence stars with Gamma Doradus pulsators
Authors:
F. D. Moyano,
P. Eggenberger,
S. J. A. J. Salmon,
J. S. G. Mombarg,
S. Ekström
Abstract:
Context. Asteroseismic studies showed that cores of post main-sequence stars rotate slower than theoretically predicted by stellar models with purely hydrodynamical transport processes. Recent studies on main sequence stars, particularly Gamma Doradus ($γ$ Dor) stars, revealed their internal rotation rate for hundreds of stars, offering a counterpart on the main sequence for studies of angular mom…
▽ More
Context. Asteroseismic studies showed that cores of post main-sequence stars rotate slower than theoretically predicted by stellar models with purely hydrodynamical transport processes. Recent studies on main sequence stars, particularly Gamma Doradus ($γ$ Dor) stars, revealed their internal rotation rate for hundreds of stars, offering a counterpart on the main sequence for studies of angular momentum transport. Aims. We investigate whether such a disagreement between observed and predicted internal rotation rates is present in main sequence stars by studying angular momentum transport in $γ$ Dor stars. Furthermore, we test whether models of rotating stars with internal magnetic fields can reproduce their rotational properties. Methods. We compute rotating models with the Geneva stellar evolution code taking into account meridional circulation and the shear instability. We also compute models with internal magnetic fields using a general formalism for transport by the Tayler-Spruit dynamo. We then compare these models to observational constraints for $γ$ Dor stars that we compiled from the literature, combining so the core rotation rates, projected rotational velocities from spectroscopy, and constraints on their fundamental parameters. Results. We show that combining the different observational constraints available for $γ$ Dor stars enable to clearly distinguish the different scenarios for internal angular momentum transport. Stellar models with purely hydrodynamical processes are in disagreement with the data whereas models with internal magnetic fields can reproduce both core and surface constraints simultaneously. Conclusions. Similarly to results obtained for subgiant and red giant stars, angular momentum transport in radiative regions of $γ$ Dor stars is highly efficient, in good agreement with predictions of models with internal magnetic fields.
△ Less
Submitted 3 July, 2023; v1 submitted 2 April, 2023;
originally announced April 2023.
-
A two-head loss function for deep Average-K classification
Authors:
Camille Garcin,
Maximilien Servajean,
Alexis Joly,
Joseph Salmon
Abstract:
Average-K classification is an alternative to top-K classification in which the number of labels returned varies with the ambiguity of the input image but must average to K over all the samples. A simple method to solve this task is to threshold the softmax output of a model trained with the cross-entropy loss. This approach is theoretically proven to be asymptotically consistent, but it is not gu…
▽ More
Average-K classification is an alternative to top-K classification in which the number of labels returned varies with the ambiguity of the input image but must average to K over all the samples. A simple method to solve this task is to threshold the softmax output of a model trained with the cross-entropy loss. This approach is theoretically proven to be asymptotically consistent, but it is not guaranteed to be optimal for a finite set of samples. In this paper, we propose a new loss function based on a multi-label classification head in addition to the classical softmax. This second head is trained using pseudo-labels generated by thresholding the softmax head while guaranteeing that K classes are returned on average. We show that this approach allows the model to better capture ambiguities between classes and, as a result, to return more consistent sets of possible classes. Experiments on two datasets from the literature demonstrate that our approach outperforms the softmax baseline, as well as several other loss functions more generally designed for weakly supervised multi-label classification. The gains are larger the higher the uncertainty, especially for classes with few samples.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Identify ambiguous tasks combining crowdsourced labels by weighting Areas Under the Margin
Authors:
Tanguy Lefort,
Benjamin Charlier,
Alexis Joly,
Joseph Salmon
Abstract:
In supervised learning - for instance in image classification - modern massive datasets are commonly labeled by a crowd of workers. The obtained labels in this crowdsourcing setting are then aggregated for training, generally leveraging a per-worker trust score. Yet, such workers oriented approaches discard the tasks' ambiguity. Ambiguous tasks might fool expert workers, which is often harmful for…
▽ More
In supervised learning - for instance in image classification - modern massive datasets are commonly labeled by a crowd of workers. The obtained labels in this crowdsourcing setting are then aggregated for training, generally leveraging a per-worker trust score. Yet, such workers oriented approaches discard the tasks' ambiguity. Ambiguous tasks might fool expert workers, which is often harmful for the learning step. In standard supervised learning settings - with one label per task - the Area Under the Margin (AUM) was tailored to identify mislabeled data. We adapt the AUM to identify ambiguous tasks in crowdsourced learning scenarios, introducing the Weighted Areas Under the Margin (WAUM). The WAUM is an average of AUMs weighted according to task-dependent scores. We show that the WAUM can help discarding ambiguous tasks from the training set, leading to better generalization performance. We report improvements over existing strategies for learning with a crowd, both on simulated settings, and on real datasets such as CIFAR-10H (a crowdsourced dataset with a high number of answered labels),LabelMe and Music (two datasets with few answered votes).
△ Less
Submitted 30 November, 2023; v1 submitted 30 September, 2022;
originally announced September 2022.
-
Microfluidic free interface diffusion: measurement of diffusion coefficients and evidence of interfacial-driven transport phenomena
Authors:
Hoang-Thanh Nguyen,
Anne Bouchaudy,
Jean-Baptiste Salmon
Abstract:
We have developed a microfluidic tool to measure the diffusion coefficient $D$ of solutes in an aqueous solution, by following the temporal relaxation of an initially steep concentration gradient in a microchannel. Our chip exploits multilayer soft lithography and the opening of a pneumatic microvalve to trigger the interdiffusion of pure water and the solution initially separated in the channel b…
▽ More
We have developed a microfluidic tool to measure the diffusion coefficient $D$ of solutes in an aqueous solution, by following the temporal relaxation of an initially steep concentration gradient in a microchannel. Our chip exploits multilayer soft lithography and the opening of a pneumatic microvalve to trigger the interdiffusion of pure water and the solution initially separated in the channel by the valve, the so-called free interface diffusion technique. Another microvalve at a distance from the diffusion zone closes the channel and thus suppresses convection. Using this chip, we have measured diffusion coefficients of solutes in water with a broad size range, from small molecules to polymers and colloids, with values in the range $D \in [10^{-13}- 10^{-9}]$~m$^2$/s. The same experiments but with added colloidal tracers also revealed diffusio-phoresis and diffusio-osmosis phenomena due to the presence of the solute concentration gradient. We nevertheless show that these interfacial-driven transport phenomena do not affect the measurements of the solute diffusion coefficients in the explored concentration range.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
High-Dimensional Private Empirical Risk Minimization by Greedy Coordinate Descent
Authors:
Paul Mangold,
Aurélien Bellet,
Joseph Salmon,
Marc Tommasi
Abstract:
In this paper, we study differentially private empirical risk minimization (DP-ERM). It has been shown that the worst-case utility of DP-ERM reduces polynomially as the dimension increases. This is a major obstacle to privately learning large machine learning models. In high dimension, it is common for some model's parameters to carry more information than others. To exploit this, we propose a dif…
▽ More
In this paper, we study differentially private empirical risk minimization (DP-ERM). It has been shown that the worst-case utility of DP-ERM reduces polynomially as the dimension increases. This is a major obstacle to privately learning large machine learning models. In high dimension, it is common for some model's parameters to carry more information than others. To exploit this, we propose a differentially private greedy coordinate descent (DP-GCD) algorithm. At each iteration, DP-GCD privately performs a coordinate-wise gradient step along the gradients' (approximately) greatest entry. We show theoretically that DP-GCD can achieve a logarithmic dependence on the dimension for a wide range of problems by naturally exploiting their structural properties (such as quasi-sparse solutions). We illustrate this behavior numerically, both on synthetic and real datasets.
△ Less
Submitted 9 April, 2023; v1 submitted 4 July, 2022;
originally announced July 2022.
-
Benchopt: Reproducible, efficient and collaborative optimization benchmarks
Authors:
Thomas Moreau,
Mathurin Massias,
Alexandre Gramfort,
Pierre Ablin,
Pierre-Antoine Bannier,
Benjamin Charlier,
Mathieu Dagréou,
Tom Dupré la Tour,
Ghislain Durif,
Cassio F. Dantas,
Quentin Klopfenstein,
Johan Larsson,
En Lai,
Tanguy Lefort,
Benoit Malézieux,
Badr Moufad,
Binh T. Nguyen,
Alain Rakotomamonjy,
Zaccharie Ramzi,
Joseph Salmon,
Samuel Vaiter
Abstract:
Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementat…
▽ More
Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementation work. As a result, validation is often very partial, which can lead to wrong conclusions that slow down the progress of research. We propose Benchopt, a collaborative framework to automate, reproduce and publish optimization benchmarks in machine learning across programming languages and hardware architectures. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments. To demonstrate its broad usability, we showcase benchmarks on three standard learning tasks: $\ell_2$-regularized logistic regression, Lasso, and ResNet18 training for image classification. These benchmarks highlight key practical findings that give a more nuanced view of the state-of-the-art for these problems, showing that for practical evaluation, the devil is in the details. We hope that Benchopt will foster collaborative work in the community hence improving the reproducibility of research findings.
△ Less
Submitted 28 October, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Asteroseismology of evolved stars to constrain the internal transport of angular momentum. V. Efficiency of the transport on the red giant branch and in the red clump
Authors:
F. D. Moyano,
P. Eggenberger,
G. Meynet,
C. Gehan,
B. Mosser,
G. Buldgen,
S. J. A. J. Salmon
Abstract:
Thanks to asteroseismology, constraints on the core rotation rate are available for hundreds of low- and intermediate-mass stars in evolved phases. Current physical processes tested in stellar evolution models cannot reproduce the evolution of these core rotation rates. We investigate the efficiency of the internal angular momentum redistribution in red giants during the hydrogen shell and core-he…
▽ More
Thanks to asteroseismology, constraints on the core rotation rate are available for hundreds of low- and intermediate-mass stars in evolved phases. Current physical processes tested in stellar evolution models cannot reproduce the evolution of these core rotation rates. We investigate the efficiency of the internal angular momentum redistribution in red giants during the hydrogen shell and core-helium burning phases based on the asteroseismic determinations of their core rotation rates. We compute stellar evolution models with rotation and model the transport of angular momentum by the action of a sole dominant diffusive process parametrized by an additional viscosity. We constrain the values of this viscosity to match the mean core rotation rates of red giants and their behaviour with mass and evolution along the red giant branch and in the red clump. For red giants in the hydrogen shell-burning phase the transport of angular momentum must be more efficient in more massive stars. The additional viscosity is found to vary by approximately two orders of magnitude in the mass range M $\sim$ 1 - 2.5 M$_{\odot}$. As stars evolve along the red giant branch, the efficiency of the internal transport of angular momentum must increase for low-mass stars (M $\lesssim$ 2 M$_{\odot}$) and remain approximately constant for slightly higher masses (2.0 M$_{\odot}$ $\lesssim$ M $\lesssim$ 2.5 M$_{\odot}$). In red-clump stars, the additional viscosities must be an order of magnitude higher than in younger red giants of similar mass during the hydrogen shell-burning phase. In combination with previous efforts, we obtain a clear picture of how the physical processes acting in stellar interiors should redistribute angular momentum from the end of the main sequence until the core-helium burning phase for low- and intermediate-mass stars to satisfy the asteroseismic constraints.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Thorough characterisation of the 16 Cygni system. Part II. Seismic inversions of the internal structure
Authors:
G. Buldgen,
M. Farnir,
P. Eggenberger,
J. Bétrisey,
C. Pezzotti,
C. Pinçon,
M. Deal,
S. J. A. J. Salmon
Abstract:
The advent of space-based photometry observations provided high-quality asteroseismic data for a large number of stars. These observations enabled the adaptation of advanced techniques, until then restricted to helioseismology, to study the best asteroseismic targets. Amongst these, the 16Cyg binary system holds a special place, being the brightest solar twins observed by Kepler. For this system,…
▽ More
The advent of space-based photometry observations provided high-quality asteroseismic data for a large number of stars. These observations enabled the adaptation of advanced techniques, until then restricted to helioseismology, to study the best asteroseismic targets. Amongst these, the 16Cyg binary system holds a special place, being the brightest solar twins observed by Kepler. For this system, modellers have access to high-quality asteroseismic, spectroscopic and interferometric data, making it the perfect testbed for the limitations of stellar models. We aim to further constrain the internal structure and fundamental parameters of 16CygA&B using linear seismic inversion techniques of both global indicators and localised corrections of the internal structure. We start from the models defined by detailed modelling in our previous paper and extend our analysis by applying variational inversions to these models. We carried out inversions of so-called seismic indicators and provided local corrections of the internal structure of the two stars. Our results indicate that linear seismic inversions alone are not able to discriminate between standard and non-standard models for 16CygA&B. We confirm the results of our previous studies that used linear inversion techniques, but consider that the differences could be linked to small fundamental parameters variations rather than to a missing process in the models. We confirm the robustness and reliability of the results of the modelling performed in our previous paper. We conclude that non-linear inversions are likely required to further investigate the properties of 16CygA&B from a seismic point of view, but that these inversions should be coupled to analyses of the depletion of light elements such as lithium and beryllium to constrain the macroscopic transport of chemicals and potential non-standard evolutionary paths.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification
Authors:
Camille Garcin,
Maximilien Servajean,
Alexis Joly,
Joseph Salmon
Abstract:
In modern classification tasks, the number of labels is getting larger and larger, as is the size of the datasets encountered in practice. As the number of classes increases, class ambiguity and class imbalance become more and more problematic to achieve high top-1 accuracy. Meanwhile, Top-K metrics (metrics allowing K guesses) have become popular, especially for performance reporting. Yet, propos…
▽ More
In modern classification tasks, the number of labels is getting larger and larger, as is the size of the datasets encountered in practice. As the number of classes increases, class ambiguity and class imbalance become more and more problematic to achieve high top-1 accuracy. Meanwhile, Top-K metrics (metrics allowing K guesses) have become popular, especially for performance reporting. Yet, proposing top-K losses tailored for deep learning remains a challenge, both theoretically and practically. In this paper we introduce a stochastic top-K hinge loss inspired by recent developments on top-K calibrated losses. Our proposal is based on the smoothing of the top-K operator building on the flexible "perturbed optimizer" framework. We show that our loss function performs very well in the case of balanced datasets, while benefiting from a significantly lower computational time than the state-of-the-art top-K loss function. In addition, we propose a simple variant of our loss for the imbalanced case. Experiments on a heavy-tailed dataset show that our loss function significantly outperforms other baseline loss functions.
△ Less
Submitted 17 July, 2022; v1 submitted 4 February, 2022;
originally announced February 2022.
-
Microfluidic osmotic compression of a charge-stabilized colloidal dispersion: Equation of state and collective diffusion coefficient
Authors:
Camille Keita,
Yannick Hallez,
Jean-Baptiste Salmon
Abstract:
We show, using a model coupling mass transport and liquid theory calculations for a charge-stabilized colloidal dispersion, that diffusion significantly limits measurement times of its Equation Of State (EOS), osmotic pressure vs composition, using the osmotic compression technique. Following this result, we present a microfluidic chip allowing one to measure the entire EOS of a charged dispersion…
▽ More
We show, using a model coupling mass transport and liquid theory calculations for a charge-stabilized colloidal dispersion, that diffusion significantly limits measurement times of its Equation Of State (EOS), osmotic pressure vs composition, using the osmotic compression technique. Following this result, we present a microfluidic chip allowing one to measure the entire EOS of a charged dispersion at the nanoliter scale in a few hours. We also show that time-resolved analyses of relaxation to equilibrium in this microfluidic experiment lead to direct estimates of the collective diffusion coefficient of the dispersion in Donnan equilibrium with a salt reservoir.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Electromagnetic neural source imaging under sparsity constraints with SURE-based hyperparameter tuning
Authors:
Pierre-Antoine Bannier,
Quentin Bertrand,
Joseph Salmon,
Alexandre Gramfort
Abstract:
Estimators based on non-convex sparsity-promoting penalties were shown to yield state-of-the-art solutions to the magneto-/electroencephalography (M/EEG) brain source localization problem. In this paper we tackle the model selection problem of these estimators: we propose to use a proxy of the Stein's Unbiased Risk Estimator (SURE) to automatically select their regularization parameters. The effec…
▽ More
Estimators based on non-convex sparsity-promoting penalties were shown to yield state-of-the-art solutions to the magneto-/electroencephalography (M/EEG) brain source localization problem. In this paper we tackle the model selection problem of these estimators: we propose to use a proxy of the Stein's Unbiased Risk Estimator (SURE) to automatically select their regularization parameters. The effectiveness of the method is demonstrated on realistic simulations and $30$ subjects from the Cam-CAN dataset. To our knowledge, this is the first time that sparsity promoting estimators are automatically calibrated at such a scale. Results show that the proposed SURE approach outperforms cross-validation strategies and state-of-the-art Bayesian statistics methods both computationally and statistically.
△ Less
Submitted 28 October, 2021;
originally announced December 2021.
-
Supervised learning of analysis-sparsity priors with automatic differentiation
Authors:
Hashem Ghanem,
Joseph Salmon,
Nicolas Keriven,
Samuel Vaiter
Abstract:
Sparsity priors are commonly used in denoising and image reconstruction. For analysis-type priors, a dictionary defines a representation of signals that is likely to be sparse. In most situations, this dictionary is not known, and is to be recovered from pairs of ground-truth signals and measurements, by minimizing the reconstruction error. This defines a hierarchical optimization problem, which c…
▽ More
Sparsity priors are commonly used in denoising and image reconstruction. For analysis-type priors, a dictionary defines a representation of signals that is likely to be sparse. In most situations, this dictionary is not known, and is to be recovered from pairs of ground-truth signals and measurements, by minimizing the reconstruction error. This defines a hierarchical optimization problem, which can be cast as a bi-level optimization. Yet, this problem is unsolvable, as reconstructions and their derivative wrt the dictionary have no closed-form expression. However, reconstructions can be iteratively computed using the Forward-Backward splitting (FB) algorithm. In this paper, we approximate reconstructions by the output of the aforementioned FB algorithm. Then, we leverage automatic differentiation to evaluate the gradient of this output wrt the dictionary, which we learn with projected gradient descent. Experiments show that our algorithm successfully learns the 1D Total Variation (TV) dictionary from piecewise constant signals. For the same case study, we propose to constrain our search to dictionaries of 0-centered columns, which removes undesired local minima and improves numerical stability.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Kepler-93: a testbed for detailed seismic modelling and orbital evolution of super-earths around solar-like stars
Authors:
J. Bétrisey,
C. Pezzotti,
G. Buldgen,
S. Khan,
P. Eggenberger,
S. J. A. J. Salmon,
A. Miglio
Abstract:
The advent of space-based photometry missions such as CoRoT, Kepler and TESS has sparkled the development of asteroseismology and exoplanetology. The advent of PLATO will further strengthen such multi-disciplinary studies. Testing asteroseismic modelling and its importance for our understanding of planetary systems is crucial. We carried out a detailed modelling of Kepler-93, an exoplanet host sta…
▽ More
The advent of space-based photometry missions such as CoRoT, Kepler and TESS has sparkled the development of asteroseismology and exoplanetology. The advent of PLATO will further strengthen such multi-disciplinary studies. Testing asteroseismic modelling and its importance for our understanding of planetary systems is crucial. We carried out a detailed modelling of Kepler-93, an exoplanet host star observed by Kepler. This star is particularly interesting as it is very similar to the PLATO benchmark target (G spectral type, ~ 6000K, ~ 1 Msun and ~ 1 Rsun) and provides a real-life testbed for potential procedures to be used for PLATO. We use global and local minimization techniques for the seismic modelling of Kepler-93, varying the ingredients of our stellar models. We compute seismic inversions of the mean density. We use these revised stellar parameters to provide new planetary parameters and simulate the orbital evolution of the system under the effects of tides and atmospheric evaporation. Our fundamental parameters for Kepler-93: mean density = 1.654 +/- 0.004 g/cm3, M = 0.907 +/- 0.023 Msun , R = 0.918 +/- 0.008 Rsun and Age = 6.78 +/- 0.32 Gyr. The uncertainties we report for this benchmark are within the requirements of PLATO. For the exoplanet Kepler-93b, we find Mp = 4.01 +/- 0.67 Mearth, Rp = 1.478 +/- 0.014 Rearth and semi-major axis a = 0.0533 +/- 0.0005 AU. According to our simulations, it seems unlikely that Kepler-93b formed with a mass large enough to be impacted by stellar tides. For the benchmark of PLATO, detailed asteroseismic modelling procedures will be able to provide fundamental stellar parameters within the requirements. We illustrate what synergies can be achieved regarding the orbital evolution and atmospheric evaporation of exoplanets. We note the importance of the high-quality radial velocity follow-up to constrain the formation scenarii of exoplanets.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark Suite for Lasso
Authors:
Kenan Šehić,
Alexandre Gramfort,
Joseph Salmon,
Luigi Nardi
Abstract:
While Weighted Lasso sparse regression has appealing statistical guarantees that would entail a major real-world impact in finance, genomics, and brain imaging applications, it is typically scarcely adopted due to its complex high-dimensional space composed by thousands of hyperparameters. On the other hand, the latest progress with high-dimensional hyperparameter optimization (HD-HPO) methods for…
▽ More
While Weighted Lasso sparse regression has appealing statistical guarantees that would entail a major real-world impact in finance, genomics, and brain imaging applications, it is typically scarcely adopted due to its complex high-dimensional space composed by thousands of hyperparameters. On the other hand, the latest progress with high-dimensional hyperparameter optimization (HD-HPO) methods for black-box functions demonstrates that high-dimensional applications can indeed be efficiently optimized. Despite this initial success, HD-HPO approaches are mostly applied to synthetic problems with a moderate number of dimensions, which limits its impact in scientific and engineering applications. We propose LassoBench, the first benchmark suite tailored for Weighted Lasso regression. LassoBench consists of benchmarks for both well-controlled synthetic setups (number of samples, noise level, ambient and effective dimensionalities, and multiple fidelities) and real-world datasets, which enables the use of many flavors of HPO algorithms to be studied and extended to the high-dimensional Lasso setting. We evaluate 6 state-of-the-art HPO methods and 3 Lasso baselines, and demonstrate that Bayesian optimization and evolutionary strategies can improve over the methods commonly used for sparse regression while highlighting limitations of these frameworks in very high-dimensional and noisy settings.
△ Less
Submitted 10 June, 2022; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Differentially Private Coordinate Descent for Composite Empirical Risk Minimization
Authors:
Paul Mangold,
Aurélien Bellet,
Joseph Salmon,
Marc Tommasi
Abstract:
Machine learning models can leak information about the data used to train them. To mitigate this issue, Differentially Private (DP) variants of optimization algorithms like Stochastic Gradient Descent (DP-SGD) have been designed to trade-off utility for privacy in Empirical Risk Minimization (ERM) problems. In this paper, we propose Differentially Private proximal Coordinate Descent (DP-CD), a new…
▽ More
Machine learning models can leak information about the data used to train them. To mitigate this issue, Differentially Private (DP) variants of optimization algorithms like Stochastic Gradient Descent (DP-SGD) have been designed to trade-off utility for privacy in Empirical Risk Minimization (ERM) problems. In this paper, we propose Differentially Private proximal Coordinate Descent (DP-CD), a new method to solve composite DP-ERM problems. We derive utility guarantees through a novel theoretical analysis of inexact coordinate descent. Our results show that, thanks to larger step sizes, DP-CD can exploit imbalance in gradient coordinates to outperform DP-SGD. We also prove new lower bounds for composite DP-ERM under coordinate-wise regularity assumptions, that are nearly matched by DP-CD. For practical implementations, we propose to clip gradients using coordinate-wise thresholds that emerge from our theory, avoiding costly hyperparameter tuning. Experiments on real and synthetic data support our results, and show that DP-CD compares favorably with DP-SGD.
△ Less
Submitted 21 October, 2022; v1 submitted 22 October, 2021;
originally announced October 2021.
-
Asteroseismology of evolved stars to constrain the internal transport of angular momentum. IV. Internal rotation of Kepler 56 from an MCMC analysis of the rotational splittings
Authors:
L. Fellay,
G. Buldgen,
P. Eggenberger,
S. Khan,
S. J. A. J. Salmon,
A. Miglio,
J. Montalbán
Abstract:
The observations of global stellar oscillations of post main-sequence stars by space-based photometry missions allowed to directly determine their internal rotation. These constraints have pointed towards the existence of angular momentum transport processes unaccounted for in theoretical models. Constraining the properties of their internal rotation thus appears as the golden path to determine th…
▽ More
The observations of global stellar oscillations of post main-sequence stars by space-based photometry missions allowed to directly determine their internal rotation. These constraints have pointed towards the existence of angular momentum transport processes unaccounted for in theoretical models. Constraining the properties of their internal rotation thus appears as the golden path to determine the physical nature of these missing dynamical processes. We wish to determine the robustness of a new approach to study the internal rotation of post main-sequence stars, using parametric rotation profiles coupled to a global optimization technique. We test our methodology on Kepler 56, a red giant observed by the Kepler mission. First, we carry out an extensive modelling of the star using global and local minimizations techniques, and seismic inversions. Then, using our best model, we study in details its internal rotation profile, we adopted a Bayesian approach to constrain stellar parametric predetermined rotation profiles using a Monte Carlo Markov Chain analysis of the rotational splittings of mixed modes. Our Monte Carlo Markov Chain analysis of the rotational splittings allows to determine the core and envelope rotation of Kepler 56 as well as give hints about the location of the transition between the slowly rotating envelope and the fast rotating core. We are able to discard a rigid rotation profile in the radiative regions followed by a power-law in the convective zone and show that the data favours a transition located in the radiative region, as predicted by processes originating from a turbulent nature. Our analysis of Kepler 56 indicates that turbulent processes whose transport efficiency is reduced by chemical gradients are favoured, while large scale fossil magnetic fields are disfavoured as a solution to the missing angular momentum transport.
△ Less
Submitted 6 August, 2021; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Score-Based Change Detection for Gradient-Based Learning Machines
Authors:
Lang Liu,
Joseph Salmon,
Zaid Harchaoui
Abstract:
The widespread use of machine learning algorithms calls for automatic change detection algorithms to monitor their behavior over time. As a machine learning algorithm learns from a continuous, possibly evolving, stream of data, it is desirable and often critical to supplement it with a companion change detection algorithm to facilitate its monitoring and control. We present a generic score-based c…
▽ More
The widespread use of machine learning algorithms calls for automatic change detection algorithms to monitor their behavior over time. As a machine learning algorithm learns from a continuous, possibly evolving, stream of data, it is desirable and often critical to supplement it with a companion change detection algorithm to facilitate its monitoring and control. We present a generic score-based change detection method that can detect a change in any number of components of a machine learning model trained via empirical risk minimization. This proposed statistical hypothesis test can be readily implemented for such models designed within a differentiable programming framework. We establish the consistency of the hypothesis test and show how to calibrate it to achieve a prescribed false alarm rate. We illustrate the versatility of the approach on synthetic and real data.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
Spatially relaxed inference on high-dimensional linear models
Authors:
Jérôme-Alexis Chevalier,
Tuan-Binh Nguyen,
Bertrand Thirion,
Joseph Salmon
Abstract:
We consider the inference problem for high-dimensional linear models, when covariates have an underlying spatial organization reflected in their correlation. A typical example of such a setting is high-resolution imaging, in which neighboring pixels are usually very similar. Accurate point and confidence intervals estimation is not possible in this context with many more covariates than samples, f…
▽ More
We consider the inference problem for high-dimensional linear models, when covariates have an underlying spatial organization reflected in their correlation. A typical example of such a setting is high-resolution imaging, in which neighboring pixels are usually very similar. Accurate point and confidence intervals estimation is not possible in this context with many more covariates than samples, furthermore with high correlation between covariates. This calls for a reformulation of the statistical inference problem, that takes into account the underlying spatial structure: if covariates are locally correlated, it is acceptable to detect them up to a given spatial uncertainty. We thus propose to rely on the $δ$-FWER, that is the probability of making a false discovery at a distance greater than $δ$ from any true positive. With this target measure in mind, we study the properties of ensembled clustered inference algorithms which combine three techniques: spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions. We show that ensembled clustered inference algorithms control the $δ$-FWER under standard assumptions for $δ$ equal to the largest cluster diameter. We complement the theoretical analysis with empirical results, demonstrating accurate $δ$-FWER control and decent power achieved by such inference algorithms.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning
Authors:
Quentin Bertrand,
Quentin Klopfenstein,
Mathurin Massias,
Mathieu Blondel,
Samuel Vaiter,
Alexandre Gramfort,
Joseph Salmon
Abstract:
Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward th…
▽ More
Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approximately. Results on regression and classification problems reveal computational benefits for hyperparameter optimization, especially when multiple hyperparameters are required.
△ Less
Submitted 8 August, 2022; v1 submitted 4 May, 2021;
originally announced May 2021.
-
Origin of the Moon
Authors:
Robin M. Canup,
Kevin Righter,
Nicolas Dauphas,
Kaveh Pahlevan,
Matija Ćuk,
Simon J. Lock,
Sarah T. Stewart,
Julien Salmon,
Raluca Rufu,
Miki Nakajima,
Tomáš Magna
Abstract:
The Earth-Moon system is unusual in several respects. The Moon is roughly 1/4 the radius of the Earth - a larger satellite-to-planet size ratio than all known satellites other than Pluto's Charon. The Moon has a tiny core, perhaps with only ~1% of its mass, in contrast to Earth whose core contains nearly 30% of its mass. The Earth-Moon system has a high total angular momentum, implying a rapidly s…
▽ More
The Earth-Moon system is unusual in several respects. The Moon is roughly 1/4 the radius of the Earth - a larger satellite-to-planet size ratio than all known satellites other than Pluto's Charon. The Moon has a tiny core, perhaps with only ~1% of its mass, in contrast to Earth whose core contains nearly 30% of its mass. The Earth-Moon system has a high total angular momentum, implying a rapidly spinning Earth when the Moon formed. In addition, the early Moon was hot and at least partially molten with a deep magma ocean. Identification of a model for lunar origin that can satisfactorily explain all of these features has been the focus of decades of research.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Role of solutal free convection on interdiffusion in a horizontal microfluidic channel
Authors:
Jean-Baptiste Salmon,
Laurent Soucasse,
Frédéric Doumenc
Abstract:
We theoretically investigate the role of solutal free convection on the diffusion of a buoyant solute at the microfluidic scales, $\simeq 5$--$500~μ$m. We first consider a horizontal microfluidic slit, one half of which initially filled with a binary solution (solute and solvent), and the other half with pure solvent. The buoyant forces generate a gravity current that couples to the diffusion of t…
▽ More
We theoretically investigate the role of solutal free convection on the diffusion of a buoyant solute at the microfluidic scales, $\simeq 5$--$500~μ$m. We first consider a horizontal microfluidic slit, one half of which initially filled with a binary solution (solute and solvent), and the other half with pure solvent. The buoyant forces generate a gravity current that couples to the diffusion of the solute. We perform numerical resolutions of the 2D model describing the transport of the solute in the slit. This study allows us to highlight different regimes as a function of a single parameter, the Rayleigh number $\text{Ra}$ which compares gravity-induced advection to solute diffusion. We then derive asymptotic analytical solutions to quantify the width of the mixing zone as a function of time in each regime and establish a diagram that makes it possible to identify the range of $\text{Ra}$ and times for which buoyancy does not impact diffusion. In a second step, we present numerical resolutions of the same model but for a 3D microfluidic channel with a square cross-section. We observe the same regimes as in the 2D case, and focus on the dispersion regime at long time scales. We then derive the expression of the 1D dispersion coefficient for a channel with a rectangular section, and analyse the role of the transverse flow in the particular case of a square section. Finally, we show that the impact of this transverse flow on the solute transport can be neglected for most of the microfluidic experimental configurations.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Model identification and local linear convergence of coordinate descent
Authors:
Quentin Klopfenstein,
Quentin Bertrand,
Alexandre Gramfort,
Joseph Salmon,
Samuel Vaiter
Abstract:
For composite nonsmooth optimization problems, Forward-Backward algorithm achieves model identification (e.g. support identification for the Lasso) after a finite number of iterations, provided the objective function is regular enough. Results concerning coordinate descent are scarcer and model identification has only been shown for specific estimators, the support-vector machine for instance. In…
▽ More
For composite nonsmooth optimization problems, Forward-Backward algorithm achieves model identification (e.g. support identification for the Lasso) after a finite number of iterations, provided the objective function is regular enough. Results concerning coordinate descent are scarcer and model identification has only been shown for specific estimators, the support-vector machine for instance. In this work, we show that cyclic coordinate descent achieves model identification in finite time for a wide class of functions. In addition, we prove explicit local linear convergence rates for coordinate descent. Extensive experiments on various estimators and on real datasets demonstrate that these rates match well empirical results.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Thorough characterisation of the 16 Cygni system Part I: Forward seismic modelling with WhoSGlAd
Authors:
M. Farnir,
M. -A. Dupret,
G. Buldgen,
S. J. A. J. Salmon,
A. Noels,
C. Pinçon,
C. Pezzotti,
P. Eggenberger
Abstract:
Context: Being part of the brightest solar-like stars, and close solar analogues, the 16 Cygni system is of great interest to the scientific community and may provide insight into the past and future evolution of our Sun. It has been observed thoroughly by the Kepler satellite, which provided us with data of an unprecedented quality. Aims: This paper is the first of a series aiming to extensively…
▽ More
Context: Being part of the brightest solar-like stars, and close solar analogues, the 16 Cygni system is of great interest to the scientific community and may provide insight into the past and future evolution of our Sun. It has been observed thoroughly by the Kepler satellite, which provided us with data of an unprecedented quality. Aims: This paper is the first of a series aiming to extensively characterise the system. We test several choices of micro- and macro-physics to highlight their effects on optimal stellar parameters and provide realistic stellar parameter ranges. Methods: We used a recently developed method, WhoSGlAd, that takes the utmost advantage of the whole oscillation spectrum of solar-like stars by simultaneously adjusting the acoustic glitches and the smoothly varying trend. For each choice of input physics, we computed models which account, at best, for a set of seismic indicators that are representative of the stellar structure and are as uncorrelated as possible. The search for optimal models was carried out through a Levenberg-Marquardt minimisation. First, we found individual optimal models for both stars. We then selected the best candidates to fit both stars while imposing a common age and composition. Results: We computed realistic ranges of stellar parameters for individual stars. We also provide two models of the system regarded as a whole. We were not able to build binary models with the whole set of choices of input physics considered for individual stars as our constraints seem too stringent. We may need to include additional parameters to the optimal model search or invoke non-standard physical processes.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso
Authors:
Jérôme-Alexis Chevalier,
Alexandre Gramfort,
Joseph Salmon,
Bertrand Thirion
Abstract:
Detecting where and when brain regions activate in a cognitive task or in a given clinical condition is the promise of non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG). This problem, referred to as source localization, or source imaging, poses however a high-dimensional statistical inference challenge. While sparsity promoting regularizations have been prop…
▽ More
Detecting where and when brain regions activate in a cognitive task or in a given clinical condition is the promise of non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG). This problem, referred to as source localization, or source imaging, poses however a high-dimensional statistical inference challenge. While sparsity promoting regularizations have been proposed to address the regression problem, it remains unclear how to ensure statistical control of false detections. Moreover, M/EEG source imaging requires to work with spatio-temporal data and autocorrelated noise. To deal with this, we adapt the desparsified Lasso estimator -- an estimator tailored for high dimensional linear model that asymptotically follows a Gaussian distribution under sparsity and moderate feature correlation assumptions -- to temporal data corrupted with autocorrelated noise. We call it the desparsified multi-task Lasso (d-MTLasso). We combine d-MTLasso with spatially constrained clustering to reduce data dimension and with ensembling to mitigate the arbitrary choice of clustering; the resulting estimator is called ensemble of clustered desparsified multi-task Lasso (ecd-MTLasso). With respect to the current procedures, the two advantages of ecd-MTLasso are that i)it offers statistical guarantees and ii)it allows to trade spatial specificity for sensitivity, leading to a powerful adaptive method. Extensive simulations on realistic head geometries, as well as empirical results on various MEG datasets, demonstrate the high recovery performance of ecd-MTLasso and its primary practical benefit: offer a statistically principled way to threshold MEG/EEG source maps.
△ Less
Submitted 25 November, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Screening Rules and its Complexity for Active Set Identification
Authors:
Eugene Ndiaye,
Olivier Fercoq,
Joseph Salmon
Abstract:
Screening rules were recently introduced as a technique for explicitly identifying active structures such as sparsity, in optimization problem arising in machine learning. This has led to new methods of acceleration based on a substantial dimension reduction. We show that screening rules stem from a combination of natural properties of subdifferential sets and optimality conditions, and can hence…
▽ More
Screening rules were recently introduced as a technique for explicitly identifying active structures such as sparsity, in optimization problem arising in machine learning. This has led to new methods of acceleration based on a substantial dimension reduction. We show that screening rules stem from a combination of natural properties of subdifferential sets and optimality conditions, and can hence be understood in a unified way. Under mild assumptions, we analyze the number of iterations needed to identify the optimal active set for any converging algorithm. We show that it only depends on its convergence rate.
△ Less
Submitted 6 September, 2020;
originally announced September 2020.
-
Collective diffusion coefficient of a charged colloidal dispersion: interferometric measurements in a drying drop
Authors:
Benjamin Sobac,
Sam Dehaeck,
Anne Bouchaudy,
Jean-Baptiste Salmon
Abstract:
In the present work, we use Mach-Zehnder interferometry to thoroughly investigate the drying dynamics of a 2D confined drop of a charged colloidal dispersion. This technique makes it possible to measure the colloid concentration field during the drying of the drop at a high accuracy (about 0.5%) and with a high temporal and spatial resolution (about 1 frame/s and 5 $μ$m/pixel). These features allo…
▽ More
In the present work, we use Mach-Zehnder interferometry to thoroughly investigate the drying dynamics of a 2D confined drop of a charged colloidal dispersion. This technique makes it possible to measure the colloid concentration field during the drying of the drop at a high accuracy (about 0.5%) and with a high temporal and spatial resolution (about 1 frame/s and 5 $μ$m/pixel). These features allow us to probe mass transport of the charged dispersion in this out-of-equilibrium situation. In particular, our experiments provide the evidence that mass transport within the drop can be described by a purely diffusive process for some range of parameters for which the buoyancy-driven convection is negligible. We are then able to extract from these experiments the collective diffusion coefficient of the dispersion $D(\varphi)$ over a wide concentration range $\varphi=0.24$-$0.5$, i.e. from the liquid dispersed state to the solid glass regime, with a high accuracy. The measured values of $D(\varphi)\simeq 5$-$12 D_0$ are significantly larger than the simple estimate $D_0$ given by the Stokes-Einstein relation, thus highlighting the important role played by the colloidal interactions in such dispersions.
△ Less
Submitted 12 August, 2020; v1 submitted 10 August, 2020;
originally announced August 2020.
-
Seismic Solar Models from Ledoux discriminant inversions
Authors:
G. Buldgen,
P. Eggenberger,
V. A. Baturin,
T. Corbard,
J. Christensen-Dalsgaard,
S. J. A. J. Salmon,
A. Noels,
A. V. Oreshina,
R. Scuflaire
Abstract:
The Sun constitutes an excellent laboratory of fundamental physics. With the advent of helioseismology, we were able to probe its internal layers with unprecedented precision. However, the current state of solar modelling is still stained by tedious issues. One of these problems is related to the disagreement between models computed with recent photospheric abundances and helioseismic constraints.…
▽ More
The Sun constitutes an excellent laboratory of fundamental physics. With the advent of helioseismology, we were able to probe its internal layers with unprecedented precision. However, the current state of solar modelling is still stained by tedious issues. One of these problems is related to the disagreement between models computed with recent photospheric abundances and helioseismic constraints. We use solar evolutionary models as initial conditions for reintegrations of their structure using Ledoux discriminant inversions. The resulting models are defined as seismic solar models, satisfying the equations of hydrostatic equilibrium. They will allow us to better constrain the internal structure of the Sun and provide complementary information to that of evolutionary models. These seismic models were computed using various reference models with different equations of state, abundances and opacity tables. We check the robustness of our approach by confirming the good agreement of our seismic models in terms of sound speed, density and entropy proxy inversions as well as frequency-separation ratios of low-degree pressure modes. Our method allows us to determine with an excellent accuracy the Ledoux discriminant profile of the Sun and compute full profiles of this quantity. Our models show an agreement with seismic data of ~0.1% in sound speed, density and entropy proxy as well as with the observed frequency-separation ratios. They surpass all standard and non-standard evolutionary models including ad-hoc changes aiming at reproducing helioseismic constraints. The obtained seismic Ledoux discriminant profile as well as the consistent structure obtained from our procedure paves the way for renewed attempts at constraining the solar modelling problem and the missing physical processes acting in the solar interior by breaking free from the hypotheses of evolutionary models.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression
Authors:
Alain Rakotomamonjy,
Rémi Flamary,
Gilles Gasso,
Joseph Salmon
Abstract:
Owing to their statistical properties, non-convex sparse regularizers have attracted much interest for estimating a sparse linear model from high dimensional data. Given that the solution is sparse, for accelerating convergence, a working set strategy addresses the optimization problem through an iterative algorithm by incre-menting the number of variables to optimize until the identification of t…
▽ More
Owing to their statistical properties, non-convex sparse regularizers have attracted much interest for estimating a sparse linear model from high dimensional data. Given that the solution is sparse, for accelerating convergence, a working set strategy addresses the optimization problem through an iterative algorithm by incre-menting the number of variables to optimize until the identification of the solution support. While those methods have been well-studied and theoretically supported for convex regularizers, this paper proposes a working set algorithm for non-convex sparse regularizers with convergence guarantees. The algorithm, named FireWorks, is based on a non-convex reformulation of a recent primal-dual approach and leverages on the geometry of the residuals. Our theoretical guarantees derive from a lower bound of the objective function decrease between two inner solver iterations and shows the convergence to a stationary point of the full problem. More importantly, we also show that convergence is preserved even when the inner solver is inexact, under sufficient decay of the error across iterations. Our experimental results demonstrate high computational gain when using our working set strategy compared to the full problem solver for both block-coordinate descent or a proximal gradient solver.
△ Less
Submitted 20 October, 2021; v1 submitted 24 June, 2020;
originally announced June 2020.
-
First evidence of inertial modes in $γ$ Doradus stars: The core rotation revealed
Authors:
R-M. Ouazzani,
F. Lignières,
M-A. Dupret,
S. J. A. J. Salmon,
J. Ballot,
S. Christophe,
M. Takata
Abstract:
Gamma Doradus stars present an incredibly rich pulsation spectra, with gravito-inertial modes, in some cases supplemented with delta Scuti-like pressure modes and in numerous cases with Rossby modes. The present paper aims at showing that, in addition to these modes established in the radiative envelope, pure inertial modes, trapped in the convective core, can be detected in Kepler observations of…
▽ More
Gamma Doradus stars present an incredibly rich pulsation spectra, with gravito-inertial modes, in some cases supplemented with delta Scuti-like pressure modes and in numerous cases with Rossby modes. The present paper aims at showing that, in addition to these modes established in the radiative envelope, pure inertial modes, trapped in the convective core, can be detected in Kepler observations of gamma Doradus stars, thanks to their resonance with the gravito-inertial modes.
We start by using a simplified model of perturbations in a full sphere of uniform density. Under these conditions, the spectrum of pure inertial modes is known from analytical solutions of the so-called Poincare equation. We then compute coupling factors which help select the pure inertial modes which interact best with the surrounding dipolar gravito-inertial modes. Using complete calculations of gravito-inertial modes in realistic models of gamma Doradus stars, we are able to show that the pure inertial/gravito-inertial resonances appear as dips in the gravito-inertial mode period spacing series at spin parameters close to those predicted by the simple model. We find the first evidence of such dips in the Kepler gamma Doradus star KIC5608334. Finally, using complete calculations in isolated convective cores, we find that the spin parameters of the pure inertial/gravito-inertial resonances are also sensitive to the density stratification of the convective core.
In conclusion, we have discovered that certain dips in gravito-inertial mode period spacings observed in some Kepler stars are in fact the signatures of resonances with pure-inertial modes that are trapped in the convective core.
This holds the promise to finally access the central conditions , i.e. rotation and density stratification, of intermediate-mass stars on the main sequence.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Implicit differentiation of Lasso-type models for hyperparameter optimization
Authors:
Quentin Bertrand,
Quentin Klopfenstein,
Mathieu Blondel,
Samuel Vaiter,
Alexandre Gramfort,
Joseph Salmon
Abstract:
Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial in practice. The most popular hyperparameter optimization approach is grid-search using held-out validation data. Grid-search however requires to choose a predefined grid for each parameter, which scales exponentially in the number of parameters. Another approach is to cast hyperparameter optimizat…
▽ More
Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial in practice. The most popular hyperparameter optimization approach is grid-search using held-out validation data. Grid-search however requires to choose a predefined grid for each parameter, which scales exponentially in the number of parameters. Another approach is to cast hyperparameter optimization as a bi-level optimization problem, one can solve by gradient descent. The key challenge for these methods is the estimation of the gradient with respect to the hyperparameters. Computing this gradient via forward or backward automatic differentiation is possible yet usually suffers from high memory consumption. Alternatively implicit differentiation typically involves solving a linear system which can be prohibitive and numerically unstable in high dimension. In addition, implicit differentiation usually assumes smooth loss functions, which is not the case for Lasso-type problems. This work introduces an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions. Experiments demonstrate that the proposed method outperforms a large number of standard methods to optimize the error on held-out data, or the Stein Unbiased Risk Estimator (SURE).
△ Less
Submitted 3 September, 2020; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Asteroseismology of evolved stars to constrain the internal transport of angular momentum II. Test of a revised prescription for transport by the Tayler instability
Authors:
P. Eggenberger,
J. W. den Hartogh,
G. Buldgen,
G. Meynet,
S. J. A. J. Salmon,
S. Deheuvels
Abstract:
Context: Asteroseismic measurements reveal that an unknown efficient angular momentum (AM) transport mechanism is needed for subgiant and red giant stars. A revised prescription for AM transport by the magnetic Tayler instability has been recently proposed as a possible candidate for such a missing mechanism.
Results: The revised prescription for the transport by the Tayler instability leads to…
▽ More
Context: Asteroseismic measurements reveal that an unknown efficient angular momentum (AM) transport mechanism is needed for subgiant and red giant stars. A revised prescription for AM transport by the magnetic Tayler instability has been recently proposed as a possible candidate for such a missing mechanism.
Results: The revised prescription for the transport by the Tayler instability leads to low core rotation rates after the main sequence that are in better global agreement with asteroseismic measurements than those predicted by models with purely hydrodynamic processes or with the original Tayler-Spruit dynamo. A detailed comparison with asteroseismic data shows that the rotational properties of at most two of the six subgiants can be correctly reproduced by models accounting for this revised magnetic transport process. This result is obtained independently of the value adopted for the calibration parameter in this prescription. We also find that this transport by the Tayler instability faces difficulties in simultaneously reproducing asteroseismic measurements available for subgiant and red giant stars. The low values of the calibration parameter needed to correctly reproduce the rotational properties of two of the six subgiants lead to core rotation rates during the red giant phase that are too high. Inversely, the higher values of this parameter needed to reproduce the core rotation rates of red giants lead to a very low degree of radial differential rotation before the red giant phase, which is in contradiction with the internal rotation of subgiant stars.
Conclusions: In its present form, the revised prescription for the transport by the Tayler instability does not provide a complete solution to the missing AM transport revealed by asteroseismology of evolved stars.
△ Less
Submitted 30 January, 2020;
originally announced January 2020.
-
Support recovery and sup-norm convergence rates for sparse pivotal estimation
Authors:
Mathurin Massias,
Quentin Bertrand,
Alexandre Gramfort,
Joseph Salmon
Abstract:
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level. The canonical pivotal estimator is the square-root Lasso, formulated along with its derivatives as a "non-smooth + non-smooth" optimization problem. Modern techniques to solve these include smoothing the datafitting term, to benefit from fast ef…
▽ More
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level. The canonical pivotal estimator is the square-root Lasso, formulated along with its derivatives as a "non-smooth + non-smooth" optimization problem. Modern techniques to solve these include smoothing the datafitting term, to benefit from fast efficient proximal algorithms. In this work we show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators. Thanks to our theoretical analysis, we provide some guidelines on how to set the smoothing hyperparameter, and illustrate on synthetic data the interest of such guidelines.
△ Less
Submitted 3 September, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Buoyancy-driven dispersion in confined drying of liquid binary mixtures
Authors:
Jean-Baptiste Salmon,
Frédéric Doumenc
Abstract:
We investigate the impact of buoyancy on the solute mass transport in an evaporating liquid mixture (non-volatile solute $+$ solvent) confined in a slit perpendicular to the gravity. Solvent evaporation at one end of the slit induces a solute concentration gradient which in turn drives free convection due to the difference between the densities of the solutes and the solvent. From the complete mod…
▽ More
We investigate the impact of buoyancy on the solute mass transport in an evaporating liquid mixture (non-volatile solute $+$ solvent) confined in a slit perpendicular to the gravity. Solvent evaporation at one end of the slit induces a solute concentration gradient which in turn drives free convection due to the difference between the densities of the solutes and the solvent. From the complete model coupling mass transport and hydrodynamics, we first use a standard Taylor-like approach to derive a one dimensional non-linear advection-dispersion equation describing the solute concentration process for a dilute mixture. We then perform a complete analysis of the expected regimes using both scaling analysis and asymptotic solutions of this equation. The validity of this approach is confirmed using a thorough comparison with the numerical resolution of both the complete model and the 1D advection-dispersion equation. Our results show that buoyancy-driven free convection always impacts solute mass transport at long time scales, dispersing solutes in a steadily increasing length scale along the slit. Beyond this confined drying configuration, our work also provides an easy way for evaluating the relevance of buoyancy on mass transport in any other microfluidic configuration involving concentration gradients.
△ Less
Submitted 14 January, 2020;
originally announced January 2020.
-
The prototype star $γ$ Doradus observed by TESS
Authors:
S. Christophe,
V. Antoci,
E. Brunsden,
R. -M. Ouazzani,
S. J. A. J. Salmon
Abstract:
$γ$ Doradus is the prototype star for the eponymous class of pulsating stars that consists of late A-early F main-sequence stars oscillating in low-frequency gravito-inertial modes. Being among the brightest stars of its kind (V = 4.2), $γ…
▽ More
$γ$ Doradus is the prototype star for the eponymous class of pulsating stars that consists of late A-early F main-sequence stars oscillating in low-frequency gravito-inertial modes. Being among the brightest stars of its kind (V = 4.2), $γ$ Dor benefits from a large set of observational data that has been recently completed by high-quality space photometry from the TESS mission. With these new data, we propose to study $γ$ Dor as an example of possibilities offered by synergies between multi-technical ground and space-based observations. Here, we present the preliminary results of our investigations.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
Size and Shape Constraints of (486958) Arrokoth from Stellar Occultations
Authors:
Marc W. Buie,
Simon B. Porter,
Peter Tamblyn,
Dirk Terrell,
Alex Harrison Parker,
David Baratoux,
Maram Kaire,
Rodrigo Leiva,
Anne J. Verbiscer,
Amanda M. Zangari,
François Colas,
Baïdy Demba Diop,
Joseph I. Samaniego,
Lawrence H. Wasserman,
Susan D. Benecchi,
Amir Caspi,
Stephen Gwyn,
J. J. Kavelaars,
Adriana C. Ocampo Uría,
Jorge Rabassa,
M. F. Skrutskie,
Alejandro Soto,
Paolo Tanga,
Eliot F. Young,
S. Alan Stern
, et al. (108 additional authors not shown)
Abstract:
We present the results from four stellar occultations by (486958) Arrokoth, the flyby target of the New Horizons extended mission. Three of the four efforts led to positive detections of the body, and all constrained the presence of rings and other debris, finding none. Twenty-five mobile stations were deployed for 2017 June 3 and augmented by fixed telescopes. There were no positive detections fr…
▽ More
We present the results from four stellar occultations by (486958) Arrokoth, the flyby target of the New Horizons extended mission. Three of the four efforts led to positive detections of the body, and all constrained the presence of rings and other debris, finding none. Twenty-five mobile stations were deployed for 2017 June 3 and augmented by fixed telescopes. There were no positive detections from this effort. The event on 2017 July 10 was observed by SOFIA with one very short chord. Twenty-four deployed stations on 2017 July 17 resulted in five chords that clearly showed a complicated shape consistent with a contact binary with rough dimensions of 20 by 30 km for the overall outline. A visible albedo of 10% was derived from these data. Twenty-two systems were deployed for the fourth event on 2018 Aug 4 and resulted in two chords. The combination of the occultation data and the flyby results provides a significant refinement of the rotation period, now estimated to be 15.9380 $\pm$ 0.0005 hours. The occultation data also provided high-precision astrometric constraints on the position of the object that were crucial for supporting the navigation for the New Horizons flyby. This work demonstrates an effective method for obtaining detailed size and shape information and probing for rings and dust on distant Kuiper Belt objects as well as being an important source of positional data that can aid in spacecraft navigation that is particularly useful for small and distant bodies.
△ Less
Submitted 31 December, 2019;
originally announced January 2020.
-
From the Sun to solar-like stars: how does the solar modelling problem affect our studies of solar-like oscillators?
Authors:
G. Buldgen,
C. Pezzotti,
M. Farnir,
S. J. A. J. Salmon,
P. Eggenberger
Abstract:
Since the first observations of solar oscillations in 1962, helioseismology has probably been one of the most successful fields of astrophysics. Besides the improvement of observational data, solar seismologists developed sophisticated techniques to infer the internal structure of the Sun. Back in 1990s these comparisons showed a very high agreement between solar models and the Sun. However, the d…
▽ More
Since the first observations of solar oscillations in 1962, helioseismology has probably been one of the most successful fields of astrophysics. Besides the improvement of observational data, solar seismologists developed sophisticated techniques to infer the internal structure of the Sun. Back in 1990s these comparisons showed a very high agreement between solar models and the Sun. However, the downward revision of the CNO surface abundances in the Sun in 2005, confirmed in 2009, induced a drastic reduction of this agreement leading to the so-called solar modelling problem. More than ten years later, in the era of the space-based photometry missions which have established asteroseismology of solar-like stars as a standard approach to obtain their masses, radii and ages, the solar modelling problem still awaits a solution. We will briefly present the results of new helioseismic inversions, discuss the current uncertainties of solar models and possible solutions to the solar modelling problem. We will also discuss how the solar problem can have significant implications for asteroseismology as a whole by discussing the modelling of the exoplanet-host star Kepler-444, thus impacting the fields requiring a precise and accurate knowledge of stellar masses, radii and ages, such as Galactic archaeology and exoplanetology.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Rotation rate of the solar core as a key constraint to magnetic angular momentum transport in stellar interiors
Authors:
P. Eggenberger,
G. Buldgen,
S. J. A. J Salmon
Abstract:
Context: The internal rotation of the Sun constitutes a fundamental constraint when modelling angular momentum transport in stellar interiors. In addition to the more external regions of the solar radiative zone probed by pressure modes, measurements of rotational splittings of gravity modes would offer an invaluable constraint on the rotation of the solar core. Aims: We study the constraints that…
▽ More
Context: The internal rotation of the Sun constitutes a fundamental constraint when modelling angular momentum transport in stellar interiors. In addition to the more external regions of the solar radiative zone probed by pressure modes, measurements of rotational splittings of gravity modes would offer an invaluable constraint on the rotation of the solar core. Aims: We study the constraints that a measurement of the core rotation rate of the Sun could bring on magnetic angular momentum transport in stellar radiative zones. Results: We first show that models computed with angular momentum transport by magnetic instabilities and a recent prescription for the braking of the stellar surface by magnetized winds can reproduce the observations of surface velocities of stars in open clusters. These solar models predict both a flat rotation profile in the external part of the solar radiative zone probed by pressure modes and an increase in the rotation rate in the solar core, where the stabilizing effect of chemical gradients plays a key role. A rapid rotation of the core of the Sun, as suggested by reported detections of gravity modes, is thus found to be compatible with angular momentum transport by magnetic instabilities. Moreover, we show that the efficiency of magnetic angular momentum transport in regions of strong chemical gradients can be calibrated by the solar core rotation rate independently from the unknown rotational history of the Sun. In particular, we find that a recent revised prescription for the transport of angular momentum by the Tayler instability can be easily distinguished from the original Tayler-Spruit dynamo, with a faster rotating solar core supporting the original prescription.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Block based refitting in $\ell_{12}$ sparse regularisation
Authors:
Charles-Alban Deledalle,
Nicolas Papadakis,
Joseph Salmon,
Samuel Vaiter
Abstract:
In many linear regression problems, including ill-posed inverse problems in image restoration, the data exhibit some sparse structures that can be used to regularize the inversion. To this end, a classical path is to use $\ell_{12}$ block based regularization. While efficient at retrieving the inherent sparsity patterns of the data -- the support -- the estimated solutions are known to suffer from…
▽ More
In many linear regression problems, including ill-posed inverse problems in image restoration, the data exhibit some sparse structures that can be used to regularize the inversion. To this end, a classical path is to use $\ell_{12}$ block based regularization. While efficient at retrieving the inherent sparsity patterns of the data -- the support -- the estimated solutions are known to suffer from a systematical bias. We propose a general framework for removing this artifact by refitting the solution towards the data while preserving key features of its structure such as the support. This is done through the use of refitting block penalties that only act on the support of the estimated solution. Based on an analysis of related works in the literature, we introduce a new penalty that is well suited for refitting purposes. We also present a new algorithm to obtain the refitted solution along with the original (biased) solution for any convex refitting block penalty. Experiments illustrate the good behavior of the proposed block penalty for refitting solutions of Total Variation and Total GeneralizedVariation models.
△ Less
Submitted 29 September, 2020; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Comprehensive stellar seismic analysis: A preliminary application of Whosglad to 16 Cygni system
Authors:
M. Farnir,
M-A. Dupret,
S. J. A. J. Salmon,
A. Noels,
G. Buldgen
Abstract:
We present a first application of Whosglad method to the components A and B of the 16 Cygni system. The method was developed to provide a comprehensive analysis of stellar oscillation spectra. It defines new seismic indicators which are as uncorrelated and precise as possible and hold detailed information about stellar interiors. Such indicators, as illustrated in the present paper, may be used to…
▽ More
We present a first application of Whosglad method to the components A and B of the 16 Cygni system. The method was developed to provide a comprehensive analysis of stellar oscillation spectra. It defines new seismic indicators which are as uncorrelated and precise as possible and hold detailed information about stellar interiors. Such indicators, as illustrated in the present paper, may be used to generate stellar models via forward seismic modeling. Finally, seismic constraints retrieved by the method provide realistic stellar parameters.
△ Less
Submitted 10 October, 2019;
originally announced October 2019.
-
Revisiting Kepler-444 Part I. Seismic modelling and inversions of stellar structure
Authors:
G. Buldgen,
M. Farnir,
C. Pezzotti,
P. Eggenberger,
S. J. A. J. Salmon,
J. Montalban,
J. W. Ferguson,
S. Khan,
V. Bourrier,
B. M. Rendle,
G. Meynet,
A. Miglio,
A. Noels
Abstract:
Context. The CoRoT and Kepler missions have paved the way for synergies between exoplanetology and asteroseismology. The use of seismic data helps providing stringent constraints on the stellar properties which directly impact the results of planetary studies. Amongst the most interesting planetary systems discovered by Kepler, Kepler-444 is unique by the quality of its seismic and classical stell…
▽ More
Context. The CoRoT and Kepler missions have paved the way for synergies between exoplanetology and asteroseismology. The use of seismic data helps providing stringent constraints on the stellar properties which directly impact the results of planetary studies. Amongst the most interesting planetary systems discovered by Kepler, Kepler-444 is unique by the quality of its seismic and classical stellar constraints. Its magnitude, age and the presence of 5 small-sized planets orbiting this target makes it an exceptional testbed for exoplanetology. Aims. We aim at providing a detailed characterization of Kepler-444, focusing on the dependency of the results on variations of key ingredients of the theoretical stellar models. This thorough study will serve as a basis for future investigations of the planetary evolution of the system orbiting Kepler-444. Methods. We use local and global minimization techniques to study the internal structure of the exoplanet-host star Kepler-444. We combine seismic observations from the Kepler mission, Gaia DR2 data and revised spectroscopic parameters to precisely constrain its internal structure and evolution. Results. We provide updated robust and precise determinations of the fundamental parameters of Kepler-444 and demonstrate that this low-mass star bore a convective core during a significant portion of its life on the main-sequence. Using seismic data, we are able to estimate the lifetime of the convective core to approximately 8 Gyr out of the 11 Gyr of the evolution of Kepler-444. The revised stellar parameters found by our thorough study are M = 0.754 +- 0.03M_Sun , R = 0.753 +- 0.01R_Sun , Age = 11 +- 1 Gy.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Dual Extrapolation for Sparse Generalized Linear Models
Authors:
Mathurin Massias,
Samuel Vaiter,
Alexandre Gramfort,
Joseph Salmon
Abstract:
Generalized Linear Models (GLM) form a wide class of regression and classification models, where prediction is a function of a linear combination of the input variables. For statistical inference in high dimension, sparsity inducing regularizations have proven to be useful while offering statistical guarantees. However, solving the resulting optimization problems can be challenging: even for popul…
▽ More
Generalized Linear Models (GLM) form a wide class of regression and classification models, where prediction is a function of a linear combination of the input variables. For statistical inference in high dimension, sparsity inducing regularizations have proven to be useful while offering statistical guarantees. However, solving the resulting optimization problems can be challenging: even for popular iterative algorithms such as coordinate descent, one needs to loop over a large number of variables. To mitigate this, techniques known as screening rules and working sets diminish the size of the optimization problem at hand, either by progressively removing variables, or by solving a growing sequence of smaller problems. For both techniques, significant variables are identified thanks to convex duality arguments. In this paper, we show that the dual iterates of a GLM exhibit a Vector AutoRegressive (VAR) behavior after sign identification, when the primal problem is solved with proximal gradient descent or cyclic coordinate descent. Exploiting this regularity, one can construct dual points that offer tighter certificates of optimality, enhancing the performance of screening rules and hel** to design competitive working set algorithms.
△ Less
Submitted 24 August, 2022; v1 submitted 12 July, 2019;
originally announced July 2019.
-
HydroSyMBA: a 1D hydrocode coupled with an N-body symplectic integrator
Authors:
Julien Salmon,
Robin M. Canup
Abstract:
The numerical modeling of co-existing circumplanetary disks/rings and satellites is particularly challenging because each part of the system requires a very different approach. Disks are generally well represented by a fluid-like dense medium, whose evolution can be calculated by a hydrocode. On the other hand, the orbital evolution of satellites is generally performed using N-body integrators. We…
▽ More
The numerical modeling of co-existing circumplanetary disks/rings and satellites is particularly challenging because each part of the system requires a very different approach. Disks are generally well represented by a fluid-like dense medium, whose evolution can be calculated by a hydrocode. On the other hand, the orbital evolution of satellites is generally performed using N-body integrators. We have developed a new numerical model that combines a 1-dimensional hydrocode with the N-body integrator SyMBA. The disk evolves due to its viscosity, and resonant torques from satellites. The latter is applied to the satellites as an additional "kick" to their accelerations. The integrator also includes the ability to spawn new moonlets at the disk's outer edge if the latter expands beyond a material-dependent Roche limit, as well as the effects of tidal dissipation in the planet and/or the satellite on the satellite orbits. The resulting integrator allows one to accurately model the evolution of an inner circumplanetary disk, and the formation of satellites by accumulation of disk material, all within a single self-consistent framework. Potential applications include the formation of Earth's Moon, the evolution of the inner Saturn system, the martian and uranian moons, and compact exoplanetary systems.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Evolutionary models for ultracool dwarfs
Authors:
Catarina S. Fernandes,
Valerie Van Grootel,
Sebastien J. A. J. Salmon,
Bernhard Aringer,
Adam J. Burgasser,
Richard Scuflaire,
Pierre Brassard,
Gilles Fontaine
Abstract:
Ultracool dwarfs have emerged as key targets for searches of transiting exoplanets. Precise estimates of the host parameters (including mass, age, and radius) are fundamental to constrain the physical properties of orbiting exoplanets. We have extended our evolutionary code CLES (Code Liégeois d'Evolution Stellaire) to the ultracool dwarf regime. We include relevant equations of state for H, He, a…
▽ More
Ultracool dwarfs have emerged as key targets for searches of transiting exoplanets. Precise estimates of the host parameters (including mass, age, and radius) are fundamental to constrain the physical properties of orbiting exoplanets. We have extended our evolutionary code CLES (Code Liégeois d'Evolution Stellaire) to the ultracool dwarf regime. We include relevant equations of state for H, He, as well as C and O elements to cover the temperature-density regime of ultracool dwarf interiors. For various metallicities, we couple the interior models to two sets of model atmospheres as surface boundary conditions. We show that including C and O in the EOS has a significant effect close the H-burning limit mass. The typical systematic error associated with uncertainties in input physics in evolutionary models is $\sim 0.0005 M_\odot$. We test model results against observations for objects whose parameters have been determined from independent techniques. We are able to reproduce dynamical mass measurements of LSPM J1314+1320AB within $1σ$ with the condition of varying the metallicity (determined from calibrations) up to $2.5σ$. For GJ 65AB, a $2σ$ agreement is obtained between individual masses from differential astrometry and those from evolutionary models. We provide tables of ultracool dwarf models for various masses and metallicities that can be used as reference when estimating parameters for ultracool objects.
△ Less
Submitted 13 June, 2019;
originally announced June 2019.
-
Refitting solutions promoted by $\ell_{12}$ sparse analysis regularization with block penalties
Authors:
Charles-Alban Deledalle,
Nicolas Papadakis,
Joseph Salmon,
Samuel Vaiter
Abstract:
In inverse problems, the use of an $\ell_{12}$ analysis regularizer induces a bias in the estimated solution. We propose a general refitting framework for removing this artifact while kee** information of interest contained in the biased solution. This is done through the use of refitting block penalties that only act on the co-support of the estimation. Based on an analysis of related works in…
▽ More
In inverse problems, the use of an $\ell_{12}$ analysis regularizer induces a bias in the estimated solution. We propose a general refitting framework for removing this artifact while kee** information of interest contained in the biased solution. This is done through the use of refitting block penalties that only act on the co-support of the estimation. Based on an analysis of related works in the literature, we propose a new penalty that is well suited for refitting purposes. We also present an efficient algorithmic method to obtain the refitted solution along with the original (biased) solution for any convex refitting block penalty. Experiments illustrate the good behavior of the proposed block penalty for refitting.
△ Less
Submitted 5 March, 2019; v1 submitted 2 March, 2019;
originally announced March 2019.
-
Combining multiple structural inversions to constrain the Solar modelling problem
Authors:
G. Buldgen,
S. J. A. J. Salmon,
A. Noels,
V. A. Baturin,
P. Eggenberger,
G. Meynet,
A. Miglio
Abstract:
The Sun is the most studied of stars and a laboratory of fundamental physics. However, the understanding of our star is stained by the solar modelling problem which can stem from various causes. We combine inversions of sound speed, an entropy proxy and the Ledoux discriminant with the position of the base of the convective zone and the photospheric helium abundance to test combinations of ingredi…
▽ More
The Sun is the most studied of stars and a laboratory of fundamental physics. However, the understanding of our star is stained by the solar modelling problem which can stem from various causes. We combine inversions of sound speed, an entropy proxy and the Ledoux discriminant with the position of the base of the convective zone and the photospheric helium abundance to test combinations of ingredients such as equation of state, abundance and opacity tables. We study the potential of the inversions to constrain ad-hoc opacity modifications and additional mixing in the Sun. We show that they provide constraints on these modifications to the ingredients and that the solar problem likely occurs from various sources and using phase shifts with our approach is the next step to take.
△ Less
Submitted 27 February, 2019;
originally announced February 2019.
-
Integer programming on the junction tree polytope for influence diagrams
Authors:
Axel Parmentier,
Victor Cohen,
Vincent Leclère,
Guillaume Obozinski,
Joseph Salmon
Abstract:
Influence Diagrams (ID) are a flexible tool to represent discrete stochastic optimization problems, including Markov Decision Process (MDP) and Partially Observable MDP as standard examples. More precisely, given random variables considered as vertices of an acyclic digraph, a probabilistic graphical model defines a joint distribution via the conditional distributions of vertices given their paren…
▽ More
Influence Diagrams (ID) are a flexible tool to represent discrete stochastic optimization problems, including Markov Decision Process (MDP) and Partially Observable MDP as standard examples. More precisely, given random variables considered as vertices of an acyclic digraph, a probabilistic graphical model defines a joint distribution via the conditional distributions of vertices given their parents. In ID, the random variables are represented by a probabilistic graphical model whose vertices are partitioned into three types : chance, decision and utility vertices. The user chooses the distribution of the decision vertices conditionally to their parents in order to maximize the expected utility. Leveraging the notion of rooted junction tree, we present a mixed integer linear formulation for solving an ID, as well as valid inequalities, which lead to a computationally efficient algorithm. We also show that the linear relaxation yields an optimal integer solution for instances that can be solved by the "single policy update", the default algorithm for addressing IDs.
△ Less
Submitted 5 July, 2019; v1 submitted 19 February, 2019;
originally announced February 2019.
-
Screening Rules for Lasso with Non-Convex Sparse Regularizers
Authors:
Alain Rakotomamonjy,
Gilles Gasso,
Joseph Salmon
Abstract:
Leveraging on the convexity of the Lasso problem , screening rules help in accelerating solvers by discarding irrelevant variables, during the optimization process. However, because they provide better theoretical guarantees in identifying relevant variables, several non-convex regularizers for the Lasso have been proposed in the literature. This work is the first that introduces a screening rule…
▽ More
Leveraging on the convexity of the Lasso problem , screening rules help in accelerating solvers by discarding irrelevant variables, during the optimization process. However, because they provide better theoretical guarantees in identifying relevant variables, several non-convex regularizers for the Lasso have been proposed in the literature. This work is the first that introduces a screening rule strategy into a non-convex Lasso solver. The approach we propose is based on a iterative majorization-minimization (MM) strategy that includes a screening rule in the inner solver and a condition for propagating screened variables between iterations of MM. In addition to improve efficiency of solvers, we also provide guarantees that the inner solver is able to identify the zeros components of its critical point in finite time. Our experimental analysis illustrates the significant computational gain brought by the new screening rule compared to classical coordinate-descent or proximal gradient descent methods.
△ Less
Submitted 19 February, 2019; v1 submitted 16 February, 2019;
originally announced February 2019.
-
Handling correlated and repeated measurements with the smoothed multivariate square-root Lasso
Authors:
Quentin Bertrand,
Mathurin Massias,
Alexandre Gramfort,
Joseph Salmon
Abstract:
Sparsity promoting norms are frequently used in high dimensional regression. A limitation of such Lasso-type estimators is that the optimal regularization parameter depends on the unknown noise level. Estimators such as the concomitant Lasso address this dependence by jointly estimating the noise level and the regression coefficients. Additionally, in many applications, the data is obtained by ave…
▽ More
Sparsity promoting norms are frequently used in high dimensional regression. A limitation of such Lasso-type estimators is that the optimal regularization parameter depends on the unknown noise level. Estimators such as the concomitant Lasso address this dependence by jointly estimating the noise level and the regression coefficients. Additionally, in many applications, the data is obtained by averaging multiple measurements: this reduces the noise variance, but it dramatically reduces sample sizes and prevents refined noise modeling. In this work, we propose a concomitant estimator that can cope with complex noise structure by using non-averaged measurements. The resulting optimization problem is convex and amenable, thanks to smoothing theory, to state-of-the-art optimization techniques that leverage the sparsity of the solutions. Practical benefits are demonstrated on toy datasets, realistic simulated data and real neuroimaging data.
△ Less
Submitted 3 September, 2020; v1 submitted 7 February, 2019;
originally announced February 2019.