-
Harnessing pattern-by-pattern linear classifiers for prediction with missing data
Authors:
Angel D Reyero Lobo,
Alexis Ayme,
Claire Boyer,
Erwan Scornet
Abstract:
Missing values have been thoroughly analyzed in the context of linear models, where the final aim is to build coefficient estimates. However, estimating coefficients does not directly solve the problem of prediction with missing entries: a manner to address empty components must be designed. Major approaches to deal with prediction with missing values are empirically driven and can be decomposed i…
▽ More
Missing values have been thoroughly analyzed in the context of linear models, where the final aim is to build coefficient estimates. However, estimating coefficients does not directly solve the problem of prediction with missing entries: a manner to address empty components must be designed. Major approaches to deal with prediction with missing values are empirically driven and can be decomposed into two families: imputation (filling in empty fields) and pattern-by-pattern prediction, where a predictor is built on each missing pattern. Unfortunately, most simple imputation techniques used in practice (as constant imputation) are not consistent when combined with linear models. In this paper, we focus on the more flexible pattern-by-pattern approaches and study their predictive performances on Missing Completely At Random (MCAR) data. We first show that a pattern-by-pattern logistic regression model is intrinsically ill-defined, implying that even classical logistic regression is impossible to apply to missing data. We then analyze the perceptron model and show how the linear separability property extends to partially-observed inputs. Finally, we use the Linear Discriminant Analysis to prove that pattern-by-pattern LDA is consistent in a high-dimensional regime. We refine our analysis to more complex MNAR data.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Physics-informed machine learning as a kernel method
Authors:
Nathan Doumèche,
Francis Bach,
Gérard Biau,
Claire Boyer
Abstract:
Physics-informed machine learning combines the expressiveness of data-based approaches with the interpretability of physical models. In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency. We prove that for linear differential priors, the problem can be formulated as a kernel re…
▽ More
Physics-informed machine learning combines the expressiveness of data-based approaches with the interpretability of physical models. In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency. We prove that for linear differential priors, the problem can be formulated as a kernel regression task. Taking advantage of kernel theory, we derive convergence rates for the minimizer of the regularized risk and show that it converges at least at the Sobolev minimax rate. However, faster rates can be achieved, depending on the physical error. This principle is illustrated with a one-dimensional example, supporting the claim that regularizing the empirical risk with physical information can be beneficial to the statistical performance of estimators.
△ Less
Submitted 19 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
An analysis of the noise schedule for score-based generative models
Authors:
Stanislas Strasman,
Antonio Ocello,
Claire Boyer,
Sylvain Le Corff,
Vincent Lemaire
Abstract:
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target.Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions…
▽ More
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target.Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Under additional regularity assumptions, taking advantage of favorable underlying contraction mechanisms, we provide a tighter error bound in Wasserstein distance compared to state-of-the-art results. In addition to being tractable, this upper bound jointly incorporates properties of the target distribution and SGM hyperparameters that need to be tuned during training.
△ Less
Submitted 24 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Random features models: a way to study the success of naive imputation
Authors:
Alexis Ayme,
Claire Boyer,
Aymeric Dieuleveut,
Erwan Scornet
Abstract:
Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data. Yet, this simple method could be expected to induce a large bias for prediction purposes, as the imputed input may strongly differ from the true underlying data. However, recent works suggest that this bias is low in the context of high-dimensional linear predictors when…
▽ More
Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data. Yet, this simple method could be expected to induce a large bias for prediction purposes, as the imputed input may strongly differ from the true underlying data. However, recent works suggest that this bias is low in the context of high-dimensional linear predictors when data is supposed to be missing completely at random (MCAR). This paper completes the picture for linear predictors by confirming the intuition that the bias is negligible and that surprisingly naive imputation also remains relevant in very low dimension.To this aim, we consider a unique underlying random features model, which offers a rigorous framework for studying predictive performances, whilst the dimension of the observed features varies.Building on these theoretical results, we establish finite-sample bounds on stochastic gradient (SGD) predictors applied to zero-imputed data, a strategy particularly well suited for large-scale learning.If the MCAR assumption appears to be strong, we show that similar favorable behaviors occur for more complex missing data scenarios.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Sasakian Geometry on Sphere Bundles II: Constant Scalar Curvature
Authors:
Charles P. Boyer,
Christina W. Tønnesen-Friedman
Abstract:
In a previous paper [BTF21] the authors employed the fiber join construction of Yamazaki [Yam99] together with the admissible construction of Apostolov, Calderbank, Gauduchon, and Tønnesen-Friedman [ACGTF08a] to construct new extremal Sasaki metrics on odd dimensional sphere bundles over smooth projective algebraic varieties. In the present paper we continue this study by applying a recent existen…
▽ More
In a previous paper [BTF21] the authors employed the fiber join construction of Yamazaki [Yam99] together with the admissible construction of Apostolov, Calderbank, Gauduchon, and Tønnesen-Friedman [ACGTF08a] to construct new extremal Sasaki metrics on odd dimensional sphere bundles over smooth projective algebraic varieties. In the present paper we continue this study by applying a recent existence theorem [BHLTF23] that shows that under certain conditions one can always obtain a constant scalar curvature Sasaki metric in the Sasaki cone. Moreover, we explicitly describe this construction for certain sphere bundles of dimension 5 and 7.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Assessing model performance for counterfactual predictions
Authors:
Christopher B. Boyer,
Issa J. Dahabreh,
Jon A. Steingrimsson
Abstract:
Counterfactual prediction methods are required when a model will be deployed in a setting where treatment policies differ from the setting where the model was developed, or when the prediction question is explicitly counterfactual. However, estimating and evaluating counterfactual prediction models is challenging because one does not observe the full set of potential outcomes for all individuals.…
▽ More
Counterfactual prediction methods are required when a model will be deployed in a setting where treatment policies differ from the setting where the model was developed, or when the prediction question is explicitly counterfactual. However, estimating and evaluating counterfactual prediction models is challenging because one does not observe the full set of potential outcomes for all individuals. Here, we discuss how to tailor a model to a counterfactual estimand, how to assess the model's performance, and how to perform model and tuning parameter selection. We also provide identifiability results for measures of performance for a potentially misspecified counterfactual prediction model based on training and test data from the same (factual) source population. Last, we illustrate the methods using simulation and apply them to the task of develo** a statin-naïve risk prediction model for cardiovascular disease.
△ Less
Submitted 6 September, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies
Authors:
James Paul Mason,
Alexandra Werth,
Colin G. West,
Allison A. Youngblood,
Donald L. Woodraska,
Courtney Peck,
Kevin Lacjak,
Florian G. Frick,
Moutamen Gabir,
Reema A. Alsinan,
Thomas Jacobsen,
Mohammad Alrubaie,
Kayla M. Chizmar,
Benjamin P. Lau,
Lizbeth Montoya Dominguez,
David Price,
Dylan R. Butler,
Connor J. Biron,
Nikita Feoktistov,
Kai Dewey,
N. E. Loomis,
Michal Bodzianowski,
Connor Kuybus,
Henry Dietrick,
Aubrey M. Wolfe
, et al. (977 additional authors not shown)
Abstract:
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th…
▽ More
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Convergence and error analysis of PINNs
Authors:
Nathan Doumèche,
Gérard Biau,
Claire Boyer
Abstract:
Physics-informed neural networks (PINNs) are a promising approach that combines the power of neural networks with the interpretability of physical modeling. PINNs have shown good practical performance in solving partial differential equations (PDEs) and in hybrid modeling scenarios, where physical models enhance data-driven approaches. However, it is essential to establish their theoretical proper…
▽ More
Physics-informed neural networks (PINNs) are a promising approach that combines the power of neural networks with the interpretability of physical modeling. PINNs have shown good practical performance in solving partial differential equations (PDEs) and in hybrid modeling scenarios, where physical models enhance data-driven approaches. However, it is essential to establish their theoretical properties in order to fully understand their capabilities and limitations. In this study, we highlight that classical training of PINNs can suffer from systematic overfitting. This problem can be addressed by adding a ridge regularization to the empirical risk, which ensures that the resulting estimator is risk-consistent for both linear and nonlinear PDE systems. However, the strong convergence of PINNs to a solution satisfying the physical constraints requires a more involved analysis using tools from functional analysis and calculus of variations. In particular, for linear PDE systems, an implementable Sobolev-type regularization allows to reconstruct a solution that not only achieves statistical accuracy but also maintains consistency with the underlying physics.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Temperature Evolution of Domains and Intradomain Chirality in 1T-TaS$_{2}$
Authors:
Boning Yu,
Ghilles Ainouche,
Manoj Singh,
Bishnu Sharma,
James Huber,
Michael C. Boyer
Abstract:
We use scanning tunneling microscopy to study the temperature evolution of the atomic-scale properties of the nearly-commensurate charge density wave (NC-CDW) state of the low-dimensional material, 1T-TaS$_2$. Our measurements at 203 K, 300 K, and 354 K, roughly spanning the temperature range of the NC-CDW state, show that while the average CDW periodicity is temperature independent, domaining and…
▽ More
We use scanning tunneling microscopy to study the temperature evolution of the atomic-scale properties of the nearly-commensurate charge density wave (NC-CDW) state of the low-dimensional material, 1T-TaS$_2$. Our measurements at 203 K, 300 K, and 354 K, roughly spanning the temperature range of the NC-CDW state, show that while the average CDW periodicity is temperature independent, domaining and the local evolution of the CDW lattice within a domain is temperature dependent. Further, we characterize the temperature evolution of the displacement field associated with the recently-discovered intradomain chirality of the NC-CDW state by calculating the local rotation vector. Intradomain chirality throughout the NC-CDW phase is likely driven by a strong coupling of the CDW lattice to the atomic lattice.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
The James Webb Space Telescope Mission
Authors:
Jonathan P. Gardner,
John C. Mather,
Randy Abbott,
James S. Abell,
Mark Abernathy,
Faith E. Abney,
John G. Abraham,
Roberto Abraham,
Yasin M. Abul-Huda,
Scott Acton,
Cynthia K. Adams,
Evan Adams,
David S. Adler,
Maarten Adriaensen,
Jonathan Albert Aguilar,
Mansoor Ahmed,
Nasif S. Ahmed,
Tanjira Ahmed,
Rüdeger Albat,
Loïc Albert,
Stacey Alberts,
David Aldridge,
Mary Marsha Allen,
Shaune S. Allen,
Martin Altenburg
, et al. (983 additional authors not shown)
Abstract:
Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astrono…
▽ More
Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astronomers will celebrate their accomplishments for the life of the mission, potentially as long as 20 years, and beyond. This report and the scientific discoveries that follow are extended thank-you notes to the 20,000 team members. The telescope is working perfectly, with much better image quality than expected. In this and accompanying papers, we give a brief history, describe the observatory, outline its objectives and current observing program, and discuss the inventions and people who made it possible. We cite detailed reports on the design and the measured performance on orbit.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Naive imputation implicitly regularizes high-dimensional linear models
Authors:
Alexis Ayme,
Claire Boyer,
Aymeric Dieuleveut,
Erwan Scornet
Abstract:
Two different approaches exist to handle missing values for prediction: either imputation, prior to fitting any predictive algorithms, or dedicated methods able to natively incorporate missing values. While imputation is widely (and easily) use, it is unfortunately biased when low-capacity predictors (such as linear models) are applied afterward. However, in practice, naive imputation exhibits goo…
▽ More
Two different approaches exist to handle missing values for prediction: either imputation, prior to fitting any predictive algorithms, or dedicated methods able to natively incorporate missing values. While imputation is widely (and easily) use, it is unfortunately biased when low-capacity predictors (such as linear models) are applied afterward. However, in practice, naive imputation exhibits good predictive performance. In this paper, we study the impact of imputation in a high-dimensional linear model with MCAR missing data. We prove that zero imputation performs an implicit regularization closely related to the ridge method, often used in high-dimensional problems. Leveraging on this connection, we establish that the imputation bias is controlled by a ridge bias, which vanishes in high dimension. As a predictor, we argue in favor of the averaged SGD strategy, applied to zero-imputed data. We establish an upper bound on its generalization error, highlighting that imputation is benign in the d $\sqrt$ n regime. Experiments illustrate our findings.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Existence and Non-Existence of Constant Scalar Curvature and Extremal Sasaki Metrics
Authors:
Charles P. Boyer,
Hongnian Huang,
Eveline Legendre,
Christina W. Tønnesen-Friedman
Abstract:
We discuss the existence and non-existence of constant scalar curvature, as well as extremal, Sasaki metrics. We prove that the natural Sasaki-Boothby-Wang manifold over the admissible projective bundles over local products of non-negative CSC Kähler metrics, as described in https://link-springer-com.libproxy.unm.edu/article/10.1007/s00222-008-0126-x, always has a constant scalar curvature (CSC) S…
▽ More
We discuss the existence and non-existence of constant scalar curvature, as well as extremal, Sasaki metrics. We prove that the natural Sasaki-Boothby-Wang manifold over the admissible projective bundles over local products of non-negative CSC Kähler metrics, as described in https://link-springer-com.libproxy.unm.edu/article/10.1007/s00222-008-0126-x, always has a constant scalar curvature (CSC) Sasaki metric in its Sasaki-Reeb cone. Moreover, we give examples that show that the extremal Sasaki--Reeb cone, defined as the set of Sasaki--Reeb vector fields admitting a compatible extremal Sasaki metric, is not necessarily connected in the Sasaki--Reeb cone, and it can be empty even in the non-Gorenstein case. We also show by example that a non-empty extremal Sasaki--Reeb cone need not contain a (CSC) Sasaki metric which answers a question posed in https://mathscinet-ams-org.libproxy.unm.edu/mathscinet-getitem?mr=4420789. The paper also contains an appendix where we explore the existence of Kähler metrics of constant weighted scalar curvature, as defined in https://londmathsoc-onlinelibrary-wiley-com.libproxy.unm.edu/doi/full/10.1112/plms.12255, on admissible manifolds over local products of non-negative CSC Kähler metrics.
△ Less
Submitted 27 June, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Sparse tree-based initialization for neural networks
Authors:
Patrick Lutz,
Ludovic Arnould,
Claire Boyer,
Erwan Scornet
Abstract:
Dedicated neural network (NN) architectures have been designed to handle specific data types (such as CNN for images or RNN for text), which ranks them among state-of-the-art methods for dealing with these data. Unfortunately, no architecture has been found for dealing with tabular data yet, for which tree ensemble methods (tree boosting, random forests) usually show the best predictive performanc…
▽ More
Dedicated neural network (NN) architectures have been designed to handle specific data types (such as CNN for images or RNN for text), which ranks them among state-of-the-art methods for dealing with these data. Unfortunately, no architecture has been found for dealing with tabular data yet, for which tree ensemble methods (tree boosting, random forests) usually show the best predictive performances. In this work, we propose a new sparse initialization technique for (potentially deep) multilayer perceptrons (MLP): we first train a tree-based procedure to detect feature interactions and use the resulting information to initialize the network, which is subsequently trained via standard stochastic gradient strategies. Numerical experiments on several tabular data sets show that this new, simple and easy-to-use method is a solid concurrent, both in terms of generalization capacity and computation time, to default MLP initialization and even to existing complex deep learning solutions. In fact, this wise MLP initialization raises the resulting NN methods to the level of a valid competitor to gradient boosting when dealing with tabular data. Besides, such initializations are able to preserve the sparsity of weights introduced in the first layers of the network through training. This fact suggests that this new initializer operates an implicit regularization during the NN training, and emphasizes that the first layers act as a sparse feature extractor (as for convolutional layers in CNN).
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
Outcome coding choice in randomized trials of programs to reduce violence
Authors:
Christopher Boyer,
Sangeeta Chatterji,
Jasper Cooper,
Lori Heise
Abstract:
Over the last decade, the number of randomized trials of programs to reduce intimate partner violence (IPV) has grown precipitously. However, most trials continue to measure and code violence using standards originally designed for global prevalence surveys. This choice may have consequences in terms of bias, power, and efficiency of trial estimates and may limit what we can learn about how progra…
▽ More
Over the last decade, the number of randomized trials of programs to reduce intimate partner violence (IPV) has grown precipitously. However, most trials continue to measure and code violence using standards originally designed for global prevalence surveys. This choice may have consequences in terms of bias, power, and efficiency of trial estimates and may limit what we can learn about how programs are working. In this paper, we return to first principles to develop a generative model for violence reduction. We then use this model to better understand trade-offs in outcome coding choices via simulation. We re-analyze results from seven recent trials in Southern and Eastern Africa to highlight some of our findings. We conclude with a discussion of key take-aways for trialists.
△ Less
Submitted 27 September, 2022; v1 submitted 26 April, 2022;
originally announced April 2022.
-
Is interpolation benign for random forest regression?
Authors:
Ludovic Arnould,
Claire Boyer,
Erwan Scornet
Abstract:
Statistical wisdom suggests that very complex models, interpolating training data, will be poor at predicting unseen examples.Yet, this aphorism has been recently challenged by the identification of benign overfitting regimes, specially studied in the case of parametric models: generalization capabilities may be preserved despite model high complexity.While it is widely known that fully-grown deci…
▽ More
Statistical wisdom suggests that very complex models, interpolating training data, will be poor at predicting unseen examples.Yet, this aphorism has been recently challenged by the identification of benign overfitting regimes, specially studied in the case of parametric models: generalization capabilities may be preserved despite model high complexity.While it is widely known that fully-grown decision trees interpolate and, in turn, have bad predictive performances, the same behavior is yet to be analyzed for Random Forests (RF).In this paper, we study the trade-off between interpolation and consistency for several types of RF algorithms. Theoretically, we prove that interpolation regimes and consistency cannot be achieved simultaneously for several non-adaptive RF.Since adaptivity seems to be the cornerstone to bring together interpolation and consistency, we study interpolating Median RF which are proved to be consistent in the interpolating regime. This is the first result conciliating interpolation and consistency for RF, highlighting that the averaging effect introduced by feature randomization is a key mechanism, sufficient to ensure the consistency in the interpolation regime and beyond.Numerical experiments show that Breiman's RF are consistent while exactly interpolating, when no bootstrap step is involved.We theoretically control the size of the interpolation area, which converges fast enough to zero, giving a necessary condition for exact interpolation and consistency to occur in conjunction.
△ Less
Submitted 9 February, 2023; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Minimax rate of consistency for linear models with missing values
Authors:
Alexis Ayme,
Claire Boyer,
Aymeric Dieuleveut,
Erwan Scornet
Abstract:
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In fact, the very nature of missing values usually prevents us from running standard learning algorithms. In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which tu…
▽ More
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In fact, the very nature of missing values usually prevents us from running standard learning algorithms. In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task. Indeed, the Bayes rule can be decomposed as a sum of predictors corresponding to each missing pattern. This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets. First, we propose a rigorous setting to analyze a least-square type estimator and establish a bound on the excess risk which increases exponentially in the dimension. Consequently, we leverage the missing data distribution to propose a new algorithm, andderive associated adaptive risk bounds that turn out to be minimax optimal. Numerical experiments highlight the benefits of our method compared to state-of-the-art algorithms used for predictions with missing values.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Model-based Clustering with Missing Not At Random Data
Authors:
Aude Sportisse,
Matthieu Marbac,
Fabien Laporte,
Gilles Celeux,
Claire Boyer,
Julie Josse,
Christophe Biernacki
Abstract:
Model-based unsupervised learning, as any learning task, stalls as soon as missing data occurs. This is even more true when the missing data are informative, or said missing not at random (MNAR). In this paper, we propose model-based clustering algorithms designed to handle very general types of missing data, including MNAR data. To do so, we introduce a mixture model for different types of data (…
▽ More
Model-based unsupervised learning, as any learning task, stalls as soon as missing data occurs. This is even more true when the missing data are informative, or said missing not at random (MNAR). In this paper, we propose model-based clustering algorithms designed to handle very general types of missing data, including MNAR data. To do so, we introduce a mixture model for different types of data (continuous, count, categorical and mixed) to jointly model the data distribution and the MNAR mechanism, remaining vigilant to the relative degrees of freedom of each. Several MNAR models are discussed, for which the cause of the missingness can depend on both the values of the missing variable themselves and on the class membership. However, we focus on a specific MNAR model, called MNARz, for which the missingness only depends on the class membership. We first underline its ease of estimation, by showing that the statistical inference can be carried out on the data matrix concatenated with the missing mask considering finally a standard MAR mechanism. Consequently, we propose to perform clustering using the Expectation Maximization algorithm, specially developed for this simplified reinterpretation. Finally, we assess the numerical performances of the proposed methods on synthetic data and on the real medical registry TraumaBase as well.
△ Less
Submitted 22 December, 2023; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Lattice-Driven Chiral Charge Density Wave State in 1T-TaS$_{2}$
Authors:
Manoj Singh,
Boning Yu,
James Huber,
Bishnu Sharma,
Ghilles Ainouche,
Ling Fu,
Jasper van Wezel,
Michael C. Boyer
Abstract:
We use scanning tunneling microscopy to study the domain structure of the nearly-commensurate charge density wave (NC-CDW) state of 1T-TaS$_2$. In our sub-angstrom characterization of the state, we find a continual evolution of the CDW lattice from domain wall to domain center, instead of a fixed CDW arrangement within a domain. Further, we uncover an intradomain chirality characterizing the NC-CD…
▽ More
We use scanning tunneling microscopy to study the domain structure of the nearly-commensurate charge density wave (NC-CDW) state of 1T-TaS$_2$. In our sub-angstrom characterization of the state, we find a continual evolution of the CDW lattice from domain wall to domain center, instead of a fixed CDW arrangement within a domain. Further, we uncover an intradomain chirality characterizing the NC-CDW state. Unlike the orbital-driven chirality previously observed in related transition metal dichalcogenides, the chiral nature of the NC-CDW state in 1T-TaS$_2$ appears driven by a strong coupling of the NC-CDW state to the lattice.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Constant Scalar Curvature Sasaki Metrics and Projective Bundles
Authors:
Charles P. Boyer,
Christina W. Tønnesen-Friedman
Abstract:
In this paper we consider the Boothby-Wang construction over twist 1 stage 3 Bott orbifolds given in terms of the log pair $(S_{\bf n},Δ_{\bf m})$. We give explicit constant scalar curvature (CSC) Sasaki metrics either directly from CSC Kähler orbifold metrics or by using the weighted extremal approach of Apostolov and Calderbank. The Sasaki 7-manifolds (orbifolds) are finitely covered by compact…
▽ More
In this paper we consider the Boothby-Wang construction over twist 1 stage 3 Bott orbifolds given in terms of the log pair $(S_{\bf n},Δ_{\bf m})$. We give explicit constant scalar curvature (CSC) Sasaki metrics either directly from CSC Kähler orbifold metrics or by using the weighted extremal approach of Apostolov and Calderbank. The Sasaki 7-manifolds (orbifolds) are finitely covered by compact simply connected manifolds (orbifolds) with the rational homology of the 2-fold connected sum of $S^2\times S^5$.
△ Less
Submitted 13 May, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Transverse Kähler holonomy in Sasaki Geometry and ${\oldmathcal S}$-Stability
Authors:
Charles P. Boyer,
Hongnian Huang,
Christina V. Tønnesen-Friedman
Abstract:
We study the transverse Kähler holonomy groups on Sasaki manifolds $(M,{\oldmathcal S})$ and their stability properties under transverse holomorphic deformations of the characteristic foliation by the Reeb vector field. In particular, we prove that when the first Betti number $b_1(M)$ and the basic Hodge number $h^{0,2}_B({\oldmathcal S})$ vanish, then ${\oldmathcal S}$ is stable under deformation…
▽ More
We study the transverse Kähler holonomy groups on Sasaki manifolds $(M,{\oldmathcal S})$ and their stability properties under transverse holomorphic deformations of the characteristic foliation by the Reeb vector field. In particular, we prove that when the first Betti number $b_1(M)$ and the basic Hodge number $h^{0,2}_B({\oldmathcal S})$ vanish, then ${\oldmathcal S}$ is stable under deformations of the transverse Kähler flow. In addition we show that an irreducible transverse hyperkähler Sasakian structure is ${\oldmathcal S}$-unstable, whereas, an irreducible transverse Calabi-Yau Sasakian structure is ${\oldmathcal S}$-stable when $\dim M\geq 7$. Finally, we prove that the standard Sasaki join operation (transverse holonomy $U(n_1)\times U(n_2)$) as well as the fiber join operation preserve ${\oldmathcal S}$-stability.
△ Less
Submitted 5 October, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions
Authors:
Claire Boyer,
Antoine Godichon-Baggioni
Abstract:
The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize the latter, given samples provided in a streaming fashion, we define a general stochastic Newton algorithm and its weighted average version. In several use cases, both implementations will be shown not to require the inversion of a Hessian estimate at each iteration, but a dire…
▽ More
The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize the latter, given samples provided in a streaming fashion, we define a general stochastic Newton algorithm and its weighted average version. In several use cases, both implementations will be shown not to require the inversion of a Hessian estimate at each iteration, but a direct update of the estimate of the inverse Hessian instead will be favored. This generalizes a trick introduced in [2] for the specific case of logistic regression, by directly updating the estimate of the inverse Hessian. Under mild assumptions such as local strong convexity at the optimum, we establish almost sure convergences and rates of convergence of the algorithms, as well as central limit theorems for the constructed parameter estimates. The unified framework considered in this paper covers the case of linear, logistic or softmax regressions to name a few. Numerical experiments on simulated data give the empirical evidence of the pertinence of the proposed methods, which outperform popular competitors particularly in case of bad initializa-tions.
△ Less
Submitted 29 June, 2023; v1 submitted 19 November, 2020;
originally announced November 2020.
-
Analyzing the tree-layer structure of Deep Forests
Authors:
Ludovic Arnould,
Claire Boyer,
Erwan Scornet,
Sorbonne Lpsm
Abstract:
Random forests on the one hand, and neural networks on the other hand, have met great success in the machine learning community for their predictive performance. Combinations of both have been proposed in the literature, notably leading to the so-called deep forests (DF) (Zhou \& Feng,2019). In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mech…
▽ More
Random forests on the one hand, and neural networks on the other hand, have met great success in the machine learning community for their predictive performance. Combinations of both have been proposed in the literature, notably leading to the so-called deep forests (DF) (Zhou \& Feng,2019). In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mechanisms. Additionally, DF architecture can be generally simplified into more simple and computationally efficient shallow forest networks. Despite some instability, the latter may outperform standard predictive tree-based methods. We exhibit a theoretical framework in which a shallow tree network is shown to enhance the performance of classical decision trees. In such a setting, we provide tight theoretical lower and upper bounds on its excess risk. These theoretical results show the interest of tree-network architectures for well-structured data provided that the first layer, acting as a data encoder, is rich enough.
△ Less
Submitted 14 October, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.
-
Sasakian Geometry on Sphere Bundles
Authors:
Charles P. Boyer,
Christina W. Tønnesen-Friedman
Abstract:
The purpose of this paper is to study the Sasakian geometry on odd dimensional sphere bundles over a smooth projective algebraic variety $N$ with the ultimate, but probably unachievable goal of understanding the existence and non-existence of extremal and constant scalar curvature Sasaki metrics. We apply the fiber join construction of Yamazaki \cite{Yam99} for K-contact manifolds to the Sasaki ca…
▽ More
The purpose of this paper is to study the Sasakian geometry on odd dimensional sphere bundles over a smooth projective algebraic variety $N$ with the ultimate, but probably unachievable goal of understanding the existence and non-existence of extremal and constant scalar curvature Sasaki metrics. We apply the fiber join construction of Yamazaki \cite{Yam99} for K-contact manifolds to the Sasaki case. This construction depends on the choice of $d+1$ integral Kähler classes $[ω_j]$ on $N$ that are not necessarily colinear in the Kähler cone. We show that the colinear case is equivalent to a subclass of a different join construction orginally described in \cite{BG00a,BGO06}, applied to the spherical case by the authors in \cite{BoTo13,BoTo14a} when $d=1$, and known as cone decomposable \cite{BHLT16}. The non-colinear case gives rise to infinite families of new inequivalent cone indecomposable Sasaki contact CR on certain sphere bundles. We prove that the Sasaki cone for some of these structures contains an open set of extremal Sasaki metrics and, for certain specialized cases, the regular ray within this cone is shown to have constant scalar curvature. We also compute the cohomology groups of all such sphere bundles over a product of Riemann surfaces.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Iterated $S^3$ Sasaki Joins and Bott Orbifolds
Authors:
Charles P Boyer,
Christina Tønnesen-Friedman
Abstract:
We present a categorical relationship between iterated $S^3$ Sasaki-joins and Bott orbifolds. Then we show how to construct smooth Sasaki-Einstein (SE) structures on the iterated joins. These become increasingly complicated as dimension grows. We give an explicit construction of (infinitely many) smooth SE structures up through dimension eleven, and conjecture the existence of smooth SE structures…
▽ More
We present a categorical relationship between iterated $S^3$ Sasaki-joins and Bott orbifolds. Then we show how to construct smooth Sasaki-Einstein (SE) structures on the iterated joins. These become increasingly complicated as dimension grows. We give an explicit construction of (infinitely many) smooth SE structures up through dimension eleven, and conjecture the existence of smooth SE structures in all odd dimensions.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Robust Lasso-Zero for sparse corruption and model selection with missing covariates
Authors:
Pascaline Descloux,
Claire Boyer,
Julie Josse,
Aude Sportisse,
Sylvain Sardy
Abstract:
We propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology, initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values i…
▽ More
We propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology, initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates. In addition to not requiring the specification of a model for the covariates, nor estimating their covariance matrix or the noise variance, the method has the great advantage of handling missing not-at random values without specifying a parametric model. Numerical experiments and a medical application underline the relevance of Robust Lasso-Zero in such a context with few available competitors. The method is easy to use and implemented in the R library lass0.
△ Less
Submitted 23 March, 2022; v1 submitted 12 May, 2020;
originally announced May 2020.
-
Sampling Rates for $\ell^1$-Synthesis
Authors:
Maximilian März,
Claire Boyer,
Jonas Kahn,
Pierre Weiss
Abstract:
This work investigates the problem of signal recovery from undersampled noisy sub-Gaussian measurements under the assumption of a synthesis-based sparsity model. Solving the $\ell^1$-synthesis basis pursuit allows for a simultaneous estimation of a coefficient representation as well as the sought-for signal. However, due to linear dependencies within redundant dictionary atoms it might be impossib…
▽ More
This work investigates the problem of signal recovery from undersampled noisy sub-Gaussian measurements under the assumption of a synthesis-based sparsity model. Solving the $\ell^1$-synthesis basis pursuit allows for a simultaneous estimation of a coefficient representation as well as the sought-for signal. However, due to linear dependencies within redundant dictionary atoms it might be impossible to identify a specific representation vector, although the actual signal is still successfully recovered. The present manuscript studies both estimation problems from a non-uniform, signal-dependent perspective. By utilizing recent results on the convex geometry of linear inverse problems, the sampling rates describing the phase transitions of each formulation are identified. In both cases, they are given by the conic Gaussian mean width of an $\ell^1$-descent cone that is linearly transformed by the dictionary. In general, this expression does not allow a simple calculation by following the polarity-based approach commonly found in the literature. Hence, two upper bounds involving the sparsity of coefficient representations are provided: The first one is based on a local condition number and the second one on a geometric analysis that makes use of the thinness of high-dimensional polyhedral cones with not too many generators. It is furthermore revealed that both recovery problems can differ dramatically with respect to robustness to measurement noise -- a fact that seems to have gone unnoticed in most of the related literature. All insights are carefully undermined by numerical simulations.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Debiasing Stochastic Gradient Descent to handle missing values
Authors:
Julie Josse,
Aude Sportisse,
Claire Boyer,
Aymeric Dieuleveut
Abstract:
Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.However, a major caveat of large data is their incompleteness.We propose an averaged stochastic gradient algorithm handling missing values in linear models. This approach has the merit to be free from the need of any data distribution modeling and to account for het…
▽ More
Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.However, a major caveat of large data is their incompleteness.We propose an averaged stochastic gradient algorithm handling missing values in linear models. This approach has the merit to be free from the need of any data distribution modeling and to account for heterogeneous missing proportion.In both streaming and finite-sample settings, we prove that this algorithm achieves convergence rate of $\mathcal{O}(\frac{1}{n})$ at the iteration $n$, the same as without missing values. We show the convergence behavior and the relevance of the algorithm not only on synthetic data but also on real data sets, including those collected from medical register.
△ Less
Submitted 8 June, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Missing Data Imputation using Optimal Transport
Authors:
Boris Muzellec,
Julie Josse,
Claire Boyer,
Marco Cuturi
Abstract:
Missing data is a crucial issue when applying machine learning algorithms to real-world datasets. Starting from the simple assumption that two batches extracted randomly from the same dataset should share the same distribution, we leverage optimal transport distances to quantify that criterion and turn it into a loss function to impute missing data values. We propose practical methods to minimize…
▽ More
Missing data is a crucial issue when applying machine learning algorithms to real-world datasets. Starting from the simple assumption that two batches extracted randomly from the same dataset should share the same distribution, we leverage optimal transport distances to quantify that criterion and turn it into a loss function to impute missing data values. We propose practical methods to minimize these losses using end-to-end learning, that can exploit or not parametric assumptions on the underlying distributions of values. We evaluate our methods on datasets from the UCI repository, in MCAR, MAR and MNAR settings. These experiments show that OT-based methods match or out-perform state-of-the-art imputation methods, even for high percentages of missing values.
△ Less
Submitted 1 July, 2020; v1 submitted 10 February, 2020;
originally announced February 2020.
-
The $S^3_\bfw$ Sasaki Join Construction
Authors:
Charles P. Boyer,
Christina W. Tønnesen-Friedman
Abstract:
The main purpose of this work is to generalize the $S^3_\bfw$ Sasaki join construction $M\star_\bfl S^3_\bfw$ described in \cite{BoTo14a} when the Sasakian structure on $M$ is regular, to the general case where the Sasakian structure is only quasi-regular. This gives one of the main results, Theorem 3.2, which describes an inductive procedure for constructing Sasakian metrics of constant scalar cu…
▽ More
The main purpose of this work is to generalize the $S^3_\bfw$ Sasaki join construction $M\star_\bfl S^3_\bfw$ described in \cite{BoTo14a} when the Sasakian structure on $M$ is regular, to the general case where the Sasakian structure is only quasi-regular. This gives one of the main results, Theorem 3.2, which describes an inductive procedure for constructing Sasakian metrics of constant scalar curvature. In the Gorenstein case ($c_1(\cald)=0$) we construct a polynomial whose coeffients are linear in the components of $\bfw$ and whose unique root in the interval $(1,\infty)$ completely determines the Sasaki-Einstein metric. In the more general case we apply our results to prove that there exists infinitely many smooth 7-manifolds each of which admit infinitely many inequivalent contact structures of Sasaki type admitting constant scalar curvature Sasaki metrics (see Corollary 6.15). We also discuss the relationship with a recent paper \cite{ApCa18} of Apostolov and Calderbank as well as the relation with K-stability.
△ Less
Submitted 2 October, 2021; v1 submitted 25 November, 2019;
originally announced November 2019.
-
Interplay of Charge Density Wave States and Strain at the Surface of CeTe$_{2}$
Authors:
Bishnu Sharma,
Manoj Singh,
Burhan Ahmed,
Boning Yu,
Philip Walmsley,
Ian R. Fisher,
Michael C. Boyer
Abstract:
We use scanning tunneling microscopy (STM) to study charge density wave (CDW) states in the rare-earth di-telluride, CeTe$_{2}$. In contrast to previous experimental and first-principles studies of the rare-earth di-tellurides, our STM measurements surprisingly detect a unidirectional CDW with $\textit{q}$ ~ 0.28 $\textit{a}$*, which is very close to what is found in experimental measurements of t…
▽ More
We use scanning tunneling microscopy (STM) to study charge density wave (CDW) states in the rare-earth di-telluride, CeTe$_{2}$. In contrast to previous experimental and first-principles studies of the rare-earth di-tellurides, our STM measurements surprisingly detect a unidirectional CDW with $\textit{q}$ ~ 0.28 $\textit{a}$*, which is very close to what is found in experimental measurements of the related rare-earth tri-tellurides. Furthermore, in the vicinity of an extended sub-surface defect, we find spatially-separated as well as spatially-coexisting unidirectional CDWs at the surface of CeTe$_{2}$. We quantify the nanoscale strain and its variations induced by this defect, and establish a correlation between local lattice strain and the locally-established CDW states. Our measurements probe the fundamental properties of a weakly-bound two-dimensional Te-sheet, which experimental and theoretical work has previously established as the fundamental component driving much of the essential physics in both the rare-earth di- and tri-telluride compounds.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
Some Open Problems in Sasaki Geometry
Authors:
Charles P. Boyer,
Hongnian Huang,
Eveline Legendre,
Christina W. Tønnesen-Friedman
Abstract:
This paper has been submitted to the Proceedings of the Australian-German Workshop on Differential Geometry in the Large held at the mathematical research institute MATRIX in Creswick, Victoria, Australia, Feb.2-Feb.14, 2019. We describe and discuss 2 important open problems in Sasaki geometry.
This paper has been submitted to the Proceedings of the Australian-German Workshop on Differential Geometry in the Large held at the mathematical research institute MATRIX in Creswick, Victoria, Australia, Feb.2-Feb.14, 2019. We describe and discuss 2 important open problems in Sasaki geometry.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data
Authors:
Aude Sportisse,
Claire Boyer,
Julie Josse
Abstract:
Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the missing data mechanism, which makes inference or imputation tasks more complex. Furthermore, this implies a strong \textit{a priori} on the parametric form of the di…
▽ More
Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the missing data mechanism, which makes inference or imputation tasks more complex. Furthermore, this implies a strong \textit{a priori} on the parametric form of the distribution.However, some works have obtained guarantees on the estimation of parameters in the presence of MNAR data, without specifying the distribution of missing data \citep{mohan2018estimation, tang2003analysis}. This is very useful in practice, but is limited to simple cases such as self-masked MNAR values in data generated according to linear regression models.We continue this line of research, but extend it to a more general MNAR mechanism, in a more general model of the probabilistic principal component analysis (PPCA), \textit{i.e.}, a low-rank model with random effects. We prove identifiability of the PPCA parameters. We then propose an estimation of the loading coefficients and a data imputation method. They are based on estimators of means, variances and covariances of missing variables, for which consistency is discussed. These estimators have the great advantage of being calculated using only the observed data, leveraging the underlying low-rank structure of the data. We illustrate the relevance of the method with numerical experiments on synthetic data and also on real data collected from a medical register.
△ Less
Submitted 10 June, 2020; v1 submitted 6 June, 2019;
originally announced June 2019.
-
Imputation and low-rank estimation with Missing Not At Random data
Authors:
Aude Sportisse,
Claire Boyer,
Julie Josse
Abstract:
Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with missing values. However, existing methods do not consider the case of informative missing values which are widely encountered in practice. This paper proposes matri…
▽ More
Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with missing values. However, existing methods do not consider the case of informative missing values which are widely encountered in practice. This paper proposes matrix completion methods to recover Missing Not At Random (MNAR) data. Our first contribution is to suggest a model-based estimation strategy by modelling the missing mechanism distribution. An EM algorithm is then implemented, involving a Fast Iterative Soft-Thresholding Algorithm (FISTA). Our second contribution is to suggest a computationally efficient surrogate estimation by implicitly taking into account the joint distribution of the data and the missing mechanism: the data matrix is concatenated with the mask coding for the missing values; a low-rank structure for exponential family is assumed on this new matrix, in order to encode links between variables and missing mechanisms. The methodology that has the great advantage of handling different missing value mechanisms is robust to model specification errors.The performances of our methods are assessed on the real data collected from a trauma registry (TraumaBase ) containing clinical information about over twenty thousand severely traumatized patients in France. The aim is then to predict if the doctors should administrate tranexomic acid to patients with traumatic brain injury, that would limit excessive bleeding.
△ Less
Submitted 29 January, 2020; v1 submitted 29 December, 2018;
originally announced December 2018.
-
Convex Regularization and Representer Theorems
Authors:
Claire Boyer,
Antonin Chambolle,
Yohann de Castro,
Vincent Duval,
Frédéric de Gournay,
Pierre Weiss
Abstract:
We establish a result which states that regularizing an inverse problem with the gauge of a convex set $C$ yields solutions which are linear combinations of a few extreme points or elements of the extreme rays of $C$. These can be understood as the \textit{atoms} of the regularizer. We then explicit that general principle by using a few popular applications. In particular, we relate it to the comm…
▽ More
We establish a result which states that regularizing an inverse problem with the gauge of a convex set $C$ yields solutions which are linear combinations of a few extreme points or elements of the extreme rays of $C$. These can be understood as the \textit{atoms} of the regularizer. We then explicit that general principle by using a few popular applications. In particular, we relate it to the common wisdom that total gradient variation minimization favors the reconstruction of piecewise constant images.
△ Less
Submitted 11 December, 2018;
originally announced December 2018.
-
Sasaki-Einstein Metrics on a class of 7-Manifolds
Authors:
Charles P. Boyer,
Christina Tønnesen-Friedman
Abstract:
In this note we give an explicit construction of Sasaki-Einstein metrics on a class of simply connected 7-manifolds with the rational cohomology of the 2-fold connected sum of $S^2\times S^5$. The homotopy types are distinguished by torsion in $H^4$.
In this note we give an explicit construction of Sasaki-Einstein metrics on a class of simply connected 7-manifolds with the rational cohomology of the 2-fold connected sum of $S^2\times S^5$. The homotopy types are distinguished by torsion in $H^4$.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.
-
Density wave probes cuprate quantum phase transition
Authors:
Tatiana A. Webb,
Michael C. Boyer,
Yi Yin,
Debanjan Chowdhury,
Yang He,
Takeshi Kondo,
T. Takeuchi,
H. Ikuta,
Eric W. Hudson,
Jennifer E. Hoffman,
Mohammad H. Hamidian
Abstract:
In cuprates, the strong correlations in proximity to the antiferromagnetic Mott insulating state give rise to an array of unconventional phenomena beyond high temperature superconductivity. Develo** a complete description of the ground state evolution is crucial to decoding the complex phase diagram. Here we use the structure of broken translational symmetry, namely $d$-form factor charge modula…
▽ More
In cuprates, the strong correlations in proximity to the antiferromagnetic Mott insulating state give rise to an array of unconventional phenomena beyond high temperature superconductivity. Develo** a complete description of the ground state evolution is crucial to decoding the complex phase diagram. Here we use the structure of broken translational symmetry, namely $d$-form factor charge modulations in (Bi,Pb)$_2$(Sr,La)$_2$CuO$_{6+δ}$, as a probe of the ground state reorganization that occurs at the transition from truncated Fermi arcs to a large Fermi surface. We use real space imaging of nanoscale electronic inhomogeneity as a tool to access a range of do**s within each sample, and we definitively validate the spectral gap $Δ$ as a proxy for local hole do**. From the $Δ$-dependence of the charge modulation wavevector, we discover a commensurate to incommensurate transition that is coincident with the Fermi surface transition from arcs to large hole pocket, demonstrating the qualitatively distinct nature of the electronic correlations governing the two sides of this quantum phase transition. Furthermore, the do** dependence of the incommensurate wavevector on the overdoped side is at odds with a simple Fermi surface driven instability.
△ Less
Submitted 15 May, 2019; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Contact Structures of Sasaki Type and their Associated Moduli
Authors:
Charles P. Boyer
Abstract:
This article is based on a talk at the RIEMain in Contact conference in Cagliari, Italy in honor of the 78th birthday of David Blair one of the founders of modern Riemannian contact geometry. The present article is a survey of a special type of Riemannian contact structure known as Sasakian geometry. An ultimate goal of this survey is to understand the moduli of classes of Sasakian structures as w…
▽ More
This article is based on a talk at the RIEMain in Contact conference in Cagliari, Italy in honor of the 78th birthday of David Blair one of the founders of modern Riemannian contact geometry. The present article is a survey of a special type of Riemannian contact structure known as Sasakian geometry. An ultimate goal of this survey is to understand the moduli of classes of Sasakian structures as well as the moduli of extremal and constant scalar curvature Sasaki metrics, and in particular the moduli of Sasaki-Einstein metrics.
△ Less
Submitted 17 October, 2018;
originally announced October 2018.
-
Proximal boosting: aggregating weak learners to minimize non-differentiable losses
Authors:
Erwan Fouillen,
Claire Boyer,
Maxime Sangnier
Abstract:
Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm, when the empirical risk to minimize is not differentiable, in order to introduce a…
▽ More
Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm, when the empirical risk to minimize is not differentiable, in order to introduce a novel boosting approach, called proximal boosting. It comes with a companion algorithm inspired by [1] and called residual proximal boosting, which is aimed at better controlling the approximation error. Theoretical convergence is proved for these two procedures under different hypotheses on the empirical risk and advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy.
△ Less
Submitted 29 November, 2022; v1 submitted 29 August, 2018;
originally announced August 2018.
-
Magnetic and structural properties of Co$_2$MnSi based Heusler compound
Authors:
S. J. Ahmed,
C. Boyer,
M. Niewczas
Abstract:
The influence of antisite disorder occupancies on the magnetic properties of the half-metallic Co$_2$MnSi compound was studied by experimental techniques and first-principles calculations. The neutron diffraction studies show almost identical amount of Mn and Co disorders of 6.5\% and 7.6\%, which was found to be in good agreement with density functional theory (DFT) calculations of the stable Co…
▽ More
The influence of antisite disorder occupancies on the magnetic properties of the half-metallic Co$_2$MnSi compound was studied by experimental techniques and first-principles calculations. The neutron diffraction studies show almost identical amount of Mn and Co disorders of 6.5\% and 7.6\%, which was found to be in good agreement with density functional theory (DFT) calculations of the stable Co$_2$MnSi system with the corresponding disorders. DFT studies reveal that antiferromagnetic interactions introduced by Mn disorder lead to a reduction of the net magnetic moment. The results are discussed in conjunction with neutron diffraction and magnetization measurements. Transport property measurement under magnetic field up to 9 Tesla revealed a positive magnetoresistance for bulk Co$_2$MnSi that persists up to room temperature. A Curie temperature of $\sim$1014 K was determined for the compound by high temperature electrical resistivity and dilatometry measurements.
△ Less
Submitted 23 November, 2018; v1 submitted 16 August, 2018;
originally announced August 2018.
-
Magnetic and structural studies of G-phase compound Mn$_6$Ni$_{16}$Si$_7$
Authors:
S. J. Ahmed,
J. E. Greedan,
C. Boyer,
M. Niewczas
Abstract:
Transition metal compounds with complex crystal structures tend to demonstrate interesting magnetic coupling resulting in unusual magnetic properties. In this work, the structural and magnetic characterization of a single crystal of the Ni-Mn-Si based G-phase compound, Mn$_6$Ni$_{16}$Si$_7$, grown by the Czochralski method, is reported. In this structure isolated octahedral Mn$_6$ clusters form a…
▽ More
Transition metal compounds with complex crystal structures tend to demonstrate interesting magnetic coupling resulting in unusual magnetic properties. In this work, the structural and magnetic characterization of a single crystal of the Ni-Mn-Si based G-phase compound, Mn$_6$Ni$_{16}$Si$_7$, grown by the Czochralski method, is reported. In this structure isolated octahedral Mn$_6$ clusters form a f.c.c. lattice. As each octahedron consists of eight edge-sharing equilateral triangles, the possibility for geometric frustration exists. Magnetization and specific heat measurements showed two magnetic phase transitions at 197 K and 50 K, respectively. At 100 K neutron diffraction on powder samples shows a magnetic structure with k = (001) in which only four of the six Mn spins per cluster order along $<100>$ directions giving a two dimensional magnetic structure consistent with intra-cluster frustration. Below the 50 K phase transition the Mn spins cant away from $<100>$ directions and a weak moment develops on the two remaining Mn octahedral sites.
△ Less
Submitted 6 September, 2018; v1 submitted 16 August, 2018;
originally announced August 2018.
-
On Positivity in Sasaki Geometry
Authors:
Charles P. Boyer,
Christina W. Tønnesen-Friedman
Abstract:
It is well known that if the dimension of the Sasaki cone is greater than one, then all Sasakian structures are either positive or indefinite. We discuss the phenomenon of type changing within a fixed Sasaki cone. Assuming henceforth that the dimension of the Sasaki cone is greater than one, there are three possibilities, either all elements are positive, all are indefinite, or both positive and i…
▽ More
It is well known that if the dimension of the Sasaki cone is greater than one, then all Sasakian structures are either positive or indefinite. We discuss the phenomenon of type changing within a fixed Sasaki cone. Assuming henceforth that the dimension of the Sasaki cone is greater than one, there are three possibilities, either all elements are positive, all are indefinite, or both positive and indefinite Sasakian structures occur. We illustrate by examples how the type can change as we move in the Sasaki cone. If there exists a Sasakian structure in the cone whose total transverse scalar curvature is non-positive, then all elements of the Sasaki cone are indefinite. Furthermore, we prove that if the first Chern class is a torsion class or represented by a positive definite (1,1) form, then all elements of the Sasaki cone are positive.
△ Less
Submitted 13 August, 2018; v1 submitted 9 August, 2018;
originally announced August 2018.
-
On Representer Theorems and Convex Regularization
Authors:
Claire Boyer,
Antonin Chambolle,
Yohann De Castro,
Vincent Duval,
Frédéric De Gournay,
Pierre Weiss
Abstract:
We establish a general principle which states that regularizing an inverse problem with a convex function yields solutions which are convex combinations of a small number of atoms. These atoms are identified with the extreme points and elements of the extreme rays of the regularizer level sets. An extension to a broader class of quasi-convex regularizers is also discussed. As a side result, we cha…
▽ More
We establish a general principle which states that regularizing an inverse problem with a convex function yields solutions which are convex combinations of a small number of atoms. These atoms are identified with the extreme points and elements of the extreme rays of the regularizer level sets. An extension to a broader class of quasi-convex regularizers is also discussed. As a side result, we characterize the minimizers of the total gradient variation, which was still an unresolved problem.
△ Less
Submitted 26 November, 2018; v1 submitted 26 June, 2018;
originally announced June 2018.
-
On oracle-type local recovery guarantees in compressed sensing
Authors:
Ben Adcock,
Claire Boyer,
Simone Brugiapaglia
Abstract:
We present improved sampling complexity bounds for stable and robust sparse recovery in compressed sensing. Our unified analysis based on l1 minimization encompasses the case where (i) the measurements are block-structured samples in order to reflect the structured acquisition that is often encountered in applications; (ii) the signal has an arbitrary structured sparsity, by results depending on i…
▽ More
We present improved sampling complexity bounds for stable and robust sparse recovery in compressed sensing. Our unified analysis based on l1 minimization encompasses the case where (i) the measurements are block-structured samples in order to reflect the structured acquisition that is often encountered in applications; (ii) the signal has an arbitrary structured sparsity, by results depending on its support S. Within this framework and under a random sign assumption, the number of measurements needed by l1 minimization can be shown to be of the same order than the one required by an oracle least-squares estimator. Moreover, these bounds can be minimized by adapting the variable density sampling to a given prior on the signal support and to the coherence of the measurements. We illustrate both numerically and analytically that our results can be successfully applied to recover Haar wavelet coefficients that are sparse in levels from random Fourier measurements in dimension one and two, which can be of particular interest in imaging problems. Finally, a preliminary numerical investigation shows the potential of this theory for devising adaptive sampling strategies in sparse polynomial approximation.
△ Less
Submitted 10 December, 2019; v1 submitted 10 June, 2018;
originally announced June 2018.
-
The Kähler geometry of Bott manifolds
Authors:
Charles P. Boyer,
David M. J. Calderbank,
Christina W. Tønnesen-Friedman
Abstract:
We study the Kähler geometry of stage n Bott manifolds, which can be viewed as $n$-dimensional generalizations of Hirzebruch surfaces. We show, using a simple induction argument and the generalized Calabi construction from [ACGT04,ACGT11], that any stage n Bott manifold $M_n$ admits an extremal Kähler metric. We also give necessary conditions for $M_n$ to admit a constant scalar curvature Kähler m…
▽ More
We study the Kähler geometry of stage n Bott manifolds, which can be viewed as $n$-dimensional generalizations of Hirzebruch surfaces. We show, using a simple induction argument and the generalized Calabi construction from [ACGT04,ACGT11], that any stage n Bott manifold $M_n$ admits an extremal Kähler metric. We also give necessary conditions for $M_n$ to admit a constant scalar curvature Kähler metric. We obtain more precise results for stage 3 Bott manifolds, including in particular some interesting relations with c-projective geometry and some explicit examples of almost Kähler structures.
To place these results in context, we review and develop the topology, complex geometry and symplectic geometry of Bott manifolds. In particular, we study the Kähler cone, the automorphism group and the Fano condition. We also relate the number of conjugacy classes of maximal tori in the symplectomorphism group to the number of biholomorphism classes compatible with the symplectic structure.
△ Less
Submitted 18 April, 2019; v1 submitted 29 January, 2018;
originally announced January 2018.
-
An application of the Duistertmaat--Heckman Theorem and its extensions in Sasaki Geometry
Authors:
Charles P. Boyer,
Hongnian Huang,
Eveline Legendre
Abstract:
Building on an idea laid out by Martelli--Sparks--Yau, we use the Duistermaat-Heckman localization formula and an extension of it to give rational and explicit expressions of the volume, the total transversal scalar curvature and the Einstein--Hilbert functional, seen as functionals on the Sasaki cone (Reeb cone). Studying the leading terms we prove they are all proper. Among consequences we get t…
▽ More
Building on an idea laid out by Martelli--Sparks--Yau, we use the Duistermaat-Heckman localization formula and an extension of it to give rational and explicit expressions of the volume, the total transversal scalar curvature and the Einstein--Hilbert functional, seen as functionals on the Sasaki cone (Reeb cone). Studying the leading terms we prove they are all proper. Among consequences we get that the Einstein-Hilbert functional attains its minimal value and each Sasaki cone possess at least one Reeb vector field with vanishing transverse Futaki invariant.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.
-
Relative K-stability and Extremal Sasaki metrics
Authors:
Charles P. Boyer,
Craig van Coevering
Abstract:
We define K-stability of a polarized Sasakian manifold relative to a maximal torus of automorphisms. The existence of a Sasaki-extremal metric in the polarization is shown to imply that the polarization is K-semistable. Computing this invariant for the deformation to the normal cone gives an extention of the Lichnerowicz obstruction, due to Gauntlett, Martelli, Sparks, and Yau, to an obstruction o…
▽ More
We define K-stability of a polarized Sasakian manifold relative to a maximal torus of automorphisms. The existence of a Sasaki-extremal metric in the polarization is shown to imply that the polarization is K-semistable. Computing this invariant for the deformation to the normal cone gives an extention of the Lichnerowicz obstruction, due to Gauntlett, Martelli, Sparks, and Yau, to an obstruction of Sasaki-extremal metrics. We use this to give a list of examples of Sasakian manifolds whose Sasaki cone contains no extremal representatives. These give the first examples of Sasaki cones of dimension greater than one that contain no extremal Sasaki metrics whatsoever. In the process we compute the unreduced Sasaki cone for an arbitrary smooth link of a weighted homogeneous polynomial.
△ Less
Submitted 20 December, 2016; v1 submitted 22 August, 2016;
originally announced August 2016.
-
Multiple Charge Density Wave States at the Surface of TbTe$_3$
Authors:
Ling Fu,
Aaron M. Kraft,
Bishnu Sharma,
Manoj Singh,
Philip Walmsley,
Ian R. Fisher,
Michael C. Boyer
Abstract:
We studied TbTe$_{3}$ using scanning tunneling microscopy (STM) in the temperature range of 298 - 355 K. As seen in previous STM measurements on RTe$_{3}$ compounds, our measurements detect a unidirectional charge density wave state (CDW) in the surface Te-layer with a wavevector consistent with that of the bulk, q$_{cdw}$ = 0.30 $\pm$ 0.01c$^{*}$. However, unlike previous STM measurements, and di…
▽ More
We studied TbTe$_{3}$ using scanning tunneling microscopy (STM) in the temperature range of 298 - 355 K. As seen in previous STM measurements on RTe$_{3}$ compounds, our measurements detect a unidirectional charge density wave state (CDW) in the surface Te-layer with a wavevector consistent with that of the bulk, q$_{cdw}$ = 0.30 $\pm$ 0.01c$^{*}$. However, unlike previous STM measurements, and differing from measurements probing the bulk, we detect two perpendicular orientations for the unidirectional CDWs with no directional preference for the in-plane crystal axes (a- or c-axis) and no noticeable difference in wavevector magnitude. In addition, we find regions in which the bidirectional CDW states coexist. We propose that observation of two CDW states indicates a decoupling of the surface Te-layer from the rare-earth block layer below, and that strain variations in the Te surface layer drive the local CDW direction to the specific unidirectional or, in rare occurrences, bidirectional CDW orders observed. This indicates that similar driving mechanisms for CDW formation in the bulk, where anisotropic lattice strain energy is important, are at play at the surface. In our bias-dependent measurements, we find no contrast inversion for the CDW state between occupied and empty states. This finding differs from other quasi 2-dimensional materials containing a hidden 1-dimensional character which leads to a favorable Fermi surface nesting scenario. Our temperature-dependent measurements provide evidence for localized CDW formation above the bulk transition temperature, T$_{cdw}$.
△ Less
Submitted 26 July, 2016;
originally announced July 2016.
-
Suppression of Superfluid Density and the Pseudogap State in the Cuprates by Impurities
Authors:
Unurbat Erdenemunkh,
Brian Koopman,
Ling Fu,
Kamalesh Chatterjee,
W. D. Wise,
G. D. Gu,
E. W. Hudson,
Michael C. Boyer
Abstract:
We use scanning tunneling microscopy (STM) to study magnetic Fe impurities intentionally doped into the high-temperature superconductor Bi$_{2}$Sr$_{2}$Ca$_{2}$CuO$_{8+δ}$. Our spectroscopic measurements reveal that Fe impurities introduce low-lying resonances in the density of states at Ω$_{1}$ $\approx$ 4meV and Ω$_{2}$ $\approx$ 15 meV allowing us to determine that, despite having a large magne…
▽ More
We use scanning tunneling microscopy (STM) to study magnetic Fe impurities intentionally doped into the high-temperature superconductor Bi$_{2}$Sr$_{2}$Ca$_{2}$CuO$_{8+δ}$. Our spectroscopic measurements reveal that Fe impurities introduce low-lying resonances in the density of states at Ω$_{1}$ $\approx$ 4meV and Ω$_{2}$ $\approx$ 15 meV allowing us to determine that, despite having a large magnetic moment, potential scattering of quasiparticles by Fe impurities dominates magnetic scattering. In addition, using high-resolution spatial characterizations of the local density of states near and away from Fe impurities, we detail the spatial extent of impurity affected regions as well as provide a local view of impurity-induced effects on the superconducting and pseudogap states. Our studies of Fe impurities, when combined with a reinterpretation of earlier STM work in the context of a two-gap scenario, allow us to present a unified view of the atomic-scale effects of elemental impurities on the pseudogap and superconducting states in hole-doped cuprates; this may help resolve a previously assumed dichotomy between the effects of magnetic and non-magnetic impurities in these materials.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
Reducibility in Sasakian Geometry
Authors:
Charles P. Boyer,
Hongnian Huang,
Eveline Legendre,
Christina W. Tønnesen-Friedman
Abstract:
The purpose of this paper is to study reducibility properties in Sasakian geometry. First we give the Sasaki version of the de Rham Decomposition Theorem; however, we need a mild technical assumption on the Sasaki automorphism group which includes the toric case. Next we introduce the concept of {\it cone reducible} and consider $S^3$ bundles over a smooth projective algebraic variety where we giv…
▽ More
The purpose of this paper is to study reducibility properties in Sasakian geometry. First we give the Sasaki version of the de Rham Decomposition Theorem; however, we need a mild technical assumption on the Sasaki automorphism group which includes the toric case. Next we introduce the concept of {\it cone reducible} and consider $S^3$ bundles over a smooth projective algebraic variety where we give a classification result concerning contact structures admitting the action of a 2-torus of Reeb type. In particular, we can classify all such Sasakian structures up to contact isotopy on $S^3$ bundles over a Riemann surface of genus greater than zero. Finally, we show that in the toric case an extremal Sasaki metric on a Sasaki join always splits.
△ Less
Submitted 12 February, 2018; v1 submitted 15 June, 2016;
originally announced June 2016.
-
Adapting to unknown noise level in sparse deconvolution
Authors:
Claire Boyer,
Yohann De Castro,
Joseph Salmon
Abstract:
In this paper, we study sparse spike deconvolution over the space of complex-valued measures when the input measure is a finite sum of Dirac masses. We introduce a modified version of the Beurling Lasso (BLasso), a semi-definite program that we refer to as the Concomitant Beurling Lasso (CBLasso). This new procedure estimates the target measure and the unknown noise level simultaneously. Contrary…
▽ More
In this paper, we study sparse spike deconvolution over the space of complex-valued measures when the input measure is a finite sum of Dirac masses. We introduce a modified version of the Beurling Lasso (BLasso), a semi-definite program that we refer to as the Concomitant Beurling Lasso (CBLasso). This new procedure estimates the target measure and the unknown noise level simultaneously. Contrary to previous estimators in the literature, theory holds for a tuning parameter that depends only on the sample size, so that it can be used for unknown noise level problems. Consistent noise level estimation is standardly proved. As for Radon measure estimation, theoretical guarantees match the previous state-of-the-art results in Super-Resolution regarding minimax prediction and localization. The proofs are based on a bound on the noise level given by a new tail estimate of the supremum of a stationary non-Gaussian process through the Rice method.
△ Less
Submitted 19 October, 2016; v1 submitted 15 June, 2016;
originally announced June 2016.