Search | arXiv e-print repository

Sliced-Wasserstein Estimation with Spherical Harmonics as Control Variates

Authors: Rémi Leluc, Aymeric Dieuleveut, François Portier, Johan Segers, Aigerim Zhuman

Abstract: The Sliced-Wasserstein (SW) distance between probability measures is defined as the average of the Wasserstein distances resulting for the associated one-dimensional projections. As a consequence, the SW distance can be written as an integral with respect to the uniform measure on the sphere and the Monte Carlo framework can be employed for calculating the SW distance. Spherical harmonics are poly… ▽ More The Sliced-Wasserstein (SW) distance between probability measures is defined as the average of the Wasserstein distances resulting for the associated one-dimensional projections. As a consequence, the SW distance can be written as an integral with respect to the uniform measure on the sphere and the Monte Carlo framework can be employed for calculating the SW distance. Spherical harmonics are polynomials on the sphere that form an orthonormal basis of the set of square-integrable functions on the sphere. Putting these two facts together, a new Monte Carlo method, hereby referred to as Spherical Harmonics Control Variates (SHCV), is proposed for approximating the SW distance using spherical harmonics as control variates. The resulting approach is shown to have good theoretical properties, e.g., a no-error property for Gaussian measures under a certain form of linear dependency between the variables. Moreover, an improved rate of convergence, compared to Monte Carlo, is established for general measures. The convergence analysis relies on the Lipschitz property associated to the SW integrand. Several numerical experiments demonstrate the superior performance of SHCV against state-of-the-art methods for SW distance computation. △ Less

Submitted 15 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to ICML 2024

MSC Class: 65C05 (Primary) 65D30; 68Txx; 68Wxx (Secondary)

arXiv:2312.09969 [pdf, other]

Nearest Neighbor Sampling for Covariate Shift Adaptation

Authors: François Portier, Lionel Truquet, Ikko Yamane

Abstract: Many existing covariate shift adaptation methods estimate sample weights given to loss values to mitigate the gap between the source and the target distribution. However, estimating the optimal weights typically involves computationally expensive matrix inversion and hyper-parameter tuning. In this paper, we propose a new covariate shift adaptation method which avoids estimating the weights. The b… ▽ More Many existing covariate shift adaptation methods estimate sample weights given to loss values to mitigate the gap between the source and the target distribution. However, estimating the optimal weights typically involves computationally expensive matrix inversion and hyper-parameter tuning. In this paper, we propose a new covariate shift adaptation method which avoids estimating the weights. The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset. Our analysis reveals that setting $k = 1$ is an optimal choice. This property removes the necessity of tuning the only hyper-parameter $k$ and leads to a running time quasi-linear in the sample size. Our results include sharp rates of convergence for our estimator, with a tight control of the mean square error and explicit constants. In particular, the variance of our estimators has the same rate of convergence as for standard parametric estimation despite their non-parametric nature. The proposed estimator shares similarities with some matching-based treatment effect estimators used, e.g., in biostatistics, econometrics, and epidemiology. Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy. △ Less

Submitted 28 June, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

arXiv:2310.14826 [pdf, other]

Sharp error bounds for imbalanced classification: how many examples in the minority class?

Authors: Anass Aghbalou, François Portier, Anne Sabourin

Abstract: When dealing with imbalanced classification data, reweighting the loss function is a standard procedure allowing to equilibrate between the true positive and true negative rates within the risk measure. Despite significant theoretical work in this area, existing results do not adequately address a main challenge within the imbalanced classification framework, which is the negligible size of one cl… ▽ More When dealing with imbalanced classification data, reweighting the loss function is a standard procedure allowing to equilibrate between the true positive and true negative rates within the risk measure. Despite significant theoretical work in this area, existing results do not adequately address a main challenge within the imbalanced classification framework, which is the negligible size of one class in relation to the full sample size and the need to rescale the risk function by a probability tending to zero. To address this gap, we present two novel contributions in the setting where the rare class probability approaches zero: (1) a non asymptotic fast rate probability bound for constrained balanced empirical risk minimization, and (2) a consistent upper bound for balanced nearest neighbors estimates. Our findings provide a clearer understanding of the benefits of class-weighting in realistic settings, opening new avenues for further research in this field. △ Less

Submitted 16 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

arXiv:2305.06151 [pdf, other]

Speeding up Monte Carlo Integration: Control Neighbors for Optimal Convergence

Authors: Rémi Leluc, François Portier, Johan Segers, Aigerim Zhuman

Abstract: A novel linear integration rule called $\textit{control neighbors}$ is proposed in which nearest neighbor estimates act as control variates to speed up the convergence rate of the Monte Carlo procedure on metric spaces. The main result is the $\mathcal{O}(n^{-1/2} n^{-s/d})$ convergence rate -- where $n$ stands for the number of evaluations of the integrand and $d$ for the dimension of the domain… ▽ More A novel linear integration rule called $\textit{control neighbors}$ is proposed in which nearest neighbor estimates act as control variates to speed up the convergence rate of the Monte Carlo procedure on metric spaces. The main result is the $\mathcal{O}(n^{-1/2} n^{-s/d})$ convergence rate -- where $n$ stands for the number of evaluations of the integrand and $d$ for the dimension of the domain -- of this estimate for Hölder functions with regularity $s \in (0,1]$, a rate which, in some sense, is optimal. Several numerical experiments validate the complexity bound and highlight the good performance of the proposed estimator. △ Less

Submitted 4 April, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

Comments: Accepted to Bernoulli (2024)

arXiv:2205.11890 [pdf, other]

A Quadrature Rule combining Control Variates and Adaptive Importance Sampling

Authors: Rémi Leluc, François Portier, Johan Segers, Aigerim Zhuman

Abstract: Driven by several successful applications such as in stochastic gradient descent or in Bayesian computation, control variates have become a major tool for Monte Carlo integration. However, standard methods do not allow the distribution of the particles to evolve during the algorithm, as is the case in sequential simulation methods. Within the standard adaptive importance sampling framework, a simp… ▽ More Driven by several successful applications such as in stochastic gradient descent or in Bayesian computation, control variates have become a major tool for Monte Carlo integration. However, standard methods do not allow the distribution of the particles to evolve during the algorithm, as is the case in sequential simulation methods. Within the standard adaptive importance sampling framework, a simple weighted least squares approach is proposed to improve the procedure with control variates. The procedure takes the form of a quadrature rule with adapted quadrature weights to reflect the information brought in by the control variates. The quadrature points and weights do not depend on the integrand, a computational advantage in case of multiple integrands. Moreover, the target density needs to be known only up to a multiplicative constant. Our main result is a non-asymptotic bound on the probabilistic error of the procedure. The bound proves that for improving the estimate's accuracy, the benefits from adaptive importance sampling and control variates can be combined. The good behavior of the method is illustrated empirically on synthetic examples and real-world data for Bayesian linear regression. △ Less

Submitted 5 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

Journal ref: Advances in Neural Information Processing Systems (NeurIPS), 2022

arXiv:2204.05792 [pdf, ps, other]

High-dimensional nonconvex lasso-type $M$-estimators

Authors: Jad Beyhum, François Portier

Abstract: This paper proposes a theory for $\ell_1$-norm penalized high-dimensional $M$-estimators, with nonconvex risk and unrestricted domain. Under high-level conditions, the estimators are shown to attain the rate of convergence $s_0\sqrt{\log(nd)/n}$, where $s_0$ is the number of nonzero coefficients of the parameter of interest. Sufficient conditions for our main assumptions are then developed and fin… ▽ More This paper proposes a theory for $\ell_1$-norm penalized high-dimensional $M$-estimators, with nonconvex risk and unrestricted domain. Under high-level conditions, the estimators are shown to attain the rate of convergence $s_0\sqrt{\log(nd)/n}$, where $s_0$ is the number of nonzero coefficients of the parameter of interest. Sufficient conditions for our main assumptions are then developed and finally used in several examples including robust linear regression, binary classification and nonlinear least squares. △ Less

Submitted 13 April, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

arXiv:2202.10211 [pdf, ps, other]

On the bias of K-fold cross validation with stable learners

Authors: Anass Aghbalou, François Portier, Anne Sabourin

Abstract: This paper investigates the efficiency of the K-fold cross-validation (CV) procedure and a debiased version thereof as a means of estimating the generalization risk of a learning algorithm. We work under the general assumption of uniform algorithmic stability. We show that the K-fold risk estimate may not be consistent under such general stability assumptions, by constructing non vanishing lower b… ▽ More This paper investigates the efficiency of the K-fold cross-validation (CV) procedure and a debiased version thereof as a means of estimating the generalization risk of a learning algorithm. We work under the general assumption of uniform algorithmic stability. We show that the K-fold risk estimate may not be consistent under such general stability assumptions, by constructing non vanishing lower bounds on the error in realistic contexts such as regularized empirical risk minimisation and stochastic gradient descent. We thus advocate the use of a debiased version of the K-fold and prove an error bound with exponential tail decay regarding this version. Our result is applicable to the large class of uniformly stable algorithms, contrarily to earlier works focusing on specific tasks such as density estimation. We illustrate the relevance of the debiased K-fold CV on a simple model selection problem and demonstrate empirically the usefulness of the promoted approach on real world classification and regression datasets. △ Less

Submitted 11 June, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

arXiv:2202.00488 [pdf, other]

Cross-validation for Extreme Value Analysis

Authors: Anass Aghbalou, Patrice Bertail, François Portier, Anne Sabourin

Abstract: We conduct a non asymptotic study of the Cross Validation (CV) estimate of the generalization risk for learning algorithms dedicated to extreme regions of the covariates space. In this Extreme Value Analysis context, the risk function measures the algorithm's error given that the norm of the input exceeds a high quantile. The main challenge within this framework is the negligible size of the extre… ▽ More We conduct a non asymptotic study of the Cross Validation (CV) estimate of the generalization risk for learning algorithms dedicated to extreme regions of the covariates space. In this Extreme Value Analysis context, the risk function measures the algorithm's error given that the norm of the input exceeds a high quantile. The main challenge within this framework is the negligible size of the extreme training sample with respect to the full sample size and the necessity to re-scale the risk function by a probability tending to zero. We open the road to a finite sample understanding of CV for extreme values by establishing two new results: an exponential probability bound on the \Kfold CV error and a polynomial probability bound on the leave-\textrm{p}-out CV. Our bounds are sharp in the sense that they match state-of-the-art guarantees for standard CV estimates while extending them to encompass a conditioning event of small probability. We illustrate the significance of our results regarding high dimensional classification in extreme regions via a Lasso-type logistic regression algorithm. The tightness of our bounds is investigated in numerical experiments. △ Less

Submitted 11 June, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

arXiv:2111.09604 [pdf, other]

doi 10.1103/PhysRevX.12.021006

Emission of photon multiplets by a dc-biased superconducting circuit

Authors: G. C. Ménard, A. Peugeot, C. Padurariu, C. Rolland, B. Kubala, Y. Mukharsky, Z. Iftikhar, C. Altimiras, P. Roche, H. le Sueur, P. Joyez, D. Vion, D. Esteve, J. Ankerhold, F. Portier

Abstract: We observe the emission of bunches of $k \geqslant 1$ photons by a circuit made of a microwave resonator in series with a voltage-biased tunable Josephson junction. The bunches are emitted at specific values $V_k$ of the bias voltage, for which each Cooper pair tunneling across the junction creates exactly k photons in the resonator. The latter is a micro-fabricated spiral coil which resonates and… ▽ More We observe the emission of bunches of $k \geqslant 1$ photons by a circuit made of a microwave resonator in series with a voltage-biased tunable Josephson junction. The bunches are emitted at specific values $V_k$ of the bias voltage, for which each Cooper pair tunneling across the junction creates exactly k photons in the resonator. The latter is a micro-fabricated spiral coil which resonates and leaks photons at 4.4~GHz in a measurement line. Its characteristic impedance of 1.97~k$Ω$ is high enough to reach a strong junction-resonator coupling and a bright emission of the k-photon bunches. We show that a RWA treatment of the system accounts quantitatively for the observed radiation intensity, from $k=1$ to $6$, and over three orders of magnitude when varying the Josephson energy $E_J$. We also measure the second order correlation function of the radiated microwave to determine its Fano factor $F_k$, which in the low $E_J$ limit, confirms with $F_k=k$ the emission of $k$ photon bunches. At larger $E_J$, a more complex behavior is observed in quantitative agreement with numerical simulations. △ Less

Submitted 8 March, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

Journal ref: Phys. Rev. X 12, 021006 (2022)

arXiv:2110.15590 [pdf, other]

Adaptive Importance Sampling meets Mirror Descent: a Bias-variance tradeoff

Authors: Anna Korba, François Portier

Abstract: Adaptive importance sampling is a widely spread Monte Carlo technique that uses a re-weighting strategy to iteratively estimate the so-called target distribution. A major drawback of adaptive importance sampling is the large variance of the weights which is known to badly impact the accuracy of the estimates. This paper investigates a regularization strategy whose basic principle is to raise the i… ▽ More Adaptive importance sampling is a widely spread Monte Carlo technique that uses a re-weighting strategy to iteratively estimate the so-called target distribution. A major drawback of adaptive importance sampling is the large variance of the weights which is known to badly impact the accuracy of the estimates. This paper investigates a regularization strategy whose basic principle is to raise the importance weights at a certain power. This regularization parameter, that might evolve between zero and one during the algorithm, is shown (i) to balance between the bias and the variance and (ii) to be connected to the mirror descent framework. Using a kernel density estimate to build the sampling policy, the uniform convergence is established under mild conditions. Finally, several practical ways to choose the regularization parameter are discussed and the benefits of the proposed approach are illustrated empirically. △ Less

Submitted 29 October, 2021; originally announced October 2021.

Comments: 35 pages, 5 figures

MSC Class: 62L20

arXiv:2110.15083 [pdf, ps, other]

Nearest neighbor empirical processes

Authors: François Portier

Abstract: In the regression framework, the empirical measure based on the responses resulting from the nearest neighbors, among the covariates, to a given point $x$ is introduced and studied as a central statistical quantity. First, the associated empirical process is shown to satisfy a uniform central limit theorem under a local bracketing entropy condition on the underlying class of functions reflecting t… ▽ More In the regression framework, the empirical measure based on the responses resulting from the nearest neighbors, among the covariates, to a given point $x$ is introduced and studied as a central statistical quantity. First, the associated empirical process is shown to satisfy a uniform central limit theorem under a local bracketing entropy condition on the underlying class of functions reflecting the localizing nature of the nearest neighbor algorithm. Second a uniform non-asymptotic bound is established under a well-known condition, often referred to as Vapnik-Chervonenkis, on the uniform entropy numbers. The covariance of the Gaussian limit obtained in the uniform central limit theorem is simply equal to the conditional covariance operator given the covariate value. This suggests the possibility of using standard formulas to estimate the variance by using only the nearest neighbors instead of the full data. This is illustrated on two problems: the estimation of the conditional cumulative distribution function and local linear regression. △ Less

Submitted 10 April, 2024; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 34 pages

MSC Class: 62G05

arXiv:2108.01432 [pdf, other]

Tail inverse regression for dimension reduction with extreme response

Authors: Anass Aghbalou, François Portier, Anne Sabourin, Chen Zhou

Abstract: We consider the problem of supervised dimension reduction with a particular focus on extreme values of the target $Y\in\mathbb{R}$ to be explained by a covariate vector $X \in \mathbb{R}^p$. The general purpose is to define and estimate a projection on a lower dimensional subspace of the covariate space which is sufficient for predicting exceedances of the target above high thresholds. We propose… ▽ More We consider the problem of supervised dimension reduction with a particular focus on extreme values of the target $Y\in\mathbb{R}$ to be explained by a covariate vector $X \in \mathbb{R}^p$. The general purpose is to define and estimate a projection on a lower dimensional subspace of the covariate space which is sufficient for predicting exceedances of the target above high thresholds. We propose an original definition of Tail Conditional Independence which matches this purpose. Inspired by Sliced Inverse Regression (SIR) methods, we develop a novel framework (TIREX, Tail Inverse Regression for EXtreme response) in order to estimate an extreme sufficient dimension reduction (SDR) space of potentially smaller dimension than that of a classical SDR space. We prove the weak convergence of tail empirical processes involved in the estimation procedure and we illustrate the relevance of the proposed approach on simulated and real world data. △ Less

Submitted 24 February, 2023; v1 submitted 30 July, 2021; originally announced August 2021.

Comments: main paper: 31 pages + supplementary material: 16 pages

MSC Class: 62G32; 62H25; 62G08; 62G30

arXiv:2107.12825 [pdf]

Individual Survival Curves with Conditional Normalizing Flows

Authors: Guillaume Ausset, Tom Ciffreo, Francois Portier, Stephan Clémençon, Timothée Papin

Abstract: Survival analysis, or time-to-event modelling, is a classical statistical problem that has garnered a lot of interest for its practical use in epidemiology, demographics or actuarial sciences. Recent advances on the subject from the point of view of machine learning have been concerned with precise per-individual predictions instead of population studies, driven by the rise of individualized medic… ▽ More Survival analysis, or time-to-event modelling, is a classical statistical problem that has garnered a lot of interest for its practical use in epidemiology, demographics or actuarial sciences. Recent advances on the subject from the point of view of machine learning have been concerned with precise per-individual predictions instead of population studies, driven by the rise of individualized medicine. We introduce here a conditional normalizing flow based estimate of the time-to-event density as a way to model highly flexible and individualized conditional survival distributions. We use a novel hierarchical formulation of normalizing flows to enable efficient fitting of flexible conditional distributions without overfitting and show how the normalizing flow formulation can be efficiently adapted to the censored setting. We experimentally validate the proposed approach on a synthetic dataset as well as four open medical datasets and an example of a common financial problem. △ Less

Submitted 27 July, 2021; originally announced July 2021.

Comments: IEEE DSAA '21

arXiv:2105.11818 [pdf, other]

SGD with Coordinate Sampling: Theory and Practice

Authors: Rémi Leluc, François Portier

Abstract: While classical forms of stochastic gradient descent algorithm treat the different coordinates in the same way, a framework allowing for adaptive (non uniform) coordinate sampling is developed to leverage structure in data. In a non-convex setting and including zeroth order gradient estimate, almost sure convergence as well as non-asymptotic bounds are established. Within the proposed framework, w… ▽ More While classical forms of stochastic gradient descent algorithm treat the different coordinates in the same way, a framework allowing for adaptive (non uniform) coordinate sampling is developed to leverage structure in data. In a non-convex setting and including zeroth order gradient estimate, almost sure convergence as well as non-asymptotic bounds are established. Within the proposed framework, we develop an algorithm, MUSKETEER, based on a reinforcement strategy: after collecting information on the noisy gradients, it samples the most promising coordinate (all for one); then it moves along the one direction yielding an important decrease of the objective (one for all). Numerical experiments on both synthetic and real data examples confirm the effectiveness of MUSKETEER in large scale problems. △ Less

Submitted 15 October, 2022; v1 submitted 25 May, 2021; originally announced May 2021.

Comments: Journal of Machine Learning Research 2022

arXiv:2010.03376 [pdf, other]

doi 10.1103/PhysRevX.11.031008

Generating two continuous entangled microwave beams using a dc-biased Josephson junction

Authors: A. Peugeot, G. Ménard, S. Dambach, M. Westig, B. Kubala, Y. Mukharsky, C. Altimiras, P. Joyez, D. Vion, P. Roche, D. Esteve, P. Milman, J. Leppäkangas, G. Johansson, M. Hofheinz, J. Ankerhold, F. Portier

Abstract: We show experimentally that a dc-biased Josephson junction in series with two microwave resonators emits entangled beams of microwaves leaking out of the resonators. In the absence of a stationary phase reference for characterizing the entanglement of the outgoing beams, we measure second-order coherence functions for proving entanglement up to an emission rate of 2.5 billion photon pairs per seco… ▽ More We show experimentally that a dc-biased Josephson junction in series with two microwave resonators emits entangled beams of microwaves leaking out of the resonators. In the absence of a stationary phase reference for characterizing the entanglement of the outgoing beams, we measure second-order coherence functions for proving entanglement up to an emission rate of 2.5 billion photon pairs per second. The experimental results are found in quantitative agreement with theory, proving that the low frequency noise of the dc bias is the main limitation for the coherence time of the entangled beams. This agreement allows us to evaluate the entropy of entanglement of the resonators, and to identify the improvements that could bring this device closer to a useful bright source of entangled microwaves for quantum-technological applications. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Journal ref: Phys. Rev. X 11, 031008 (2021)

arXiv:2006.15043 [pdf, other]

Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic Bounds and Applications

Authors: Guillaume Ausset, Stephan Clémençon, François Portier

Abstract: Motivated by a wide variety of applications, ranging from stochastic optimization to dimension reduction through variable selection, the problem of estimating gradients accurately is of crucial importance in statistics and learning theory. We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted upon observing a (possibly high dimensional) ra… ▽ More Motivated by a wide variety of applications, ranging from stochastic optimization to dimension reduction through variable selection, the problem of estimating gradients accurately is of crucial importance in statistics and learning theory. We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted upon observing a (possibly high dimensional) random vector $X$ by means of a predictive function $f(X)$ as accurately as possible in the mean-squared sense and study a nearest-neighbour-based pointwise estimate of the gradient of the optimal predictive function, the regression function $m(x)=\mathbb{E}[Y\mid X=x]$. Under classic smoothness conditions combined with the assumption that the tails of $Y-m(X)$ are sub-Gaussian, we prove nonasymptotic bounds improving upon those obtained for alternative estimation methods. Beyond the novel theoretical results established, several illustrative numerical experiments have been carried out. The latter provide strong empirical evidence that the estimation method proposed works very well for various statistical problems involving gradient estimation, namely dimensionality reduction, stochastic gradient descent optimization and quantifying disentanglement. △ Less

Submitted 26 June, 2020; originally announced June 2020.

arXiv:2006.12839 [pdf, other]

Conditional independence testing via weighted partial copulas and nearest neighbors

Authors: Pascal Bianchi, Kevin Elgui, François Portier

Abstract: This paper introduces the \textit{weighted partial copula} function for testing conditional independence. The proposed test procedure results from these two ingredients: (i) the test statistic is an explicit Cramer-von Mises transformation of the \textit{weighted partial copula}, (ii) the regions of rejection are computed using a bootstrap procedure which mimics conditional independence by generat… ▽ More This paper introduces the \textit{weighted partial copula} function for testing conditional independence. The proposed test procedure results from these two ingredients: (i) the test statistic is an explicit Cramer-von Mises transformation of the \textit{weighted partial copula}, (ii) the regions of rejection are computed using a bootstrap procedure which mimics conditional independence by generating samples from the product measure of the estimated conditional marginals. Under conditional independence, the weak convergence of the \textit{weighted partial copula proces}s is established when the marginals are estimated using a smoothed local linear estimator. Finally, an experimental section demonstrates that the proposed test has competitive power compared to recent state-of-the-art methods such as kernel-based test. △ Less

Submitted 12 February, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

arXiv:2006.09223 [pdf, ps, other]

Risk bounds when learning infinitely many response functions by ordinary linear regression

Authors: Vincent Plassier, François Portier, Johan Segers

Abstract: Consider the problem of learning a large number of response functions simultaneously based on the same input variables. The training data consist of a single independent random sample of the input variables drawn from a common distribution together with the associated responses. The input variables are mapped into a high-dimensional linear space, called the feature space, and the response function… ▽ More Consider the problem of learning a large number of response functions simultaneously based on the same input variables. The training data consist of a single independent random sample of the input variables drawn from a common distribution together with the associated responses. The input variables are mapped into a high-dimensional linear space, called the feature space, and the response functions are modelled as linear functionals of the mapped features, with coefficients calibrated via ordinary least squares. We provide convergence guarantees on the worst-case excess prediction risk by controlling the convergence rate of the excess risk uniformly in the response function. The dimension of the feature map is allowed to tend to infinity with the sample size. The collection of response functions, although potentially infinite, is supposed to have a finite Vapnik-Chervonenkis dimension. The bound derived can be applied when building multiple surrogate models in a reasonable computing time. △ Less

Submitted 27 November, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 27 pages

arXiv:2006.02745 [pdf, other]

Asymptotic Analysis of Conditioned Stochastic Gradient Descent

Authors: Rémi Leluc, François Portier

Abstract: In this paper, we investigate a general class of stochastic gradient descent (SGD) algorithms, called Conditioned SGD, based on a preconditioning of the gradient direction. Using a discrete-time approach with martingale tools, we establish under mild assumptions the weak convergence of the rescaled sequence of iterates for a broad class of conditioning matrices including stochastic first-order and… ▽ More In this paper, we investigate a general class of stochastic gradient descent (SGD) algorithms, called Conditioned SGD, based on a preconditioning of the gradient direction. Using a discrete-time approach with martingale tools, we establish under mild assumptions the weak convergence of the rescaled sequence of iterates for a broad class of conditioning matrices including stochastic first-order and second-order methods. Almost sure convergence results, which may be of independent interest, are also presented. Interestingly, the asymptotic normality result consists in a stochastic equicontinuity property so when the conditioning matrix is an estimate of the inverse Hessian, the algorithm is asymptotically optimal. △ Less

Submitted 15 October, 2023; v1 submitted 4 June, 2020; originally announced June 2020.

Comments: Accepted to Transactions on Machine Learning Research 2023

MSC Class: 62L20 (Primary) 60F05; 60G46; 68W40 (Secondary)

Journal ref: Transactions on Machine Learning Research (2023)

arXiv:2005.10618 [pdf, other]

doi 10.1214/20-AOS2035

Infinite-dimensional gradient-based descent for alpha-divergence minimisation

Authors: Kamélia Daudel, Randal Douc, François Portier

Abstract: This paper introduces the $(α, Γ)$-descent, an iterative algorithm which operates on measures and performs $α$-divergence minimisation in a Bayesian framework. This gradient-based procedure extends the commonly-used variational approximation by adding a prior on the variational parameters in the form of a measure. We prove that for a rich family of functions $Γ$, this algorithm leads at each step… ▽ More This paper introduces the $(α, Γ)$-descent, an iterative algorithm which operates on measures and performs $α$-divergence minimisation in a Bayesian framework. This gradient-based procedure extends the commonly-used variational approximation by adding a prior on the variational parameters in the form of a measure. We prove that for a rich family of functions $Γ$, this algorithm leads at each step to a systematic decrease in the $α$-divergence and derive convergence results. Our framework recovers the Entropic Mirror Descent algorithm and provides an alternative algorithm that we call the Power Descent. Moreover, in its stochastic formulation, the $(α, Γ)$-descent allows to optimise the mixture weights of any given mixture model without any information on the underlying distribution of the variational parameters. This renders our method compatible with many choices of parameters updates and applicable to a wide range of Machine Learning tasks. We demonstrate empirically on both toy and real-world examples the benefit of using the Power descent and going beyond the Entropic Mirror Descent framework, which fails as the dimension grows. △ Less

Submitted 15 October, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

Journal ref: Ann. Statist. 49(4): 2250-2270 (August 2021)

arXiv:1910.11095 [pdf, other]

High dimensional regression for regenerative time-series: an application to road traffic modeling

Authors: Mohammed Bouchouia, François Portier

Abstract: A statistical predictive model in which a high-dimensional time-series regenerates at the end of each day is used to model road traffic. Due to the regeneration, prediction is based on a daily modeling using a vector autoregressive model that combines linearly the past observations of the day. Due to the high-dimension, the learning algorithm follows from an L1-penalization of the regression coeff… ▽ More A statistical predictive model in which a high-dimensional time-series regenerates at the end of each day is used to model road traffic. Due to the regeneration, prediction is based on a daily modeling using a vector autoregressive model that combines linearly the past observations of the day. Due to the high-dimension, the learning algorithm follows from an L1-penalization of the regression coefficients. Excess risk bounds are established under the high-dimensional framework in which the number of road sections goes to infinity with the number of observed days. Considering floating car data observed in an urban area, the approach is compared to state-of-the-art methods including neural networks. In addition of being highly competitive in terms of prediction, it enables the identification of the most determinant sections of the road network. △ Less

Submitted 26 January, 2021; v1 submitted 24 October, 2019; originally announced October 2019.

MSC Class: Primary 62J05; 62J07; secondary 62P30

arXiv:1909.12239

The $f$-Divergence Expectation Iteration Scheme

Authors: Kamélia Daudel, Randal Douc, François Portier, François Roueff

Abstract: This paper introduces the $f$-EI$(φ)$ algorithm, a novel iterative algorithm which operates on measures and performs $f$-divergence minimisation in a Bayesian framework. We prove that for a rich family of values of $(f,φ)$ this algorithm leads at each step to a systematic decrease in the $f$-divergence and show that we achieve an optimum. In the particular case where we consider a weighted sum of… ▽ More This paper introduces the $f$-EI$(φ)$ algorithm, a novel iterative algorithm which operates on measures and performs $f$-divergence minimisation in a Bayesian framework. We prove that for a rich family of values of $(f,φ)$ this algorithm leads at each step to a systematic decrease in the $f$-divergence and show that we achieve an optimum. In the particular case where we consider a weighted sum of Dirac measures and the $α$-divergence, we obtain that the calculations involved in the $f$-EI$(φ)$ algorithm simplify to gradient-based computations. Empirical results support the claim that the $f$-EI$(φ)$ algorithm serves as a powerful tool to assist Variational methods. △ Less

Submitted 15 March, 2021; v1 submitted 26 September, 2019; originally announced September 2019.

Comments: This content ended up being split into the papers arXiv:2005.10618 and arXiv:2103.05684, which correspond to two separate and more in-depth approaches

arXiv:1906.10920 [pdf, other]

Control variate selection for Monte Carlo integration

Authors: Rémi Leluc, François Portier, Johan Segers

Abstract: Monte Carlo integration with variance reduction by means of control variates can be implemented by the ordinary least squares estimator for the intercept in a multiple linear regression model with the integrand as response and the control variates as covariates. Even without special knowledge on the integrand, significant efficiency gains can be obtained if the control variate space is sufficientl… ▽ More Monte Carlo integration with variance reduction by means of control variates can be implemented by the ordinary least squares estimator for the intercept in a multiple linear regression model with the integrand as response and the control variates as covariates. Even without special knowledge on the integrand, significant efficiency gains can be obtained if the control variate space is sufficiently large. Incorporating a large number of control variates in the ordinary least squares procedure may however result in (i) a certain instability of the ordinary least squares estimator and (ii) a possibly prohibitive computation time. Regularizing the ordinary least squares estimator by preselecting appropriate control variates via the Lasso turns out to increase the accuracy without additional computational cost. The findings in the numerical experiment are confirmed by concentration inequalities for the integration error. △ Less

Submitted 1 April, 2021; v1 submitted 26 June, 2019; originally announced June 2019.

Comments: Accepted to Statistics and Computing

arXiv:1906.01908 [pdf, other]

Empirical Risk Minimization under Random Censorship: Theory and Practice

Authors: Guillaume Ausset, Stéphan Clémençon, François Portier

Abstract: We consider the classic supervised learning problem, where a continuous non-negative random label $Y$ (i.e. a random duration) is to be predicted based upon observing a random vector $X$ valued in $\mathbb{R}^d$ with $d\geq 1$ by means of a regression rule with minimum least square error. In various applications, ranging from industrial quality control to public health through credit risk analysis… ▽ More We consider the classic supervised learning problem, where a continuous non-negative random label $Y$ (i.e. a random duration) is to be predicted based upon observing a random vector $X$ valued in $\mathbb{R}^d$ with $d\geq 1$ by means of a regression rule with minimum least square error. In various applications, ranging from industrial quality control to public health through credit risk analysis for instance, training observations can be right censored, meaning that, rather than on independent copies of $(X,Y)$, statistical learning relies on a collection of $n\geq 1$ independent realizations of the triplet $(X, \; \min\{Y,\; C\},\; δ)$, where $C$ is a nonnegative r.v. with unknown distribution, modeling censorship and $δ=\mathbb{I}\{Y\leq C\}$ indicates whether the duration is right censored or not. As ignoring censorship in the risk computation may clearly lead to a severe underestimation of the target duration and jeopardize prediction, we propose to consider a plug-in estimate of the true risk based on a Kaplan-Meier estimator of the conditional survival function of the censorship $C$ given $X$, referred to as Kaplan-Meier risk, in order to perform empirical risk minimization. It is established, under mild conditions, that the learning rate of minimizers of this biased/weighted empirical risk functional is of order $O_{\mathbb{P}}(\sqrt{\log(n)/n})$ when ignoring model bias issues inherent to plug-in estimation, as can be attained in absence of censorship. Beyond theoretical results, numerical experiments are presented in order to illustrate the relevance of the approach developed. △ Less

Submitted 5 June, 2019; originally announced June 2019.

Comments: Submitted to JMLR. 18 pages + Appendix

arXiv:1905.01161 [pdf, other]

doi 10.1103/PhysRevX.10.021003

Absence of a dissipative quantum phase transition in Josephson junctions

Authors: Anil Murani, Nicolas Bourlet, Hélène le Sueur, Fabien Portier, Carles Altimiras, Daniel Esteve, Hermann Grabert, Jürgen Stockburger, Joachim Ankerhold, Philippe Joyez

Abstract: Half a century after its discovery, the Josephson junction has become the most important nonlinear quantum electronic component at our disposal. It has helped reshape the SI system around quantum effects and is used in scores of quantum devices. By itself, the use of Josephson junctions in the volt metrology seems to imply an exquisite understanding of the component in every aspect. Yet, surprisin… ▽ More Half a century after its discovery, the Josephson junction has become the most important nonlinear quantum electronic component at our disposal. It has helped reshape the SI system around quantum effects and is used in scores of quantum devices. By itself, the use of Josephson junctions in the volt metrology seems to imply an exquisite understanding of the component in every aspect. Yet, surprisingly, there have been long-standing subtle issues regarding the modeling of the interaction of a junction with its electromagnetic environment. Here, we find that a Josephson junction connected to a resistor does not become insulating beyond a given value of the resistance due to a dissipative quantum phase transition, as is commonly believed. Our work clarifies how this key quantum component behaves in the presence of a dissipative environment and provides a comprehensive and consistent picture, notably regarding the treatment of its phase. △ Less

Submitted 4 April, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

Journal ref: Phys. Rev. X 10, 021003 (2020)

arXiv:1903.08507 [pdf, other]

Safe and adaptive importance sampling: a mixture approach

Authors: Bernard Delyon, François Portier

Abstract: This paper investigates adaptive importance sampling algorithms for which the policy, the sequence of distributions used to generate the particles, is a mixture distribution between a flexible kernel density estimate (based on the previous particles), and a "safe" heavy-tailed density. When the share of samples generated according to the safe density goes to zero but not too quickly, two results a… ▽ More This paper investigates adaptive importance sampling algorithms for which the policy, the sequence of distributions used to generate the particles, is a mixture distribution between a flexible kernel density estimate (based on the previous particles), and a "safe" heavy-tailed density. When the share of samples generated according to the safe density goes to zero but not too quickly, two results are established: (i) uniform convergence rates are derived for the policy toward the target density; (ii) a central limit theorem is obtained for the resulting integral estimates. The fact that the asymptotic variance is the same as the variance of an "oracle" procedure with variance-optimal policy, illustrates the benefits of the approach. In addition, a subsampling step (among the particles) can be conducted before constructing the kernel estimate in order to decrease the computational effort without altering the performance of the method. The practical behavior of the algorithms is illustrated in a simulation study. △ Less

Submitted 20 March, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

Comments: 35 pages, 4 figures

MSC Class: 62G05; 62G07; 65C05

arXiv:1903.05919 [pdf, other]

doi 10.1038/s41467-020-16331-4

Relaxation and revival of quasiparticles injected in an interacting quantum Hall liquid

Authors: R. H. Rodriguez, F. D. Parmentier, D. Ferraro, P. Roulleau, U. Gennser, A. Cavanna, M. Sassetti, F. Portier, D. Mailly, P. Roche

Abstract: The one-dimensional, chiral edge channels of the quantum Hall effect are a promising platform in which to implement electron quantum optics experiments; however, Coulomb interactions between edge channels are a major source of decoherence and energy relaxation. It is therefore of large interest to understand the range and limitations of the simple quantum electron optics picture. Here we confirm e… ▽ More The one-dimensional, chiral edge channels of the quantum Hall effect are a promising platform in which to implement electron quantum optics experiments; however, Coulomb interactions between edge channels are a major source of decoherence and energy relaxation. It is therefore of large interest to understand the range and limitations of the simple quantum electron optics picture. Here we confirm experimentally for the first time the predicted relaxation and revival of electrons injected at finite energy into an edge channel. The observed decay of the injected electrons is reproduced theoretically within a Tomonaga-Luttinger liquid framework, including an important dissipation towards external degrees of freedom. This gives us a quantitative empirical understanding of the strength of the interaction and the dissipation. △ Less

Submitted 16 May, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

Comments: Includes supplementary information

Journal ref: Nature Communications volume 11, Article number: 2426 (2020)

arXiv:1810.06217 [pdf, other]

doi 10.1103/PhysRevLett.122.186804

Antibunched photons emitted by a dc biased Josephson junction

Authors: C. Rolland, A. Peugeot, S. Dambach, M. Westig, B. Kubala, Y. Mukharsky, C. Altimiras, H. le Sueur, P. Joyez, D. Vion, P. Roche, D. Esteve, J. Ankerhold, F. Portier

Abstract: We show experimentally that a dc biased Josephson junction in series with a high-enough impedance microwave resonator emits antibunched photons. Our resonator is made of a simple micro-fabricated spiral coil that resonates at 4.4 GHz and reaches a 1.97 k$Ω$ characteristic impedance. The second order correlation function of the power leaking out of the resonator drops down to 0.3 at zero delay, whi… ▽ More We show experimentally that a dc biased Josephson junction in series with a high-enough impedance microwave resonator emits antibunched photons. Our resonator is made of a simple micro-fabricated spiral coil that resonates at 4.4 GHz and reaches a 1.97 k$Ω$ characteristic impedance. The second order correlation function of the power leaking out of the resonator drops down to 0.3 at zero delay, which demonstrates the antibunching of the photons emitted by the circuit at a rate of 6 $10^7$ photons per second. Results are found in quantitative agreement with our theoretical predictions. This simple scheme could offer an efficient and bright single-photon source in the microwave domain. △ Less

Submitted 13 May, 2019; v1 submitted 15 October, 2018; originally announced October 2018.

Journal ref: Phys. Rev. Lett. 122, 186804 (2019)

arXiv:1806.05830 [pdf, other]

Parametric versus nonparametric: the fitness coefficient

Authors: Gildas Mazo, François Portier

Abstract: The fitness coefficient, introduced in this paper, results from a competition between parametric and nonparametric density estimators within the likelihood of the data. As illustrated on several real datasets, the fitness coefficient generally agrees with p-values but is easier to compute and interpret. Namely, the fitness coefficient can be interpreted as the proportion of data coming from the pa… ▽ More The fitness coefficient, introduced in this paper, results from a competition between parametric and nonparametric density estimators within the likelihood of the data. As illustrated on several real datasets, the fitness coefficient generally agrees with p-values but is easier to compute and interpret. Namely, the fitness coefficient can be interpreted as the proportion of data coming from the parametric model. Moreover, the fitness coefficient can be used to build a semiparamteric compromise which improves inference over the parametric and nonparametric approaches. From a theoretical perspective, the fitness coefficient is shown to converge in probability to one if the model is true and to zero if the model is false. From a practical perspective, the utility of the fitness coefficient is illustrated on real and simulated datasets. △ Less

Submitted 15 June, 2018; originally announced June 2018.

arXiv:1806.02107 [pdf, ps, other]

Rademacher complexity for Markov chains : Applications to kernel smoothing and Metropolis-Hasting

Authors: Patrice Bertail, François Portier

Abstract: Following the seminal approach by Talagrand, the concept of Rademacher complexity for independent sequences of random variables is extended to Markov chains. The proposed notion of "block Rademacher complexity" (of a class of functions) follows from renewal theory and allows to control the expected values of suprema (over the class of functions) of empirical processes based on Harris Markov chains… ▽ More Following the seminal approach by Talagrand, the concept of Rademacher complexity for independent sequences of random variables is extended to Markov chains. The proposed notion of "block Rademacher complexity" (of a class of functions) follows from renewal theory and allows to control the expected values of suprema (over the class of functions) of empirical processes based on Harris Markov chains as well as the excess probability. For classes of Vapnik-Chervonenkis type, bounds on the "block Rademacher complexity" are established. These bounds depend essentially on the sample size and the probability tails of the regeneration times. The proposed approach is employed to obtain convergence rates for the kernel density estimator of the stationary measure and to derive concentration inequalities for the Metropolis-Hasting algorithm. △ Less

Submitted 6 July, 2018; v1 submitted 6 June, 2018; originally announced June 2018.

Comments: 22 pages

arXiv:1806.01082 [pdf, other]

On an extension of the promotion time cure model

Authors: François Portier, Ingrid Van Keilegom, Anouar El Ghouch

Abstract: We consider the problem of estimating the distribution of time-to-event data that are subject to censoring and for which the event of interest might never occur, i.e., some subjects are cured. To model this kind of data in the presence of covariates, one of the leading semiparametric models is the promotion time cure model \citep{yakovlev1996}, which adapts the Cox model to the presence of cured s… ▽ More We consider the problem of estimating the distribution of time-to-event data that are subject to censoring and for which the event of interest might never occur, i.e., some subjects are cured. To model this kind of data in the presence of covariates, one of the leading semiparametric models is the promotion time cure model \citep{yakovlev1996}, which adapts the Cox model to the presence of cured subjects. Estimating the conditional distribution results in a complicated constrained optimization problem, and inference is difficult as no closed-formula for the variance is available. We propose a new model, inspired by the Cox model, that leads to a simple estimation procedure and that presents a closed formula for the variance. We derive some asymptotic properties of the estimators and we show the practical behaviour of our procedure by means of simulations. We also apply our model and estimation method to a breast cancer data set. △ Less

Submitted 4 June, 2018; originally announced June 2018.

Comments: 41 pages, 5 figures

arXiv:1806.00989 [pdf, other]

Asymptotic optimality of adaptive importance sampling

Authors: Bernard Delyon, François Portier

Abstract: Adaptive importance sampling (AIS) uses past samples to update the \textit{sampling policy} $q_t$ at each stage $t$. Each stage $t$ is formed with two steps : (i) to explore the space with $n_t$ points according to $q_t$ and (ii) to exploit the current amount of information to update the sampling policy. The very fundamental question raised in this paper concerns the behavior of empirical sums bas… ▽ More Adaptive importance sampling (AIS) uses past samples to update the \textit{sampling policy} $q_t$ at each stage $t$. Each stage $t$ is formed with two steps : (i) to explore the space with $n_t$ points according to $q_t$ and (ii) to exploit the current amount of information to update the sampling policy. The very fundamental question raised in this paper concerns the behavior of empirical sums based on AIS. Without making any assumption on the allocation policy $n_t$, the theory developed involves no restriction on the split of computational resources between the explore (i) and the exploit (ii) step. It is shown that AIS is asymptotically optimal : the asymptotic behavior of AIS is the same as some "oracle" strategy that knows the targeted sampling policy from the beginning. From a practical perspective, weighted AIS is introduced, a new method that allows to forget poor samples from early stages. △ Less

Submitted 3 October, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: 19 pages, 3 figures

arXiv:1804.10596 [pdf, other]

doi 10.1103/PhysRevX.9.021016

A bright on-demand source of anti-bunched microwave photons based on inelastic Cooper pair tunneling

Authors: Alexander Grimm, Florian Blanchet, Romain Albert, Juha Leppäkangas, Salha Jebari, Dibyendu Hazra, Frédéric Gustavo, Jean-Luc Thomassin, Eva Dupont-Ferrier, Fabien Portier, Max Hofheinz

Abstract: The ability to generate single photons is not only an ubiquitous tool for scientific exploration with applications ranging from spectroscopy and metrology to quantum computing, but also an important proof of the underlying quantum nature of a physical process. In the microwave regime, emission of anti-bunched radiation has so far relied on coherent control of Josephson qubits, where precisely cali… ▽ More The ability to generate single photons is not only an ubiquitous tool for scientific exploration with applications ranging from spectroscopy and metrology to quantum computing, but also an important proof of the underlying quantum nature of a physical process. In the microwave regime, emission of anti-bunched radiation has so far relied on coherent control of Josephson qubits, where precisely calibrated microwave pulses are needed, and the achievable bandwidth is limited by the anharmonicity of the qubit. Here, we demonstrate the operation of a bright on-demand source of quantum microwave radiation capable of emitting anti-bunched photons based on inelastic Cooper pair tunneling and driven by a simple DC voltage bias. It is characterized by its normalized second order correlation function of $g^{(2)}(0)\approx0.43$ corresponding to anti-bunching in the single photon regime. Our source can be triggered and its emission rate is tunable in situ exceeding rates obtained with current microwave single photon sources by more than one order of magnitude. △ Less

Submitted 31 July, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

Journal ref: Phys. Rev. X 9, 021016 (2019)

arXiv:1802.07323 [pdf, other]

doi 10.1103/PhysRevApplied.11.034035

Parametric amplification and squeezing with an ac- and dc-voltage biased superconducting junction

Authors: Udson C. Mendes, Sébastien Jezouin, Philippe Joyez, Bertrand Reulet, Alexandre Blais, Fabien Portier, Christophe Mora, Carles Altimiras

Abstract: We theoretically investigate a near-quantum-limited parametric amplifier based on the nonlinear dynamics of quasiparticles flowing through a superconducting-insulator-superconducting junction. Photon-assisted tunneling, resulting from the combination of dc- and ac-voltage bias, gives rise to a strong parametric interaction for the electromagnetic modes reflected by the junction coupled to a transm… ▽ More We theoretically investigate a near-quantum-limited parametric amplifier based on the nonlinear dynamics of quasiparticles flowing through a superconducting-insulator-superconducting junction. Photon-assisted tunneling, resulting from the combination of dc- and ac-voltage bias, gives rise to a strong parametric interaction for the electromagnetic modes reflected by the junction coupled to a transmission line. We show phase-sensitive and phase-preserving amplification, together with single- and two-mode squeezing. For an aluminum junction pumped at twice the center frequency, $ω_0/2π=6$~GHz, we predict narrow-band phase-sensitive amplification of microwaves signals to more than 20 dB, and broadband phase-preserving amplification of 20 dB over a 1.2 GHz 3-dB bandwidth. We also predict single- and two-mode squeezing reaching more than -12 dB over 5.3 GHz 3-dB bandwidth. Moreover, with a simple impedance matching circuit, we demonstrate 3 dB bandwidth reaching 4.3 GHz for 20 dB of gain. A key feature of the device is that its performance can be controlled in-situ with the applied dc- and ac-voltage biases. △ Less

Submitted 6 March, 2019; v1 submitted 20 February, 2018; originally announced February 2018.

Comments: Accepted for publication at the Physical Review Applied. 12 pages and 9 figures

Journal ref: Phys. Rev. Applied 11, 034035 (2019)

arXiv:1801.07497 [pdf, other]

doi 10.1088/1361-6633/aaa98a

Coherent control of single electrons: a review of current progress

Authors: Christopher Bauerle, D. Christian Glattli, Tristan Meunier, Fabien Portier, Patrice Roche, Preden Roulleau, Shintaro Takada, Xavier Waintal

Abstract: In this report we review the present state of the art of the control of propagating quantum states at the single-electron level and its potential application to quantum information processing. We give an overview of the different approaches which have been developed over the last ten years in order to gain full control over a propagating single electron in a solid state system. After a brief intro… ▽ More In this report we review the present state of the art of the control of propagating quantum states at the single-electron level and its potential application to quantum information processing. We give an overview of the different approaches which have been developed over the last ten years in order to gain full control over a propagating single electron in a solid state system. After a brief introduction of the basic concepts, we present experiments on flying qubit circuits for ensemble of electrons measured in the low frequency (DC) limit. We then present the basic ingredients necessary to realise such experiments at the single-electron level. This includes a review of the various single electron sources which are compatible with integrated single electron circuits. This is followed by a review of recent key experiments on electron quantum optics with single electrons. Finally we will present recent developments about the new physics that emerges using ultrashort voltage pulses. We conclude our review with an outlook and future challenges in the field. △ Less

Submitted 1 February, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

Comments: 35 pages, 24 figures, This is an author-created, un-copyedited version of an article accepted for publication in Rep. Prog. Phys. (references added)

Journal ref: Rep. Prog. Phys. 81 056503 (pp33) (2018)

arXiv:1801.01797 [pdf, ps, other]

Monte Carlo integration with a growing number of control variates

Authors: François Portier, Johan Segers

Abstract: It is well known that Monte Carlo integration with variance reduction by means of control variates can be implemented by the ordinary least squares estimator for the intercept in a multiple linear regression model. A central limit theorem is established for the integration error if the number of control variates tends to infinity. The integration error is scaled by the standard deviation of the er… ▽ More It is well known that Monte Carlo integration with variance reduction by means of control variates can be implemented by the ordinary least squares estimator for the intercept in a multiple linear regression model. A central limit theorem is established for the integration error if the number of control variates tends to infinity. The integration error is scaled by the standard deviation of the error term in the regression model. If the linear span of the control variates is dense in a function space that contains the integrand, the integration error tends to zero at a rate which is faster than the square root of the number of Monte Carlo replicates. Depending on the situation, increasing the number of control variates may or may not be computationally more efficient than increasing the Monte Carlo sample size. △ Less

Submitted 9 October, 2019; v1 submitted 5 January, 2018; originally announced January 2018.

Comments: 22 pages. Numerical experiments in earlier version

MSC Class: 60F05; 62J05; 65C05

arXiv:1704.04432 [pdf, other]

doi 10.1038/s41928-018-0055-7

Quantum limited amplification from inelastic Cooper pair tunneling

Authors: S. Jebari, F. Blanchet, A. Grimm, D. Hazra, R. Albert, P. Joyez, D. Vion, D. Esteve, F. Portier, M. Hofheinz

Abstract: Nature sets fundamental limits regarding how accurate the amplification of analog signals may be. For instance, a linear amplifier unavoidably adds some noise which amounts to half a photon at best. While for most applications much higher noise levels are acceptable, the readout of microwave quantum systems, such as spin or superconducting qubits, requires noise as close as possible to this ultima… ▽ More Nature sets fundamental limits regarding how accurate the amplification of analog signals may be. For instance, a linear amplifier unavoidably adds some noise which amounts to half a photon at best. While for most applications much higher noise levels are acceptable, the readout of microwave quantum systems, such as spin or superconducting qubits, requires noise as close as possible to this ultimate limit. To date, it is approached only by parametric amplifiers exploiting non-linearities in superconducting circuits and driven by a strong microwave pump tone. However, this microwave drive makes them much more difficult to implement and operate than conventional DC powered amplifiers, which so far suffer from much higher noise. Here we present the first experimental proof that a simple DC-powered setup allows for amplification close to the quantum limit. Our amplification scheme is based on the stimulated microwave photon emission accompanying inelastic Cooper pair tunneling through a DC-biased Josephson junction, with the key to low noise lying in a well defined auxiliary idler mode, in analogy to parametric amplifiers. △ Less

Submitted 22 May, 2017; v1 submitted 14 April, 2017; originally announced April 2017.

Comments: 6 pages, 4 figures

Journal ref: Nat Electron 1, 223 (2018)

arXiv:1703.05009 [pdf, other]

doi 10.1103/PhysRevLett.119.137001

Emission of non-classical radiation by inelastic Cooper pair tunneling

Authors: M. Westig, B. Kubala, O. Parlavecchio, Y. Mukharsky, C. Altimiras, P Joyez, D. Vion, P. Roche, M. Hofheinz, D Esteve, M Trif, P. Simon, J. Ankerhold, F. Portier

Abstract: We show that a properly dc-biased Josephson junction in series with two microwave resonators of different frequencies emits photon pairs in the resonators. By measuring auto- and inter-correlations of the power leaking out of the resonators, we demonstrate two-mode amplitude squeezing below the classical limit. This non-classical microwave light emission is found to be in quantitative agreement wi… ▽ More We show that a properly dc-biased Josephson junction in series with two microwave resonators of different frequencies emits photon pairs in the resonators. By measuring auto- and inter-correlations of the power leaking out of the resonators, we demonstrate two-mode amplitude squeezing below the classical limit. This non-classical microwave light emission is found to be in quantitative agreement with our theoretical predictions, up to an emission rate of 2 billion photon pairs per second. △ Less

Submitted 15 March, 2017; originally announced March 2017.

Journal ref: Phys. Rev. Lett. 119, 137001 (2017)

arXiv:1609.01165 [pdf, other]

Integral estimation based on Markovian design

Authors: Romain Azaïs, Bernard Delyon, François Portier

Abstract: Suppose that a mobile sensor describes a Markovian trajectory in the ambient space. At each time the sensor measures an attribute of interest, e.g., the temperature. Using only the location history of the sensor and the associated measurements, the aim is to estimate the average value of the attribute over the space. In contrast to classical probabilistic integration methods, e.g., Monte Carlo, th… ▽ More Suppose that a mobile sensor describes a Markovian trajectory in the ambient space. At each time the sensor measures an attribute of interest, e.g., the temperature. Using only the location history of the sensor and the associated measurements, the aim is to estimate the average value of the attribute over the space. In contrast to classical probabilistic integration methods, e.g., Monte Carlo, the proposed approach does not require any knowledge on the distribution of the sensor trajectory. Probabilistic bounds on the convergence rates of the estimator are established. These rates are better than the traditional "root n"-rate, where n is the sample size, attached to other probabilistic integration methods. For finite sample sizes, the good behaviour of the procedure is demonstrated through simulations and an application to the evaluation of the average temperature of oceans is considered. △ Less

Submitted 29 September, 2017; v1 submitted 5 September, 2016; originally announced September 2016.

Comments: 45 pages

MSC Class: 62G07; G2M05

arXiv:1605.00098 [pdf]

doi 10.1103/PhysRevX.6.031002

Interacting electrodynamics of short coherent conductors in quantum circuits

Authors: C. Altimiras, F. Portier, P. Joyez

Abstract: When combining lumped mesoscopic electronic components to form a circuit, quantum fluctuations of electrical quantities lead to a non-linear electromagnetic interaction between the components that is not generally understood. The Landauer-Büttiker formalism that is frequently used to describe non-interacting coherent mesoscopic components is not directly suited to describe such circuits since it a… ▽ More When combining lumped mesoscopic electronic components to form a circuit, quantum fluctuations of electrical quantities lead to a non-linear electromagnetic interaction between the components that is not generally understood. The Landauer-Büttiker formalism that is frequently used to describe non-interacting coherent mesoscopic components is not directly suited to describe such circuits since it assumes perfect voltage bias, i.e. the absence of fluctuations. Here, we show that for short coherent conductors of arbitrary transmission, the Landauer-Büttiker formalism can be extended to take into account quantum voltage fluctuations similarly to what is done for tunnel junctions. The electrodynamics of the whole circuit is then formally worked out disregarding the non-Gaussianity of fluctuations. This reveals how the aforementioned non-linear interaction operates in short coherent conductors: voltage fluctuations induce a reduction of conductance through the phenomenon of dynamical Coulomb blockade but they also modify their internal density of states leading to an additional electrostatic modification of the transmission. Using this approach we can account quantitatively for conductance measurements performed on Quantum Point Contacts in series with impedances of the order of $R_K = h / e^2$. Our work should enable a better engineering of quantum circuits with targeted properties. △ Less

Submitted 30 April, 2016; originally announced May 2016.

Journal ref: Phys. Rev. X 6, 031002 (2016)

arXiv:1512.05812 [pdf, other]

doi 10.1103/PhysRevB.95.125311

Quantum Properties of the radiation emitted by a conductor in the Coulomb Blockade Regime

Authors: Christophe Mora, Carles Altimiras, Philippe Joyez, Fabien Portier

Abstract: We present an input-output formalism describing a tunnel junction strongly coupled to its electromagnetic environment. We exploit it in order to investigate the dynamics of the radiation being emitted and scattered by the junction. We find that the non-linearity imprinted in the electronic transport by a properly designed environment generates strongly squeezed radiation. Our results show that the… ▽ More We present an input-output formalism describing a tunnel junction strongly coupled to its electromagnetic environment. We exploit it in order to investigate the dynamics of the radiation being emitted and scattered by the junction. We find that the non-linearity imprinted in the electronic transport by a properly designed environment generates strongly squeezed radiation. Our results show that the interaction between a quantum conductor and electromagnetic fields can be exploited as a resource to design simple sources of non-classical radiation. △ Less

Submitted 28 March, 2017; v1 submitted 17 December, 2015; originally announced December 2015.

Comments: 14 pages, 4 figures, includes Supplementary

Journal ref: Phys. Rev. B 95, 125311 (2017)

arXiv:1511.06544 [pdf, other]

On the weak convergence of the empirical conditional copula under a simplifying assumption

Authors: François Portier, Johan Segers

Abstract: When the copula of the conditional distribution of two random variables given a covariate does not depend on the value of the covariate, two conflicting intuitions arise about the best possible rate of convergence attainable by nonparametric estimators of that copula. In the end, any such estimator must be based on the marginal conditional distribution functions of the two dependent variables give… ▽ More When the copula of the conditional distribution of two random variables given a covariate does not depend on the value of the covariate, two conflicting intuitions arise about the best possible rate of convergence attainable by nonparametric estimators of that copula. In the end, any such estimator must be based on the marginal conditional distribution functions of the two dependent variables given the covariate, and the best possible rates for estimating such localized objects is slower than the parametric one. However, the invariance of the conditional copula given the value of the covariate suggests the possibility of parametric convergence rates. The more optimistic intuition is shown to be correct, confirming a conjecture supported by extensive Monte Carlo simulations by I. Hobaek Haff and J. Segers [Computational Statistics and Data Analysis 84:1--13, 2015] and improving upon the nonparametric rate obtained theoretically by I. Gijbels, M. Omelka and N. Veraverbeke [Scandinavian Journal of Statistics 2015, to appear]. The novelty of the proposed approach lies in the double smoothing procedure employed for the estimator of the marginal cumulative distribution functions. Under mild conditions on the bandwidth sequence, the estimator is shown to take values in a certain class of smooth functions, the class having sufficiently small entropy for empirical process arguments to work. The copula estimator itself is asymptotically undistinguishable from a kind of oracle empirical copula, making it appear as if the marginal conditional distribution functions were known. △ Less

Submitted 16 May, 2017; v1 submitted 20 November, 2015; originally announced November 2015.

Comments: 36 pages

MSC Class: 62G20; 62G30

arXiv:1509.04413 [pdf, ps, other]

Efficiency of Z-estimators indexed by the objective functions

Authors: François Portier

Abstract: We study the convergence of $Z$-estimators $\widehat θ(η)\in \mathbb R^p$ for which the objective function depends on a parameter $η$ that belongs to a Banach space $\mathcal H$. Our results include the uniform consistency over $\mathcal H$ and the weak convergence in the space of bounded $\mathbb R^p$-valued functions defined on $\mathcal H$. Furthermore when $η$ is a tuning parameter optimally s… ▽ More We study the convergence of $Z$-estimators $\widehat θ(η)\in \mathbb R^p$ for which the objective function depends on a parameter $η$ that belongs to a Banach space $\mathcal H$. Our results include the uniform consistency over $\mathcal H$ and the weak convergence in the space of bounded $\mathbb R^p$-valued functions defined on $\mathcal H$. Furthermore when $η$ is a tuning parameter optimally selected at $η_0$, we provide conditions under which an estimated $\widehat η$ can be replaced by $η_0$ without affecting the asymptotic variance. Interestingly, these conditions are free from any rate of convergence of $\widehat η$ to $η_0$ but they require the space described by $\widehat η$ to be not too large. We highlight several applications of our results and we study in detail the case where $η$ is the weight function in weighted regression. △ Less

Submitted 15 September, 2015; originally announced September 2015.

Comments: 25 pages, 4 figures

MSC Class: 62F12; 62F35; 62G20

arXiv:1503.05057 [pdf, other]

doi 10.1103/PhysRevB.93.035420

Robust quantum coherence above the Fermi sea

Authors: S. Tewari, P. Roulleau, C. Grenier, F. Portier, A. Cavanna, U. Gennser, D. Mailly, P. Roche

Abstract: In this paper we present an experiment where we measured the quantum coherence of a quasiparticle injected at a well-defined energy above the Fermi sea into the edge states of the integer quantum Hall regime. Electrons are introduced in an electronic Mach-Zehnder interferometer after passing through a quantum dot that plays the role of an energy filter. Measurements show that above a threshold inj… ▽ More In this paper we present an experiment where we measured the quantum coherence of a quasiparticle injected at a well-defined energy above the Fermi sea into the edge states of the integer quantum Hall regime. Electrons are introduced in an electronic Mach-Zehnder interferometer after passing through a quantum dot that plays the role of an energy filter. Measurements show that above a threshold injection energy, the visibility of the quantum interferences is almost independent of the energy. This is true even for high energies, up to 130~$μ$eV, well above the thermal energy of the measured sample. This result is in strong contradiction with our theoretical predictions, which instead predict a continuous decrease of the interference visibility with increasing energy. This experiment raises serious questions concerning the understanding of excitations in the integer quantum Hall regime. △ Less

Submitted 17 March, 2015; originally announced March 2015.

Journal ref: Phys. Rev. B 93, 035420 (2016)

arXiv:1409.6696 [pdf, other]

doi 10.1103/PhysRevLett.114.126801

Fluctuation-Dissipation Relations of a Tunnel Junction Driven by a Quantum Circuit

Authors: O. Parlavecchio, C. Altimiras, J. -R. Souquet, P. Simon, I. Safi, P. Joyez, D. Vion, P. Roche, D. Esteve, F. Portier

Abstract: We derive fluctuation-dissipation relations for a tunnel junction driven by a high impedance microwave resonator, displaying strong quantum fluctuations. We find that the fluctuation-dissipation relations derived for classical forces hold, provided the effect of the circuit's quantum fluctuations is incorporated into a modified non-linear $I(V)$ curve. We also demonstrate that all quantities measu… ▽ More We derive fluctuation-dissipation relations for a tunnel junction driven by a high impedance microwave resonator, displaying strong quantum fluctuations. We find that the fluctuation-dissipation relations derived for classical forces hold, provided the effect of the circuit's quantum fluctuations is incorporated into a modified non-linear $I(V)$ curve. We also demonstrate that all quantities measured under a coherent time dependent bias can be reconstructed from their dc counterpart with a photo-assisted tunneling relation. We confirm these predictions by implementing the circuit and measuring the dc current through the junction, its high frequency admittance and its current noise at the frequency of the resonator. △ Less

Submitted 30 March, 2015; v1 submitted 23 September, 2014; originally announced September 2014.

Comments: Publisehd as Physical Review Letters, 114, 126801

Journal ref: Physical Review Letters, American Physical Society, 2015, 114 (12)

arXiv:1409.0752 [pdf, ps, other]

Continuous inverse regression

Authors: François Portier

Abstract: We provide new theoretical results in the field of inverse regression methods for dimension reduction. Our approach is based on the study of some empirical processes that lie close to a certain dimension reduction subspace, called the central subspace. The study of these processes essentially includes weak convergence results and the consistency of some general bootstrap procedures. While such pro… ▽ More We provide new theoretical results in the field of inverse regression methods for dimension reduction. Our approach is based on the study of some empirical processes that lie close to a certain dimension reduction subspace, called the central subspace. The study of these processes essentially includes weak convergence results and the consistency of some general bootstrap procedures. While such properties are used to obtain new results about sliced inverse regression, they mainly allow to define a natural family of methods for dimension reduction. First the estimation methods are shown to have root $n$ rates and the bootstrap is proved to be valid. Second, we describe a family of Cramér-von Mises test statistics that can be used in testing structural properties of the central subspace or the significancy of some sets of predictors. We show that the quantiles of those tests could be computed by bootstrap. Most of the existing methods related to inverse regression involve a slicing of the response that is difficult to select in practice. While our approach guarantee a comprehensive estimation, the slicing is no longer needed. △ Less

Submitted 1 June, 2015; v1 submitted 2 September, 2014; originally announced September 2014.

Comments: 22 pages

arXiv:1409.0733 [pdf, ps, other]

doi 10.3150/15-BEJ725

Integral approximation by kernel smoothing

Authors: Bernard Delyon, François Portier

Abstract: Let $(X_1,\ldots,X_n)$ be an i.i.d. sequence of random variables in $\mathbb{R}^d$, $d\geq 1$. We show that, for any function $\varphi :\mathbb{R}^d\rightarrow\mathbb{R}$, under regularity conditions, \[n^ {1/2}\Biggl(n^{-1}\sum_{i=1}^n\frac{\varphi(X_i)}{\widehat{f}^(X_i)}- \int \varphi(x)\,dx\Biggr)\stackrel{\mathbb{P}}{\longrightarrow}0,\] where $\widehat{f}$ is the classical kernel estimator o… ▽ More Let $(X_1,\ldots,X_n)$ be an i.i.d. sequence of random variables in $\mathbb{R}^d$, $d\geq 1$. We show that, for any function $\varphi :\mathbb{R}^d\rightarrow\mathbb{R}$, under regularity conditions, \[n^ {1/2}\Biggl(n^{-1}\sum_{i=1}^n\frac{\varphi(X_i)}{\widehat{f}^(X_i)}- \int \varphi(x)\,dx\Biggr)\stackrel{\mathbb{P}}{\longrightarrow}0,\] where $\widehat{f}$ is the classical kernel estimator of the density of $X_1$. This result is striking because it speeds up traditional rates, in root $n$, derived from the central limit theorem when $\widehat{f}=f$. Although this paper highlights some applications, we mainly address theoretical issues related to the later result. We derive upper bounds for the rate of convergence in probability. These bounds depend on the regularity of the functions $\varphi$ and $f$, the dimension $d$ and the bandwidth of the kernel estimator $\widehat{f}$. Moreover, they are shown to be accurate since they are used as renormalizing sequences in two central limit theorems each reflecting different degrees of smoothness of $\varphi$. As an application to regression modelling with random design, we provide the asymptotic normality of the estimation of the linear functionals of a regression function. As a consequence of the above result, the asymptotic variance does not depend on the regression function. Finally, we debate the choice of the bandwidth for integral approximation and we highlight the good behavior of our procedure through simulations. △ Less

Submitted 6 June, 2016; v1 submitted 2 September, 2014; originally announced September 2014.

Comments: Published at http://dx.doi.org/10.3150/15-BEJ725 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm). arXiv admin note: text overlap with arXiv:1312.4497

Report number: IMS-BEJ-BEJ725

Journal ref: Bernoulli 2016, Vol. 22, No. 4, 2177-2208

arXiv:1404.1792 [pdf, other]

doi 10.1063/1.4832074

Tunable microwave impedance matching to a high impedance source using a Josephson metamaterial

Authors: Carles Altimiras, Olivier Parlavecchio, Philippe Joyez, Denis Vion, Patrice Roche, Daniel Esteve, Fabien Portier

Abstract: We report the efficient coupling of a $50\,Ω$ microwave circuit to a high impedance conductor. We use an impedance transformer consisting of a $λ/4$ co-planar resonator whose inner conductor contains an array of superconducting quantum interference devices (SQUIDs), providing the resonator with a large and tunable lineic inductance $\mathcal{L}\sim 80 μ_0$, resulting in a large characteristic impe… ▽ More We report the efficient coupling of a $50\,Ω$ microwave circuit to a high impedance conductor. We use an impedance transformer consisting of a $λ/4$ co-planar resonator whose inner conductor contains an array of superconducting quantum interference devices (SQUIDs), providing the resonator with a large and tunable lineic inductance $\mathcal{L}\sim 80 μ_0$, resulting in a large characteristic impedance $Z_C\sim 1\,\mathrm{k}Ω$. The impedance matching efficiency is characterized by measuring the shot noise power emitted by a dc biased high resistance tunnel junction connected to the resonator. We demonstrate matching to impedances in the $15$ to $35\,\mathrm{k}Ω$ range with bandwidths above $100\,\mathrm{MHz}$ around a resonant frequency tunable in the $4$ to $6\,\mathrm{GHz}$ range. △ Less

Submitted 7 April, 2014; originally announced April 2014.

Comments: Published in Applied Physics Letters

Journal ref: Appl. Phys. Lett. 103, 212601 (2013)

arXiv:1403.5999 [pdf, other]

doi 10.1103/PhysRevLett.112.236803

Dynamical Coulomb Blockade of Shot Noise

Authors: C. Altimiras, O. Parlavecchio, P. Joyez, D. Vion, P. Roche, D. Esteve, F. Portier

Abstract: We observe the suppression of the finite frequency shot-noise produced by a voltage biased tunnel junction due to its interaction with a single electromagnetic mode of high impedance. The tunnel junction is embedded in a quarter wavelength resonator containing a dense SQUID array providing it with a characteristic impedance in the kOhms range and a resonant frequency tunable in the 4-6 GHz range.… ▽ More We observe the suppression of the finite frequency shot-noise produced by a voltage biased tunnel junction due to its interaction with a single electromagnetic mode of high impedance. The tunnel junction is embedded in a quarter wavelength resonator containing a dense SQUID array providing it with a characteristic impedance in the kOhms range and a resonant frequency tunable in the 4-6 GHz range. Such high impedance gives rise to a sizeable Coulomb blockade on the tunnel junction (roughly 30% reduction in the differential conductance) and allows an efficient measurement of the spectral density of the current fluctuations at the resonator frequency. The observed blockade of shot-noise is found in agreement with an extension of the dynamical Coulomb blockade theory. △ Less

Submitted 23 October, 2014; v1 submitted 24 March, 2014; originally announced March 2014.

Journal ref: Phys. Rev. Lett. 112, 236803 (2014)

arXiv:1312.4497 [pdf, ps, other]

On the acceleration of some empirical means with application to nonparametric regression

Authors: Bernard Delyon, François Portier

Abstract: Let $(X_1,\ldots ,X_n)$ be an i.i.d. sequence of random variables in $\R^d$, $d\geq 1$, for some function $\varphi:\R^d\r \R$, under regularity conditions, we show that \begin{align*} n^{1/2} \left(n^{-1} \sum_{i=1}^n \frac{\varphi(X_i)}{\w f^{(i)}(X_i)}-\int_{} \varphi(x)dx \right) \overset¶{\lr} 0, \end{align*} where $\w f^{(i)}$ is the classical leave-one-out kernel estimator of the density of… ▽ More Let $(X_1,\ldots ,X_n)$ be an i.i.d. sequence of random variables in $\R^d$, $d\geq 1$, for some function $\varphi:\R^d\r \R$, under regularity conditions, we show that \begin{align*} n^{1/2} \left(n^{-1} \sum_{i=1}^n \frac{\varphi(X_i)}{\w f^{(i)}(X_i)}-\int_{} \varphi(x)dx \right) \overset¶{\lr} 0, \end{align*} where $\w f^{(i)}$ is the classical leave-one-out kernel estimator of the density of $X_1$. This result is striking because it speeds up traditional rates, in root $n$, derived from the central limit theorem when $\w f^{(i)}=f$. As a consequence, it improves the classical Monte Carlo procedure for integral approximation. The paper mainly addressed with theoretical issues related to the later result (rates of convergence, bandwidth choice, regularity of $\varphi$) but also interests some statistical applications dealing with random design regression. In particular, we provide the asymptotic normality of the estimation of the linear functionals of a regression function on which the only requirement is the Hölder regularity. This leads us to a new version of the \textit{average derivative estimator} introduced by Härdle and Stoker in \cite{hardle1989} which allows for \textit{dimension reduction} by estimating the \textit{index space} of a regression. △ Less

Submitted 16 December, 2013; originally announced December 2013.

Showing 1–50 of 68 results for author: Portier, F