Search | arXiv e-print repository

Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence

Authors: Antoine Labatie, Dominic Masters, Zach Eaton-Rosen, Carlo Luschi

Abstract: We investigate the reasons for the performance degradation incurred with batch-independent normalization. We find that the prototypical techniques of layer normalization and instance normalization both induce the appearance of failure modes in the neural network's pre-activations: (i) layer normalization induces a collapse towards channel-wise constant functions; (ii) instance normalization induce… ▽ More We investigate the reasons for the performance degradation incurred with batch-independent normalization. We find that the prototypical techniques of layer normalization and instance normalization both induce the appearance of failure modes in the neural network's pre-activations: (i) layer normalization induces a collapse towards channel-wise constant functions; (ii) instance normalization induces a lack of variability in instance statistics, symptomatic of an alteration of the expressivity. To alleviate failure mode (i) without aggravating failure mode (ii), we introduce the technique "Proxy Normalization" that normalizes post-activations using a proxy distribution. When combined with layer normalization or group normalization, this batch-independent normalization emulates batch normalization's behavior and consistently matches or exceeds its performance. △ Less

Submitted 3 April, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021 camera-ready

arXiv:2106.03640 [pdf, other]

Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

Authors: Dominic Masters, Antoine Labatie, Zach Eaton-Rosen, Carlo Luschi

Abstract: Much recent research has been dedicated to improving the efficiency of training and inference for image classification. This effort has commonly focused on explicitly improving theoretical efficiency, often measured as ImageNet validation accuracy per FLOP. These theoretical savings have, however, proven challenging to achieve in practice, particularly on high-performance training accelerators.… ▽ More Much recent research has been dedicated to improving the efficiency of training and inference for image classification. This effort has commonly focused on explicitly improving theoretical efficiency, often measured as ImageNet validation accuracy per FLOP. These theoretical savings have, however, proven challenging to achieve in practice, particularly on high-performance training accelerators. In this work, we focus on improving the practical efficiency of the state-of-the-art EfficientNet models on a new class of accelerator, the Graphcore IPU. We do this by extending this family of models in the following ways: (i) generalising depthwise convolutions to group convolutions; (ii) adding proxy-normalized activations to match batch normalization performance with batch-independent statistics; (iii) reducing compute by lowering the training resolution and inexpensively fine-tuning at higher resolution. We find that these three methods improve the practical efficiency for both training and inference. Code available at https://github.com/graphcore/graphcore-research/tree/main/Making_EfficientNet_More_Efficient . △ Less

Submitted 26 August, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

arXiv:1811.03087 [pdf, other]

Characterizing Well-Behaved vs. Pathological Deep Neural Networks

Authors: Antoine Labatie

Abstract: We introduce a novel approach, requiring only mild assumptions, for the characterization of deep neural networks at initialization. Our approach applies both to fully-connected and convolutional networks and easily incorporates batch normalization and skip-connections. Our key insight is to consider the evolution with depth of statistical moments of signal and noise, thereby characterizing the pre… ▽ More We introduce a novel approach, requiring only mild assumptions, for the characterization of deep neural networks at initialization. Our approach applies both to fully-connected and convolutional networks and easily incorporates batch normalization and skip-connections. Our key insight is to consider the evolution with depth of statistical moments of signal and noise, thereby characterizing the presence or absence of pathologies in the hypothesis space encoded by the choice of hyperparameters. We establish: (i) for feedforward networks, with and without batch normalization, the multiplicativity of layer composition inevitably leads to ill-behaved moments and pathologies; (ii) for residual networks with batch normalization, on the other hand, skip-connections induce power-law rather than exponential behaviour, leading to well-behaved moments and no pathology. △ Less

Submitted 19 June, 2019; v1 submitted 7 November, 2018; originally announced November 2018.

Comments: Proceedings of ICML 2019 (with contact info updated and formatting issues fixed). Code available at https://github.com/alabatie/moments-dnns

arXiv:1211.6211 [pdf, other]

doi 10.1051/0004-6361/201220790

An optimized correlation function estimator for galaxy surveys

Authors: M. Vargas-Magaña, Julian. E. Bautista, J. -Ch. Hamilton, N. G. Busca, É. Aubourg, A. Labatie, J. -M. Le Goff, Stephanie Escoffier, Marc Manera, Cameron K. McBride, Donald P. Schneider, Christopher N. A. Willmer

Abstract: Measuring the two-point correlation function of the galaxies in the Universe gives access to the underlying dark matter distribution, which is related to cosmological parameters and to the physics of the primordial Universe. The estimation of the correlation function for current galaxy surveys makes use of the Landy-Szalay estimator, which is supposed to reach minimal variance. This is only true,… ▽ More Measuring the two-point correlation function of the galaxies in the Universe gives access to the underlying dark matter distribution, which is related to cosmological parameters and to the physics of the primordial Universe. The estimation of the correlation function for current galaxy surveys makes use of the Landy-Szalay estimator, which is supposed to reach minimal variance. This is only true, however, for a vanishing correlation function. We study the Landy-Szalay estimator when these conditions are not fulfilled and propose a new estimator that provides the smallest variance for a given survey geometry. Our estimator is a linear combination of ratios between paircounts of data and/or random catalogues (DD, RR and DR). The optimal combination for a given geometry is determined by using lognormal mock catalogues. The resulting estimator is biased in a model-dependent way, but we propose a simple iterative procedure for obtaining an unbiased model- independent estimator.Our method can be easily applied to any dataset and requires few extra mock catalogues compared to the standard Landy-Szalay analysis. Using various sets of simulated data (lognormal, second-order LPT and N-Body), we obtain a 20-25% gain on the error bars on the two-point correlation function for the SDSS geometry and $Λ$CDM correlation function. When applied to SDSS data (DR7 and DR9), we achieve a similar gain on the correlation functions, which translates into a 10-15% improvement over the estimation of the densities of matter $Ω_m$ and dark energy $Ω_Λ$ in an open $Λ$CDM model. The constraints derived from DR7 data with our estimator are similar to those obtained with the DR9 data and the Landy-Szalay estimator, which covers a volume twice as large and has a density that is three times higher. △ Less

Submitted 16 May, 2013; v1 submitted 26 November, 2012; originally announced November 2012.

Comments: Accepted for publication A&A

arXiv:1210.0878 [pdf, ps, other]

doi 10.1088/0004-637X/760/2/97

Effect of model-dependent covariance matrix for studying Baryon Acoustic Oscillations

Authors: A. Labatie, J. -L. Starck, M. Lachièze-Rey

Abstract: Large-scale structures in the Universe are a powerful tool to test cosmological models and constrain cosmological parameters. A particular feature of interest comes from Baryon Acoustic Oscillations (BAOs), which are sound waves traveling in the hot plasma of the early Universe that stopped at the recombination time. This feature can be observed as a localized bump in the correlation function at t… ▽ More Large-scale structures in the Universe are a powerful tool to test cosmological models and constrain cosmological parameters. A particular feature of interest comes from Baryon Acoustic Oscillations (BAOs), which are sound waves traveling in the hot plasma of the early Universe that stopped at the recombination time. This feature can be observed as a localized bump in the correlation function at the scale of the sound horizon $r_s$. As such, it provides a standard ruler and a lot of constraining power in the correlation function analysis of galaxy surveys. Moreover the detection of BAOs at the expected scale gives a strong support to cosmological models. Both of these studies (BAO detection and parameter constraints) rely on a statistical modeling of the measured correlation function $\hatξ$. Usually $\hatξ$ is assumed to be gaussian, with a mean $ξ_θ$ depending on the cosmological model and a covariance matrix $C$ generally approximated as a constant (i.e. independent of the model). In this article we study whether a realistic model-dependent $C_θ$ changes the results of cosmological parameter constraints compared to the approximation of a constant covariance matrix $C$. For this purpose, we use a new procedure to generate lognormal realizations of the Luminous Red Galaxies sample of the Sloan Digital Sky Survey Data Release 7 to obtain a model-dependent $C_θ$ in a reasonable time. The approximation of $C_θ$ as a constant creates small changes in the cosmological parameter constraints on our sample. We quantify this modeling error using a lot of simulations and find that it only has a marginal influence on cosmological parameter constraints for current and next-generation galaxy surveys. It can be approximately taken into account by extending the $1σ$ intervals by a factor $\approx 1.3$. △ Less

Submitted 9 November, 2012; v1 submitted 2 October, 2012; originally announced October 2012.

Comments: 14 pages, 12 figures

arXiv:1203.6616 [pdf, ps, other]

doi 10.1111/j.1365-2966.2012.21502.x

The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: cosmological implications of the large-scale two-point correlation function

Authors: Ariel G. Sanchez, C. G. Scoccola, A. J. Ross, W. Percival, M. Manera, F. Montesano, X. Mazzalay, A. J. Cuesta, D. J. Eisenstein, E. Kazin, C. K. McBride, K. Mehta, A. D. Montero-Dorta, N. Padmanabhan, F. Prada, J. A. Rubino-Martin, R. Tojeiro, X. Xu, M. Vargas Magana, E. Aubourg, N. A. Bahcall, S. Bailey, D. Bizyaev, A. S. Bolton, H. Brewington , et al. (31 additional authors not shown)

Abstract: We obtain constraints on cosmological parameters from the spherically averaged redshift-space correlation function of the CMASS Data Release 9 (DR9) sample of the Baryonic Oscillation Spectroscopic Survey (BOSS). We combine this information with additional data from recent CMB, SN and BAO measurements. Our results show no significant evidence of deviations from the standard flat-Lambda CDM model,… ▽ More We obtain constraints on cosmological parameters from the spherically averaged redshift-space correlation function of the CMASS Data Release 9 (DR9) sample of the Baryonic Oscillation Spectroscopic Survey (BOSS). We combine this information with additional data from recent CMB, SN and BAO measurements. Our results show no significant evidence of deviations from the standard flat-Lambda CDM model, whose basic parameters can be specified by Omega_m = 0.285 +- 0.009, 100 Omega_b = 4.59 +- 0.09, n_s = 0.96 +- 0.009, H_0 = 69.4 +- 0.8 km/s/Mpc and sigma_8 = 0.80 +- 0.02. The CMB+CMASS combination sets tight constraints on the curvature of the Universe, with Omega_k = -0.0043 +- 0.0049, and the tensor-to-scalar amplitude ratio, for which we find r < 0.16 at the 95 per cent confidence level (CL). These data show a clear signature of a deviation from scale-invariance also in the presence of tensor modes, with n_s <1 at the 99.7 per cent CL. We derive constraints on the fraction of massive neutrinos of f_nu < 0.049 (95 per cent CL), implying a limit of sum m_nu < 0.51 eV. We find no signature of a deviation from a cosmological constant from the combination of all datasets, with a constraint of w_DE = -1.033 +- 0.073 when this parameter is assumed time-independent, and no evidence of a departure from this value when it is allowed to evolve as w_DE(a) = w_0 + w_a (1 - a). The achieved accuracy on our cosmological constraints is a clear demonstration of the constraining power of current cosmological observations. △ Less

Submitted 13 June, 2012; v1 submitted 29 March, 2012; originally announced March 2012.

Comments: 26 pages, 15 figures. Minor changes to match version accepted by MNRAS

arXiv:1203.6594 [pdf, other]

doi 10.1111/j.1365-2966.2012.22066.x

The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: Baryon Acoustic Oscillations in the Data Release 9 Spectroscopic Galaxy Sample

Authors: Lauren Anderson, Eric Aubourg, Stephen Bailey, Dmitry Bizyaev, Michael Blanton, Adam S. Bolton, J. Brinkmann, Joel R. Brownstein, Angela Burden, Antonio J. Cuesta, Luiz N. A. da Costa, Kyle S. Dawson, Roland de Putter, Daniel J. Eisenstein, James E. Gunn, Hong Guo, Jean-Christophe Hamilton, Paul Harding, Shirley Ho, Klaus Honscheid, Eyal Kazin, D. Kirkby, Jean-Paul Kneib, Antione Labatie, Craig Loomis , et al. (51 additional authors not shown)

Abstract: We present measurements of galaxy clustering from the Baryon Oscillation Spectroscopic Survey (BOSS), which is part of the Sloan Digital Sky Survey III (SDSS-III). These use the Data Release 9 (DR9) CMASS sample, which contains 264,283 massive galaxies covering 3275 square degrees with an effective redshift z=0.57 and redshift range 0.43 < z < 0.7. Assuming a concordance Lambda-CDM cosmological mo… ▽ More We present measurements of galaxy clustering from the Baryon Oscillation Spectroscopic Survey (BOSS), which is part of the Sloan Digital Sky Survey III (SDSS-III). These use the Data Release 9 (DR9) CMASS sample, which contains 264,283 massive galaxies covering 3275 square degrees with an effective redshift z=0.57 and redshift range 0.43 < z < 0.7. Assuming a concordance Lambda-CDM cosmological model, this sample covers an effective volume of 2.2 Gpc^3, and represents the largest sample of the Universe ever surveyed at this density, n = 3 x 10^-4 h^-3 Mpc^3. We measure the angle-averaged galaxy correlation function and power spectrum, including density-field reconstruction of the baryon acoustic oscillation (BAO) feature. The acoustic features are detected at a significance of 5σin both the correlation function and power spectrum. Combining with the SDSS-II Luminous Red Galaxy Sample, the detection significance increases to 6.7σ. Fitting for the position of the acoustic features measures the distance to z=0.57 relative to the sound horizon DV /rs = 13.67 +/- 0.22 at z=0.57. Assuming a fiducial sound horizon of 153.19 Mpc, which matches cosmic microwave background constraints, this corresponds to a distance DV(z=0.57) = 2094 +/- 34 Mpc. At 1.7 per cent, this is the most precise distance constraint ever obtained from a galaxy survey. We place this result alongside previous BAO measurements in a cosmological distance ladder and find excellent agreement with the current supernova measurements. We use these distance measurements to constrain various cosmological models, finding continuing support for a flat Universe with a cosmological constant. △ Less

Submitted 29 March, 2012; originally announced March 2012.

Comments: 33 pages

arXiv:1112.0980 [pdf, other]

doi 10.1088/0004-637X/746/2/172

Detecting Baryon Acoustic Oscillations

Authors: A. Labatie, J. -L. Starck, M. Lachièze-Rey

Abstract: Baryon Acoustic Oscillations are a feature imprinted in the galaxy distribution by acoustic waves traveling in the plasma of the early universe. Their detection at the expected scale in large-scale structures strongly supports current cosmological models with a nearly linear evolution from redshift approximately 1000, and the existence of dark energy. Besides, BAOs provide a standard ruler for stu… ▽ More Baryon Acoustic Oscillations are a feature imprinted in the galaxy distribution by acoustic waves traveling in the plasma of the early universe. Their detection at the expected scale in large-scale structures strongly supports current cosmological models with a nearly linear evolution from redshift approximately 1000, and the existence of dark energy. Besides, BAOs provide a standard ruler for studying cosmic expansion. In this paper we focus on methods for BAO detection using the correlation function measurement. For each method, we want to understand the tested hypothesis (the hypothesis H0 to be rejected) and the underlying assumptions. We first present wavelet methods which are mildly model-dependent and mostly sensitive to the BAO feature. Then we turn to fully model-dependent methods. We present the most often used method based on the chi^2 statistic, but we find it has limitations. In general the assumptions of the chi^2 method are not verified, and it only gives a rough estimate of the significance. The estimate can become very wrong when considering more realistic hypotheses, where the covariance matrix of the measurement depends on cosmological parameters. Instead we propose to use a new method based on two modifications: we modify the procedure for computing the significance and make it rigorous, and we modify the statistic to obtain better results in the case of varying covariance matrix. We verify with simulations that correct significances are different from the ones obtained using the classical chi^2 procedure. We also test a simple example of varying covariance matrix. In this case we find that our modified statistic outperforms the classical chi^2 statistic when both significances are correctly computed. Finally we find that taking into account variations of the covariance matrix can change both BAO detection levels and cosmological parameter constraints. △ Less

Submitted 5 December, 2011; originally announced December 2011.

arXiv:1101.1911 [pdf, ps, other]

doi 10.1051/0004-6361/201118017

Wavelet analysis of baryon acoustic structures in the galaxy distribution

Authors: P. Arnalte-Mur, A. Labatie, N. Clerc, V. J. Martínez, J. -L. Starck, M. Lachièze-Rey, E. Saar, S. Paredes

Abstract: Baryon Acoustic Oscillations (BAO) are a feature imprinted in the density field by acoustic waves travelling in the plasma of the early universe. Their fixed scale can be used as a standard ruler to study the geometry of the universe. BAO have been previously detected using correlation functions and power spectra of the galaxy distribution. In this work, we present a new method for the detection o… ▽ More Baryon Acoustic Oscillations (BAO) are a feature imprinted in the density field by acoustic waves travelling in the plasma of the early universe. Their fixed scale can be used as a standard ruler to study the geometry of the universe. BAO have been previously detected using correlation functions and power spectra of the galaxy distribution. In this work, we present a new method for the detection of the real-space structures associated with this feature. These baryon acoustic structures are spherical shells with a relatively small density contrast, surrounding high density central regions. We design a specific wavelet adapted to the search for shells, and exploit the physics of the process by making use of two different mass tracers, introducing a specific statistic to detect the BAO features. We show the effect of the BAO signal in this new statistic when applied to the Lambda - Cold Dark Matter (LCDM) model, using an analytical approximation to the transfer function. We confirm the reliability and stability of our method by using cosmological N-body simulations from the MareNostrum Institut de Ciències de l'Espai (MICE). We apply our method to the detection of BAO in a galaxy sample drawn from the Sloan Digital Sky Survey (SDSS). We use the `Main' catalogue to trace the shells, and the Luminous Red Galaxies (LRG) as tracers of the high density central regions. Using this new method, we detect, with a high significance, that the LRGs in our sample are preferentially located close to the centres of shell-like structures in the density field, with characteristics similar to those expected from BAOs. We show that stacking selected shells, we can find their characteristic density profile. We have delineated a new feature of the cosmic web, the BAO shells. As these are real spatial structures, the BAO phenomenon can be studied in detail by examining those shells. △ Less

Submitted 14 March, 2012; v1 submitted 10 January, 2011; originally announced January 2011.

Comments: 12 pages, 10 figures, 1 table. Accepted for publication in A&A. v3: General revision of the paper. Added Sect. 3 discussing expected signal in LCDM model, using MICE simulations. Added illustration of localisation and stacking possibilities in Sect. 5. Main results and conclusions unchanged

Journal ref: Astronomy & Astrophysics, Volume 542, id.A34 (2012)

arXiv:1009.1232 [pdf, other]

doi 10.1016/j.stamet.2011.05.001

Uncertainty in 2-point correlation function estimators and BAO detection in SDSS DR7

Authors: Antoine Labatie, Jean-Luc Starck, Marc Lachièze-Rey, Pablo Arnalte-Mur

Abstract: We study the uncertainty in different two-point correlation function (2PCF) estimators in currently available galaxy surveys. This is motivated by the active subject of using the baryon acoustic oscillations (BAOs) feature in the correlation function as a tool to constrain cosmological parameters, which requires a fine analysis of the statistical significance. We discuss how estimators are affecte… ▽ More We study the uncertainty in different two-point correlation function (2PCF) estimators in currently available galaxy surveys. This is motivated by the active subject of using the baryon acoustic oscillations (BAOs) feature in the correlation function as a tool to constrain cosmological parameters, which requires a fine analysis of the statistical significance. We discuss how estimators are affected by both the uncertainty in the mean density $\bar{n}$ and the integral constraint $\frac{1}{V^2}\int_{V^2} \hatξ (r) d^3r =0$ which necessarily causes a bias. We quantify both effects for currently available galaxy samples using simulated mock catalogues of the Sloan Digital Sky Survey (SDSS) following a lognormal model, with a Lambda-Cold Dark Matter ($Λ\text{CDM}$) correlation function and similar properties as the samples (number density, mean redshift for the $Λ\text{CDM}$ correlation function, survey geometry, mass-luminosity bias). Because we need extensive simulations to quantify small statistical effects, we cannot use realistic N-body simulations and some physical effects are neglected. Our simulations still enable a comparison of the different estimators by looking at their biases and variances. We also test the reliability of the BAO detection in the SDSS samples and study the compatibility of the data results with our $Λ\text{CDM}$ simulations. △ Less

Submitted 22 June, 2011; v1 submitted 7 September, 2010; originally announced September 2010.

Comments: 14 pages, 6 figures, 3 tables

Showing 1–10 of 10 results for author: Labatie, A