-
Generalization Bounds for Gradient Methods via Discrete and Continuous Prior
Authors:
Xuanyuan Luo,
Luo Bei,
Jian Li
Abstract:
Proving algorithm-dependent generalization error bounds for gradient-type optimization methods has attracted significant attention recently in learning theory. However, most existing trajectory-based analyses require either restrictive assumptions on the learning rate (e.g., fast decreasing learning rate), or continuous injected noise (such as the Gaussian noise in Langevin dynamics). In this pape…
▽ More
Proving algorithm-dependent generalization error bounds for gradient-type optimization methods has attracted significant attention recently in learning theory. However, most existing trajectory-based analyses require either restrictive assumptions on the learning rate (e.g., fast decreasing learning rate), or continuous injected noise (such as the Gaussian noise in Langevin dynamics). In this paper, we introduce a new discrete data-dependent prior to the PAC-Bayesian framework, and prove a high probability generalization bound of order $O(\frac{1}{n}\cdot \sum_{t=1}^T(γ_t/\varepsilon_t)^2\left\|{\mathbf{g}_t}\right\|^2)$ for Floored GD (i.e. a version of gradient descent with precision level $\varepsilon_t$), where $n$ is the number of training samples, $γ_t$ is the learning rate at step $t$, $\mathbf{g}_t$ is roughly the difference of the gradient computed using all samples and that using only prior samples. $\left\|{\mathbf{g}_t}\right\|$ is upper bounded by and and typical much smaller than the gradient norm $\left\|{\nabla f(W_t)}\right\|$. We remark that our bound holds for nonconvex and nonsmooth scenarios. Moreover, our theoretical results provide numerically favorable upper bounds of testing errors (e.g., $0.037$ on MNIST). Using a similar technique, we can also obtain new generalization bounds for certain variants of SGD. Furthermore, we study the generalization bounds for gradient Langevin Dynamics (GLD). Using the same framework with a carefully constructed continuous prior, we show a new high probability generalization bound of order $O(\frac{1}{n} + \frac{L^2}{n^2}\sum_{t=1}^T(γ_t/σ_t)^2)$ for GLD. The new $1/n^2$ rate is due to the concentration of the difference between the gradient of training samples and that of the prior.
△ Less
Submitted 11 October, 2022; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Spitzer observations of MAMBO galaxies: weeding out active nuclei in starbursting proto-ellipticals
Authors:
R. J. Ivison,
T. R. Greve,
S. Serjeant,
F. Bertoldi,
E. Egami,
A. M. J. Mortier,
A. Alonso-Herrero,
P. Barmby,
L. Bei,
H. Dole,
C. W. Engelbracht,
G. G. Fazio,
D. T. Frayer,
K. D. Gordon,
D. C. Hines,
J. -S. Huang,
E. Le Floch,
K. A. Misselt,
S. Miyazaki,
J. E. Morrison,
C. Papovich,
P. G. Perez-Gonzalez,
M. J. Rieke,
G. H. Rieke,
J. Rigby
, et al. (4 additional authors not shown)
Abstract:
We present Spitzer observations in five wavebands between 3.6 and 24um of an unbiased sample of 9 luminous, dusty galaxies selected at 1200um by the MAMBO camera on the IRAM 30-m telescope, a population akin to the well-known submm or `SCUBA' galaxies (hereafter SMGs). Owing to the coarse resolution of submm/mm instrumentation, SMGs have traditionally been difficult to identify at other waveleng…
▽ More
We present Spitzer observations in five wavebands between 3.6 and 24um of an unbiased sample of 9 luminous, dusty galaxies selected at 1200um by the MAMBO camera on the IRAM 30-m telescope, a population akin to the well-known submm or `SCUBA' galaxies (hereafter SMGs). Owing to the coarse resolution of submm/mm instrumentation, SMGs have traditionally been difficult to identify at other wavelengths. We compare our multi-wavelength catalogs to show that the overlap between 24 and 1200um must be close to complete at these flux levels. We find that all (4/4) of the most secure >=4sigma SMGs have robust >=4sigma counterparts at 1.4GHz, while the fraction drops to 7/9 using all >=3sigma SMGs. We show that combining mid-IR and marginal (>=3sigma) radio detections provides plausible identifications in the remaining cases, enabling us to identify the complete sample. Accretion onto an obscured central engine is betrayed by the shape of the mid-IR continuum emission for several sources, confirming Spitzer's potential to weed out active galaxies. We demonstrate the power of a S(24um)/S(8um) vs S(8um)/S(4.5um) color-color plot as a diagnostic for this purpose. However, we conclude that the majority (~75%) of SMGs have rest-frame mid-/far-IR SEDs commensurate with obscured starbursts. Sensitive 24-um observations are clearly a useful route to identify and characterize reliable counterparts to high-redshift far-IR-bright galaxies, complementing what is possible via deep radio imaging.
△ Less
Submitted 7 June, 2004;
originally announced June 2004.
-
Identification of luminous infrared galaxies at 1<z<2.5
Authors:
E. Le Floc'h,
P. G. Perez-Gonzalez,
G. H. Rieke,
C. Papovich,
J. -S. Huang,
P. Barmby,
H. Dole,
E. Egami,
A. Alonso-Herrero,
G. Wilson,
S. Miyazaki,
J. R. Rigby,
L. Bei,
M. Blaylock,
C. W. Engelbracht,
G. G. Fazio,
D. T. Frayer,
K. D. Gordon,
D. C. Hines,
K. A. Misselt,
J. E. Morrison,
J. Muzerolle,
M. J. Rieke,
D. Rigopoulou,
K. Y. L. Su
, et al. (2 additional authors not shown)
Abstract:
We present preliminary results on 24micron detections of luminous infrared galaxies at z>1 with the Multiband Imaging Photometer for Spitzer (MIPS). Observations were performed in the Lockman Hole and the Extended Groth Strip (EGS), and were supplemented by data obtained with the Infrared Array Camera (IRAC) between 3 and 9microns. The positional accuracy of ~2arcsec for most MIPS/IRAC detection…
▽ More
We present preliminary results on 24micron detections of luminous infrared galaxies at z>1 with the Multiband Imaging Photometer for Spitzer (MIPS). Observations were performed in the Lockman Hole and the Extended Groth Strip (EGS), and were supplemented by data obtained with the Infrared Array Camera (IRAC) between 3 and 9microns. The positional accuracy of ~2arcsec for most MIPS/IRAC detections provides unambiguous identifications of their optical counterparts. Using spectroscopic redshifts from the Deep Extragalactic Evolutionary Probe survey, we identify 24micron sources at z>1 in the EGS, while the combination of the MIPS/IRAC observations with $BVRIJHK$ ancillary data in the Lockman Hole also shows very clear cases of galaxies with photometric redshifts at 1<z<2.5.
The observed 24micron fluxes indicate infrared luminosities greater than 10^11 L_sol, while the data at shorter wavelengths reveal rather red and probably massive (M>=M*) galaxy counterparts. It is the first time that this population of luminous objects is detected up to z~2.5 in the infrared. Our work demonstrates the ability of the MIPS instrument to probe the dusty Universe at very high redshift, and illustrates how the forthcoming Spitzer deep surveys will offer a unique opportunity to illuminate a dark side of cosmic history not explored by previous infrared experiments.
△ Less
Submitted 6 June, 2004;
originally announced June 2004.
-
Sub-millimeter detections of Spitzer Space Telescope galaxy populations
Authors:
S. Serjeant,
A. M. J. Mortier,
R. J. Ivison,
E. Egami,
G. H. Rieke,
S. P. Willner,
D. Rigopoulou,
A. Alonso-Herrero,
P. Barmby,
L. Bei,
H. Dole,
C. W. Engelbracht,
G. G. Fazio,
E. Le Floc'h,
K. D. Gordon,
T. R. Greve,
D. C. Hines,
J. -S. Huang,
K. A. Misselt,
S. Miyazaki,
J. E. Morrison,
C. Papovich,
P. G. Perez-Gonzalez,
M. J. Rieke,
J. Rigby
, et al. (1 additional authors not shown)
Abstract:
We present sub-millimeter statistical detections of galaxies discovered in the 5'x5' Spitzer Early Release Observations (to 4-15 microJy 5 sigma at 3.6-8 microns, 170 microJy at 24 microns) through a stacking analysis of our reanalysed SCUBA 8mJy survey maps, and a Spitzer identification of a new sub-millimeter point source in the 8mJy survey region. For sources detected at 5.8 or 8 microns (154…
▽ More
We present sub-millimeter statistical detections of galaxies discovered in the 5'x5' Spitzer Early Release Observations (to 4-15 microJy 5 sigma at 3.6-8 microns, 170 microJy at 24 microns) through a stacking analysis of our reanalysed SCUBA 8mJy survey maps, and a Spitzer identification of a new sub-millimeter point source in the 8mJy survey region. For sources detected at 5.8 or 8 microns (154 and 111 sources respectively), we detect positive skews in the sub-millimeter flux distributions at 99.2-99.8% confidence using Kolmogorov-Smirnov tests, at both 850 microns and 450 microns. We also marginally detect the Spitzer 24 micron galaxies at 850 microns at 97% confidence, and place limits on the mean sub-millimeter fluxes of the 3.6 and 4.5 micron sources. Integrating the sub-millimeter fluxes of the Spitzer populations, we find the 5.8 micron galaxies contribute 0.12 +/- 0.05 nW/m^2/sr to the 850 micron background, and 2.4 +/- 0.7 nW/m^2/sr to the 450 micron background; similar contributions are made by the 8 micron-selected sample. We infer that the populations dominating the 5.8 and 8 micron extragalactic background light also contribute around a quarter of the 850 micron background and the majority of the 450 micron background.
△ Less
Submitted 1 June, 2004;
originally announced June 2004.