-
The K2-OjOS Project: New and revisited planets and candidates in K2 campaigns 5, 16, & 18
Authors:
A. Castro-González,
E. Díez Alonso,
J. Menéndez Blanco,
J. Livingston,
J. P. de Leon,
J. Lillo-Box,
J. Korth,
S. Fernández Menéndez,
J. M. Recio,
F. Izquierdo-Ruiz,
A. Coya Lozano,
F. García de la Cuesta,
N. Gómez Hernández,
J. R. Vidal Blanco,
R. Hevia Díaz,
R. Pardo Silva,
S. Pérez Acevedo,
J. Polancos Ruiz,
P. Padilla Tijerín,
D. Vázquez García,
S. L. Suárez Gómez,
F. García Riesgo,
C. González Gutiérrez,
L. Bonavera,
J. González-Nuevo
, et al. (6 additional authors not shown)
Abstract:
We present the first results of K2-OjOS, a collaborative project between professional and amateur astronomers primarily aimed to detect, characterize, and validate new extrasolar planets. For this work, 10 amateur astronomers looked for planetary signals by visually inspecting the 20 427 light curves of K2 campaign 18 (C18). They found 42 planet candidates, of which 18 are new detections and 24 ha…
▽ More
We present the first results of K2-OjOS, a collaborative project between professional and amateur astronomers primarily aimed to detect, characterize, and validate new extrasolar planets. For this work, 10 amateur astronomers looked for planetary signals by visually inspecting the 20 427 light curves of K2 campaign 18 (C18). They found 42 planet candidates, of which 18 are new detections and 24 had been detected in the overlap** C5 by previous works. We used archival photometric and spectroscopic observations, as well as new high-spatial resolution images in order to carry out a complete analysis of the candidates found, including a homogeneous characterization of the host stars, transit modelling, search for transit timing variations and statistical validation. As a result, we report four new planets (K2-355 b, K2-356 b, K2-357 b, and K2-358 b) and 14 planet candidates. Besides, we refine the transit ephemeris of the previously published planets and candidates by modelling C5, C16 (when available) and C18 photometric data jointly, largely improving the period and mid-transit time precision. Regarding individual systems, we highlight the new planet K2-356 b and candidate EPIC 211537087.02 being near a 2:1 period commensurability, the detection of significant TTVs in the bright star K2-184 (V = 10.35), the location of K2-103 b inside the habitable zone according to optimistic models, the detection of a new single transit in the known system K2-274, and the disposition reassignment of K2-120 b, which we consider as a planet candidate as the origin of the signal cannot be ascertained.
△ Less
Submitted 20 November, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
A General Family of Stochastic Proximal Gradient Methods for Deep Learning
Authors:
Jihun Yun,
Aurelie C. Lozano,
Eunho Yang
Abstract:
We study the training of regularized neural networks where the regularizer can be non-smooth and non-convex. We propose a unified framework for stochastic proximal gradient descent, which we term ProxGen, that allows for arbitrary positive preconditioners and lower semi-continuous regularizers. Our framework encompasses standard stochastic proximal gradient methods without preconditioners as speci…
▽ More
We study the training of regularized neural networks where the regularizer can be non-smooth and non-convex. We propose a unified framework for stochastic proximal gradient descent, which we term ProxGen, that allows for arbitrary positive preconditioners and lower semi-continuous regularizers. Our framework encompasses standard stochastic proximal gradient methods without preconditioners as special cases, which have been extensively studied in various settings. Not only that, we present two important update rules beyond the well-known standard methods as a byproduct of our approach: (i) the first closed-form proximal map**s of $\ell_q$ regularization ($0 \leq q \leq 1$) for adaptive stochastic gradient methods, and (ii) a revised version of ProxQuant that fixes a caveat of the original approach for quantization-specific regularizers. We analyze the convergence of ProxGen and show that the whole family of ProxGen enjoys the same convergence rate as stochastic proximal gradient descent without preconditioners. We also empirically show the superiority of proximal methods compared to subgradient-based approaches via extensive experiments. Interestingly, our results indicate that proximal methods with non-convex regularizers are more effective than those with convex regularizers.
△ Less
Submitted 15 July, 2020;
originally announced July 2020.
-
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks
Authors:
Kyung-Su Kim,
Aurélie C. Lozano,
Eunho Yang
Abstract:
Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning theory. Especially, using the NTK-based approach, the following three representative results were obtained: (1) A training error bound was derived to show that net…
▽ More
Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning theory. Especially, using the NTK-based approach, the following three representative results were obtained: (1) A training error bound was derived to show that networks can fit any finite training sample perfectly by reflecting a tighter characterization of training speed depending on the data complexity. (2) A generalization error bound invariant of network size was derived by using a data-dependent complexity measure (CMD). It follows from this CMD bound that networks can generalize arbitrary smooth functions. (3) A simple and analytic kernel function was derived as indeed equivalent to a fully-trained network. This kernel outperforms its corresponding network and the existing gold standard, Random Forests, in few shot learning. For all of these results to hold, the network scaling factor $κ$ should decrease w.r.t. sample size n. In this case of decreasing $κ$, however, we prove that the aforementioned results are surprisingly erroneous. It is because the output value of trained network decreases to zero when $κ$ decreases w.r.t. n. To solve this problem, we tighten key bounds by essentially removing $κ$-affected values. Our tighter analysis resolves the scaling problem and enables the validation of the original NTK-based results.
△ Less
Submitted 6 August, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Stochastic Gradient Methods with Block Diagonal Matrix Adaptation
Authors:
Jihun Yun,
Aurelie C. Lozano,
Eunho Yang
Abstract:
Adaptive gradient approaches that automatically adjust the learning rate on a per-feature basis have been very popular for training deep networks. This rich class of algorithms includes Adagrad, RMSprop, Adam, and recent extensions. All these algorithms have adopted diagonal matrix adaptation, due to the prohibitive computational burden of manipulating full matrices in high-dimensions. In this pap…
▽ More
Adaptive gradient approaches that automatically adjust the learning rate on a per-feature basis have been very popular for training deep networks. This rich class of algorithms includes Adagrad, RMSprop, Adam, and recent extensions. All these algorithms have adopted diagonal matrix adaptation, due to the prohibitive computational burden of manipulating full matrices in high-dimensions. In this paper, we show that block-diagonal matrix adaptation can be a practical and powerful solution that can effectively utilize structural characteristics of deep learning architectures, and significantly improve convergence and out-of-sample generalization. We present a general framework with block-diagonal matrix updates via coordinate grou**, which includes counterparts of the aforementioned algorithms, prove their convergence in non-convex optimization, highlighting benefits compared to diagonal versions. In addition, we propose an efficient spectrum-clip** scheme that benefits from superior generalization performance of Sgd. Extensive experiments reveal that block-diagonal approaches achieve state-of-the-art results on several deep learning tasks, and can outperform adaptive diagonal methods, vanilla Sgd, as well as a modified version of full-matrix adaptation proposed very recently.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
Multitask Learning using Task Clustering with Applications to Predictive Modeling and GWAS of Plant Varieties
Authors:
Ming Yu,
Addie M. Thompson,
Karthikeyan Natesan Ramamurthy,
Eunho Yang,
Aurélie C. Lozano
Abstract:
Inferring predictive maps between multiple input and multiple output variables or tasks has innumerable applications in data science. Multi-task learning attempts to learn the maps to several output tasks simultaneously with information sharing between them. We propose a novel multi-task learning framework for sparse linear regression, where a full task hierarchy is automatically inferred from the…
▽ More
Inferring predictive maps between multiple input and multiple output variables or tasks has innumerable applications in data science. Multi-task learning attempts to learn the maps to several output tasks simultaneously with information sharing between them. We propose a novel multi-task learning framework for sparse linear regression, where a full task hierarchy is automatically inferred from the data, with the assumption that the task parameters follow a hierarchical tree structure. The leaves of the tree are the parameters for individual tasks, and the root is the global model that approximates all the tasks. We apply the proposed approach to develop and evaluate: (a) predictive models of plant traits using large-scale and automated remote sensing data, and (b) GWAS methodologies map** such derived phenotypes in lieu of hand-measured traits. We demonstrate the superior performance of our approach compared to other methods, as well as the usefulness of discovering hierarchical grou**s between tasks. Our results suggest that richer genetic map** can indeed be obtained from the remote sensing data. In addition, our discovered grou**s reveal interesting insights from a plant science perspective.
△ Less
Submitted 4 October, 2017;
originally announced October 2017.
-
Learning task structure via sparsity grouped multitask learning
Authors:
Meghana Kshirsagar,
Eunho Yang,
Aurélie C. Lozano
Abstract:
Sparse map** has been a key methodology in many high-dimensional scientific problems. When multiple tasks share the set of relevant features, learning them jointly in a group drastically improves the quality of relevant feature selection. However, in practice this technique is used limitedly since such grou** information is usually hidden. In this paper, our goal is to recover the group struct…
▽ More
Sparse map** has been a key methodology in many high-dimensional scientific problems. When multiple tasks share the set of relevant features, learning them jointly in a group drastically improves the quality of relevant feature selection. However, in practice this technique is used limitedly since such grou** information is usually hidden. In this paper, our goal is to recover the group structure on the sparsity patterns and leverage that information in the sparse learning. Toward this, we formulate a joint optimization problem in the task parameter and the group membership, by constructing an appropriate regularizer to encourage sparse learning as well as correct recovery of task groups. We further demonstrate that our proposed method recovers groups and the sparsity patterns in the task parameters accurately by extensive experiments.
△ Less
Submitted 14 September, 2017; v1 submitted 13 May, 2017;
originally announced May 2017.
-
Removing Clouds and Recovering Ground Observations in Satellite Image Sequences via Temporally Contiguous Robust Matrix Completion
Authors:
Jialei Wang,
Peder A. Olsen,
Andrew R. Conn,
Aurelie C. Lozano
Abstract:
We consider the problem of removing and replacing clouds in satellite image sequences, which has a wide range of applications in remote sensing. Our approach first detects and removes the cloud-contaminated part of the image sequences. It then recovers the missing scenes from the clean parts using the proposed "TECROMAC" (TEmporally Contiguous RObust MAtrix Completion) objective. The objective fun…
▽ More
We consider the problem of removing and replacing clouds in satellite image sequences, which has a wide range of applications in remote sensing. Our approach first detects and removes the cloud-contaminated part of the image sequences. It then recovers the missing scenes from the clean parts using the proposed "TECROMAC" (TEmporally Contiguous RObust MAtrix Completion) objective. The objective function balances temporal smoothness with a low rank solution while staying close to the original observations. The matrix whose the rows are pixels and columnsare days corresponding to the image, has low-rank because the pixels reflect land-types such as vegetation, roads and lakes and there are relatively few variations as a result. We provide efficient optimization algorithms for TECROMAC, so we can exploit images containing millions of pixels. Empirical results on real satellite image sequences, as well as simulated data, demonstrate that our approach is able to recover underlying images from heavily cloud-contaminated observations.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Robust Gaussian Graphical Modeling with the Trimmed Graphical Lasso
Authors:
Eunho Yang,
Aurélie C. Lozano
Abstract:
Gaussian Graphical Models (GGMs) are popular tools for studying network structures. However, many modern applications such as gene network discovery and social interactions analysis often involve high-dimensional noisy data with outliers or heavier tails than the Gaussian distribution. In this paper, we propose the Trimmed Graphical Lasso for robust estimation of sparse GGMs. Our method guards aga…
▽ More
Gaussian Graphical Models (GGMs) are popular tools for studying network structures. However, many modern applications such as gene network discovery and social interactions analysis often involve high-dimensional noisy data with outliers or heavier tails than the Gaussian distribution. In this paper, we propose the Trimmed Graphical Lasso for robust estimation of sparse GGMs. Our method guards against outliers by an implicit trimming mechanism akin to the popular Least Trimmed Squares method used for linear regression. We provide a rigorous statistical analysis of our estimator in the high-dimensional setting. In contrast, existing approaches for robust sparse GGMs estimation lack statistical guarantees. Our theoretical results are complemented by experiments on simulated and real gene expression data which further demonstrate the value of our approach.
△ Less
Submitted 28 October, 2015;
originally announced October 2015.
-
Sparse Quantile Huber Regression for Efficient and Robust Estimation
Authors:
Aleksandr Y. Aravkin,
Anju Kambadur,
Aurelie C. Lozano,
Ronny Luss
Abstract:
We consider new formulations and methods for sparse quantile regression in the high-dimensional setting. Quantile regression plays an important role in many applications, including outlier-robust exploratory analysis in gene selection. In addition, the sparsity consideration in quantile regression enables the exploration of the entire conditional distribution of the response variable given the pre…
▽ More
We consider new formulations and methods for sparse quantile regression in the high-dimensional setting. Quantile regression plays an important role in many applications, including outlier-robust exploratory analysis in gene selection. In addition, the sparsity consideration in quantile regression enables the exploration of the entire conditional distribution of the response variable given the predictors and therefore yields a more comprehensive view of the important predictors. We propose a generalized OMP algorithm for variable selection, taking the misfit loss to be either the traditional quantile loss or a smooth version we call quantile Huber, and compare the resulting greedy approaches with convex sparsity-regularized formulations. We apply a recently proposed interior point methodology to efficiently solve all convex formulations as well as convex subproblems in the generalized OMP setting, pro- vide theoretical guarantees of consistent estimation, and demonstrate the performance of our approach using empirical studies of simulated and genomic datasets.
△ Less
Submitted 19 February, 2014;
originally announced February 2014.
-
Minimum Distance Estimation for Robust High-Dimensional Regression
Authors:
Aurélie C. Lozano,
Nicolai Meinshausen
Abstract:
We propose a minimum distance estimation method for robust regression in sparse high-dimensional settings. The traditional likelihood-based estimators lack resilience against outliers, a critical issue when dealing with high-dimensional noisy data. Our method, Minimum Distance Lasso (MD-Lasso), combines minimum distance functionals, customarily used in nonparametric estimation for their robustness…
▽ More
We propose a minimum distance estimation method for robust regression in sparse high-dimensional settings. The traditional likelihood-based estimators lack resilience against outliers, a critical issue when dealing with high-dimensional noisy data. Our method, Minimum Distance Lasso (MD-Lasso), combines minimum distance functionals, customarily used in nonparametric estimation for their robustness, with l1-regularization for high-dimensional regression. The geometry of MD-Lasso is key to its consistency and robustness. The estimator is governed by a scaling parameter that caps the influence of outliers: the loss per observation is locally convex and close to quadratic for small squared residuals, and flattens for squared residuals larger than the scaling parameter. As the parameter approaches infinity, the estimator becomes equivalent to least-squares Lasso. MD-Lasso enjoys fast convergence rates under mild conditions on the model error distribution, which hold for any of the solutions in a convexity region around the true parameter and in certain cases for every solution. Remarkably, a first-order optimization method is able to produce iterates very close to the consistent solutions, with geometric convergence and regardless of the initialization. A connection is established with re-weighted least-squares that intuitively explains MD-Lasso robustness. The merits of our method are demonstrated through simulation and eQTL data analysis.
△ Less
Submitted 11 July, 2013;
originally announced July 2013.
-
Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality
Authors:
Vikas Sindhwani,
Minh Ha Quang,
Aurelie C. Lozano
Abstract:
We propose a general matrix-valued multiple kernel learning framework for high-dimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel Hilbert Spaces. We develop a highly scalable and eigendecomposition-free algorithm that orchestra…
▽ More
We propose a general matrix-valued multiple kernel learning framework for high-dimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel Hilbert Spaces. We develop a highly scalable and eigendecomposition-free algorithm that orchestrates two inexact solvers for simultaneously learning both the input and output components of separable matrix-valued kernels. As a key application enabled by our framework, we show how high-dimensional causal inference tasks can be naturally cast as sparse function estimation problems, leading to novel nonlinear extensions of a class of Graphical Granger Causality techniques. Our algorithmic developments and extensive empirical studies are complemented by theoretical analyses in terms of Rademacher generalization bounds.
△ Less
Submitted 7 March, 2013; v1 submitted 17 October, 2012;
originally announced October 2012.