Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces
Authors:
Amirali Aghazadeh,
Nived Rajaraman,
Tony Tu,
Kannan Ramchandran
Abstract:
Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite th…
▽ More
Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite these empirical studies, the theoretical underpinning of when and how spectral regularization enables improved generalization is poorly understood. In this paper, we focus on learning pseudo-Boolean functions and demonstrate that regularizing the empirical mean squared error by the L_1 norm of the spectral transform of the learned function reshapes the loss landscape and allows for data-frugal learning, under a restricted secant condition on the learner's empirical error measured against the ground truth function. Under a weaker quadratic growth condition, we show that stationary points which also approximately interpolate the training data points achieve statistically optimal generalization performance. Complementing our theory, we empirically demonstrate that running gradient descent on the regularized loss results in a better generalization performance compared to baseline algorithms in several data-scarce real-world problems.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
Unequal Covariance Awareness for Fisher Discriminant Analysis and Its Variants in Classification
Authors:
Thu Nguyen,
Quang M. Le,
Son N. T. Tu,
Binh T. Nguyen
Abstract:
Fisher Discriminant Analysis (FDA) is one of the essential tools for feature extraction and classification. In addition, it motivates the development of many improved techniques based on the FDA to adapt to different problems or data types. However, none of these approaches make use of the fact that the assumption of equal covariance matrices in FDA is usually not satisfied in practical situations…
▽ More
Fisher Discriminant Analysis (FDA) is one of the essential tools for feature extraction and classification. In addition, it motivates the development of many improved techniques based on the FDA to adapt to different problems or data types. However, none of these approaches make use of the fact that the assumption of equal covariance matrices in FDA is usually not satisfied in practical situations. Therefore, we propose a novel classification rule for the FDA that accounts for this fact, mitigating the effect of unequal covariance matrices in the FDA. Furthermore, since we only modify the classification rule, the same can be applied to many FDA variants, improving these algorithms further. Theoretical analysis reveals that the new classification rule allows the implicit use of the class covariance matrices while increasing the number of parameters to be estimated by a small amount compared to going from FDA to Quadratic Discriminant Analysis. We illustrate our idea via experiments, which show the superior performance of the modified algorithms based on our new classification rule compared to the original ones.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.