Search | arXiv e-print repository

Piecewise Polynomial Regression of Tame Functions via Integer Programming

Authors: Gilles Bareilles, Johannes Aspman, Jiri Nemecek, Jakub Marecek

Abstract: Tame functions are a class of nonsmooth, nonconvex functions, which feature in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We consider approximating tame functions with piecewise polynomial functions. We bound the quality of approximation of… ▽ More Tame functions are a class of nonsmooth, nonconvex functions, which feature in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We consider approximating tame functions with piecewise polynomial functions. We bound the quality of approximation of a tame function by a piecewise polynomial function with a given number of segments on any full-dimensional cube. We also present the first mixed-integer programming formulation of piecewise polynomial regression. Together, these can be used to estimate tame functions. We demonstrate promising computational results. △ Less

Submitted 4 June, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2310.04469 [pdf, other]

Taming Binarized Neural Networks and Mixed-Integer Programs

Authors: Johannes Aspman, Georgios Korpas, Jakub Marecek

Abstract: There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that… ▽ More There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that binarized neural networks admit a tame representation. This, in turn, makes it possible to use the framework of Bolte et al. for implicit differentiation, which offers the possibility for practical implementation of backpropagation in the context of binarized neural networks. This approach could also be used for a broader class of mixed-integer programs, beyond the training of binarized neural networks, as encountered in symbolic approaches to AI and beyond. △ Less

Submitted 20 December, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: 9 pages, 4 figures

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2024

arXiv:2302.00709 [pdf, other]

Riemannian Stochastic Approximation for Minimizing Tame Nonsmooth Objective Functions

Authors: Johannes Aspman, Vyacheslav Kungurtsev, Reza Roohi Seraji

Abstract: In many learning applications, the parameters in a model are structurally constrained in a way that can be modeled as them lying on a Riemannian manifold. Riemannian optimization, wherein procedures to enforce an iterative minimizing sequence to be constrained to the manifold, is used to train such models. At the same time, tame geometry has become a significant topological description of nonsmoot… ▽ More In many learning applications, the parameters in a model are structurally constrained in a way that can be modeled as them lying on a Riemannian manifold. Riemannian optimization, wherein procedures to enforce an iterative minimizing sequence to be constrained to the manifold, is used to train such models. At the same time, tame geometry has become a significant topological description of nonsmooth functions that appear in the landscapes of training neural networks and other important models with structural compositions of continuous nonlinear functions with nonsmooth maps. In this paper, we study the properties of such stratifiable functions on a manifold and the behavior of retracted stochastic gradient descent, with diminishing stepsizes, for minimizing such functions. △ Less

Submitted 8 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

Showing 1–3 of 3 results for author: Aspman, J