Search | arXiv e-print repository

Standalone Neural ODEs with Sensitivity Analysis

Authors: Rym Jaroudi, Lukáš Malý, Gabriel Eilertsen, B. Tomas Johansson, Jonas Unger, George Baravdish

Abstract: This paper presents the Standalone Neural ODE (sNODE), a continuous-depth neural ODE model capable of describing a full deep neural network. This uses a novel nonlinear conjugate gradient (NCG) descent optimization scheme for training, where the Sobolev gradient can be incorporated to improve smoothness of model weights. We also present a general formulation of the neural sensitivity problem and s… ▽ More This paper presents the Standalone Neural ODE (sNODE), a continuous-depth neural ODE model capable of describing a full deep neural network. This uses a novel nonlinear conjugate gradient (NCG) descent optimization scheme for training, where the Sobolev gradient can be incorporated to improve smoothness of model weights. We also present a general formulation of the neural sensitivity problem and show how it is used in the NCG training. The sensitivity analysis provides a reliable measure of uncertainty propagation throughout a network, and can be used to study model robustness and to generate adversarial attacks. Our evaluations demonstrate that our novel formulations lead to increased robustness and performance as compared to ResNet models, and that it opens up for new opportunities for designing and develo** machine learning with improved explainability. △ Less

Submitted 8 June, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

Comments: 25 pages, 15 figures; typos corrected

MSC Class: 68T07; 34H05; 49N45

arXiv:2202.05766 [pdf, other]

Learning via nonlinear conjugate gradients and depth-varying neural ODEs

Authors: George Baravdish, Gabriel Eilertsen, Rym Jaroudi, B. Tomas Johansson, Lukáš Malý, Jonas Unger

Abstract: The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers… ▽ More The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers trained by conventional methods. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional covering a variety of loss functions and penalty terms. A nonlinear conjugate gradient method (NCG) is derived for the minimization. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with a sensitivity problem. The sensitivity problem can estimate changes in the network output under perturbation of the trained parameters. To preserve smoothness during the iterations the Sobolev gradient is calculated and incorporated. As a proof-of-concept, numerical results are included for a NODE and two synthetic datasets, and compared with standard gradient approaches (not based on NODEs). The results show that the proposed method works well for deep learning with infinite numbers of layers, and has built-in stability and smoothness. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: 26 pages, 3 figures

MSC Class: 68T07; 34H05; 49N45

arXiv:2012.11953 [pdf, ps, other]

Hamilton cycles in weighted Erdős-Rényi graphs

Authors: Tony Johansson

Abstract: Given a symmetric $n\times n$ matrix $P$ with $0 \le P(u, v)\le 1$, we define a random graph $G_{n, P}$ on $[n]$ by independently including any edge $\{u, v\}$ with probability $P(u, v)$. For $k\ge 1$ let $\mathcal{A}_k$ be the property of containing $\lfloor k/2 \rfloor$ Hamilton cycles, and one perfect matching if $k$ is odd, all edge-disjoint. With an eigenvalue condition on $P$, and conditions… ▽ More Given a symmetric $n\times n$ matrix $P$ with $0 \le P(u, v)\le 1$, we define a random graph $G_{n, P}$ on $[n]$ by independently including any edge $\{u, v\}$ with probability $P(u, v)$. For $k\ge 1$ let $\mathcal{A}_k$ be the property of containing $\lfloor k/2 \rfloor$ Hamilton cycles, and one perfect matching if $k$ is odd, all edge-disjoint. With an eigenvalue condition on $P$, and conditions on its row sums, $G_{n, P}\in \mathcal{A}_k$ happens with high probability if and only if $G_{n, P}$ has minimum degree $k$ whp. We also provide a hitting time version. As a special case, the random graph process on pseudorandom $(n, d, μ)$-graphs with $μ\le d(d/n)^α$ for some constant $α> 0$ has property $\mathcal{A}_k$ as soon as it acquires minimum degree $k$ with high probability. △ Less

Submitted 22 December, 2020; originally announced December 2020.

arXiv:2001.05258 [pdf, ps, other]

A condition for Hamiltonicity in Sparse Random Graphs with a Fixed Degree Sequence

Authors: Tony Johansson

Abstract: We consider the random graph $G_{n, {\bf d}}$ chosen uniformly at random from the set of all graphs with a given sparse degree sequence ${\bf d}$. We assume ${\bf d}$ has minimum degree at least 4, at most a power law tail, and place one more condition on its tail. For $k\ge 2$ define $β_k(G) = \max e(A, B) + k(|A|-|B|) - d(A)$, with the maximum taken over disjoint vertex sets $A, B$. It is shown… ▽ More We consider the random graph $G_{n, {\bf d}}$ chosen uniformly at random from the set of all graphs with a given sparse degree sequence ${\bf d}$. We assume ${\bf d}$ has minimum degree at least 4, at most a power law tail, and place one more condition on its tail. For $k\ge 2$ define $β_k(G) = \max e(A, B) + k(|A|-|B|) - d(A)$, with the maximum taken over disjoint vertex sets $A, B$. It is shown that the problem of determining if $G_{n, {\bf d}}$ contains a Hamilton cycle reduces to calculating $β_2(G_{n, {\bf d}})$. If $k\ge 2$ and $δ\ge k+2$, the problem of determining if $G_{n, {\bf d}}$ contains a $k$-factor reduces to calculating $β_k(G_{n, {\bf d}})$. △ Less

Submitted 15 January, 2020; originally announced January 2020.

arXiv:1901.02328 [pdf, ps, other]

doi 10.1007/s00453-019-00667-5

Embedding small digraphs and permutations in binary trees and split trees

Authors: Michael Albert, Cecilia Holmgren, Tony Johansson, Fiona Skerman

Abstract: We investigate the number of permutations that occur in random labellings of trees. This is a generalisation of the number of subpermutations occurring in a random permutation. It also generalises some recent results on the number of inversions in randomly labelled trees. We consider complete binary trees as well as random split trees a large class of random trees of logarithmic height introduced… ▽ More We investigate the number of permutations that occur in random labellings of trees. This is a generalisation of the number of subpermutations occurring in a random permutation. It also generalises some recent results on the number of inversions in randomly labelled trees. We consider complete binary trees as well as random split trees a large class of random trees of logarithmic height introduced by Devroye in 1998. Split trees consist of nodes (bags) which can contain balls and are generated by a random trickle down process of balls through the nodes. For complete binary trees we show that asymptotically the cumulants of the number of occurrences of a fixed permutation in the random node labelling have explicit formulas. Our other main theorem is to show that for a random split tree, with high probability the cumulants of the number of occurrences are asymptotically an explicit parameter of the split tree. For the proof of the second theorem we show some results on the number of embeddings of digraphs into split trees which may be of independent interest. △ Less

Submitted 6 May, 2019; v1 submitted 8 January, 2019; originally announced January 2019.

arXiv:1811.03501 [pdf, ps, other]

On Hamilton cycles in Erdős-Rényi subgraphs of large graphs

Authors: Tony Johansson

Abstract: Given a graph $Γ= (V, E)$ on $n$ vertices and $m$ edges, we define the Erdős-Rényi graph process with host $Γ$ as follows. A permutation $e_1,\dots,e_m$ of $E$ is chosen uniformly at random, and for $t\leq m$ we let $Γ_t = (V, \{e_1,\dots,e_t\})$. Suppose the minimum degree of $Γ$ is $δ(Γ) \geq (1/2 + \varepsilon)n$ for some constant $\varepsilon > 0$. Then with high probability, $Γ_t$ becomes Ham… ▽ More Given a graph $Γ= (V, E)$ on $n$ vertices and $m$ edges, we define the Erdős-Rényi graph process with host $Γ$ as follows. A permutation $e_1,\dots,e_m$ of $E$ is chosen uniformly at random, and for $t\leq m$ we let $Γ_t = (V, \{e_1,\dots,e_t\})$. Suppose the minimum degree of $Γ$ is $δ(Γ) \geq (1/2 + \varepsilon)n$ for some constant $\varepsilon > 0$. Then with high probability, $Γ_t$ becomes Hamiltonian at the same moment that its minimum degree becomes at least two. Given $0\leq p\leq 1$ we let $Γ_p$ be the Erdős-Rényi subgraph of $Γ$, obtained by retaining each edge independently with probability $p$. When $δ(Γ)\geq (1/2 + \varepsilon)n$, we provide a threshold function $p_0$ for Hamiltonicity, such that if $(p-p_0)n\to -\infty$ then $Γ_p$ is not Hamiltonian whp, and if $(p-p_0)n\to\infty$ then $Γ_p$ is Hamiltonian whp. △ Less

Submitted 8 November, 2018; originally announced November 2018.

arXiv:1809.11012 [pdf, other]

doi 10.1016/j.cam.2019.112463

On a boundary integral solution of a lateral planar Cauchy problem in elastodynamics

Authors: Roman Chapko, B. Tomas Johansson, Leonidas Mindrinos

Abstract: A boundary integral based method for the stable reconstruction of missing boundary data is presented for the governing hyperbolic equation of elastodynamics in annular planar domains. Cauchy data in the form of the solution and traction is reconstructed on the inner boundary curve from the similar data given on the outer boundary. The ill-posed data reconstruction problem is reformulated as a sequ… ▽ More A boundary integral based method for the stable reconstruction of missing boundary data is presented for the governing hyperbolic equation of elastodynamics in annular planar domains. Cauchy data in the form of the solution and traction is reconstructed on the inner boundary curve from the similar data given on the outer boundary. The ill-posed data reconstruction problem is reformulated as a sequence of boundary integral equations using the Laguerre transform with respect to time and employing a single-layer approach for the stationary problem. Singularities of the involved kernels in the integrals are analysed and made explicit, and standard quadrature rules are used for discretisation. Tikhonov regularization is employed for the stable solution of the obtained linear system. Numerical results are included showing that the outlined approach can be turned into a practical working method for finding the missing data. △ Less

Submitted 28 September, 2018; originally announced September 2018.

Comments: 26 pages, 2 Figures, 8 Tables

Journal ref: Journal of Computational and Applied Mathematics, 367, 112463, 2020

arXiv:1805.05780 [pdf, ps, other]

The cover time of a biased random walk on a random regular graph of odd degree

Authors: Tony Johansson

Abstract: We consider a random walk process which prefers to visit previously unvisited edges, on the random $r$-regular graph $G_r$ for any odd $r\geq 3$. We show that this random walk process has asymptotic vertex and edge cover times $\frac{1}{r-2}n\log n$ and $\frac{r}{2(r-2)}n\log n$, respectively, generalizing the result from Cooper, Frieze and Johansson from $r = 3$ to any larger odd $r$. This comple… ▽ More We consider a random walk process which prefers to visit previously unvisited edges, on the random $r$-regular graph $G_r$ for any odd $r\geq 3$. We show that this random walk process has asymptotic vertex and edge cover times $\frac{1}{r-2}n\log n$ and $\frac{r}{2(r-2)}n\log n$, respectively, generalizing the result from Cooper, Frieze and Johansson from $r = 3$ to any larger odd $r$. This completes the study of the vertex cover time for fixed $r\geq 3$, with Berenbrink, Cooper and Friedetzky having previously shown that $G_r$ has vertex cover time asymptotic to $\frac{rn}{2}$ when $r\geq 4$ is even. △ Less

Submitted 12 May, 2018; originally announced May 2018.

Comments: arXiv admin note: text overlap with arXiv:1801.00760

arXiv:1801.00760 [pdf, other]

The cover time of a biased random walk on a random cubic graph

Authors: Colin Cooper, Alan Frieze, Tony Johansson

Abstract: We study a random walk that prefers tou se unvisited edges in the context of random cubic graphs. We establish asymptotically correct estimates for the vertex and edge cover times, these being $\approx n\log n$ and $\approx \frac32n\log n$ respectively. We study a random walk that prefers tou se unvisited edges in the context of random cubic graphs. We establish asymptotically correct estimates for the vertex and edge cover times, these being $\approx n\log n$ and $\approx \frac32n\log n$ respectively. △ Less

Submitted 3 January, 2018; v1 submitted 2 January, 2018; originally announced January 2018.

arXiv:1709.00216 [pdf, other]

doi 10.1017/S0963548318000512

Inversions in split trees and conditional Galton--Watson trees

Authors: Xing Shi Cai, Cecilia Holmgren, Svante Janson, Tony Johansson, Fiona Skerman

Abstract: We study $I(T)$, the number of inversions in a tree $T$ with its vertices labeled uniformly at random, which is a generalization of inversions in permutations. We first show that the cumulants of $I(T)$ have explicit formulas involving the $k$-total common ancestors of $T$ (an extension of the total path length). Then we consider $X_n$, the normalized version of $I(T_n)$, for a sequence of trees… ▽ More We study $I(T)$, the number of inversions in a tree $T$ with its vertices labeled uniformly at random, which is a generalization of inversions in permutations. We first show that the cumulants of $I(T)$ have explicit formulas involving the $k$-total common ancestors of $T$ (an extension of the total path length). Then we consider $X_n$, the normalized version of $I(T_n)$, for a sequence of trees $T_n$. For fixed $T_{n}$'s, we prove a sufficient condition for $X_n$ to converge in distribution. As an application, we identify the limit of $X_n$ for complete $b$-ary trees. For $T_n$ being split trees, we show that $X_n$ converges to the unique solution of a distributional equation. Finally, when $T_n$'s are conditional Galton--Watson trees, we show that $X_n$ converges to a random variable defined in terms of Brownian excursions. By exploiting the connection between inversions and the total path length, we are able to give results that are stronger and much broader compared to previous work by Panholzer and Seitz. △ Less

Submitted 3 September, 2018; v1 submitted 1 September, 2017; originally announced September 2017.

Comments: 28 pages, 1 figure

MSC Class: 60C05

arXiv:1610.04588 [pdf, other]

Deletion of oldest edges in a preferential attachment graph

Authors: Tony Johansson

Abstract: We consider a variation on the Barabási-Albert random graph process with fixed parameters $m\in \mathbb{N}$ and $1/2 < p < 1$. With probability $p$ a vertex is added along with $m$ edges, randomly chosen proportional to vertex degrees. With probability $1 - p$, the oldest vertex still holding its original $m$ edges loses those edges. It is shown that the degree of any vertex either is zero or foll… ▽ More We consider a variation on the Barabási-Albert random graph process with fixed parameters $m\in \mathbb{N}$ and $1/2 < p < 1$. With probability $p$ a vertex is added along with $m$ edges, randomly chosen proportional to vertex degrees. With probability $1 - p$, the oldest vertex still holding its original $m$ edges loses those edges. It is shown that the degree of any vertex either is zero or follows a geometric distribution. If $p$ is above a certain threshold, this leads to a power law for the degree sequence, while a smaller $p$ gives exponential tails. It is also shown that the graph contains a unique giant component whp if and only if $m\geq 2$. △ Less

Submitted 17 November, 2016; v1 submitted 14 October, 2016; originally announced October 2016.

Comments: 38 pages. Slightly improved results

arXiv:1602.04652 [pdf, ps, other]

On the insertion time of random walk cuckoo hashing

Authors: Alan Frieze, Tony Johansson

Abstract: Cuckoo Hashing is a hashing scheme invented by Pagh and Rodler. It uses $d\geq 2$ distinct hash functions to insert items into the hash table. It has been an open question for some time as to the expected time for Random Walk Insertion to add items. We show that if the number of hash functions $d=O(1)$ is sufficiently large, then the expected insertion time is $O(1)$ per item. Cuckoo Hashing is a hashing scheme invented by Pagh and Rodler. It uses $d\geq 2$ distinct hash functions to insert items into the hash table. It has been an open question for some time as to the expected time for Random Walk Insertion to add items. We show that if the number of hash functions $d=O(1)$ is sufficiently large, then the expected insertion time is $O(1)$ per item. △ Less

Submitted 8 January, 2017; v1 submitted 15 February, 2016; originally announced February 2016.

Comments: 9 pages

arXiv:1505.03429 [pdf, ps, other]

On edge disjoint spanning trees in a randomly weighted complete graph

Authors: Alan Frieze, Tony Johansson

Abstract: Assume that the edges of the complete graph $K_n$ are given independent uniform $[0,1]$ edges weights. We consider the expected minimum total weight $μ_k$ of $k\geq 2$ edge disjoint spanning trees. When $k$ is large we show that $μ_k\approx k^2$. Most of the paper is concerned with the case $k=2$. We show that $\m_2$ tends to an explicitly defined constant and that $μ_2\approx 4.1704288\ldots$. Assume that the edges of the complete graph $K_n$ are given independent uniform $[0,1]$ edges weights. We consider the expected minimum total weight $μ_k$ of $k\geq 2$ edge disjoint spanning trees. When $k$ is large we show that $μ_k\approx k^2$. Most of the paper is concerned with the case $k=2$. We show that $\m_2$ tends to an explicitly defined constant and that $μ_2\approx 4.1704288\ldots$. △ Less

Submitted 22 June, 2017; v1 submitted 13 May, 2015; originally announced May 2015.

Comments: Fixed minor issues

arXiv:1504.00312 [pdf, ps, other]

Minimum-cost matching in a random graph with random costs

Authors: Alan Frieze, Tony Johansson

Abstract: Let $G_{n,p}$ be the standard Erdős-Rényi-Gilbert random graph and let $G_{n,n,p}$ be the random bipartite graph on $n+n$ vertices, where each $e\in [n]^2$ appears as an edge independently with probability $p$. For a graph $G=(V,E)$, suppose that each edge $e\in E$ is given an independent uniform exponential rate one cost. Let $C(G)$ denote the random variable equal to the length of the minimum co… ▽ More Let $G_{n,p}$ be the standard Erdős-Rényi-Gilbert random graph and let $G_{n,n,p}$ be the random bipartite graph on $n+n$ vertices, where each $e\in [n]^2$ appears as an edge independently with probability $p$. For a graph $G=(V,E)$, suppose that each edge $e\in E$ is given an independent uniform exponential rate one cost. Let $C(G)$ denote the random variable equal to the length of the minimum cost perfect matching, assuming that $G$ contains at least one. We show that w.h.p. if $d=np\gg(\log n)^2$ then w.h.p. ${\bf E}[C(G_{n,n,p})] =(1+o(1))\frac{\p^2}{6p}$. This generalises the well-known result for the case $G=K_{n,n}$. We also show that w.h.p. ${\bf E}[C(G_{n,p})] =(1+o(1))\frac{\p^2}{12p}$ along with concentration results for both types of random graph. △ Less

Submitted 17 November, 2015; v1 submitted 1 April, 2015; originally announced April 2015.

Comments: Replaces an earlier paper where $G$ was an arbitrary regular bipartite graph

arXiv:1405.2129 [pdf, other]

On random k-out sub-graphs of large graphs

Authors: Alan Frieze, Tony Johansson

Abstract: We consider random sub-graphs of a fixed graph $G=(V,E)$ with large minimum degree. We fix a positive integer $k$ and let $G_k$ be the random sub-graph where each $v\in V$ independently chooses $k$ random neighbors, making $kn$ edges in all. When the minimum degree $δ(G)\geq (\frac12+ε)n,\,n=|V|$ then $G_k$ is $k$-connected w.h.p. for $k=O(1)$; Hamiltonian for $k$ sufficiently large. When… ▽ More We consider random sub-graphs of a fixed graph $G=(V,E)$ with large minimum degree. We fix a positive integer $k$ and let $G_k$ be the random sub-graph where each $v\in V$ independently chooses $k$ random neighbors, making $kn$ edges in all. When the minimum degree $δ(G)\geq (\frac12+ε)n,\,n=|V|$ then $G_k$ is $k$-connected w.h.p. for $k=O(1)$; Hamiltonian for $k$ sufficiently large. When $δ(G) \geq m$, then $G_k$ has a cycle of length $(1-ε)m$ for $k\geq k_ε$. By w.h.p. we mean that the probability of non-occurrence can be bounded by a function $φ(n)$ (or $φ(m)$) where $\lim_{n\to\infty}φ(n)=0$. △ Less

Submitted 8 May, 2014; originally announced May 2014.

arXiv:1210.7346 [pdf, ps, other]

A note on uniqueness in the identification of a spacewise dependent source and diffusion coefficient for the heat equation

Authors: Adriano De Cezaro, B. Tomas Johansson

Abstract: We investigate uniqueness in the inverse problem of reconstructing simultaneously a spacewise conductivity function and a heat source in the parabolic heat equation from the usual conditions of the direct problem and additional information from a supplementary temperature measurement at a given single instant of time. In the multi-dimensional case, we use Carleman estimates for parabolic equations… ▽ More We investigate uniqueness in the inverse problem of reconstructing simultaneously a spacewise conductivity function and a heat source in the parabolic heat equation from the usual conditions of the direct problem and additional information from a supplementary temperature measurement at a given single instant of time. In the multi-dimensional case, we use Carleman estimates for parabolic equations to obtain a uniqueness result. The given data and the solution domain are sufficiently smooth such that the required norms and the derivatives of the conductivity, the source and the solution of the parabolic heat equation exist and are continuous throughout the solution domain. These assumptions can be further relaxed using more involved estimates and techniques but these lengthy details are not included. Instead, in the special case of the one-dimensional heat equation, we give an alternative and rather straightforward proof of uniqueness of the inverse problem, based on integral representations of the solution together with density results for solutions of the corresponding adjoint problem. In this case, the required regularity conditions on the conductivity, source and the solution of the parabolic heat equation are weakened to classes of integrable functions. Keywords: uniqueness; spacewise conductivity and source; final time measurements; heat equation; Carleman estimates. △ Less

Submitted 27 October, 2012; originally announced October 2012.

Comments: Submitted in October of 2012

MSC Class: 35R30; 35A02; 65M32; 65L09; 74F05; 35K05

Showing 1–16 of 16 results for author: Johansson, T