-
Subspace embedding with random Khatri-Rao products and its application to eigensolvers
Authors:
Zvonimir Bujanović,
Luka Grubišić,
Daniel Kressner,
Hei Yin Lam
Abstract:
Various iterative eigenvalue solvers have been developed to compute parts of the spectrum for a large sparse matrix, including the power method, Krylov subspace methods, contour integral methods, and preconditioned solvers such as the so called LOBPCG method. All of these solvers rely on random matrices to determine, e.g., starting vectors that have, with high probability, a non-negligible overlap…
▽ More
Various iterative eigenvalue solvers have been developed to compute parts of the spectrum for a large sparse matrix, including the power method, Krylov subspace methods, contour integral methods, and preconditioned solvers such as the so called LOBPCG method. All of these solvers rely on random matrices to determine, e.g., starting vectors that have, with high probability, a non-negligible overlap with the eigenvectors of interest. For this purpose, a safe and common choice are unstructured Gaussian random matrices. In this work, we investigate the use of random Khatri-Rao products in eigenvalue solvers. On the one hand, we establish a novel subspace embedding property that provides theoretical justification for the use of such structured random matrices. On the other hand, we highlight the potential algorithmic benefits when solving eigenvalue problems with Kronecker product structure, as they arise frequently from the discretization of eigenvalue problems for differential operators on tensor product domains. In particular, we consider the use of random Khatri-Rao products within a contour integral method and LOBPCG. Numerical experiments indicate that the gains for the contour integral method strongly depend on the ability to efficiently and accurately solve (shifted) matrix equations with low-rank right-hand side. The flexibility of LOBPCG to directly employ preconditioners makes it easier to benefit from Khatri-Rao product structure, at the expense of having less theoretical justification.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
The Past, Present, and Future of Plant Stress Research
Authors:
Eugene Koh,
Rohan Shawn Sunil,
Hilbert Yuen In Lam,
Malavika SujaMdharan,
Monika Chodasiewicz,
Marek Mutwil
Abstract:
Life finds a way. For sessile organisms like plants, the need to adapt to changes in the environment is even more poignant. For humanity, the need to develop crops that can grow in diverse environments and feed our growing population is an existential one. The development of fast-growing, high-yielding crop varieties sparked the Green Revolution, and the advent of the genomics era enabled the deve…
▽ More
Life finds a way. For sessile organisms like plants, the need to adapt to changes in the environment is even more poignant. For humanity, the need to develop crops that can grow in diverse environments and feed our growing population is an existential one. The development of fast-growing, high-yielding crop varieties sparked the Green Revolution, and the advent of the genomics era enabled the development of customized transgenic crops enhanced for specific traits or resistances. Today, the proliferation of artificial intelligence (AI) allows scientists to rapidly screen through massive and complex datasets to uncover elusive patterns in the data, enabling us to create more robust and faster models for prediction and hypothesis generation in a bid to develop more stress-resilient plants. This review aims to provide an overview of the evolution of environmental stress research across the plant kingdom over the past fifty years. It will cover historical landmark concepts and discoveries that were seminal in advancing the field, provide a global snapshot of our current scientific progress, and conclude with a discussion on the advent of AI tools that would help accelerate scientific discovery.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Large Language Models in Plant Biology
Authors:
Hilbert Yuen In Lam,
Xing Er Ong,
Marek Mutwil
Abstract:
Large Language Models (LLMs), such as ChatGPT, have taken the world by storm and have passed certain forms of the Turing test. However, LLMs are not limited to human language and analyze sequential data, such as DNA, protein, and gene expression. The resulting foundation models can be repurposed to identify the complex patterns within the data, resulting in powerful, multi-purpose prediction tools…
▽ More
Large Language Models (LLMs), such as ChatGPT, have taken the world by storm and have passed certain forms of the Turing test. However, LLMs are not limited to human language and analyze sequential data, such as DNA, protein, and gene expression. The resulting foundation models can be repurposed to identify the complex patterns within the data, resulting in powerful, multi-purpose prediction tools able to explain cellular systems. This review outlines the different types of LLMs and showcases their recent uses in biology. Since LLMs have not yet been embraced by the plant community, we also cover how these models can be deployed for the plant kingdom.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Randomized low-rank approximation of parameter-dependent matrices
Authors:
Daniel Kressner,
Hei Yin Lam
Abstract:
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with ran…
▽ More
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to $A(t)$ would involve different, independent DRMs for every $t$, which is not only expensive but also leads to inherently non-smooth approximations. In this work, we propose to use constant DRMs, that is, $A(t)$ is multiplied with the same DRM for every $t$. The resulting parameter-dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nyström method, are computationally attractive, especially when $A(t)$ admits an affine linear decomposition with respect to $t$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi-best low-rank approximations.
△ Less
Submitted 17 April, 2024; v1 submitted 24 February, 2023;
originally announced February 2023.