Search | arXiv e-print repository

Physics-Informed Heterogeneous Graph Neural Networks for DC Blocker Placement

Authors: Hongwei **, Prasanna Balaprakash, Allen Zou, Pieter Ghysels, Aditi S. Krishnapriyan, Adam Mate, Arthur Barnes, Russell Bent

Abstract: The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of trans… ▽ More The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of transformers that experience high GICs during GMD events, however, calls for a sparse placement strategy that involves high computational cost. To address this challenge, we developed a physics-informed heterogeneous graph neural network (PIHGNN) for solving the graph-based dc-blocker placement problem. Our approach combines a heterogeneous graph neural network (HGNN) with a physics-informed neural network (PINN) to capture the diverse types of nodes and edges in ac/dc networks and incorporates the physical laws of the power grid. We train the PIHGNN model using a surrogate power flow model and validate it using case studies. Results demonstrate that PIHGNN can effectively and efficiently support the deployment of GIC dc-current blockers, ensuring the continued supply of electricity to meet societal demands. Our approach has the potential to contribute to the development of more reliable and resilient power grids capable of withstanding the growing threat that GMDs pose. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Paper is accepted by PSCC 2024

arXiv:2110.08614 [pdf, other]

Deep Learning and Spectral Embedding for Graph Partitioning

Authors: Alice Gatti, Zhixiong Hu, Tess Smidt, Esmond G. Ng, Pieter Ghysels

Abstract: We present a graph bisection and partitioning algorithm based on graph neural networks. For each node in the graph, the network outputs probabilities for each of the partitions. The graph neural network consists of two modules: an embedding phase and a partitioning phase. The embedding phase is trained first by minimizing a loss function inspired by spectral graph theory. The partitioning module i… ▽ More We present a graph bisection and partitioning algorithm based on graph neural networks. For each node in the graph, the network outputs probabilities for each of the partitions. The graph neural network consists of two modules: an embedding phase and a partitioning phase. The embedding phase is trained first by minimizing a loss function inspired by spectral graph theory. The partitioning module is trained through a loss function that corresponds to the expected value of the normalized cut. Both parts of the neural network rely on SAGE convolutional layers and graph coarsening using heavy edge matching. The multilevel structure of the neural network is inspired by the multigrid algorithm. Our approach generalizes very well to bigger graphs and has partition quality comparable to METIS, Scotch and spectral partitioning, with shorter runtime compared to METIS and spectral partitioning. △ Less

Submitted 8 December, 2021; v1 submitted 16 October, 2021; originally announced October 2021.

arXiv:2104.03546 [pdf, other]

Graph Partitioning and Sparse Matrix Ordering using Reinforcement Learning and Graph Neural Networks

Authors: Alice Gatti, Zhixiong Hu, Tess Smidt, Esmond G. Ng, Pieter Ghysels

Abstract: We present a novel method for graph partitioning, based on reinforcement learning and graph convolutional neural networks. Our approach is to recursively partition coarser representations of a given graph. The neural network is implemented using SAGE graph convolution layers, and trained using an advantage actor critic (A2C) agent. We present two variants, one for finding an edge separator that mi… ▽ More We present a novel method for graph partitioning, based on reinforcement learning and graph convolutional neural networks. Our approach is to recursively partition coarser representations of a given graph. The neural network is implemented using SAGE graph convolution layers, and trained using an advantage actor critic (A2C) agent. We present two variants, one for finding an edge separator that minimizes the normalized cut or quotient cut, and one that finds a small vertex separator. The vertex separators are then used to construct a nested dissection ordering to permute a sparse matrix so that its triangular factorization will incur less fill-in. The partitioning quality is compared with partitions obtained using METIS and SCOTCH, and the nested dissection ordering is evaluated in the sparse solver SuperLU. Our results show that the proposed method achieves similar partitioning quality as METIS and SCOTCH. Furthermore, the method generalizes across different classes of graphs, and works well on a variety of graphs from the SuiteSparse sparse matrix collection. △ Less

Submitted 28 June, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

arXiv:2007.00202 [pdf, other]

doi 10.1137/20M1349667

Sparse Approximate Multifrontal Factorization with Butterfly Compression for High Frequency Wave Equations

Authors: Yang Liu, Pieter Ghysels, Lisa Claus, Xiaoye Sherry Li

Abstract: We present a fast and approximate multifrontal solver for large-scale sparse linear systems arising from finite-difference, finite-volume or finite-element discretization of high-frequency wave equations. The proposed solver leverages the butterfly algorithm and its hierarchical matrix extension for compressing and factorizing large frontal matrices via graph-distance guided entry evaluation or ra… ▽ More We present a fast and approximate multifrontal solver for large-scale sparse linear systems arising from finite-difference, finite-volume or finite-element discretization of high-frequency wave equations. The proposed solver leverages the butterfly algorithm and its hierarchical matrix extension for compressing and factorizing large frontal matrices via graph-distance guided entry evaluation or randomized matrix-vector multiplication-based schemes. Complexity analysis and numerical experiments demonstrate $\mathcal{O}(N\log^2 N)$ computation and $\mathcal{O}(N)$ memory complexity when applied to an $N\times N$ sparse system arising from 3D high-frequency Helmholtz and Maxwell problems. △ Less

Submitted 18 October, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

MSC Class: 15A23; 65F50; 65R10; 65R20

Journal ref: SIAM Journal on Scientific Computing. 2021(0):S367-91

arXiv:2002.03400 [pdf, other]

Butterfly factorization via randomized matrix-vector multiplications

Authors: Yang Liu, Xin Xing, Han Guo, Eric Michielssen, Pieter Ghysels, Xiaoye Sherry Li

Abstract: This paper presents an adaptive randomized algorithm for computing the butterfly factorization of a $m\times n$ matrix with $m\approx n$ provided that both the matrix and its transpose can be rapidly applied to arbitrary vectors. The resulting factorization is composed of $O(\log n)$ sparse factors, each containing $O(n)$ nonzero entries. The factorization can be attained using $O(n^{3/2}\log n)$… ▽ More This paper presents an adaptive randomized algorithm for computing the butterfly factorization of a $m\times n$ matrix with $m\approx n$ provided that both the matrix and its transpose can be rapidly applied to arbitrary vectors. The resulting factorization is composed of $O(\log n)$ sparse factors, each containing $O(n)$ nonzero entries. The factorization can be attained using $O(n^{3/2}\log n)$ computation and $O(n\log n)$ memory resources. The proposed algorithm applies to matrices with strong and weak admissibility conditions arising from surface integral equation solvers with a rigorous error bound, and is implemented in parallel. △ Less

Submitted 9 February, 2020; originally announced February 2020.

arXiv:1905.06850 [pdf, other]

Improving strong scaling of the Conjugate Gradient method for solving large linear systems using global reduction pipelining

Authors: Siegfried Cools, Jeffrey Cornelis, Pieter Ghysels, Wim Vanroose

Abstract: This paper presents performance results comparing MPI-based implementations of the popular Conjugate Gradient (CG) method and several of its communication hiding (or 'pipelined') variants. Pipelined CG methods are designed to efficiently solve SPD linear systems on massively parallel distributed memory hardware, and typically display significantly improved strong scaling compared to classic CG. Th… ▽ More This paper presents performance results comparing MPI-based implementations of the popular Conjugate Gradient (CG) method and several of its communication hiding (or 'pipelined') variants. Pipelined CG methods are designed to efficiently solve SPD linear systems on massively parallel distributed memory hardware, and typically display significantly improved strong scaling compared to classic CG. This increase in parallel performance is achieved by overlap** the global reduction phase (MPI_Iallreduce) required to compute the inner products in each iteration by (chiefly local) computational work such as the matrix-vector product as well as other global communication. This work includes a brief introduction to the deep pipelined CG method for readers that may be unfamiliar with the specifics of the method. A brief overview of implementation details provides the practical tools required for implementation of the algorithm. Subsequently, easily reproducible strong scaling results on the US Department of Energy (DoE) NERSC machine 'Cori' (Phase I - Haswell nodes) on up to 1024 nodes with 16 MPI ranks per node are presented using an implementation of p(l)-CG that is available in the open source PETSc library. Observations on the staggering and overlap of the asynchronous, non-blocking global communication phases with communication and computational kernels are drawn from the experiments. △ Less

Submitted 15 May, 2019; originally announced May 2019.

Comments: EuroMPI 2019, 10 - 13 September 2019, ETH Zurich, Switzerland, 11 pages, 4 figures, 1 table. arXiv admin note: substantial text overlap with arXiv:1902.03100

MSC Class: 65F10; 65N12; 65G50; 65Y05; 65N22

arXiv:1810.04125 [pdf, other]

Matrix-free construction of HSS representation using adaptive randomized sampling

Authors: Christopher Gorman, Gustavo Chávez, Pieter Ghysels, Théo Mary, François-Henry Rouet, Xiaoye Sherry Li

Abstract: We present new algorithms for the randomized construction of hierarchically semi-separable matrices, addressing several practical issues. The HSS construction algorithms use a partially matrix-free, adaptive randomized projection scheme to determine the maximum off-diagonal block rank. We develop both relative and absolute stop** criteria to determine the minimum dimension of the random projecti… ▽ More We present new algorithms for the randomized construction of hierarchically semi-separable matrices, addressing several practical issues. The HSS construction algorithms use a partially matrix-free, adaptive randomized projection scheme to determine the maximum off-diagonal block rank. We develop both relative and absolute stop** criteria to determine the minimum dimension of the random projection matrix that is sufficient for the desired accuracy. Two strategies are discussed to adaptively enlarge the random sample matrix: repeated doubling of the number of random vectors, and iteratively incrementing the number of random vectors by a fixed number. The relative and absolute stop** criteria are based on probabilistic bounds for the Frobenius norm of the random projection of the Hankel blocks of the input matrix. We discuss parallel implementation and computation and communication cost of both variants. Parallel numerical results for a range of applications, including boundary element method matrices and quantum chemistry Toeplitz matrices, show the effectiveness, scalability and numerical robustness of the proposed algorithms. △ Less

Submitted 11 October, 2018; v1 submitted 9 October, 2018; originally announced October 2018.

Comments: 24 pages, 4 figures, 20 references

arXiv:1803.10274 [pdf, other]

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression

Authors: Elizaveta Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels, Xiaoye Sherry Li

Abstract: We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Using hierarchical matrix approximations for the kernel matrix the memory requirements, the number of floating point operations, and the execution time are drastically reduced compared to standard dense linear algebra routines. We consider both the general $\mathcal{H}$ matrix hierarchical format as we… ▽ More We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Using hierarchical matrix approximations for the kernel matrix the memory requirements, the number of floating point operations, and the execution time are drastically reduced compared to standard dense linear algebra routines. We consider both the general $\mathcal{H}$ matrix hierarchical format as well as Hierarchically Semi-Separable (HSS) matrices. Furthermore, we investigate the impact of several preprocessing and clustering techniques on the hierarchical matrix compression. Effective clustering of the input leads to a ten-fold increase in efficiency of the compression. The algorithms are implemented using the STRUMPACK solver library. These results confirm that --- with correct tuning of the hyperparameters --- classification using kernel ridge regression with the compressed matrix does not lose prediction accuracy compared to the exact --- not compressed --- kernel matrix and that our approach can be extended to $\mathcal{O}(1M)$ datasets, for which computation with the full kernel matrix becomes prohibitively expensive. We present numerical experiments in a distributed memory environment up to 1,024 processors of the NERSC's Cori supercomputer using well-known datasets to the machine learning community that range from dimension 8 up to 784. △ Less

Submitted 27 March, 2018; originally announced March 2018.

Comments: 10 pages, 8 figures

Journal ref: IPDPS workshops 2018

arXiv:1503.05464 [pdf, ps, other]

A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

Authors: François-Henry Rouet, Xiaoye S. Li, Pieter Ghysels, Artem Napov

Abstract: We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiti… ▽ More We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver. △ Less

Submitted 26 June, 2015; v1 submitted 18 March, 2015; originally announced March 2015.

arXiv:1502.07405 [pdf, ps, other]

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Authors: Pieter Ghysels, Xiaoye S. Li, Francois-Henry Rouet, Samuel Williams, Artem Napov

Abstract: We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with inte… ▽ More We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices. △ Less

Submitted 25 February, 2015; originally announced February 2015.

Showing 1–10 of 10 results for author: Ghysels, P