Search | arXiv e-print repository

Improving Antibody Design with Force-Guided Sampling in Diffusion Models

Authors: Paulina Kulytė, Francisco Vargas, Simon Valentin Mathis, Yu Guang Wang, José Miguel Hernández-Lobato, Pietro Liò

Abstract: Antibodies, crucial for immune defense, primarily rely on complementarity-determining regions (CDRs) to bind and neutralize antigens, such as viruses. The design of these CDRs determines the antibody's affinity and specificity towards its target. Generative models, particularly denoising diffusion probabilistic models (DDPMs), have shown potential to advance the structure-based design of CDR regio… ▽ More Antibodies, crucial for immune defense, primarily rely on complementarity-determining regions (CDRs) to bind and neutralize antigens, such as viruses. The design of these CDRs determines the antibody's affinity and specificity towards its target. Generative models, particularly denoising diffusion probabilistic models (DDPMs), have shown potential to advance the structure-based design of CDR regions. However, only a limited dataset of bound antibody-antigen structures is available, and generalization to out-of-distribution interfaces remains a challenge. Physics based force-fields, which approximate atomic interactions, offer a coarse but universal source of information to better mold designs to target interfaces. Integrating this foundational information into diffusion models is, therefore, highly desirable. Here, we propose a novel approach to enhance the sampling process of diffusion models by integrating force field energy-based feedback. Our model, DiffForce, employs forces to guide the diffusion sampling process, effectively blending the two distributions. Through extensive experiments, we demonstrate that our method guides the model to sample CDRs with lower energy, enhancing both the structure and sequence of the generated antibodies. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05540 [pdf, other]

A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding

Authors: Yiqing Shen, Zan Chen, Michail Mamalakis, Luhan He, Haiyang Xia, Tianbin Li, Yanzhou Su, Junjun He, Yu Guang Wang

Abstract: The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have… ▽ More The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have then attempted to adapt LLMs for protein understanding by integrating a protein sequence encoder with a pre-trained LLM. However, this adaptation raises a fundamental question: "Can LLMs, originally designed for NLP, effectively comprehend protein sequences as a form of language?" Current datasets fall short in addressing this question due to the lack of a direct correlation between protein sequences and corresponding text descriptions, limiting the ability to train and evaluate LLMs for protein understanding effectively. To bridge this gap, we introduce ProteinLMDataset, a dataset specifically designed for further self-supervised pretraining and supervised fine-tuning (SFT) of LLMs to enhance their capability for protein sequence comprehension. Specifically, ProteinLMDataset includes 17.46 billion tokens for pretraining and 893,000 instructions for SFT. Additionally, we present ProteinLMBench, the first benchmark dataset consisting of 944 manually verified multiple-choice questions for assessing the protein understanding capabilities of LLMs. ProteinLMBench incorporates protein-related details and sequences in multiple languages, establishing a new standard for evaluating LLMs' abilities in protein comprehension. The large language model InternLM2-7B, pretrained and fine-tuned on the ProteinLMDataset, outperforms GPT-4 on ProteinLMBench, achieving the highest accuracy score. The dataset and the benchmark are available at https://huggingface.co/datasets/tsynbio/ProteinLMBench. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2405.12474 [pdf, other]

How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing

Authors: Keke Huang, Yu Guang Wang, Ming Li, and Pietro Liò

Abstract: Spectral Graph Neural Networks (GNNs), alternatively known as graph filters, have gained increasing prevalence for heterophily graphs. Optimal graph filters rely on Laplacian eigendecomposition for Fourier transform. In an attempt to avert prohibitive computations, numerous polynomial filters have been proposed. However, polynomials in the majority of these filters are predefined and remain fixed… ▽ More Spectral Graph Neural Networks (GNNs), alternatively known as graph filters, have gained increasing prevalence for heterophily graphs. Optimal graph filters rely on Laplacian eigendecomposition for Fourier transform. In an attempt to avert prohibitive computations, numerous polynomial filters have been proposed. However, polynomials in the majority of these filters are predefined and remain fixed across different graphs, failing to accommodate the varying degrees of heterophily. Addressing this gap, we demystify the intrinsic correlation between the spectral property of desired polynomial bases and the heterophily degrees via thorough theoretical analyses. Subsequently, we develop a novel adaptive heterophily basis wherein the basis vectors mutually form angles reflecting the heterophily degree of the graph. We integrate this heterophily basis with the homophily basis to construct a universal polynomial basis UniBasis, which devises a polynomial filter based graph neural network - UniFilter. It optimizes the convolution and propagation in GNN, thus effectively limiting over-smoothing and alleviating over-squashing. Our extensive experiments, conducted on a diverse range of real-world and synthetic datasets with varying degrees of heterophily, support the superiority of UniFilter. These results not only demonstrate the universality of UniBasis but also highlight its proficiency in graph explanation. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2311.18177

arXiv:2405.06658 [pdf, other]

ProteinEngine: Empower LLM with Domain Knowledge for Protein Engineering

Authors: Yiqing Shen, Outongyi Lv, Houying Zhu, Yu Guang Wang

Abstract: Large language models (LLMs) have garnered considerable attention for their proficiency in tackling intricate tasks, particularly leveraging their capacities for zero-shot and in-context learning. However, their utility has been predominantly restricted to general tasks due to an absence of domain-specific knowledge. This constraint becomes particularly pertinent in the realm of protein engineerin… ▽ More Large language models (LLMs) have garnered considerable attention for their proficiency in tackling intricate tasks, particularly leveraging their capacities for zero-shot and in-context learning. However, their utility has been predominantly restricted to general tasks due to an absence of domain-specific knowledge. This constraint becomes particularly pertinent in the realm of protein engineering, where specialized expertise is required for tasks such as protein function prediction, protein evolution analysis, and protein design, with a level of specialization that existing LLMs cannot furnish. In response to this challenge, we introduce \textsc{ProteinEngine}, a human-centered platform aimed at amplifying the capabilities of LLMs in protein engineering by seamlessly integrating a comprehensive range of relevant tools, packages, and software via API calls. Uniquely, \textsc{ProteinEngine} assigns three distinct roles to LLMs, facilitating efficient task delegation, specialized task resolution, and effective communication of results. This design fosters high extensibility and promotes the smooth incorporation of new algorithms, models, and features for future development. Extensive user studies, involving participants from both the AI and protein engineering communities across academia and industry, consistently validate the superiority of \textsc{ProteinEngine} in augmenting the reliability and precision of deep learning in protein engineering tasks. Consequently, our findings highlight the potential of \textsc{ProteinEngine} to bride the disconnected tools for future research in the protein engineering domain. △ Less

Submitted 20 April, 2024; originally announced May 2024.

arXiv:2403.11408 [pdf, other]

Layer-diverse Negative Sampling for Graph Neural Networks

Authors: Wei Duan, Jie Lu, Yu Guang Wang, Junyu Xuan

Abstract: Graph neural networks (GNNs) are a powerful solution for various structure learning applications due to their strong representation capabilities for graph data. However, traditional GNNs, relying on message-passing mechanisms that gather information exclusively from first-order neighbours (known as positive samples), can lead to issues such as over-smoothing and over-squashing. To mitigate these i… ▽ More Graph neural networks (GNNs) are a powerful solution for various structure learning applications due to their strong representation capabilities for graph data. However, traditional GNNs, relying on message-passing mechanisms that gather information exclusively from first-order neighbours (known as positive samples), can lead to issues such as over-smoothing and over-squashing. To mitigate these issues, we propose a layer-diverse negative sampling method for message-passing propagation. This method employs a sampling matrix within a determinantal point process, which transforms the candidate set into a space and selectively samples from this space to generate negative samples. To further enhance the diversity of the negative samples during each forward pass, we develop a space-squeezing method to achieve layer-wise diversity in multi-layer GNNs. Experiments on various real-world graph datasets demonstrate the effectiveness of our approach in improving the diversity of negative samples and overall learning performance. Moreover, adding negative samples dynamically changes the graph's topology, thus with the strong potential to improve the expressiveness of GNNs and reduce the risk of over-squashing. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Published in Transactions on Machine Learning Research (03/2024)

arXiv:2311.05767 [pdf, other]

Dirichlet Energy Enhancement of Graph Neural Networks by Framelet Augmentation

Authors: Jialin Chen, Yuelin Wang, Cristian Bodnar, Rex Ying, Pietro Lio, Yu Guang Wang

Abstract: Graph convolutions have been a pivotal element in learning graph representations. However, recursively aggregating neighboring information with graph convolutions leads to indistinguishable node features in deep layers, which is known as the over-smoothing issue. The performance of graph neural networks decays fast as the number of stacked layers increases, and the Dirichlet energy associated with… ▽ More Graph convolutions have been a pivotal element in learning graph representations. However, recursively aggregating neighboring information with graph convolutions leads to indistinguishable node features in deep layers, which is known as the over-smoothing issue. The performance of graph neural networks decays fast as the number of stacked layers increases, and the Dirichlet energy associated with the graph decreases to zero as well. In this work, we introduce a framelet system into the analysis of Dirichlet energy and take a multi-scale perspective to leverage the Dirichlet energy and alleviate the over-smoothing issue. Specifically, we develop a Framelet Augmentation strategy by adjusting the update rules with positive and negative increments for low-pass and high-passes respectively. Based on that, we design the Energy Enhanced Convolution (EEConv), which is an effective and practical operation that is proved to strictly enhance Dirichlet energy. From a message-passing perspective, EEConv inherits multi-hop aggregation property from the framelet transform and takes into account all hops in the multi-scale representation, which benefits the node classification tasks over heterophilous graphs. Experiments show that deep GNNs with EEConv achieve state-of-the-art performance over various node classification datasets, especially for heterophilous graphs, while also lifting the Dirichlet energy as the network goes deeper. △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2306.16819 [pdf, other]

Graph Denoising Diffusion for Inverse Protein Folding

Authors: Kai Yi, Bingxin Zhou, Yiqing Shen, Pietro Liò, Yu Guang Wang

Abstract: Inverse protein folding is challenging due to its inherent one-to-many map** characteristic, where numerous possible amino acid sequences can fold into a single, identical protein backbone. This task involves not only identifying viable sequences but also representing the sheer diversity of potential solutions. However, existing discriminative models, such as transformer-based auto-regressive mo… ▽ More Inverse protein folding is challenging due to its inherent one-to-many map** characteristic, where numerous possible amino acid sequences can fold into a single, identical protein backbone. This task involves not only identifying viable sequences but also representing the sheer diversity of potential solutions. However, existing discriminative models, such as transformer-based auto-regressive models, struggle to encapsulate the diverse range of plausible solutions. In contrast, diffusion probabilistic models, as an emerging genre of generative approaches, offer the potential to generate a diverse set of sequence candidates for determined protein backbones. We propose a novel graph denoising diffusion model for inverse protein folding, where a given protein backbone guides the diffusion process on the corresponding amino acid residue types. The model infers the joint distribution of amino acids conditioned on the nodes' physiochemical properties and local environment. Moreover, we utilize amino acid replacement matrices for the diffusion forward process, encoding the biologically-meaningful prior knowledge of amino acids from their spatial and sequential neighbors as well as themselves, which reduces the sampling space of the generative process. Our model achieves state-of-the-art performance over a set of popular baseline methods in sequence recovery and exhibits great potential in generating diverse protein sequences for a determined protein backbone structure. △ Less

Submitted 7 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023

arXiv:2306.04899 [pdf, other]

Multi-level Protein Representation Learning for Blind Mutational Effect Prediction

Authors: Yang Tan, Bingxin Zhou, Yuanhong Jiang, Yu Guang Wang, Liang Hong

Abstract: Directed evolution plays an indispensable role in protein engineering that revises existing protein sequences to attain new or enhanced functions. Accurately predicting the effects of protein variants necessitates an in-depth understanding of protein structure and function. Although large self-supervised language models have demonstrated remarkable performance in zero-shot inference using only pro… ▽ More Directed evolution plays an indispensable role in protein engineering that revises existing protein sequences to attain new or enhanced functions. Accurately predicting the effects of protein variants necessitates an in-depth understanding of protein structure and function. Although large self-supervised language models have demonstrated remarkable performance in zero-shot inference using only protein sequences, these models inherently do not interpret the spatial characteristics of protein structures, which are crucial for comprehending protein folding stability and internal molecular interactions. This paper introduces a novel pre-training framework that cascades sequential and geometric analyzers for protein primary and tertiary structures. It guides mutational directions toward desired traits by simulating natural selection on wild-type proteins and evaluates the effects of variants based on their fitness to perform the function. We assess the proposed approach using a public database and two new databases for a variety of variant effect prediction tasks, which encompass a diverse set of proteins and assays from different taxa. The prediction results achieve state-of-the-art performance over other zero-shot learning methods for both single-site mutations and deep mutations. △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2304.08299 [pdf, other]

Accurate and Definite Mutational Effect Prediction with Lightweight Equivariant Graph Neural Networks

Authors: Bingxin Zhou, Outongyi Lv, Kai Yi, Xinye Xiong, Pan Tan, Liang Hong, Yu Guang Wang

Abstract: Directed evolution as a widely-used engineering strategy faces obstacles in finding desired mutants from the massive size of candidate modifications. While deep learning methods learn protein contexts to establish feasible searching space, many existing models are computationally demanding and fail to predict how specific mutational tests will affect a protein's sequence or function. This research… ▽ More Directed evolution as a widely-used engineering strategy faces obstacles in finding desired mutants from the massive size of candidate modifications. While deep learning methods learn protein contexts to establish feasible searching space, many existing models are computationally demanding and fail to predict how specific mutational tests will affect a protein's sequence or function. This research introduces a lightweight graph representation learning scheme that efficiently analyzes the microenvironment of wild-type proteins and recommends practical higher-order mutations exclusive to the user-specified protein and function of interest. Our method enables continuous improvement of the inference model by limited computational resources and a few hundred mutational training samples, resulting in accurate prediction of variant effects that exhibit near-perfect correlation with the ground truth across deep mutational scanning assays of 19 proteins. With its affordability and applicability to both computer scientists and biochemical laboratories, our solution offers a wide range of benefits that make it an ideal choice for the community. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2304.02656 [pdf, other]

Graph Representation Learning for Interactive Biomolecule Systems

Authors: Xinye Xiong, Bingxin Zhou, Yu Guang Wang

Abstract: Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objec… ▽ More Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objects, such as sequences, graphs, and surfaces. Moreover, it examines how geometric deep learning models, with an emphasis on graph-based techniques, can analyze biomolecule data to enable drug discovery, protein characterization, and biological system analysis. The study concludes with an overview of the current state of the field, highlighting the challenges that exist and the potential future research directions. △ Less

Submitted 5 April, 2023; originally announced April 2023.

arXiv:2303.10876 [pdf, other]

EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning

Authors: Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, Yanfeng Wang

Abstract: Learning to predict agent motions with relationship reasoning is important for many applications. In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle. However, such equivariance and invariance properties are overlooked by most existing methods. To fill this gap, we propose… ▽ More Learning to predict agent motions with relationship reasoning is important for many applications. In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle. However, such equivariance and invariance properties are overlooked by most existing methods. To fill this gap, we propose EqMotion, an efficient equivariant motion prediction model with invariant interaction reasoning. To achieve motion equivariance, we propose an equivariant geometric feature learning module to learn a Euclidean transformable feature through dedicated designs of equivariant operations. To reason agent's interactions, we propose an invariant interaction reasoning module to achieve a more stable interaction modeling. To further promote more comprehensive motion features, we propose an invariant pattern feature learning module to learn an invariant pattern feature, which cooperates with the equivariant geometric feature to enhance network expressiveness. We conduct experiments for the proposed model on four distinct scenarios: particle dynamics, molecule dynamics, human skeleton motion prediction and pedestrian trajectory prediction. Experimental results show that our method is not only generally applicable, but also achieves state-of-the-art prediction performances on all the four tasks, improving by 24.0/30.1/8.6/9.2%. Code is available at https://github.com/MediaBrain-SJTU/EqMotion. △ Less

Submitted 27 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR 2023

arXiv:2302.14806 [pdf, other]

Framelet Message Passing

Authors: Xinliang Liu, Bingxin Zhou, Chutian Zhang, Yu Guang Wang

Abstract: Graph neural networks (GNNs) have achieved champion in wide applications. Neural message passing is a typical key module for feature propagation by aggregating neighboring features. In this work, we propose a new message passing based on multiscale framelet transforms, called Framelet Message Passing. Different from traditional spatial methods, it integrates framelet representation of neighbor nod… ▽ More Graph neural networks (GNNs) have achieved champion in wide applications. Neural message passing is a typical key module for feature propagation by aggregating neighboring features. In this work, we propose a new message passing based on multiscale framelet transforms, called Framelet Message Passing. Different from traditional spatial methods, it integrates framelet representation of neighbor nodes from multiple hops away in node message update. We also propose a continuous message passing using neural ODE solvers. It turns both discrete and continuous cases can provably achieve network stability and limit oversmoothing due to the multiscale property of framelets. Numerical experiments on real graph datasets show that the continuous version of the framelet message passing significantly outperforms existing methods when learning heterogeneous graphs and achieves state-of-the-art performance on classic node classification tasks with low computational costs. △ Less

Submitted 28 February, 2023; originally announced February 2023.

arXiv:2301.00004 [pdf]

doi 10.1186/s13321-023-00688-x

SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering

Authors: Mingchen Li, Liqi Kang, Yi Xiong, Yu Guang Wang, Guisheng Fan, Pan Tan, Liang Hong

Abstract: Deep learning has been widely used for protein engineering. However, it is limited by the lack of sufficient experimental data to train an accurate model for predicting the functional fitness of high-order mutants. Here, we develop SESNet, a supervised deep-learning model to predict the fitness for protein mutants by leveraging both sequence and structure information, and exploiting attention mech… ▽ More Deep learning has been widely used for protein engineering. However, it is limited by the lack of sufficient experimental data to train an accurate model for predicting the functional fitness of high-order mutants. Here, we develop SESNet, a supervised deep-learning model to predict the fitness for protein mutants by leveraging both sequence and structure information, and exploiting attention mechanism. Our model integrates local evolutionary context from homologous sequences, the global evolutionary context encoding rich semantic from the universal protein sequence space and the structure information accounting for the microenvironment around each residue in a protein. We show that SESNet outperforms state-of-the-art models for predicting the sequence-function relationship on 26 deep mutational scanning datasets. More importantly, we propose a data augmentation strategy by leveraging the data from unsupervised models to pre-train our model. After that, our model can achieve strikingly high accuracy in prediction of the fitness of protein mutants, especially for the higher order variants (> 4 mutation sites), when finetuned by using only a small number of experimental mutation data (<50). The strategy proposed is of great practical value as the required experimental effort, i.e., producing a few tens of experimental mutation data on a given protein, is generally affordable by an ordinary biochemical group and can be applied on almost any protein. △ Less

Submitted 28 December, 2022; originally announced January 2023.

Journal ref: Journal of Cheminformatics (2023) 15:12

arXiv:2208.03471 [pdf, other]

Oversquashing in GNNs through the lens of information contraction and graph expansion

Authors: Pradeep Kr. Banerjee, Kedar Karhadkar, Yu Guang Wang, Uri Alon, Guido Montúfar

Abstract: The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on informatio… ▽ More The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on information contraction. Our analysis is guided by a model of reliable computation due to von Neumann that lends a new insight into oversquashing as signal quenching in noisy computation graphs. Building on this, we propose a graph rewiring algorithm aimed at alleviating oversquashing. Our algorithm employs a random local edge flip primitive motivated by an expander graph construction. We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy. Synthetic experiments show that while our algorithm in general has a slower rate of expansion, it is overall computationally cheaper, preserves the node degrees exactly and never disconnects the graph. △ Less

Submitted 6 August, 2022; originally announced August 2022.

Comments: 8 pages, 5 figures; Accepted at the 58th Annual Allerton Conference on Communication, Control, and Computing

arXiv:2206.10385 [pdf, other]

Approximate Equivariance SO(3) Needlet Convolution

Authors: Kai Yi, Jialin Chen, Yu Guang Wang, Bingxin Zhou, Pietro Liò, Yanan Fan, Jan Hamann

Abstract: This paper develops a rotation-invariant needlet convolution for rotation group SO(3) to distill multiscale information of spherical signals. The spherical needlet transform is generalized from $\mathbb{S}^2$ onto the SO(3) group, which decomposes a spherical signal to approximate and detailed spectral coefficients by a set of tight framelet operators. The spherical signal during the decomposition… ▽ More This paper develops a rotation-invariant needlet convolution for rotation group SO(3) to distill multiscale information of spherical signals. The spherical needlet transform is generalized from $\mathbb{S}^2$ onto the SO(3) group, which decomposes a spherical signal to approximate and detailed spectral coefficients by a set of tight framelet operators. The spherical signal during the decomposition and reconstruction achieves rotation invariance. Based on needlet transforms, we form a Needlet approximate Equivariance Spherical CNN (NES) with multiple SO(3) needlet convolutional layers. The network establishes a powerful tool to extract geometric-invariant features of spherical signals. The model allows sufficient network scalability with multi-resolution representation. A robust signal embedding is learned with wavelet shrinkage activation function, which filters out redundant high-pass representation while maintaining approximate rotation invariance. The NES achieves state-of-the-art performance for quantum chemistry regression and Cosmic Microwave Background (CMB) delensing reconstruction, which shows great potential for solving scientific challenges with high-resolution and multi-scale spherical signal representation. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2206.07599 [pdf, other]

How GNNs Facilitate CNNs in Mining Geometric Information from Large-Scale Medical Images

Authors: Yiqing Shen, Bingxin Zhou, Xinye Xiong, Ruitian Gao, Yu Guang Wang

Abstract: Gigapixel medical images provide massive data, both morphological textures and spatial information, to be mined. Due to the large data scale in histology, deep learning methods play an increasingly significant role as feature extractors. Existing solutions heavily rely on convolutional neural networks (CNNs) for global pixel-level analysis, leaving the underlying local geometric structure such as… ▽ More Gigapixel medical images provide massive data, both morphological textures and spatial information, to be mined. Due to the large data scale in histology, deep learning methods play an increasingly significant role as feature extractors. Existing solutions heavily rely on convolutional neural networks (CNNs) for global pixel-level analysis, leaving the underlying local geometric structure such as the interaction between cells in the tumor microenvironment unexplored. The topological structure in medical images, as proven to be closely related to tumor evolution, can be well characterized by graphs. To obtain a more comprehensive representation for downstream oncology tasks, we propose a fusion framework for enhancing the global image-level representation captured by CNNs with the geometry of cell-level spatial information learned by graph neural networks (GNN). The fusion layer optimizes an integration between collaborative features of global images and cell graphs. Two fusion strategies have been developed: one with MLP which is simple but turns out efficient through fine-tuning, and the other with Transformer gains a champion in fusing multiple networks. We evaluate our fusion strategies on histology datasets curated from large patient cohorts of colorectal and gastric cancers for three biomarker prediction tasks. Both two models outperform plain CNNs or GNNs, reaching a consistent AUC improvement of more than 5% on various network backbones. The experimental results yield the necessity for combining image-level morphological features with cell spatial relations in medical image analysis. Codes are available at https://github.com/yiqings/HEGnnEnhanceCnn. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: 21 pages

arXiv:2206.05437 [pdf, other]

ACMP: Allen-Cahn Message Passing for Graph Neural Networks with Particle Phase Transition

Authors: Yuelin Wang, Kai Yi, Xinliang Liu, Yu Guang Wang, Shi **

Abstract: Neural message passing is a basic feature extraction unit for graph-structured data considering neighboring node features in network propagation from one layer to the next. We model such process by an interacting particle system with attractive and repulsive forces and the Allen-Cahn force arising in the modeling of phase transition. The dynamics of the system is a reaction-diffusion process which… ▽ More Neural message passing is a basic feature extraction unit for graph-structured data considering neighboring node features in network propagation from one layer to the next. We model such process by an interacting particle system with attractive and repulsive forces and the Allen-Cahn force arising in the modeling of phase transition. The dynamics of the system is a reaction-diffusion process which can separate particles without blowing up. This induces an Allen-Cahn message passing (ACMP) for graph neural networks where the numerical iteration for the particle system solution constitutes the message passing propagation. ACMP which has a simple implementation with a neural ODE solver can propel the network depth up to one hundred of layers with theoretically proven strictly positive lower bound of the Dirichlet energy. It thus provides a deep model of GNNs circumventing the common GNN problem of oversmoothing. GNNs with ACMP achieve state of the art performance for real-world node classification tasks on both homophilic and heterophilic datasets. △ Less

Submitted 23 April, 2023; v1 submitted 11 June, 2022; originally announced June 2022.

Comments: 26 pages, 5 figures. NeurIPS 2022 Workshop on GLFrontiers (Oral). ICLR 2023 (Spotlight)

arXiv:2206.00228 [pdf, other]

Lower and Upper Bounds for Numbers of Linear Regions of Graph Convolutional Networks

Authors: Hao Chen, Yu Guang Wang, Huan Xiong

Abstract: The research for characterizing GNN expressiveness attracts much attention as graph neural networks achieve a champion in the last five years. The number of linear regions has been considered a good measure for the expressivity of neural networks with piecewise linear activation. In this paper, we present some estimates for the number of linear regions of the classic graph convolutional networks (… ▽ More The research for characterizing GNN expressiveness attracts much attention as graph neural networks achieve a champion in the last five years. The number of linear regions has been considered a good measure for the expressivity of neural networks with piecewise linear activation. In this paper, we present some estimates for the number of linear regions of the classic graph convolutional networks (GCNs) with one layer and multiple-layer scenarios. In particular, we obtain an optimal upper bound for the maximum number of linear regions for one-layer GCNs, and the upper and lower bounds for multi-layer GCNs. The simulated estimate shows that the true maximum number of linear regions is possibly closer to our estimated lower bound. These results imply that the number of linear regions of multi-layer GCNs is exponentially greater than one-layer GCNs per parameter in general. This suggests that deeper GCNs have more expressivity than shallow GCNs. △ Less

Submitted 1 June, 2022; originally announced June 2022.

arXiv:2205.15068 [pdf, other]

doi 10.1016/j.neunet.2022.05.001

Embedding Graphs on Grassmann Manifold

Authors: Bingxin Zhou, Xuebin Zheng, Yu Guang Wang, Ming Li, Junbin Gao

Abstract: Learning efficient graph representation is the key to favorably addressing downstream tasks on graphs, such as node or graph property prediction. Given the non-Euclidean structural property of graphs, preserving the original graph data's similarity relationship in the embedded space needs specific tools and a similarity metric. This paper develops a new graph representation learning scheme, namely… ▽ More Learning efficient graph representation is the key to favorably addressing downstream tasks on graphs, such as node or graph property prediction. Given the non-Euclidean structural property of graphs, preserving the original graph data's similarity relationship in the embedded space needs specific tools and a similarity metric. This paper develops a new graph representation learning scheme, namely EGG, which embeds approximated second-order graph characteristics into a Grassmann manifold. The proposed strategy leverages graph convolutions to learn hidden representations of the corresponding subspace of the graph, which is then mapped to a Grassmann point of a low dimensional manifold through truncated singular value decomposition (SVD). The established graph embedding approximates denoised correlationship of node attributes, as implemented in the form of a symmetric matrix space for Euclidean calculation. The effectiveness of EGG is demonstrated using both clustering and classification tasks at the node level and graph level. It outperforms baseline models on various benchmarks. △ Less

Submitted 30 May, 2022; originally announced May 2022.

arXiv:2202.04936 [pdf, other]

Robust Graph Representation Learning for Local Corruption Recovery

Authors: Bingxin Zhou, Yuanhong Jiang, Yu Guang Wang, **gwei Liang, Junbin Gao, Shirui Pan, Xiaoqun Zhang

Abstract: The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding f… ▽ More The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding for prediction tasks. The detection operation leverages a graph autoencoder, which does not make any assumptions about the distribution of the local corruptions. It pinpoints the positions of the anomalous node attributes in an unbiased mask matrix, where robust estimations are recovered with sparsity promoting regularizer. The optimizer approaches a new embedding that is sparse in the framelet domain and conditionally close to input observations. Extensive experiments are provided to validate our proposed model can recover a robust graph representation from black-box poisoning and achieve excellent performance. △ Less

Submitted 11 August, 2023; v1 submitted 10 February, 2022; originally announced February 2022.

Comments: WWW '23: Proceedings of the ACM Web Conference 2023

arXiv:2111.03264 [pdf, other]

Graph Denoising with Framelet Regularizer

Authors: Bingxin Zhou, Ruikun Li, Xuebin Zheng, Yu Guang Wang, Junbin Gao

Abstract: As graph data collected from the real world is merely noise-free, a practical representation of graphs should be robust to noise. Existing research usually focuses on feature smoothing but leaves the geometric structure untouched. Furthermore, most work takes L2-norm that pursues a global smoothness, which limits the expressivity of graph neural networks. This paper tailors regularizers for graph… ▽ More As graph data collected from the real world is merely noise-free, a practical representation of graphs should be robust to noise. Existing research usually focuses on feature smoothing but leaves the geometric structure untouched. Furthermore, most work takes L2-norm that pursues a global smoothness, which limits the expressivity of graph neural networks. This paper tailors regularizers for graph data in terms of both feature and structure noises, where the objective function is efficiently solved with the alternating direction method of multipliers (ADMM). The proposed scheme allows to take multiple layers without the concern of over-smoothing, and it guarantees convergence to the optimal solutions. Empirical study proves that our model achieves significantly better performance compared with popular graph convolutions even when the graph is heavily contaminated. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2106.12575 [pdf, other]

Weisfeiler and Lehman Go Cellular: CW Networks

Authors: Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yu Guang Wang, Pietro Liò, Guido Montúfar, Michael Bronstein

Abstract: Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message p… ▽ More Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models can be severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs. We show that this generalisation provides a powerful set of graph "lifting" transformations, each leading to a unique hierarchical message passing procedure. The resulting methods, which we collectively call CW Networks (CWNs), are strictly more powerful than the WL test and not less powerful than the 3-WL test. In particular, we demonstrate the effectiveness of one such scheme, based on rings, when applied to molecular graph problems. The proposed architecture benefits from provably larger expressivity than commonly used GNNs, principled modelling of higher-order signals and from compressing the distances between nodes. We demonstrate that our model achieves state-of-the-art results on a variety of molecular datasets. △ Less

Submitted 31 January, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021. Contains 28 pages, 9 figures

arXiv:2106.09876 [pdf, other]

doi 10.1109/TKDE.2021.3124061

Anomaly Detection in Dynamic Graphs via Transformer

Authors: Yixin Liu, Shirui Pan, Yu Guang Wang, Fei Xiong, Liang Wang, Qingfeng Chen, Vincent CS Lee

Abstract: Detecting anomalies for dynamic graphs has drawn increasing attention due to their wide applications in social networks, e-commerce, and cybersecurity. Recent deep learning-based approaches have shown promising results over shallow methods. However, they fail to address two core challenges of anomaly detection in dynamic graphs: the lack of informative encoding for unattributed nodes and the diffi… ▽ More Detecting anomalies for dynamic graphs has drawn increasing attention due to their wide applications in social networks, e-commerce, and cybersecurity. Recent deep learning-based approaches have shown promising results over shallow methods. However, they fail to address two core challenges of anomaly detection in dynamic graphs: the lack of informative encoding for unattributed nodes and the difficulty of learning discriminate knowledge from coupled spatial-temporal dynamic graphs. To overcome these challenges, in this paper, we present a novel Transformer-based Anomaly Detection framework for DYnamic graphs (TADDY). Our framework constructs a comprehensive node encoding strategy to better represent each node's structural and temporal roles in an evolving graphs stream. Meanwhile, TADDY captures informative representation from dynamic graphs with coupled spatial-temporal patterns via a dynamic graph transformer model. The extensive experimental results demonstrate that our proposed TADDY framework outperforms the state-of-the-art methods by a large margin on six real-world datasets. △ Less

Submitted 27 October, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

Comments: 13 pages, 5 figures

arXiv:2103.03212 [pdf, other]

Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks

Authors: Cristian Bodnar, Fabrizio Frasca, Yu Guang Wang, Nina Otter, Guido Montúfar, Pietro Liò, Michael Bronstein

Abstract: The pairwise interaction paradigm of graph machine learning has predominantly governed the modelling of relational systems. However, graphs alone cannot capture the multi-level interactions present in many complex systems and the expressive power of such schemes was proven to be limited. To overcome these limitations, we propose Message Passing Simplicial Networks (MPSNs), a class of models that p… ▽ More The pairwise interaction paradigm of graph machine learning has predominantly governed the modelling of relational systems. However, graphs alone cannot capture the multi-level interactions present in many complex systems and the expressive power of such schemes was proven to be limited. To overcome these limitations, we propose Message Passing Simplicial Networks (MPSNs), a class of models that perform message passing on simplicial complexes (SCs). To theoretically analyse the expressivity of our model we introduce a Simplicial Weisfeiler-Lehman (SWL) colouring procedure for distinguishing non-isomorphic SCs. We relate the power of SWL to the problem of distinguishing non-isomorphic graphs and show that SWL and MPSNs are strictly more powerful than the WL test and not less powerful than the 3-WL test. We deepen the analysis by comparing our model with traditional graph neural networks (GNNs) with ReLU activations in terms of the number of linear regions of the functions they can represent. We empirically support our theoretical claims by showing that MPSNs can distinguish challenging strongly regular graphs for which GNNs fail and, when equipped with orientation equivariant layers, they can improve classification accuracy in oriented SCs compared to a GNN baseline. △ Less

Submitted 14 June, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

Comments: ICML 2021. Contains 27 pages, 9 figures

arXiv:2102.06986 [pdf, other]

How Framelets Enhance Graph Neural Networks

Authors: Xuebin Zheng, Bingxin Zhou, Junbin Gao, Yu Guang Wang, Pietro Lio, Ming Li, Guido Montufar

Abstract: This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. We decompose an input graph into low-pass and high-pass frequencies coefficients for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling str… ▽ More This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. We decompose an input graph into low-pass and high-pass frequencies coefficients for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling strategy by aggregating the graph feature into low-pass and high-pass spectra, which considers both the feature values and geometry of the graph data and conserves the total information. The graph neural networks with the proposed framelet convolution and pooling achieve state-of-the-art performance in many node and graph prediction tasks. Moreover, we propose shrinkage as a new activation for the framelet convolution, which thresholds high-frequency information at different scales. Compared to ReLU, shrinkage activation improves model performance on denoising and signal compression: noises in both node and structure can be significantly reduced by accurately cutting off the high-pass coefficients from framelet decomposition, and the signal can be compressed to less than half its original size with well-preserved prediction performance. △ Less

Submitted 20 June, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

Comments: 24 pages, 17 figures, 8 tables, ICML2021 (fix typos)

MSC Class: 68T07; 05C85; 42C40 ACM Class: I.2.4; I.2.6

arXiv:2012.06922 [pdf, other]

Decimated Framelet System on Graphs and Fast G-Framelet Transforms

Authors: Xuebin Zheng, Bingxin Zhou, Yu Guang Wang, Xiaosheng Zhuang

Abstract: Graph representation learning has many real-world applications, from super-resolution imaging, 3D computer vision to drug repurposing, protein classification, social networks analysis. An adequate representation of graph data is vital to the learning performance of a statistical or machine learning model for graph-structured data. In this paper, we propose a novel multiscale representation system… ▽ More Graph representation learning has many real-world applications, from super-resolution imaging, 3D computer vision to drug repurposing, protein classification, social networks analysis. An adequate representation of graph data is vital to the learning performance of a statistical or machine learning model for graph-structured data. In this paper, we propose a novel multiscale representation system for graph data, called decimated framelets, which form a localized tight frame on the graph. The decimated framelet system allows storage of the graph data representation on a coarse-grained chain and processes the graph data at multi scales where at each scale, the data is stored at a subgraph. Based on this, we then establish decimated G-framelet transforms for the decomposition and reconstruction of the graph data at multi resolutions via a constructive data-driven filter bank. The graph framelets are built on a chain-based orthonormal basis that supports fast graph Fourier transforms. From this, we give a fast algorithm for the decimated G-framelet transforms, or FGT, that has linear computational complexity O(N) for a graph of size N. The theory of decimated framelets and FGT is verified with numerical examples for random graphs. The effectiveness is demonstrated by real-world applications, including multiresolution analysis for traffic network, and graph neural networks for graph classification tasks. △ Less

Submitted 16 December, 2021; v1 submitted 12 December, 2020; originally announced December 2020.

Comments: 69 pages, 10 figures, Published in JMLR

arXiv:2008.08427 [pdf, other]

How Powerful are Shallow Neural Networks with Bandlimited Random Weights?

Authors: Ming Li, Sho Sonoda, Feilong Cao, Yu Guang Wang, Jiye Liang

Abstract: We investigate the expressive power of depth-2 bandlimited random neural networks. A random net is a neural network where the hidden layer parameters are frozen with random assignment, and only the output layer parameters are trained by loss minimization. Using random weights for a hidden layer is an effective method to avoid non-convex optimization in standard gradient descent learning. It has al… ▽ More We investigate the expressive power of depth-2 bandlimited random neural networks. A random net is a neural network where the hidden layer parameters are frozen with random assignment, and only the output layer parameters are trained by loss minimization. Using random weights for a hidden layer is an effective method to avoid non-convex optimization in standard gradient descent learning. It has also been adopted in recent deep learning theories. Despite the well-known fact that a neural network is a universal approximator, in this study, we mathematically show that when hidden parameters are distributed in a bounded domain, the network may not achieve zero approximation error. In particular, we derive a new nontrivial approximation error lower bound. The proof utilizes the technique of ridgelet analysis, a harmonic analysis method designed for neural networks. This method is inspired by fundamental principles in classical signal processing, specifically the idea that signals with limited bandwidth may not always be able to perfectly recreate the original signal. We corroborate our theoretical results with various simulation studies, and generally, two main take-home messages are offered: (i) Not any distribution for selecting random weights is feasible to build a universal approximator; (ii) A suitable assignment of random weights exists but to some degree is associated with the complexity of the target function. △ Less

Submitted 31 May, 2023; v1 submitted 19 August, 2020; originally announced August 2020.

Comments: Published as a conference paper at ICML 2023

arXiv:2007.11202 [pdf, other]

MathNet: Haar-Like Wavelet Multiresolution-Analysis for Graph Representation and Learning

Authors: Xuebin Zheng, Bingxin Zhou, Ming Li, Yu Guang Wang, Junbin Gao

Abstract: Graph Neural Networks (GNNs) have recently caught great attention and achieved significant progress in graph-level applications. In this paper, we propose a framework for graph neural networks with multiresolution Haar-like wavelets, or MathNet, with interrelated convolution and pooling strategies. The underlying method takes graphs in different structures as input and assembles consistent graph r… ▽ More Graph Neural Networks (GNNs) have recently caught great attention and achieved significant progress in graph-level applications. In this paper, we propose a framework for graph neural networks with multiresolution Haar-like wavelets, or MathNet, with interrelated convolution and pooling strategies. The underlying method takes graphs in different structures as input and assembles consistent graph representations for readout layers, which then accomplishes label prediction. To achieve this, the multiresolution graph representations are first constructed and fed into graph convolutional layers for processing. The hierarchical graph pooling layers are then involved to downsample graph resolution while simultaneously remove redundancy within graph signals. The whole workflow could be formed with a multi-level graph analysis, which not only helps embed the intrinsic topological information of each graph into the GNN, but also supports fast computation of forward and adjoint graph transforms. We show by extensive experiments that the proposed framework obtains notable accuracy gains on graph classification and regression tasks with performance stability. The proposed MathNet outperforms various existing GNN models, especially on big data sets. △ Less

Submitted 24 January, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

Comments: 32 pages, 6 figures, 6 tables

MSC Class: 68T07; 05C85; 42C40 ACM Class: I.2.4; I.2.6

arXiv:2007.09392 [pdf, other]

Distributed Learning via Filtered Hyperinterpolation on Manifolds

Authors: Guido Montúfar, Yu Guang Wang

Abstract: Learning map**s of data on manifolds is an important topic in contemporary machine learning, with applications in astrophysics, geophysics, statistical physics, medical diagnosis, biochemistry, 3D object analysis. This paper studies the problem of learning real-valued functions on manifolds through filtered hyperinterpolation of input-output data pairs where the inputs may be sampled determinist… ▽ More Learning map**s of data on manifolds is an important topic in contemporary machine learning, with applications in astrophysics, geophysics, statistical physics, medical diagnosis, biochemistry, 3D object analysis. This paper studies the problem of learning real-valued functions on manifolds through filtered hyperinterpolation of input-output data pairs where the inputs may be sampled deterministically or at random and the outputs may be clean or noisy. Motivated by the problem of handling large data sets, it presents a parallel data processing approach which distributes the data-fitting task among multiple servers and synthesizes the fitted sub-models into a global estimator. We prove quantitative relations between the approximation quality of the learned function over the entire manifold, the type of target function, the number of servers, and the number and type of available samples. We obtain the approximation rates of convergence for distributed and non-distributed approaches. For the non-distributed case, the approximation order is optimal. △ Less

Submitted 18 July, 2020; originally announced July 2020.

Comments: 50 pages, 4 figures, 2 tables

MSC Class: 65Y05; 41A05; 65D05; 65D30; 68Q32; 94A12; 58C05; 58C35; 65T60; 42C05 ACM Class: F.2.1; G.1.1; G.1.4; G.1.2; I.2.6

arXiv:2006.16811 [pdf, other]

doi 10.1088/1742-5468/ac3ae4

Path Integral Based Convolution and Pooling for Graph Neural Networks

Authors: Zheng Ma, Junyu Xuan, Yu Guang Wang, Ming Li, Pietro Lio

Abstract: Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that in… ▽ More Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. It generalizes the graph Laplacian to a new transition matrix we call maximal entropy transition (MET) matrix derived from a path integral formalism. Importantly, the diagonal entries of the MET matrix are directly related to the subgraph centrality, thus providing a natural and adaptive pooling mechanism. PAN provides a versatile framework that can be tailored for different graph data with varying sizes and structures. We can view most existing GNN architectures as special cases of PAN. Experimental results show that PAN achieves state-of-the-art performance on various graph classification/regression tasks, including a new benchmark dataset from statistical mechanics we propose to boost applications of GNN in physical sciences. △ Less

Submitted 8 July, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

Comments: 15 pages, 4 figures, 6 tables. arXiv admin note: text overlap with arXiv:1904.10996

Journal ref: NeurIPS 2020

arXiv:2001.11653 [pdf, other]

Deep Learning Based Unsupervised and Semi-supervised Classification for Keratoconus

Authors: Nicole Hallett, Kai Yi, Josef Dick, Christopher Hodge, Gerard Sutton, Yu Guang Wang, **g**g You

Abstract: The transparent cornea is the window of the eye, facilitating the entry of light rays and controlling focusing the movement of the light within the eye. The cornea is critical, contributing to 75% of the refractive power of the eye. Keratoconus is a progressive and multifactorial corneal degenerative disease affecting 1 in 2000 individuals worldwide. Currently, there is no cure for keratoconus oth… ▽ More The transparent cornea is the window of the eye, facilitating the entry of light rays and controlling focusing the movement of the light within the eye. The cornea is critical, contributing to 75% of the refractive power of the eye. Keratoconus is a progressive and multifactorial corneal degenerative disease affecting 1 in 2000 individuals worldwide. Currently, there is no cure for keratoconus other than corneal transplantation for advanced stage keratoconus or corneal cross-linking, which can only halt KC progression. The ability to accurately identify subtle KC or KC progression is of vital clinical significance. To date, there has been little consensus on a useful model to classify KC patients, which therefore inhibits the ability to predict disease progression accurately. In this paper, we utilised machine learning to analyse data from 124 KC patients, including topographical and clinical variables. Both supervised multilayer perceptron and unsupervised variational autoencoder models were used to classify KC patients with reference to the existing Amsler-Krumeich (A-K) classification system. Both methods result in high accuracy, with the unsupervised method showing better performance. The result showed that the unsupervised method with a selection of 29 variables could be a powerful tool to provide an automatic classification tool for clinicians. These outcomes provide a platform for additional analysis for the progression and treatment of keratoconus. △ Less

Submitted 30 January, 2020; originally announced January 2020.

Comments: 7 pages, 8 figures, 3 tables

MSC Class: 62F15; 47N30; 62P10; 92C50; 46N60; 92B20 ACM Class: I.2.6; I.2.1; J.3

arXiv:2001.11651 [pdf, other]

CosmoVAE: Variational Autoencoder for CMB Image Inpainting

Authors: Kai Yi, Yi Guo, Yanan Fan, Jan Hamann, Yu Guang Wang

Abstract: Cosmic microwave background radiation (CMB) is critical to the understanding of the early universe and precise estimation of cosmological constants. Due to the contamination of thermal dust noise in the galaxy, the CMB map that is an image on the two-dimensional sphere has missing observations, mainly concentrated on the equatorial region. The noise of the CMB map has a significant impact on the e… ▽ More Cosmic microwave background radiation (CMB) is critical to the understanding of the early universe and precise estimation of cosmological constants. Due to the contamination of thermal dust noise in the galaxy, the CMB map that is an image on the two-dimensional sphere has missing observations, mainly concentrated on the equatorial region. The noise of the CMB map has a significant impact on the estimation precision for cosmological parameters. Inpainting the CMB map can effectively reduce the uncertainty of parametric estimation. In this paper, we propose a deep learning-based variational autoencoder --- CosmoVAE, to restoring the missing observations of the CMB map. The input and output of CosmoVAE are square images. To generate training, validation, and test data sets, we segment the full-sky CMB map into many small images by Cartesian projection. CosmoVAE assigns physical quantities to the parameters of the VAE network by using the angular power spectrum of the Gaussian random field as latent variables. CosmoVAE adopts a new loss function to improve the learning performance of the model, which consists of $\ell_1$ reconstruction loss, Kullback-Leibler divergence between the posterior distribution of encoder network and the prior distribution of latent variables, perceptual loss, and total-variation regularizer. The proposed model achieves state of the art performance for Planck \texttt{Commander} 2018 CMB map inpainting. △ Less

Submitted 30 January, 2020; originally announced January 2020.

Comments: 7 pages, 6 figures

MSC Class: 62F15; 92B20; 97K80; 83F05; 47A52 ACM Class: I.2.6; J.2; I.2.4

arXiv:1910.02434 [pdf, other]

Distributed filtered hyperinterpolation for noisy data on the sphere

Authors: Shao-Bo Lin, Yu Guang Wang, Ding-Xuan Zhou

Abstract: Problems in astrophysics, space weather research and geophysics usually need to analyze noisy big data on the sphere. This paper develops distributed filtered hyperinterpolation for noisy data on the sphere, which assigns the data fitting task to multiple servers to find a good approximation of the map** of input and output data. For each server, the approximation is a filtered hyperinterpolatio… ▽ More Problems in astrophysics, space weather research and geophysics usually need to analyze noisy big data on the sphere. This paper develops distributed filtered hyperinterpolation for noisy data on the sphere, which assigns the data fitting task to multiple servers to find a good approximation of the map** of input and output data. For each server, the approximation is a filtered hyperinterpolation on the sphere by a small proportion of quadrature nodes. The distributed strategy allows parallel computing for data processing and model selection and thus reduces computational cost for each server while preserves the approximation capability compared to the filtered hyperinterpolation. We prove quantitative relation between the approximation capability of distributed filtered hyperinterpolation and the numbers of input data and servers. Numerical examples show the efficiency and accuracy of the proposed method. △ Less

Submitted 6 October, 2019; originally announced October 2019.

Comments: 26 pages, 4 figures

MSC Class: 68Q32; 65D05; 41A50; 33C55; 65T60

arXiv:1909.11580 [pdf, other]

Haar Graph Pooling

Authors: Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan

Abstract: Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks. In these tasks, graph pooling is a critical ingredient by which GNNs adapt to input graphs of varying size and structure. We propose a new graph pooling operation based on compressive Haar transforms -- HaarPooling. HaarPooling implements a cascade of pooling operations; it is computed by… ▽ More Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks. In these tasks, graph pooling is a critical ingredient by which GNNs adapt to input graphs of varying size and structure. We propose a new graph pooling operation based on compressive Haar transforms -- HaarPooling. HaarPooling implements a cascade of pooling operations; it is computed by following a sequence of clusterings of the input graph. A HaarPooling layer transforms a given input graph to an output graph with a smaller node number and the same feature dimension; the compressive Haar transform filters out fine detail information in the Haar wavelet domain. In this way, all the HaarPooling layers together synthesize the features of any given input graph into a feature vector of uniform size. Such transforms provide a sparse characterization of the data and preserve the structure information of the input graph. GNNs implemented with standard graph convolution layers and HaarPooling layers achieve state of the art performance on diverse graph classification and regression problems. △ Less

Submitted 24 June, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: 14 pages, 4 figures, 7 tables; Published in ICML2020

MSC Class: 42C40; 05C85; 11Y16 ACM Class: I.2.4; I.2.6

arXiv:1908.00041 [pdf, other]

FaVeST: Fast Vector Spherical Harmonic Transforms

Authors: Quoc T. Le Gia, Ming Li, Yu Guang Wang

Abstract: Vector spherical harmonics on the unit sphere of $\mathbb{R}^3$ have broad applications in geophysics, quantum mechanics and astrophysics. In the representation of a tangent vector field, one needs to evaluate the expansion and the Fourier coefficients of vector spherical harmonics. In this paper, we develop fast algorithms (FaVeST) for vector spherical harmonic transforms on these evaluations. Th… ▽ More Vector spherical harmonics on the unit sphere of $\mathbb{R}^3$ have broad applications in geophysics, quantum mechanics and astrophysics. In the representation of a tangent vector field, one needs to evaluate the expansion and the Fourier coefficients of vector spherical harmonics. In this paper, we develop fast algorithms (FaVeST) for vector spherical harmonic transforms on these evaluations. The forward FaVeST evaluates the Fourier coefficients and has a computational cost proportional to $N\log \sqrt{N}$ for $N$ number of evaluation points. The adjoint FaVeST which evaluates a linear combination of vector spherical harmonics with a degree up to $\sqrt{M}$ for $M$ evaluation points has cost proportional to $M\log\sqrt{M}$. Numerical examples of simulated tangent fields illustrate the accuracy, efficiency and stability of FaVeST. △ Less

Submitted 24 March, 2021; v1 submitted 31 July, 2019; originally announced August 2019.

Comments: 23 pages, 6 figures, 3 tables

MSC Class: 65T50; 37C10; 33C55; 65D30 ACM Class: F.2.1; G.4; G.1.4; G.1.2

arXiv:1907.04786 [pdf, other]

Fast Haar Transforms for Graph Neural Networks

Authors: Ming Li, Zheng Ma, Yu Guang Wang, Xiaosheng Zhuang

Abstract: Graph Neural Networks (GNNs) have become a topic of intense research recently due to their powerful capability in high-dimensional classification and regression tasks for graph-structured data. However, as GNNs typically define the graph convolution by the orthonormal basis for the graph Laplacian, they suffer from high computational cost when the graph size is large. This paper introduces Haar ba… ▽ More Graph Neural Networks (GNNs) have become a topic of intense research recently due to their powerful capability in high-dimensional classification and regression tasks for graph-structured data. However, as GNNs typically define the graph convolution by the orthonormal basis for the graph Laplacian, they suffer from high computational cost when the graph size is large. This paper introduces Haar basis which is a sparse and localized orthonormal system for a coarse-grained chain on graph. The graph convolution under Haar basis, called Haar convolution, can be defined accordingly for GNNs. The sparsity and locality of the Haar basis allow Fast Haar Transforms (FHTs) on graph, by which a fast evaluation of Haar convolution between graph data and filters can be achieved. We conduct experiments on GNNs equipped with Haar convolution, which demonstrates state-of-the-art results on graph-based regression and node classification tasks. △ Less

Submitted 30 October, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

Comments: 24 pages, 5 figures, 3 tables

MSC Class: 42C40; 05C85; 11Y16

arXiv:1701.01595 [pdf, other]

Analysis of Framelet Transforms on a Simplex

Authors: Yu Guang Wang, Houying Zhu

Abstract: In this paper, we construct framelets associated with a sequence of quadrature rules on the simplex $T^{2}$ in $\mathbb{R}^{2}$. We give the framelet transforms -- decomposition and reconstruction of the coefficients for framelets of a function on $T^{2}$. We prove that the reconstruction is exact when the framelets are tight. We give an example of construction of framelets and show that the frame… ▽ More In this paper, we construct framelets associated with a sequence of quadrature rules on the simplex $T^{2}$ in $\mathbb{R}^{2}$. We give the framelet transforms -- decomposition and reconstruction of the coefficients for framelets of a function on $T^{2}$. We prove that the reconstruction is exact when the framelets are tight. We give an example of construction of framelets and show that the framelet transforms can be computed as fast as FFT. △ Less

Submitted 14 September, 2017; v1 submitted 6 January, 2017; originally announced January 2017.

Comments: 14 pages, 4 figures

MSC Class: 42C15; 42C40; 94A08; 65T60; 65T50; 41A55; 58C35; 42B05

Showing 1–37 of 37 results for author: Wang, Y G