Search | arXiv e-print repository

DCI: An Accurate Quality Assessment Criteria for Protein Complex Structure Models

Authors: Wenda Wang, Jiaqi Zhai, He Huang, Xinqi Gong

Abstract: The structure of proteins is the basis for studying protein function and drug design. The emergence of AlphaFold 2 has greatly promoted the prediction of protein 3D structures, and it is of great significance to give an overall and accurate evaluation of the predicted models, especially the complex models. Among the existing methods for evaluating multimer structures, DockQ is the most commonly us… ▽ More The structure of proteins is the basis for studying protein function and drug design. The emergence of AlphaFold 2 has greatly promoted the prediction of protein 3D structures, and it is of great significance to give an overall and accurate evaluation of the predicted models, especially the complex models. Among the existing methods for evaluating multimer structures, DockQ is the most commonly used. However, as a more suitable metric for complex docking, DockQ cannot provide a unique and accurate evaluation in the non-docking situation. Therefore, it is necessary to propose an evaluation strategy that can directly evaluate the whole complex without limitation and achieve good results. In this work, we proposed DCI score, a new evaluation strategy for protein complex structure models, which only bases on distance map and CI (contact-interface) map, DCI focuses on the prediction accuracy of the contact interface based on the overall evaluation of complex structure, is not inferior to DockQ in the evaluation accuracy according to CAPRI classification, and is able to handle the non-docking situation better than DockQ. Besides, we calculated DCI score on CASP datasets and compared it with CASP official assessment, which obtained good results. In addition, we found that DCI can better evaluate the overall structure deviation caused by interface prediction errors in the case of multi-chains. Our DCI is available at \url{https://gitee.com/WendaWang/DCI-score.git}, and the online-server is available at \url{http://mialab.ruc.edu.cn/DCIServer/}. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2406.14307 [pdf, other]

QuST-LLM: Integrating Large Language Models for Comprehensive Spatial Transcriptomics Analysis

Authors: Chao Hui Huang

Abstract: In this paper, we introduce QuST-LLM, an innovative extension of QuPath that utilizes the capabilities of large language models (LLMs) to analyze and interpret spatial transcriptomics (ST) data. In addition to simplifying the intricate and high-dimensional nature of ST data by offering a comprehensive workflow that includes data loading, region selection, gene expression analysis, and functional a… ▽ More In this paper, we introduce QuST-LLM, an innovative extension of QuPath that utilizes the capabilities of large language models (LLMs) to analyze and interpret spatial transcriptomics (ST) data. In addition to simplifying the intricate and high-dimensional nature of ST data by offering a comprehensive workflow that includes data loading, region selection, gene expression analysis, and functional annotation, QuST-LLM employs LLMs to transform complex ST data into understandable and detailed biological narratives based on gene ontology annotations, thereby significantly improving the interpretability of ST data. Consequently, users can interact with their own ST data using natural language. Hence, QuST-LLM provides researchers with a potent functionality to unravel the spatial and functional complexities of tissues, fostering novel insights and advancements in biomedical research. QuST-LLM is a part of QuST project. The source code is hosted on GitHub and documentation is available at (https://github.com/huangch/qust). △ Less

Submitted 1 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: 12 pages, 7 figures

arXiv:2404.13631 [pdf, other]

Fermi-Bose Machine

Authors: Mingshan Xie, Yuchen Wang, Hai** Huang

Abstract: Distinct from human cognitive processing, deep neural networks trained by backpropagation can be easily fooled by adversarial examples. To design a semantically meaningful representation learning, we discard backpropagation, and instead, propose a local contrastive learning, where the representation for the inputs bearing the same label shrink (akin to boson) in hidden layers, while those of diffe… ▽ More Distinct from human cognitive processing, deep neural networks trained by backpropagation can be easily fooled by adversarial examples. To design a semantically meaningful representation learning, we discard backpropagation, and instead, propose a local contrastive learning, where the representation for the inputs bearing the same label shrink (akin to boson) in hidden layers, while those of different labels repel (akin to fermion). This layer-wise learning is local in nature, being biological plausible. A statistical mechanics analysis shows that the target fermion-pair-distance is a key parameter. Moreover, the application of this local contrastive learning to MNIST benchmark dataset demonstrates that the adversarial vulnerability of standard perceptron can be greatly mitigated by tuning the target distance, i.e., controlling the geometric separation of prototype manifolds. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 17 pages, 6 figures, a physics inspired machine without backpropagation and enhanced adversarial robustness

arXiv:2404.11199 [pdf, other]

RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models

Authors: Han Huang, Ziqian Lin, Dongchen He, Liang Hong, Yu Li

Abstract: RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA… ▽ More RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the non-unique structure-sequence map**, and the flexibility of RNA conformation. In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of $11\%$ for sequence similarity splits and $16\%$ for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in-silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 15 pages

arXiv:2403.07920 [pdf, other]

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training

Authors: Le Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Heqi Zheng, Conghui He, Xian-Ling Mao, Wentao Zhang

Abstract: We propose ProtLLM, a versatile cross-modal large language model (LLM) for both protein-centric and protein-language tasks. ProtLLM features a unique dynamic protein mounting mechanism, enabling it to handle complex inputs where the natural language text is interspersed with an arbitrary number of proteins. Besides, we propose the protein-as-word language modeling approach to train ProtLLM. By dev… ▽ More We propose ProtLLM, a versatile cross-modal large language model (LLM) for both protein-centric and protein-language tasks. ProtLLM features a unique dynamic protein mounting mechanism, enabling it to handle complex inputs where the natural language text is interspersed with an arbitrary number of proteins. Besides, we propose the protein-as-word language modeling approach to train ProtLLM. By develo** a specialized protein vocabulary, we equip the model with the capability to predict not just natural language but also proteins from a vast pool of candidates. Additionally, we construct a large-scale interleaved protein-text dataset, named InterPT, for pre-training. This dataset comprehensively encompasses both (1) structured data sources like protein annotations and (2) unstructured data sources like biological research papers, thereby endowing ProtLLM with crucial knowledge for understanding proteins. We evaluate ProtLLM on classic supervised protein-centric tasks and explore its novel protein-language applications. Experimental results demonstrate that ProtLLM not only achieves superior performance against protein-specialized baselines on protein-centric tasks but also induces zero-shot and in-context learning capabilities on protein-language tasks. △ Less

Submitted 27 February, 2024; originally announced March 2024.

Comments: https://protllm.github.io/project/

arXiv:2402.17796 [pdf, other]

An Allosteric Model for the Influence of $\text{H}^+$ and $\text{CO}_2$ on Oxygen-Hemoglobin Binding

Authors: Heming Huang, Charles S. Peskin

Abstract: In the physiology of oxygen-hemoglobin binding, an important role is played by the influence of $\text{H}^+$ and $\text{CO}_2$ on the affinity of hemoglobin for $\text{O}_2$. Here we extend the allosteric model of hemoglobin to include these effects. We assume purely allosteric modulation, i.e., that the modulatory effects of $\text{H}^+$ and $\text{CO}_2$ on oxygen binding occur only because of t… ▽ More In the physiology of oxygen-hemoglobin binding, an important role is played by the influence of $\text{H}^+$ and $\text{CO}_2$ on the affinity of hemoglobin for $\text{O}_2$. Here we extend the allosteric model of hemoglobin to include these effects. We assume purely allosteric modulation, i.e., that the modulatory effects of $\text{H}^+$ and $\text{CO}_2$ on oxygen binding occur only because of their influence on the T $\leftrightarrow$ R transition, in which all four subunits of the hemoglobin molecule participate simultaneously. We assume, moreover, that these modulatory influences occur only through the interaction of $\text{H}^+$ and $\text{CO}_2$ with the amino group at the N-terminal of each of the four polypeptide chains of the hemoglobin molecule. We fit the model to experimental data and obtain reasonable agreement with the observed shifts in oxygen-hemoglobin binding that occur when the concentrations of $\text{H}^+$ and $\text{CO}_2$ are changed. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2401.10009 [pdf, other]

An optimization-based equilibrium measure describes non-equilibrium steady state dynamics: application to edge of chaos

Authors: Junbin Qiu, Hai** Huang

Abstract: Understanding neural dynamics is a central topic in machine learning, non-linear physics and neuroscience. However, the dynamics is non-linear, stochastic and particularly non-gradient, i.e., the driving force can not be written as gradient of a potential. These features make analytic studies very challenging. The common tool is the path integral approach or dynamical mean-field theory, but the dr… ▽ More Understanding neural dynamics is a central topic in machine learning, non-linear physics and neuroscience. However, the dynamics is non-linear, stochastic and particularly non-gradient, i.e., the driving force can not be written as gradient of a potential. These features make analytic studies very challenging. The common tool is the path integral approach or dynamical mean-field theory, but the drawback is that one has to solve the integro-differential or dynamical mean-field equations, which is computationally expensive and has no closed form solutions in general. From the aspect of associated Fokker-Planck equation, the steady state solution is generally unknown. Here, we treat searching for the steady states as an optimization problem, and construct an approximate potential related to the speed of the dynamics, and find that searching for the ground state of this potential is equivalent to running an approximate stochastic gradient dynamics or Langevin dynamics. Only in the zero temperature limit, the distribution of the original steady states can be achieved. The resultant stationary state of the dynamics follows exactly the canonical Boltzmann measure. Within this framework, the quenched disorder intrinsic in the neural networks can be averaged out by applying the replica method, which leads naturally to order parameters for the non-equilibrium steady states. Our theory reproduces the well-known result of edge-of-chaos, and further the order parameters characterizing the continuous transition are derived, and the order parameters are explained as fluctuations and responses of the steady states. Our method thus opens the door to analytically study the steady state landscape of the deterministic or stochastic high dimensional dynamics. △ Less

Submitted 7 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: 21 pages, 9 figures, revised version 2

arXiv:2312.17670 [pdf, other]

Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA

Authors: Kaiyuan Yang, Fabio Musio, Yihui Ma, Norman Juchler, Johannes C. Paetzold, Rami Al-Maskari, Luciano Höher, Hongwei Bran Li, Ibrahim Ethem Hamamci, Anjany Sekuboyina, Suprosanna Shit, Hou**g Huang, Chinmay Prabhakar, Ezequiel de la Rosa, Diana Waldmannstetter, Florian Kofler, Fernando Navarro, Martin Menten, Ivan Ezhov, Daniel Rueckert, Iris Vos, Ynte Ruigrok, Birgitta Velthuis, Hugo Kuijf, Julien Hämmerli , et al. (59 additional authors not shown)

Abstract: The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modaliti… ▽ More The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited public datasets with annotations on CoW anatomy, especially for CTA. Therefore we organized the TopCoW Challenge in 2023 with the release of an annotated CoW dataset. The TopCoW dataset was the first public dataset with voxel-level annotations for thirteen possible CoW vessel components, enabled by virtual-reality (VR) technology. It was also the first large dataset with paired MRA and CTA from the same patients. TopCoW challenge formalized the CoW characterization problem as a multiclass anatomical segmentation task with an emphasis on topological metrics. We invited submissions worldwide for the CoW segmentation task, which attracted over 140 registered participants from four continents. The top performing teams managed to segment many CoW components to Dice scores around 90%, but with lower scores for communicating arteries and rare variants. There were also topological mistakes for predictions with high Dice scores. Additional topological analysis revealed further areas for improvement in detecting certain CoW components and matching CoW variant topology accurately. TopCoW represented a first attempt at benchmarking the CoW anatomical segmentation task for MRA and CTA, both morphologically and topologically. △ Less

Submitted 29 April, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

Comments: 24 pages, 11 figures, 9 tables. Summary Paper for the MICCAI TopCoW 2023 Challenge

arXiv:2312.03292 [pdf, other]

Enhancing Molecular Property Prediction via Mixture of Collaborative Experts

Authors: Xu Yao, Shuang Liang, Songqiao Han, Hailiang Huang

Abstract: Molecular Property Prediction (MPP) task involves predicting biochemical properties based on molecular features, such as molecular graph structures, contributing to the discovery of lead compounds in drug development. To address data scarcity and imbalance in MPP, some studies have adopted Graph Neural Networks (GNN) as an encoder to extract commonalities from molecular graphs. However, these appr… ▽ More Molecular Property Prediction (MPP) task involves predicting biochemical properties based on molecular features, such as molecular graph structures, contributing to the discovery of lead compounds in drug development. To address data scarcity and imbalance in MPP, some studies have adopted Graph Neural Networks (GNN) as an encoder to extract commonalities from molecular graphs. However, these approaches often use a separate predictor for each task, neglecting the shared characteristics among predictors corresponding to different tasks. In response to this limitation, we introduce the GNN-MoCE architecture. It employs the Mixture of Collaborative Experts (MoCE) as predictors, exploiting task commonalities while confronting the homogeneity issue in the expert pool and the decision dominance dilemma within the expert group. To enhance expert diversity for collaboration among all experts, the Expert-Specific Projection method is proposed to assign a unique projection perspective to each expert. To balance decision-making influence for collaboration within the expert group, the Expert-Specific Loss is presented to integrate individual expert loss into the weighted decision loss of the group for more equitable training. Benefiting from the enhancements of MoCE in expert creation, dynamic expert group formation, and experts' collaboration, our model demonstrates superior performance over traditional methods on 24 MPP datasets, especially in tasks with limited data or high imbalance. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 11 pages, 8 figures

arXiv:2310.14621 [pdf, other]

Spiking mode-based neural networks

Authors: Zhanghan Lin, Hai** Huang

Abstract: Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that updating all weights is quite expensive. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent… ▽ More Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that updating all weights is quite expensive. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent understanding of circuit mechanisms. Therefore, in this work, we address these challenges by proposing a spiking mode-based training protocol, where the recurrent weight matrix is explained as a Hopfield-like multiplication of three matrices: input, output modes and a score matrix. The first advantage is that the weight is interpreted by input and output modes and their associated scores characterizing the importance of each decomposition term. The number of modes is thus adjustable, allowing more degrees of freedom for modeling the experimental data. This significantly reduces the training cost because of significantly reduced space complexity for learning. Training spiking networks is thus carried out in the mode-score space. The second advantage is that one can project the high dimensional neural activity (filtered spike train) in the state space onto the mode space which is typically of a low dimension, e.g., a few modes are sufficient to capture the shape of the underlying neural manifolds. We successfully apply our framework in two computational tasks -- digit classification and selective sensory integration tasks. Our method accelerate the training of spiking neural networks by a Hopfield-like decomposition, and moreover this training leads to low-dimensional attractor structures of high-dimensional neural dynamics. △ Less

Submitted 3 June, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: 29 pages, 9 figures, a significantly revised version

arXiv:2310.00185 [pdf]

CARLA: Adjusted common average referencing for cortico-cortical evoked potential data

Authors: Harvey Huang, Gabriela Ojeda Valencia, Nicholas M. Gregg, Gamaleldin M. Osman, Morgan N. Montoya, Gregory A. Worrell, Kai J. Miller, Dora Hermes

Abstract: Human brain connectivity can be mapped by single pulse electrical stimulation during intracranial EEG measurements. The raw cortico-cortical evoked potentials (CCEP) are often contaminated by noise. Common average referencing (CAR) removes common noise and preserves response shapes but can introduce bias from responsive channels. We address this issue with an adjusted, adaptive CAR algorithm terme… ▽ More Human brain connectivity can be mapped by single pulse electrical stimulation during intracranial EEG measurements. The raw cortico-cortical evoked potentials (CCEP) are often contaminated by noise. Common average referencing (CAR) removes common noise and preserves response shapes but can introduce bias from responsive channels. We address this issue with an adjusted, adaptive CAR algorithm termed "CAR by Least Anticorrelation (CARLA)". CARLA was tested on simulated CCEP data and real CCEP data collected from four human participants. In CARLA, the channels are ordered by increasing mean cross-trial covariance, and iteratively added to the common average until anticorrelation between any single channel and all re-referenced channels reaches a minimum, as a measure of shared noise. We simulated CCEP data with true responses in 0 to 45 of 50 total channels. We quantified CARLA's error and found that it erroneously included 0 (median) truly responsive channels in the common average with less than or equal to 42 responsive channels, and erroneously excluded less than or equal to 2.5 (median) unresponsive channels at all responsiveness levels. On real CCEP data, signal quality was quantified with the mean R-squared between all pairs of channels, which represents inter-channel dependency and is low for well-referenced data. CARLA re-referencing produced significantly lower mean R-squared than standard CAR, CAR using a fixed bottom quartile of channels by covariance, and no re-referencing. CARLA minimizes bias in re-referenced CCEP data by adaptively selecting the optimal subset of non-responsive channels. It showed high specificity and sensitivity on simulated CCEP data and lowered inter-channel dependency compared to CAR on real CCEP data. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 29 pages, 8 main figures, 3 supplemental figures. For associated code, see https://github.com/hharveygit/CARLA_JNM

arXiv:2309.08478 [pdf, other]

Current and future directions in network biology

Authors: Marinka Zitnik, Michelle M. Li, Aydin Wells, Kimberly Glass, Deisy Morselli Gysi, Arjun Krishnan, T. M. Murali, Predrag Radivojac, Sushmita Roy, Anaïs Baudot, Serdar Bozdag, Danny Z. Chen, Lenore Cowen, Kapil Devkota, Anthony Gitter, Sara Gosline, Pengfei Gu, Pietro H. Guzzi, Heng Huang, Meng Jiang, Ziynet Nesibe Kesimoglu, Mehmet Koyuturk, Jian Ma, Alexander R. Pico, Nataša Pržulj , et al. (12 additional authors not shown)

Abstract: Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These challenges stem from various fa… ▽ More Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These challenges stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology and highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on the future directions of network biology. Additionally, we offer insights into scientific communities, educational initiatives, and the importance of fostering diversity within the field. This paper establishes a roadmap for an immediate and long-term vision for network biology. △ Less

Submitted 11 June, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 52 pages, 6 figures, 1 table

arXiv:2309.04106 [pdf, other]

doi 10.1103/PhysRevE.109.044309

Meta predictive learning model of languages in neural circuits

Authors: Chan Li, Junbin Qiu, Hai** Huang

Abstract: Large language models based on self-attention mechanisms have achieved astonishing performances not only in natural language itself, but also in a variety of tasks of different nature. However, regarding processing language, our human brain may not operate using the same principle. Then, a debate is established on the connection between brain computation and artificial self-supervision adopted in… ▽ More Large language models based on self-attention mechanisms have achieved astonishing performances not only in natural language itself, but also in a variety of tasks of different nature. However, regarding processing language, our human brain may not operate using the same principle. Then, a debate is established on the connection between brain computation and artificial self-supervision adopted in large language models. One of most influential hypothesis in brain computation is the predictive coding framework, which proposes to minimize the prediction error by local learning. However, the role of predictive coding and the associated credit assignment in language processing remains unknown. Here, we propose a mean-field learning model within the predictive coding framework, assuming that the synaptic weight of each connection follows a spike and slab distribution, and only the distribution, rather than specific weights, is trained. This meta predictive learning is successfully validated on classifying handwritten digits where pixels are input to the network in sequence, and moreover on the toy and real language corpus. Our model reveals that most of the connections become deterministic after learning, while the output connections have a higher level of variability. The performance of the resulting network ensemble changes continuously with data load, further improving with more training data, in analogy with the emergent behavior of large language models. Therefore, our model provides a starting point to investigate the connection among brain computation, next-token prediction and general intelligence. △ Less

Submitted 9 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: 31 pages, 6 figures, codes are available in the main text with the link

Journal ref: Phys. Rev. E 109, 044309 (2024)

arXiv:2306.11232 [pdf, other]

Eight challenges in develo** theory of intelligence

Authors: Hai** Huang

Abstract: A good theory of mathematical beauty is more practical than any current observation, as new predictions of physical reality can be verified self-consistently. This belief applies to the current status of understanding deep neural networks including large language models and even the biological intelligence. Toy models provide a metaphor of physical reality, allowing mathematically formulating that… ▽ More A good theory of mathematical beauty is more practical than any current observation, as new predictions of physical reality can be verified self-consistently. This belief applies to the current status of understanding deep neural networks including large language models and even the biological intelligence. Toy models provide a metaphor of physical reality, allowing mathematically formulating that reality (i.e., the so-called theory), which can be updated as more conjectures are justified or refuted. One does not need to pack all details into a model, but rather, more abstract models are constructed, as complex systems like brains or deep networks have many sloppy dimensions but much less stiff dimensions that strongly impact macroscopic observables. This kind of bottom-up mechanistic modeling is still promising in the modern era of understanding the natural or artificial intelligence. Here, we shed light on eight challenges in develo** theory of intelligence following this theoretical paradigm. Theses challenges are representation learning, generalization, adversarial robustness, continual learning, causal learning, internal model of the brain, next-token prediction, and finally the mechanics of subjective experience. △ Less

Submitted 21 June, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: 24 pages, 131 references, revised version to journal

arXiv:2305.16222 [pdf, ps, other]

Incomplete Multimodal Learning for Complex Brain Disorders Prediction

Authors: Reza Shirkavand, Liang Zhan, Heng Huang, Li Shen, Paul M. Thompson

Abstract: Recent advancements in the acquisition of various brain data sources have created new opportunities for integrating multimodal brain data to assist in early detection of complex brain disorders. However, current data integration approaches typically need a complete set of biomedical data modalities, which may not always be feasible, as some modalities are only available in large-scale research coh… ▽ More Recent advancements in the acquisition of various brain data sources have created new opportunities for integrating multimodal brain data to assist in early detection of complex brain disorders. However, current data integration approaches typically need a complete set of biomedical data modalities, which may not always be feasible, as some modalities are only available in large-scale research cohorts and are prohibitive to collect in routine clinical practice. Especially in studies of brain diseases, research cohorts may include both neuroimaging data and genetic data, but for practical clinical diagnosis, we often need to make disease predictions only based on neuroimages. As a result, it is desired to design machine learning models which can use all available data (different data could provide complementary information) during training but conduct inference using only the most common data modality. We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks to effectively exploit auxiliary modalities available during training in order to improve the performance of a unimodal model at inference. We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Experimental results demonstrate that our approach outperforms the related machine learning and deep learning methods by a significant margin. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.12347 [pdf, other]

Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation

Authors: Han Huang, Leilei Sun, Bowen Du, Weifeng Lv

Abstract: Designing new molecules is essential for drug discovery and material science. Recently, deep generative models that aim to model molecule distribution have made promising progress in narrowing down the chemical research space and generating high-fidelity molecules. However, current generative models only focus on modeling either 2D bonding graphs or 3D geometries, which are two complementary descr… ▽ More Designing new molecules is essential for drug discovery and material science. Recently, deep generative models that aim to model molecule distribution have made promising progress in narrowing down the chemical research space and generating high-fidelity molecules. However, current generative models only focus on modeling either 2D bonding graphs or 3D geometries, which are two complementary descriptors for molecules. The lack of ability to jointly model both limits the improvement of generation quality and further downstream applications. In this paper, we propose a new joint 2D and 3D diffusion model (JODO) that generates complete molecules with atom types, formal charges, bond information, and 3D coordinates. To capture the correlation between molecular graphs and geometries in the diffusion process, we develop a Diffusion Graph Transformer to parameterize the data prediction model that recovers the original data from noisy data. The Diffusion Graph Transformer interacts node and edge representations based on our relational attention mechanism, while simultaneously propagating and updating scalar features and geometric vectors. Our model can also be extended for inverse molecular design targeting single or multiple quantum properties. In our comprehensive evaluation pipeline for unconditional joint generation, the results of the experiment show that JODO remarkably outperforms the baselines on the QM9 and GEOM-Drugs datasets. Furthermore, our model excels in few-step fast sampling, as well as in inverse molecule design and molecular graph generation. Our code is provided in https://github.com/GRAPH-0/JODO. △ Less

Submitted 4 June, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

arXiv:2305.08459 [pdf, other]

doi 10.21468/SciPostPhysLectNotes.79

Introduction to dynamical mean-field theory of randomly connected neural networks with bidirectionally correlated couplings

Authors: Wenxuan Zou, Hai** Huang

Abstract: Dynamical mean-field theory is a powerful physics tool used to analyze the typical behavior of neural networks, where neurons can be recurrently connected, or multiple layers of neurons can be stacked. However, it is not easy for beginners to access the essence of this tool and the underlying physics. Here, we give a pedagogical introduction of this method in a particular example of random neural… ▽ More Dynamical mean-field theory is a powerful physics tool used to analyze the typical behavior of neural networks, where neurons can be recurrently connected, or multiple layers of neurons can be stacked. However, it is not easy for beginners to access the essence of this tool and the underlying physics. Here, we give a pedagogical introduction of this method in a particular example of random neural networks, where neurons are randomly and fully connected by correlated synapses and therefore the network exhibits rich emergent collective dynamics. We also review related past and recent important works applying this tool. In addition, a physically transparent and alternative method, namely the dynamical cavity method, is also introduced to derive exactly the same results. The numerical implementation of solving the integro-differential mean-field equations is also detailed, with an illustration of exploring the fluctuation dissipation theorem. △ Less

Submitted 7 October, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: 27 pages, 5 figures, 44 references, revised version for SciPost Physics Lecture Notes

Journal ref: SciPost Phys. Lect. Notes 79 (2024)

arXiv:2304.01347 [pdf]

Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Authors: Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

Abstract: The available evidence suggests that dynamic functional connectivity (dFC) can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia(SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal br… ▽ More The available evidence suggests that dynamic functional connectivity (dFC) can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia(SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed, based on the synchronous temporal properties of feature. Finally, the first modular abnormal hemispherical lateralization test tool in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-the-art methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower order perceptual system and higher order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ and reaffirms the importance of the left medial superior frontal gyrus in SZ. Our core code is available at: https://github.com/swfen/Temporal-BCGCN. △ Less

Submitted 11 September, 2023; v1 submitted 30 March, 2023; originally announced April 2023.

arXiv:2302.08062 [pdf]

doi 10.1111/2041-210X.14229

Fossil Image Identification using Deep Learning Ensembles of Data Augmented Multiviews

Authors: Chengbin Hou, Xinyu Lin, Hanhui Huang, Sheng Xu, Junxuan Fan, Yukun Shi, Hairong Lv

Abstract: Identification of fossil species is crucial to evolutionary studies. Recent advances from deep learning have shown promising prospects in fossil image identification. However, the quantity and quality of labeled fossil images are often limited due to fossil preservation, conditioned sampling, and expensive and inconsistent label annotation by domain experts, which pose great challenges to training… ▽ More Identification of fossil species is crucial to evolutionary studies. Recent advances from deep learning have shown promising prospects in fossil image identification. However, the quantity and quality of labeled fossil images are often limited due to fossil preservation, conditioned sampling, and expensive and inconsistent label annotation by domain experts, which pose great challenges to training deep learning based image classification models. To address these challenges, we follow the idea of the wisdom of crowds and propose a multiview ensemble framework, which collects Original (O), Gray (G), and Skeleton (S) views of each fossil image reflecting its different characteristics to train multiple base models, and then makes the final decision via soft voting. Experiments on the largest fusulinid dataset with 2400 images show that the proposed OGS consistently outperforms baselines (using a single model for each view), and obtains superior or comparable performance compared to OOO (using three base models for three the same Original views). Besides, as the training data decreases, the proposed framework achieves more gains. While considering the identification consistency estimation with respect to human experts, OGS receives the highest agreement with the original labels of dataset and with the re-identifications of two human experts. The validation performance provides a quantitative estimation of consistency across different experts and genera. We conclude that the proposed framework can present state-of-the-art performance in the fusulinid fossil identification case study. This framework is designed for general fossil identification and it is expected to see applications to other fossil datasets in future work. The source code is publicly available at https://github.com/houchengbin/Fossil-Image-Identification to benefit future research in fossil image identification. △ Less

Submitted 1 February, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

Comments: published in Methods in Ecology and Evolution

Journal ref: Methods in Ecology and Evolution, 14, 3020-3034 (2023)

arXiv:2301.00427 [pdf, other]

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

Authors: Han Huang, Leilei Sun, Bowen Du, Weifeng Lv

Abstract: Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structur… ▽ More Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through stochastic differential equations (SDE) and derive discrete graph structures as the condition for reverse generative processes. We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states. We further utilize ordinary differential equation (ODE) solvers for efficient graph sampling, based on the semi-linear structure of the probability flow ODE. Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps. Our code is provided in https://github.com/GRAPH-0/CDGS. △ Less

Submitted 23 May, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

Comments: Accepted by AAAI 2023

arXiv:2212.02846 [pdf, other]

doi 10.1103/PhysRevE.108.014309

Statistical mechanics of continual learning: variational principle and mean-field potential

Authors: Chan Li, Zhenye Huang, Wenxuan Zou, Hai** Huang

Abstract: An obstacle to artificial general intelligence is set by continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learnin… ▽ More An obstacle to artificial general intelligence is set by continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learning setting is thus proposed, where the neural networks are trained in a field-space, rather than gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and modulates synaptic resources among tasks. From a physics perspective, we translate the variational continual learning into Franz-Parisi thermodynamic potential framework, where previous task knowledge acts as a prior and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multi-layered neural networks, and performs better than the currently available metaplasticity algorithm. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience inspired metaplasticity, providing a theory-grounded method for the real-world multi-task learning with deep networks. △ Less

Submitted 20 June, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

Comments: 48 pages, 8 figures, final version to Phys Rev E

Journal ref: Phys. Rev. E 108, 014309 (2023)

arXiv:2211.14939 [pdf, other]

doi 10.1016/j.physa.2022.128395

Applying Deep Reinforcement Learning to the HP Model for Protein Structure Prediction

Authors: Kaiyuan Yang, Hou**g Huang, Olafs Vandans, Adithya Murali, Fujia Tian, Roland H. C. Yap, Liang Dai

Abstract: A central problem in computational biophysics is protein structure prediction, i.e., finding the optimal folding of a given amino acid sequence. This problem has been studied in a classical abstract model, the HP model, where the protein is modeled as a sequence of H (hydrophobic) and P (polar) amino acids on a lattice. The objective is to find conformations maximizing H-H contacts. It is known th… ▽ More A central problem in computational biophysics is protein structure prediction, i.e., finding the optimal folding of a given amino acid sequence. This problem has been studied in a classical abstract model, the HP model, where the protein is modeled as a sequence of H (hydrophobic) and P (polar) amino acids on a lattice. The objective is to find conformations maximizing H-H contacts. It is known that even in this reduced setting, the problem is intractable (NP-hard). In this work, we apply deep reinforcement learning (DRL) to the two-dimensional HP model. We can obtain the conformations of best known energies for benchmark HP sequences with lengths from 20 to 50. Our DRL is based on a deep Q-network (DQN). We find that a DQN based on long short-term memory (LSTM) architecture greatly enhances the RL learning ability and significantly improves the search process. DRL can sample the state space efficiently, without the need of manual heuristics. Experimentally we show that it can find multiple distinct best-known solutions per trial. This study demonstrates the effectiveness of deep reinforcement learning in the HP model for protein folding. △ Less

Submitted 9 December, 2022; v1 submitted 27 November, 2022; originally announced November 2022.

Comments: Published at Physica A: Statistical Mechanics and its Applications, available online 7 December 2022. Extended abstract accepted by the Machine Learning and the Physical Sciences workshop, NeurIPS 2022

arXiv:2209.13371 [pdf]

Recommendations and guidelines from the ISMRM Diffusion Study Group for preclinical diffusion MRI: Part 2 -- Ex vivo imaging

Authors: Kurt G Schilling, Francesco Grussu, Andrada Ianus, Brian Hansen, Rachel L C Barrett, Manisha Aggarwal, Stijn Michielse, Fatima Nasrallah, Warda Syeda, Nian Wang, Jelle Veraart, Alard Roebroeck, Andrew F Bagdasarian, Cornelius Eichner, Farshid Sepehrband, Jan Zimmermann, Lucas Soustelle, Christien Bowman, Benjamin C Tendler, Andreea Hertanu, Ben Jeurissen, Lucio Frydman, Yohan van de Looij, David Hike, Jeff F Dunn , et al. (31 additional authors not shown)

Abstract: The value of preclinical diffusion MRI (dMRI) is substantial. While dMRI enables in vivo non-invasive characterization of tissue, ex vivo dMRI is increasingly being used to probe tissue microstructure and brain connectivity. Ex vivo dMRI has several experimental advantages including higher signal-to-noise ratio and spatial resolution compared to in vivo studies, and more advanced diffusion contras… ▽ More The value of preclinical diffusion MRI (dMRI) is substantial. While dMRI enables in vivo non-invasive characterization of tissue, ex vivo dMRI is increasingly being used to probe tissue microstructure and brain connectivity. Ex vivo dMRI has several experimental advantages including higher signal-to-noise ratio and spatial resolution compared to in vivo studies, and more advanced diffusion contrasts for improved microstructure and connectivity characterization. Another major advantage is direct comparison with histological data as a crucial methodological validation. However, there are a number of considerations that must be made when performing ex vivo experiments. The steps from tissue preparation, image acquisition and processing, and interpretation of results are complex, with many decisions that not only differ dramatically from in vivo imaging, but ultimately affect what questions can be answered using the data. This work represents 'Part 2' of a series of recommendations and considerations for preclinical dMRI, where we focus on best practices for dMRI of ex vivo tissue. We first describe the value that ex vivo imaging adds to the field of dMRI, followed by general considerations and foundational knowledge that must be considered when designing experiments. We then give guidelines for ex vivo protocols, including tissue preparation, imaging sequences and data processing including pre-processing, model-fitting, and tractography. Finally, we provide an online resource which lists publicly available ex vivo dMRI datasets and dedicated software packages. In each section, we attempt to provide guidelines and recommendations, but also highlight areas for which no guidelines exist, and where future work should lie. An overarching goal herein is to enhance the rigor and reproducibility of ex vivo dMRI acquisitions and analyses, and thereby advance biomedical knowledge. △ Less

Submitted 7 February, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

Comments: 59 pages, 12 figures, part of ongoing efforts on ISMRM Diffusion Study Group initiative 'Best Practices (Consensus) for diffusion MRI'. arXiv admin note: text overlap with arXiv:2209.12994

arXiv:2209.12994 [pdf]

Recommendations and guidelines from the ISMRM Diffusion Study Group for preclinical diffusion MRI: Part 1 -- In vivo small-animal imaging

Authors: Ileana O Jelescu, Francesco Grussu, Andrada Ianus, Brian Hansen, Rachel L C Barrett, Manisha Aggarwal, Stijn Michielse, Fatima Nasrallah, Warda Syeda, Nian Wang, Jelle Veraart, Alard Roebroeck, Andrew F Bagdasarian, Cornelius Eichner, Farshid Sepehrband, Jan Zimmermann, Lucas Soustelle, Christien Bowman, Benjamin C Tendler, Andreea Hertanu, Ben Jeurissen, Marleen Verhoye, Lucio Frydman, Yohan van de Looij, David Hike , et al. (32 additional authors not shown)

Abstract: The value of in vivo preclinical diffusion MRI (dMRI) is substantial. Small-animal dMRI has been used for methodological development and validation, characterizing the biological basis of diffusion phenomena, and comparative anatomy. Many of the influential works in this field were first performed in small animals or ex vivo samples. The steps from animal setup and monitoring, to acquisition, anal… ▽ More The value of in vivo preclinical diffusion MRI (dMRI) is substantial. Small-animal dMRI has been used for methodological development and validation, characterizing the biological basis of diffusion phenomena, and comparative anatomy. Many of the influential works in this field were first performed in small animals or ex vivo samples. The steps from animal setup and monitoring, to acquisition, analysis, and interpretation are complex, with many decisions that may ultimately affect what questions can be answered using the data. This work aims to serve as a reference, presenting selected recommendations and guidelines from the diffusion community, on best practices for preclinical dMRI of in vivo animals. In each section, we also highlight areas for which no guidelines exist (and why), and where future work should focus. We first describe the value that small animal imaging adds to the field of dMRI, followed by general considerations and foundational knowledge that must be considered when designing experiments. We briefly describe differences in animal species and disease models and discuss how they are appropriate for different studies. We then give guidelines for in vivo acquisition protocols, including decisions on hardware, animal preparation, imaging sequences and data processing, including pre-processing, model-fitting, and tractography. Finally, we provide an online resource which lists publicly available preclinical dMRI datasets and software packages, to promote responsible and reproducible research. An overarching goal herein is to enhance the rigor and reproducibility of small animal dMRI acquisitions and analyses, and thereby advance biomedical knowledge. △ Less

Submitted 21 April, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: 69 pages, 6 figures, 1 table

arXiv:2208.11411 [pdf, other]

doi 10.1103/PhysRevResearch.5.013090

Spectrum of non-Hermitian deep-Hebbian neural networks

Authors: Zijian Jiang, Ziming Chen, Tianqi Hou, Hai** Huang

Abstract: Neural networks with recurrent asymmetric couplings are important to understand how episodic memories are encoded in the brain. Here, we integrate the experimental observation of wide synaptic integration window into our model of sequence retrieval in the continuous time dynamics. The model with non-normal neuron-interactions is theoretically studied by deriving a random matrix theory of the Jacob… ▽ More Neural networks with recurrent asymmetric couplings are important to understand how episodic memories are encoded in the brain. Here, we integrate the experimental observation of wide synaptic integration window into our model of sequence retrieval in the continuous time dynamics. The model with non-normal neuron-interactions is theoretically studied by deriving a random matrix theory of the Jacobian matrix in neural dynamics. The spectra bears several distinct features, such as breaking rotational symmetry about the origin, and the emergence of nested voids within the spectrum boundary. The spectral density is thus highly non-uniformly distributed in the complex plane. The random matrix theory also predicts a transition to chaos. In particular, the edge of chaos provides computational benefits for the sequential retrieval of memories. Our work provides a systematic study of time-lagged correlations with arbitrary time delays, and thus can inspire future studies of a broad class of memory models, and even big data analysis of biological time series. △ Less

Submitted 16 January, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

Comments: 65 pages, 12 figures, revised version for publication

Journal ref: Phys. Rev. Research 5, 013090 (2023)

arXiv:2208.09859 [pdf, other]

doi 10.1103/PhysRevResearch.5.L022011

Emergence of hierarchical modes from deep learning

Authors: Chan Li, Hai** Huang

Abstract: Large-scale deep neural networks consume expensive training costs, but the training results in less-interpretable weight matrices constructing the networks. Here, we propose a mode decomposition learning that can interpret the weight matrices as a hierarchy of latent modes. These modes are akin to patterns in physics studies of memory networks, but the least number of modes increases only logarith… ▽ More Large-scale deep neural networks consume expensive training costs, but the training results in less-interpretable weight matrices constructing the networks. Here, we propose a mode decomposition learning that can interpret the weight matrices as a hierarchy of latent modes. These modes are akin to patterns in physics studies of memory networks, but the least number of modes increases only logarithmically with the network width, and becomes even a constant when the width further grows. The mode decomposition learning not only saves a significant large amount of training costs, but also explains the network performance with the leading modes, displaying a striking piecewise power-law behavior. The modes specify a progressively compact latent space across the network hierarchy, making a more disentangled subspaces compared to standard training. Our mode decomposition learning is also studied in an analytic on-line learning setting, which reveals multi-stage of learning dynamics with a continuous specialization of hidden nodes. Therefore, the proposed mode decomposition learning points to a cheap and interpretable route towards the magical deep learning. △ Less

Submitted 27 February, 2023; v1 submitted 21 August, 2022; originally announced August 2022.

Comments: 5 pages +11 pages (SM), 4+10 figures, revised version to the journal

Journal ref: Phys. Rev. Research 5, L022011 (2023)

arXiv:2208.04944 [pdf]

doi 10.1021/acs.jcim.2c01180

Bridging the gap between target-based and cell-based drug discovery with a graph generative multi-task model

Authors: Fan Hu, Dongqi Wang, Huazhen Huang, Yishen Hu, Peng Yin

Abstract: Drug discovery is vitally important for protecting human against disease. Target-based screening is one of the most popular methods to develop new drugs in the past several decades. This method efficiently screens candidate drugs inhibiting target protein in vitro, but it often fails due to inadequate activity of the selected drugs in vivo. Accurate computational methods are needed to bridge this… ▽ More Drug discovery is vitally important for protecting human against disease. Target-based screening is one of the most popular methods to develop new drugs in the past several decades. This method efficiently screens candidate drugs inhibiting target protein in vitro, but it often fails due to inadequate activity of the selected drugs in vivo. Accurate computational methods are needed to bridge this gap. Here, we propose a novel graph multi task deep learning model to identify compounds carrying both target inhibitory and cell active (MATIC) properties. On a carefully curated SARS-CoV-2 dataset, the proposed MATIC model shows advantages comparing with traditional method in screening effective compounds in vivo. Next, we explored the model interpretability and found that the learned features for target inhibition (in vitro) or cell active (in vivo) tasks are different with molecular property correlations and atom functional attentions. Based on these findings, we utilized a monte carlo based reinforcement learning generative model to generate novel multi-property compounds with both in vitro and in vivo efficacy, thus bridging the gap between target-based and cell-based drug discovery. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Journal ref: Journal of Chemical Information and Modeling, 2022

arXiv:2207.07650 [pdf, other]

Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model

Authors: Haoteng Tang, Guixiang Ma, Lei Guo, Xiyao Fu, Heng Huang, Liang Zhang

Abstract: Recently brain networks have been widely adopted to study brain dynamics, brain development and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. Firstly, most current gra… ▽ More Recently brain networks have been widely adopted to study brain dynamics, brain development and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. Firstly, most current graph learning models are designed for unsigned graph, which hinders the analysis of many signed network data (e.g., brain functional networks). Meanwhile, the insufficiency of brain network data limits the model performance on clinical phenotypes predictions. Moreover, few of current graph learning model is interpretable, which may not be capable to provide biological insights for model outcomes. Here, we propose an interpretable hierarchical signed graph representation learning model to extract graph-level representations from brain functional networks, which can be used for different prediction tasks. In order to further improve the model performance, we also propose a new strategy to augment functional brain network data for contrastive learning. We evaluate this framework on different classification and regression tasks using the data from HCP and OASIS. Our results from extensive experiments demonstrate the superiority of the proposed model compared to several state-of-the-art techniques. Additionally, we use graph saliency maps, derived from these prediction tasks, to demonstrate detection and interpretation of phenotypic biomarkers. △ Less

Submitted 14 July, 2022; originally announced July 2022.

arXiv:2207.01813 [pdf, other]

Stochastic Variational Methods in Generalized Hidden Semi-Markov Models to Characterize Functionality in Random Heteropolymers

Authors: Yun Zhou, Boying Gong, Tao Jiang, Ting Xu, Haiyan Huang

Abstract: Recent years have seen substantial advances in the development of biofunctional materials using synthetic polymers. The growing problem of elusive sequence-functionality relations for most biomaterials has driven researchers to seek more effective tools and analysis methods. In this study, statistical models are used to study sequence features of the recently reported random heteropolymers (RHP),… ▽ More Recent years have seen substantial advances in the development of biofunctional materials using synthetic polymers. The growing problem of elusive sequence-functionality relations for most biomaterials has driven researchers to seek more effective tools and analysis methods. In this study, statistical models are used to study sequence features of the recently reported random heteropolymers (RHP), which transport protons across lipid bilayers selectively and rapidly like natural proton channels. We utilized the probabilistic graphical model framework and developed a generalized hidden semi-Markov model (GHSMM-RHP) to extract the function-determining sequence features, including the transmembrane segments within a chain and the sequence heterogeneity among different chains. We developed stochastic variational methods for efficient inference on parameter estimation and predictions, and empirically studied their computational performance from a comparative perspective on Bayesian (i.e., stochastic variational Bayes) versus frequentist (i.e., stochastic variational expectation-maximization) frameworks that have been studied separately before. The real data results agree well with the laboratory experiments, and suggest GHSMM-RHP's potential in predicting protein-like behavior at the polymer-chain level. △ Less

Submitted 5 July, 2022; originally announced July 2022.

arXiv:2206.06035 [pdf, other]

doi 10.1016/j.cag.2022.07.005

SHREC 2022: Protein-ligand binding site recognition

Authors: Luca Gagliardi, Andrea Raffo, Ulderico Fugacci, Silvia Biasotti, Walter Rocchia, Hao Huang, Boulbaba Ben Amor, Yi Fang, Yuanyuan Zhang, Xiao Wang, Charles Christoffer, Daisuke Kihara, Apostolos Axenopoulos, Stelios Mylonas, Petros Daras

Abstract: This paper presents the methods that have participated in the SHREC 2022 contest on protein-ligand binding site recognition. The prediction of protein-ligand binding regions is an active research domain in computational biophysics and structural biology and plays a relevant role for molecular docking and drug design. The goal of the contest is to assess the effectiveness of computational methods i… ▽ More This paper presents the methods that have participated in the SHREC 2022 contest on protein-ligand binding site recognition. The prediction of protein-ligand binding regions is an active research domain in computational biophysics and structural biology and plays a relevant role for molecular docking and drug design. The goal of the contest is to assess the effectiveness of computational methods in recognizing ligand binding sites in a protein based on its geometrical structure. Performances of the segmentation algorithms are analyzed according to two evaluation scores describing the capacity of a putative pocket to contact a ligand and to pinpoint the correct binding region. Despite some methods perform remarkably, we show that simple non-machine-learning approaches remain very competitive against data-driven algorithms. In general, the task of pocket detection remains a challenging learning problem which suffers of intrinsic difficulties due to the lack of negative examples (data imbalance problem). △ Less

Submitted 24 August, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

Journal ref: Computers & Graphics 107 (2022) 20-31

arXiv:2205.07854 [pdf, other]

Functional2Structural: Cross-Modality Brain Networks Representation Learning

Authors: Haoteng Tang, Xiyao Fu, Lei Guo, Yalin Wang, Scott Mackin, Olusola Ajilore, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan

Abstract: MRI-based modeling of brain networks has been widely used to understand functional and structural interactions and connections among brain regions, and factors that affect them, such as brain development and disease. Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. Since brain networks derived from functional an… ▽ More MRI-based modeling of brain networks has been widely used to understand functional and structural interactions and connections among brain regions, and factors that affect them, such as brain development and disease. Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. Since brain networks derived from functional and structural MRI describe the brain topology from different perspectives, exploring a representation that combines these cross-modality brain networks is non-trivial. Most current studies aim to extract a fused representation of the two types of brain network by projecting the structural network to the functional counterpart. Since the functional network is dynamic and the structural network is static, map** a static object to a dynamic object is suboptimal. However, map** in the opposite direction is not feasible due to the non-negativity requirement of current graph learning techniques. Here, we propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder that, from an opposite perspective, learns the cross-modality representations by projecting the functional network to the structural counterpart. We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2110.14329 [pdf]

doi 10.1186/s13059-021-02544-3

Feature selection revisited in the single-cell era

Authors: Pengyi Yang, Hao Huang, Chunlei Liu

Abstract: Feature selection techniques are essential for high-dimensional data analysis. In the last two decades, their popularity has been fuelled by the increasing availability of high-throughput biomolecular data where high-dimensionality is a common data property. Recent advances in biotechnologies enable global profiling of various molecular and cellular features at single-cell resolution, resulting in… ▽ More Feature selection techniques are essential for high-dimensional data analysis. In the last two decades, their popularity has been fuelled by the increasing availability of high-throughput biomolecular data where high-dimensionality is a common data property. Recent advances in biotechnologies enable global profiling of various molecular and cellular features at single-cell resolution, resulting in large-scale datasets with increased complexity. These technological developments have led to a resurgence in feature selection research and application in the single-cell field. Here, we revisit feature selection techniques and summarise recent developments. We review their versatile application to a range of single-cell data types including those generated from traditional cytometry and imaging technologies and the latest array of single-cell omics technologies. We highlight some of the challenges and future directions on which feature selection could have a significant impact. Finally, we consider the scalability and make general recommendations on the utility of each type of feature selection method. We hope this review serves as a reference point to stimulate future research and application of feature selection in the single-cell era. △ Less

Submitted 27 October, 2021; originally announced October 2021.

Journal ref: Genome Biology 22, 321 (2021)

arXiv:2103.14324 [pdf, other]

doi 10.1103/PhysRevE.104.064307

Eigenvalue spectrum of neural networks with arbitrary Hebbian length

Authors: Jianwen Zhou, Zijian Jiang, Tianqi Hou, Ziming Chen, K Y Michael Wong, Hai** Huang

Abstract: Associative memory is a fundamental function in the brain. Here, we generalize the standard associative memory model to include long-range Hebbian interactions at the learning stage, corresponding to a large synaptic integration window. In our model, the Hebbian length can be arbitrarily large. The spectral density of the coupling matrix is derived using the replica method, which is also shown to… ▽ More Associative memory is a fundamental function in the brain. Here, we generalize the standard associative memory model to include long-range Hebbian interactions at the learning stage, corresponding to a large synaptic integration window. In our model, the Hebbian length can be arbitrarily large. The spectral density of the coupling matrix is derived using the replica method, which is also shown to be consistent with the results obtained by applying the free probability method. The maximal eigenvalue is then obtained by an iterative equation, related to the paramagnetic to spin glass transition in the model. Altogether, this work establishes the connection between the associative memory with arbitrary Hebbian length and the asymptotic eigen-spectrum of the neural-coupling matrix. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: 19 pages, 4 figures

Journal ref: Physical Review E 104, 064307 (2021)

arXiv:2103.14317 [pdf, other]

doi 10.1103/PhysRevE.104.064306

Associative memory model with arbitrary Hebbian length

Authors: Zijian Jiang, Jianwen Zhou, Tianqi Hou, K. Y. Michael Wong, Hai** Huang

Abstract: Conversion of temporal to spatial correlations in the cortex is one of the most intriguing functions in the brain. The learning at synapses triggering the correlation conversion can take place in a wide integration window, whose influence on the correlation conversion remains elusive. Here, we propose a generalized associative memory model with arbitrary Hebbian length. The model can be analytical… ▽ More Conversion of temporal to spatial correlations in the cortex is one of the most intriguing functions in the brain. The learning at synapses triggering the correlation conversion can take place in a wide integration window, whose influence on the correlation conversion remains elusive. Here, we propose a generalized associative memory model with arbitrary Hebbian length. The model can be analytically solved, and predicts that a small Hebbian length can already significantly enhance the correlation conversion, i.e., the stimulus-induced attractor can be highly correlated with a significant number of patterns in the stored sequence, thereby facilitating state transitions in the neural representation space. Moreover, an anti-Hebbian component is able to reshape the energy landscape of memories, akin to the function of sleep. Our work thus establishes the fundamental connection between associative memory, Hebbian length, and correlation conversion in the brain. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: 6 pages, 2 figures

Journal ref: Physical Review E 104, 064306 (2021)

arXiv:2103.11075 [pdf, other]

doi 10.1007/s10441-022-09442-6

Limited Cognitive Abilities and Dominance Hierarchy

Authors: Hanyuan Huang, Jiabin Wu

Abstract: We propose a novel model to explain the mechanisms underlying dominance hierarchical structures. Guided by a predetermined social convention, agents with limited cognitive abilities optimize their strategies in a Hawk-Dove game. We find that several commonly observed hierarchical structures in nature such as linear hierarchy and despotism, emerge as the total fitness-maximizing social structures g… ▽ More We propose a novel model to explain the mechanisms underlying dominance hierarchical structures. Guided by a predetermined social convention, agents with limited cognitive abilities optimize their strategies in a Hawk-Dove game. We find that several commonly observed hierarchical structures in nature such as linear hierarchy and despotism, emerge as the total fitness-maximizing social structures given different levels of cognitive abilities. △ Less

Submitted 18 June, 2022; v1 submitted 19 March, 2021; originally announced March 2021.

Comments: 26 pages, 17 figures

Journal ref: Acta Biotheoretica 2022, 70: 17

arXiv:2102.03740 [pdf, other]

doi 10.1103/PhysRevE.107.024307

Ensemble perspective for understanding temporal credit assignment

Authors: Wenxuan Zou, Chan Li, Hai** Huang

Abstract: Recurrent neural networks are widely used for modeling spatio-temporal sequences in both nature language processing and neural population dynamics. However, understanding the temporal credit assignment is hard. Here, we propose that each individual connection in the recurrent computation is modeled by a spike and slab distribution, rather than a precise weight value. We then derive the mean-field… ▽ More Recurrent neural networks are widely used for modeling spatio-temporal sequences in both nature language processing and neural population dynamics. However, understanding the temporal credit assignment is hard. Here, we propose that each individual connection in the recurrent computation is modeled by a spike and slab distribution, rather than a precise weight value. We then derive the mean-field algorithm to train the network at the ensemble level. The method is then applied to classify handwritten digits when pixels are read in sequence, and to the multisensory integration task that is a fundamental cognitive function of animals. Our model reveals important connections that determine the overall performance of the network. The model also shows how spatio-temporal information is processed through the hyperparameters of the distribution, and moreover reveals distinct types of emergent neural selectivity. To provide a mechanistic analysis of the ensemble learning, we first derive an analytic solution of the learning at the infinitely-large-network limit. We then carry out a low-dimensional projection of both neural and synaptic dynamics, analyze symmetry breaking in the parameter space, and finally demonstrate the role of stochastic plasticity in the recurrent computation. Therefore, our study sheds light on mechanisms of how weight uncertainty impacts the temporal credit assignment in recurrent neural networks from the ensemble perspective. △ Less

Submitted 7 March, 2022; v1 submitted 7 February, 2021; originally announced February 2021.

Comments: 61 pages, 33 figures, revised version to journal

Journal ref: Phys. Rev. E 107, 024307 (2023)

arXiv:2012.03303 [pdf, other]

doi 10.1016/j.bpj.2021.06.020

A Tridomain Model for Potassium Clearance in Optic Nerve of Necturus

Authors: Yi Zhu, Shixin Xu, Robert S. Eisenberg, Huaxiong Huang

Abstract: The accumulation of potassium in the narrow space outside nerve cells is a classical subject of biophysics that has received much attention recently. It may be involved in potassium accumulation \textcolor{black}{including} spreading depression, perhaps migraine and some kinds of epilepsy, even (speculatively) learning. Quantitative analysis is likely to help evaluate the role of potassium clearan… ▽ More The accumulation of potassium in the narrow space outside nerve cells is a classical subject of biophysics that has received much attention recently. It may be involved in potassium accumulation \textcolor{black}{including} spreading depression, perhaps migraine and some kinds of epilepsy, even (speculatively) learning. Quantitative analysis is likely to help evaluate the role of potassium clearance from the extracellular space after a train of action potentials. Clearance involves three structures that extend down the length of the nerve: glia, extracellular space, and axon and so need to be described as systems distributed in space in the tradition used for electrical potential in the `cable equations' of nerve since the work of Hodgkin in 1937. A three-compartment model is proposed here for the optic nerve and is used to study the accumulation of potassium and its clearance. The model allows the convection, diffusion, and electrical migration of water and ions. We depend on the data of Orkand et al to ensure the relevance of our model and align its parameters with the anatomy and properties of membranes, channels, and transporters: our model fits their experimental data quite well. The aligned model shows that glia has an important role in buffering potassium, as expected. The model shows that potassium is cleared mostly by convective flow through the syncytia of glia driven by osmotic pressure differences. A simplified model might be possible, but it must involve flow down the length of the optic nerve. It is easy for compartment models to neglect this flow. Our model can be used for structures quite different from the optic nerve that might have different distributions of channels and transporters in its three compartments. It can be generalized to include a fourth (distributed) compartment representing blood vessels to deal with the glymphatic flow into the circulatory system. △ Less

Submitted 16 May, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

Comments: 35 pages, 13 figures

MSC Class: 92C05 92C37 35Q92

arXiv:2007.02047 [pdf, other]

doi 10.1088/1674-1056/abd68e

Relationship between manifold smoothness and adversarial vulnerability in deep learning with local errors

Authors: Zijian Jiang, Jianwen Zhou, Hai** Huang

Abstract: Artificial neural networks can achieve impressive performances, and even outperform humans in some specific tasks. Nevertheless, unlike biological brains, the artificial neural networks suffer from tiny perturbations in sensory input, under various kinds of adversarial attacks. It is therefore necessary to study the origin of the adversarial vulnerability. Here, we establish a fundamental relation… ▽ More Artificial neural networks can achieve impressive performances, and even outperform humans in some specific tasks. Nevertheless, unlike biological brains, the artificial neural networks suffer from tiny perturbations in sensory input, under various kinds of adversarial attacks. It is therefore necessary to study the origin of the adversarial vulnerability. Here, we establish a fundamental relationship between geometry of hidden representations (manifold perspective) and the generalization capability of the deep networks. For this purpose, we choose a deep neural network trained by local errors, and then analyze emergent properties of trained networks through the manifold dimensionality, manifold smoothness, and the generalization capability. To explore effects of adversarial examples, we consider independent Gaussian noise attacks and fast-gradient-sign-method (FGSM) attacks. Our study reveals that a high generalization accuracy requires a relatively fast power-law decay of the eigen-spectrum of hidden representations. Under Gaussian attacks, the relationship between generalization accuracy and power-law exponent is monotonic, while a non-monotonic behavior is observed for FGSM attacks. Our empirical study provides a route towards a final mechanistic interpretation of adversarial vulnerability under adversarial attacks. △ Less

Submitted 23 December, 2020; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: 10 pages, 8 figures, to appear in Chin. Phys. B (2021)

Journal ref: Chin. Phys. B Vol. 30, No. 4 (2021) 048702

arXiv:2006.12148 [pdf]

doi 10.1101/2020.06.20.160564

Identification of Neuronal Polarity by Node-Based Machine Learning

Authors: Chen-Zhi Su, Kuan-Ting Chou, Hsuan-Pei Huang, Chung-Chuan Lo, Daw-Wei Wang

Abstract: Identify the directions of signal flows in neural networks is one of the most important stages for understanding the intricate information dynamics of a living brain. Using a dataset of 213 projection neurons distributed in different regions of Drosophila brain, we develop a powerful machine learning algorithm: node-based polarity identifier of neurons (NPIN). The proposed model is trained by noda… ▽ More Identify the directions of signal flows in neural networks is one of the most important stages for understanding the intricate information dynamics of a living brain. Using a dataset of 213 projection neurons distributed in different regions of Drosophila brain, we develop a powerful machine learning algorithm: node-based polarity identifier of neurons (NPIN). The proposed model is trained by nodal information only and includes both Soma Features (which contain spatial information from a given node to a soma) and Local Features (which contain morphological information of a given node). After including the spatial correlations between nodal polarities, our NPIN provided extremely high accuracy (>96.0%) for the classification of neuronal polarity, even for complex neurons with more than two dendrite/axon clusters. Finally, we further apply NPIN to classify the neuronal polarity of the blowfly, which has much less neuronal data available. Our results demonstrate that NPIN is a powerful tool to identify the neuronal polarity of insects and to map out the signal flows in the brain's neural networks. △ Less

Submitted 22 June, 2020; originally announced June 2020.

Comments: Manuscript: 18 pages and 9 figures; Appendix: 14 pages, 5 figures, and 2 tables

arXiv:2006.11569 [pdf, other]

doi 10.1103/PhysRevE.103.012315

Weakly-correlated synapses promote dimension reduction in deep neural networks

Authors: Jianwen Zhou, Hai** Huang

Abstract: By controlling synaptic and neural correlations, deep learning has achieved empirical successes in improving classification performances. How synaptic correlations affect neural correlations to produce disentangled hidden representations remains elusive. Here we propose a simplified model of dimension reduction, taking into account pairwise correlations among synapses, to reveal the mechanism unde… ▽ More By controlling synaptic and neural correlations, deep learning has achieved empirical successes in improving classification performances. How synaptic correlations affect neural correlations to produce disentangled hidden representations remains elusive. Here we propose a simplified model of dimension reduction, taking into account pairwise correlations among synapses, to reveal the mechanism underlying how the synaptic correlations affect dimension reduction. Our theory determines the synaptic-correlation scaling form requiring only mathematical self-consistency, for both binary and continuous synapses. The theory also predicts that weakly-correlated synapses encourage dimension reduction compared to their orthogonal counterparts. In addition, these synapses slow down the decorrelation process along the network depth. These two computational roles are explained by the proposed mean-field equation. The theoretical predictions are in excellent agreement with numerical simulations, and the key features are also captured by a deep learning with Hebbian rules. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Comments: 21 pages, 8 figures

Journal ref: Phys. Rev. E 103, 012315 (2021)

arXiv:2006.00639 [pdf]

Ontology-based systematic classification and analysis of coronaviruses, hosts, and host-coronavirus interactions towards deep understanding of COVID-19

Authors: Hong Yu, Li Li, Hsin-hui Huang, Yang Wang, Yingtong Liu, Edison Ong, Anthony Huffman, Tao Zeng, **gsong Zhang, Pengpai Li, Zhi** Liu, Xiangyan Zhang, Xianwei Ye, Samuel K. Handelman, Gerry Higgins, Gilbert S. Omenn, Brian Athey, Junguk Hur, Luonan Chen, Yongqun He

Abstract: Given the existing COVID-19 pandemic worldwide, it is critical to systematically study the interactions between hosts and coronaviruses including SARS-Cov, MERS-Cov, and SARS-CoV-2 (cause of COVID-19). We first created four host-pathogen interaction (HPI)-Outcome postulates, and generated a HPI-Outcome model as the basis for understanding host-coronavirus interactions (HCI) and their relations wit… ▽ More Given the existing COVID-19 pandemic worldwide, it is critical to systematically study the interactions between hosts and coronaviruses including SARS-Cov, MERS-Cov, and SARS-CoV-2 (cause of COVID-19). We first created four host-pathogen interaction (HPI)-Outcome postulates, and generated a HPI-Outcome model as the basis for understanding host-coronavirus interactions (HCI) and their relations with the disease outcomes. We hypothesized that ontology can be used as an integrative platform to classify and analyze HCI and disease outcomes. Accordingly, we annotated and categorized different coronaviruses, hosts, and phenotypes using ontologies and identified their relations. Various COVID-19 phenotypes are hypothesized to be caused by the backend HCI mechanisms. To further identify the causal HCI-outcome relations, we collected 35 experimentally-verified HCI protein-protein interactions (PPIs), and applied literature mining to identify additional host PPIs in response to coronavirus infections. The results were formulated in a logical ontology representation for integrative HCI-outcome understanding. Using known PPIs as baits, we also developed and applied a domain-inferred prediction method to predict new PPIs and identified their pathological targets on multiple organs. Overall, our proposed ontology-based integrative framework combined with computational predictions can be used to support fundamental understanding of the intricate interactions between human patients and coronaviruses (including SARS-CoV-2) and their association with various disease outcomes. △ Less

Submitted 31 May, 2020; originally announced June 2020.

Comments: 32 pages, 1 table, 6 figures

arXiv:2001.03354 [pdf, other]

doi 10.1103/PhysRevLett.125.178301

Learning credit assignment

Authors: Chan Li, Hai** Huang

Abstract: Deep learning has achieved impressive prediction accuracies in a variety of scientific and industrial domains. However, the nested non-linear feature of deep learning makes the learning highly non-transparent, i.e., it is still unknown how the learning coordinates a huge number of parameters to achieve a decision making. To explain this hierarchical credit assignment, we propose a mean-field learn… ▽ More Deep learning has achieved impressive prediction accuracies in a variety of scientific and industrial domains. However, the nested non-linear feature of deep learning makes the learning highly non-transparent, i.e., it is still unknown how the learning coordinates a huge number of parameters to achieve a decision making. To explain this hierarchical credit assignment, we propose a mean-field learning model by assuming that an ensemble of sub-networks, rather than a single network, are trained for a classification task. Surprisingly, our model reveals that apart from some deterministic synaptic weights connecting two neurons at neighboring layers, there exist a large number of connections that can be absent, and other connections can allow for a broad distribution of their weight values. Therefore, synaptic connections can be classified into three categories: very important ones, unimportant ones, and those of variability that may partially encode nuisance factors. Therefore, our model learns the credit assignment leading to the decision, and predicts an ensemble of sub-networks that can accomplish the same task, thereby providing insights toward understanding the macroscopic behavior of deep learning through the lens of distinct roles of synaptic weights. △ Less

Submitted 3 October, 2020; v1 submitted 10 January, 2020; originally announced January 2020.

Comments: 5 pages, 4 figures, a generalized BackProp proposed to learn credit assignment from an network ensemble perspective, to appear in Phys Rev Lett (2020)

Journal ref: Phys. Rev. Lett. 125, 178301 (2020)

arXiv:1911.07662 [pdf, other]

doi 10.1103/PhysRevE.102.030301

Variational mean-field theory for training restricted Boltzmann machines with binary synapses

Authors: Hai** Huang

Abstract: Unsupervised learning requiring only raw data is not only a fundamental function of the cerebral cortex, but also a foundation for a next generation of artificial neural networks. However, a unified theoretical framework to treat sensory inputs, synapses and neural activity together is still lacking. The computational obstacle originates from the discrete nature of synapses, and complex interactio… ▽ More Unsupervised learning requiring only raw data is not only a fundamental function of the cerebral cortex, but also a foundation for a next generation of artificial neural networks. However, a unified theoretical framework to treat sensory inputs, synapses and neural activity together is still lacking. The computational obstacle originates from the discrete nature of synapses, and complex interactions among these three essential elements of learning. Here, we propose a variational mean-field theory in which the distribution of synaptic weights is considered. The unsupervised learning can then be decomposed into two intertwined steps: a maximization step is carried out as a gradient ascent of the lower-bound on the data log-likelihood, in which the synaptic weight distribution is determined by updating variational parameters, and an expectation step is carried out as a message passing procedure on an equivalent or dual neural network whose parameter is specified by the variational parameters of the weight distribution. Therefore, our framework provides insights on how data (or sensory inputs), synapses and neural activities interact with each other to achieve the goal of extracting statistical regularities in sensory inputs. This variational framework is verified in restricted Boltzmann machines with planted synaptic weights and handwritten-digits learning. △ Less

Submitted 17 August, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

Comments: 9 pages, 2 figures, a mean-field framework proposed for unsupervised learning in RBM with discrete synapses, which was previously out of reach

Journal ref: Phys. Rev. E 102, 030301 (2020)

arXiv:1910.01726 [pdf, other]

A machine learning method correlating pulse pressure wave data with pregnancy

Authors: Jianhong Chen, Huang Huang, Wenrui Hao, **chao Xu

Abstract: Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key f… ▽ More Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key features in TCM, and pregnancy by using deep learning technology. This computational approach shows that the accuracy of pregnancy detection by the PPW is 84% with an AUC of 91%. Our study is a proof of concept of pulse diagnosis and will also motivate further sophisticated investigations on pulse waves. △ Less

Submitted 3 October, 2019; originally announced October 2019.

arXiv:1904.13052 [pdf, other]

doi 10.1088/1751-8121/ab3f3f

Minimal model of permutation symmetry in unsupervised learning

Authors: Tianqi Hou, K. Y. Michael Wong, Hai** Huang

Abstract: Permutation of any two hidden units yields invariant properties in typical deep generative neural networks. This permutation symmetry plays an important role in understanding the computation performance of a broad class of neural networks with two or more hidden units. However, a theoretical study of the permutation symmetry is still lacking. Here, we propose a minimal model with only two hidden u… ▽ More Permutation of any two hidden units yields invariant properties in typical deep generative neural networks. This permutation symmetry plays an important role in understanding the computation performance of a broad class of neural networks with two or more hidden units. However, a theoretical study of the permutation symmetry is still lacking. Here, we propose a minimal model with only two hidden units in a restricted Boltzmann machine, which aims to address how the permutation symmetry affects the critical learning data size at which the concept-formation (or spontaneous symmetry breaking in physics language) starts, and moreover semi-rigorously prove a conjecture that the critical data size is independent of the number of hidden units once this number is finite. Remarkably, we find that the embedded correlation between two receptive fields of hidden units reduces the critical data size. In particular, the weakly-correlated receptive fields have the benefit of significantly reducing the minimal data size that triggers the transition, given less noisy data. Inspired by the theory, we also propose an efficient fully-distributed algorithm to infer the receptive fields of hidden units. Furthermore, our minimal model reveals that the permutation symmetry can also be spontaneously broken following the spontaneous symmetry breaking. Overall, our results demonstrate that the unsupervised learning is a progressive combination of spontaneous symmetry breaking and permutation symmetry breaking which are both spontaneous processes driven by data streams (observations). All these effects can be analytically probed based on the minimal model, providing theoretical insights towards understanding unsupervised learning in a more general context. △ Less

Submitted 11 August, 2019; v1 submitted 30 April, 2019; originally announced April 2019.

Comments: 36 pages, 110 equations, 5 figures; a complete picture about physical laws of unsupervised learning in neural networks presented

Journal ref: 2019 J. Phys. A: Math. Theor. 52 414001

arXiv:1810.04162 [pdf, other]

doi 10.1016/j.bpj.2019.02.007

A Bidomain Model for Lens Microcirculation

Authors: Yi Zhu, Shixin Xu, Robert S. Eisenberg, Huaxiong Huang

Abstract: There exists a large body of research on the lens of mammalian eye over the past several decades. The objective of the current work is to provide a link between the most recent computational models to some of the pioneering work in the 1970s and 80s. We introduce a general non-electro-neutral model to study the microcirculation in lens of eyes. It describes the steady state relationships among ion… ▽ More There exists a large body of research on the lens of mammalian eye over the past several decades. The objective of the current work is to provide a link between the most recent computational models to some of the pioneering work in the 1970s and 80s. We introduce a general non-electro-neutral model to study the microcirculation in lens of eyes. It describes the steady state relationships among ion fluxes, water flow and electric field inside cells, and in the narrow extracellular spaces between cells in the lens. Using asymptotic analysis, we derive a simplified model based on physiological data and compare our results with those in the literature. We show that our simplified model can be reduced further to the first generation models while our full model is consistent with the most recent computational models. In addition, our simplified model captures the main features of the full model. Our results serve as a useful link intermediate between the computational models and the first generation analytical models. △ Less

Submitted 22 May, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

Journal ref: Biophysical Journal, 2019, 116, (6), pp. 1171-1184

arXiv:1806.00646 [pdf]

Osmosis through a Semi-permeable Membrane: a Consistent Approach to Interactions

Authors: Shixin Xu, Bob Eisenberg, Zilong Song, Huaxiong Huang

Abstract: The movement of ionic solutions is an essential part of biology and technology. Fluidics, from nano- to micro- to microfluidics, is a burgeoning area of technology which is all about the movement of ionic solutions, on various scales. Many cells, tissues, and organs of animals and plants depend on osmosis, as the movement of fluids is called in biology. Indeed, the movement of fluids through chann… ▽ More The movement of ionic solutions is an essential part of biology and technology. Fluidics, from nano- to micro- to microfluidics, is a burgeoning area of technology which is all about the movement of ionic solutions, on various scales. Many cells, tissues, and organs of animals and plants depend on osmosis, as the movement of fluids is called in biology. Indeed, the movement of fluids through channel proteins (that have a hole down their middle) is fluidics on an atomic scale. Ionic fluids are complex fluids, with energy stored in many ways. Ionic fluids flow driven by gradients of concentration, chemical and electrical potential, and hydrostatic pressure. Each flow is classically described by its own field theory, independent of the others, but of course, in reality every gradient drives every kind of flow to a varying extent. Combining field equations is tricky and so the theory of complex fluids derives the equations, rather than assumes their interactions. When field equations are derived, rather than assumed, their variables are consistent. That is to say all variables satisfy all equations under all conditions with one set of parameters. Here we treat a classical osmotic cell in this spirit, using a sharp interface method to derive boundary conditions consistent with all flows and fields. We allow volume to change with concentration, since changes of volume are a property of ionic solutions known to all who make them in the laboratory. We consider flexible and inflexible membranes. We show how to combine the energetics of the membrane with the energetics of the surrounding complex fluids. The results seem general but need application to specific situations of technological, biological and experimental importance before the consequences of consistency can be understood. △ Less

Submitted 7 June, 2018; v1 submitted 2 June, 2018; originally announced June 2018.

Comments: typos corrected; equations reformatted a bit; masking of part of Fig.1 corrected

arXiv:1804.05175 [pdf]

Amino-acid network clique analysis of protein mutation correlation effects: a case study of lysozme

Authors: Rui Chen, Dengming Ming, He Huang

Abstract: Optimizing amino-acid mutations has been a most challenging task in modern bio- industrial enzyme designing. It is well known that many successful designs often hinge on extensive correlations among mutations at different sites within the enzyme, however, the underpinning mechanism for these correlations is far from clear. Here, we present a topology-based model to quantitively characterize correl… ▽ More Optimizing amino-acid mutations has been a most challenging task in modern bio- industrial enzyme designing. It is well known that many successful designs often hinge on extensive correlations among mutations at different sites within the enzyme, however, the underpinning mechanism for these correlations is far from clear. Here, we present a topology-based model to quantitively characterize correlation effects between mutations. The method is based on the molecular dynamic simulations and the amino-acid network clique analysis that simply examines if two single mutation sites belong to some 3-clique. We analyzed 13 dual mutations of T4 phage lysozyme and found that the clique-based model successfully distinguishes highly correlated or non-additive double-site mutations from those with less correlation or additive mutations. We also applied the model to the protein Eglin c whose topology is significantly distinct from that of T4 phage lysozyme, and found that the model can, to some extension, still identify non-additive mutations from additive ones. Our calculations showed that mutation correlation effects may heavily depend on topology relationship among mutation sites, which can be quantitatively characterized using amino-acid network k-cliques. We also showed that double-site mutation correlations can be significantly altered by exerting a third mutation, indicating that more detailed physico-chemistry interactions might be considered with the network model for better understanding of the elusive mutation-correlation principle. △ Less

Submitted 14 April, 2018; originally announced April 2018.

Comments: 12 pages, 3 figures, 5 tables

arXiv:1802.02540 [pdf]

Molecular Regulation of Histamine Synthesis

Authors: Hua Huang, Yapeng Li, **yi Liang, Fred D. Finkelman

Abstract: Histamine is a critical mediator of IgE/ cell-mediated anaphylaxis, a neurotransmitter and a regulator of gastric acid secretion. Histamine is a monoamine synthesized from the amino acid histidine through a reaction catalyzed by the enzyme histidine decarboxylase (HDC), which removes carboxyl group from histidine. Despite the importance of histamine, transcriptional regulation of HDC gene expressi… ▽ More Histamine is a critical mediator of IgE/ cell-mediated anaphylaxis, a neurotransmitter and a regulator of gastric acid secretion. Histamine is a monoamine synthesized from the amino acid histidine through a reaction catalyzed by the enzyme histidine decarboxylase (HDC), which removes carboxyl group from histidine. Despite the importance of histamine, transcriptional regulation of HDC gene expression in mammals is still poorly understood. In this Review, we focus on discussing advances in the understanding of molecular regulation of mammalian histamine synthesis. △ Less

Submitted 31 May, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

Comments: 1.added references for introduction section; 2.added references and typos added for histamine-producing cells in mammals and stimuli that trigger histamine release; 3.typos added for section of histidine decarboxylase and histamine synthesis in mammals; 4.added references and typos added for section of hdc gene expression and histamine synthesis in basophils and mast cells. 5. added 2 figures

arXiv:1710.08961 [pdf]

Fast and Scalable Distributed Deep Convolutional Autoencoder for fMRI Big Data Analytics

Authors: Milad Makkie, Heng Huang, Yu Zhao, Athanasios V. Vasilakos, Tianming Liu

Abstract: In recent years, analyzing task-based fMRI (tfMRI) data has become an essential tool for understanding brain function and networks. However, due to the sheer size of tfMRI data, its intrinsic complex structure, and lack of ground truth of underlying neural activities, modeling tfMRI data is hard and challenging. Previously proposed data-modeling methods including Independent Component Analysis (IC… ▽ More In recent years, analyzing task-based fMRI (tfMRI) data has become an essential tool for understanding brain function and networks. However, due to the sheer size of tfMRI data, its intrinsic complex structure, and lack of ground truth of underlying neural activities, modeling tfMRI data is hard and challenging. Previously proposed data-modeling methods including Independent Component Analysis (ICA) and Sparse Dictionary Learning only provided a weakly established model based on blind source separation under the strong assumption that original fMRI signals could be linearly decomposed into time series components with corresponding spatial maps. Meanwhile, analyzing and learning a large amount of tfMRI data from a variety of subjects has been shown to be very demanding but yet challenging even with technological advances in computational hardware. Given the Convolutional Neural Network (CNN), a robust method for learning high-level abstractions from low-level data such as tfMRI time series, in this work we propose a fast and scalable novel framework for distributed deep Convolutional Autoencoder model. This model aims to both learn the complex hierarchical structure of the tfMRI data and to leverage the processing power of multiple GPUs in a distributed fashion. To implement such a model, we have created an enhanced processing pipeline on the top of Apache Spark and Tensorflow library, leveraging from a very large cluster of GPU machines. Experimental data from applying the model on the Human Connectome Project (HCP) show that the proposed model is efficient and scalable toward tfMRI big data analytics, thus enabling data-driven extraction of hierarchical neuroscientific information from massive fMRI big data in the future. △ Less

Submitted 4 March, 2018; v1 submitted 24 October, 2017; originally announced October 2017.

Comments: This work is submitted to SIGKDD 2018

Showing 1–50 of 66 results for author: Huang, H