-
Spectral theory of $p-adic$ Hermite operator
Authors:
Tianhong Zhao
Abstract:
We give the definition of $p-adic$ Hermite operator and set up the $p-adic$ spectral measure. We compare the Archimedean case with non-Archimedean case. The structure of Hermite conjugate in $C^{*}$-Algebra corresponds to three canonical structures of $p-adic$ ultrametric Banach algebra: 1. mod $p$ reduction 2. Frobenius map 3. Teichmüller lift. There is a nature connection between Galois theory a…
▽ More
We give the definition of $p-adic$ Hermite operator and set up the $p-adic$ spectral measure. We compare the Archimedean case with non-Archimedean case. The structure of Hermite conjugate in $C^{*}$-Algebra corresponds to three canonical structures of $p-adic$ ultrametric Banach algebra: 1. mod $p$ reduction 2. Frobenius map 3. Teichmüller lift. There is a nature connection between Galois theory and Hermite operator spectral decomposition. The Galois group $\mathrm{Gal}(\bar{\mathbb{F}}_p|\mathbb{F}_p)$ generate the $p-adic$ spectral measure. We point out some relationships with $p-adic$ quantum mechanics: 1. creation operator and annihilation operator 2. $p-adic$ uncertainty principle.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
MuGER$^2$: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering
Authors:
Yingyao Wang,
Junwei Bao,
Chaoqun Duan,
Youzheng Wu,
Xiaodong He,
Tiejun Zhao
Abstract:
Hybrid question answering (HQA) aims to answer questions over heterogeneous data, including tables and passages linked to table cells. The heterogeneous data can provide different granularity evidence to HQA models, e.t., column, row, cell, and link. Conventional HQA models usually retrieve coarse- or fine-grained evidence to reason the answer. Through comparison, we find that coarse-grained evide…
▽ More
Hybrid question answering (HQA) aims to answer questions over heterogeneous data, including tables and passages linked to table cells. The heterogeneous data can provide different granularity evidence to HQA models, e.t., column, row, cell, and link. Conventional HQA models usually retrieve coarse- or fine-grained evidence to reason the answer. Through comparison, we find that coarse-grained evidence is easier to retrieve but contributes less to the reasoner, while fine-grained evidence is the opposite. To preserve the advantage and eliminate the disadvantage of different granularity evidence, we propose MuGER$^2$, a Multi-Granularity Evidence Retrieval and Reasoning approach. In evidence retrieval, a unified retriever is designed to learn the multi-granularity evidence from the heterogeneous data. In answer reasoning, an evidence selector is proposed to navigate the fine-grained evidence for the answer reader based on the learned multi-granularity evidence. Experiment results on the HybridQA dataset show that MuGER$^2$ significantly boosts the HQA performance. Further ablation analysis verifies the effectiveness of both the retrieval and reasoning designs.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Backward error analysis of the Lanczos bidiagonalization with reorthogonalization
Authors:
Haibo Li,
Guangming Tan,
Tong Zhao
Abstract:
The $k$-step Lanczos bidiagonalization reduces a matrix $A\in\mathbb{R}^{m\times n}$ into a bidiagonal form $B_k\in\mathbb{R}^{(k+1)\times k}$ while generates two orthonormal matrices $U_{k+1}\in\mathbb{R}^{m\times (k+1)}$ and $V_{k+1}\in\mathbb{R}^{n\times {(k+1)}}$. However, any practical implementation of the algorithm suffers from loss of orthogonality of $U_{k+1}$ and $V_{k+1}$ due to the pre…
▽ More
The $k$-step Lanczos bidiagonalization reduces a matrix $A\in\mathbb{R}^{m\times n}$ into a bidiagonal form $B_k\in\mathbb{R}^{(k+1)\times k}$ while generates two orthonormal matrices $U_{k+1}\in\mathbb{R}^{m\times (k+1)}$ and $V_{k+1}\in\mathbb{R}^{n\times {(k+1)}}$. However, any practical implementation of the algorithm suffers from loss of orthogonality of $U_{k+1}$ and $V_{k+1}$ due to the presence of rounding errors, and several reorthogonalization strategies are proposed to maintain some level of orthogonality. In this paper, by writing various reorthogonalization strategies in a general form we make a backward error analysis of the Lanczos bidiagonalization with reorthogonalization (LBRO). Our results show that the computed $B_k$ by the $k$-step LBRO of $A$ with starting vector $b$ is the exact one generated by the $k$-step Lanczos bidiagonalization of $A+E$ with starting vector $b+δ_{b}$ (denoted by LB($A+E,b+δ_{b}$)), where the 2-norm of perturbation vector/matrix $δ_{b}$ and $E$ depend on the roundoff unit and orthogonality levels of $U_{k+1}$ and $V_{k+1}$. The results also show that the 2-norm of $U_{k+1}-\bar{U}_{k+1}$ and $V_{k+1}-\bar{V}_{k+1}$ are controlled by the orthogonality levels of $U_{k+1}$ and $V_{k+1}$, respectively, where $\bar{U}_{k+1}$ and $\bar{V}_{k+1}$ are the two orthonormal matrices generated by the $k$-step LB($A+E,b+δ_{b}$) in exact arithmetic. Thus the $k$-step LBRO is mixed forward-backward stable as long as the orthogonality of $U_{k+1}$ and $V_{k+1}$ are good enough. We use this result to investigate the backward stability of LBRO based SVD computation algorithm and LSQR algorithm. Numerical experiments are made to confirm our results.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Ultrasensitive atomic comagnetometer with enhanced nuclear spin coherence
Authors:
Kai Wei,
Tian Zhao,
Xiujie Fang,
Zitong Xu,
Chang Liu,
Qian Cao,
Arne Wickenbrock,
Yanhui Hu,
Wei Ji,
Dmitry Budker
Abstract:
Achieving high energy resolution in spin systems is important for fundamental physics research and precision measurements, with alkali-noble-gas comagnetometers being among the best available sensors. We found a new relaxation mechanism in such devices, the gradient of the Fermi-contact-interaction field that dominates the relaxation of hyperpolarized nuclear spins. We report on precise control ov…
▽ More
Achieving high energy resolution in spin systems is important for fundamental physics research and precision measurements, with alkali-noble-gas comagnetometers being among the best available sensors. We found a new relaxation mechanism in such devices, the gradient of the Fermi-contact-interaction field that dominates the relaxation of hyperpolarized nuclear spins. We report on precise control over spin distribution, demonstrating a tenfold increase of nuclear spin hyperpolarization and transverse coherence time with optimal hybrid optical pum**. Operating in the self-compensation regime, our $^{21}$Ne-Rb-K comagnetometer achieves an ultrahigh inertial rotation sensitivity of $3\times10^{-8}$\,rad/s/Hz$^{1/2}$ in the frequency range from 0.2 to 1.0 Hz, which is equivalent to the energy resolution of $3.1\times 10^{-23}$\,eV/Hz$^{1/2}$. We propose to use this comagnetometer to search for exotic spin-dependent interactions involving proton and neutron spins. The projected sensitivity surpasses the previous experimental and astrophysical limits by more than four orders of magnitude.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation
Authors:
Yongwei Zhou,
Junwei Bao,
Chaoqun Duan,
Youzheng Wu,
Xiaodong He,
Tiejun Zhao
Abstract:
Question answering requiring discrete reasoning, e.g., arithmetic computing, comparison, and counting, over knowledge is a challenging task. In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform unified discrete reasoning over heterogeneous knowledge resources, i.e., table and text, as program generation. Concretely, UniRPG con…
▽ More
Question answering requiring discrete reasoning, e.g., arithmetic computing, comparison, and counting, over knowledge is a challenging task. In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform unified discrete reasoning over heterogeneous knowledge resources, i.e., table and text, as program generation. Concretely, UniRPG consists of a neural programmer and a symbolic program executor, where a program is the composition of a set of pre-defined general atomic and higher-order operations and arguments extracted from table and text. First, the programmer parses a question into a program by generating operations and copying arguments, and then the executor derives answers from table and text based on the program. To alleviate the costly program annotation issue, we design a distant supervision approach for programmer learning, where pseudo programs are automatically constructed without annotated derivations. Extensive experiments on the TAT-QA dataset show that UniRPG achieves tremendous improvements and enhances interpretability and scalability compared with state-of-the-art methods, even without derivation annotation. Moreover, it achieves promising performance on the textual dataset DROP without derivations.
△ Less
Submitted 15 October, 2022;
originally announced October 2022.
-
Linkless Link Prediction via Relational Distillation
Authors:
Zhichun Guo,
William Shiao,
Shichang Zhang,
Yozen Liu,
Nitesh V. Chawla,
Neil Shah,
Tong Zhao
Abstract:
Graph Neural Networks (GNNs) have shown exceptional performance in the task of link prediction. Despite their effectiveness, the high latency brought by non-trivial neighborhood data dependency limits GNNs in practical deployments. Conversely, the known efficient MLPs are much less effective than GNNs due to the lack of relational knowledge. In this work, to combine the advantages of GNNs and MLPs…
▽ More
Graph Neural Networks (GNNs) have shown exceptional performance in the task of link prediction. Despite their effectiveness, the high latency brought by non-trivial neighborhood data dependency limits GNNs in practical deployments. Conversely, the known efficient MLPs are much less effective than GNNs due to the lack of relational knowledge. In this work, to combine the advantages of GNNs and MLPs, we start with exploring direct knowledge distillation (KD) methods for link prediction, i.e., predicted logit-based matching and node representation-based matching. Upon observing direct KD analogs do not perform well for link prediction, we propose a relational KD framework, Linkless Link Prediction (LLP), to distill knowledge for link prediction with MLPs. Unlike simple KD methods that match independent link logits or node representations, LLP distills relational knowledge that is centered around each (anchor) node to the student MLP. Specifically, we propose rank-based matching and distribution-based matching strategies that complement each other. Extensive experiments demonstrate that LLP boosts the link prediction performance of MLPs with significant margins, and even outperforms the teacher GNNs on 7 out of 8 benchmarks. LLP also achieves a 70.68x speedup in link prediction inference compared to GNNs on the large-scale OGB dataset.
△ Less
Submitted 5 June, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal
Authors:
Junhong Lin,
Nanfeng Jiang,
Zhentao Zhang,
Weiling Chen,
Tiesong Zhao
Abstract:
Snow removal aims to locate snow areas and recover clean images without repairing traces. Unlike the regularity and semitransparency of rain, snow with various patterns and degradations seriously occludes the background. As a result, the state-of-the-art snow removal methods usually retains a large parameter size. In this paper, we propose a lightweight but high-efficient snow removal network call…
▽ More
Snow removal aims to locate snow areas and recover clean images without repairing traces. Unlike the regularity and semitransparency of rain, snow with various patterns and degradations seriously occludes the background. As a result, the state-of-the-art snow removal methods usually retains a large parameter size. In this paper, we propose a lightweight but high-efficient snow removal network called Laplace Mask Query Transformer (LMQFormer). Firstly, we present a Laplace-VQVAE to generate a coarse mask as prior knowledge of snow. Instead of using the mask in dataset, we aim at reducing both the information entropy of snow and the computational cost of recovery. Secondly, we design a Mask Query Transformer (MQFormer) to remove snow with the coarse mask, where we use two parallel encoders and a hybrid decoder to learn extensive snow features under lightweight requirements. Thirdly, we develop a Duplicated Mask Query Attention (DMQA) that converts the coarse mask into a specific number of queries, which constraint the attention areas of MQFormer with reduced parameters. Experimental results in popular datasets have demonstrated the efficiency of our proposed model, which achieves the state-of-the-art snow removal quality with significantly reduced parameters and the lowest running time.
△ Less
Submitted 5 April, 2023; v1 submitted 10 October, 2022;
originally announced October 2022.
-
Empowering Graph Representation Learning with Test-Time Graph Transformation
Authors:
Wei **,
Tong Zhao,
Jiayuan Ding,
Yozen Liu,
Jiliang Tang,
Neil Shah
Abstract:
As powerful tools for representation learning on graphs, graph neural networks (GNNs) have facilitated various applications from drug discovery to recommender systems. Nevertheless, the effectiveness of GNNs is immensely challenged by issues related to data quality, such as distribution shift, abnormal features and adversarial attacks. Recent efforts have been made on tackling these issues from a…
▽ More
As powerful tools for representation learning on graphs, graph neural networks (GNNs) have facilitated various applications from drug discovery to recommender systems. Nevertheless, the effectiveness of GNNs is immensely challenged by issues related to data quality, such as distribution shift, abnormal features and adversarial attacks. Recent efforts have been made on tackling these issues from a modeling perspective which requires additional cost of changing model architectures or re-training model parameters. In this work, we provide a data-centric view to tackle these issues and propose a graph transformation framework named GTrans which adapts and refines graph data at test time to achieve better performance. We provide theoretical analysis on the design of the framework and discuss why adapting graph data works better than adapting the model. Extensive experiments have demonstrated the effectiveness of GTrans on three distinct scenarios for eight benchmark datasets where suboptimal data is presented. Remarkably, GTrans performs the best in most cases with improvements up to 2.8%, 8.2% and 3.8% over the best baselines on three experimental settings. Code is released at https://github.com/ChandlerBang/GTrans.
△ Less
Submitted 26 February, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering
Authors:
Mingxuan Ju,
Wenhao Yu,
Tong Zhao,
Chuxu Zhang,
Yanfang Ye
Abstract:
A common thread of open-domain question answering (QA) models employs a retriever-reader pipeline that first retrieves a handful of relevant passages from Wikipedia and then peruses the passages to produce an answer. However, even state-of-the-art readers fail to capture the complex relationships between entities appearing in questions and retrieved passages, leading to answers that contradict the…
▽ More
A common thread of open-domain question answering (QA) models employs a retriever-reader pipeline that first retrieves a handful of relevant passages from Wikipedia and then peruses the passages to produce an answer. However, even state-of-the-art readers fail to capture the complex relationships between entities appearing in questions and retrieved passages, leading to answers that contradict the facts. In light of this, we propose a novel knowledge Graph enhanced passage reader, namely Grape, to improve the reader performance for open-domain QA. Specifically, for each pair of question and retrieved passage, we first construct a localized bipartite graph, attributed to entity embeddings extracted from the intermediate layer of the reader model. Then, a graph neural network learns relational knowledge while fusing graph and contextual representations into the hidden states of the reader model. Experiments on three open-domain QA benchmarks show Grape can improve the state-of-the-art performance by up to 2.2 exact match score with a negligible overhead increase, with the same retriever and retrieved passages. Our code is publicly available at https://github.com/jumxglhf/GRAPE.
△ Less
Submitted 9 October, 2022; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization
Authors:
Mingxuan Ju,
Tong Zhao,
Qianlong Wen,
Wenhao Yu,
Neil Shah,
Yanfang Ye,
Chuxu Zhang
Abstract:
Self-supervised learning (SSL) for graph neural networks (GNNs) has attracted increasing attention from the graph machine learning community in recent years, owing to its capability to learn performant node embeddings without costly label information. One weakness of conventional SSL frameworks for GNNs is that they learn through a single philosophy, such as mutual information maximization or gene…
▽ More
Self-supervised learning (SSL) for graph neural networks (GNNs) has attracted increasing attention from the graph machine learning community in recent years, owing to its capability to learn performant node embeddings without costly label information. One weakness of conventional SSL frameworks for GNNs is that they learn through a single philosophy, such as mutual information maximization or generative reconstruction. When applied to various downstream tasks, these frameworks rarely perform equally well for every task, because one philosophy may not span the extensive knowledge required for all tasks. To enhance the task generalization across tasks, as an important first step forward in exploring fundamental graph models, we introduce PARETOGNN, a multi-task SSL framework for node representation learning over graphs. Specifically, PARETOGNN is self-supervised by manifold pretext tasks observing multiple philosophies. To reconcile different philosophies, we explore a multiple-gradient descent algorithm, such that PARETOGNN actively learns from every pretext task while minimizing potential conflicts. We conduct comprehensive experiments over four downstream tasks (i.e., node classification, node clustering, link prediction, and partition prediction), and our proposal achieves the best overall performance across tasks on 11 widely adopted benchmark datasets. Besides, we observe that learning from multiple philosophies enhances not only the task generalization but also the single task performances, demonstrating that PARETOGNN achieves better task generalization via the disjoint yet complementary knowledge learned from different philosophies. Our code is publicly available at https://github.com/jumxglhf/ParetoGNN.
△ Less
Submitted 27 February, 2023; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Authors:
Chen Liang,
Simiao Zuo,
Qingru Zhang,
Pengcheng He,
Weizhu Chen,
Tuo Zhao
Abstract:
Layer-wise distillation is a powerful tool to compress large models (i.e. teacher models) into small ones (i.e., student models). The student distills knowledge from the teacher by mimicking the hidden representations of the teacher at every intermediate layer. However, layer-wise distillation is difficult. Since the student has a smaller model capacity than the teacher, it is often under-fitted.…
▽ More
Layer-wise distillation is a powerful tool to compress large models (i.e. teacher models) into small ones (i.e., student models). The student distills knowledge from the teacher by mimicking the hidden representations of the teacher at every intermediate layer. However, layer-wise distillation is difficult. Since the student has a smaller model capacity than the teacher, it is often under-fitted. Furthermore, the hidden representations of the teacher contain redundant information that the student does not necessarily need for the target task's learning. To address these challenges, we propose a novel Task-aware layEr-wise Distillation (TED). TED designs task-aware filters to align the hidden representations of the student and the teacher at each layer. The filters select the knowledge that is useful for the target task from the hidden representations. As such, TED reduces the knowledge gap between the two models and helps the student to fit better on the target task. We evaluate TED in two scenarios: continual pre-training and fine-tuning. TED demonstrates significant and consistent improvements over existing distillation methods in both scenarios. Code is available at https://github.com/cliang1453/task-aware-distillation.
△ Less
Submitted 5 June, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
Authors:
Xiaotian Han,
Tong Zhao,
Yozen Liu,
Xia Hu,
Neil Shah
Abstract:
Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming. This is attributed to overheads caused by sparse matrix multiplication, which are sidestepped when training multi-layer perceptrons (MLPs) with only node features. MLPs, by ignoring graph context, are simple and faster for graph data, however they usually sacrifice prediction accuracy, limiting their app…
▽ More
Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming. This is attributed to overheads caused by sparse matrix multiplication, which are sidestepped when training multi-layer perceptrons (MLPs) with only node features. MLPs, by ignoring graph context, are simple and faster for graph data, however they usually sacrifice prediction accuracy, limiting their applications for graph data. We observe that for most message passing-based GNNs, we can trivially derive an analog MLP (we call this a PeerMLP) with an equivalent weight space, by setting the trainable parameters with the same shapes, making us curious about \textbf{\emph{how do GNNs using weights from a fully trained PeerMLP perform?}} Surprisingly, we find that GNNs initialized with such weights significantly outperform their PeerMLPs, motivating us to use PeerMLP training as a precursor, initialization step to GNN training. To this end, we propose an embarrassingly simple, yet hugely effective initialization method for GNN training acceleration, called MLPInit. Our extensive experiments on multiple large-scale graph datasets with diverse GNN architectures validate that MLPInit can accelerate the training of GNNs (up to 33X speedup on OGB-Products) and often improve prediction performance (e.g., up to $7.97\%$ improvement for GraphSAGE across $7$ datasets for node classification, and up to $17.81\%$ improvement across $4$ datasets for link prediction on metric Hits@10). The code is available at \href{https://github.com/snap-research/MLPInit-for-GNNs}.
△ Less
Submitted 8 April, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.
-
Development of a Full Monte Carlo Therapeutic Dose Calculation Toolkit for Halcyon Using Geant4
Authors:
Ruirui Liu,
Zhen Ji,
Xiandong Zhao,
Tianyu Zhao,
Abhishek Sethi,
Daren Sawkey,
Bin Cai
Abstract:
Purpose: To develop a Monte Carlo (MC) therapeutic dose calculation toolkit of a recently released ring gantry linac in Geant4 (Version 10.7) for secondary dose validation of radiotherapy plan. Methods: For the Halcyon (Varian Medical Systems), the DSMLC was modeled and radiation transport in DSMLC and patient phantom was simulated using Geant4. Radiation source was sampled from a phase space file…
▽ More
Purpose: To develop a Monte Carlo (MC) therapeutic dose calculation toolkit of a recently released ring gantry linac in Geant4 (Version 10.7) for secondary dose validation of radiotherapy plan. Methods: For the Halcyon (Varian Medical Systems), the DSMLC was modeled and radiation transport in DSMLC and patient phantom was simulated using Geant4. Radiation source was sampled from a phase space file for linac head above the DSMLC. The phase space file was obtained using a cloud-based Monte Carlo (MC) simulator, VirtuaLinac (VL) provide by Varian. Dosimetric profiles for different square field widths (2x2, 4x4, 6x6, 8x8, 10x10, 20x20, and 28x28 cm2), i.e., percent depth dose (PDD) curves and lateral profiles are simulated and compared against the experimental profiles. IMRT (intensity modulated radiation therapy) plans in two anatomical sites (prostate and brain) were also calculated using the developed toolkit and compared against the TPS calculated dose (Acuros, Eclipse 15.6). 3D dose difference and 3D gamma analysis were used to evaluate the simulation accuracy compared against the TPS calculated dose. Results: The simulated lateral dose profiles and PDD curves in water phantom match well with the measured ones for all the simulated field sizes with relative difference +-2%. For the prostate and brain IMRT plans, the simulated dose showed a good agreement with the TPS calculated dose. The 3D gamma pass rate (3%/3mm) are 98.08% and 95.4% for the two prostate and brain plans, respectively. Conclusion: The developed full MC dose calculation toolkit for Halcyon performs well in dose calculations in water phantom and patient CT phantom. The developed toolkit shows promising possibility for future secondary dose calculation for IMRT and serve as clinical quality assurance (QA) tool for Halcyon.
△ Less
Submitted 30 September, 2022;
originally announced October 2022.
-
Assortment Optimization Under the Multivariate MNL Model
Authors:
Xin Chen,
Jiachun Li,
Menglong Li,
Tiancheng Zhao,
Yuan Zhou
Abstract:
We study an assortment optimization problem under a multi-purchase choice model in which customers choose a bundle of up to one product from each of two product categories. Different bundles have different utilities and the bundle price is the summation of the prices of products in it. For the uncapacitated setting where any set of products can be offered, we prove that this problem is strongly NP…
▽ More
We study an assortment optimization problem under a multi-purchase choice model in which customers choose a bundle of up to one product from each of two product categories. Different bundles have different utilities and the bundle price is the summation of the prices of products in it. For the uncapacitated setting where any set of products can be offered, we prove that this problem is strongly NP-hard. We show that an adjusted-revenue-ordered assortment provides a 1/2-approximation. Furthermore, we develop an approximation framework based on a linear programming relaxation of the problem and obtain a 0.74-approximation algorithm. This approximation ratio almost matches the integrality gap of the linear program, which is proven to be at most 0.75. For the capacitated setting, we prove that there does not exist a constant-factor approximation algorithm assuming the Exponential Time Hypothesis. The same hardness result holds for settings with general bundle prices or more than two categories. Finally, we conduct numerical experiments on randomly generated problem instances. The average approximation ratios of our algorithms are over 99%.
△ Less
Submitted 10 October, 2022; v1 submitted 30 September, 2022;
originally announced September 2022.
-
First-order Policy Optimization for Robust Markov Decision Process
Authors:
Yan Li,
Guanghui Lan,
Tuo Zhao
Abstract:
We consider the problem of solving robust Markov decision process (MDP), which involves a set of discounted, finite state, finite action space MDPs with uncertain transition kernels. The goal of planning is to find a robust policy that optimizes the worst-case values against the transition uncertainties, and thus encompasses the standard MDP planning as a special case. For…
▽ More
We consider the problem of solving robust Markov decision process (MDP), which involves a set of discounted, finite state, finite action space MDPs with uncertain transition kernels. The goal of planning is to find a robust policy that optimizes the worst-case values against the transition uncertainties, and thus encompasses the standard MDP planning as a special case. For $(\mathbf{s},\mathbf{a})$-rectangular uncertainty sets, we establish several structural observations on the robust objective, which facilitates the development of a policy-based first-order method, namely the robust policy mirror descent (RPMD). An $\mathcal{O}(\log(1/ε))$ iteration complexity for finding an $ε$-optimal policy is established with linearly increasing stepsizes. We further develop a stochastic variant of the robust policy mirror descent method, named SRPMD, when the first-order information is only available through online interactions with the nominal environment. We show that the optimality gap converges linearly up to the noise level, and consequently establish an $\tilde{\mathcal{O}}(1/ε^2)$ sample complexity by develo** a temporal difference learning method for policy evaluation. Both iteration and sample complexities are also discussed for RPMD with a constant stepsize. To the best of our knowledge, all the aforementioned results appear to be new for policy-based first-order methods applied to the robust MDP problem.
△ Less
Submitted 10 June, 2023; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Flashlight: Scalable Link Prediction with Effective Decoders
Authors:
Yiwei Wang,
Bryan Hooi,
Yozen Liu,
Tong Zhao,
Zhichun Guo,
Neil Shah
Abstract:
Link prediction (LP) has been recognized as an important task in graph learning with its broad practical applications. A typical application of LP is to retrieve the top scoring neighbors for a given source node, such as the friend recommendation. These services desire the high inference scalability to find the top scoring neighbors from many candidate nodes at low latencies. There are two popular…
▽ More
Link prediction (LP) has been recognized as an important task in graph learning with its broad practical applications. A typical application of LP is to retrieve the top scoring neighbors for a given source node, such as the friend recommendation. These services desire the high inference scalability to find the top scoring neighbors from many candidate nodes at low latencies. There are two popular decoders that the recent LP models mainly use to compute the edge scores from node embeddings: the HadamardMLP and Dot Product decoders. After theoretical and empirical analysis, we find that the HadamardMLP decoders are generally more effective for LP. However, HadamardMLP lacks the scalability for retrieving top scoring neighbors on large graphs, since to the best of our knowledge, there does not exist an algorithm to retrieve the top scoring neighbors for HadamardMLP decoders in sublinear complexity. To make HadamardMLP scalable, we propose the Flashlight algorithm to accelerate the top scoring neighbor retrievals for HadamardMLP: a sublinear algorithm that progressively applies approximate maximum inner product search (MIPS) techniques with adaptively adjusted query embeddings. Empirical results show that Flashlight improves the inference speed of LP by more than 100 times on the large OGBL-CITATION2 dataset without sacrificing effectiveness. Our work paves the way for large-scale LP applications with the effective HadamardMLP decoders by greatly accelerating their inference.
△ Less
Submitted 3 December, 2022; v1 submitted 16 September, 2022;
originally announced September 2022.
-
Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites
Authors:
Simiao Zuo,
Qingyu Yin,
Haoming Jiang,
Shaohui Xi,
Bing Yin,
Chao Zhang,
Tuo Zhao
Abstract:
E-commerce queries are often short and ambiguous. Consequently, query understanding often uses query rewriting to disambiguate user-input queries. While using e-commerce search tools, users tend to enter multiple searches, which we call context, before purchasing. These history searches contain contextual insights about users' true shop** intents. Therefore, modeling such contextual information…
▽ More
E-commerce queries are often short and ambiguous. Consequently, query understanding often uses query rewriting to disambiguate user-input queries. While using e-commerce search tools, users tend to enter multiple searches, which we call context, before purchasing. These history searches contain contextual insights about users' true shop** intents. Therefore, modeling such contextual information is critical to a better query rewriting model. However, existing query rewriting models ignore users' history behaviors and consider only the instant search query, which is often a short string offering limited information about the true shop** intent.
We propose an end-to-end context-aware query rewriting model to bridge this gap, which takes the search context into account. Specifically, our model builds a session graph using the history search queries and their contained words. We then employ a graph attention mechanism that models cross-query relations and computes contextual information of the session. The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network. The session representations are then decoded to generate rewritten queries. Empirically, we demonstrate the superiority of our method to state-of-the-art approaches under various metrics. On in-house data from an online shop** platform, by introducing contextual information, our model achieves 11.6% improvement under the MRR (Mean Reciprocal Rank) metric and 20.1% improvement under the HIT@16 metric (a hit rate metric), in comparison with the best baseline method (Transformer-based model).
△ Less
Submitted 24 September, 2022; v1 submitted 15 September, 2022;
originally announced September 2022.
-
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks
Authors:
Simiao Zuo,
Haoming Jiang,
Qingyu Yin,
Xianfeng Tang,
Bing Yin,
Tuo Zhao
Abstract:
Graph neural network (GNN) pre-training methods have been proposed to enhance the power of GNNs. Specifically, a GNN is first pre-trained on a large-scale unlabeled graph and then fine-tuned on a separate small labeled graph for downstream applications, such as node classification. One popular pre-training method is to mask out a proportion of the edges, and a GNN is trained to recover them. Howev…
▽ More
Graph neural network (GNN) pre-training methods have been proposed to enhance the power of GNNs. Specifically, a GNN is first pre-trained on a large-scale unlabeled graph and then fine-tuned on a separate small labeled graph for downstream applications, such as node classification. One popular pre-training method is to mask out a proportion of the edges, and a GNN is trained to recover them. However, such a generative method suffers from graph mismatch. That is, the masked graph inputted to the GNN deviates from the original graph. To alleviate this issue, we propose DiP-GNN (Discriminative Pre-training of Graph Neural Networks). Specifically, we train a generator to recover identities of the masked edges, and simultaneously, we train a discriminator to distinguish the generated edges from the original graph's edges. In our framework, the graph seen by the discriminator better matches the original graph because the generator can recover a proportion of the masked edges. Extensive experiments on large-scale homogeneous and heterogeneous graphs demonstrate the effectiveness of the proposed framework.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
Differentially Private Estimation of Hawkes Process
Authors:
Simiao Zuo,
Tianyi Liu,
Tuo Zhao,
Hongyuan Zha
Abstract:
Point process models are of great importance in real world applications. In certain critical applications, estimation of point process models involves large amounts of sensitive personal data from users. Privacy concerns naturally arise which have not been addressed in the existing literature. To bridge this glaring gap, we propose the first general differentially private estimation procedure for…
▽ More
Point process models are of great importance in real world applications. In certain critical applications, estimation of point process models involves large amounts of sensitive personal data from users. Privacy concerns naturally arise which have not been addressed in the existing literature. To bridge this glaring gap, we propose the first general differentially private estimation procedure for point process models. Specifically, we take the Hawkes process as an example, and introduce a rigorous definition of differential privacy for event stream data based on a discretized representation of the Hawkes process. We then propose two differentially private optimization algorithms, which can efficiently estimate Hawkes process models with the desired privacy and utility guarantees under two different settings. Experiments are provided to back up our theoretical analysis.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
Authors:
Tiancheng Zhao,
Peng Liu,
Kyusong Lee
Abstract:
The advancement of object detection (OD) in open-vocabulary and open-world scenarios is a critical challenge in computer vision. This work introduces OmDet, a novel language-aware object detection architecture, and an innovative training mechanism that harnesses continual learning and multi-dataset vision-language pre-training. Leveraging natural language as a universal knowledge representation, O…
▽ More
The advancement of object detection (OD) in open-vocabulary and open-world scenarios is a critical challenge in computer vision. This work introduces OmDet, a novel language-aware object detection architecture, and an innovative training mechanism that harnesses continual learning and multi-dataset vision-language pre-training. Leveraging natural language as a universal knowledge representation, OmDet accumulates a "visual vocabulary" from diverse datasets, unifying the task as a language-conditioned detection framework. Our multimodal detection network (MDN) overcomes the challenges of multi-dataset joint training and generalizes to numerous training datasets without manual label taxonomy merging. We demonstrate superior performance of OmDet over strong baselines in object detection in the wild, open-vocabulary detection, and phrase grounding, achieving state-of-the-art results. Ablation studies reveal the impact of scaling the pre-training visual vocabulary, indicating a promising direction for further expansion to larger datasets. The effectiveness of our deep fusion approach is underscored by its ability to learn jointly from multiple datasets, enhancing performance through knowledge sharing.
△ Less
Submitted 25 February, 2024; v1 submitted 10 September, 2022;
originally announced September 2022.
-
Explanation Guided Contrastive Learning for Sequential Recommendation
Authors:
Lei Wang,
Ee-Peng Lim,
Zhiwei Liu,
Tianxiang Zhao
Abstract:
Recently, contrastive learning has been applied to the sequential recommendation task to address data sparsity caused by users with few item interactions and items with few user adoptions. Nevertheless, the existing contrastive learning-based methods fail to ensure that the positive (or negative) sequence obtained by some random augmentation (or sequence sampling) on a given anchor user sequence r…
▽ More
Recently, contrastive learning has been applied to the sequential recommendation task to address data sparsity caused by users with few item interactions and items with few user adoptions. Nevertheless, the existing contrastive learning-based methods fail to ensure that the positive (or negative) sequence obtained by some random augmentation (or sequence sampling) on a given anchor user sequence remains to be semantically similar (or different). When the positive and negative sequences turn out to be false positive and false negative respectively, it may lead to degraded recommendation performance. In this work, we address the above problem by proposing Explanation Guided Augmentations (EGA) and Explanation Guided Contrastive Learning for Sequential Recommendation (EC4SRec) model framework. The key idea behind EGA is to utilize explanation method(s) to determine items' importance in a user sequence and derive the positive and negative sequences accordingly. EC4SRec then combines both self-supervised and supervised contrastive learning over the positive and negative sequences generated by EGA operations to improve sequence representation learning for more accurate recommendation results. Extensive experiments on four real-world benchmark datasets demonstrate that EC4SRec outperforms the state-of-the-art sequential recommendation methods and two recent contrastive learning-based sequential recommendation methods, CL4SRec and DuoRec. Our experiments also show that EC4SRec can be easily adapted for different sequence encoder backbones (e.g., GRU4Rec and Caser), and improve their recommendation performance.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
The analytical solution to the migration of an epithelial monolayer with a circular spreading front and its implications in the gap closure process
Authors:
Tiankai Zhao,
Hongyan Yuan
Abstract:
The coordinated behaviors of epithelial cells are widely observed in tissue development, such as re-epithelialization, tumor growth, and morphogenesis. In these processes, cells either migrate collectively or organize themselves into specific structures to serve certain purposes. In this work, we study a spreading epithelial monolayer whose migrating front encloses a circular gap in the monolayer…
▽ More
The coordinated behaviors of epithelial cells are widely observed in tissue development, such as re-epithelialization, tumor growth, and morphogenesis. In these processes, cells either migrate collectively or organize themselves into specific structures to serve certain purposes. In this work, we study a spreading epithelial monolayer whose migrating front encloses a circular gap in the monolayer center. Such tissue is usually used to mimic the wound healing process in Virto. We model the epithelial sheet as a layer of active viscous polar fluid. With an axisymmetric assumption, the model can be analytically solved under two special conditions, suggesting two possible spreading modes for the epithelial monolayer. Based on these two sets of analytical solutions, we assess the velocity of the spreading front affected by the gap size, the active intercellular contractility, and the purse-string contraction acting on the spreading edge. Several critical values exist in the model parameters for the initiation of the gap closure process, and the purse-string contraction plays a vital role in governing the gap closure kinetics. Finally, the instability of the morphology of the spreading front was studied. Numerical calculations show how the perturbated velocities and the growth rates vary with respect to different model parameters.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-Iteration
Authors:
Tuowen Zhao,
Tobi Popoola,
Mary Hall,
Catherine Olschanowsky,
Michelle Mills Strout
Abstract:
This paper presents a code generator for sparse tensor contraction computations. It leverages a mathematical representation of loop nest computations in the sparse polyhedral framework (SPF), which extends the polyhedral model to support non-affine computations, such as arise in sparse tensors. SPF is extended to perform layout specification, optimization, and code generation of sparse tensor code…
▽ More
This paper presents a code generator for sparse tensor contraction computations. It leverages a mathematical representation of loop nest computations in the sparse polyhedral framework (SPF), which extends the polyhedral model to support non-affine computations, such as arise in sparse tensors. SPF is extended to perform layout specification, optimization, and code generation of sparse tensor code: 1) we develop a polyhedral layout specification that decouples iteration spaces for layout and computation; and, 2) we develop efficient co-iteration of sparse tensors by combining polyhedra scanning over the layout of one sparse tensor with the synthesis of code to find corresponding elements in other tensors through an SMT solver.
We compare the generated code with that produced by a state-of-the-art tensor compiler, TACO. We achieve on average 1.63$\times$ faster parallel performance than TACO on sparse-sparse co-iteration and describe how to improve that to 2.72$\times$ average speedup by switching the find algorithms. We also demonstrate that decoupling iteration spaces of layout and computation enables additional layout and computation combinations to be supported.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN
Authors:
Huy Phan,
Cong Shi,
Yi Xie,
Tianfang Zhang,
Zhuohang Li,
Tianming Zhao,
Jian Liu,
Yan Wang,
Yingying Chen,
Bo Yuan
Abstract:
Recently backdoor attack has become an emerging threat to the security of deep neural network (DNN) models. To date, most of the existing studies focus on backdoor attack against the uncompressed model; while the vulnerability of compressed DNNs, which are widely used in the practical applications, is little exploited yet. In this paper, we propose to study and develop Robust and Imperceptible Bac…
▽ More
Recently backdoor attack has become an emerging threat to the security of deep neural network (DNN) models. To date, most of the existing studies focus on backdoor attack against the uncompressed model; while the vulnerability of compressed DNNs, which are widely used in the practical applications, is little exploited yet. In this paper, we propose to study and develop Robust and Imperceptible Backdoor Attack against Compact DNN models (RIBAC). By performing systematic analysis and exploration on the important design knobs, we propose a framework that can learn the proper trigger patterns, model parameters and pruning masks in an efficient way. Thereby achieving high trigger stealthiness, high attack success rate and high model efficiency simultaneously. Extensive evaluations across different datasets, including the test against the state-of-the-art defense mechanisms, demonstrate the high robustness, stealthiness and model efficiency of RIBAC. Code is available at https://github.com/huyvnphan/ECCV2022-RIBAC
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Data Augmentation is a Hyperparameter: Cherry-picked Self-Supervision for Unsupervised Anomaly Detection is Creating the Illusion of Success
Authors:
Jaemin Yoo,
Tiancheng Zhao,
Leman Akoglu
Abstract:
Self-supervised learning (SSL) has emerged as a promising alternative to create supervisory signals to real-world problems, avoiding the extensive cost of manual labeling. SSL is particularly attractive for unsupervised tasks such as anomaly detection (AD), where labeled anomalies are rare or often nonexistent. A large catalog of augmentation functions has been used for SSL-based AD (SSAD) on imag…
▽ More
Self-supervised learning (SSL) has emerged as a promising alternative to create supervisory signals to real-world problems, avoiding the extensive cost of manual labeling. SSL is particularly attractive for unsupervised tasks such as anomaly detection (AD), where labeled anomalies are rare or often nonexistent. A large catalog of augmentation functions has been used for SSL-based AD (SSAD) on image data, and recent works have reported that the type of augmentation has a significant impact on accuracy. Motivated by those, this work sets out to put image-based SSAD under a larger lens and investigate the role of data augmentation in SSAD. Through extensive experiments on 3 different detector models and across 420 AD tasks, we provide comprehensive numerical and visual evidences that the alignment between data augmentation and anomaly-generating mechanism is the key to the success of SSAD, and in the lack thereof, SSL may even impair accuracy. To the best of our knowledge, this is the first meta-analysis on the role of data augmentation in SSAD.
△ Less
Submitted 27 July, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.
-
Scalable neural quantum states architecture for quantum chemistry
Authors:
Tianchen Zhao,
James Stokes,
Shravan Veerapaneni
Abstract:
Variational optimization of neural-network representations of quantum states has been successfully applied to solve interacting fermionic problems. Despite rapid developments, significant scalability challenges arise when considering molecules of large scale, which correspond to non-locally interacting quantum spin Hamiltonians consisting of sums of thousands or even millions of Pauli operators. I…
▽ More
Variational optimization of neural-network representations of quantum states has been successfully applied to solve interacting fermionic problems. Despite rapid developments, significant scalability challenges arise when considering molecules of large scale, which correspond to non-locally interacting quantum spin Hamiltonians consisting of sums of thousands or even millions of Pauli operators. In this work, we introduce scalable parallelization strategies to improve neural-network-based variational quantum Monte Carlo calculations for ab-initio quantum chemistry applications. We establish GPU-supported local energy parallelism to compute the optimization objective for Hamiltonians of potentially complex molecules. Using autoregressive sampling techniques, we demonstrate systematic improvement in wall-clock timings required to achieve CCSD baseline target energies. The performance is further enhanced by accommodating the structure of resultant spin Hamiltonians into the autoregressive sampling ordering. The algorithm achieves promising performance in comparison with the classical approximate methods and exhibits both running time and scalability advantages over existing neural-network based methods.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Computer Vision Methods for the Microstructural Analysis of Materials: The State-of-the-art and Future Perspectives
Authors:
Khaled Alrfou,
Amir Kordijazi,
Tian Zhao
Abstract:
Finding quantitative descriptors representing the microstructural features of a given material is an ongoing research area in the paradigm of Materials-by-Design. Historically, microstructural analysis mostly relies on qualitative descriptions. However, to build a robust and accurate process-structure-properties relationship, which is required for designing new advanced high-performance materials,…
▽ More
Finding quantitative descriptors representing the microstructural features of a given material is an ongoing research area in the paradigm of Materials-by-Design. Historically, microstructural analysis mostly relies on qualitative descriptions. However, to build a robust and accurate process-structure-properties relationship, which is required for designing new advanced high-performance materials, the extraction of quantitative and meaningful statistical data from the microstructural analysis is a critical step. In recent years, computer vision (CV) methods, especially those which are centered around convolutional neural network (CNN) algorithms have shown promising results for this purpose. This review paper focuses on the state-of-the-art CNN-based techniques that have been applied to various multi-scale microstructural image analysis tasks, including classification, object detection, segmentation, feature extraction, and reconstruction. Additionally, we identified the main challenges with regard to the application of these methods to materials science research. Finally, we discussed some possible future directions of research in this area. In particular, we emphasized the application of transformer-based models and their capabilities to improve the microstructural analysis of materials.
△ Less
Submitted 29 July, 2022;
originally announced August 2022.
-
Quantum-inspired variational algorithms for partial differential equations: Application to financial derivative pricing
Authors:
Tianchen Zhao,
Chuhao Sun,
Asaf Cohen,
James Stokes,
Shravan Veerapaneni
Abstract:
Variational quantum Monte Carlo (VMC) combined with neural-network quantum states offers a novel angle of attack on the curse-of-dimensionality encountered in a particular class of partial differential equations (PDEs); namely, the real- and imaginary time-dependent Schrödinger equation. In this paper, we present a simple generalization of VMC applicable to arbitrary time-dependent PDEs, showcasin…
▽ More
Variational quantum Monte Carlo (VMC) combined with neural-network quantum states offers a novel angle of attack on the curse-of-dimensionality encountered in a particular class of partial differential equations (PDEs); namely, the real- and imaginary time-dependent Schrödinger equation. In this paper, we present a simple generalization of VMC applicable to arbitrary time-dependent PDEs, showcasing the technique in the multi-asset Black-Scholes PDE for pricing European options contingent on many correlated underlying assets.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
Space-based gravitational wave signal detection and extraction with deep neural network
Authors:
Tianyu Zhao,
Ruoxi Lyu,
He Wang,
Zhoujian Cao,
Zhixiang Ren
Abstract:
Space-based gravitational wave (GW) detectors will be able to observe signals from sources that are otherwise nearly impossible from current ground-based detection. Consequently, the well established signal detection method, matched filtering, will require a complex template bank, leading to a computational cost that is too expensive in practice. Here, we develop a high-accuracy GW signal detectio…
▽ More
Space-based gravitational wave (GW) detectors will be able to observe signals from sources that are otherwise nearly impossible from current ground-based detection. Consequently, the well established signal detection method, matched filtering, will require a complex template bank, leading to a computational cost that is too expensive in practice. Here, we develop a high-accuracy GW signal detection and extraction method for all space-based GW sources. As a proof of concept, we show that a science-driven and uniform multi-stage self-attention-based deep neural network can identify synthetic signals that are submerged in Gaussian noise. Our method exhibits a detection rate exceeding 99% in identifying signals from various sources, with the signal-to-noise ratio at 50, at a false alarm rate of 1%. while obtaining at least 95% similarity compared with target signals. We further demonstrate the interpretability and strong generalization behavior for several extended scenarios.
△ Less
Submitted 15 August, 2023; v1 submitted 15 July, 2022;
originally announced July 2022.
-
Telecom-band Multi-Type Spontaneous Parametric Downconversion in Periodically Polarized Nonlinear Materials
Authors:
Xi-Yu Liu,
Ya-Fei Yu,
Zheng-Jun Wei,
Tian-Ming Zhao,
**-Dong Wang
Abstract:
Spontaneous parametric downconversion is an essential technique in quantum optics experiments. In this paper, various quasi-phase-matching processes in several typical periodically polarized nonlinear materials are analyzed and calculated. Furthermore, a general method for realizing multiple types of quasi-phase-matching in a monolithic material is presented. Finally, a novel design to prepare mul…
▽ More
Spontaneous parametric downconversion is an essential technique in quantum optics experiments. In this paper, various quasi-phase-matching processes in several typical periodically polarized nonlinear materials are analyzed and calculated. Furthermore, a general method for realizing multiple types of quasi-phase-matching in a monolithic material is presented. Finally, a novel design to prepare multiple entangled photon pairs based on the Sagnac interferometer is discussed. This technology can be applied to tiny optical paths in the telecom C band, saving both cost and space.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
RWT-SLAM: Robust Visual SLAM for Highly Weak-textured Environments
Authors:
Qihao Peng,
Zhiyu Xiang,
YuanGang Fan,
Tengqi Zhao,
Xijun Zhao
Abstract:
As a fundamental task for intelligent robots, visual SLAM has made great progress over the past decades. However, robust SLAM under highly weak-textured environments still remains very challenging. In this paper, we propose a novel visual SLAM system named RWT-SLAM to tackle this problem. We modify LoFTR network which is able to produce dense point matching under low-textured scenes to generate fe…
▽ More
As a fundamental task for intelligent robots, visual SLAM has made great progress over the past decades. However, robust SLAM under highly weak-textured environments still remains very challenging. In this paper, we propose a novel visual SLAM system named RWT-SLAM to tackle this problem. We modify LoFTR network which is able to produce dense point matching under low-textured scenes to generate feature descriptors. To integrate the new features into the popular ORB-SLAM framework, we develop feature masks to filter out the unreliable features and employ KNN strategy to strengthen the matching robustness. We also retrained visual vocabulary upon new descriptors for efficient loop closing. The resulting RWT-SLAM is tested in various public datasets such as TUM and OpenLORIS, as well as our own data. The results shows very promising performance under highly weak-textured environments.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Aspect-Based Sentiment Analysis using Local Context Focus Mechanism with DeBERTa
Authors:
Tianyu Zhao,
Jun** Du,
Zhe Xue,
Ang Li,
Zeli Guan
Abstract:
Text sentiment analysis, also known as opinion mining, is research on the calculation of people's views, evaluations, attitude and emotions expressed by entities. Text sentiment analysis can be divided into text-level sentiment analysis, sen-tence-level sentiment analysis and aspect-level sentiment analysis. Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in the field of sentiment an…
▽ More
Text sentiment analysis, also known as opinion mining, is research on the calculation of people's views, evaluations, attitude and emotions expressed by entities. Text sentiment analysis can be divided into text-level sentiment analysis, sen-tence-level sentiment analysis and aspect-level sentiment analysis. Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in the field of sentiment analysis, which aims to predict the polarity of aspects. The research of pre-training neural model has significantly improved the performance of many natural language processing tasks. In recent years, pre training model (PTM) has been applied in ABSA. Therefore, there has been a question, which is whether PTMs contain sufficient syntactic information for ABSA. In this paper, we explored the recent DeBERTa model (Decoding-enhanced BERT with disentangled attention) to solve Aspect-Based Sentiment Analysis problem. DeBERTa is a kind of neural language model based on transformer, which uses self-supervised learning to pre-train on a large number of original text corpora. Based on the Local Context Focus (LCF) mechanism, by integrating DeBERTa model, we purpose a multi-task learning model for aspect-based sentiment analysis. The experiments result on the most commonly used the laptop and restaurant datasets of SemEval-2014 and the ACL twitter dataset show that LCF mechanism with DeBERTa has significant improvement.
△ Less
Submitted 7 July, 2022; v1 submitted 5 July, 2022;
originally announced July 2022.
-
VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations
Authors:
Tiancheng Zhao,
Tianqi Zhang,
Mingwei Zhu,
Haozhan Shen,
Kyusong Lee,
Xiaopeng Lu,
Jianwei Yin
Abstract:
Vision-Language Pretraining (VLP) models have recently successfully facilitated many cross-modal downstream tasks. Most existing works evaluated their systems by comparing the fine-tuned downstream task performance. However, only average downstream task accuracy provides little information about the pros and cons of each VLP method, let alone provides insights on how the community can improve the…
▽ More
Vision-Language Pretraining (VLP) models have recently successfully facilitated many cross-modal downstream tasks. Most existing works evaluated their systems by comparing the fine-tuned downstream task performance. However, only average downstream task accuracy provides little information about the pros and cons of each VLP method, let alone provides insights on how the community can improve the systems in the future. Inspired by the CheckList for testing natural language processing, we exploit VL-CheckList, a novel framework to understand the capabilities of VLP models. The proposed method divides the image-texting ability of a VLP model into three categories: objects, attributes, and relations, and uses a novel taxonomy to further break down these three aspects. We conduct comprehensive studies to analyze seven recently popular VLP models via the proposed framework. Results confirm the effectiveness of the proposed method by revealing fine-grained differences among the compared models that were not visible from downstream task-only evaluation. Further results show promising research direction in building better VLP models. Our data and code are available at: https://github.com/om-ai-lab/VL-CheckList.
△ Less
Submitted 22 June, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Active Coding Piezoelectric Metasurfaces
Authors:
Zhaoxi Li,
Chunlong Fei,
Shenghui Yang,
Chenxue Hou,
Jianxin Zhao,
Yi Li,
Chenxi Zheng,
He** Wu,
Yi Quan,
Tianlong Zhao,
Dongdong Chen,
Di Li,
Gang Niu,
Wei Ren,
Meng Xiao,
Yintang Yang
Abstract:
The manipulation of acoustic waves plays an important role in a wide range of applications. Currently, acoustic wave manipulation typically relies on either acoustic metasurfaces or phased array transducers. The elements of metasurfaces are designed and optimized for a target frequency, which thus limits their bandwidth. Phased array transducers, suffering from high-cost and complex control circui…
▽ More
The manipulation of acoustic waves plays an important role in a wide range of applications. Currently, acoustic wave manipulation typically relies on either acoustic metasurfaces or phased array transducers. The elements of metasurfaces are designed and optimized for a target frequency, which thus limits their bandwidth. Phased array transducers, suffering from high-cost and complex control circuits, are usually limited by the array size and the filling ratio of the control units. In this work, we introduce active coding piezoelectric metasurfaces; demonstrate commonly implemented acoustic wave manipulation functionalities such as beam steering, beam focusing and vortex beam focusing, acoustic tweezers; and eventually realize ultrasound imaging. The information coded on the piezoelectric metasurfaces herein is frequency independent and originates from the polarization directions, pointing either up or down, of the piezoelectric materials. Such a piezoelectric metasurface is driven by a single electrode and acts as a controllable active sound source, which combines the advantages of acoustic metasurfaces and phased array transducers while kee** the devices structurally simple and compact. Our coding piezoelectric metasurfaces can lead to potential technological innovations in underwater acoustic wave modulation, acoustic tweezers, biomedical imaging, industrial non-destructive testing and neural regulation.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
ECG Heartbeat classification using deep transfer learning with Convolutional Neural Network and STFT technique
Authors:
Minh Cao,
Tianqi Zhao,
Yanxun Li,
Wenhao Zhang,
Peyman Benharash,
Ramin Ramezani
Abstract:
Electrocardiogram (ECG) is a simple non-invasive measure to identify heart-related issues such as irregular heartbeats known as arrhythmias. While artificial intelligence and machine learning is being utilized in a wide range of healthcare related applications and datasets, many arrhythmia classifiers using deep learning methods have been proposed in recent years. However, sizes of the available d…
▽ More
Electrocardiogram (ECG) is a simple non-invasive measure to identify heart-related issues such as irregular heartbeats known as arrhythmias. While artificial intelligence and machine learning is being utilized in a wide range of healthcare related applications and datasets, many arrhythmia classifiers using deep learning methods have been proposed in recent years. However, sizes of the available datasets from which to build and assess machine learning models is often very small and the lack of well-annotated public ECG datasets is evident. In this paper, we propose a deep transfer learning framework that is aimed to perform classification on a small size training dataset. The proposed method is to fine-tune a general-purpose image classifier ResNet-18 with MIT-BIH arrhythmia dataset in accordance with the AAMI EC57 standard. This paper further investigates many existing deep learning models that have failed to avoid data leakage against AAMI recommendations. We compare how different data split methods impact the model performance. This comparison study implies that future work in arrhythmia classification should follow the AAMI EC57 standard when using any including MIT-BIH arrhythmia dataset.
△ Less
Submitted 7 July, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Authors:
Qingru Zhang,
Simiao Zuo,
Chen Liang,
Alexander Bukharin,
Pengcheng He,
Weizhu Chen,
Tuo Zhao
Abstract:
Large Transformer-based models have exhibited superior performance in various natural language processing and computer vision tasks. However, these models contain enormous amounts of parameters, which restrict their deployment to real-world applications. To reduce the model size, researchers prune these models based on the weights' importance scores. However, such scores are usually estimated on m…
▽ More
Large Transformer-based models have exhibited superior performance in various natural language processing and computer vision tasks. However, these models contain enormous amounts of parameters, which restrict their deployment to real-world applications. To reduce the model size, researchers prune these models based on the weights' importance scores. However, such scores are usually estimated on mini-batches during training, which incurs large variability/uncertainty due to mini-batch sampling and complicated training dynamics. As a result, some crucial weights could be pruned by commonly used pruning methods because of such uncertainty, which makes training unstable and hurts generalization. To resolve this issue, we propose PLATON, which captures the uncertainty of importance scores by upper confidence bound (UCB) of importance estimation. In particular, for the weights with low importance scores but high uncertainty, PLATON tends to retain them and explores their capacity. We conduct extensive experiments with several Transformer-based models on natural language understanding, question answering and image classification to validate the effectiveness of PLATON. Results demonstrate that PLATON manifests notable improvement under different sparsity levels. Our code is publicly available at https://github.com/QingruZhang/PLATON.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Authors:
Kentaro Mitsui,
Tianyu Zhao,
Kei Sawada,
Yukiya Hono,
Yoshihiko Nankaku,
Keiichi Tokuda
Abstract:
The recent text-to-speech (TTS) has achieved quality comparable to that of humans; however, its application in spoken dialogue has not been widely studied. This study aims to realize a TTS that closely resembles human dialogue. First, we record and transcribe actual spontaneous dialogues. Then, the proposed dialogue TTS is trained in two stages: first stage, variational autoencoder (VAE)-VITS or G…
▽ More
The recent text-to-speech (TTS) has achieved quality comparable to that of humans; however, its application in spoken dialogue has not been widely studied. This study aims to realize a TTS that closely resembles human dialogue. First, we record and transcribe actual spontaneous dialogues. Then, the proposed dialogue TTS is trained in two stages: first stage, variational autoencoder (VAE)-VITS or Gaussian mixture variational autoencoder (GMVAE)-VITS is trained, which introduces an utterance-level latent variable into variational inference with adversarial learning for end-to-end text-to-speech (VITS), a recently proposed end-to-end TTS model. A style encoder that extracts a latent speaking style representation from speech is trained jointly with TTS. In the second stage, a style predictor is trained to predict the speaking style to be synthesized from dialogue history. During inference, by passing the speaking style representation predicted by the style predictor to VAE/GMVAE-VITS, speech can be synthesized in a style appropriate to the context of the dialogue. Subjective evaluation results demonstrate that the proposed method outperforms the original VITS in terms of dialogue-level naturalness.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
Computer-aided quantization and numerical analysis of superconducting circuits
Authors:
Sai Pavan Chitta,
Tianpu Zhao,
Ziwen Huang,
Ian Mondragon-Shem,
Jens Koch
Abstract:
The development of new superconducting circuits and the improvement of existing ones rely on the accurate modeling of spectral properties which are key to achieving the needed advances in qubit performance. Systematic circuit analysis at the lumped-element level, starting from a circuit network and culminating in a Hamiltonian appropriately describing the quantum properties of the circuit, is a we…
▽ More
The development of new superconducting circuits and the improvement of existing ones rely on the accurate modeling of spectral properties which are key to achieving the needed advances in qubit performance. Systematic circuit analysis at the lumped-element level, starting from a circuit network and culminating in a Hamiltonian appropriately describing the quantum properties of the circuit, is a well-established procedure, yet cumbersome to carry out manually for larger circuits. We present work utilizing symbolic computer algebra and numerical diagonalization routines versatile enough to tackle a variety of circuits. Results from this work are accessible through a newly released module of the scqubits package.
△ Less
Submitted 2 July, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
RefCrowd: Grounding the Target in Crowd with Referring Expressions
Authors:
Heqian Qiu,
Hongliang Li,
Tai** Zhao,
Lanxiao Wang,
Qingbo Wu,
Fanman Meng
Abstract:
Crowd understanding has aroused the widespread interest in vision domain due to its important practical significance. Unfortunately, there is no effort to explore crowd understanding in multi-modal domain that bridges natural language and computer vision. Referring expression comprehension (REF) is such a representative multi-modal task. Current REF studies focus more on grounding the target objec…
▽ More
Crowd understanding has aroused the widespread interest in vision domain due to its important practical significance. Unfortunately, there is no effort to explore crowd understanding in multi-modal domain that bridges natural language and computer vision. Referring expression comprehension (REF) is such a representative multi-modal task. Current REF studies focus more on grounding the target object from multiple distinctive categories in general scenarios. It is difficult to applied to complex real-world crowd understanding. To fill this gap, we propose a new challenging dataset, called RefCrowd, which towards looking for the target person in crowd with referring expressions. It not only requires to sufficiently mine the natural language information, but also requires to carefully focus on subtle differences between the target and a crowd of persons with similar appearance, so as to realize the fine-grained map** from language to vision. Furthermore, we propose a Fine-grained Multi-modal Attribute Contrastive Network (FMAC) to deal with REF in crowd understanding. It first decomposes the intricate visual and language features into attribute-aware multi-modal features, and then captures discriminative but robustness fine-grained attribute features to effectively distinguish these subtle differences between similar persons. The proposed method outperforms existing state-of-the-art (SoTA) methods on our RefCrowd dataset and existing REF datasets. In addition, we implement an end-to-end REF toolbox for the deeper research in multi-modal domain. Our dataset and code can be available at: \url{https://qiuheqian.github.io/datasets/refcrowd/}.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Implementing two-qubit gates at the quantum speed limit
Authors:
Joel Howard,
Alexander Lidiak,
Casey Jameson,
Bora Basyildiz,
Kyle Clark,
Tongyu Zhao,
Mustafa Bal,
Junling Long,
David P. Pappas,
Meenakshi Singh,
Zhexuan Gong
Abstract:
The speed of elementary quantum gates, particularly two-qubit gates, ultimately sets the limit on the speed at which quantum circuits can operate. In this work, we experimentally demonstrate commonly used two-qubit gates at nearly the fastest possible speed allowed by the physical interaction strength between two superconducting transmon qubits. We achieve this quantum speed limit by implementing…
▽ More
The speed of elementary quantum gates, particularly two-qubit gates, ultimately sets the limit on the speed at which quantum circuits can operate. In this work, we experimentally demonstrate commonly used two-qubit gates at nearly the fastest possible speed allowed by the physical interaction strength between two superconducting transmon qubits. We achieve this quantum speed limit by implementing experimental gates designed using a machine learning inspired optimal control method. Importantly, our method only requires the single-qubit drive strength to be moderately larger than the interaction strength to achieve an arbitrary two-qubit gate close to its analytical speed limit with high fidelity. Thus, the method is applicable to a variety of platforms including those with comparable single-qubit and two-qubit gate speeds, or those with always-on interactions. We expect our method to offer significant speedups for non-native two-qubit gates that are typically achieved with a long sequence of single-qubit and native two-qubit gates.
△ Less
Submitted 1 December, 2023; v1 submitted 15 June, 2022;
originally announced June 2022.
-
Synthetic Over-sampling for Imbalanced Node Classification with Graph Neural Networks
Authors:
Tianxiang Zhao,
Xiang Zhang,
Suhang Wang
Abstract:
In recent years, graph neural networks (GNNs) have achieved state-of-the-art performance for node classification. However, most existing GNNs would suffer from the graph imbalance problem. In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph. The message propagation mechanism in GNNs would further amplify the dominance of those maj…
▽ More
In recent years, graph neural networks (GNNs) have achieved state-of-the-art performance for node classification. However, most existing GNNs would suffer from the graph imbalance problem. In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph. The message propagation mechanism in GNNs would further amplify the dominance of those majority classes, resulting in sub-optimal classification performance. In this work, we seek to address this problem by generating pseudo instances of minority classes to balance the training data, extending previous over-sampling-based techniques. This task is non-trivial, as those techniques are designed with the assumption that instances are independent. Neglection of relation information would complicate this oversampling process. Furthermore, the node classification task typically takes the semi-supervised setting with only a few labeled nodes, providing insufficient supervision for the generation of minority instances. Generated new nodes of low quality would harm the trained classifier. In this work, we address these difficulties by synthesizing new nodes in a constructed embedding space, which encodes both node attributes and topology information. Furthermore, an edge generator is trained simultaneously to model the graph structure and provide relations for new samples. To further improve the data efficiency, we also explore synthesizing mixed ``in-between'' nodes to utilize nodes from the majority class in this over-sampling process. Experiments on real-world datasets validate the effectiveness of our proposed framework.
△ Less
Submitted 10 June, 2022;
originally announced June 2022.
-
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint
Authors:
Hao Liu,
Minshuo Chen,
Siawpeng Er,
Wen**g Liao,
Tong Zhang,
Tuo Zhao
Abstract:
Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness. Most existing function approximation theories suggest that with sufficiently many parameters, neural networks can well approximate certain classes of functions in terms of the function value. The neural n…
▽ More
Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness. Most existing function approximation theories suggest that with sufficiently many parameters, neural networks can well approximate certain classes of functions in terms of the function value. The neural network themselves, however, can be highly nonsmooth. To bridge this gap, we take convolutional residual networks (ConvResNets) as an example, and prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Moreover, we extend our theory to approximating functions supported on a low-dimensional manifold. Our theory partially justifies the benefits of using deep and wide networks in practice. Numerical experiments on adversarial robust image classification are provided to support our theory.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks
Authors:
Xiang Ji,
Minshuo Chen,
Mengdi Wang,
Tuo Zhao
Abstract:
We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neural networks. We analyze the deep fitted Q-evaluation method for estimating the expected cumulative reward of a target policy, when the data are generated from an unknown behavior policy. We show that, by choosing network size appropriately, one can leverage any low-dimensional manifold structure in…
▽ More
We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neural networks. We analyze the deep fitted Q-evaluation method for estimating the expected cumulative reward of a target policy, when the data are generated from an unknown behavior policy. We show that, by choosing network size appropriately, one can leverage any low-dimensional manifold structure in the Markov decision process and obtain a sample-efficient estimator without suffering from the curse of high data ambient dimensionality. Specifically, we establish a sharp error bound for fitted Q-evaluation, which depends on the intrinsic dimension of the state-action space, the smoothness of Bellman operator, and a function class-restricted $χ^2$-divergence. It is noteworthy that the restricted $χ^2$-divergence measures the behavior and target policies' {\it mismatch in the function space}, which can be small even if the two policies are not close to each other in their tabular forms. We also develop a novel approximation result for convolutional neural networks in Q-function estimation. Numerical experiments are provided to support our theoretical analysis.
△ Less
Submitted 3 October, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Graph Rationalization with Environment-based Augmentations
Authors:
Gang Liu,
Tong Zhao,
Jiaxin Xu,
Tengfei Luo,
Meng Jiang
Abstract:
Rationale is defined as a subset of input features that best explains or supports the prediction by machine learning models. Rationale identification has improved the generalizability and interpretability of neural networks on vision and language data. In graph applications such as molecule and polymer property prediction, identifying representative subgraph structures named as graph rationales pl…
▽ More
Rationale is defined as a subset of input features that best explains or supports the prediction by machine learning models. Rationale identification has improved the generalizability and interpretability of neural networks on vision and language data. In graph applications such as molecule and polymer property prediction, identifying representative subgraph structures named as graph rationales plays an essential role in the performance of graph neural networks. Existing graph pooling and/or distribution intervention methods suffer from lack of examples to learn to identify optimal graph rationales. In this work, we introduce a new augmentation operation called environment replacement that automatically creates virtual data examples to improve rationale identification. We propose an efficient framework that performs rationale-environment separation and representation learning on the real and augmented examples in latent spaces to avoid the high complexity of explicit graph decoding and encoding. Comparing against recent techniques, experiments on seven molecular and four polymer real datasets demonstrate the effectiveness and efficiency of the proposed augmentation-based graph rationalization framework.
△ Less
Submitted 26 September, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
An Unbiased Quantum Random Number Generator Based on Boson Sampling
Authors:
****g Shi,
Tongge Zhao,
Yizhi Wang,
Chunlin Yu,
Yuhu Lu,
Ronghua Shi,
Shichao Zhang,
Junjie Wu
Abstract:
It has been proven that Boson sampling is a much promising model of optical quantum computation, which has been applied to designing quantum computer successfully, such as "Jiuzhang". However, the meaningful randomness of Boson sampling results, whose correctness and significance were proved from a specific quantum mechanical distribution, has not been utilized or exploited. In this research, Boso…
▽ More
It has been proven that Boson sampling is a much promising model of optical quantum computation, which has been applied to designing quantum computer successfully, such as "Jiuzhang". However, the meaningful randomness of Boson sampling results, whose correctness and significance were proved from a specific quantum mechanical distribution, has not been utilized or exploited. In this research, Boson sampling is applied to design a novel Quantum Random Number Generator (QRNG) by fully exploiting the randomness of Boson sampling results, and its prototype system is constructed with the programmable silicon photonic processor, which can generate uniform and unbiased random sequences and overcome the shortcomings of the existing discrete QRNGs such as source-related, high demand for the photon number resolution capability of the detector and slow self-detection generator speed. Boson sampling is implemented as a random entropy source, and random bit strings with satisfactory randomness and uniformity can be obtained after post-processing the sampling results. It is the first approach for applying the randomness of Boson sampling results to develop a practical prototype system for actual tasks, and the experiment results demonstrate the designed Boson sampling-based QRNG prototype system pass 15 tests of the NIST SP 800-22 statistical test component, which prove that Boson sampling has great potential for practical applications with desirable performance besides quantum advantage.
△ Less
Submitted 5 June, 2022;
originally announced June 2022.
-
Revisiting excitation gaps in the fractional quantum Hall effect
Authors:
Tongzhou Zhao,
Koji Kudo,
W. N. Faugno,
Ajit C. Balram,
J. K. Jain
Abstract:
Recent systematic measurements of the quantum well width dependence of the excitation gaps of fractional quantum Hall states in high mobility samples [Villegas Rosales {\it et al.}, Phys. Rev. Lett. {\bf 127}, 056801 (2021)] open the possibility of a better quantitative understanding of this important issue. We present what we believe to be accurate theoretical gaps including the effects of finite…
▽ More
Recent systematic measurements of the quantum well width dependence of the excitation gaps of fractional quantum Hall states in high mobility samples [Villegas Rosales {\it et al.}, Phys. Rev. Lett. {\bf 127}, 056801 (2021)] open the possibility of a better quantitative understanding of this important issue. We present what we believe to be accurate theoretical gaps including the effects of finite width and Landau level (LL) mixing. While theory captures the width dependence, there still remains a deviation between the calculated and the measured gaps, presumably caused by disorder. It is customary to model the experimental gaps of the $n/(2n\pm 1)$ states as $Δ_{n/(2n\pm 1)} = Ce^2/[(2n\pm 1)\varepsilon l]-Γ$, where $\varepsilon$ is the dielectric constant of the background semiconductor and $l$ is the magnetic length; the first term is interpreted as the cyclotron energy of composite fermions and $Γ$ as a disorder-induced broadening of composite-fermion LLs. Fitting the gaps for various fractional quantum Hall states, we find that $Γ$ can be nonzero even in the absence of disorder.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
A full contraction-reaction-diffusion model for pattern formation in geometrically confined microtissues
Authors:
Tiankai Zhao,
Hongyan Yuan
Abstract:
The reaction-diffusion models have been extensively applied to explain the mechanism of pattern formations in early embryogenesis based on geometrically confined microtissues consisting of human pluripotent stem cells. Recently, mechanical cues, such as the cellular stresses and strains, have been found to dictate the pattern formation in human stem cell differentiation. As a result, the tradition…
▽ More
The reaction-diffusion models have been extensively applied to explain the mechanism of pattern formations in early embryogenesis based on geometrically confined microtissues consisting of human pluripotent stem cells. Recently, mechanical cues, such as the cellular stresses and strains, have been found to dictate the pattern formation in human stem cell differentiation. As a result, the traditional reaction-diffusion models are modified by adding mechanically related terms to consider the role played by the mechanical cues. However, these models either do not consider the activeness of the cellular tissues or neglect their poroelastic nature that biological tissues are made by both cells and interstitial fluid. Hence, the current models suffer from the lacks of biophysical relevance. Here we propose a modified reaction-diffusion model that couples with the active contraction of cellular tissues. The cellular tissue is modelled as a piece of biphasic poroelastic material, where mechanical forces naturally regulate the transport of chemical cues. Such chemical cues direct cell fate and hence yield certain types of pattern formations observed in previous experiments.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Towards Faithful and Consistent Explanations for Graph Neural Networks
Authors:
Tianxiang Zhao,
Dongsheng Luo,
Xiang Zhang,
Suhang Wang
Abstract:
Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. Though various algorithms are proposed, most of them formalize this task by searching the minimal subgraph which can preserv…
▽ More
Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. Though various algorithms are proposed, most of them formalize this task by searching the minimal subgraph which can preserve original predictions. However, an inductive bias is deep-rooted in this framework: several subgraphs can result in the same or similar outputs as the original graphs. Consequently, they have the danger of providing spurious explanations and fail to provide consistent explanations. Applying them to explain weakly-performed GNNs would further amplify these issues. To address this problem, we theoretically examine the predictions of GNNs from the causality perspective. Two typical reasons of spurious explanations are identified: confounding effect of latent variables like distribution shift, and causal factors distinct from the original input. Observing that both confounding effects and diverse causal rationales are encoded in internal representations, we propose a simple yet effective countermeasure by aligning embeddings. Concretely, concerning potential shifts in the high-dimensional space, we design a distribution-aware alignment algorithm based on anchors. This new objective is easy to compute and can be incorporated into existing techniques with no or little effort. Theoretical analysis shows that it is in effect optimizing a more faithful explanation objective in design, which further justifies the proposed approach.
△ Less
Submitted 16 December, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Exploiting Dynamic and Fine-grained Semantic Scope for Extreme Multi-label Text Classification
Authors:
Yuan Wang,
Huiling Song,
Peng Huo,
Tao Xu,
Jucheng Yang,
Yarui Chen,
Tingting Zhao
Abstract:
Extreme multi-label text classification (XMTC) refers to the problem of tagging a given text with the most relevant subset of labels from a large label set. A majority of labels only have a few training instances due to large label dimensionality in XMTC. To solve this data sparsity issue, most existing XMTC methods take advantage of fixed label clusters obtained in early stage to balance performa…
▽ More
Extreme multi-label text classification (XMTC) refers to the problem of tagging a given text with the most relevant subset of labels from a large label set. A majority of labels only have a few training instances due to large label dimensionality in XMTC. To solve this data sparsity issue, most existing XMTC methods take advantage of fixed label clusters obtained in early stage to balance performance on tail labels and head labels. However, such label clusters provide static and coarse-grained semantic scope for every text, which ignores distinct characteristics of different texts and has difficulties modelling accurate semantics scope for texts with tail labels. In this paper, we propose a novel framework TReaderXML for XMTC, which adopts dynamic and fine-grained semantic scope from teacher knowledge for individual text to optimize text conditional prior category semantic ranges. TReaderXML dynamically obtains teacher knowledge for each text by similar texts and hierarchical label information in training sets to release the ability of distinctly fine-grained label-oriented semantic scope. Then, TReaderXML benefits from a novel dual cooperative network that firstly learns features of a text and its corresponding label-oriented semantic scope by parallel Encoding Module and Reading Module, secondly embeds two parts by Interaction Module to regularize the text's representation by dynamic and fine-grained label-oriented semantic scope, and finally find target labels by Prediction Module. Experimental results on three XMTC benchmark datasets show that our method achieves new state-of-the-art results and especially performs well for severely imbalanced and sparse datasets.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training
Authors:
Yifan Gao,
Qingyu Yin,
Zheng Li,
Rui Meng,
Tong Zhao,
Bing Yin,
Irwin King,
Michael R. Lyu
Abstract:
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text. Despite its recent flourishing, keyphrase generation on non-English languages haven't been vastly investigated. In this paper, we call attention to a new setting named multilingual keyphrase generation and we contribute two new datasets, EcommerceMKP and AcademicMKP, covering six languages. Technica…
▽ More
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text. Despite its recent flourishing, keyphrase generation on non-English languages haven't been vastly investigated. In this paper, we call attention to a new setting named multilingual keyphrase generation and we contribute two new datasets, EcommerceMKP and AcademicMKP, covering six languages. Technically, we propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages. The retrieval-augmented model leverages keyphrase annotations in English datasets to facilitate generating keyphrases in low-resource languages. Given a non-English passage, a cross-lingual dense passage retrieval module finds relevant English passages. Then the associated English keyphrases serve as external knowledge for keyphrase generation in the current language. Moreover, we develop a retriever-generator iterative training algorithm to mine pseudo parallel passage pairs to strengthen the cross-lingual passage retriever. Comprehensive experiments and ablations show that the proposed approach outperforms all baselines.
△ Less
Submitted 1 June, 2022; v1 submitted 20 May, 2022;
originally announced May 2022.