-
Epithelial-substrate coupling strength regulates the landscape of the traction in cohesive monolayers: a parametric study and a revisit to "size effect"
Authors:
Tiankai Zhao,
Hongyan Yuan
Abstract:
Epithelial cells can assemble into cohesive colonies and collectively interact with substrates by generating extracellular forces through focal adhesions. Recently, a molecularly based thermodynamic model, which integrates both the monolayer elasticity and force-mediated focal adhesion formation, has been developed to elucidate the regulation of the cellular force landscape induced by the active e…
▽ More
Epithelial cells can assemble into cohesive colonies and collectively interact with substrates by generating extracellular forces through focal adhesions. Recently, a molecularly based thermodynamic model, which integrates both the monolayer elasticity and force-mediated focal adhesion formation, has been developed to elucidate the regulation of the cellular force landscape induced by the active epithelial-substrate coupling. However, how epithelial-substrate coupling strength mediate the landscapes of the traction, the cellular displacement, and the focal adhesion distribution in a cohesive monolayer remains unexamined in details. In this work, we follow the procedures by the previous work to re-formulate the free energy of the epithelial-substrate system and obtain the thermodynamic steady-state equations. We then derive a simplified form of the complete equation system, and solve it both semi-analytically and numerically. We find that the parameter which characterizes the epithelial-substrate coupling strength can significantly affect the landscapes of the traction the cellular displacement, and the focal adhesion distribution. We also revisit the "size effect" addressed by previous works and demonstrate that such effect is the natural outcome of a strong epithelial-substrate coupling without introducing any extra factors. For epithelial-substrate coupling which is not strong enough, the currently observed "size effect" does not hold. A scaling law that determines whether the previously observed "size effect" holds is proposed based on our model.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
A lower bound for the beta function
Authors:
Tiehong Zhao,
Miaokun Wang
Abstract:
We present a new lower bound for Euler's beta function, $B(x,y)$, which states that the inequality \begin{equation*}
B(x,y)>\frac{x+y}{xy}\left(1-\frac{2xy}{x+y+1}\right) \end{equation*} holds on $(0,1]\times(0,1]$, which improves a lower bound obtained by P. Ivády [12, Theorem, (3.2)] in the case of $0<x+y<1$.
We present a new lower bound for Euler's beta function, $B(x,y)$, which states that the inequality \begin{equation*}
B(x,y)>\frac{x+y}{xy}\left(1-\frac{2xy}{x+y+1}\right) \end{equation*} holds on $(0,1]\times(0,1]$, which improves a lower bound obtained by P. Ivády [12, Theorem, (3.2)] in the case of $0<x+y<1$.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
Authors:
Tony Z. Zhao,
Vikash Kumar,
Sergey Levine,
Chelsea Finn
Abstract:
Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and…
▽ More
Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleoperation interface. Imitation learning, however, presents its own challenges, particularly in high-precision domains: errors in the policy can compound over time, and human demonstrations can be non-stationary. To address these challenges, we develop a simple yet novel algorithm, Action Chunking with Transformers (ACT), which learns a generative model over action sequences. ACT allows the robot to learn 6 difficult tasks in the real world, such as opening a translucent condiment cup and slotting a battery with 80-90% success, with only 10 minutes worth of demonstrations. Project website: https://tonyzhaozh.github.io/aloha/
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
Multi-label Node Classification On Graph-Structured Data
Authors:
Tianqi Zhao,
Ngan Thi Dong,
Alan Hanjalic,
Megha Khosla
Abstract:
Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node…
▽ More
Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node classification is the limited number of publicly available multi-label graph datasets. Therefore, as our first contribution, we collect and release three real-world biological datasets and develop a multi-label graph generator to generate datasets with tunable properties. While high label similarity (high homophily) is usually attributed to the success of GNNs, we argue that a multi-label scenario does not follow the usual semantics of homophily and heterophily so far defined for a multi-class scenario. As our second contribution, we define homophily and Cross-Class Neighborhood Similarity for the multi-label scenario and provide a thorough analyses of the collected $9$ multi-label datasets. Finally, we perform a large-scale comparative study with $8$ methods and $9$ datasets and analyse the performances of the methods to assess the progress made by current state of the art in the multi-label node classification scenario. We release our benchmark at https://github.com/Tianqi-py/MLGNC.
△ Less
Submitted 29 February, 2024; v1 submitted 20 April, 2023;
originally announced April 2023.
-
A Multi-robot Coverage Path Planning Algorithm Based on Improved DARP Algorithm
Authors:
Yufan Huang,
Man Li,
Tao Zhao
Abstract:
The research on multi-robot coverage path planning (CPP) has been attracting more and more attention. In order to achieve efficient coverage, this paper proposes an improved DARP coverage algorithm. The improved DARP algorithm based on A* algorithm is used to assign tasks to robots and then combined with STC algorithm based on Up-First algorithm to achieve full coverage of the task area. Compared…
▽ More
The research on multi-robot coverage path planning (CPP) has been attracting more and more attention. In order to achieve efficient coverage, this paper proposes an improved DARP coverage algorithm. The improved DARP algorithm based on A* algorithm is used to assign tasks to robots and then combined with STC algorithm based on Up-First algorithm to achieve full coverage of the task area. Compared with the initial DARP algorithm, this algorithm has higher efficiency and higher coverage rate.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
A high-efficiency proton-boron fusion scheme taking into account the effects of quantum degeneracy
Authors:
S. J. Liu,
D. Wu,
T. X. Hu,
T. Y. Liang,
X. C. Ning,
J. H. Liang,
Y. C. Liu,
P. Liu,
X. Liu,
Z. M. Sheng,
Y. T. Zhao,
D. H. H. Hoffmann,
X. T. He,
J. Zhang
Abstract:
The proton-boron (p-$^{11}$B) reaction is regarded as the holy grail of advanced fusion fuels, since the primary reaction produces three $α$ particles with few neutrons and induced radio-activities from second order reactions. Compared to the Deuterium-Tritium reaction a much higher reaction temperature is required. Moreover, bremsstrahlung energy losses due to the high nuclear charge of boron dee…
▽ More
The proton-boron (p-$^{11}$B) reaction is regarded as the holy grail of advanced fusion fuels, since the primary reaction produces three $α$ particles with few neutrons and induced radio-activities from second order reactions. Compared to the Deuterium-Tritium reaction a much higher reaction temperature is required. Moreover, bremsstrahlung energy losses due to the high nuclear charge of boron deem it seemingly apparent than a fusion reactor based on Deuterium-Tritium plasma in equilibrium is to say the least very difficult.It is becoming more appealing to collide intense laser beams or accelerated proton beams with a boron target to produce p-$^{11}$B reactions. The fusion yield of p-$^{11}$B reactions is closely related to proton beam parameters and boron target conditions such as density, temperature, and ingredients. Quantum degeneracy will increase fusion yields by reducing the stop** power of injected protons. In this work, we suggest a high-efficiency scheme for beam-target p-$^{11}$B fusions via injecting a MeV proton beam into a highly compressed quantum degenerated boron target. Such a boron target can be achieved via quasi-isentropic compression of solid boron by using precisely shaped laser pulses. Our results indicate that for densities ranging from $10^3$ to $10^4ρ_s$, where $ρ_s$ is the density of solid boron, contributions of bound and free electrons to the stop** of protons can be completely disregarded and dramatically reduced respectively. The result is an increase in fusion yield by orders of magnitude. Furthermore, in order to achieve multiplication factor $F$ greater than one, with $F$ defined as the ratio of output fusion energy to the energy of injected protons, it is found there exits a minimum possible density of boron target, which is $2.15 \times 10^4 ρ_s$ when the kinetic energy of injected protons is $0.8$ MeV.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
A Survey on Distributed Evolutionary Computation
Authors:
Wei-Neng Chen,
Feng-Feng Wei,
Tian-Fang Zhao,
Kay Chen Tan,
Jun Zhang
Abstract:
The rapid development of parallel and distributed computing paradigms has brought about great revolution in computing. Thanks to the intrinsic parallelism of evolutionary computation (EC), it is natural to implement EC on parallel and distributed computing systems. On the one hand, the computing power provided by parallel computing systems can significantly improve the efficiency and scalability o…
▽ More
The rapid development of parallel and distributed computing paradigms has brought about great revolution in computing. Thanks to the intrinsic parallelism of evolutionary computation (EC), it is natural to implement EC on parallel and distributed computing systems. On the one hand, the computing power provided by parallel computing systems can significantly improve the efficiency and scalability of EC. On the other hand, data are collected and processed in a distributed manner, which brings a novel development direction and new challenges to EC. In this paper, we intend to give a systematic review on distributed EC (DEC). First, a new taxonomy for DEC is proposed from top design mechanism to bottom implementation mechanism. Based on this taxonomy, existing studies on DEC are reviewed in terms of purpose, parallel structure of the algorithm, parallel model for implementation, and the implementation environment. Second, we clarify two major purposes of DEC, i.e., improving efficiency through parallel processing for centralized optimization and cooperating distributed individuals/sub-populations with partial information to perform distributed optimization. Third, noting that the latter purpose of DEC is an emerging and attractive trend for EC with the booming of spatially distributed paradigms, this paper gives a systematic definition of the distributed optimization and classifies it into dimension distributed-, data distributed-, and objective distributed-optimization problems. Formal formulations for these problems are provided and various DEC studies on these problems are reviewed. We also discuss challenges and potential research directions, aiming to enlighten the design of DEC and pave the way for future developments.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Open-shell Tensor Hypercontraction
Authors:
Tingting Zhao,
Megan Simons,
Devin A. Matthews
Abstract:
The extension of least-squares tensor hypercontracted second- and third-order Møller-Plessett perturbation theory (LS-THC-MP2 and LS-THC-MP3) to open-shell systems is an important development due to the scaling reduction afforded by THC and the ubiquity of molecular ions, radicals, and other open-shell reactive species. The complexity of wavefunction-based quantum chemical methods such as Møller-P…
▽ More
The extension of least-squares tensor hypercontracted second- and third-order Møller-Plessett perturbation theory (LS-THC-MP2 and LS-THC-MP3) to open-shell systems is an important development due to the scaling reduction afforded by THC and the ubiquity of molecular ions, radicals, and other open-shell reactive species. The complexity of wavefunction-based quantum chemical methods such as Møller-Plessett and coupled cluster theory is reflected in the steep scaling of the computational costs with the molecular size. The least-squares tensor hypercontraction (LS-THC) method is an efficient, single-step factorization for the two-electron integral tensor, but can also be used to factorize the double excitation amplitudes, leading to significant scaling reduction. Here, we extend this promising method to open-shell variants of LS-THC-MP2 and -MP3 using diagrammatic techniques and explicit spin-summation. The accuracy of the resulting methods for open-shell species is benchmarked on standard tests systems such as regular alkanes, as well as realistic systems involving bond breaking, radical stabilization, and other effects. We find that open-shell LS-THC-MP$n$ methods exhibit errors highly comparable to those produced by closed-shell LS-THC-MP$n$, and are highly insensitive to particular chemical interactions, geometries, or even to moderate spin contamination.
△ Less
Submitted 27 May, 2023; v1 submitted 8 April, 2023;
originally announced April 2023.
-
High-quality NiFe thin films on oxide/non-oxide platforms via pulsed laser deposition at room temperature
Authors:
H. Yan,
G. J. Omar,
Z. T. Zhao,
Lim Zhi Shiuh,
A. Ariando
Abstract:
Soft ferromagnetic NiFe thin films are promising for applications in spintronic devices because of their constituent electrical and magnetic properties. Electron beam evaporation and sputtering techniques have been used to deposit NiFe thin films. For in-situ stacking of NiFe with functional complex oxides, the pulsed laser deposition (PLD) method is highly desirable. However, the growth of high-q…
▽ More
Soft ferromagnetic NiFe thin films are promising for applications in spintronic devices because of their constituent electrical and magnetic properties. Electron beam evaporation and sputtering techniques have been used to deposit NiFe thin films. For in-situ stacking of NiFe with functional complex oxides, the pulsed laser deposition (PLD) method is highly desirable. However, the growth of high-quality NiFe (and non-oxide thin films in general) by PLD remains a formidable task. Here, we report high-quality NiFe thin films of various thicknesses on oxide/non-oxide substrates with desirable magnetic properties by PLD at room temperature. The magnetic properties are found to be strongly dependent on the laser fluence of the deposition process. The laser fluence of 4 Joule/cm$^2$ produces the highest magnetization of ~547 emu/cc. The small coercivity (few Oersted) and sharp ferromagnetic switching behaviour indicate uniaxial anisotropy with an easy axis along the in-plane direction. In addition, thickness-dependent magnetodynamics characterizations are studied via ferromagnetic resonance. Our findings indicate the ferromagnetic characteristics are sensitive to the quality of the oxide/non-oxide substrate surface. These results offer significant insight into the PLD-based development of thin metal magnetic films.
△ Less
Submitted 1 April, 2023;
originally announced April 2023.
-
Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation
Authors:
Tong Zhao,
Andrea Tagliabue,
Jonathan P. How
Abstract:
The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable pol…
▽ More
The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Evaluations in simulation show that a high-quality adaptive policy can be obtained in about $1.3$ hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a $6.1$ cm average position error under wind disturbances that correspond to about $50\%$ of the weight of the robot, and that are $36\%$ larger than the maximum wind seen during training.
△ Less
Submitted 2 October, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
GAN-RXA: A Practical Scalable Solution to Receiver-Agnostic Transmitter Fingerprinting
Authors:
Tianyi Zhao,
Shamik Sarkar,
Enes Krijestorac,
Danijela Cabric
Abstract:
Radio frequency fingerprinting has been proposed for device identification. However, experimental studies also demonstrated its sensitivity to deployment changes. Recent works have addressed channel impacts by develo** robust algorithms accounting for time and location variability, but the impacts of receiver impairments on transmitter fingerprints are yet to be solved. In this work, we investig…
▽ More
Radio frequency fingerprinting has been proposed for device identification. However, experimental studies also demonstrated its sensitivity to deployment changes. Recent works have addressed channel impacts by develo** robust algorithms accounting for time and location variability, but the impacts of receiver impairments on transmitter fingerprints are yet to be solved. In this work, we investigate the receiver-agnostic transmitter fingerprinting problem, and propose a novel two-stage supervised learning framework (RXA) to address it. In the first stage, our approach calibrates a receiver-agnostic transmitter feature-extractor. We also propose two deep-learning approaches (SD-RXA and GAN-RXA) in this first stage to improve the receiver-agnostic property of the RXA framework. In the second stage, the calibrated feature-extractor is utilized to train a transmitter classifier with only one receiver. We evaluate the proposed approaches on the transmitter identification problem using a large-scale WiFi dataset. We show that when a trained transmitter-classifier is deployed on new receivers, the RXA framework can improve the classification accuracy by 19.5%, and the outlier detection ROC AUC by 12.0% compared to a naive approach without calibration. Moreover, GAN-RXA can further increase the closed-set classification accuracy by 5.0%, and the outlier detection ROC AUC by 12.5% compared to the RXA approach.
△ Less
Submitted 17 February, 2024; v1 submitted 24 March, 2023;
originally announced March 2023.
-
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
Authors:
Qingru Zhang,
Minshuo Chen,
Alexander Bukharin,
Nikos Karampatziakis,
Pengcheng He,
Yu Cheng,
Weizhu Chen,
Tuo Zhao
Abstract:
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. However, common practice fine-tunes all of the parameters in a pre-trained model, which becomes prohibitive when a large number of downstream tasks are present. Therefore, many fine-tuning methods are proposed to learn incremental updates of pre-trained weights in a parameter efficient way, e…
▽ More
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. However, common practice fine-tunes all of the parameters in a pre-trained model, which becomes prohibitive when a large number of downstream tasks are present. Therefore, many fine-tuning methods are proposed to learn incremental updates of pre-trained weights in a parameter efficient way, e.g., low-rank increments. These methods often evenly distribute the budget of incremental updates across all pre-trained weight matrices, and overlook the varying importance of different weight parameters. As a consequence, the fine-tuning performance is suboptimal. To bridge this gap, we propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score. In particular, AdaLoRA parameterizes the incremental updates in the form of singular value decomposition. Such a novel approach allows us to effectively prune the singular values of unimportant updates, which is essentially to reduce their parameter budget but circumvent intensive exact SVD computations. We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA. Results demonstrate that AdaLoRA manifests notable improvement over baselines, especially in the low budget settings. Our code is publicly available at https://github.com/QingruZhang/AdaLoRA .
△ Less
Submitted 20 December, 2023; v1 submitted 18 March, 2023;
originally announced March 2023.
-
Data-Centric Learning from Unlabeled Graphs with Diffusion Model
Authors:
Gang Liu,
Eric Inae,
Tong Zhao,
Jiaxin Xu,
Tengfei Luo,
Meng Jiang
Abstract:
Graph property prediction tasks are important and numerous. While each task offers a small size of labeled examples, unlabeled graphs have been collected from various sources and at a large scale. A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks. However, the self-supervised task knowledge could no…
▽ More
Graph property prediction tasks are important and numerous. While each task offers a small size of labeled examples, unlabeled graphs have been collected from various sources and at a large scale. A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks. However, the self-supervised task knowledge could not be aligned or sometimes conflicted with what the predictions needed. In this paper, we propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points to augment each property prediction model. We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process with each task's labeled data to generate task-specific graph examples and their labels. Experiments demonstrate that our data-centric approach performs significantly better than fifteen existing various methods on fifteen tasks. The performance improvement brought by unlabeled data is visible as the generated labeled examples unlike the self-supervised learning.
△ Less
Submitted 12 October, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
Local Search for Solving Satisfiability of Polynomial Formulas
Authors:
Haokun Li,
Bican Xia,
Tianqi Zhao
Abstract:
Satisfiability Modulo the Theory of Nonlinear Real Arithmetic, SMT(NRA) for short, concerns the satisfiability of polynomial formulas, which are quantifier-free Boolean combinations of polynomial equations and inequalities with integer coefficients and real variables. In this paper, we propose a local search algorithm for a special subclass of SMT(NRA), where all constraints are strict inequalitie…
▽ More
Satisfiability Modulo the Theory of Nonlinear Real Arithmetic, SMT(NRA) for short, concerns the satisfiability of polynomial formulas, which are quantifier-free Boolean combinations of polynomial equations and inequalities with integer coefficients and real variables. In this paper, we propose a local search algorithm for a special subclass of SMT(NRA), where all constraints are strict inequalities. An important fact is that, given a polynomial formula with $n$ variables, the zero level set of the polynomials in the formula decomposes the $n$-dimensional real space into finitely many components (cells) and every polynomial has constant sign in each cell. The key point of our algorithm is a new operation based on real root isolation, called cell-jump, which updates the current assignment along a given direction such that the assignment can `jump' from one cell to another. One cell-jump may adjust the values of several variables while traditional local search operations, such as flip for SAT and critical move for SMT(LIA), only change that of one variable. We also design a two-level operation selection to balance the success rate and efficiency. Furthermore, our algorithm can be easily generalized to a wider subclass of SMT(NRA) where polynomial equations linear with respect to some variable are allowed. Experiments show the algorithm is competitive with state-of-the-art SMT solvers, and performs particularly well on those formulas with high-degree polynomials.
△ Less
Submitted 21 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Geometry-based spherical JND modeling for 360$^\circ$ display
Authors:
Hongan Wei,
Jiaqi Liu,
Bo Chen,
Liqun Lin,
Weiling Chen,
Tiesong Zhao
Abstract:
360$^\circ$ videos have received widespread attention due to its realistic and immersive experiences for users. To date, how to accurately model the user perceptions on 360$^\circ$ display is still a challenging issue. In this paper, we exploit the visual characteristics of 360$^\circ$ projection and display and extend the popular just noticeable difference (JND) model to spherical JND (SJND). Fir…
▽ More
360$^\circ$ videos have received widespread attention due to its realistic and immersive experiences for users. To date, how to accurately model the user perceptions on 360$^\circ$ display is still a challenging issue. In this paper, we exploit the visual characteristics of 360$^\circ$ projection and display and extend the popular just noticeable difference (JND) model to spherical JND (SJND). First, we propose a quantitative 2D-JND model by jointly considering spatial contrast sensitivity, luminance adaptation and texture masking effect. In particular, our model introduces an entropy-based region classification and utilizes different parameters for different types of regions for better modeling performance. Second, we extend our 2D-JND model to SJND by jointly exploiting latitude projection and field of view during 360$^\circ$ display. With this operation, SJND reflects both the characteristics of human vision system and the 360$^\circ$ display. Third, our SJND model is more consistent with user perceptions during subjective test and also shows more tolerance in distortions with fewer bit rates during 360$^\circ$ video compression. To further examine the effectiveness of our SJND model, we embed it in Versatile Video Coding (VVC) compression. Compared with the state-of-the-arts, our SJND-VVC framework significantly reduced the bit rate with negligible loss in visual quality.
△ Less
Submitted 4 June, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds
Authors:
Biraj Dahal,
Alex Havrilla,
Minshuo Chen,
Tuo Zhao,
Wen**g Liao
Abstract:
Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such…
▽ More
Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such as natural images and signals, exhibit low-dimensional geometric structures. In this paper, we take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold. We prove statistical guarantees of generative networks under the Wasserstein-1 loss. We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the low-dimensional geometric structures in data sets and justifies the practical power of generative networks. We require no smoothness assumptions on the data distribution which is desirable in practice.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Authors:
Chen Liang,
Haoming Jiang,
Zheng Li,
Xianfeng Tang,
Bin Yin,
Tuo Zhao
Abstract:
Knowledge distillation has been shown to be a powerful model compression approach to facilitate the deployment of pre-trained language models in practice. This paper focuses on task-agnostic distillation. It produces a compact pre-trained model that can be easily fine-tuned on various tasks with small computational costs and memory footprints. Despite the practical benefits, task-agnostic distilla…
▽ More
Knowledge distillation has been shown to be a powerful model compression approach to facilitate the deployment of pre-trained language models in practice. This paper focuses on task-agnostic distillation. It produces a compact pre-trained model that can be easily fine-tuned on various tasks with small computational costs and memory footprints. Despite the practical benefits, task-agnostic distillation is challenging. Since the teacher model has a significantly larger capacity and stronger representation power than the student model, it is very difficult for the student to produce predictions that match the teacher's over a massive amount of open-domain training data. Such a large prediction discrepancy often diminishes the benefits of knowledge distillation. To address this challenge, we propose Homotopic Distillation (HomoDistil), a novel task-agnostic distillation approach equipped with iterative pruning. Specifically, we initialize the student model from the teacher model, and iteratively prune the student's neurons until the target width is reached. Such an approach maintains a small discrepancy between the teacher's and student's predictions throughout the distillation process, which ensures the effectiveness of knowledge transfer. Extensive experiments demonstrate that HomoDistil achieves significant improvements on existing baselines.
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
Microscopic Energy Storage Mechanism of Dielectric Polymer-Coated Supercapacitors
Authors:
Weihang Gao,
Teng Zhao,
Shian Dong,
Xingyi Huang,
Zhenli Xu
Abstract:
Supercapacitors have been attracting significant attention as promising energy storage devices. However, the voltage window limitation associated with electrolyte solutions has hindered the improvement of their capacitance. To address this issue and enhance the energy storage capabilities of general traditional supercapacitors, we put forward the dipole induced effects observed in the theoretical…
▽ More
Supercapacitors have been attracting significant attention as promising energy storage devices. However, the voltage window limitation associated with electrolyte solutions has hindered the improvement of their capacitance. To address this issue and enhance the energy storage capabilities of general traditional supercapacitors, we put forward the dipole induced effects observed in the theoretical framework of the electric double-layer structure. The molecular dynamics results demonstrate that, compared to traditional systems, an improvement of over 50% in integral capacitance at low voltages is achieved. Moreover, a new material-based experimental results obtained from a dielectric supercapacitor employing a hydrated electrolyte solution corroborated the effectiveness of our proposed model, yielding consistent outcomes. We attribute the large capacitance variation to the reorientation of the dipoles, which induces the neutral-to-bilayer transition and the overscreening-to-steric transition, consistent with the polarization process of the polymer in the experiment. We further investigate the capacitance variations under different dipole parameters, such as varying the number of layers, different number densities and different spacings, thereby enriching the experimental results with additional conclusions not previously obtained. This work presents a novel approach that exploits dipole-induced capacitance effects, paving the way for further advances in the field of energy storage technology.
△ Less
Submitted 25 June, 2023; v1 submitted 19 February, 2023;
originally announced February 2023.
-
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
Authors:
Minshuo Chen,
Kaixuan Huang,
Tuo Zhao,
Mengdi Wang
Abstract:
Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion mo…
▽ More
Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Furthermore, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on the subspace dimension, indicating that diffusion models can circumvent the curse of data ambient dimensionality.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Framework for phase transitions between the Maxwell and Gibbs constructions
Authors:
Constantinos Constantinou,
Tianqi Zhao,
Sophia Han,
Madappa Prakash
Abstract:
By taking the nucleon-to-quark phase transition within a neutron star as an example, we present a thermodynamically consistent method to calculate the equation of state of ambient matter so that transitions that are intermediate to those of the familiar Maxwell and Gibbs constructions can be described. This method does not address the poorly known surface tension between the two phases microscopic…
▽ More
By taking the nucleon-to-quark phase transition within a neutron star as an example, we present a thermodynamically consistent method to calculate the equation of state of ambient matter so that transitions that are intermediate to those of the familiar Maxwell and Gibbs constructions can be described. This method does not address the poorly known surface tension between the two phases microscopically (as, for example, in the calculation of the core pasta phases via the Wigner-Seitz approximation) but instead combines the local and global charge neutrality conditions characteristic of the Maxwell and Gibbs constructions, respectively. Overall charge neutrality is achieved by dividing the leptons to those that obey local charge neutrality (Maxwell) and those that maintain global charge neutrality (Gibbs). The equation of state is obtained by using equilibrium constraints derived from minimizing the total energy density. The results of this minimization are then used to calculate neutron star mass-radius curves, tidal deformabilities, equilibrium and adiabatic sound speeds, and nonradial $g$-mode oscillation frequencies for several intermediate constructions. Various quantities of interest transform smoothly from their Gibbs structures to those of Maxwell as the local-to-total electron ratio $η$, introduced to mimic the hadron-to-quark interface tension from $0$ (Gibbs) to $\infty$ (Maxwell), is raised from $0$ to $1$. A notable exception is the $g$-mode frequency for the specific case of $η=1$ for which a gap appears between the quark and hadronic branches.
△ Less
Submitted 16 April, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Dynamic Ensemble of Low-fidelity Experts: Mitigating NAS "Cold-Start"
Authors:
Junbo Zhao,
Xuefei Ning,
Enshu Liu,
Binxin Ru,
Zixuan Zhou,
Tianchen Zhao,
Chen Chen,
Jia** Zhang,
Qingmin Liao,
Yu Wang
Abstract:
Predictor-based Neural Architecture Search (NAS) employs an architecture performance predictor to improve the sample efficiency. However, predictor-based NAS suffers from the severe ``cold-start'' problem, since a large amount of architecture-performance data is required to get a working predictor. In this paper, we focus on exploiting information in cheaper-to-obtain performance estimations (i.e.…
▽ More
Predictor-based Neural Architecture Search (NAS) employs an architecture performance predictor to improve the sample efficiency. However, predictor-based NAS suffers from the severe ``cold-start'' problem, since a large amount of architecture-performance data is required to get a working predictor. In this paper, we focus on exploiting information in cheaper-to-obtain performance estimations (i.e., low-fidelity information) to mitigate the large data requirements of predictor training. Despite the intuitiveness of this idea, we observe that using inappropriate low-fidelity information even damages the prediction ability and different search spaces have different preferences for low-fidelity information types. To solve the problem and better fuse beneficial information provided by different types of low-fidelity information, we propose a novel dynamic ensemble predictor framework that comprises two steps. In the first step, we train different sub-predictors on different types of available low-fidelity information to extract beneficial knowledge as low-fidelity experts. In the second step, we learn a gating network to dynamically output a set of weighting coefficients conditioned on each input neural architecture, which will be used to combine the predictions of different low-fidelity experts in a weighted sum. The overall predictor is optimized on a small set of actual architecture-performance data to fuse the knowledge from different low-fidelity experts to make the final prediction. We conduct extensive experiments across five search spaces with different architecture encoders under various experimental settings. Our method can easily be incorporated into existing predictor-based NAS frameworks to discover better architectures.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
On Approximating the Dynamic Response of Synchronous Generators via Operator Learning: A Step Towards Building Deep Operator-based Power Grid Simulators
Authors:
Christian Moya,
Guang Lin,
Tianqiao Zhao,
Meng Yue
Abstract:
This paper designs an Operator Learning framework to approximate the dynamic response of synchronous generators. One can use such a framework to (i) design a neural-based generator model that can interact with a numerical simulator of the rest of the power grid or (ii) shadow the generator's transient response. To this end, we design a data-driven Deep Operator Network~(DeepONet) that approximates…
▽ More
This paper designs an Operator Learning framework to approximate the dynamic response of synchronous generators. One can use such a framework to (i) design a neural-based generator model that can interact with a numerical simulator of the rest of the power grid or (ii) shadow the generator's transient response. To this end, we design a data-driven Deep Operator Network~(DeepONet) that approximates the generators' infinite-dimensional solution operator. Then, we develop a DeepONet-based numerical scheme to simulate a given generator's dynamic response over a short/medium-term horizon. The proposed numerical scheme recursively employs the trained DeepONet to simulate the response for a given multi-dimensional input, which describes the interaction between the generator and the rest of the system. Furthermore, we develop a residual DeepONet numerical scheme that incorporates information from mathematical models of synchronous generators. We accompany this residual DeepONet scheme with an estimate for the prediction's cumulative error. We also design a data aggregation (DAgger) strategy that allows (i) employing supervised learning to train the proposed DeepONets and (ii) fine-tuning the DeepONet using aggregated training data that the DeepONet is likely to encounter during interactive simulations with other grid components. Finally, as a proof of concept, we demonstrate that the proposed DeepONet frameworks can effectively approximate the transient model of a synchronous generator.
△ Less
Submitted 29 January, 2023;
originally announced January 2023.
-
Understanding and Improving Deep Graph Neural Networks: A Probabilistic Graphical Model Perspective
Authors:
Jiayuan Chen,
Xiang Zhang,
Yinfei Xu,
Tianli Zhao,
Renjie Xie,
Wei Xu
Abstract:
Recently, graph-based models designed for downstream tasks have significantly advanced research on graph neural networks (GNNs). GNN baselines based on neural message-passing mechanisms such as GCN and GAT perform worse as the network deepens. Therefore, numerous GNN variants have been proposed to tackle this performance degradation problem, including many deep GNNs. However, a unified framework i…
▽ More
Recently, graph-based models designed for downstream tasks have significantly advanced research on graph neural networks (GNNs). GNN baselines based on neural message-passing mechanisms such as GCN and GAT perform worse as the network deepens. Therefore, numerous GNN variants have been proposed to tackle this performance degradation problem, including many deep GNNs. However, a unified framework is still lacking to connect these existing models and interpret their effectiveness at a high level. In this work, we focus on deep GNNs and propose a novel view for understanding them. We establish a theoretical framework via inference on a probabilistic graphical model. Given the fixed point equation (FPE) derived from the variational inference on the Markov random fields, the deep GNNs, including JKNet, GCNII, DGCN, and the classical GNNs, such as GCN, GAT, and APPNP, can be regarded as different approximations of the FPE. Moreover, given this framework, more accurate approximations of FPE are brought, guiding us to design a more powerful GNN: coupling graph neural network (CoGNet). Extensive experiments are carried out on citation networks and natural language processing downstream tasks. The results demonstrate that the CoGNet outperforms the SOTA models.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Flow cytometry with anti-diffraction light sheet (ADLS) by spatial light modulation
Authors:
Yanyan Gong,
Ming Zeng,
Yueqiang Zhu,
Shangyu Li,
Wei Zhao,
Ce Zhang,
Tianyun Zhao,
Kaige Wang,
Jiangcun Yang,
**tao Bai
Abstract:
Flow cytometry is a widespread and powerful technique, whose resolution is determined by its capacity to accurately distinguish fluorescently positive populations from negative ones. However, most informative results are discarded while performing the measurements of conventional flow cytometry, e.g., the cell size, shape, morphology, and distribution or location of labeled exosomes within the unp…
▽ More
Flow cytometry is a widespread and powerful technique, whose resolution is determined by its capacity to accurately distinguish fluorescently positive populations from negative ones. However, most informative results are discarded while performing the measurements of conventional flow cytometry, e.g., the cell size, shape, morphology, and distribution or location of labeled exosomes within the unpurified biological samples. We, herein, propose a novel approach using an anti-diffraction light sheet with anisotroic feature to excite fluorescent tags. Constituted by an anti-diffraction Bessel-Gaussian beam array, the light sheet is 12 $μ$m wide, 12 $μ$m high, with a thickness of $~ 0.8 μ$m. The intensity profile of the excited fluorescent signal can, therefore, reflect the size and allow samples in the range from O(100 nm) to 10 $μ$m (e.g., blood cells) to be transported via hydrodynamic focusing in a microfluidic chip. The sampling rate is 500 kHz provides a capability of high throughput without sacrificing the spatial resolution. Consequently, the proposed anti-diffraction light-sheet flow cytometry (ADLSFC) can obtain more informative results than the conventional methodologies, and is able to provide multiple characteristics (e.g., the size and distribution of fluorescent signal) hel** to distinguish the target samples from the complex backgrounds.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Quad-cascade picture of turbulence
Authors:
Wei Zhao,
Yanxia Shi,
Yueqiang Zhu,
Ming Zeng,
Guangyin **g,
Keyi Nan,
Yu Chen,
Chen Zhang,
Tianyun Zhao,
Kaige Wang,
**tao Bai
Abstract:
Although its ubiquitous emergence in nature and variety of systems, turbulence possesses spatio-temporal chaotic, intermittent fluctuations, and makes it impossible to be precisely predicted. Persistent attempts for almost a century have been devoted to capture the invariant laws and hidden deeply universality out of the vast disorder and chaotic nature of turbulence. The celebrated Kolmogorov -5/…
▽ More
Although its ubiquitous emergence in nature and variety of systems, turbulence possesses spatio-temporal chaotic, intermittent fluctuations, and makes it impossible to be precisely predicted. Persistent attempts for almost a century have been devoted to capture the invariant laws and hidden deeply universality out of the vast disorder and chaotic nature of turbulence. The celebrated Kolmogorov -5/3 law is robust, but not comprehensive to describe the diverse turbulences, especially in the turbulence driven by external volume forces, e.g. thermal convection, electrokinetic turbulence and etc. Here, we reveal that the fluxes of kinetic energy and scalar variance must be highly coupled to establish a universal conservation law and consequently we successfully unify a much diversity of scaling laws. As an example, in a microfluidic electrokinetic turbulence, additional scaling of -5/3, -9/5 and -7/3 are experimentally found in the power spectra of concentration. With this proposed model, a full quad-cascade picture is eventually complete to unify the various scaling laws for the most complicated physical problem of turbulence.
△ Less
Submitted 19 January, 2023; v1 submitted 18 January, 2023;
originally announced January 2023.
-
PTA-Det: Point Transformer Associating Point cloud and Image for 3D Object Detection
Authors:
Rui Wan,
Tianyun Zhao,
Wei Zhao
Abstract:
In autonomous driving, 3D object detection based on multi-modal data has become an indispensable approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies between the LiDAR point and camera image, the fusion of the data for object detection encounter…
▽ More
In autonomous driving, 3D object detection based on multi-modal data has become an indispensable approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies between the LiDAR point and camera image, the fusion of the data for object detection encounters a series of problems. Most multi-modal detection methods perform even worse than LiDAR-only methods. In this investigation, we propose a method named PTA-Det to improve the performance of multi-modal detection. Accompanied by PTA-Det, a Pseudo Point Cloud Generation Network is proposed, which can convert image information including texture and semantic features by pseudo points. Thereafter, through a transformer-based Point Fusion Transition (PFT) module, the features of LiDAR points and pseudo points from image can be deeply fused under a unified point-based representation. The combination of these modules can conquer the major obstacle in feature fusion across modalities and realizes a complementary and discriminative representation for proposal generation. Extensive experiments on the KITTI dataset show the PTA-Det achieves a competitive result and support its effectiveness.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Isolating Bounded and Unbounded Real Roots of a Mixed Trigonometric-Polynomial
Authors:
Rizeng Chen,
Haokun Li,
Bican Xia,
Tianqi Zhao,
Tao Zheng
Abstract:
Mixed trigonometric-polynomials (MTPs) are functions of the form $f(x,\sin{x}, \cos{x})$ with $f\in\mathbb{Q}[x_1,x_2,x_3]$. In this paper, an algorithm ``isolating" all the real roots of an MTP is provided and implemented. It automatically divides the real roots into two parts: one consists of finitely many ``bounded" roots in an interval $[μ_-,μ_+]$ while the other consists of probably countably…
▽ More
Mixed trigonometric-polynomials (MTPs) are functions of the form $f(x,\sin{x}, \cos{x})$ with $f\in\mathbb{Q}[x_1,x_2,x_3]$. In this paper, an algorithm ``isolating" all the real roots of an MTP is provided and implemented. It automatically divides the real roots into two parts: one consists of finitely many ``bounded" roots in an interval $[μ_-,μ_+]$ while the other consists of probably countably many ``periodic" roots in $\mathbb{R}\backslash[μ_-,μ_+]$. For bounded roots, the algorithm returns isolating intervals and corresponding multiplicities while for periodic roots, it returns finitely many mutually disjoint small intervals $I_i\subset[-π,π]$, integers $c_i>0$ and multisets of root multiplicity $\{m_{j,i}\}_{j=1}^{c_i}$ such that any periodic root $t>μ_+$ is in the set $(\sqcup_i\cup_{k\in\mathbb{N}}(I_i+2kπ))$ and any interval $I_i+2kπ\subset(μ_+,\infty)$ contains exactly $c_i$ periodic roots with multiplicities $m_{1,i},...,m_{c_i,i}$, respectively. The effectiveness and efficiency of the algorithm are shown by experiments. %In particular, our results indicate that the ``distributions" of the roots of an MTP in the ``periods" $(-π,π]+2kπ$ sufficiently far from $0$ share a same pattern. Besides, the method used to isolate the roots in $[μ_-,μ_+]$ is applicable to any other bounded interval as well. The algorithm takes advantages of the weak Fourier sequence technique and deals with the intervals period-by-period without scaling the coordinate so to keep the length of the sequence short. The new approaches can easily be modified to decide whether there is any root, or whether there are infinitely many roots in unbounded intervals of the form $(-\infty,a)$ or $(a,\infty)$ with $a\in\mathbb{Q}$.
△ Less
Submitted 14 January, 2023;
originally announced January 2023.
-
Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text
Authors:
Fangzhi Xu,
Jun Liu,
Qika Lin,
Tianzhe Zhao,
Jian Zhang,
Lingling Zhang
Abstract:
Logical reasoning task involves diverse types of complex reasoning over text, based on the form of multiple-choice question answering. Given the context, question and a set of options as the input, previous methods achieve superior performances on the full-data setting. However, the current benchmark dataset has the ideal assumption that the reasoning type distribution on the train split is close…
▽ More
Logical reasoning task involves diverse types of complex reasoning over text, based on the form of multiple-choice question answering. Given the context, question and a set of options as the input, previous methods achieve superior performances on the full-data setting. However, the current benchmark dataset has the ideal assumption that the reasoning type distribution on the train split is close to the test split, which is inconsistent with many real application scenarios. To address it, there remain two problems to be studied: (1) How is the zero-shot capability of the models (train on seen types and test on unseen types)? (2) How to enhance the perception of reasoning types for the models? For problem 1, we propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR. It includes six splits based on the three type sampling strategies. For problem 2, a type-aware model TaCo is proposed. It utilizes both the heuristic input reconstruction and the contrastive learning to improve the type perception in the global representation. Extensive experiments on both the zero-shot and full-data settings prove the superiority of TaCo over the state-of-the-art methods. Also, we experiment and verify the generalization capability of TaCo on other logical reasoning dataset.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Taiji Data Challenge for Exploring Gravitational Wave Universe
Authors:
Zhixiang Ren,
Tianyu Zhao,
Zhoujian Cao,
Zong-Kuan Guo,
Wen-Biao Han,
Hong-Bo **,
Yue-Liang Wu
Abstract:
The direct observation of gravitational waves (GWs) opens a new window for exploring new physics from quanta to cosmos and provides a new tool for probing the evolution of universe. GWs detection in space covers a broad spectrum ranging over more than four orders of magnitude and enables us to study rich physical and astronomical phenomena. Taiji is a proposed space-based GW detection mission that…
▽ More
The direct observation of gravitational waves (GWs) opens a new window for exploring new physics from quanta to cosmos and provides a new tool for probing the evolution of universe. GWs detection in space covers a broad spectrum ranging over more than four orders of magnitude and enables us to study rich physical and astronomical phenomena. Taiji is a proposed space-based GW detection mission that will be launched in the 2030s. Taiji will be exposed to numerous overlap** and persistent GW signals buried in the foreground and background, posing various data analysis challenges. In order to empower potential scientific discoveries, the Mock LISA Data Challenge and the LISA Data Challenge (LDC) were developed. While LDC provides a baseline framework, the first LDC needs to be updated with more realistic simulations and adjusted detector responses for Taiji's constellation. In this paper, we review the scientific objectives and the roadmap for Taiji, as well as the technical difficulties in data analysis and the data generation strategy, and present the associated data challenges. In contrast to LDC, we utilize second-order Keplerian orbit and second-generation time delay interferometry techniques. Additionally, we employ a new model for the extreme-mass-ratio inspiral waveform and stochastic GW background spectrum, which enables us to test general relativity and measure the non-Gaussianity of curvature perturbations. Furthermore, we present a comprehensive showcase of parameter estimation using a toy dataset. This showcase not only demonstrates the scientific potential of the Taiji Data Challenge but also serves to validate the effectiveness of the pipeline. As the first data challenge for Taiji, we aim to build an open ground for data analysis related to Taiji sources and sciences. More details can be found on the official website at http://taiji-tdc.ictp-ap.org.
△ Less
Submitted 15 August, 2023; v1 submitted 7 January, 2023;
originally announced January 2023.
-
Faithful and Consistent Graph Neural Network Explanations with Rationale Alignment
Authors:
Tianxiang Zhao,
Dongsheng Luo,
Xiang Zhang,
Suhang Wang
Abstract:
Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. %These identified sub-structures can provide interpretations of GNN's behavior. Though various algorithms are proposed, most…
▽ More
Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. %These identified sub-structures can provide interpretations of GNN's behavior. Though various algorithms are proposed, most of them formalize this task by searching the minimal subgraph which can preserve original predictions. However, an inductive bias is deep-rooted in this framework: several subgraphs can result in the same or similar outputs as the original graphs. Consequently, they have the danger of providing spurious explanations and failing to provide consistent explanations. Applying them to explain weakly-performed GNNs would further amplify these issues. To address this problem, we theoretically examine the predictions of GNNs from the causality perspective. Two typical reasons for spurious explanations are identified: confounding effect of latent variables like distribution shift, and causal factors distinct from the original input. Observing that both confounding effects and diverse causal rationales are encoded in internal representations, \tianxiang{we propose a new explanation framework with an auxiliary alignment loss, which is theoretically proven to be optimizing a more faithful explanation objective intrinsically. Concretely for this alignment loss, a set of different perspectives are explored: anchor-based alignment, distributional alignment based on Gaussian mixture models, mutual-information-based alignment, etc. A comprehensive study is conducted both on the effectiveness of this new framework in terms of explanation faithfulness/consistency and on the advantages of these variants.
△ Less
Submitted 2 September, 2023; v1 submitted 7 January, 2023;
originally announced January 2023.
-
Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment
Authors:
Liqun Lin,
Yang Zheng,
Weiling Chen,
Chengdong Lan,
Tiesong Zhao
Abstract:
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and…
▽ More
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Maximum-Likelihood-Estimate Hamiltonian learning via efficient and robust quantum likelihood gradient
Authors:
Tian-Lun Zhao,
Shi-Xin Hu,
Yi Zhang
Abstract:
Given the recent developments in quantum techniques, modeling the physical Hamiltonian of a target quantum many-body system is becoming an increasingly practical and vital research direction. Here, we propose an efficient strategy combining maximum likelihood estimation, gradient descent, and quantum many-body algorithms. Given the measurement outcomes, we optimize the target model Hamiltonian and…
▽ More
Given the recent developments in quantum techniques, modeling the physical Hamiltonian of a target quantum many-body system is becoming an increasingly practical and vital research direction. Here, we propose an efficient strategy combining maximum likelihood estimation, gradient descent, and quantum many-body algorithms. Given the measurement outcomes, we optimize the target model Hamiltonian and density operator via a series of descents along the quantum likelihood gradient, which we prove is negative semi-definite with respect to the negative-log-likelihood function. In addition to such optimization efficiency, our maximum-likelihood-estimate Hamiltonian learning respects the locality of a given quantum system, therefore, extends readily to larger systems with available quantum many-body algorithms. Compared with previous approaches, it also exhibits better accuracy and overall stability toward noises, fluctuations, and temperature ranges, which we demonstrate with various examples.
△ Less
Submitted 23 June, 2023; v1 submitted 28 December, 2022;
originally announced December 2022.
-
A Method to Load Tellurium in Liquid Scintillator for the Study of Neutrinoless Double Beta Decay
Authors:
D. J. Auty,
D. Bartlett,
S. D. Biller,
D. Chauhan,
M. Chen,
O. Chkvorets,
S. Connolly,
X. Dai,
E. Fletcher,
K. Frankiewicz,
D. Gooding,
C. Grant,
S. Hall,
D. Horne,
S. Hans,
B. Hreljac,
T. Kaptanoglu,
B. Krar,
C. Kraus,
T. Kroupova',
I. Lam,
Y. Liu,
S. Maguire,
C. Miller,
S. Manecki
, et al. (12 additional authors not shown)
Abstract:
A method has been developed to load tellurium into liquid scintillator so as to permit searches for neutrinoless double beta decay with high sensitivity. The approach involves the synthesis of an oil-soluble tellurium compound from telluric acid and an organic diol. The process utilises distillable chemicals that can be safely handled underground and affords low radioactive backgrounds, low optica…
▽ More
A method has been developed to load tellurium into liquid scintillator so as to permit searches for neutrinoless double beta decay with high sensitivity. The approach involves the synthesis of an oil-soluble tellurium compound from telluric acid and an organic diol. The process utilises distillable chemicals that can be safely handled underground and affords low radioactive backgrounds, low optical absorption and high light yields at loading levels of at least several percent Te by weight.
△ Less
Submitted 4 April, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
Authors:
Ahmed Tanvir Mahdad,
Cong Shi,
Zhengkun Ye,
Tianming Zhao,
Yan Wang,
Yingying Chen,
Nitesh Saxena
Abstract:
Eavesdrop** from the user's smartphone is a well-known threat to the user's safety and privacy. Existing studies show that loudspeaker reverberation can inject speech into motion sensor readings, leading to speech eavesdrop**. While more devastating attacks on ear speakers, which produce much smaller scale vibrations, were believed impossible to eavesdrop with zero-permission motion sensors. I…
▽ More
Eavesdrop** from the user's smartphone is a well-known threat to the user's safety and privacy. Existing studies show that loudspeaker reverberation can inject speech into motion sensor readings, leading to speech eavesdrop**. While more devastating attacks on ear speakers, which produce much smaller scale vibrations, were believed impossible to eavesdrop with zero-permission motion sensors. In this work, we revisit this important line of reach. We explore recent trends in smartphone manufacturers that include extra/powerful speakers in place of small ear speakers, and demonstrate the feasibility of using motion sensors to capture such tiny speech vibrations. We investigate the impacts of these new ear speakers on built-in motion sensors and examine the potential to elicit private speech information from the minute vibrations. Our designed system EarSpy can successfully detect word regions, time, and frequency domain features and generate a spectrogram for each word region. We train and test the extracted data using classical machine learning algorithms and convolutional neural networks. We found up to 98.66% accuracy in gender detection, 92.6% detection in speaker detection, and 56.42% detection in digit detection (which is 5X more significant than the random selection (10%)). Our result unveils the potential threat of eavesdrop** on phone conversations from ear speakers using motion sensors.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
TopoImb: Toward Topology-level Imbalance in Learning from Graphs
Authors:
Tianxiang Zhao,
Dongsheng Luo,
Xiang Zhang,
Suhang Wang
Abstract:
Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance…
▽ More
Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
JUNO Sensitivity on Proton Decay $p\to \barνK^+$ Searches
Authors:
JUNO Collaboration,
Angel Abusleme,
Thomas Adam,
Shakeel Ahmad,
Rizwan Ahmed,
Sebastiano Aiello,
Muhammad Akram,
Fengpeng An,
Qi An,
Giuseppe Andronico,
Nikolay Anfimov,
Vito Antonelli,
Tatiana Antoshkina,
Burin Asavapibhop,
João Pedro Athayde Marcondes de André,
Didier Auguste,
Nikita Balashov,
Wander Baldini,
Andrea Barresi,
Davide Basilico,
Eric Baussan,
Marco Bellato,
Antonio Bergnoli,
Thilo Birkenfeld,
Sylvie Blin
, et al. (586 additional authors not shown)
Abstract:
The Jiangmen Underground Neutrino Observatory (JUNO) is a large liquid scintillator detector designed to explore many topics in fundamental physics. In this paper, the potential on searching for proton decay in $p\to \barνK^+$ mode with JUNO is investigated.The kaon and its decay particles feature a clear three-fold coincidence signature that results in a high efficiency for identification. Moreov…
▽ More
The Jiangmen Underground Neutrino Observatory (JUNO) is a large liquid scintillator detector designed to explore many topics in fundamental physics. In this paper, the potential on searching for proton decay in $p\to \barνK^+$ mode with JUNO is investigated.The kaon and its decay particles feature a clear three-fold coincidence signature that results in a high efficiency for identification. Moreover, the excellent energy resolution of JUNO permits to suppress the sizable background caused by other delayed signals. Based on these advantages, the detection efficiency for the proton decay via $p\to \barνK^+$ is 36.9% with a background level of 0.2 events after 10 years of data taking. The estimated sensitivity based on 200 kton-years exposure is $9.6 \times 10^{33}$ years, competitive with the current best limits on the proton lifetime in this channel.
△ Less
Submitted 26 October, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
A Survey on Biometrics Authentication
Authors:
Fangshi Zhou,
Tianming Zhao
Abstract:
Nowadays, traditional authentication methods are vulnerable to face attacks that are often based on inherent security issues. Professional attackers leverage adversarial offenses on the security holes. Biometrics has intrinsic advantages to overcome the traditional authentication methods on security, success rates, efficiency, and accessibility. Biometrics has wide prospects to implement various a…
▽ More
Nowadays, traditional authentication methods are vulnerable to face attacks that are often based on inherent security issues. Professional attackers leverage adversarial offenses on the security holes. Biometrics has intrinsic advantages to overcome the traditional authentication methods on security, success rates, efficiency, and accessibility. Biometrics has wide prospects to implement various applications in fields. Whether in authentication security or clinical medicine, biometrics is one of the mainstream studies. In this paper, we surveyed and reviewed some related studies of biometrics, which are outstanding and significant in driving the development and popularization of biometrics. Although they still have some inherent disadvantages to restrict popularization, these obstacles could not conceal the promising future of biometrics. Multi-factors continuous biometrics authentication has become the mainstream trend of development. We reflect the findings as well as the challenges of the studies in the survey paper.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
Efficient Long Sequence Modeling via State Space Augmented Transformer
Authors:
Simiao Zuo,
Xiaodong Liu,
Jian Jiao,
Denis Charles,
Eren Manavoglu,
Tuo Zhao,
Jianfeng Gao
Abstract:
Transformer models have achieved superior performance in various natural language processing tasks. However, the quadratic computational cost of the attention mechanism limits its practicality for long sequences. There are existing attention variants that improve the computational efficiency, but they have limited ability to effectively compute global information. In parallel to Transformer models…
▽ More
Transformer models have achieved superior performance in various natural language processing tasks. However, the quadratic computational cost of the attention mechanism limits its practicality for long sequences. There are existing attention variants that improve the computational efficiency, but they have limited ability to effectively compute global information. In parallel to Transformer models, state space models (SSMs) are tailored for long sequences, but they are not flexible enough to capture complicated local information. We propose SPADE, short for $\underline{\textbf{S}}$tate s$\underline{\textbf{P}}$ace $\underline{\textbf{A}}$ugmente$\underline{\textbf{D}}$ Transform$\underline{\textbf{E}}$r. Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers. The SSM augments global information, which complements the lack of long-range dependency issue in local attention methods. Experimental results on the Long Range Arena benchmark and language modeling tasks demonstrate the effectiveness of the proposed method. To further demonstrate the scalability of SPADE, we pre-train large encoder-decoder models and present fine-tuning results on natural language understanding and natural language generation tasks.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
RLEKF: An Optimizer for Deep Potential with Ab Initio Accuracy
Authors:
Siyu Hu,
Wentao Zhang,
Qiuchen Sha,
Feng Pan,
Lin-Wang Wang,
Weile Jia,
Guangmng Tan,
Tong Zhao
Abstract:
It is imperative to accelerate the training of neural network force field such as Deep Potential, which usually requires thousands of images based on first-principles calculation and a couple of days to generate an accurate potential energy surface. To this end, we propose a novel optimizer named reorganized layer extended Kalman filtering (RLEKF), an optimized version of global extended Kalman fi…
▽ More
It is imperative to accelerate the training of neural network force field such as Deep Potential, which usually requires thousands of images based on first-principles calculation and a couple of days to generate an accurate potential energy surface. To this end, we propose a novel optimizer named reorganized layer extended Kalman filtering (RLEKF), an optimized version of global extended Kalman filtering (GEKF) with a strategy of splitting big and gathering small layers to overcome the $O(N^2)$ computational cost of GEKF. This strategy provides an approximation of the dense weights error covariance matrix with a sparse diagonal block matrix for GEKF. We implement both RLEKF and the baseline Adam in our $α$Dynamics package and numerical experiments are performed on 13 unbiased datasets. Overall, RLEKF converges faster with slightly better accuracy. For example, a test on a typical system, bulk copper, shows that RLEKF converges faster by both the number of training epochs ($\times$11.67) and wall-clock time ($\times$1.19). Besides, we theoretically prove that the updates of weights converge and thus are against the gradient exploding problem. Experimental results verify that RLEKF is not sensitive to the initialization of weights. The RLEKF sheds light on other AI-for-science applications where training a large neural network (with tons of thousands parameters) is a bottleneck.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization
Authors:
Jiahui Cheng,
Minshuo Chen,
Hao Liu,
Tuo Zhao,
Wen**g Liao
Abstract:
Label Shift has been widely believed to be harmful to the generalization performance of machine learning models. Researchers have proposed many approaches to mitigate the impact of the label shift, e.g., balancing the training data. However, these methods often consider the underparametrized regime, where the sample size is much larger than the data dimension. The research under the overparametriz…
▽ More
Label Shift has been widely believed to be harmful to the generalization performance of machine learning models. Researchers have proposed many approaches to mitigate the impact of the label shift, e.g., balancing the training data. However, these methods often consider the underparametrized regime, where the sample size is much larger than the data dimension. The research under the overparametrized regime is very limited. To bridge this gap, we propose a new asymptotic analysis of the Fisher Linear Discriminant classifier for binary classification with label shift. Specifically, we prove that there exists a phase transition phenomenon: Under certain overparametrized regime, the classifier trained using imbalanced data outperforms the counterpart with reduced balanced data. Moreover, we investigate the impact of regularization to the label shift: The aforementioned phase transition vanishes as the regularization becomes strong.
△ Less
Submitted 7 December, 2022; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Revealing temperature evolution of the Dirac band in ZrTe$_5$ via magneto-infrared spectroscopy
Authors:
Yuxuan Jiang,
Tianhao Zhao,
Luojia Zhang,
Qiang Chen,
Haidong Zhou,
Mykhaylo Ozerov,
Dmitry Smirnov,
Zhigang Jiang
Abstract:
We report the temperature evolution of the Dirac band in semiconducting zirconium pentatelluride (ZrTe$_5$) using magneto-infrared spectroscopy. We find that the band gap is temperature independent at low temperatures and increases with temperature at elevated temperatures. Although such an observation seems to support a weak topological insulator phase at all temperatures and defy the previously…
▽ More
We report the temperature evolution of the Dirac band in semiconducting zirconium pentatelluride (ZrTe$_5$) using magneto-infrared spectroscopy. We find that the band gap is temperature independent at low temperatures and increases with temperature at elevated temperatures. Although such an observation seems to support a weak topological insulator phase at all temperatures and defy the previously reported topological phase transition (TPT) at an intermediate temperature in ZrTe$_5$, we show that it is also possible to explain the observation by considering the effect of conduction-valence band mixing and band inversion with a strong topological insulator phase at low temperatures. Our work provides an alternative picture of the band gap evolution across TPT.
△ Less
Submitted 27 July, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Link Prediction with Non-Contrastive Learning
Authors:
William Shiao,
Zhichun Guo,
Tong Zhao,
Evangelos E. Papalexakis,
Yozen Liu,
Neil Shah
Abstract:
A recent focal area in the space of graph neural networks (GNNs) is graph self-supervised learning (SSL), which aims to derive useful node representations without labeled data. Notably, many state-of-the-art graph SSL methods are contrastive methods, which use a combination of positive and negative samples to learn node representations. Owing to challenges in negative sampling (slowness and model…
▽ More
A recent focal area in the space of graph neural networks (GNNs) is graph self-supervised learning (SSL), which aims to derive useful node representations without labeled data. Notably, many state-of-the-art graph SSL methods are contrastive methods, which use a combination of positive and negative samples to learn node representations. Owing to challenges in negative sampling (slowness and model sensitivity), recent literature introduced non-contrastive methods, which instead only use positive samples. Though such methods have shown promising performance in node-level tasks, their suitability for link prediction tasks, which are concerned with predicting link existence between pairs of nodes (and have broad applicability to recommendation systems contexts) is yet unexplored. In this work, we extensively evaluate the performance of existing non-contrastive methods for link prediction in both transductive and inductive settings. While most existing non-contrastive methods perform poorly overall, we find that, surprisingly, BGRL generally performs well in transductive settings. However, it performs poorly in the more realistic inductive settings where the model has to generalize to links to/from unseen nodes. We find that non-contrastive models tend to overfit to the training graph and use this analysis to propose T-BGRL, a novel non-contrastive framework that incorporates cheap corruptions to improve the generalization ability of the model. This simple modification strongly improves inductive performance in 5/6 of our datasets, with up to a 120% improvement in Hits@50--all with comparable speed to other non-contrastive baselines and up to 14x faster than the best-performing contrastive baseline. Our work imparts interesting findings about non-contrastive learning for link prediction and paves the way for future researchers to further expand upon this area.
△ Less
Submitted 28 March, 2023; v1 submitted 25 November, 2022;
originally announced November 2022.
-
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Authors:
Tingting Zhao,
Ying Wang,
Wei Sun,
Yarui Chen,
Gang Niub,
Masashi Sugiyama
Abstract:
Deep reinforcement learning (DRL) breaks through the bottlenecks of traditional reinforcement learning (RL) with the help of the perception capability of deep learning and has been widely applied in real-world problems.While model-free RL, as a class of efficient DRL methods, performs the learning of state representations simultaneously with policy learning in an end-to-end manner when facing larg…
▽ More
Deep reinforcement learning (DRL) breaks through the bottlenecks of traditional reinforcement learning (RL) with the help of the perception capability of deep learning and has been widely applied in real-world problems.While model-free RL, as a class of efficient DRL methods, performs the learning of state representations simultaneously with policy learning in an end-to-end manner when facing large-scale continuous state and action spaces. However, training such a large policy model requires a large number of trajectory samples and training time. On the other hand, the learned policy often fails to generalize to large-scale action spaces, especially for the continuous action spaces. To address this issue, in this paper we propose an efficient policy learning method in latent state and action spaces. More specifically, we extend the idea of state representations to action representations for better policy generalization capability. Meanwhile, we divide the whole learning task into learning with the large-scale representation models in an unsupervised manner and learning with the small-scale policy model in the RL manner.The small policy model facilitates policy learning, while not sacrificing generalization and expressiveness via the large representation model. Finally,the effectiveness of the proposed method is demonstrated by MountainCar,CarRacing and Cheetah experiments.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Radial Oscillations of Dark Matter admixed Neutron Stars
Authors:
Pinku Routray,
H. C. Das,
Souhardya Sen,
Bharat Kumar,
Grigoris Panotopoulos,
Tianqi Zhao
Abstract:
Within the relativistic mean-field model, we investigate the properties of dark matter (DM) admixed neutron stars, considering non-rotating objects made of isotropic matter. We adopt the IOPB-I hadronic equation of state (EOS) by assuming that the fermionic DM within super-symmetric models has already been accreted inside the neutron star (NS). The impact of DM on the mass-radius relationships and…
▽ More
Within the relativistic mean-field model, we investigate the properties of dark matter (DM) admixed neutron stars, considering non-rotating objects made of isotropic matter. We adopt the IOPB-I hadronic equation of state (EOS) by assuming that the fermionic DM within super-symmetric models has already been accreted inside the neutron star (NS). The impact of DM on the mass-radius relationships and the radial oscillations of pulsating DM admixed neutron stars (with and without the crust) are explored. It is observed that the presence of DM softens the EOS, which in turn lowers the maximum mass and its corresponding radius. Moreover, adding DM results in higher frequencies of pulsating objects and hence we show the linearity of fundamental mode frequency of canonical NS with DM Fermi momentum. We also investigate the profile of eigenfunctions solving the Sturm-Liouville boundary value problem, and verify its validity. Further, we study the stability of NSs considering the fundamental mode frequency variation with the mass of the star, and verify the stability criterion $\partial M/\partialρ_c > 0$. Finally, the effect of the crust on the large frequency separation for different DM Fermi momenta is shown as well.
△ Less
Submitted 24 May, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Composite fermion pairing induced by Landau level mixing
Authors:
Tongzhou Zhao,
Ajit C. Balram,
J. K. Jain
Abstract:
Pairing of composite fermions provides a possible mechanism for fractional quantum Hall effect at even denominator fractions and is believed to serve as a platform for realizing quasiparticles with non-Abelian braiding statistics. We present results from fixed-phase diffusion Monte Carlo calculations which predict that substantial Landau level mixing can induce a pairing of composite fermions at f…
▽ More
Pairing of composite fermions provides a possible mechanism for fractional quantum Hall effect at even denominator fractions and is believed to serve as a platform for realizing quasiparticles with non-Abelian braiding statistics. We present results from fixed-phase diffusion Monte Carlo calculations which predict that substantial Landau level mixing can induce a pairing of composite fermions at filling factors $ν=1/2$ and $ν=1/4$ in the $l=-3$ relative angular momentum channel, thereby destabilizing the composite-fermion Fermi seas to produce non-Abelian fractional quantum Hall states.
△ Less
Submitted 4 May, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Anomalous anisotropy of spin current in a cubic spin source with noncollinear antiferromagnetism
Authors:
Cuimei Cao,
Shiwei Chen,
Rui-Chun Xiao,
Zengtai Zhu,
Guoqiang Yu,
Yang** Wang,
Xuepeng Qiu,
Liang Liu,
Tieyang Zhao,
Ding-Fu Shao,
Yang Xu,
**gsheng Chen,
Qingfeng Zhan
Abstract:
Cubic materials host high crystal symmetry and hence are not expected to support anisotropy in transport phenomena. In contrast to this common expectation, here we report an anomalous anisotropy of spin current can emerge in the (001) film of Mn${_3}$Pt, a noncollinear antiferromagnetic spin source with face-centered cubic structure. Such spin current anisotropy originates from the intertwined tim…
▽ More
Cubic materials host high crystal symmetry and hence are not expected to support anisotropy in transport phenomena. In contrast to this common expectation, here we report an anomalous anisotropy of spin current can emerge in the (001) film of Mn${_3}$Pt, a noncollinear antiferromagnetic spin source with face-centered cubic structure. Such spin current anisotropy originates from the intertwined time reversal-odd ($T$-odd) and time reversal-even ($T$-even) spin Hall effects. Based on symmetry analyses and experimental characterizations of the current-induced spin torques in Mn${_3}$Pt-based heterostructures, we find that the spin current generated by Mn${_3}$Pt (001) exhibits exotic dependences on the current direction for all the spin components, deviating from that in conventional cubic systems. We also demonstrate that such an anisotropic spin current can be used to realize low-power spintronic applications such as the efficient field-free switching of the perpendicular magnetizations.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture
Authors:
Olivia Hsu,
Alexander Rucker,
Tian Zhao,
Kunle Olukotun,
Fredrik Kjolstad
Abstract:
We introduce Stardust, a compiler that compiles sparse tensor algebra to reconfigurable dataflow architectures (RDAs). Stardust introduces new user-provided data representation and scheduling language constructs for map** to resource-constrained accelerated architectures. Stardust uses the information provided by these constructs to determine on-chip memory placement and to lower to the Capstan…
▽ More
We introduce Stardust, a compiler that compiles sparse tensor algebra to reconfigurable dataflow architectures (RDAs). Stardust introduces new user-provided data representation and scheduling language constructs for map** to resource-constrained accelerated architectures. Stardust uses the information provided by these constructs to determine on-chip memory placement and to lower to the Capstan RDA through a parallel-patterns rewrite system that targets the Spatial programming model. The Stardust compiler is implemented as a new compilation path inside the TACO open-source system. Using cycle-accurate simulation, we demonstrate that Stardust can generate more Capstan tensor operations than its authors had implemented and that it results in 138$\times$ better performance than generated CPU kernels and 41$\times$ better performance than generated GPU kernels.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Cost-aware Generalized $α$-investing for Multiple Hypothesis Testing
Authors:
Thomas Cook,
Harsh Vardhan Dubey,
Ji Ah Lee,
Guangyu Zhu,
Tingting Zhao,
Patrick Flaherty
Abstract:
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized $α$-investing framework which enables control of the false discovery rate in a sequential testing setting. We make a the…
▽ More
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized $α$-investing framework which enables control of the false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of $α$-wealth which motivates a consideration of sample size in the $α$-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected $α$-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where $n$ is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.
△ Less
Submitted 3 November, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers
Authors:
Wanjun Zhong,
Tingting Ma,
Jiahai Wang,
Jian Yin,
Tiejun Zhao,
Chin-Yew Lin,
Nan Duan
Abstract:
This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are decoupled to capture different levels of cognition. Upon the top of the representation…
▽ More
This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are decoupled to capture different levels of cognition. Upon the top of the representation module, the pre-trained reasoning modules are modular and professional in specific and fundamental reasoning skills (e.g., logic, simple QA, etc). To mimic the controlled compositional thinking process, different reasoning modules are dynamically activated and composed in both parallel and cascaded manners to control what reasoning skills are activated and how deep the reasoning process will be reached to solve the current problems. The unified reasoning framework solves multiple tasks with a single model, and is trained and inferred in an end-to-end manner. Evaluated on 11 datasets requiring different reasoning skills and complexity, ReasonFormer demonstrates substantial performance boosts, revealing the compositional reasoning ability. Few-shot experiments exhibit better generalization ability by learning to compose pre-trained skills for new tasks with limited data, and decoupling the representation module and the reasoning modules. Further analysis shows the modularity of reasoning modules as different tasks activate distinct reasoning skills at different reasoning depths.
△ Less
Submitted 7 December, 2022; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Spectral theory of $p-adic$ Hermite operator
Authors:
Tianhong Zhao
Abstract:
We give the definition of $p-adic$ Hermite operator and set up the $p-adic$ spectral measure. We compare the Archimedean case with non-Archimedean case. The structure of Hermite conjugate in $C^{*}$-Algebra corresponds to three canonical structures of $p-adic$ ultrametric Banach algebra: 1. mod $p$ reduction 2. Frobenius map 3. Teichmüller lift. There is a nature connection between Galois theory a…
▽ More
We give the definition of $p-adic$ Hermite operator and set up the $p-adic$ spectral measure. We compare the Archimedean case with non-Archimedean case. The structure of Hermite conjugate in $C^{*}$-Algebra corresponds to three canonical structures of $p-adic$ ultrametric Banach algebra: 1. mod $p$ reduction 2. Frobenius map 3. Teichmüller lift. There is a nature connection between Galois theory and Hermite operator spectral decomposition. The Galois group $\mathrm{Gal}(\bar{\mathbb{F}}_p|\mathbb{F}_p)$ generate the $p-adic$ spectral measure. We point out some relationships with $p-adic$ quantum mechanics: 1. creation operator and annihilation operator 2. $p-adic$ uncertainty principle.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.