Search | arXiv e-print repository

A Minimal Model for Carnot Efficiency at Maximum Power

Authors: Shiling Liang, Yu-Han Ma, Daniel Maria Busiello, Paolo De Los Rios

Abstract: Carnot efficiency sets a fundamental upper bound on the heat engine efficiency, attainable in the quasi-static limit, albeit at the cost of completely sacrificing power output. In this Letter, we present a minimal heat engine model that can attain Carnot efficiency while achieving maximum power output. We unveil the potential of intrinsic divergent physical quantities within the working substance,… ▽ More Carnot efficiency sets a fundamental upper bound on the heat engine efficiency, attainable in the quasi-static limit, albeit at the cost of completely sacrificing power output. In this Letter, we present a minimal heat engine model that can attain Carnot efficiency while achieving maximum power output. We unveil the potential of intrinsic divergent physical quantities within the working substance, such as degeneracy, as promising thermodynamic resources to break through the universal power-efficiency trade-off imposed by nonequilibrium thermodynamics for conventional heat engines. Our findings provide novel insights into the collective advantage in harnessing energy of many-body interacting systems. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 6+5 pages, 3 figures

arXiv:2311.13693 [pdf, other]

Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores

Authors: Zeliang Zhang, Zhuo Liu, Susan Liang, Zhiyuan Wang, Yifan Zhu, Chen Ding, Chenliang Xu

Abstract: CP decomposition is a powerful tool for data science, especially gene analysis, deep learning, and quantum computation. However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors. While the data in our real world is usually presented as trillion- or even exascale-scale tensors, e… ▽ More CP decomposition is a powerful tool for data science, especially gene analysis, deep learning, and quantum computation. However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors. While the data in our real world is usually presented as trillion- or even exascale-scale tensors, existing work can only support billion-scale scale tensors. In our work, we propose the Exascale-Tensor to mitigate the significant gap. Specifically, we propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Then, we carefully analyze the inherent parallelism and propose a bag of strategies to improve computational efficiency. Last, we conduct experiments to decompose tensors ranging from million-scale to trillion-scale for evaluation. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks, of which the numeric results demonstrate the scalability and effectiveness of our method. △ Less

Submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.12075 [pdf, other]

BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning

Authors: Siyuan Liang, Mingli Zhu, Aishan Liu, Baoyuan Wu, Xiaochun Cao, Ee-Chien Chang

Abstract: Studying backdoor attacks is valuable for model copyright protection and enhancing defenses. While existing backdoor attacks have successfully infected multimodal contrastive learning models such as CLIP, they can be easily countered by specialized backdoor defenses for MCL models. This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defen… ▽ More Studying backdoor attacks is valuable for model copyright protection and enhancing defenses. While existing backdoor attacks have successfully infected multimodal contrastive learning models such as CLIP, they can be easily countered by specialized backdoor defenses for MCL models. This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses and introduces the \emph{\toolns} attack, which is resistant to backdoor detection and model fine-tuning defenses. To achieve this, we draw motivations from the perspective of the Bayesian rule and propose a dual-embedding guided framework for backdoor attacks. Specifically, we ensure that visual trigger patterns approximate the textual target semantics in the embedding space, making it challenging to detect the subtle parameter variations induced by backdoor learning on such natural trigger patterns. Additionally, we optimize the visual trigger patterns to align the poisoned samples with target vision features in order to hinder the backdoor unlearning through clean fine-tuning. Extensive experiments demonstrate that our attack significantly outperforms state-of-the-art baselines (+45.3% ASR) in the presence of SoTA backdoor defenses, rendering these mitigation and detection strategies virtually ineffective. Furthermore, our approach effectively attacks some more rigorous scenarios like downstream tasks. We believe that this paper raises awareness regarding the potential threats associated with the practical application of multimodal contrastive learning and encourages the development of more robust defense mechanisms. △ Less

Submitted 4 March, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

Comments: The paper lacks some work that needs to be cited

Journal ref: CVPR 2024

arXiv:2311.11044 [pdf, ps, other]

Conditional central limit theorem for critical branching random walk

Authors: Wenming Hong, Shengli Liang

Abstract: Consider a critical branching random walk on $\mathbb{R}$. Let $Z^{(n)}(A)$ be the number of individuals in the $n$-th generation located in $A\in \mathcal{B}(\mathbb{R})$ and $Z_{n}:=Z^{(n)}(\mathbb{R})$ denote the population of the $n$-th generation. We prove that, under some conditions, for all $x\in \mathbb{R}$, as $n\to \infty$,… ▽ More Consider a critical branching random walk on $\mathbb{R}$. Let $Z^{(n)}(A)$ be the number of individuals in the $n$-th generation located in $A\in \mathcal{B}(\mathbb{R})$ and $Z_{n}:=Z^{(n)}(\mathbb{R})$ denote the population of the $n$-th generation. We prove that, under some conditions, for all $x\in \mathbb{R}$, as $n\to \infty$, $$\mathcal{L}\left(\frac{Z^{(n)}(-\infty, \sqrt{n} x]}{n} ~\bigg |~ Z_{n}>0\right) \Longrightarrow\mathcal{L}\left(Y(x)\right),$$ where $\Rightarrow$ means weak convergence and $Y(x)$ is a random variable whose distribution is specified by its moments. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.11017 [pdf, other]

Improving Adversarial Transferability by Stable Diffusion

Authors: Jiayang Liu, Siyu Zhu, Siyuan Liang, Jie Zhang, Han Fang, Weiming Zhang, Ee-Chien Chang

Abstract: Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving DNN predictions. While some attack methods excel in the white-box setting, they often struggle in the black-box scenario, particularly against models fortified with defense mechanisms. Various techniques have emerged to enhance the transferability of adversa… ▽ More Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving DNN predictions. While some attack methods excel in the white-box setting, they often struggle in the black-box scenario, particularly against models fortified with defense mechanisms. Various techniques have emerged to enhance the transferability of adversarial attacks for the black-box scenario. Among these, input transformation-based attacks have demonstrated their effectiveness. In this paper, we explore the potential of leveraging data generated by Stable Diffusion to boost adversarial transferability. This approach draws inspiration from recent research that harnessed synthetic data generated by Stable Diffusion to enhance model generalization. In particular, previous work has highlighted the correlation between the presence of both real and synthetic data and improved model generalization. Building upon this insight, we introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images. Furthermore, we propose a fast variant of SDAM to reduce computational overhead while preserving high adversarial transferability. Our extensive experimental results demonstrate that our method outperforms state-of-the-art baselines by a substantial margin. Moreover, our approach is compatible with existing transfer-based attacks to further enhance adversarial transferability. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.06595 [pdf, other]

From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Authors: Xiaoqian Li, Ercong Nie, Sheng Liang

Abstract: The remarkable ability of Large Language Models (LLMs) to understand and follow instructions has sometimes been limited by their in-context learning (ICL) performance in low-resource languages. To address this, we introduce a novel approach that leverages cross-lingual retrieval-augmented in-context learning (CREA-ICL). By extracting semantically similar prompts from high-resource languages, we ai… ▽ More The remarkable ability of Large Language Models (LLMs) to understand and follow instructions has sometimes been limited by their in-context learning (ICL) performance in low-resource languages. To address this, we introduce a novel approach that leverages cross-lingual retrieval-augmented in-context learning (CREA-ICL). By extracting semantically similar prompts from high-resource languages, we aim to improve the zero-shot performance of multilingual pre-trained language models (MPLMs) across diverse tasks. Though our approach yields steady improvements in classification tasks, it faces challenges in generation tasks. Our evaluation offers insights into the performance dynamics of retrieval-augmented in-context learning across both classification and generation domains. △ Less

Submitted 2 December, 2023; v1 submitted 11 November, 2023; originally announced November 2023.

Comments: In The Workshop on Instruction Tuning and Instruction Following, held in conjunction with The Conference on NeurIPS 2023, December 2023. arXiv admin note: text overlap with arXiv:2311.00587

arXiv:2311.05797 [pdf, other]

Stochastic quantization of the three-dimensional polymer measure via the Dirichlet form method

Authors: Sergio Albeverio, Seiichiro Kusuoka, Song Liang, Makoto Nakashima

Abstract: We prove that there exists a diffusion process whose invariant measure is the three dimensional polymer measure $ν_λ$ for small $λ>0$. We follow in part a previous incomplete unpublished work of the first named author with M. Röckner and X.Y. Zhou. For the construction of $ν_λ$ we rely on previous work by J. Westwater, E. Bolthausen and X.Y. Zhou. Using $ν_λ$, the diffusion is constructed by means… ▽ More We prove that there exists a diffusion process whose invariant measure is the three dimensional polymer measure $ν_λ$ for small $λ>0$. We follow in part a previous incomplete unpublished work of the first named author with M. Röckner and X.Y. Zhou. For the construction of $ν_λ$ we rely on previous work by J. Westwater, E. Bolthausen and X.Y. Zhou. Using $ν_λ$, the diffusion is constructed by means of the theory of Dirichlet forms on infinite-dimensional state spaces. The closability of the appropriate pre-Dirichlet form which is of gradient type is proven, by using a general closability result in [AR89a]. This result does not require an integration by parts formula (which does not even hold for the two-dimensional polymer measure $ν_λ$) but requires the quasi-invariance of $ν_λ$ along a basis of vectors in the classical Cameron-Martin space such that the Radon-Nikodym derivatives have versions which form a continuous process. △ Less

Submitted 13 November, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

Comments: 87 pages, 8 figures

MSC Class: 81S20; 60J65; 60J46; 60H30

arXiv:2311.04732 [pdf, other]

General-purpose machine-learned potential for 16 elemental metals and their alloys

Authors: Keke Song, Rui Zhao, Jiahui Liu, Yanzhou Wang, Eric Lindgren, Yong Wang, Shunda Chen, Ke Xu, Ting Liang, Penghua Ying, Nan Xu, Zhiqiang Zhao, Jiuyang Shi, Junjie Wang, Shuang Lyu, Zezhu Zeng, Shirong Liang, Haikuan Dong, Ligang Sun, Yue Chen, Zhuhua Zhang, Wanlin Guo, ** Qian, Jian Sun, Paul Erhart , et al. (3 additional authors not shown)

Abstract: Machine-learned potentials (MLPs) have exhibited remarkable accuracy, yet the lack of general-purpose MLPs for a broad spectrum of elements and their alloys limits their applicability. Here, we present a feasible approach for constructing a unified general-purpose MLP for numerous elements, demonstrated through a model (UNEP-v1) for 16 elemental metals and their alloys. To achieve a complete repre… ▽ More Machine-learned potentials (MLPs) have exhibited remarkable accuracy, yet the lack of general-purpose MLPs for a broad spectrum of elements and their alloys limits their applicability. Here, we present a feasible approach for constructing a unified general-purpose MLP for numerous elements, demonstrated through a model (UNEP-v1) for 16 elemental metals and their alloys. To achieve a complete representation of the chemical space, we show, via principal component analysis and diverse test datasets, that employing one-component and two-component systems suffices. Our unified UNEP-v1 model exhibits superior performance across various physical properties compared to a widely used embedded-atom method potential, while maintaining remarkable efficiency. We demonstrate our approach's effectiveness through reproducing experimentally observed chemical order and stable phases, and large-scale simulations of plasticity and primary radiation damage in MoTaVW alloys. This work represents a significant leap towards a unified general-purpose MLP encompassing the periodic table, with profound implications for materials science. △ Less

Submitted 12 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: Main text with 17 pages and 8 figures; supplementary with 26 figures and 4 tables; source code and training/test data available

arXiv:2311.00587 [pdf, other]

Crosslingual Retrieval Augmented In-context Learning for Bangla

Authors: Xiaoqian Li, Ercong Nie, Sheng Liang

Abstract: The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multi… ▽ More The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multilingual pretrained language models (MPLMs), especially the generative model BLOOMZ, to successfully boost performance on Bangla tasks. Our extensive evaluation highlights that the cross-lingual retrieval augmented prompts bring steady improvements to MPLMs over the zero-shot performance. △ Less

Submitted 2 December, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: In The 1st Bangla Language Processing (BLP) Workshop, held in conjunction with The Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023

arXiv:2310.16450 [pdf, other]

CLEX: Continuous Length Extrapolation for Large Language Models

Authors: Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing

Abstract: Transformer-based Large Language Models (LLMs) are pioneering advances in many natural language processing tasks, however, their exceptional capabilities are restricted within the preset context window of Transformer. Position Embedding (PE) scaling methods, while effective in extending the context window to a specific length, demonstrate either notable limitations in their extrapolation abilities… ▽ More Transformer-based Large Language Models (LLMs) are pioneering advances in many natural language processing tasks, however, their exceptional capabilities are restricted within the preset context window of Transformer. Position Embedding (PE) scaling methods, while effective in extending the context window to a specific length, demonstrate either notable limitations in their extrapolation abilities or sacrificing partial performance within the context window. Length extrapolation methods, although theoretically capable of extending the context window beyond the training sequence length, often underperform in practical long-context applications. To address these challenges, we propose Continuous Length EXtrapolation (CLEX) for LLMs. We generalise the PE scaling approaches to model the continuous dynamics by ordinary differential equations over the length scaling factor, thereby overcoming the constraints of current PE scaling methods designed for specific lengths. Moreover, by extending the dynamics to desired context lengths beyond the training sequence length, CLEX facilitates the length extrapolation with impressive performance in practical tasks. We demonstrate that CLEX can be seamlessly incorporated into LLMs equipped with Rotary Position Embedding, such as LLaMA and GPT-NeoX, with negligible impact on training and inference latency. Experimental results reveal that CLEX can effectively extend the context window to over 4x or almost 8x training length, with no deterioration in performance. Furthermore, when evaluated on the practical LongBench benchmark, our model trained on a 4k length exhibits competitive performance against state-of-the-art open-source models trained on context lengths up to 32k. Our code is available at https://github.com/DAMO-NLP-SG/CLEX. △ Less

Submitted 24 March, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: ICLR 2024

arXiv:2310.11743 [pdf]

doi 10.1088/0256-307X/40/11/117201

Moire synaptic transistor for homogeneous-architecture reservoir computing

Authors: Pengfei Wang, Moyu Chen, Yongqin Xie, Chen Pan, Kenji Watanabe, Takashi Taniguchi, Bin Cheng, Shi-Jun Liang, Feng Miao

Abstract: Reservoir computing has been considered as a promising intelligent computing paradigm for effectively processing complex temporal information. Exploiting tunable and reproducible dynamics in the single electronic device have been desired to implement the reservoir and the readout layer of reservoir computing system. Two-dimensional moire material, with an artificial lattice constant many times lar… ▽ More Reservoir computing has been considered as a promising intelligent computing paradigm for effectively processing complex temporal information. Exploiting tunable and reproducible dynamics in the single electronic device have been desired to implement the reservoir and the readout layer of reservoir computing system. Two-dimensional moire material, with an artificial lattice constant many times larger than the atomic length scale, is one type of most studied artificial quantum materials in community of material science and condensed-matter physics over the past years. These materials are featured with gate-tunable periodic potential and electronic correlation, thus varying the electric field allows the electrons in the moire potential per unit cell to exhibit distinct and reproducible dynamics, showing great promise in robust reservoir computing. Here, we report that a moire synaptic transistor can be used to implement the reservoir computing system with a homogeneous reservoir-readout architecture. The synaptic transistor is fabricated based on a h-BN/bilayer graphene/h-BN moire heterostructure, exhibiting ferroelectricity-like hysteretic gate voltage dependence of resistance. Varying the magnitude of the gate voltage enables the moire transistor to be switched between long-term memory and short-term memory with nonlinear dynamics. By employing the short- and long-term memory as the reservoir nodes and weights of the readout layer, respectively, we construct a full-moire physical neural network and demonstrate that the classification accuracy of 90.8% can be achieved for the MNIST handwritten digit database. Our work would pave the way towards the development of neuromorphic computing based on the moire materials. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Journal ref: Chin. Phys. Lett. 2023 40 (11): 117201

arXiv:2310.10317 [pdf]

Stochastic spin-orbit-torque synapse and its application in uncertainty quantification

Authors: Cen Wang, Guang Zeng, Xinyu Wen, Yuhui He, Wei Luo, Shiwei Chen, Shiheng Liang, Yue Zhang

Abstract: Stochasticity plays a significant role in the low-power operation of a biological neural network. In an artificial neural network (ANN), stochasticity also contributes to critical functions such as the uncertainty quantification (UQ) for estimating the probability for the correctness of prediction. This UQ is vital for cutting-edge applications, including medical diagnostics, autopilots, and large… ▽ More Stochasticity plays a significant role in the low-power operation of a biological neural network. In an artificial neural network (ANN), stochasticity also contributes to critical functions such as the uncertainty quantification (UQ) for estimating the probability for the correctness of prediction. This UQ is vital for cutting-edge applications, including medical diagnostics, autopilots, and large language models. Thanks to high computing velocity and low dissipation, a spin-orbit-torque (SOT) device exhibits significant potential for implementing the UQ. However, up until now, the application of UQ for stochastic SOT devices remains unexplored. In this study, based on SOT-induced stochastic magnetic domain wall (DW) motion with varying velocity, we fabricated an SOT synapse that could emulate stochastic weight update following the Spike-Timing-Dependent-Plasticity (STDP) rule. Furthermore, we set up a stochastic Spiking-Neural-Network (SNN), which, when compared to its deterministic counterpart, demonstrates a clear advantage in quantifying uncertainty for diagnosing the type of breast tumor (benign or malignant). △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.09917 [pdf, other]

Empirical study of pretrained multilingual language models for zero-shot cross-lingual knowledge transfer in generation

Authors: Nadezhda Chirkova, Sheng Liang, Vassilina Nikoulina

Abstract: Zero-shot cross-lingual knowledge transfer enables the multilingual pretrained language model (mPLM), finetuned on a task in one language, make predictions for this task in other languages. While being broadly studied for natural language understanding tasks, the described setting is understudied for generation. Previous works notice a frequent problem of generation in a wrong language and propose… ▽ More Zero-shot cross-lingual knowledge transfer enables the multilingual pretrained language model (mPLM), finetuned on a task in one language, make predictions for this task in other languages. While being broadly studied for natural language understanding tasks, the described setting is understudied for generation. Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model. In this work, we test alternative mPLMs, such as mBART and NLLB-200, considering full finetuning and parameter-efficient finetuning with adapters. We find that mBART with adapters performs similarly to mT5 of the same size, and NLLB-200 can be competitive in some cases. We also underline the importance of tuning learning rate used for finetuning, which helps to alleviate the problem of generation in the wrong language. △ Less

Submitted 22 April, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

Comments: This preprint describes a preliminary study for our follow-up work arXiv:2402.12279 (NAACL 2024), in which we investigate important factors for enabling zero-shot cross-lingual transfer in generative tasks

arXiv:2310.09434 [pdf, other]

Learning nonlinear integral operators via Recurrent Neural Networks and its application in solving Integro-Differential Equations

Authors: Hardeep Bassi, Yuanran Zhu, Senwei Liang, Jia Yin, Cian C. Reeves, Vojtech Vlcek, Chao Yang

Abstract: In this paper, we propose using LSTM-RNNs (Long Short-Term Memory-Recurrent Neural Networks) to learn and represent nonlinear integral operators that appear in nonlinear integro-differential equations (IDEs). The LSTM-RNN representation of the nonlinear integral operator allows us to turn a system of nonlinear integro-differential equations into a system of ordinary differential equations for whic… ▽ More In this paper, we propose using LSTM-RNNs (Long Short-Term Memory-Recurrent Neural Networks) to learn and represent nonlinear integral operators that appear in nonlinear integro-differential equations (IDEs). The LSTM-RNN representation of the nonlinear integral operator allows us to turn a system of nonlinear integro-differential equations into a system of ordinary differential equations for which many efficient solvers are available. Furthermore, because the use of LSTM-RNN representation of the nonlinear integral operator in an IDE eliminates the need to perform a numerical integration in each numerical time evolution step, the overall temporal cost of the LSTM-RNN-based IDE solver can be reduced to $O(n_T)$ from $O(n_T^2)$ if a $n_T$-step trajectory is to be computed. We illustrate the efficiency and robustness of this LSTM-RNN-based numerical IDE solver with a model problem. Additionally, we highlight the generalizability of the learned integral operator by applying it to IDEs driven by different external forces. As a practical application, we show how this methodology can effectively solve the Dyson's equation for quantum many-body systems. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.07073 [pdf, other]

Pull-back Geometry of Persistent Homology Encodings

Authors: Shuang Liang, Renata Turkeš, Jiayi Li, Nina Otter, Guido Montúfar

Abstract: Persistent homology (PH) is a method for generating topology-inspired representations of data. Empirical studies that investigate the properties of PH, such as its sensitivity to perturbations or ability to detect a feature of interest, commonly rely on training and testing an additional model on the basis of the PH representation. To gain more intrinsic insights about PH, independently of the cho… ▽ More Persistent homology (PH) is a method for generating topology-inspired representations of data. Empirical studies that investigate the properties of PH, such as its sensitivity to perturbations or ability to detect a feature of interest, commonly rely on training and testing an additional model on the basis of the PH representation. To gain more intrinsic insights about PH, independently of the choice of such a model, we propose a novel methodology based on the pull-back geometry that a PH encoding induces on the data manifold. The spectrum and eigenvectors of the induced metric help to identify the most and least significant information captured by PH. Furthermore, the pull-back norm of tangent vectors provides insights about the sensitivity of PH to a given perturbation, or its potential to detect a given feature of interest, and in turn its ability to solve a given classification or regression problem. Experimentally, the insights gained through our methodology align well with the existing knowledge about PH. Moreover, we show that the pull-back norm correlates with the performance on downstream tasks, and can therefore guide the choice of a suitable PH encoding. △ Less

Submitted 3 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.05591 [pdf, other]

From the Weyl-Schrödinger connection to the accelerating Universe -- extending Einstein's gravity via a length preserving nonmetricity

Authors: Lei Ming, Shi-Dong Liang, Hong-Hao Zhang, Tiberiu Harko

Abstract: One of the important extensions of Riemann geometry is Weyl geometry, which is essentially based on the ideas of conformal invariance and nonmetricity. A similar non-Riemannian geometry was proposed by Erwin Schrödinger in the late 1940s, in a geometry which is simpler, and (probably) more elegant than the Weyl geometry. Even it contains nonmetricity, the Schrödinger connection preserves the lengt… ▽ More One of the important extensions of Riemann geometry is Weyl geometry, which is essentially based on the ideas of conformal invariance and nonmetricity. A similar non-Riemannian geometry was proposed by Erwin Schrödinger in the late 1940s, in a geometry which is simpler, and (probably) more elegant than the Weyl geometry. Even it contains nonmetricity, the Schrödinger connection preserves the length of vectors under parallel transport, and thus seems to be more physical than the Weyl connection. Interestingly enough, Schrödinger's approach did not attract much interest in the field of gravitational physics. It is the goal of the present paper to reconsider the Schrödinger geometry as a potential candidate for a gravitational theory extending standard general relativity. We consider a gravitational action constructed from a length preserving non-metricity, in the absence of torsion, and investigate its variation in both Palatini and metric formalisms. While the Palatini variation leads to standard general relativity, the metric version of the theory adds some non-metricity dependent extra terms in the gravitational Einstein equations, which can be interpreted as representing a geometric type dark energy. After obtaining the generalized Friedmann equations, we analyze in detail the cosmological implications of the theory, by considering two distinct models, corresponding to a dark energy satisfying a linear equation of state, and to conserved matter energy, respectively. In both cases we compare the predictions of the Weyl-Schrödinger cosmology with a set of observational data for the Hubble function, and with the results of the $Λ$CDM standard paradigm. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: 21 pages, 11 figures

arXiv:2310.01980 [pdf, other]

UAV Swarm-enabled Collaborative Secure Relay Communications with Time-domain Colluding Eavesdropper

Authors: Chuang Zhang, Geng Sun, Qingqing Wu, Jiahui Li, Shuang Liang, Dusit Niyato, Victor C. M. Leung

Abstract: Unmanned aerial vehicles (UAVs) as aerial relays are practically appealing for assisting Internet of Things (IoT) network. In this work, we aim to utilize the UAV swarm to assist the secure communication between the micro base station (MBS) equipped with the planar array antenna (PAA) and the IoT terminal devices by collaborative beamforming (CB), so as to counteract the effects of collusive eaves… ▽ More Unmanned aerial vehicles (UAVs) as aerial relays are practically appealing for assisting Internet of Things (IoT) network. In this work, we aim to utilize the UAV swarm to assist the secure communication between the micro base station (MBS) equipped with the planar array antenna (PAA) and the IoT terminal devices by collaborative beamforming (CB), so as to counteract the effects of collusive eavesdrop** attacks in time-domain. Specifically, we formulate a UAV swarm-enabled secure relay multi-objective optimization problem (US2RMOP) for simultaneously maximizing the achievable sum rate of associated IoT terminal devices, minimizing the achievable sum rate of the eavesdropper and minimizing the energy consumption of UAV swarm, by jointly optimizing the excitation current weights of both MBS and UAV swarm, the selection of the UAV receiver, the position of UAVs and user association order of IoT terminal devices. Furthermore, the formulated US2RMOP is proved to be a non-convex, NP-hard and large-scale optimization problem. Therefore, we propose an improved multi-objective grasshopper algorithm (IMOGOA) with some specific designs to address the problem. Simulation results exhibit the effectiveness of the proposed UAV swarm-enabled collaborative secure relay strategy and demonstrate the superiority of IMOGOA. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: Submitted to IEEE Transactions on Mobile Computing

arXiv:2310.00396 [pdf, other]

Joint Scheduling and Trajectory Optimization of Charging UAV in Wireless Rechargeable Sensor Networks

Authors: Yanheng Liu, Hongyang Pan, Geng Sun, Aimin Wang, Jiahui Li, Shuang Liang

Abstract: Wireless rechargeable sensor networks with a charging unmanned aerial vehicle (CUAV) have the broad application prospects in the power supply of the rechargeable sensor nodes (SNs). However, how to schedule a CUAV and design the trajectory to improve the charging efficiency of the entire system is still a vital problem. In this paper, we formulate a joint-CUAV scheduling and trajectory optimizatio… ▽ More Wireless rechargeable sensor networks with a charging unmanned aerial vehicle (CUAV) have the broad application prospects in the power supply of the rechargeable sensor nodes (SNs). However, how to schedule a CUAV and design the trajectory to improve the charging efficiency of the entire system is still a vital problem. In this paper, we formulate a joint-CUAV scheduling and trajectory optimization problem (JSTOP) to simultaneously minimize the hovering points of CUAV, the number of the repeatedly covered SNs and the flying distance of CUAV for charging all SNs. Due to the complexity of JSTOP, it is decomposed into two optimization subproblems that are CUAV scheduling optimization problem (CSOP) and CUAV trajectory optimization problem (CTOP). CSOP is a hybrid optimization problem that consists of the continuous and discrete solution space, and the solution dimension in CSOP is not fixed since it should be changed with the number of hovering points of CUAV. Moreover, CTOP is a completely discrete optimization problem. Thus, we propose a particle swarm optimization (PSO) with a flexible dimension mechanism, a K-means operator and a punishment-compensation mechanism (PSOFKP) and a PSO with a discretization factor, a 2-opt operator and a path crossover reduction mechanism (PSOD2P) to solve the converted CSOP and CTOP, respectively. Simulation results evaluate the benefits of PSOFKP and PSOD2P under different scales and settings of the network, and the stability of the proposed algorithms is verified. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2310.00384 [pdf, ps, other]

Joint Power and 3D Trajectory Optimization for UAV-enabled Wireless Powered Communication Networks with Obstacles

Authors: Hongyang Pan, Yanheng Liu, Geng Sun, Junsong Fan, Shuang Liang, Chau Yuen

Abstract: Unmanned aerial vehicle (UAV)-enabled wireless powered communication networks (WPCNs) are promising technologies in 5G/6G wireless communications, while there are several challenges about UAV power allocation and scheduling to enhance the energy utilization efficiency, considering the existence of obstacles. In this work, we consider a UAV-enabled WPCN scenario that a UAV needs to cover the ground… ▽ More Unmanned aerial vehicle (UAV)-enabled wireless powered communication networks (WPCNs) are promising technologies in 5G/6G wireless communications, while there are several challenges about UAV power allocation and scheduling to enhance the energy utilization efficiency, considering the existence of obstacles. In this work, we consider a UAV-enabled WPCN scenario that a UAV needs to cover the ground wireless devices (WDs). During the coverage process, the UAV needs to collect data from the WDs and charge them simultaneously. To this end, we formulate a joint-UAV power and three-dimensional (3D) trajectory optimization problem (JUPTTOP) to simultaneously increase the total number of the covered WDs, increase the time efficiency, and reduce the total flying distance of UAV so as to improve the energy utilization efficiency in the network. Due to the difficulties and complexities, we decompose it into two sub optimization problems, which are the UAV power allocation optimization problem (UPAOP) and UAV 3D trajectory optimization problem (UTTOP), respectively. Then, we propose an improved non-dominated sorting genetic algorithm-II with K-means initialization operator and Variable dimension mechanism (NSGA-II-KV) for solving the UPAOP. For UTTOP, we first introduce a pretreatment method, and then use an improved particle swarm optimization with Normal distribution initialization, Genetic mechanism, Differential mechanism and Pursuit operator (PSO-NGDP) to deal with this sub optimization problem. Simulation results verify the effectiveness of the proposed strategies under different scales and settings of the networks. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2310.00288 [pdf]

doi 10.1038/s41928-023-00965-5

Parallel in-memory wireless computing

Authors: Cong Wang, Gong-Jie Ruan, Zai-Zheng Yang, Xing-Jian Yangdong, Yixiang Li, Liang Wu, Yingmeng Ge, Yichen Zhao, Chen Pan, Wei Wei, Li-Bo Wang, Bin Cheng, Zaichen Zhang, Chuan Zhang, Shi-Jun Liang, Feng Miao

Abstract: Parallel wireless digital communication with ultralow power consumption is critical for emerging edge technologies such as 5G and Internet of Things. However, the physical separation between digital computing units and analogue transmission units in traditional wireless technology leads to high power consumption. Here we report a parallel in-memory wireless computing scheme. The approach combines… ▽ More Parallel wireless digital communication with ultralow power consumption is critical for emerging edge technologies such as 5G and Internet of Things. However, the physical separation between digital computing units and analogue transmission units in traditional wireless technology leads to high power consumption. Here we report a parallel in-memory wireless computing scheme. The approach combines in-memory computing with wireless communication using memristive crossbar arrays. We show that the system can be used for the radio transmission of a binary stream of 480 bits with a bit error rate of 0. The in-memory wireless computing uses two orders of magnitude less power than conventional technology (based on digital-to-analogue and analogue-to-digital converters). We also show that the approach can be applied to acoustic and optical wireless communications △ Less

Submitted 30 September, 2023; originally announced October 2023.

Journal ref: Nat Electron 6, 381-389 (2023)

arXiv:2309.16709 [pdf, other]

Joint Task Offloading and Resource Allocation in Aerial-Terrestrial UAV Networks with Edge and Fog Computing for Post-Disaster Rescue

Authors: Geng Sun, Long He, Zemin Sun, Qingqing Wu, Shuang Liang, Jiahui Li, Dusit Niyato, Victor C. M. Leung

Abstract: Unmanned aerial vehicles (UAVs) play an increasingly important role in assisting fast-response post-disaster rescue due to their fast deployment, flexible mobility, and low cost. However, UAVs face the challenges of limited battery capacity and computing resources, which could shorten the expected flight endurance of UAVs and increase the rescue response delay during performing mission-critical ta… ▽ More Unmanned aerial vehicles (UAVs) play an increasingly important role in assisting fast-response post-disaster rescue due to their fast deployment, flexible mobility, and low cost. However, UAVs face the challenges of limited battery capacity and computing resources, which could shorten the expected flight endurance of UAVs and increase the rescue response delay during performing mission-critical tasks. To address this challenge, we first present a three-layer post-disaster rescue computing architecture by leveraging the aerial-terrestrial edge capabilities of mobile edge computing (MEC) and vehicle fog computing (VFC), which consists of a vehicle fog layer, a UAV client layer, and a UAV edge layer. Moreover, we formulate a joint task offloading and resource allocation optimization problem (JTRAOP) with the aim of maximizing the time-average system utility. Since the formulated JTRAOP is proved to be NP-hard, we propose an MEC-VFC-aided task offloading and resource allocation (MVTORA) approach, which consists of a game theoretic algorithm for task offloading decision, a convex optimization-based algorithm for MEC resource allocation, and an evolutionary computation-based hybrid algorithm for VFC resource allocation. Simulation results validate that the proposed approach can achieve superior system performance compared to the other benchmark schemes, especially under heavy system workloads. △ Less

Submitted 6 October, 2023; v1 submitted 17 August, 2023; originally announced September 2023.

Comments: 18 pages, 6 figures

arXiv:2309.15977 [pdf, other]

Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields

Authors: Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Abstract: Room impulse response (RIR), which measures the sound propagation within an environment, is critical for synthesizing high-fidelity audio for a given environment. Some prior work has proposed representing RIR as a neural field function of the sound emitter and receiver positions. However, these methods do not sufficiently consider the acoustic properties of an audio scene, leading to unsatisfactor… ▽ More Room impulse response (RIR), which measures the sound propagation within an environment, is critical for synthesizing high-fidelity audio for a given environment. Some prior work has proposed representing RIR as a neural field function of the sound emitter and receiver positions. However, these methods do not sufficiently consider the acoustic properties of an audio scene, leading to unsatisfactory performance. This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene by leveraging multiple acoustic contexts, such as geometry, material property, and spatial information. Driven by the unique properties of RIR, i.e., temporal un-smoothness and monotonic energy attenuation, we design a temporal correlation module and multi-scale energy decay criterion. Experimental results show that NACF outperforms existing field-based methods by a notable margin. Please visit our project page for more qualitative results. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.11996 [pdf, other]

doi 10.1140/epjc/s10052-023-12296-y

Design and performance of the field cage for the XENONnT experiment

Authors: E. Aprile, K. Abe, S. Ahmed Maouloud, L. Althueser, B. Andrieu, E. Angelino, J. R. Angevaare, V. C. Antochi, D. Antón Martin, F. Arneodo, L. Baudis, A. L. Baxter, M. Bazyk, L. Bellagamba, R. Biondi, A. Bismark, E. J. Brookes, A. Brown, S. Bruenner, G. Bruno, R. Budnik, T. K. Bui, C. Cai, J. M. R. Cardoso, D. Cichon , et al. (139 additional authors not shown)

Abstract: The precision in reconstructing events detected in a dual-phase time projection chamber depends on an homogeneous and well understood electric field within the liquid target. In the XENONnT TPC the field homogeneity is achieved through a double-array field cage, consisting of two nested arrays of field sha** rings connected by an easily accessible resistor chain. Rather than being connected to t… ▽ More The precision in reconstructing events detected in a dual-phase time projection chamber depends on an homogeneous and well understood electric field within the liquid target. In the XENONnT TPC the field homogeneity is achieved through a double-array field cage, consisting of two nested arrays of field sha** rings connected by an easily accessible resistor chain. Rather than being connected to the gate electrode, the topmost field sha** ring is independently biased, adding a degree of freedom to tune the electric field during operation. Two-dimensional finite element simulations were used to optimize the field cage, as well as its operation. Simulation results were compared to ${}^{83m}\mathrm{Kr}$ calibration data. This comparison indicates an accumulation of charge on the panels of the TPC which is constant over time, as no evolution of the reconstructed position distribution of events is observed. The simulated electric field was then used to correct the charge signal for the field dependence of the charge yield. This correction resolves the inconsistent measurement of the drift electron lifetime when using different calibrations sources and different field cage tuning voltages. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Journal ref: Eur. Phys. J. C 84, 138 (2024)

arXiv:2309.11114 [pdf, other]

doi 10.1103/PhysRevD.109.054508

Reconstructing lattice QCD spectral functions with stochastic pole expansion and Nevanlinna analytic continuation

Authors: Li Huang, Shuang Liang

Abstract: The reconstruction of spectral functions from Euclidean correlation functions is a well-known, yet ill-posed inverse problem in the fields of many-body and high-energy physics. In this paper, we present a comprehensive investigation of two recently developed analytic continuation methods, namely stochastic pole expansion and Nevanlinna analytic continuation, for extracting spectral functions from… ▽ More The reconstruction of spectral functions from Euclidean correlation functions is a well-known, yet ill-posed inverse problem in the fields of many-body and high-energy physics. In this paper, we present a comprehensive investigation of two recently developed analytic continuation methods, namely stochastic pole expansion and Nevanlinna analytic continuation, for extracting spectral functions from mock lattice QCD data. We examine a range of Euclidean correlation functions generated by representative models, including the Breit-Wigner model, the Gaussian mixture model, the resonance-continuum model, and the bottomonium model. Our findings demonstrate that the stochastic pole expansion method, when combined with the constrained sampling algorithm and the self-adaptive sampling algorithm, successfully recovers the essential features of the spectral functions and exhibits excellent resilience to noise of input data. In contrast, the Nevanlinna analytic continuation method suffers from numerical instability, often resulting in the emergence of spurious peaks and significant oscillations in the high-energy regions of the spectral functions, even with the application of the Hardy basis function optimization algorithm. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: 14 pages, 8 figures

Journal ref: Phys. Rev. D 109, 054508 (2024)

arXiv:2309.10326 [pdf, other]

QASnowball: An Iterative Bootstrap** Framework for High-Quality Question-Answering Data Generation

Authors: Kunlun Zhu, Shihao Liang, Xu Han, Zhi Zheng, Guoyang Zeng, Zhiyuan Liu, Maosong Sun

Abstract: Recent years have witnessed the success of question answering (QA), especially its potential to be a foundation paradigm for tackling diverse NLP tasks. However, obtaining sufficient data to build an effective and stable QA system still remains an open problem. For this problem, we introduce an iterative bootstrap** framework for QA data augmentation (named QASnowball), which can iteratively gen… ▽ More Recent years have witnessed the success of question answering (QA), especially its potential to be a foundation paradigm for tackling diverse NLP tasks. However, obtaining sufficient data to build an effective and stable QA system still remains an open problem. For this problem, we introduce an iterative bootstrap** framework for QA data augmentation (named QASnowball), which can iteratively generate large-scale high-quality QA data based on a seed set of supervised examples. Specifically, QASnowball consists of three modules, an answer extractor to extract core phrases in unlabeled documents as candidate answers, a question generator to generate questions based on documents and candidate answers, and a QA data filter to filter out high-quality QA data. Moreover, QASnowball can be self-enhanced by reseeding the seed set to fine-tune itself in different iterations, leading to continual improvements in the generation quality. We conduct experiments in the high-resource English scenario and the medium-resource Chinese scenario, and the experimental results show that the data generated by QASnowball can facilitate QA models: (1) training models on the generated data achieves comparable results to using supervised data, and (2) pre-training on the generated data and fine-tuning on supervised data can achieve better performance. Our code and generated data will be released to advance further work. △ Less

Submitted 19 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.09030 [pdf, other]

Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule

Authors: Hongyu Zhu, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali yuan, Shi-Lin Wang, Guang Cheng

Abstract: As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data au… ▽ More As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules. Specifically, We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer. Additionally, we propose to incorporate outputs from intermediate layers into a checkpoint ensemble for more stable performance. Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various tabular classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors. The learned policies also transfer effectively to Deep Forest variants, underscoring its potential for enhancing non-differentiable deep learning modules in tabular signal processing. △ Less

Submitted 16 September, 2023; originally announced September 2023.

arXiv:2309.07979 [pdf, other]

Fast Safe Rectangular Corridor-based Online AGV Trajectory Optimization with Obstacle Avoidance

Authors: Shaoqiang Liang, Songyuan Fa, Yiqun Li

Abstract: Automated Guided Vehicles (AGVs) are essential in various industries for their efficiency and adaptability. However, planning trajectories for AGVs in obstacle-dense, unstructured environments presents significant challenges due to the nonholonomic kinematics, abundant obstacles, and the scenario's nonconvex and constrained nature. To address this, we propose an efficient trajectory planning frame… ▽ More Automated Guided Vehicles (AGVs) are essential in various industries for their efficiency and adaptability. However, planning trajectories for AGVs in obstacle-dense, unstructured environments presents significant challenges due to the nonholonomic kinematics, abundant obstacles, and the scenario's nonconvex and constrained nature. To address this, we propose an efficient trajectory planning framework for AGVs by formulating the problem as an optimal control problem. Our framework utilizes the fast safe rectangular corridor (FSRC) algorithm to construct rectangular convex corridors, representing avoidance constraints as box constraints. This eliminates redundant obstacle influences and accelerates the solution speed. Additionally, we employ the Modified Visibility Graph algorithm to speed up path planning and a boundary discretization strategy to expedite FSRC construction. Experimental results demonstrate the effectiveness and superiority of our framework, particularly in computational efficiency. Compared to advanced frameworks, our framework achieves computational efficiency gains of 1 to 2 orders of magnitude. Notably, FSRC significantly outperforms other safe convex corridor-based methods regarding computational efficiency. △ Less

Submitted 12 March, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.05794 [pdf, other]

Robust Physics-based Deep MRI Reconstruction Via Diffusion Purification

Authors: Ismail Alkhouri, Shijun Liang, Rongrong Wang, Qing Qu, Saiprasad Ravishankar

Abstract: Deep learning (DL) techniques have been extensively employed in magnetic resonance imaging (MRI) reconstruction, delivering notable performance enhancements over traditional non-DL methods. Nonetheless, recent studies have identified vulnerabilities in these models during testing, namely, their susceptibility to (\textit{i}) worst-case measurement perturbations and to (\textit{ii}) variations in t… ▽ More Deep learning (DL) techniques have been extensively employed in magnetic resonance imaging (MRI) reconstruction, delivering notable performance enhancements over traditional non-DL methods. Nonetheless, recent studies have identified vulnerabilities in these models during testing, namely, their susceptibility to (\textit{i}) worst-case measurement perturbations and to (\textit{ii}) variations in training/testing settings like acceleration factors and k-space sampling locations. This paper addresses the robustness challenges by leveraging diffusion models. In particular, we present a robustification strategy that improves the resilience of DL-based MRI reconstruction methods by utilizing pretrained diffusion models as noise purifiers. In contrast to conventional robustification methods for DL-based MRI reconstruction, such as adversarial training (AT), our proposed approach eliminates the need to tackle a minimax optimization problem. It only necessitates fine-tuning on purified examples. Our experimental results highlight the efficacy of our approach in mitigating the aforementioned instabilities when compared to leading robustification approaches for deep MRI reconstruction, including AT and randomized smoothing. △ Less

Submitted 24 October, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2308.16781 [pdf, other]

StratMed: Relevance Stratification between Biomedical Entities for Sparsity on Medication Recommendation

Authors: Xiang Li, Shunpan Liang, Yulei Hou, Tengfei Ma

Abstract: With the growing imbalance between limited medical resources and escalating demands, AI-based clinical tasks have become paramount. As a sub-domain, medication recommendation aims to amalgamate longitudinal patient history with medical knowledge, assisting physicians in prescribing safer and more accurate medication combinations. Existing works ignore the inherent long-tailed distribution of medic… ▽ More With the growing imbalance between limited medical resources and escalating demands, AI-based clinical tasks have become paramount. As a sub-domain, medication recommendation aims to amalgamate longitudinal patient history with medical knowledge, assisting physicians in prescribing safer and more accurate medication combinations. Existing works ignore the inherent long-tailed distribution of medical data, have uneven learning strengths for hot and sparse data, and fail to balance safety and accuracy. To address the above limitations, we propose StratMed, which introduces a stratification strategy that overcomes the long-tailed problem and achieves fuller learning of sparse data. It also utilizes a dual-property network to address the issue of mutual constraints on the safety and accuracy of medication combinations, synergistically enhancing these two properties. Specifically, we construct a pre-training method using deep learning networks to obtain medication and disease representations. After that, we design a pyramid-like stratification method based on relevance to strengthen the expressiveness of sparse data. Based on this relevance, we design two graph structures to express medication safety and precision at the same level to obtain patient representations. Finally, the patient's historical clinical information is fitted to generate medication combinations for the current health condition. We employed the MIMIC-III dataset to evaluate our model against state-of-the-art methods in three aspects comprehensively. Compared to the sub-optimal baseline model, our model reduces safety risk by 15.08\%, improves accuracy by 0.36\%, and reduces training time consumption by 81.66\%. △ Less

Submitted 27 November, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.14497 [pdf, other]

doi 10.1103/PhysRevE.108.L062101

Thermodynamic bounds on time-reversal asymmetry

Authors: Shiling Liang, Simone Pigolotti

Abstract: Quantifying irreversibility of a system using finite information constitutes a major challenge in stochastic thermodynamics. We introduce an observable that measures the time-reversal asymmetry between two states after a given time lag. Our central result is a bound on the time-reversal asymmetry in terms of the total cycle affinity driving the system out of equilibrium. This result leads to furth… ▽ More Quantifying irreversibility of a system using finite information constitutes a major challenge in stochastic thermodynamics. We introduce an observable that measures the time-reversal asymmetry between two states after a given time lag. Our central result is a bound on the time-reversal asymmetry in terms of the total cycle affinity driving the system out of equilibrium. This result leads to further thermodynamic bounds on the asymmetry of directed fluxes; on the asymmetry of finite-time cross-correlations; and on the cycle affinity of coarse-grained dynamics. △ Less

Submitted 16 November, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. E 108, L062101 (2023)

arXiv:2308.12740 [pdf, other]

Human Comprehensible Active Learning of Genome-Scale Metabolic Networks

Authors: Lun Ai, Shi-Shun Liang, Wang-Zhou Dai, Liam Hallett, Stephen H. Muggleton, Geoff S. Baldwin

Abstract: An important application of Synthetic Biology is the engineering of the host cell system to yield useful products. However, an increase in the scale of the host system leads to huge design space and requires a large number of validation trials with high experimental costs. A comprehensible machine learning approach that efficiently explores the hypothesis space and guides experimental design is ur… ▽ More An important application of Synthetic Biology is the engineering of the host cell system to yield useful products. However, an increase in the scale of the host system leads to huge design space and requires a large number of validation trials with high experimental costs. A comprehensible machine learning approach that efficiently explores the hypothesis space and guides experimental design is urgently needed for the Design-Build-Test-Learn (DBTL) cycle of the host cell system. We introduce a novel machine learning framework ILP-iML1515 based on Inductive Logic Programming (ILP) that performs abductive logical reasoning and actively learns from training examples. In contrast to numerical models, ILP-iML1515 is built on comprehensible logical representations of a genome-scale metabolic model and can update the model by learning new logical structures from auxotrophic mutant trials. The ILP-iML1515 framework 1) allows high-throughput simulations and 2) actively selects experiments that reduce the experimental cost of learning gene functions in comparison to randomly selected experiments. △ Less

Submitted 31 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: Invited presentation for AAAI Spring Symposium Series 2023 on Computational Scientific Discovery

arXiv:2308.12427 [pdf, other]

Ultrastrong photon-photon coupling

Authors: Fuyang Tay, Ali Mojibpour, Stephen Sanders, Shuang Liang, Hong**g Xu, Geoff C. Gardner, Andrey Baydin, Michael J. Manfra, Alessandro Alabastri, David Hagenmüller, Junichiro Kono

Abstract: Recent studies have shown that matter can ultrastrongly couple with the quantum vacuum field inside a photonic cavity, producing a nonclassical ground state that contains a finite number of photons. Here, we present a novel matter-vacuum hybrid in a multimode photonic cavity whose ground state contains ultrastrongly coupled photons. This unique photon-photon coupling was realized in a three-dimens… ▽ More Recent studies have shown that matter can ultrastrongly couple with the quantum vacuum field inside a photonic cavity, producing a nonclassical ground state that contains a finite number of photons. Here, we present a novel matter-vacuum hybrid in a multimode photonic cavity whose ground state contains ultrastrongly coupled photons. This unique photon-photon coupling was realized in a three-dimensional terahertz photonic-crystal cavity, where two adjacent cavity modes mixed together through simultaneous coupling with the cyclotron resonance of a two-dimensional electron gas with a coupling strength exceeding the intermode frequency. Our microscopic theory successfully explains the salient features of our experimental observations, highlighting the spatial overlap of mode profiles as a key enabler of photon-photon ultrastrong coupling. Our findings provide guidelines for harnessing photon-photon correlations for furthering the physics of vacuum-dressed matter as well as for develo** vacuum-enabled quantum technology. △ Less

Submitted 9 September, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: 50 pages, 12 figures

arXiv:2308.09555 [pdf, ps, other]

doi 10.1142/S0217732321502163

Connections between Weyl geometry, quantum potential and quantum entanglement

Authors: Shi-Dong Liang, Wen**g Huang

Abstract: The Weyl geometry promises potential applications in gravity and quantum mechanics. We study the relationships between the Weyl geometry, quantum entropy and quantum entanglement based on the Weyl geometry endowing the Euclidean metric. We give the formulation of the Weyl Ricci curvature and Weyl scalar curvature in the $n$-dimensional system. The Weyl scalar field plays a bridge role to connect t… ▽ More The Weyl geometry promises potential applications in gravity and quantum mechanics. We study the relationships between the Weyl geometry, quantum entropy and quantum entanglement based on the Weyl geometry endowing the Euclidean metric. We give the formulation of the Weyl Ricci curvature and Weyl scalar curvature in the $n$-dimensional system. The Weyl scalar field plays a bridge role to connect the Weyl scalar curvature, quantum potential and quantum entanglement. We also give the Einstein-Weyl tensor and the generalized field equation in 3D vacuum case, which reveals the relationship between Weyl geometry and quantum potential. Particularly, we find that the correspondence between the Weyl scalar curvature and quantum potential is dimension-dependent and works only for the 3D space, which reveals a clue to quantize gravity and a understanding why our space must be 3D if quantum gravity is compatible with quantum mechanics. We analyze numerically a typical example of two orthogonal oscillators to reveal the relationships between the Weyl scalar curvature, quantum potential and quantum entanglement based on this formulation. We find that the Weyl scalar curvature shows a negative dip peak for separate state but becomes a positive peak for the entangled state near original point region, which can be regarded as a geometric signal to detect quantum entanglement. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 16 pages, 4 figures

Journal ref: Modern Physics Letter A, 36, 30 (2021) 2150216

arXiv:2308.08561 [pdf]

doi 10.5281/zenodo.7983561

Implementation of The Future of Drug Discovery: QuantumBased Machine Learning Simulation (QMLS)

Authors: Yifan Zhou, Yew Kee Wong, Yan Shing Liang, Haichuan Qiu, Yu Xi Wu, Bin He

Abstract: The Research & Development (R&D) phase of drug development is a lengthy and costly process. To revolutionize this process, we introduce our new concept QMLS to shorten the whole R&D phase to three to six months and decrease the cost to merely fifty to eighty thousand USD. For Hit Generation, Machine Learning Molecule Generation (MLMG) generates possible hits according to the molecular structure of… ▽ More The Research & Development (R&D) phase of drug development is a lengthy and costly process. To revolutionize this process, we introduce our new concept QMLS to shorten the whole R&D phase to three to six months and decrease the cost to merely fifty to eighty thousand USD. For Hit Generation, Machine Learning Molecule Generation (MLMG) generates possible hits according to the molecular structure of the target protein while the Quantum Simulation (QS) filters molecules from the primary essay based on the reaction and binding effectiveness with the target protein. Then, For Lead Optimization, the resultant molecules generated and filtered from MLMG and QS are compared, and molecules that appear as a result of both processes will be made into dozens of molecular variations through Machine Learning Molecule Variation (MLMV), while others will only be made into a few variations. Lastly, all optimized molecules would undergo multiple rounds of QS filtering with a high standard for reaction effectiveness and safety, creating a few dozen pre-clinical-trail-ready drugs. This paper is based on our first paper, where we pitched the concept of machine learning combined with quantum simulations. In this paper we will go over the detailed design and framework of QMLS, including MLMG, MLMV, and QS. △ Less

Submitted 25 October, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: 13 pages, 6 figures

Journal ref: International Journal of Computer Science and Mobile Applications, Vol 11 Issue 5,May- 2023

arXiv:2308.08344 [pdf, other]

Graph Out-of-Distribution Generalization with Controllable Data Augmentation

Authors: Bin Lu, Xiaoying Gan, Ze Zhao, Shiyu Liang, Luoyi Fu, Xinbing Wang, Chenghu Zhou

Abstract: Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. However, due to the selection bias of training and testing data (e.g., training on small graphs and testing on large graphs, or training on dense graphs and testing on sparse graphs), distribution deviation is widespread. More importantly, we often observe \emph{hybrid structure distribution shif… ▽ More Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. However, due to the selection bias of training and testing data (e.g., training on small graphs and testing on large graphs, or training on dense graphs and testing on sparse graphs), distribution deviation is widespread. More importantly, we often observe \emph{hybrid structure distribution shift} of both scale and density, despite of one-sided biased data partition. The spurious correlations over hybrid distribution deviation degrade the performance of previous GNN methods and show large instability among different datasets. To alleviate this problem, we propose \texttt{OOD-GMixup} to jointly manipulate the training distribution with \emph{controllable data augmentation} in metric space. Specifically, we first extract the graph rationales to eliminate the spurious correlations due to irrelevant information. Secondly, we generate virtual samples with perturbation on graph rationale representation domain to obtain potential OOD training samples. Finally, we propose OOD calibration to measure the distribution deviation of virtual samples by leveraging Extreme Value Theory, and further actively control the training distribution by emphasizing the impact of virtual OOD samples. Extensive studies on several real-world datasets on graph classification demonstrate the superiority of our proposed method over state-of-the-art baselines. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: Under review

arXiv:2308.06869 [pdf, other]

Shape-Graph Matching Network (SGM-net): Registration for Statistical Shape Analysis

Authors: Shenyuan Liang, Mauricio Pamplona Segundo, Sathyanarayanan N. Aakur, Sudeep Sarkar, Anuj Srivastava

Abstract: This paper focuses on the statistical analysis of shapes of data objects called shape graphs, a set of nodes connected by articulated curves with arbitrary shapes. A critical need here is a constrained registration of points (nodes to nodes, edges to edges) across objects. This, in turn, requires optimization over the permutation group, made challenging by differences in nodes (in terms of numbers… ▽ More This paper focuses on the statistical analysis of shapes of data objects called shape graphs, a set of nodes connected by articulated curves with arbitrary shapes. A critical need here is a constrained registration of points (nodes to nodes, edges to edges) across objects. This, in turn, requires optimization over the permutation group, made challenging by differences in nodes (in terms of numbers, locations) and edges (in terms of shapes, placements, and sizes) across objects. This paper tackles this registration problem using a novel neural-network architecture and involves an unsupervised loss function developed using the elastic shape metric for curves. This architecture results in (1) state-of-the-art matching performance and (2) an order of magnitude reduction in the computational cost relative to baseline approaches. We demonstrate the effectiveness of the proposed approach using both simulated data and real-world 2D and 3D shape graphs. Code and data will be made publicly available after review to foster research. △ Less

Submitted 13 August, 2023; originally announced August 2023.

arXiv:2308.05983

Face Encryption via Frequency-Restricted Identity-Agnostic Attacks

Authors: Xin Dong, Rui Wang, Siyuan Liang, Aishan Liu, Lihua **g

Abstract: Billions of people are sharing their daily live images on social media everyday. However, malicious collectors use deep face recognition systems to easily steal their biometric information (e.g., faces) from these images. Some studies are being conducted to generate encrypted face photos using adversarial attacks by introducing imperceptible perturbations to reduce face information leakage. Howeve… ▽ More Billions of people are sharing their daily live images on social media everyday. However, malicious collectors use deep face recognition systems to easily steal their biometric information (e.g., faces) from these images. Some studies are being conducted to generate encrypted face photos using adversarial attacks by introducing imperceptible perturbations to reduce face information leakage. However, existing studies need stronger black-box scenario feasibility and more natural visual appearances, which challenge the feasibility of privacy protection. To address these problems, we propose a frequency-restricted identity-agnostic (FRIA) framework to encrypt face images from unauthorized face recognition without access to personal information. As for the weak black-box scenario feasibility, we obverse that representations of the average feature in multiple face recognition models are similar, thus we propose to utilize the average feature via the crawled dataset from the Internet as the target to guide the generation, which is also agnostic to identities of unknown face recognition systems; in nature, the low-frequency perturbations are more visually perceptible by the human vision system. Inspired by this, we restrict the perturbation in the low-frequency facial regions by discrete cosine transform to achieve the visual naturalness guarantee. Extensive experiments on several face recognition models demonstrate that our FRIA outperforms other state-of-the-art methods in generating more natural encrypted faces while attaining high black-box attack success rates of 96%. In addition, we validate the efficacy of FRIA using real-world black-box commercial API, which reveals the potential of FRIA in practice. Our codes can be found in https://github.com/XinDong10/FRIA. △ Less

Submitted 24 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: I noticed something missing in the article's description in subsection 3.2, so I'd like to undo it and re-finalize and describe it

arXiv:2308.05961 [pdf, other]

Compositional Learning in Transformer-Based Human-Object Interaction Detection

Authors: Zikun Zhuang, Ruihao Qian, Chi Xie, Shuang Liang

Abstract: Human-object interaction (HOI) detection is an important part of understanding human activities and visual scenes. The long-tailed distribution of labeled instances is a primary challenge in HOI detection, promoting research in few-shot and zero-shot learning. Inspired by the combinatorial nature of HOI triplets, some existing approaches adopt the idea of compositional learning, in which object an… ▽ More Human-object interaction (HOI) detection is an important part of understanding human activities and visual scenes. The long-tailed distribution of labeled instances is a primary challenge in HOI detection, promoting research in few-shot and zero-shot learning. Inspired by the combinatorial nature of HOI triplets, some existing approaches adopt the idea of compositional learning, in which object and action features are learned individually and re-composed as new training samples. However, these methods follow the CNN-based two-stage paradigm with limited feature extraction ability, and often rely on auxiliary information for better performance. Without introducing any additional information, we creatively propose a transformer-based framework for compositional HOI learning. Human-object pair representations and interaction representations are re-composed across different HOI instances, which involves richer contextual information and promotes the generalization of knowledge. Experiments show our simple but effective method achieves state-of-the-art performance, especially on rare HOI classes. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2308.05948 [pdf, other]

Uncertainty-Aware Cross-Modal Transfer Network for Sketch-Based 3D Shape Retrieval

Authors: Yiyang Cai, Jiaming Lu, Jiewen Wang, Shuang Liang

Abstract: In recent years, sketch-based 3D shape retrieval has attracted growing attention. While many previous studies have focused on cross-modal matching between hand-drawn sketches and 3D shapes, the critical issue of how to handle low-quality and noisy samples in sketch data has been largely neglected. This paper presents an uncertainty-aware cross-modal transfer network (UACTN) that addresses this iss… ▽ More In recent years, sketch-based 3D shape retrieval has attracted growing attention. While many previous studies have focused on cross-modal matching between hand-drawn sketches and 3D shapes, the critical issue of how to handle low-quality and noisy samples in sketch data has been largely neglected. This paper presents an uncertainty-aware cross-modal transfer network (UACTN) that addresses this issue. UACTN decouples the representation learning of sketches and 3D shapes into two separate tasks: classification-based sketch uncertainty learning and 3D shape feature transfer. We first introduce an end-to-end classification-based approach that simultaneously learns sketch features and uncertainty, allowing uncertainty to prevent overfitting noisy sketches by assigning different levels of importance to clean and noisy sketches. Then, 3D shape features are mapped into the pre-learned sketch embedding space for feature alignment. Extensive experiments and ablation studies on two benchmarks demonstrate the superiority of our proposed method compared to state-of-the-art methods. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: 6 pages, 7 figures; To be published in IEEE International Conference on Multimedia and Expo 2023

arXiv:2308.05771 [pdf, ps, other]

doi 10.1002/andp.202100520

Geometric criterion of topological phase transition for non-Hermitian systems

Authors: Annan Fan, Shi-Dong Liang

Abstract: We propose a geometric criterion of the topological phase transition for non-Hermitian systems. We define the length of the boundary of the bulk band in the complex energy plane for non-Hermitian systems. For one-dimensional systems, we find that the topological phase transition occurs when the derivatives of the length with respect to parameters are discontinuous. For two-dimensional systems, whe… ▽ More We propose a geometric criterion of the topological phase transition for non-Hermitian systems. We define the length of the boundary of the bulk band in the complex energy plane for non-Hermitian systems. For one-dimensional systems, we find that the topological phase transition occurs when the derivatives of the length with respect to parameters are discontinuous. For two-dimensional systems, when the length is discontinuous, the topological phase transitions between the gapped and gapless phases occurs. When the derivatives of the length with respect to parameters are discontinuous, the topological phase transition between the gapless and gapless phases occurs. These nonanalytic behaviors of the length in the complex energy plane provide a signal to detect the topological phase transitions. We demonstrate this geometric criterion by the one-dimensional non-Hermitian Su-Schieffer-Heeger model and the two-dimensional non-Hermitian Chern insulator model. This geometric criterion provides an efficient insight to the global topological invariant from a geometric local object in the complex energy plane for non-Hermitian systems △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 24 pages, 4 figures. arXiv admin note: text overlap with arXiv:2308.05329

Journal ref: Annalen der physik, 2100520 (2021)

arXiv:2308.05329 [pdf, ps, other]

doi 10.1007/s11467-021-1122-5

Topological invariants of complex energy plane in non-Hermitian systems

Authors: Annan Fan, Shi-Dong Liang

Abstract: Non-Hermitian systems as theoretical models of open or dissipative systems exhibit rich novel physical properties and fundamental issues in condensed matter physics.We propose a generalized local-global correspondence between the pseudo-boundary states in the complex energy plane and topological invariants of quantum states. We find that the patterns of the pseudo-boundary states in the complex en… ▽ More Non-Hermitian systems as theoretical models of open or dissipative systems exhibit rich novel physical properties and fundamental issues in condensed matter physics.We propose a generalized local-global correspondence between the pseudo-boundary states in the complex energy plane and topological invariants of quantum states. We find that the patterns of the pseudo-boundary states in the complex energy plane mapped to the Brillouin zone are topological invariants against the parameter deformation. We demonstrate this approach by the non-Hermitian Chern insulator model. We give the consistent topological phases obtained from the Chern number and vorticity. We also find some novel topological invariants embedded in the topological phases of the Chern insulator model, which enrich the phase diagram of the non-Hermitian Chern insulators model beyond that predicted by the Chern number and vorticity. We also propose a generalized vorticity and its flip** index to understand physics behind this novel local-global correspondence and discuss the relationships between the local-global correspondence and the Chern number as well as the transformation between the Brillouin zone and the complex energy plane. These novel approaches provide insights to how topological invariants may be obtained from local information as well as the global property of quantum states, which is expected to be applicable in more generic non-Hermitian systems. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 27 pages, 6 figures, 46 conferences

Journal ref: Front. Phys. 17(3), 33501 (2022)

arXiv:2308.03700 [pdf, other]

Half-Valley Ohmic Contact and Contact-Limited Valley-Contrasting Current Injection

Authors: Xukun Feng, Chit Siong Lau, Shi-Jun Liang, Ching Hua Lee, Shengyuan A. Yang, Yee Sin Ang

Abstract: Two-dimensional (2D) ferrovalley semiconductor (FVSC) with spontaneous valley polarization offers an exciting material platform for probing Berry phase physics. How FVSC can be incorporated in valleytronic device applications, however, remain an open question. Here we generalize the concept of metal/semiconductor (MS) contact into the realm of valleytronics. We propose a half-valley Ohmic contact… ▽ More Two-dimensional (2D) ferrovalley semiconductor (FVSC) with spontaneous valley polarization offers an exciting material platform for probing Berry phase physics. How FVSC can be incorporated in valleytronic device applications, however, remain an open question. Here we generalize the concept of metal/semiconductor (MS) contact into the realm of valleytronics. We propose a half-valley Ohmic contact based on FVSC/graphene heterostructure where the two valleys of FVSC separately forms Ohmic and Schottky contacts with those of graphene, thus allowing current to be valley-selectively injected through the `Ohmic' valley while being blocked in the `Schottky' valley. We develop a theory of contact-limited valley-contrasting current injection and demonstrate that such transport mechanism can produce gate-tunable valley-polarized injection current. Using RuCl$_2$/graphene heterostructure as an example, we illustrate a device concept of valleytronic barristor where high valley polarization efficiency and sizable current on/off ratio, can be achieved under experimentally feasible electrostatic gating conditions. These findings uncover contact-limited valley-contrasting current injection as an efficient mechanism for valley polarization manipulation, and reveals the potential of valleytronic MS contact as a functional building block of valleytronic device technology. △ Less

Submitted 9 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: 9 pages, 5 figures

arXiv:2308.00958 [pdf, other]

Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks

Authors: Jun Guo, Aishan Liu, Xingyu Zheng, Siyuan Liang, Yisong Xiao, Yichao Wu, Xianglong Liu

Abstract: Despite the broad application of Machine Learning models as a Service (MLaaS), they are vulnerable to model stealing attacks. These attacks can replicate the model functionality by using the black-box query process without any prior knowledge of the target victim model. Existing stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. However,… ▽ More Despite the broad application of Machine Learning models as a Service (MLaaS), they are vulnerable to model stealing attacks. These attacks can replicate the model functionality by using the black-box query process without any prior knowledge of the target victim model. Existing stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. However, these defenses are now suffering problems of high inference computational overheads and unfavorable trade-offs between benign accuracy and stealing robustness, which challenges the feasibility of deployed models in practice. To address the problems, this paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. Instead of deploying auxiliary defense modules that introduce redundant inference time, InI directly trains a defensive model by isolating the adversary's training gradient from the expected gradient, which can effectively reduce the inference computational cost. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries, which can induce the adversary to extract little useful knowledge from victim models with minimal impact on the benign performance. Extensive experiments on several visual classification datasets (e.g., MNIST and CIFAR10) demonstrate the superior robustness (up to 48% reduction on stealing accuracy) and speed (up to 25.4x faster) of our InI over other state-of-the-art methods. Our codes can be found in https://github.com/DIG-Beihang/InI-Model-Stealing-Defense. △ Less

Submitted 3 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: Accepted by ACM Multimedia 2023

arXiv:2308.00122 [pdf, other]

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models

Authors: Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Abstract: We propose DAVIS, a Diffusion model-based Audio-VIusal Separation framework that solves the audio-visual sound source separation task through a generative manner. While existing discriminative methods that perform mask regression have made remarkable progress in this field, they face limitations in capturing the complex data distribution required for high-quality separation of sounds from diverse… ▽ More We propose DAVIS, a Diffusion model-based Audio-VIusal Separation framework that solves the audio-visual sound source separation task through a generative manner. While existing discriminative methods that perform mask regression have made remarkable progress in this field, they face limitations in capturing the complex data distribution required for high-quality separation of sounds from diverse categories. In contrast, DAVIS leverages a generative diffusion model and a Separation U-Net to synthesize separated magnitudes starting from Gaussian noises, conditioned on both the audio mixture and the visual footage. With its generative objective, DAVIS is better suited to achieving the goal of high-quality sound separation across diverse categories. We compare DAVIS to existing state-of-the-art discriminative audio-visual separation methods on the domain-specific MUSIC dataset and the open-domain AVE dataset, and results show that DAVIS outperforms other methods in separation quality, demonstrating the advantages of our framework for tackling the audio-visual source separation task. △ Less

Submitted 31 July, 2023; originally announced August 2023.

arXiv:2307.16789 [pdf, other]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Authors: Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun

Abstract: Despite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current instruction tuning largely focuses on basic language tasks but ignores the tool-use domain. This is in contrast to the excellent tool-use capabilities of state-of-th… ▽ More Despite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current instruction tuning largely focuses on basic language tasks but ignores the tool-use domain. This is in contrast to the excellent tool-use capabilities of state-of-the-art (SOTA) closed-source LLMs, e.g., ChatGPT. To bridge this gap, we introduce ToolLLM, a general tool-use framework encompassing data construction, model training, and evaluation. We first present ToolBench, an instruction-tuning dataset for tool use, which is constructed automatically using ChatGPT. Specifically, the construction can be divided into three stages: (i) API collection: we collect 16,464 real-world RESTful APIs spanning 49 categories from RapidAPI Hub; (ii) instruction generation: we prompt ChatGPT to generate diverse instructions involving these APIs, covering both single-tool and multi-tool scenarios; (iii) solution path annotation: we use ChatGPT to search for a valid solution path (chain of API calls) for each instruction. To enhance the reasoning capabilities of LLMs, we develop a novel depth-first search-based decision tree algorithm. It enables LLMs to evaluate multiple reasoning traces and expand the search space. Moreover, to evaluate the tool-use capabilities of LLMs, we develop an automatic evaluator: ToolEval. Based on ToolBench, we fine-tune LLaMA to obtain an LLM ToolLLaMA, and equip it with a neural API retriever to recommend appropriate APIs for each instruction. Experiments show that ToolLLaMA demonstrates a remarkable ability to execute complex instructions and generalize to unseen APIs, and exhibits comparable performance to ChatGPT. Our ToolLLaMA also demonstrates strong zero-shot generalization ability in an out-of-distribution tool-use dataset: APIBench. △ Less

Submitted 3 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

arXiv:2307.15504 [pdf, other]

Exploring Format Consistency for Instruction Tuning

Authors: Shihao Liang, Runchu Tian, Kunlun Zhu, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun

Abstract: Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger col… ▽ More Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger collections. However, different users have their unique ways of expressing instructions, and there often exist variations across different datasets in the instruction styles and formats, i.e., format inconsistency. In this work, we propose a framework named Unified Instruction Tuning (UIT), which calls OpenAI APIs for automatic format transfer among different instruction tuning datasets such as PromptSource, FLAN and CrossFit. With the framework, we (1) demonstrate the necessity of maintaining format consistency in instruction tuning; (2) improve the generalization performance on unseen instructions on T5-LM-xl; (3) provide a novel perplexity-based denoising method to reduce the noise of automatic format transfer to make the UIT framework more practical and a smaller offline model based on GPT-J that achieves comparable format transfer capability to OpenAI APIs to reduce costs in practice. Further analysis regarding variations of targeted formats and other effects is intended. △ Less

Submitted 8 January, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.13634 [pdf, other]

Exact Methods of Homogeneity Test of Proportions for Bilateral and Unilateral Correlated Data

Authors: Shuyi Liang, Chang-Xing Ma

Abstract: Subjects in clinical studies that investigate paired body parts can carry a disease on either both sides (bilateral) or a single side (unilateral) of the organs. Data in such studies may consist of both bilateral and unilateral records. However, the correlation between the paired organs is often ignored, which may lead to biased interpretations. Recent literatures have taken the correlation into a… ▽ More Subjects in clinical studies that investigate paired body parts can carry a disease on either both sides (bilateral) or a single side (unilateral) of the organs. Data in such studies may consist of both bilateral and unilateral records. However, the correlation between the paired organs is often ignored, which may lead to biased interpretations. Recent literatures have taken the correlation into account. For example, Ma and Wang (2021) proposed three asymptotic procedures for testing the homogeneity of proportions of multiple groups using combined bilateral and unilateral data and recommended the score test. It is of importance to notice that the asymptotic behavior is not guaranteed if the sample size is small, resulting in uncontrolled type I error rates. In this paper, we extend their work by considering exact approaches and compare these methods with the score test proposed by Ma and Wang (2021) in terms of type I errors and statistical powers. Additionally, two real-world examples are used to illustrate the application of the proposed approaches. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.12813 [pdf, other]

Described Object Detection: Liberating Object Detection with Flexible Expressions

Authors: Chi Xie, Zhao Zhang, Yixuan Wu, Feng Zhu, Rui Zhao, Shuang Liang

Abstract: Detecting objects based on language information is a popular task that includes Open-Vocabulary object Detection (OVD) and Referring Expression Comprehension (REC). In this paper, we advance them to a more practical setting called Described Object Detection (DOD) by expanding category names to flexible language expressions for OVD and overcoming the limitation of REC only grounding the pre-existin… ▽ More Detecting objects based on language information is a popular task that includes Open-Vocabulary object Detection (OVD) and Referring Expression Comprehension (REC). In this paper, we advance them to a more practical setting called Described Object Detection (DOD) by expanding category names to flexible language expressions for OVD and overcoming the limitation of REC only grounding the pre-existing object. We establish the research foundation for DOD by constructing a Description Detection Dataset ($D^3$). This dataset features flexible language expressions, whether short category names or long descriptions, and annotating all described objects on all images without omission. By evaluating previous SOTA methods on $D^3$, we find some troublemakers that fail current REC, OVD, and bi-functional methods. REC methods struggle with confidence scores, rejecting negative instances, and multi-target scenarios, while OVD methods face constraints with long and complex descriptions. Recent bi-functional methods also do not work well on DOD due to their separated training procedures and inference strategies for REC and OVD tasks. Building upon the aforementioned findings, we propose a baseline that largely improves REC methods by reconstructing the training data and introducing a binary classification sub-task, outperforming existing methods. Data and code are available at https://github.com/shikras/d-cube and related works are tracked in https://github.com/Charles-Xie/awesome-described-object-detection. △ Less

Submitted 11 October, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: Accepted by NeurIPS 2023

arXiv:2307.11324 [pdf, other]

doi 10.1103/PhysRevB.108.235143

Stochastic pole expansion method

Authors: Li Huang, Shuang Liang

Abstract: In this paper, we propose a new analytic continuation method to extract real frequency spectral functions from imaginary frequency Green's functions of quantum many-body systems. This method is based on the pole representation of Matsubara Green's function and a stochastic sampling procedure is utilized to optimize the amplitudes and locations of poles. In order to capture narrow peaks and sharp b… ▽ More In this paper, we propose a new analytic continuation method to extract real frequency spectral functions from imaginary frequency Green's functions of quantum many-body systems. This method is based on the pole representation of Matsubara Green's function and a stochastic sampling procedure is utilized to optimize the amplitudes and locations of poles. In order to capture narrow peaks and sharp band edges in the spectral functions, a constrained sampling algorithm and a self-adaptive sampling algorithm are developed. To demonstrate the usefulness and performance of the new method, we at first apply it to study the spectral functions of representative fermionic and bosonic correlators. Then we employ this method to tackle the analytic continuation problems of matrix-valued Green's functions. The synthetic Green's functions, as well as realistic correlation functions from finite temperature quantum many-body calculations, are used as input. The benchmark results demonstrate that this method is capable of reproducing most of the key characteristics in the spectral functions. The sharp, smooth, and multi-peak features in both low-frequency and high-frequency regions of spectral functions could be accurately resolved, which overcomes one of the main limitations of the traditional maximum entropy method. More importantly, it exhibits excellent robustness with respect to noisy and incomplete input data. The causality of spectral function is always satisfied even in the presence of sizable noises. As a byproduct, this method could derive a fitting formula for the Matsubara data, which provides a compact approximation to the many-body Green's functions. Hence, we expect that this new method could become a pivotal workhorse for numerically analytic continuation and be broadly useful in many applications. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: 26 pages, 20 figures

Journal ref: Phys. Rev. B 108, 235143 (2023)

arXiv:2307.08991 [pdf, other]

EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps

Authors: Yuzhe He, Shuang Liang, Xiaofei Rui, Chengying Cai, Guowei Wan

Abstract: Accurate and reliable ego-localization is critical for autonomous driving. In this paper, we present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods, but uses lightweight vectorized maps instead of heavy point-based maps. To begin with, we extract BEV features from online multi-view images and LiDAR point cloud. Then, we em… ▽ More Accurate and reliable ego-localization is critical for autonomous driving. In this paper, we present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods, but uses lightweight vectorized maps instead of heavy point-based maps. To begin with, we extract BEV features from online multi-view images and LiDAR point cloud. Then, we employ a set of learnable semantic embeddings to encode the semantic types of map elements and supervise them with semantic segmentation, to make their feature representation consistent with BEV features. After that, we feed map queries, composed of learnable semantic embeddings and coordinates of map elements, into a transformer decoder to perform cross-modality matching with BEV features. Finally, we adopt a robust histogram-based pose solver to estimate the optimal pose by searching exhaustively over candidate poses. We comprehensively validate the effectiveness of our method using both the nuScenes dataset and a newly collected dataset. The experimental results show that our method achieves centimeter-level localization accuracy, and outperforms existing methods using vectorized maps by a large margin. Furthermore, our model has been extensively tested in a large fleet of autonomous vehicles under various challenging urban scenes. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: 8 pages

Showing 101–150 of 565 results for author: Liang, S