Search | arXiv e-print repository

UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

Authors: **g**g Ren, Wenbo Li, Haoyu Chen, Ren**g Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu

Abstract: Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining comp… ▽ More Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining computational efficiency. UltraPixel leverages semantics-rich representations of lower-resolution images in the later denoising stage to guide the whole generation of highly detailed high-resolution images, significantly reducing complexity. Furthermore, we introduce implicit neural representations for continuous upsampling and scale-aware normalization layers adaptable to various resolutions. Notably, both low- and high-resolution processes are performed in the most compact space, sharing the majority of parameters with less than 3$\%$ additional parameters for high-resolution outputs, largely enhancing training and inference efficiency. Our model achieves fast training with reduced data requirements, producing photo-realistic high-resolution images and demonstrating state-of-the-art performance in extensive experiments. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.17764 [pdf, other]

BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning

Authors: Ercong Nie, Bo Shao, Zifeng Ding, Mingyang Wang, Helmut Schmid, Hinrich Schütze

Abstract: Large language models (LLMs) possess extensive parametric knowledge, but this knowledge is difficult to update with new information because retraining is very expensive and infeasible for closed-source models. Knowledge editing (KE) has emerged as a viable solution for updating the knowledge of LLMs without compromising their overall performance. On-the-fly KE methods, inspired by in-context learn… ▽ More Large language models (LLMs) possess extensive parametric knowledge, but this knowledge is difficult to update with new information because retraining is very expensive and infeasible for closed-source models. Knowledge editing (KE) has emerged as a viable solution for updating the knowledge of LLMs without compromising their overall performance. On-the-fly KE methods, inspired by in-context learning (ICL), have shown great promise and allow LLMs to be treated as black boxes. In the past, KE was primarily employed in English contexts, whereas the potential for cross-lingual KE in current English-centric LLMs has not been fully explored. To foster more research in this direction, we introduce the BMIKE-53 benchmark for evaluating cross-lingual KE on 53 diverse languages across three KE task types. We also propose a gradient-free KE method called Multilingual In-context Knowledge Editing (MIKE) and evaluate it on BMIKE-53. Our evaluation focuses on cross-lingual knowledge transfer in terms of reliability, generality, locality, and portability, offering valuable insights and a framework for future research in cross-lingual KE. Our code and data are publicly accessible via the anonymous repository at https://anonymous.4open.science/r/MIKE. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 12 pages, 4 figures

arXiv:2406.13007 [pdf, other]

NTIRE 2024 Challenge on Night Photography Rendering

Authors: Egor Ershov, Artyom Panshin, Oleg Karasev, Sergey Korchagin, Shepelev Lev, Alexandr Startsev, Daniil Vladimirov, Ekaterina Zaychenkova, Nikola Banić, Dmitrii Iarchuk, Maria Efimova, Radu Timofte, Arseniy Terekhin, Shuwei Yue, Yuyang Liu, Minchen Wei, Lu Xu, Chao Zhang, Yasi Wang, Furkan Kınlı, Doğa Yılmaz, Barış Özcan, Furkan Kıraç, Shuai Liu, **gyuan Xiao , et al. (25 additional authors not shown)

Abstract: This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algo… ▽ More This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algorithms was also measured alongside the quality of their output. To evaluate the results, a sufficient number of viewers were asked to assess the visual quality of the proposed solutions, considering the subjective nature of the task. There were 2 nominations: quality and efficiency. Top 5 solutions in terms of output quality were sorted by evaluation time (see Fig. 1). The top ranking participants' solutions effectively represent the state-of-the-art in nighttime photography rendering. More results can be found at https://nightimaging.org. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 10 pages, 10 figures

arXiv:2406.11185 [pdf, other]

Acceleration without Disruption: DFT Software as a Service

Authors: Fusong Ju, Xinran Wei, Lin Huang, Andrew J. Jenkins, Leo Xia, Jia Zhang, Jianwei Zhu, Han Yang, Bin Shao, Peggy Dai, Ashwin Mayya, Zahra Hooshmand, Alexandra Efimovskaya, Nathan A. Baker, Matthias Troyer, Hongbin Liu

Abstract: Density functional theory (DFT) has been a cornerstone in computational chemistry, physics, and materials science for decades, benefiting from advancements in computational power and theoretical methods. This paper introduces a novel, cloud-native application, Accelerated DFT, which offers an order of magnitude acceleration in DFT simulations. By integrating state-of-the-art cloud infrastructure a… ▽ More Density functional theory (DFT) has been a cornerstone in computational chemistry, physics, and materials science for decades, benefiting from advancements in computational power and theoretical methods. This paper introduces a novel, cloud-native application, Accelerated DFT, which offers an order of magnitude acceleration in DFT simulations. By integrating state-of-the-art cloud infrastructure and redesigning algorithms for graphic processing units (GPUs), Accelerated DFT achieves high-speed calculations without sacrificing accuracy. It provides an accessible and scalable solution for the increasing demands of DFT calculations in scientific communities. The implementation details, examples, and benchmark results illustrate how Accelerated DFT can significantly expedite scientific discovery across various domains. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.03794 [pdf, other]

Infusing Self-Consistency into Density Functional Theory Hamiltonian Prediction via Deep Equilibrium Models

Authors: Zun Wang, Chang Liu, Nianlong Zou, He Zhang, Xinran Wei, Lin Huang, Lijun Wu, Bin Shao

Abstract: In this study, we introduce a unified neural network architecture, the Deep Equilibrium Density Functional Theory Hamiltonian (DEQH) model, which incorporates Deep Equilibrium Models (DEQs) for predicting Density Functional Theory (DFT) Hamiltonians. The DEQH model inherently captures the self-consistency nature of Hamiltonian, a critical aspect often overlooked by traditional machine learning app… ▽ More In this study, we introduce a unified neural network architecture, the Deep Equilibrium Density Functional Theory Hamiltonian (DEQH) model, which incorporates Deep Equilibrium Models (DEQs) for predicting Density Functional Theory (DFT) Hamiltonians. The DEQH model inherently captures the self-consistency nature of Hamiltonian, a critical aspect often overlooked by traditional machine learning approaches for Hamiltonian prediction. By employing DEQ within our model architecture, we circumvent the need for DFT calculations during the training phase to introduce the Hamiltonian's self-consistency, thus addressing computational bottlenecks associated with large or complex systems. We propose a versatile framework that combines DEQ with off-the-shelf machine learning models for predicting Hamiltonians. When benchmarked on the MD17 and QH9 datasets, DEQHNet, an instantiation of the DEQH framework, has demonstrated a significant improvement in prediction accuracy. Beyond a predictor, the DEQH model is a Hamiltonian solver, in the sense that it uses the fixed-point solving capability of the deep equilibrium model to iteratively solve for the Hamiltonian. Ablation studies of DEQHNet further elucidate the network's effectiveness, offering insights into the potential of DEQ-integrated networks for Hamiltonian learning. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.16511 [pdf, other]

SE3Set: Harnessing equivariant hypergraph neural networks for molecular representation learning

Authors: Hongfei Wu, Lijun Wu, Guoqing Liu, Zhirong Liu, Bin Shao, Zun Wang

Abstract: In this paper, we develop SE3Set, an SE(3) equivariant hypergraph neural network architecture tailored for advanced molecular representation learning. Hypergraphs are not merely an extension of traditional graphs; they are pivotal for modeling high-order relationships, a capability that conventional equivariant graph-based methods lack due to their inherent limitations in representing intricate ma… ▽ More In this paper, we develop SE3Set, an SE(3) equivariant hypergraph neural network architecture tailored for advanced molecular representation learning. Hypergraphs are not merely an extension of traditional graphs; they are pivotal for modeling high-order relationships, a capability that conventional equivariant graph-based methods lack due to their inherent limitations in representing intricate many-body interactions. To achieve this, we first construct hypergraphs via proposing a new fragmentation method that considers both chemical and three-dimensional spatial information of molecular system. We then design SE3Set, which incorporates equivariance into the hypergragh neural network. This ensures that the learned molecular representations are invariant to spatial transformations, thereby providing robustness essential for accurate prediction of molecular properties. SE3Set has shown performance on par with state-of-the-art (SOTA) models for small molecule datasets like QM9 and MD17. It excels on the MD22 dataset, achieving a notable improvement of approximately 20% in accuracy across all molecules, which highlights the prevalence of complex many-body interactions in larger molecules. This exceptional performance of SE3Set across diverse molecular structures underscores its transformative potential in computational chemistry, offering a route to more accurate and physically nuanced modeling. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.00751 [pdf, other]

F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching

Authors: Shaoning Li, Yusong Wang, Mingyu Li, Jian Zhang, Bin Shao, Nanning Zheng, Jian Tang

Abstract: Molecular dynamics (MD) is a crucial technique for simulating biological systems, enabling the exploration of their dynamic nature and fostering an understanding of their functions and properties. To address exploration inefficiency, emerging enhanced sampling approaches like coarse-graining (CG) and generative models have been employed. In this work, we propose a \underline{Frame-to-Frame} genera… ▽ More Molecular dynamics (MD) is a crucial technique for simulating biological systems, enabling the exploration of their dynamic nature and fostering an understanding of their functions and properties. To address exploration inefficiency, emerging enhanced sampling approaches like coarse-graining (CG) and generative models have been employed. In this work, we propose a \underline{Frame-to-Frame} generative model with guided \underline{Flow}-matching (F$3$low) for enhanced sampling, which (a) extends the domain of CG modeling to the SE(3) Riemannian manifold; (b) retreating CGMD simulations as autoregressively sampling guided by the former frame via flow-matching models; (c) targets the protein backbone, offering improved insights into secondary structure formation and intricate folding pathways. Compared to previous methods, F$3$low allows for broader exploration of conformational space. The ability to rapidly generate diverse conformations via force-free generative paradigm on SE(3) paves the way toward efficient enhanced sampling methods. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: Accepted by ICLR 2024 GEM workshop

arXiv:2404.14248 [pdf, other]

NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi **, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, **g Lin, Alan Yuille, Ben Shao, ** Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlighting, extreme darkness, and night scenes. A notable total of 428 participants registered for the challenge, with 22 teams ultimately making valid submissions. This paper meticulously evaluates the state-of-the-art advancements in enhancing low-light images, reflecting the significant progress and creativity in this field. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: NTIRE 2024 Challenge Report

arXiv:2404.11722 [pdf, other]

Beyond the Bid-Ask: Strategic Insights into Spread Prediction and the Global Mid-Price Phenomenon

Authors: Yifan He, Abootaleb Shirvani, Barret Shao, Svetlozar Rachev, Frank Fabozzi

Abstract: This study introduces novel concepts in the analysis of limit order books (LOBs) with a focus on unveiling strategic insights into spread prediction and understanding the global mid-price (GMP) phenomenon. We define and analyze the total market order book bid--ask spread (TMOBBAS) and GMP, showcasing their significance in providing a deeper understanding of market dynamics beyond traditional LOB m… ▽ More This study introduces novel concepts in the analysis of limit order books (LOBs) with a focus on unveiling strategic insights into spread prediction and understanding the global mid-price (GMP) phenomenon. We define and analyze the total market order book bid--ask spread (TMOBBAS) and GMP, showcasing their significance in providing a deeper understanding of market dynamics beyond traditional LOB models. Employing high-frequency data, we comprehensively examine these concepts through various methodological lenses, including tail behavior analysis, dynamics of log-returns, and risk--return performance evaluation. Our findings reveal the intricate behavior of TMOBBAS and GMP under different market conditions, offering new perspectives on the liquidity, volatility, and efficiency of markets. This paper not only contributes to the academic discourse on financial markets but also presents practical implications for traders, risk managers, and policymakers seeking to navigate the complexities of modern financial systems. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 54 pages, 45 figures

arXiv:2403.19940 [pdf, other]

MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Authors: Beichen Shao, Yan Ding, Xingchen Wang, Xuefeng Xie, Fuqiang Gu, Jun Luo, Chao Chen

Abstract: Mobile manipulators always need to determine feasible base positions prior to carrying out navigation-manipulation tasks. Real-world environments are often cluttered with various furniture, obstacles, and dozens of other objects. Efficiently computing base positions poses a challenge. In this work, we introduce a framework named MoMa-Pos to address this issue. MoMa-Pos first learns to predict a sm… ▽ More Mobile manipulators always need to determine feasible base positions prior to carrying out navigation-manipulation tasks. Real-world environments are often cluttered with various furniture, obstacles, and dozens of other objects. Efficiently computing base positions poses a challenge. In this work, we introduce a framework named MoMa-Pos to address this issue. MoMa-Pos first learns to predict a small set of objects that, taken together, would be sufficient for finding base positions using a graph embedding architecture. MoMa-Pos then calculates standing positions by considering furniture structures, robot models, and obstacles comprehensively. We have extensively evaluated the proposed MoMa-Pos across different settings (e.g., environment and algorithm parameters) and with various mobile manipulators. Our empirical results show that MoMa-Pos demonstrates remarkable effectiveness and efficiency in its performance, surpassing the methods in the literature. %, but also is adaptable to cluttered environments and different robot models. Supplementary material can be found at \url{https://yding25.com/MoMa-Pos}. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Submitted to IROS 2024

arXiv:2403.09560 [pdf, other]

Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

Authors: He Zhang, Chang Liu, Zun Wang, Xinran Wei, Siyuan Liu, Nanning Zheng, Bin Shao, Tie-Yan Liu

Abstract: Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an… ▽ More Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale. △ Less

Submitted 5 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: Accepted by ICML 2024

arXiv:2403.01698 [pdf, other]

Hypertext Entity Extraction in Webpage

Authors: Yifei Yang, Tianqiao Liu, Bo Shao, Hai Zhao, Linjun Shou, Ming Gong, Daxin Jiang

Abstract: Webpage entity extraction is a fundamental natural language processing task in both research and applications. Nowadays, the majority of webpage entity extraction models are trained on structured datasets which strive to retain textual content and its structure information. However, existing datasets all overlook the rich hypertext features (e.g., font color, font size) which show their effectiven… ▽ More Webpage entity extraction is a fundamental natural language processing task in both research and applications. Nowadays, the majority of webpage entity extraction models are trained on structured datasets which strive to retain textual content and its structure information. However, existing datasets all overlook the rich hypertext features (e.g., font color, font size) which show their effectiveness in previous works. To this end, we first collect a \textbf{H}ypertext \textbf{E}ntity \textbf{E}xtraction \textbf{D}ataset (\textit{HEED}) from the e-commerce domains, scra** both the text and the corresponding explicit hypertext features with high-quality manual entity annotations. Furthermore, we present the \textbf{Mo}E-based \textbf{E}ntity \textbf{E}xtraction \textbf{F}ramework (\textit{MoEEF}), which efficiently integrates multiple features to enhance model performance by Mixture of Experts and outperforms strong baselines, including the state-of-the-art small-scale models and GPT-3.5-turbo. Moreover, the effectiveness of hypertext features in \textit{HEED} and several model components in \textit{MoEEF} are analyzed. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2403.00298 [pdf, other]

Multiple Classical Noise Mitigation by Multiobjective Robust Quantum Optimal Control

Authors: Bowen Shao, Xiaodong Yang, Ran Liu, Yue Zhai, Dawei Lu, Tao Xin, Jun Li

Abstract: High-quality control is a fundamental requirement for quantum computation, but practically it is often hampered by the presence of various types of noises, which can be static or time-dependent. In many realistic scenarios, multiple noise sources coexist, and their resulting noise effects need be corrected to a sufficient order, posing significant challenges for the design of effective robust cont… ▽ More High-quality control is a fundamental requirement for quantum computation, but practically it is often hampered by the presence of various types of noises, which can be static or time-dependent. In many realistic scenarios, multiple noise sources coexist, and their resulting noise effects need be corrected to a sufficient order, posing significant challenges for the design of effective robust control methods. Here, we explore the method of robust quantum optimal control to generally tackle the problem of resisting multiple noises from a complicated noise environment. Specifically, we confine our analysis to unitary noises that can be described by classical noise models. This method employs a gradient-based multiobjective optimization algorithm to maximize the control figure of merit, and meanwhile to minimize the perturbative effects of the noises that are allowed for. To verify its effectiveness, we apply this method to a number of examples, including roubust entangling gate in trapped ion system and robust controlled-Z gate in superconducting qubits, under commonly encountered static and time-dependent noises. Our simulation results reveal that robust optimal control can find smooth, robust pulses that can simultaneously resist several noises and thus achieve high-fidelity gates. Therefore, we expect that this method will find wide applications on current noisy quantum computing devices. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 13 pages, 5 figures, accepted by Physical Review Applied

arXiv:2401.07532 [pdf, other]

Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

Authors: Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, **g Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng

Abstract: Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still re… ▽ More Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still remains unaddressed. To this end, we propose Multi-view MidiVAE, as one of the pioneers in VAE methods that effectively model and generate long multi-track symbolic music. The Multi-view MidiVAE utilizes the two-dimensional (2-D) representation, OctupleMIDI, to capture relationships among notes while reducing the feature sequences length. Moreover, we focus on instrumental characteristics and harmony as well as global and local information about the musical composition by employing a hybrid variational encoding-decoding strategy to integrate both Track- and Bar-view MidiVAE features. Objective and subjective experimental results on the CocoChorales dataset demonstrate that, compared to the baseline, Multi-view MidiVAE exhibits significant improvements in terms of modeling long multi-track symbolic music. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: Accepted by ICASSP 2024

arXiv:2312.11218 [pdf, other]

Decoupled Knowledge with Ensemble Learning for Online Distillation

Authors: Baitan Shao, Ying Chen

Abstract: Offline distillation is a two-stage pipeline that requires expensive resources to train a teacher network and then distill the knowledge to a student for deployment. Online knowledge distillation, on the other hand, is a one-stage strategy that alleviates the requirement with mutual learning and collaborative learning. Recent peer collaborative learning (PCL) integrates online ensemble, collaborat… ▽ More Offline distillation is a two-stage pipeline that requires expensive resources to train a teacher network and then distill the knowledge to a student for deployment. Online knowledge distillation, on the other hand, is a one-stage strategy that alleviates the requirement with mutual learning and collaborative learning. Recent peer collaborative learning (PCL) integrates online ensemble, collaboration of base networks and temporal mean teacher to construct effective knowledge. However, the model collapses occasionally in PCL due to high homogenization between the student and the teacher. In this paper, the cause of the high homogenization is analyzed and the solution is presented. A decoupled knowledge for online knowledge distillation is generated by an independent teacher, separate from the student. Such design can increase the diversity between the networks and reduce the possibility of model collapse. To obtain early decoupled knowledge, an initialization scheme for the teacher is devised, and a 2D geometry-based analysis experiment is conducted under ideal conditions to showcase the effectiveness of this scheme. Moreover, to improve the teacher's supervisory resilience, a decaying ensemble scheme is devised. It assembles the knowledge of the teacher to which a dynamic weight which is large at the start of the training and gradually decreases with the training process is assigned. The assembled knowledge serves as a strong teacher during the early training and the decreased-weight-assembled knowledge can eliminate the distribution deviation under the potentially overfitted teacher's supervision. A Monte Carlo-based simulation is conducted to evaluate the convergence. Extensive experiments on CIFAR-10, CIFAR-100 and TinyImageNet show the superiority of our method. Ablation studies and further analysis demonstrate the effectiveness. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.04919 [pdf, other]

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion

Authors: Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng

Abstract: Any-to-any singing voice conversion (SVC) is confronted with the challenge of ``timbre leakage'' issue caused by inadequate disentanglement between the content and the speaker timbre. To address this issue, this study introduces NeuCoSVC, a novel neural concatenative SVC framework. It consists of a self-supervised learning (SSL) representation extractor, a neural harmonic signal generator, and a w… ▽ More Any-to-any singing voice conversion (SVC) is confronted with the challenge of ``timbre leakage'' issue caused by inadequate disentanglement between the content and the speaker timbre. To address this issue, this study introduces NeuCoSVC, a novel neural concatenative SVC framework. It consists of a self-supervised learning (SSL) representation extractor, a neural harmonic signal generator, and a waveform synthesizer. The SSL extractor condenses audio into fixed-dimensional SSL features, while the harmonic signal generator leverages linear time-varying filters to produce both raw and filtered harmonic signals for pitch information. The synthesizer reconstructs waveforms using SSL features, harmonic signals, and loudness information. During inference, voice conversion is performed by substituting source SSL features with their nearest counterparts from a matching pool which comprises SSL features extracted from the reference audio, while preserving raw harmonic signals and loudness from the source audio. By directly utilizing SSL features from the reference audio, the proposed framework effectively resolves the ``timbre leakage" issue caused by previous disentanglement-based approaches. Experimental results demonstrate that the proposed NeuCoSVC system outperforms the disentanglement-based speaker embedding approach in one-shot SVC across intra-language, cross-language, and cross-domain evaluations. △ Less

Submitted 8 January, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

arXiv:2309.16578 [pdf, other]

doi 10.1038/s43588-024-00605-8

Overcoming the Barrier of Orbital-Free Density Functional Theory for Molecular Systems Using Deep Learning

Authors: He Zhang, Siyuan Liu, Jiacheng You, Chang Liu, Shuxin Zheng, Ziheng Lu, Tong Wang, Nanning Zheng, Bin Shao

Abstract: Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. Here we propose M-OFDFT, an OFDFT… ▽ More Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. Here we propose M-OFDFT, an OFDFT approach capable of solving molecular systems using a deep learning functional model. We build the essential non-locality into the model, which is made affordable by the concise density representation as expansion coefficients under an atomic basis. With techniques to address unconventional learning challenges therein, M-OFDFT achieves a comparable accuracy with Kohn-Sham DFT on a wide range of molecules untouched by OFDFT before. More attractively, M-OFDFT extrapolates well to molecules much larger than those seen in training, which unleashes the appealing scaling of OFDFT for studying large molecules including proteins, representing an advancement of the accuracy-efficiency trade-off frontier in quantum chemistry. △ Less

Submitted 9 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Published in Nature Computational Science, March 2024. Full paper with supplementary information

arXiv:2308.04686 [pdf, other]

Orthogonality catastrophe and quantum speed limit for dynamical quantum phase transition

Authors: Zheng-Rong Zhu, Bin Shao, Jian Zou, Lian-Ao Wu

Abstract: We investigate the orthogonality catastrophe and quantum speed limit in the Creutz model for dynamical quantum phase transitions. We demonstrate that exact zeros of the Loschmidt echo can exist in finite-size systems for specific discrete values. We highlight the role of the zero-energy mode when analyzing quench dynamics near the critical point. We also examine the behavior of the time for the fi… ▽ More We investigate the orthogonality catastrophe and quantum speed limit in the Creutz model for dynamical quantum phase transitions. We demonstrate that exact zeros of the Loschmidt echo can exist in finite-size systems for specific discrete values. We highlight the role of the zero-energy mode when analyzing quench dynamics near the critical point. We also examine the behavior of the time for the first exact zeros of the Loschmidt echo and the corresponding quantum speed limit time as the system size increases. While the bound is not tight, it can be attributed to the scaling properties of the band gap and energy variance with respect to system size. As such, we establish a relation between the orthogonality catastrophe and quantum speed limit by referencing the full form of the Loschmidt echo. Significantly, we find the possibility of using the quantum speed limit to detect the critical point of a static quantum phase transition, along with a decrease in the amplitude of noise induced quantum speed limit. △ Less

Submitted 22 September, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

Comments: 10 pages, 8 figures

arXiv:2304.13542 [pdf, other]

Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local Interaction for Scalable Molecular Dynamics Simulation

Authors: Yunyang Li, Yusong Wang, Lin Huang, Han Yang, Xinran Wei, Jia Zhang, Tong Wang, Zun Wang, Bin Shao, Tie-Yan Liu

Abstract: Computational simulation of chemical and biological systems using ab initio molecular dynamics has been a challenge over decades. Researchers have attempted to address the problem with machine learning and fragmentation-based methods. However, the two approaches fail to give a satisfactory description of long-range and many-body interactions, respectively. Inspired by fragmentation-based methods,… ▽ More Computational simulation of chemical and biological systems using ab initio molecular dynamics has been a challenge over decades. Researchers have attempted to address the problem with machine learning and fragmentation-based methods. However, the two approaches fail to give a satisfactory description of long-range and many-body interactions, respectively. Inspired by fragmentation-based methods, we propose the Long-Short-Range Message-Passing (LSR-MP) framework as a generalization of the existing equivariant graph neural networks (EGNNs) with the intent to incorporate long-range interactions efficiently and effectively. We apply the LSR-MP framework to the recently proposed ViSNet and demonstrate the state-of-the-art results with up to $40\%$ error reduction for molecules in MD22 and Chignolin datasets. Consistent improvements to various EGNNs will also be discussed to illustrate the general applicability and robustness of our LSR-MP framework. △ Less

Submitted 18 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2211.12791 [pdf, other]

An ensemble of VisNet, Transformer-M, and pretraining models for molecular property prediction in OGB Large-Scale Challenge @ NeurIPS 2022

Authors: Yusong Wang, Shaoning Li, Zun Wang, Xinheng He, Bin Shao, Tie-Yan Liu, Tong Wang

Abstract: In the technical report, we provide our solution for OGB-LSC 2022 Graph Regression Task. The target of this task is to predict the quantum chemical property, HOMO-LUMO gap for a given molecule on PCQM4Mv2 dataset. In the competition, we designed two kinds of models: Transformer-M-ViSNet which is an geometry-enhanced graph neural network for fully connected molecular graphs and Pretrained-3D-ViSNet… ▽ More In the technical report, we provide our solution for OGB-LSC 2022 Graph Regression Task. The target of this task is to predict the quantum chemical property, HOMO-LUMO gap for a given molecule on PCQM4Mv2 dataset. In the competition, we designed two kinds of models: Transformer-M-ViSNet which is an geometry-enhanced graph neural network for fully connected molecular graphs and Pretrained-3D-ViSNet which is a pretrained ViSNet by distilling geomeotric information from optimized structures. With an ensemble of 22 models, ViSNet Team achieved the MAE of 0.0723 eV on the test-challenge set, dramatically reducing the error by 39.75% compared with the best method in the last year competition. △ Less

Submitted 16 August, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

arXiv:2210.16518 [pdf, other]

ViSNet: an equivariant geometry-enhanced graph neural network with vector-scalar interactive message passing for molecules

Authors: Yusong Wang, Shaoning Li, Xinheng He, Mingyu Li, Zun Wang, Nanning Zheng, Bin Shao, Tie-Yan Liu, Tong Wang

Abstract: Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose… ▽ More Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose an equivariant geometry-enhanced graph neural network called ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs. Our proposed ViSNet outperforms state-of-the-art approaches on multiple MD benchmarks, including MD17, revised MD17 and MD22, and achieves excellent chemical property prediction on QM9 and Molecule3D datasets. Additionally, ViSNet achieved the top winners of PCQM4Mv2 track in the OGB-LCS@NeurIPS2022 competition. Furthermore, through a series of simulations and case studies, ViSNet can efficiently explore the conformational space and provide reasonable interpretability to map geometric representations to molecular structures. △ Less

Submitted 16 August, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

arXiv:2207.05654 [pdf, other]

Spinless Mirror Chern Insulator from Projective Symmetry Algebra

Authors: L. B. Shao, Z. Y. Chen, K. Wang, S. A. Yang, Y. X. Zhao

Abstract: It was commonly believed that a mirror Chern insulator (MCI) must require spin-orbital coupling, since time-reversal symmetry for spinless systems contradicts with the mirror Chern number. So MCI cannot be realized in spinless systems which include the large field of topological artificial crystals. Here, we disprove this common belief. The first point to clarify is that the fundamental constraint… ▽ More It was commonly believed that a mirror Chern insulator (MCI) must require spin-orbital coupling, since time-reversal symmetry for spinless systems contradicts with the mirror Chern number. So MCI cannot be realized in spinless systems which include the large field of topological artificial crystals. Here, we disprove this common belief. The first point to clarify is that the fundamental constraint is not from spin-orbital coupling but the symmetry algebra of time reversal and mirror operations. Then, our theory is based on the conceptual transformation that the symmetry algebras will be projectively modified under gauge fields. Particularly, we show that the symmetry algebra of mirror reflection and time-reversal required for MCI can be achieved projectively in spinless systems with lattice $\mathbb{Z}_2$ gauge fields, i.e., by allowing real hop** amplitudes to take $\pm$ signs. Moreover, we propose the basic structure, the twisted $π$-flux blocks, to fulfill the projective symmetry algebra, and develop a general approach to construct spinless MCIs based on these building blocks. Two concrete spinless MCI models are presented, which can be readily realized in artificial systems such as acoustic crystals. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: 12 pages and 10 figures

arXiv:2205.06939 [pdf, ps, other]

Relevance between Information scrambling and quantum Darwinism

Authors: Feng Tian, Jian Zou, Hai Li, Bin Shao

Abstract: Quantum system interacting with environment can induce redundant encoding of the information of system into a multipartite environment, which is the essence of quantum Darwinism. At the same time, environment may scramble the initially localized information about the system. We mainly investigate the relevance between information scrambling in environment and the emergence of quantum Darwinism. Fi… ▽ More Quantum system interacting with environment can induce redundant encoding of the information of system into a multipartite environment, which is the essence of quantum Darwinism. At the same time, environment may scramble the initially localized information about the system. We mainly investigate the relevance between information scrambling in environment and the emergence of quantum Darwinism. First, we generally identify that when the system shows a Darwinistic behavior system information that is initially localized in the environment is not scrambled, while when Darwinism disappears scrambling occurs.We then verify our result through a collision model where the system, consisting of one or two qubits, interacts with an ensemble of environmental ancillas.Moreover, dependent on the nature of system-environment interactions, our results also shows that the single qubit and two-qubit systems behave differently for the emergence of QD and the scrambling, but the above relevance between them remains valid. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: 9 pages, 11 figures

arXiv:2205.03979 [pdf, ps, other]

doi 10.3390/e24111532

Quantum information scrambling in non-Markovian open quantum systems

Authors: Li-** Han, Jian Zou, Hai Li, Bin Shao

Abstract: In this paper we investigate the dynamics of a spin chain whose two end spins interact with two independent non-Markovian baths by using the non-Markovian quantum state diffusion (QSD) equation approach. Specifically two issues about quantum information scrambling in open quantum system are addressed. The first issue is that tripartite mutual information (TMI) can quantify information scrambling p… ▽ More In this paper we investigate the dynamics of a spin chain whose two end spins interact with two independent non-Markovian baths by using the non-Markovian quantum state diffusion (QSD) equation approach. Specifically two issues about quantum information scrambling in open quantum system are addressed. The first issue is that tripartite mutual information (TMI) can quantify information scrambling properly via its negative value in closed system, whether it is still suitable to indicate quantum scrambling in open quantum system. However we find that negative TMI is not an suitable quantifier of information scrambling in open quantum system in some cases while negative tripartite logarithmic negativity (TLN) is more appropriate. The second one is that up to now almost all the open quantum system effects on information scrambling reported were focus on the Markovian environment, while the effect of non-Markovian environment on information scrambling is still elusive. Significantly our results show that the memory effect of environment will be beneficial to the emergence of quantum information scrambling. Moreover, it is found that environment is generally detrimental for information scrambling in a long time, while in some cases it will be helpful for information scrambling in a short time. △ Less

Submitted 8 May, 2022; originally announced May 2022.

Comments: 13 pages, 17 figures

arXiv:2204.11179 [pdf, other]

doi 10.1103/PhysRevB.106.165112

Generating Two-dimensional Ferromagnetic Charge Density Waves via External Fields

Authors: Heng **, Jiabin Chen, Yang Li, Bin Shao, Bing Huang

Abstract: Two-dimensional (2D) ferromagnetic charge density wave (CDW), an exotic quantum state for exploring the intertwining effect between correlated charge and spin orders in 2D limit, has not been discovered in the experiments yet. Here, we propose a feasible strategy to realize 2D ferromagnetic CDWs under external fields, which is demonstrated in monolayer VSe$_2$ using first-principles calculations.… ▽ More Two-dimensional (2D) ferromagnetic charge density wave (CDW), an exotic quantum state for exploring the intertwining effect between correlated charge and spin orders in 2D limit, has not been discovered in the experiments yet. Here, we propose a feasible strategy to realize 2D ferromagnetic CDWs under external fields, which is demonstrated in monolayer VSe$_2$ using first-principles calculations. Under external tensile strain, two novel ferromagnetic CDWs ($\sqrt{3}$$\times$$\sqrt{3}$ and 2$\times2\sqrt{3}$ CDWs) can be generated, accompanied by distinguishable lattice reconstructions of magnetic V atoms. Remarkably, because the driving forces for generating these two ferromagnetic CDWs are strongly spin-dependent, fundamentally different from that in conventional CDWs, the $\sqrt{3}$$\times$$\sqrt{3}$ and 2$\times2\sqrt{3}$ CDWs can exhibit two dramatically different half-metallic phases under a large strain range, along with either a flat band or a Dirac cone around Fermi level. Our proposed strategy and material demonstration may open a door to generate and manipulate correlation effect between collective charge and spin orders via external fields. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: 6 pages,4 figures

arXiv:2203.14513 [pdf]

Multi-View Substructure Learning for Drug-Drug Interaction Prediction

Authors: Zimeng Li, Shichao Zhu, Bin Shao, Tie-Yan Liu, Xiangxiang Zeng, Tong Wang

Abstract: Drug-drug interaction (DDI) prediction provides a drug combination strategy for systemically effective treatment. Previous studies usually model drug information constrained on a single view such as the drug itself, leading to incomplete and noisy information, which limits the accuracy of DDI prediction. In this work, we propose a novel multi- view drug substructure network for DDI prediction (MSN… ▽ More Drug-drug interaction (DDI) prediction provides a drug combination strategy for systemically effective treatment. Previous studies usually model drug information constrained on a single view such as the drug itself, leading to incomplete and noisy information, which limits the accuracy of DDI prediction. In this work, we propose a novel multi- view drug substructure network for DDI prediction (MSN-DDI), which learns chemical substructures from both the representations of the single drug (intra-view) and the drug pair (inter-view) simultaneously and utilizes the substructures to update the drug representation iteratively. Comprehensive evaluations demonstrate that MSN-DDI has almost solved DDI prediction for existing drugs by achieving a relatively improved accuracy of 19.32% and an over 99% accuracy under the transductive setting. More importantly, MSN-DDI exhibits better generalization ability to unseen drugs with a relatively improved accuracy of 7.07% under more challenging inductive scenarios. Finally, MSN-DDI improves prediction performance for real-world DDI applications to new drugs. △ Less

Submitted 28 March, 2022; originally announced March 2022.

arXiv:2202.03882 [pdf]

doi 10.1088/0256-307X/40/8/087303

Engineering Interlayer Hybridization in Energy Space via Dipolar Overlayers

Authors: Bin Shao, Xiao Jiang, Jan Berges, Sheng Meng, Bing Huang

Abstract: The interlayer hybridization (IH) of van der Waals (vdW) materials is thought to be mostly associated with the unignorable interlayer overlaps of wavefunctions ($t$) in real space. Here, we develop a more fundamental understanding of IH by introducing a new physical quantity, the IH admixture ratio $α$. Consequently, an exotic strategy of IH engineering in energy space can be proposed, i.e., inste… ▽ More The interlayer hybridization (IH) of van der Waals (vdW) materials is thought to be mostly associated with the unignorable interlayer overlaps of wavefunctions ($t$) in real space. Here, we develop a more fundamental understanding of IH by introducing a new physical quantity, the IH admixture ratio $α$. Consequently, an exotic strategy of IH engineering in energy space can be proposed, i.e., instead of changing t as commonly used, $α$ can be effectively tuned in energy space by changing the onsite energy difference ($2Δ$) between neighboring-layer states. In practice, this is feasible via resha** the electrostatic potential of the surface by deposing a dipolar overlayer, e.g., crystalline ice. Our first-principles calculations unveil that IH engineering via adjusting $2Δ$ can greatly tune interlayer optical transitions in transition-metal dichalcogenide bilayers, switch different types of Dirac surface states in Bi$_2$Se$_3$ thin films, and control magnetic phase transition of charge density waves in 1H/1T-TaS$_2$ bilayers, opening new opportunities to govern the fundamental optoelectronic, topological, and magnetic properties of vdW systems beyond the traditional interlayer-distance or twisting engineering. △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: 12 pages, 4 figures

Journal ref: Chin. Phys. Lett. 40, 087303 (2023)

arXiv:2201.08572 [pdf]

Research and experimental design of Astrojax double balls trajectory based on double pendulum system

Authors: Bin Duan, Zihao Bai, Yulong Zhang, Qingyuan Zhang, Sixing Fang, Bohui Shao

Abstract: Based on the double pendulum and Lagrange equation, the moving particles are captured by a binocular three-dimensional capture camera. Two trajectory models of Astrojax and the relationship between trajectory empirical formula and parameters are established. Through research, the calculated trajectory of this formula and related parameters fit well with the actual measured trajectory, and can accu… ▽ More Based on the double pendulum and Lagrange equation, the moving particles are captured by a binocular three-dimensional capture camera. Two trajectory models of Astrojax and the relationship between trajectory empirical formula and parameters are established. Through research, the calculated trajectory of this formula and related parameters fit well with the actual measured trajectory, and can accurately predict and change the trajectory of the model. The equipment and materials required in the experiment are simple and easy to obtain, and the experimental theme is relatively interesting and novel, which can be applied as an extended experiment in college physics experiment course, so that students can understand the motion characteristics of the double pendulum and learn physics from life. The designing experiment can not only improve students' interest in learning, but also broaden their knowledge and cultivate their practical ability. △ Less

Submitted 23 January, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

Comments: Comments updated: The last author is the corresponding author; The paper is in Chinese language

arXiv:2112.07181 [pdf, other]

doi 10.1103/PhysRevB.105.085113

Tensor Theory for Higher Dimensional Chern Insulators with Large Chern Numbers

Authors: Kai Wang, Jia-Xiao Dai, L. B. Shao, Shengyuan A. Yang, Y. X. Zhao

Abstract: Recent advances in topological artificial systems open the door to realizing topological states in dimensions higher than the usual three-dimensional space. Here, we present a "tensor product" theory, which offers a method to construct Chern insulators with arbitrarily high dimensions and Chern numbers. Particularly, we show that the tensor product of a $d_A$D Chern insulator… ▽ More Recent advances in topological artificial systems open the door to realizing topological states in dimensions higher than the usual three-dimensional space. Here, we present a "tensor product" theory, which offers a method to construct Chern insulators with arbitrarily high dimensions and Chern numbers. Particularly, we show that the tensor product of a $d_A$D Chern insulator $\langle \mathcal{H}_A^{(κ_{A})}, C_A\rangle$ with a $d_B$D Chern insulator $\langle \mathcal{H}_B^{(κ_B)}, C_B\rangle$ leads to a $(d_A+d_B)$D Chern insulator $\langle \mathcal{H}_{A B}^{(κ_A\star κ_B)},-2C_AC_B\rangle $, where in the brackets, $\mathcal{H}^{(κ)}$ is the $d$D Hamiltonian with $d$ even, $C$ is the corresponding $(d/2)$th Chern number, and $κ$ labels the five non-chiral Altland-Zirnbauer symmetry classes A, AI, D, AII and C. The four real classes AI, D, AII and C form a Klein four-group under the multiplication `$\star$' with class AI the identity, and class A is the zero element. Our theory leads to novel higher-dimensional topological physics. (i) The construction can generate large higher-order Chern numbers, e.g., for some cases the resultant classification is $8\mathbb{Z}$. (ii) Fascinatingly, the boundary states feature flat nodal hypersurfaces with nontrivial Chern charges. For the constructed $(d_A+d_B)$D Chern insulator, a boundary perpendicular to a direction of $\mathcal{H}_A$ generically hosts $|C_A|$ $d_B$D nodal hypersurfaces, each of which has topological charge $\pm 2C_B$. Under perturbations, each nodal hypersurface bursts into stable unit nodal points, with the total Chern charge conserved. Examples are given to demonstrate our theory, which can be experimentally realized in artificial systems such as acoustic crystals, electric circuit arrays, ultracold atoms, or mechanical networks. △ Less

Submitted 7 February, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

Comments: 9 pages, 3 figures, 2 tables

Journal ref: Phys. Rev. B 105, 085113 (2022)

arXiv:2110.15527 [pdf, other]

Pre-training Co-evolutionary Protein Representation via A Pairwise Masked Language Model

Authors: Liang He, Shizhuo Zhang, Lijun Wu, Huanhuan Xia, Fusong Ju, He Zhang, Siyuan Liu, Yingce Xia, Jianwei Zhu, Pan Deng, Bin Shao, Tao Qin, Tie-Yan Liu

Abstract: Understanding protein sequences is vital and urgent for biology, healthcare, and medicine. Labeling approaches are expensive yet time-consuming, while the amount of unlabeled data is increasing quite faster than that of the labeled data due to low-cost, high-throughput sequencing methods. In order to extract knowledge from these unlabeled data, representation learning is of significant value for p… ▽ More Understanding protein sequences is vital and urgent for biology, healthcare, and medicine. Labeling approaches are expensive yet time-consuming, while the amount of unlabeled data is increasing quite faster than that of the labeled data due to low-cost, high-throughput sequencing methods. In order to extract knowledge from these unlabeled data, representation learning is of significant value for protein-related tasks and has great potential for hel** us learn more about protein functions and structures. The key problem in the protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences. Instead of leveraging multiple sequence alignment as is usually done, we propose a novel method to capture this information directly by pre-training via a dedicated language model, i.e., Pairwise Masked Language Model (PMLM). In a conventional masked language model, the masked tokens are modeled by conditioning on the unmasked tokens only, but processed independently to each other. However, our proposed PMLM takes the dependency among masked tokens into consideration, i.e., the probability of a token pair is not equal to the product of the probability of the two tokens. By applying this model, the pre-trained encoder is able to generate a better representation for protein sequences. Our result shows that the proposed method can effectively capture the inter-residue correlations and improves the performance of contact prediction by up to 9% compared to the MLM baseline under the same setting. The proposed model also significantly outperforms the MSA baseline by more than 7% on the TAPE contact prediction benchmark when pre-trained on a subset of the sequence database which the MSA is generated from, revealing the potential of the sequence pre-training method to surpass MSA based methods in general. △ Less

Submitted 29 October, 2021; originally announced October 2021.

arXiv:2110.14811 [pdf, other]

SE(3) Equivariant Graph Neural Networks with Complete Local Frames

Authors: Weitao Du, He Zhang, Yuanqi Du, Qi Meng, Wei Chen, Bin Shao, Tie-Yan Liu

Abstract: Group equivariance (e.g. SE(3) equivariance) is a critical physical symmetry in science, from classical and quantum physics to computational biology. It enables robust and accurate prediction under arbitrary reference transformations. In light of this, great efforts have been put on encoding this symmetry into deep neural networks, which has been shown to improve the generalization performance and… ▽ More Group equivariance (e.g. SE(3) equivariance) is a critical physical symmetry in science, from classical and quantum physics to computational biology. It enables robust and accurate prediction under arbitrary reference transformations. In light of this, great efforts have been put on encoding this symmetry into deep neural networks, which has been shown to improve the generalization performance and data efficiency for downstream tasks. Constructing an equivariant neural network generally brings high computational costs to ensure expressiveness. Therefore, how to better trade-off the expressiveness and computational efficiency plays a core role in the design of the equivariant deep learning models. In this paper, we propose a framework to construct SE(3) equivariant graph neural networks that can approximate the geometric quantities efficiently. Inspired by differential geometry and physics, we introduce equivariant local complete frames to graph neural networks, such that tensor information at given orders can be projected onto the frames. The local frame is constructed to form an orthonormal basis that avoids direction degeneration and ensure completeness. Since the frames are built only by cross product operations, our method is computationally efficient. We evaluate our method on two tasks: Newton mechanics modeling and equilibrium molecule conformation generation. Extensive experimental results demonstrate that our model achieves the best or competitive performance in two types of datasets. △ Less

Submitted 5 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: ICML 2022 accepted

arXiv:2110.07347 [pdf, other]

Improved Drug-target Interaction Prediction with Intermolecular Graph Transformer

Authors: Siyuan Liu, Yusong Wang, Tong Wang, Yifan Deng, Liang He, Bin Shao, Jian Yin, Nanning Zheng, Tie-Yan Liu

Abstract: The identification of active binding drugs for target proteins (termed as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieved better performance than molecular docking, existing models often neglect certain aspects of the intermolecular information, hindering the perf… ▽ More The identification of active binding drugs for target proteins (termed as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieved better performance than molecular docking, existing models often neglect certain aspects of the intermolecular information, hindering the performance of prediction. We recognize this problem and propose a novel approach named Intermolecular Graph Transformer (IGT) that employs a dedicated attention mechanism to model intermolecular information with a three-way Transformer-based architecture. IGT outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively, and shows superior generalization ability to unseen receptor proteins. Furthermore, IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses. △ Less

Submitted 15 October, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

arXiv:2108.06681 [pdf, other]

doi 10.1016/j.imavis.2021.104286

Multi-granularity for knowledge distillation

Authors: Baitan Shao, Ying Chen

Abstract: Considering the fact that students have different abilities to understand the knowledge imparted by teachers, a multi-granularity distillation mechanism is proposed for transferring more understandable knowledge for student networks. A multi-granularity self-analyzing module of the teacher network is designed, which enables the student network to learn knowledge from different teaching patterns. F… ▽ More Considering the fact that students have different abilities to understand the knowledge imparted by teachers, a multi-granularity distillation mechanism is proposed for transferring more understandable knowledge for student networks. A multi-granularity self-analyzing module of the teacher network is designed, which enables the student network to learn knowledge from different teaching patterns. Furthermore, a stable excitation scheme is proposed for robust supervision for the student training. The proposed distillation mechanism can be embedded into different distillation frameworks, which are taken as baselines. Experiments show the mechanism improves the accuracy by 0.58% on average and by 1.08% in the best over the baselines, which makes its performance superior to the state-of-the-arts. It is also exploited that the student's ability of fine-tuning and robustness to noisy inputs can be improved via the proposed mechanism. The code is available at https://github.com/shaoeric/multi-granularity-distillation. △ Less

Submitted 15 August, 2021; originally announced August 2021.

Comments: 14 pages, 12 figures

arXiv:2108.05609 [pdf, ps, other]

doi 10.1088/1361-6471/ac430e

Self-consistent description of the halo nature of 31Ne with continuum and pairing correlations

Authors: Shisheng Zhang, Shiyi Zhong, Bo Shao, Michael Scott Smith

Abstract: Using a Glauber model with our relativistic fully microscopic structure model input, we give a full description of the halo nature of 31Ne that includes a self-consistent use of pairing and continuum contributions that makes predictions consistent with reaction cross section measurements. Our predictions of total reaction and one-neutron removal cross sections of 31Ne on a Carbon target were signi… ▽ More Using a Glauber model with our relativistic fully microscopic structure model input, we give a full description of the halo nature of 31Ne that includes a self-consistent use of pairing and continuum contributions that makes predictions consistent with reaction cross section measurements. Our predictions of total reaction and one-neutron removal cross sections of 31Ne on a Carbon target were significantly enhanced compared with those of neighboring Neon isotopes, agreeing well with measurements at 240 MeV/nucleon and consistent with a single neutron halo. Furthermore, our calculations of the inclusive longitudinal momentum distribution of the 30Ne and valence neutron residues from the 31Ne breakup reaction indicate a dilute density distribution in coordinate space, another halo signature. △ Less

Submitted 10 December, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

arXiv:2107.03258 [pdf]

Lorentz-breaking Theory and Tunneling Radiation Correction to Vaidya-Banner de Sitter Black Hole

Authors: Bei Sha, Zhi-E Liu

Abstract: In Vaidya-Bonner de Sitter Black hole space-time, the tunneling radiation characteristics of fermions and bosons are corrected by taking Lorentz symmetry breaking theory into account. The corresponding gamma matrices and ether-like field vectors of the black hole are constructed, then the new modified form of Dirac equation for the fermion with spin 1/2 and the new modified form of Klein-Gordon eq… ▽ More In Vaidya-Bonner de Sitter Black hole space-time, the tunneling radiation characteristics of fermions and bosons are corrected by taking Lorentz symmetry breaking theory into account. The corresponding gamma matrices and ether-like field vectors of the black hole are constructed, then the new modified form of Dirac equation for the fermion with spin 1/2 and the new modified form of Klein-Gordon equation for boson in the curved space-time of the black hole are obtained. Through solving the two equations, new and corrected expressions of surface gravity, Hawking temperature and tunneling rate of the black hole are obtained, and the results obtained are also discussed. △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: 13 pages,0 figure

arXiv:2104.00310 [pdf, other]

doi 10.1103/PhysRevLett.127.076401

The gauge-field extended $k\cdot p$ method and novel topological phases

Authors: L. B. Shao, Q. Liu, R. Xiao, Shengyuan A. Yang, Y. X. Zhao

Abstract: Although topological artificial systems, like acoustic/photonic crystals and cold atoms in optical lattices were initially motivated by simulating topological phases of electronic systems, they have their own unique features such as the spinless time-reversal symmetry and tunable $\mathbb{Z}_2$ gauge fields. Hence, it is fundamentally important to explore new topological phases based on their uniq… ▽ More Although topological artificial systems, like acoustic/photonic crystals and cold atoms in optical lattices were initially motivated by simulating topological phases of electronic systems, they have their own unique features such as the spinless time-reversal symmetry and tunable $\mathbb{Z}_2$ gauge fields. Hence, it is fundamentally important to explore new topological phases based on their unique features. Here, we point out that the $\mathbb{Z}_2$ gauge field leads to two fundamental modifications of the conventional $k\cdot p$ method: (i) The little co-group must include the translations with nontrivial algebraic relations; (ii) The algebraic relations of the little co-group are projectively represented. These give rise to higher-dimensional irreducible representations and therefore highly degenerate Fermi points. Breaking the primitive translations can transform the Fermi points to interesting topological phases. We demonstrate our theory by two models: a rectangular $π$-flux model exhibiting graphene-like semimetal phases, and a graphite model with interlayer $π$ flux that realizes the real second-order nodal-line semimetal phase with hinge helical modes. Their physical realizations with a general bright-dark mechanism are discussed. Our finding opens a new direction to explore novel topological phases unique to artificial systems and establishes the approach to analyze these phases. △ Less

Submitted 27 July, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

Comments: 18 pages, 10 figures, to be published in Physical Review Letters

Journal ref: Phys. Rev. Lett. 127, 076401 (2021)

arXiv:2101.01884 [pdf]

Exploring the Regulatory Function of the N-terminal Domain of SARS-CoV-2 Spike Protein Through Molecular Dynamics Simulation

Authors: Yao Li, Tong Wang, Juanrong Zhang, Bin Shao, Haipeng Gong, Yusong Wang, Siyuan Liu, Tie-Yan Liu

Abstract: SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection is mediated by the SARS-CoV-2 homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in the receptor-accessible state. We performed molecular dynamics simulation on the S protein with a focus on the function of its N-terminal domains (NTDs). Our study reveals that the NTD acts as a "wedge" and plays a cruc… ▽ More SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection is mediated by the SARS-CoV-2 homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in the receptor-accessible state. We performed molecular dynamics simulation on the S protein with a focus on the function of its N-terminal domains (NTDs). Our study reveals that the NTD acts as a "wedge" and plays a crucial regulatory role in the conformational changes of the S protein. The complete RBD structural transition is allowed only when the neighboring NTD that typically prohibits the RBD's movements as a wedge detaches and swings away. Based on this NTD "wedge" model, we propose that the NTD-RBD interface should be a potential drug target. △ Less

Submitted 6 January, 2021; originally announced January 2021.

arXiv:2012.12364 [pdf, ps, other]

doi 10.3390/e23040471

Effect of inter-system coupling on heat transport in a microscopic collision model

Authors: Feng Tian, Jian Zou, Lei Li, Hai Li, Bin Shao

Abstract: In this paper we consider a bipartite system composed of two subsystems each coupled to its own thermal environment. Based on a collision model, we mainly study whether the approximation (i.e., the inter-system interaction is ignored when modeling the system-environment coupling) is valid or not. We also address the problem of heat transport unitedly for both conventional energy-preserving system-… ▽ More In this paper we consider a bipartite system composed of two subsystems each coupled to its own thermal environment. Based on a collision model, we mainly study whether the approximation (i.e., the inter-system interaction is ignored when modeling the system-environment coupling) is valid or not. We also address the problem of heat transport unitedly for both conventional energy-preserving system-environment interactions and non-energy preserving system-environment interactions. For the former interaction, as the inter-system interaction strength increases, at first this approximation gets worse as expected, but then counterintuitively gets better even for a stronger inter-system coupling. For the latter interaction with asymmetry, this approximation gets progressively worse. In this case we realize a perfect thermal rectification, and we can not find apparent rectification effect for the former interaction. Finally and more importantly, our results show that whether this approximation is valid or not is closely related to the quantum correlations between the subsystems, i.e., the weaker the quantum correlations, the more justified the approximation and vice versa. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: 9 pages, 10 figures

arXiv:2010.10789 [pdf, other]

ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models in Sponsored Search Engine

Authors: Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang, Houqiang Li, Nan Duan, Ming Zhou

Abstract: In a sponsored search engine, generative retrieval models are recently proposed to mine relevant advertisement keywords for users' input queries. Generative retrieval models generate outputs token by token on a path of the target library prefix tree (Trie), which guarantees all of the generated outputs are legal and covered by the target library. In actual use, we found several typical problems ca… ▽ More In a sponsored search engine, generative retrieval models are recently proposed to mine relevant advertisement keywords for users' input queries. Generative retrieval models generate outputs token by token on a path of the target library prefix tree (Trie), which guarantees all of the generated outputs are legal and covered by the target library. In actual use, we found several typical problems caused by Trie-constrained searching length. In this paper, we analyze these problems and propose a looking ahead strategy for generative retrieval models named ProphetNet-Ads. ProphetNet-Ads improves the retrieval ability by directly optimizing the Trie-constrained searching space. We build a dataset from a real-word sponsored search engine and carry out experiments to analyze different generative retrieval models. Compared with Trie-based LSTM generative retrieval model proposed recently, our single model result and integrated result improve the recall by 15.58\% and 18.8\% respectively with beam size 5. Case studies further demonstrate how these problems are alleviated by ProphetNet-Ads clearly. △ Less

Submitted 21 October, 2020; originally announced October 2020.

Comments: Accepted to NLPCC 2020

arXiv:2009.09638 [pdf]

doi 10.1103/PhysRevB.103.L201405

Realization of Semiconducting Layered Multiferroic Heterojunctions via Asymmetrical Magnetoelectric Coupling

Authors: Baishun Yang, Bin Shao, Jianfeng Wang, ChiYung Yam, Shengbai Zhang, Bing Huang

Abstract: Two-dimensional (2D) semiconducting multiferroics that can effectively couple magnetic and polarization (P) orders have great interest for both fundamental research and technological applications in nanoscale, which are, however, rare in nature. In this study, we propose a general mechanism to realize semiconducting 2D multiferroics via vdW heterojunction engineering, as demonstrated in a typical… ▽ More Two-dimensional (2D) semiconducting multiferroics that can effectively couple magnetic and polarization (P) orders have great interest for both fundamental research and technological applications in nanoscale, which are, however, rare in nature. In this study, we propose a general mechanism to realize semiconducting 2D multiferroics via vdW heterojunction engineering, as demonstrated in a typical heterostructure consisting of magnetic bilayer CrI3 (bi-CrI3) and ferroelectric monolayer In2Se3. Interestingly, the novel indirect orbital coupling between Se 4p and Cr 3d orbitals, intermediated by the interfacial I 5p orbitals, are switchable in the opposite P configurations, resulting in an unexpected mechanism of strong asymmetrical magnetoelectric coupling. Therefore, along with the noticeable ferroelectric energy barrier induced by In2Se3, the realization of opposite magnetic orders in opposite P configurations can eventually result in the novel multiferroicity in bi-CrI3/In2Se3. Finally, we demonstrate that our mechanism can generally be applied to design other vdW multiferroics even with tunable layer thickness. △ Less

Submitted 21 September, 2020; originally announced September 2020.

Comments: 11pages, 4 figures

Journal ref: Phys. Rev. B 103, 201405 (2021)

arXiv:2009.07483 [pdf, other]

Unified Theory of Quantum Crystalline Symmetries

Authors: Y. X. Zhao, L. B. Shao

Abstract: Symmetry groups are projectively represented in quantum mechanics, and crystalline symmetries are fundamental in condensed matter physics. Here, we systematically present a unified theory of quantum mechanical space groups from two complementary aspects. First, we provide a decomposition form for the space-group factor systems to characterize all quantum space groups. It consists of three factors,… ▽ More Symmetry groups are projectively represented in quantum mechanics, and crystalline symmetries are fundamental in condensed matter physics. Here, we systematically present a unified theory of quantum mechanical space groups from two complementary aspects. First, we provide a decomposition form for the space-group factor systems to characterize all quantum space groups. It consists of three factors, the factor system for the translation subgroup $L$, an in-homogeneous factor system for the point group $P$, and a factor connecting $L$ and $P$. The three factors satisfy three consistency equations, which are exactly solvable and can completely exhaust all factor systems for space groups. Second, since factors systems are classified by the second cohomology group, we show the (co)homology groups for space groups can be derived from Borel's equivariant (co)homology theory, which leads to an algorithm that can compute all (co)homology groups for space groups. To demonstrate the general theory, we explicitly present quantum wallpaper groups with the $\mathbb{Z}_2$ gauge group. Furthermore, as a primitive application, we find the time-reversal invariant quantum space groups with inversion symmetry can lead to a novel clifford band theory, where each band is fourfold degenerate to represent certain real Clifford algebras with topologically nontrivial pinor structures over the Brillouin zone. Our work serves as a foundation for exploring quantum mechanical space groups, and can find applications in spin liquids, unconventional superconductors, and artificial lattice systems, including cold atoms, photonic and phononic crystals, and even LC electric circuit networks. △ Less

Submitted 16 September, 2020; originally announced September 2020.

Comments: 14 pages, 3 tables and 1 figure

arXiv:2005.13297 [pdf, other]

Accelerating Neural Network Inference by Overflow Aware Quantization

Authors: Hongwei Xie, Shuo Zhang, Huanghao Ding, Yafei Song, Baitao Shao, Conggang Hu, Ling Cai, Mingyang Li

Abstract: The inherent heavy computation of deep neural networks prevents their widespread applications. A widely used method for accelerating model inference is quantization, by replacing the input operands of a network using fixed-point values. Then the majority of computation costs focus on the integer matrix multiplication accumulation. In fact, high-bit accumulator leads to partially wasted computation… ▽ More The inherent heavy computation of deep neural networks prevents their widespread applications. A widely used method for accelerating model inference is quantization, by replacing the input operands of a network using fixed-point values. Then the majority of computation costs focus on the integer matrix multiplication accumulation. In fact, high-bit accumulator leads to partially wasted computation and low-bit one typically suffers from numerical overflow. To address this problem, we propose an overflow aware quantization method by designing trainable adaptive fixed-point representation, to optimize the number of bits for each input tensor while prohibiting numeric overflow during the computation. With the proposed method, we are able to fully utilize the computing power to minimize the quantization loss and obtain optimized inference performance. To verify the effectiveness of our method, we conduct image classification, object detection, and semantic segmentation tasks on ImageNet, Pascal VOC, and COCO datasets, respectively. Experimental results demonstrate that the proposed method can achieve comparable performance with state-of-the-art quantization methods while accelerating the inference process by about 2 times. △ Less

Submitted 27 May, 2020; originally announced May 2020.

arXiv:2005.05565 [pdf, other]

doi 10.1103/PhysRevLett.125.126403

Boundary criticality of $PT$-invariant topology and second-order nodal-line semimetals

Authors: Kai Wang, Jia-Xiao Dai, L. B. Shao, Shengyuan A. Yang, Y. X. Zhao

Abstract: For conventional topological phases, the boundary gapless modes are determined by bulk topological invariants. Based on develo** an analytic method to solve higher-order boundary modes, we present $PT$-invariant $2$D topological insulators and $3$D topological semimetals that go beyond this bulk-boundary correspondence framework. With unchanged bulk topological invariant, their first-order bound… ▽ More For conventional topological phases, the boundary gapless modes are determined by bulk topological invariants. Based on develo** an analytic method to solve higher-order boundary modes, we present $PT$-invariant $2$D topological insulators and $3$D topological semimetals that go beyond this bulk-boundary correspondence framework. With unchanged bulk topological invariant, their first-order boundaries undergo transitions separating different phases with second-order-boundary zero-modes. For the $2$D topological insulator, the helical edge modes appear at the transition point for two second-order topological insulator phases with diagonal and off-diagonal corner zero-modes, respectively. Accordingly, for the $3$D topological semimetal, the criticality corresponds to surface helical Fermi arcs of a Dirac semimetal phase. Interestingly, we find that the $3$D system generically belongs to a novel second-order nodal-line semimetal phase, possessing gapped surfaces but a pair of diagonal or off-diagonal hinge Fermi arcs. △ Less

Submitted 7 August, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

Comments: 12 pages, 4 figures, to appear in Phys. Rev. Lett

Journal ref: Phys. Rev. Lett. 125, 126403 (2020)

arXiv:2005.03915 [pdf, other]

Defending Model Inversion and Membership Inference Attacks via Prediction Purification

Authors: Ziqi Yang, Bin Shao, Bohan Xuan, Ee-Chien Chang, Fan Zhang

Abstract: Neural networks are susceptible to data inference attacks such as the model inversion attack and the membership inference attack, where the attacker could infer the reconstruction and the membership of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a unified approach, namely purification framework, to defend data inference attacks. It purifie… ▽ More Neural networks are susceptible to data inference attacks such as the model inversion attack and the membership inference attack, where the attacker could infer the reconstruction and the membership of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a unified approach, namely purification framework, to defend data inference attacks. It purifies the confidence score vectors predicted by the target classifier by reducing their dispersion. The purifier can be further specialized in defending a particular attack via adversarial learning. We evaluate our approach on benchmark datasets and classifiers. We show that when the purifier is dedicated to one attack, it naturally defends the other one, which empirically demonstrates the connection between the two attacks. The purifier can effectively defend both attacks. For example, it can reduce the membership inference accuracy by up to 15% and increase the model inversion error by a factor of up to 4. Besides, it incurs less than 0.4% classification accuracy drop and less than 5.5% distortion to the confidence scores. △ Less

Submitted 20 August, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

Comments: updated experiments and results

arXiv:2005.00856 [pdf, other]

SEEK: Segmented Embedding of Knowledge Graphs

Authors: Wentao Xu, Shun Zheng, Liang He, Bin Shao, Jian Yin, Tie-Yan Liu

Abstract: In recent years, knowledge graph embedding becomes a pretty hot research topic of artificial intelligence and plays increasingly vital roles in various downstream applications, such as recommendation and question answering. However, existing methods for knowledge graph embedding can not make a proper trade-off between the model complexity and the model expressiveness, which makes them still far fr… ▽ More In recent years, knowledge graph embedding becomes a pretty hot research topic of artificial intelligence and plays increasingly vital roles in various downstream applications, such as recommendation and question answering. However, existing methods for knowledge graph embedding can not make a proper trade-off between the model complexity and the model expressiveness, which makes them still far from satisfactory. To mitigate this problem, we propose a lightweight modeling framework that can achieve highly competitive relational expressiveness without increasing the model complexity. Our framework focuses on the design of scoring functions and highlights two critical characteristics: 1) facilitating sufficient feature interactions; 2) preserving both symmetry and antisymmetry properties of relations. It is noteworthy that owing to the general and elegant design of scoring functions, our framework can incorporate many famous existing methods as special cases. Moreover, extensive experiments on public benchmarks demonstrate the efficiency and effectiveness of our framework. Source codes and data can be found at \url{https://github.com/Wentao-Xu/SEEK}. △ Less

Submitted 22 June, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

arXiv:2004.13875 [pdf, other]

6G White Paper on Machine Learning in Wireless Communication Networks

Authors: Samad Ali, Walid Saad, Nandana Rajatheva, Kapseok Chang, Daniel Steinbach, Benjamin Sliwa, Christian Wietfeld, Kai Mei, Hamid Shiri, Hans-Jürgen Zepernick, Thi My Chinh Chu, Ijaz Ahmad, Jyrki Huusko, Jaakko Suutala, Shubhangi Bhadauria, Vimal Bhatia, Rangeet Mitra, Saidhiraj Amuru, Robert Abbas, Baohua Shao, Michele Capobianco, Guanghui Yu, Maelick Claes, Teemu Karvonen, Mingzhe Chen , et al. (2 additional authors not shown)

Abstract: The focus of this white paper is on machine learning (ML) in wireless communications. 6G wireless communication networks will be the backbone of the digital transformation of societies by providing ubiquitous, reliable, and near-instant wireless connectivity for humans and machines. Recent advances in ML research has led enable a wide range of novel technologies such as self-driving vehicles and v… ▽ More The focus of this white paper is on machine learning (ML) in wireless communications. 6G wireless communication networks will be the backbone of the digital transformation of societies by providing ubiquitous, reliable, and near-instant wireless connectivity for humans and machines. Recent advances in ML research has led enable a wide range of novel technologies such as self-driving vehicles and voice assistants. Such innovation is possible as a result of the availability of advanced ML models, large datasets, and high computational power. On the other hand, the ever-increasing demand for connectivity will require a lot of innovation in 6G wireless networks, and ML tools will play a major role in solving problems in the wireless domain. In this paper, we provide an overview of the vision of how ML will impact the wireless communication systems. We first give an overview of the ML methods that have the highest potential to be used in wireless networks. Then, we discuss the problems that can be solved by using ML in various layers of the network such as the physical layer, medium access layer, and application layer. Zero-touch optimization of wireless networks using ML is another interesting aspect that is discussed in this paper. Finally, at the end of each section, important research questions that the section aims to answer are presented. △ Less

Submitted 28 April, 2020; originally announced April 2020.

arXiv:2004.03875 [pdf, other]

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

Authors: Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan

Abstract: News headline generation aims to produce a short sentence to attract readers to read the news. One news article often contains multiple keyphrases that are of interest to different users, which can naturally have multiple reasonable headlines. However, most existing methods focus on the single headline generation. In this paper, we propose generating multiple headlines with keyphrases of user inte… ▽ More News headline generation aims to produce a short sentence to attract readers to read the news. One news article often contains multiple keyphrases that are of interest to different users, which can naturally have multiple reasonable headlines. However, most existing methods focus on the single headline generation. In this paper, we propose generating multiple headlines with keyphrases of user interests, whose main idea is to generate multiple keyphrases of interest to users for the news first, and then generate multiple keyphrase-relevant headlines. We propose a multi-source Transformer decoder, which takes three sources as inputs: (a) keyphrase, (b) keyphrase-filtered article, and (c) original article to generate keyphrase-relevant, high-quality, and diverse headlines. Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$. Extensive experimental comparisons on the real-world dataset show that the proposed method achieves state-of-the-art results in terms of quality and diversity △ Less

Submitted 3 October, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

Comments: Accepted at EMNLP 2020

arXiv:2003.11436 [pdf]

Synergistically creating sulfur vacancies in semimetal-supported amorphous MoS2 for efficient hydrogen evolution

Authors: Guowei Li, Chenguang Fu, Jiquan Wu, Jiancun Rao, Sz-Chian Liou, Xi** Xu, Baiqi Shao, Kai Liu, Enke Liu, Nitesh Kumar, Xianjie Liu, Mats Fahlman, Johannes Gooth, Gudrun Auffermann, Yan Sun, Claudia Felser, Baomin Zhang

Abstract: The presence of elemental vacancies in materials is inevitable according to statistical thermodynamics, which will decide the chemical and physical properties of the investigated system. However, the controlled manipulation of vacancies for specific applications is a challenge. Here we report a facile method for creating large concentrations of S vacancies in the inert basal plane of MoS2 supporte… ▽ More The presence of elemental vacancies in materials is inevitable according to statistical thermodynamics, which will decide the chemical and physical properties of the investigated system. However, the controlled manipulation of vacancies for specific applications is a challenge. Here we report a facile method for creating large concentrations of S vacancies in the inert basal plane of MoS2 supported on semimetal CoMoP2. With a small applied potential, S atoms can be removed in the form of H2S due to the optimized free energy of formation. The existence of vacancies favors electron injection from the electrode to the active site by decreasing the contact resistance. As a consequence, the activity is increased by 221 % with the vacancy-rich MoS2 as electrocatalyst for hydrogen evolution reaction (HER). A small overpotential of 75 mV is needed to deliver a current density of 10 mA cm-2, which is considered among the best values achieved for MoS2. It is envisaged that this work may provide a new strategy for utilizing the semimetal phase for structuring MoS2 into a multi-functional material. △ Less

Submitted 25 March, 2020; originally announced March 2020.

arXiv:2002.05946 [pdf, ps, other]

Influence of Lorentz Invariation Violation on Arbitrarily Spin Fermions Tunneling Radiation in the Vaidya-Bonner Spacetime

Authors: Jie Zhang, Zhie Liu, Bei Sha, Xia Tan, Yuzhen Liu, Shuzheng Yang

Abstract: In the spacetime of non-stationary spherical symmetry Vaidya-Bonner black hole, an accurate modification of Hawking tunneling radiation for fermions with arbitrarily spin is researched. Considering a light dispersion relationship derived from string theory, quantum gravitational theory and Rarita-Schwinger Equation in the non-stationary spherical symmetry spacetime, we derive an accurately modifie… ▽ More In the spacetime of non-stationary spherical symmetry Vaidya-Bonner black hole, an accurate modification of Hawking tunneling radiation for fermions with arbitrarily spin is researched. Considering a light dispersion relationship derived from string theory, quantum gravitational theory and Rarita-Schwinger Equation in the non-stationary spherical symmetry spacetime, we derive an accurately modified dynamic equation for fermions with arbitrarily spin. By solving the equation, modified tunneling rate of fermions with arbitrarily spin, Hawking temperature and entropy at the event horizon of Vaidya-Bonner black hole are presented. We find the Hawking temperature will increase, but the the entropy will decrease comparing with the case without Lorentz Invariation Violation modification. △ Less

Submitted 14 February, 2020; originally announced February 2020.

arXiv:2002.03368 [pdf, ps, other]

doi 10.1088/1674-1137/abb4d6

Accurate correction of arbitrary spin fermions quantum tunneling from non-stationary Kerr-de Sitter black hole based on corrected Lorentz dispersion relation

Authors: Bei Sha, Zhi-E Liu, Yu-Zhen Liu, Xia Tan, Jie Zhang, Shu-Zheng Yang

Abstract: According to a corrected dispersion relation proposed in the study on string theory and quantum gravity theory, Rarita-Schwinger equation has been precisely modified, which results in a Rarita-Schwinger-Hamilton-Jacobi equation, and through which, the characteristics of arbitrary spin fermions quantum tunneling radiation from non-stationary Kerr-de Sitter black hole are researched. A series of acc… ▽ More According to a corrected dispersion relation proposed in the study on string theory and quantum gravity theory, Rarita-Schwinger equation has been precisely modified, which results in a Rarita-Schwinger-Hamilton-Jacobi equation, and through which, the characteristics of arbitrary spin fermions quantum tunneling radiation from non-stationary Kerr-de Sitter black hole are researched. A series of accurately corrected physical quantities such as surface gravity, chemical potential, tunneling probability and Hawking temperature that describe the properties of the black hole are derived. This research has enriched the research methods and made precision of the research contents of black hole physics. △ Less

Submitted 9 February, 2020; originally announced February 2020.

Comments: 8 pages

Showing 1–50 of 144 results for author: Shao, B