Search | arXiv e-print repository

doi 10.1103/PhysRevD.106.096005

$S$-wave fully charmed tetraquark resonant states

Authors: Guang-Juan Wang, Qi Meng, Makoto Oka

Abstract: We calculate the mass spectrum of the $S$-wave fully-charmed tetraquark resonant states $cc\bar c\bar c $ in the nonrelativistic quark model, which successfully describes the charmonium spectrum. The four-body system is solved with the Gaussian expansion method. The complex scaling technique is used to identify the genuine resonances. With the nonrelativistic quark model, our results show the exis… ▽ More We calculate the mass spectrum of the $S$-wave fully-charmed tetraquark resonant states $cc\bar c\bar c $ in the nonrelativistic quark model, which successfully describes the charmonium spectrum. The four-body system is solved with the Gaussian expansion method. The complex scaling technique is used to identify the genuine resonances. With the nonrelativistic quark model, our results show the existence of two $cc\bar c\bar c$ resonances in each of the $J^{PC}=$ $0^{++}$, $1^{+-}$ and $2^{++}$ sectors, respectively. In the $S$-wave sector, no resonance is found at the energy region of the $X(6200)$ and $X(6600)$ states. The lower $0^{++}$ and $2^{++}$ resonances are located around $100$ MeV higher than the $X(6900)$ state observed in experiments but have the decay widths consistent with the experiment. The higher $0^{++}$ and $2^{++}$ resonances are found at around $7.2$ GeV with the widths of $60.6$ MeV and $91.2$ MeV, respectively, and they may be good candidates for the $X(7200)$ state. △ Less

Submitted 5 November, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: 9 pages, 5 figures, version to be published in PRD

arXiv:2208.05201 [pdf, other]

Quadrotor Autonomous Landing on Moving Platform

Authors: Pengyu Wang, Chaoqun Wang, Jiankun Wang, Max Q. -H. Meng

Abstract: This paper introduces a quadrotor's autonomous take-off and landing system on a moving platform. The designed system addresses three challenging problems: fast pose estimation, restricted external localization, and effective obstacle avoidance. Specifically, first, we design a landing recognition and positioning system based on the AruCo marker to help the quadrotor quickly calculate the relative… ▽ More This paper introduces a quadrotor's autonomous take-off and landing system on a moving platform. The designed system addresses three challenging problems: fast pose estimation, restricted external localization, and effective obstacle avoidance. Specifically, first, we design a landing recognition and positioning system based on the AruCo marker to help the quadrotor quickly calculate the relative pose; second, we leverage a gradient-based local motion planner to generate collision-free reference trajectories rapidly for the quadrotor; third, we build an autonomous state machine that enables the quadrotor to complete its take-off, tracking and landing tasks in full autonomy; finally, we conduct experiments in simulated, real-world indoor and outdoor environments to verify the system's effectiveness and demonstrate its potential. △ Less

Submitted 10 August, 2022; originally announced August 2022.

arXiv:2208.00214 [pdf, other]

Towards Privacy-Preserving, Real-Time and Lossless Feature Matching

Authors: Qiang Meng, Feng Zhou

Abstract: Most visual retrieval applications store feature vectors for downstream matching tasks. These vectors, from where user information can be spied out, will cause privacy leakage if not carefully protected. To mitigate privacy risks, current works primarily utilize non-invertible transformations or fully cryptographic algorithms. However, transformation-based methods usually fail to achieve satisfyin… ▽ More Most visual retrieval applications store feature vectors for downstream matching tasks. These vectors, from where user information can be spied out, will cause privacy leakage if not carefully protected. To mitigate privacy risks, current works primarily utilize non-invertible transformations or fully cryptographic algorithms. However, transformation-based methods usually fail to achieve satisfying matching performances while cryptosystems suffer from heavy computational overheads. In addition, secure levels of current methods should be improved to confront potential adversary attacks. To address these issues, this paper proposes a plug-in module called SecureVector that protects features by random permutations, 4L-DEC converting and existing homomorphic encryption techniques. For the first time, SecureVector achieves real-time and lossless feature matching among sanitized features, along with much higher security levels than current state-of-the-arts. Extensive experiments on face recognition, person re-identification, image retrieval, and privacy analyses demonstrate the effectiveness of our method. Given limited public projects in this field, codes of our method and implemented baselines are made open-source in https://github.com/IrvingMeng/SecureVector. △ Less

Submitted 30 July, 2022; originally announced August 2022.

arXiv:2208.00034 [pdf, other]

MulViMotion: Shape-aware 3D Myocardial Motion Tracking from Multi-View Cardiac MRI

Authors: Qingjie Meng, Chen Qin, Wenjia Bai, Tianrui Liu, Antonio de Marvao, Declan P O'Regan, Daniel Rueckert

Abstract: Recovering the 3D motion of the heart from cine cardiac magnetic resonance (CMR) imaging enables the assessment of regional myocardial function and is important for understanding and analyzing cardiovascular disease. However, 3D cardiac motion estimation is challenging because the acquired cine CMR images are usually 2D slices which limit the accurate estimation of through-plane motion. To address… ▽ More Recovering the 3D motion of the heart from cine cardiac magnetic resonance (CMR) imaging enables the assessment of regional myocardial function and is important for understanding and analyzing cardiovascular disease. However, 3D cardiac motion estimation is challenging because the acquired cine CMR images are usually 2D slices which limit the accurate estimation of through-plane motion. To address this problem, we propose a novel multi-view motion estimation network (MulViMotion), which integrates 2D cine CMR images acquired in short-axis and long-axis planes to learn a consistent 3D motion field of the heart. In the proposed method, a hybrid 2D/3D network is built to generate dense 3D motion fields by learning fused representations from multi-view images. To ensure that the motion estimation is consistent in 3D, a shape regularization module is introduced during training, where shape information from multi-view images is exploited to provide weak supervision to 3D motion estimation. We extensively evaluate the proposed method on 2D cine CMR images from 580 subjects of the UK Biobank study for 3D motion tracking of the left ventricular myocardium. Experimental results show that the proposed method quantitatively and qualitatively outperforms competing methods. △ Less

Submitted 29 July, 2022; originally announced August 2022.

arXiv:2206.09571 [pdf, other]

Deep Random Vortex Method for Simulation and Inference of Navier-Stokes Equations

Authors: Rui Zhang, Peiyan Hu, Qi Meng, Yue Wang, Rongchan Zhu, Bingguang Chen, Zhi-Ming Ma, Tie-Yan Liu

Abstract: Navier-Stokes equations are significant partial differential equations that describe the motion of fluids such as liquids and air. Due to the importance of Navier-Stokes equations, the development on efficient numerical schemes is important for both science and engineer. Recently, with the development of AI techniques, several approaches have been designed to integrate deep neural networks in simu… ▽ More Navier-Stokes equations are significant partial differential equations that describe the motion of fluids such as liquids and air. Due to the importance of Navier-Stokes equations, the development on efficient numerical schemes is important for both science and engineer. Recently, with the development of AI techniques, several approaches have been designed to integrate deep neural networks in simulating and inferring the fluid dynamics governed by incompressible Navier-Stokes equations, which can accelerate the simulation or inferring process in a mesh-free and differentiable way. In this paper, we point out that the capability of existing deep Navier-Stokes informed methods is limited to handle non-smooth or fractional equations, which are two critical situations in reality. To this end, we propose the \emph{Deep Random Vortex Method} (DRVM), which combines the neural network with a random vortex dynamics system equivalent to the Navier-Stokes equation. Specifically, the random vortex dynamics motivates a Monte Carlo based loss function for training the neural network, which avoids the calculation of derivatives through auto-differentiation. Therefore, DRVM not only can efficiently solve Navier-Stokes equations involving rough path, non-differentiable initial conditions and fractional operators, but also inherits the mesh-free and differentiable benefits of the deep-learning-based solver. We conduct experiments on the Cauchy problem, parametric solver learning, and the inverse problem of both 2-d and 3-d incompressible Navier-Stokes equations. The proposed method achieves accurate results for simulation and inference of Navier-Stokes equations. Especially for the cases that include singular initial conditions, DRVM significantly outperforms existing PINN method. △ Less

Submitted 20 July, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

arXiv:2206.08406 [pdf, other]

Predicting Hate Intensity of Twitter Conversation Threads

Authors: Qing Meng, Tharun Suresh, Roy Ka-Wei Lee, Tanmoy Chakraborty

Abstract: Tweets are the most concise form of communication in online social media, wherein a single tweet has the potential to make or break the discourse of the conversation. Online hate speech is more accessible than ever, and stifling its propagation is of utmost importance for social media companies and users for congenial communication. Most of the research barring a recent few has focused on classify… ▽ More Tweets are the most concise form of communication in online social media, wherein a single tweet has the potential to make or break the discourse of the conversation. Online hate speech is more accessible than ever, and stifling its propagation is of utmost importance for social media companies and users for congenial communication. Most of the research barring a recent few has focused on classifying an individual tweet regardless of the tweet thread/context leading up to that point. One of the classical approaches to curb hate speech is to adopt a reactive strategy after the hate speech postage. The ex-post facto strategy results in neglecting subtle posts that do not show the potential to instigate hate speech on their own but may portend in the subsequent discussion ensuing in the post's replies. In this paper, we propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future. It uses the semantic and propagating structure of the tweet threads to maximize the contextual information leading up to and the fall of hate intensity at each subsequent tweet. We explore three publicly available Twitter datasets -- Anti-Racism contains the reply tweets of a collection of social media discourse on racist remarks during US political and Covid-19 background; Anti-Social presents a dataset of 40 million tweets amidst the COVID-19 pandemic on anti-social behaviours; and Anti-Asian presents Twitter datasets collated based on anti-Asian behaviours during COVID-19 pandemic. All the curated datasets consist of structural graph information of the Tweet threads. We show that DRAGNET++ outperforms all the state-of-the-art baselines significantly. It beats the best baseline by an 11% margin on the Person correlation coefficient and a decrease of 25% on RMSE for the Anti-Racism dataset with a similar performance on the other two datasets. △ Less

Submitted 14 May, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted in Knowledge-Based Systems, 30 pages (main content) + 9 pages (Refs.), 11 figures, 3 tables

arXiv:2206.00765 [pdf]

Positional uncertainty and quality assurance of digital elevation change detection (DECD)

Authors: Chang Li, Qi Meng, Dong Wei, Wenzhong Shi, Ming Hao

Abstract: Studies on rapid change detection of large area urgently need to be extended from 2D image to digital elevation model (DEM) due to the challenge of changes caused by disasters. This research investigates positional uncertainty of digital elevation change detection (DECD) caused by different degrees of DEM complexity and DEM misregistration. Unfortunately, using three-sigma rule (3σR) for DECD is d… ▽ More Studies on rapid change detection of large area urgently need to be extended from 2D image to digital elevation model (DEM) due to the challenge of changes caused by disasters. This research investigates positional uncertainty of digital elevation change detection (DECD) caused by different degrees of DEM complexity and DEM misregistration. Unfortunately, using three-sigma rule (3σR) for DECD is disturbed by accuracy of parameter estimation, which is affected by the outliers (i.e., varied DEM) from DEM differencing samples. Hence, to reduce the aforementioned uncertainty of DECD, we propose a new strategy of quality assurance, adaptively censored three-sigma rule (AC3σR), in which with the samples censored, outliers of global DEM differencing samples outside the standard deviations of the mean calculated by moment estimation are iteratively removed. Compared with the 3σR and censored three-sigma rule (C3σR) that is similar to AC3σR but without iteration for both simulation and real-world data experiments, the proposed global AC3σR method always exhibits the highest accuracies of DECD in terms of both the overall accuracies 0.99967, 0.98740 and kappa coefficients 0.99598, 0.81803 respectively, and the strongest robustness with a large convergence interval [0, 0.30010] under the simulated maximum registration error and most complex terrain complexity conditions. △ Less

Submitted 15 May, 2022; originally announced June 2022.

arXiv:2205.15187 [pdf, other]

Do Deep Neural Networks Always Perform Better When Eating More Data?

Authors: Jiachen Yang, Zhuo Zhang, Yicheng Gong, Shukun Ma, Xiaolan Guo, Yue Yang, Shuai Xiao, Jiabao Wen, Yang Li, Xinbo Gao, Wen Lu, Qinggang Meng

Abstract: Data has now become a shortcoming of deep learning. Researchers in their own fields share the thinking that "deep neural networks might not always perform better when they eat more data," which still lacks experimental validation and a convincing guiding theory. Here to fill this lack, we design experiments from Identically Independent Distribution(IID) and Out of Distribution(OOD), which give pow… ▽ More Data has now become a shortcoming of deep learning. Researchers in their own fields share the thinking that "deep neural networks might not always perform better when they eat more data," which still lacks experimental validation and a convincing guiding theory. Here to fill this lack, we design experiments from Identically Independent Distribution(IID) and Out of Distribution(OOD), which give powerful answers. For the purpose of guidance, based on the discussion of results, two theories are proposed: under IID condition, the amount of information determines the effectivity of each sample, the contribution of samples and difference between classes determine the amount of sample information and the amount of class information; under OOD condition, the cross-domain degree of samples determine the contributions, and the bias-fitting caused by irrelevant elements is a significant factor of cross-domain. The above theories provide guidance from the perspective of data, which can promote a wide range of practical applications of artificial intelligence. △ Less

Submitted 30 May, 2022; originally announced May 2022.

arXiv:2205.13103 [pdf]

Spontaneous Radiative Cooling to Enhance the Operational Stability of Perovskite Solar Cells via a Black-body-like Full Carbon Electrode

Authors: Bingcheng Yu, Jiangjian Shi, Yiming Li, Shan Tan, Yuqi Cui, Fanqi Meng, Huijue Wu, Yanhong Luo, Dongmei Li, Qingbo Meng

Abstract: Operational stability of perovskite solar cells is remarkably influenced by the device temperature, therefore, decreasing the interior temperature of the device is one of the most effective approaches to prolong the service life. Herein, we introduce the spontaneous radiative cooling effect into the perovskite solar cell and amplified this effect via functional structure design of a full-carbon el… ▽ More Operational stability of perovskite solar cells is remarkably influenced by the device temperature, therefore, decreasing the interior temperature of the device is one of the most effective approaches to prolong the service life. Herein, we introduce the spontaneous radiative cooling effect into the perovskite solar cell and amplified this effect via functional structure design of a full-carbon electrode (F-CE). Firstly, with interface engineering, >19% and >23% power conversion efficiencies of F-CE based inorganic CsPbI3 and hybrid perovskite solar cells have been achieved, respectively, both of which are the highest reported efficiencies based on carbon electrode and are comparative to the results for metal electrodes. Highly efficient thermal radiation of this F-CE can reduce the temperature of the operating cell by about 10 °C. Compared with the conventional metal electrode-based control cells, the operational stability of the above two types of cells have been significantly improved due to this cooling effect. Especially, the CsPbI3 PSCs exhibited no efficiency degradation after 2000 hours of continuous operational tracking. △ Less

Submitted 25 May, 2022; originally announced May 2022.

arXiv:2205.06970 [pdf, other]

Learning to Reorient Objects with Stable Placements Afforded by Extrinsic Supports

Authors: Peng Xu, Hu Cheng, Jiankun Wang, Max Q. -H. Meng

Abstract: Reorienting objects by using supports is a practical yet challenging manipulation task. Owing to the intricate geometry of objects and the constrained feasible motions of the robot, multiple manipulation steps are required for object reorientation. In this work, we propose a pipeline for predicting various object placements from point clouds. This pipeline comprises three stages: a pose generation… ▽ More Reorienting objects by using supports is a practical yet challenging manipulation task. Owing to the intricate geometry of objects and the constrained feasible motions of the robot, multiple manipulation steps are required for object reorientation. In this work, we propose a pipeline for predicting various object placements from point clouds. This pipeline comprises three stages: a pose generation stage, followed by a pose refinement stage, and culminating in a placement classification stage. We also propose an algorithm to construct manipulation graphs based on point clouds. Feasible manipulation sequences are determined for the robot to transfer object placements. Both simulated and real-world experiments demonstrate that our approach is effective. The simulation results underscore our pipeline's capacity to generalize to novel objects in random start poses. Our predicted placements exhibit a 20% enhancement in accuracy compared to the state-of-the-art baseline. Furthermore, the robot finds feasible sequential steps in the manipulation graphs constructed by our algorithm to accomplish object reorientation manipulation. △ Less

Submitted 29 August, 2023; v1 submitted 14 May, 2022; originally announced May 2022.

arXiv:2205.06951 [pdf, other]

doi 10.1109/TASE.2022.3215562

NR-RRT: Neural Risk-Aware Near-Optimal Path Planning in Uncertain Nonconvex Environments

Authors: Fei Meng, Liangliang Chen, Han Ma, Jiankun Wang, Max Q. -H. Meng

Abstract: Balancing the trade-off between safety and efficiency is of significant importance for path planning under uncertainty. Many risk-aware path planners have been developed to explicitly limit the probability of collision to an acceptable bound in uncertain environments. However, convex obstacles or Gaussian uncertainties are usually assumed to make the problem tractable in the existing method. These… ▽ More Balancing the trade-off between safety and efficiency is of significant importance for path planning under uncertainty. Many risk-aware path planners have been developed to explicitly limit the probability of collision to an acceptable bound in uncertain environments. However, convex obstacles or Gaussian uncertainties are usually assumed to make the problem tractable in the existing method. These assumptions limit the generalization and application of path planners in real-world implementations. In this article, we propose to apply deep learning methods to the sampling-based planner, develo** a novel risk bounded near-optimal path planning algorithm named neural risk-aware RRT (NR-RRT). Specifically, a deterministic risk contours map is maintained by perceiving the probabilistic nonconvex obstacles, and a neural network sampler is proposed to predict the next most-promising safe state. Furthermore, the recursive divide-and-conquer planning and bidirectional search strategies are used to accelerate the convergence to a near-optimal solution with guaranteed bounded risk. Worst-case theoretical guarantees can also be proven owing to a standby safety guaranteed planner utilizing a uniform sampling distribution. Simulation experiments demonstrate that the proposed algorithm outperforms the state-of-the-art remarkably for finding risk bounded low-cost paths in seen and unseen environments with uncertainty and nonconvex constraints. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Journal ref: IEEE Transactions on Automation Science and Engineering, 2022

arXiv:2205.06940 [pdf, other]

BiAIT*: Symmetrical Bidirectional Optimal Path Planning with Adaptive Heuristic

Authors: Chenming Li, Han Ma, Peng Xu, Jiankun Wang, Max Q. -H. Meng

Abstract: Adaptively Informed Trees (AIT*) is an algorithm that uses the problem-specific heuristic to avoid unnecessary searches, which significantly improves its performance, especially when collision checking is expensive. However, the heuristic estimation in AIT* consumes lots of computational resources, and its asymmetric bidirectional searching strategy cannot fully exploit the potential of the bidire… ▽ More Adaptively Informed Trees (AIT*) is an algorithm that uses the problem-specific heuristic to avoid unnecessary searches, which significantly improves its performance, especially when collision checking is expensive. However, the heuristic estimation in AIT* consumes lots of computational resources, and its asymmetric bidirectional searching strategy cannot fully exploit the potential of the bidirectional method. In this article, we propose an extension of AIT* called BiAIT*. Unlike AIT*, BiAIT* uses symmetrical bidirectional search for both the heuristic and space searching. The proposed method allows BiAIT* to find the initial solution faster than AIT*, and update the heuristic with less computation when a collision occurs. We evaluated the performance of BiAIT* through simulations and experiments, and the results show that BiAIT* can find the solution faster than state-of-the-art methods. We also analyze the reasons for the different performances between BiAIT* and AIT*. Furthermore, we discuss two simple but effective modifications to fully exploit the potential of the adaptively heuristic method. △ Less

Submitted 25 May, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

arXiv:2205.04847 [pdf, other]

Multi-Tree Guided Efficient Robot Motion Planning

Authors: Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

Abstract: Motion Planning is necessary for robots to complete different tasks. Rapidly-exploring Random Tree (RRT) and its variants have been widely used in robot motion planning due to their fast search in state space. However, they perform not well in many complex environments since the motion planning needs to simultaneously consider the geometry constraints and differential constraints. In this article,… ▽ More Motion Planning is necessary for robots to complete different tasks. Rapidly-exploring Random Tree (RRT) and its variants have been widely used in robot motion planning due to their fast search in state space. However, they perform not well in many complex environments since the motion planning needs to simultaneously consider the geometry constraints and differential constraints. In this article, we propose a novel robot motion planning algorithm that utilizes multi-tree to guide the exploration and exploitation. The proposed algorithm maintains more than two trees to search the state space at first. Each tree will explore the local environment. The tree starts from the root will gradually collect information from other trees and grow towards the goal state. This simultaneous exploration and exploitation method can quickly find a feasible trajectory. We compare the proposed algorithm with other popular motion planning algorithms. The experiment results demonstrate that our algorithm achieves the best performance on different evaluation metrics. △ Less

Submitted 17 May, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2205.04232 [pdf, other]

doi 10.1093/mnras/stac1305

Arecibo and FAST Timing Follow-up of twelve Millisecond Pulsars Discovered in Commensal Radio Astronomy FAST Survey

Authors: C. C. Miao, W. W. Zhu, D. Li, P. C. C. Freire, J. R. Niu, P. Wang, J. P. Yuan, M. Y. Xue, A. D. Cameron, D. J. Champion, M. Cruces, Y. T. Chen, M. M. Chi, X. F. Cheng, S. J. Dang, M. F. Ding, Y. Feng, Z. Y. Gan, G. Hobbs, M. Kramer, Z. J. Liu, Y. X. Li, Z. K. Luo, X. L. Miao, L. Q. Meng , et al. (24 additional authors not shown)

Abstract: We report the phase-connected timing ephemeris, polarization pulse profiles, Faraday rotation measurements, and Rotating-Vector-Model (RVM) fitting results of twelve millisecond pulsars (MSPs) discovered with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) in the Commensal radio Astronomy FAST survey (CRAFTS). The timing campaigns were carried out with FAST and Arecibo over three… ▽ More We report the phase-connected timing ephemeris, polarization pulse profiles, Faraday rotation measurements, and Rotating-Vector-Model (RVM) fitting results of twelve millisecond pulsars (MSPs) discovered with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) in the Commensal radio Astronomy FAST survey (CRAFTS). The timing campaigns were carried out with FAST and Arecibo over three years. Eleven of the twelve pulsars are in neutron star - white dwarf binary systems, with orbital periods between 2.4 and 100 d. Ten of them have spin periods, companion masses, and orbital eccentricities that are consistent with the theoretical expectations for MSP - Helium white dwarf (He WD) systems. The last binary pulsar (PSR J1912$-$0952) has a significantly smaller spin frequency and a smaller companion mass, the latter could be caused by a low orbital inclination for the system. Its orbital period of 29 days is well within the range of orbital periods where some MSP - He WD systems have shown anomalous eccentricities, however, the eccentricity of PSR J1912$-$0952 is typical of what one finds for the remaining MSP - He WD systems. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 11 pages, 5 figures, MNRAS accepted

arXiv:2205.00459 [pdf, other]

Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation

Authors: Qingyan Meng, Mingqing Xiao, Shen Yan, Yisen Wang, Zhouchen Lin, Zhi-Quan Luo

Abstract: Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. However, it is a challenge to efficiently train SNNs due to their non-differentiability. Most existing methods either suffer from high latency (i.e., long simulation time steps), or cannot achieve as high performance as Artificial Neural Networks (ANNs). In this paper, we propose the Di… ▽ More Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. However, it is a challenge to efficiently train SNNs due to their non-differentiability. Most existing methods either suffer from high latency (i.e., long simulation time steps), or cannot achieve as high performance as Artificial Neural Networks (ANNs). In this paper, we propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance that is competitive to ANNs yet with low latency. First, we encode the spike trains into spike representation using (weighted) firing rate coding. Based on the spike representation, we systematically derive that the spiking dynamics with common neural models can be represented as some sub-differentiable map**. With this viewpoint, our proposed DSR method trains SNNs through gradients of the map** and avoids the common non-differentiability problem in SNN training. Then we analyze the error when representing the specific map** with the forward computation of the SNN. To reduce such error, we propose to train the spike threshold in each layer, and to introduce a new hyperparameter for the neural models. With these components, the DSR method can achieve state-of-the-art SNN performance with low latency on both static and neuromorphic datasets, including CIFAR-10, CIFAR-100, ImageNet, and DVS-CIFAR10. △ Less

Submitted 30 March, 2023; v1 submitted 1 May, 2022; originally announced May 2022.

Comments: Accepted by CVPR 2022

arXiv:2204.06255 [pdf, other]

Neural Operator with Regularity Structure for Modeling Dynamics Driven by SPDEs

Authors: Peiyan Hu, Qi Meng, Bingguang Chen, Shiqi Gong, Yue Wang, Wei Chen, Rongchan Zhu, Zhi-Ming Ma, Tie-Yan Liu

Abstract: Stochastic partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics. Neural Operators, generations of neural networks with capability of learning maps between infinite-dimensional spaces, are strong tools for solving parametric PDEs. However, they lack the ability to modeling SPDEs which usually have poor regularity… ▽ More Stochastic partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics. Neural Operators, generations of neural networks with capability of learning maps between infinite-dimensional spaces, are strong tools for solving parametric PDEs. However, they lack the ability to modeling SPDEs which usually have poor regularity due to the driving noise. As the theory of regularity structure has achieved great successes in analyzing SPDEs and provides the concept model feature vectors that well-approximate SPDEs' solutions, we propose the Neural Operator with Regularity Structure (NORS) which incorporates the feature vectors for modeling dynamics driven by SPDEs. We conduct experiments on various of SPDEs including the dynamic Phi41 model and the 2d stochastic Navier-Stokes equation, and the results demonstrate that the NORS is resolution-invariant, efficient, and achieves one order of magnitude lower error with a modest amount of data. △ Less

Submitted 17 July, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

arXiv:2204.04971 [pdf]

A new artificial photosynthetic system coupling photovoltaic electrocatalysis with photothermal catalysis

Authors: Yaguang Li, Fanqi Meng, Xianhua Bai, Dachao Yuan, Xingyuan San, Baolai Liang, Guangsheng Fu, Shufang Wang, Lin Gu, Qingbo Meng

Abstract: In this work, we present a novel artificial photosynthetic paradigm with square meter (m2) level scalable production by integrating photovoltaic electrolytic water splitting device and solar heating CO2 hydrogenation device, successfully achieving the synergy of 1 sun driven 19.4% solar to chemical energy efficiency (STC) for CO production (2.7 times higher than state of the art of large-sized art… ▽ More In this work, we present a novel artificial photosynthetic paradigm with square meter (m2) level scalable production by integrating photovoltaic electrolytic water splitting device and solar heating CO2 hydrogenation device, successfully achieving the synergy of 1 sun driven 19.4% solar to chemical energy efficiency (STC) for CO production (2.7 times higher than state of the art of large-sized artificial photosynthetic systems) with a low cost (equivalent to 1/7 of reported artificial photosynthetic systems). Furthermore, the outdoor artificial photosynthetic demonstration with 1.268 m2 of scale exhibits the CO generation amount of 258.4 L per day, the STC of ~15.5% for CO production in winter, which could recover the cost within 833 suuny days of operation by selling CO. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: 22pages,3 figures

arXiv:2203.14520 [pdf, other]

Optimistic Online Convex Optimization in Dynamic Environments

Authors: Qing-xin Meng, Jian-wei Liu

Abstract: In this paper, we study the optimistic online convex optimization problem in dynamic environments. Existing works have shown that Ader enjoys an $O\left(\sqrt{\left(1+P_T\right)T}\right)$ dynamic regret upper bound, where $T$ is the number of rounds, and $P_T$ is the path length of the reference strategy sequence. However, Ader is not environment-adaptive. Based on the fact that optimism provides… ▽ More In this paper, we study the optimistic online convex optimization problem in dynamic environments. Existing works have shown that Ader enjoys an $O\left(\sqrt{\left(1+P_T\right)T}\right)$ dynamic regret upper bound, where $T$ is the number of rounds, and $P_T$ is the path length of the reference strategy sequence. However, Ader is not environment-adaptive. Based on the fact that optimism provides a framework for implementing environment-adaptive, we replace Greedy Projection (GP) and Normalized Exponentiated Subgradient (NES) in Ader with Optimistic-GP and Optimistic-NES respectively, and name the corresponding algorithm ONES-OGP. We also extend the doubling trick to the adaptive trick, and introduce three characteristic terms naturally arise from optimism, namely $M_T$, $\widetilde{M}_T$ and $V_T+1_{L^2ρ\left(ρ+2 P_T\right)\leqslant\varrho^2 V_T}D_T$, to replace the dependence of the dynamic regret upper bound on $T$. We elaborate ONES-OGP with adaptive trick and its subgradient variation version, all of which are environment-adaptive. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: An early version of this manuscript can be found at https://openreview.net/forum?id=T3_cV3-zbg

arXiv:2202.12797 [pdf, other]

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

Authors: Shuang Qiu, Boxiang Lyu, Qinglin Meng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan

Abstract: Dynamic mechanism design studies how mechanism designers should allocate resources among agents in a time-varying environment. We consider the problem where the agents interact with the mechanism designer according to an unknown Markov Decision Process (MDP), where agent rewards and the mechanism designer's state evolve according to an episodic MDP with unknown reward functions and transition kern… ▽ More Dynamic mechanism design studies how mechanism designers should allocate resources among agents in a time-varying environment. We consider the problem where the agents interact with the mechanism designer according to an unknown Markov Decision Process (MDP), where agent rewards and the mechanism designer's state evolve according to an episodic MDP with unknown reward functions and transition kernels. We focus on the online setting with linear function approximation and propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction. A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space to estimate prices in the dynamic VCG mechanism. We show that the regret of our proposed method is upper bounded by $\tilde{\mathcal{O}}(T^{2/3})$ and further devise a lower bound to show that our algorithm is efficient, incurring the same $\tilde{\mathcal{O}}(T^{2 / 3})$ regret as the lower bound, where $T$ is the total number of rounds. Our work establishes the regret guarantee for online RL in solving dynamic mechanism design problems without prior knowledge of the underlying model. △ Less

Submitted 25 February, 2024; v1 submitted 25 February, 2022; originally announced February 2022.

Comments: Minor Revision for JMLR. The first three authors contribute equally

arXiv:2202.08004 [pdf, other]

Deep Koopman Operator with Control for Nonlinear Systems

Authors: Haojie Shi, Max Q. -H. Meng

Abstract: Recently Koopman operator has become a promising data-driven tool to facilitate real-time control for unknown nonlinear systems. It maps nonlinear systems into equivalent linear systems in embedding space, ready for real-time linear control methods. However, designing an appropriate Koopman embedding function remains a challenging task. Furthermore, most Koopman-based algorithms only consider nonl… ▽ More Recently Koopman operator has become a promising data-driven tool to facilitate real-time control for unknown nonlinear systems. It maps nonlinear systems into equivalent linear systems in embedding space, ready for real-time linear control methods. However, designing an appropriate Koopman embedding function remains a challenging task. Furthermore, most Koopman-based algorithms only consider nonlinear systems with linear control input, resulting in lousy prediction and control performance when the system is fully nonlinear with the control input. In this work, we propose an end-to-end deep learning framework to learn the Koopman embedding function and Koopman Operator together to alleviate such difficulties. We first parameterize the embedding function and Koopman Operator with the neural network and train them end-to-end with the K-steps loss function. Then, an auxiliary control network is augmented to encode the nonlinear state-dependent control term to model the nonlinearity in the control input. This encoded term is considered the new control variable instead to ensure linearity of the modeled system in the embedding system.We next deploy Linear Quadratic Regulator (LQR) on the linear embedding space to derive the optimal control policy and decode the actual control input from the control net. Experimental results demonstrate that our approach outperforms other existing methods, reducing the prediction error by order of magnitude and achieving superior control performance in several nonlinear dynamic systems like dam** pendulum, CartPole, and the seven DOF robotic manipulator. △ Less

Submitted 15 June, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

arXiv:2202.07316 [pdf, ps, other]

Optimization Conditions and Decomposable Algorithms for Convertible Nonconvex Optimization

Authors: M. Jiang, R. Shen, Z. Q. Meng, C. Y. Dang

Abstract: This paper defines a convertible nonconvex function(CN function for short) and a weak (strong) uniform (decomposable, exact) CN function, proves the optimization conditions for their global solutions and proposes algorithms for solving the unconstrained optimization problems with the decomposable CN function. First, to illustrate the fact that some nonconvex functions, nonsmooth or discontinuous,… ▽ More This paper defines a convertible nonconvex function(CN function for short) and a weak (strong) uniform (decomposable, exact) CN function, proves the optimization conditions for their global solutions and proposes algorithms for solving the unconstrained optimization problems with the decomposable CN function. First, to illustrate the fact that some nonconvex functions, nonsmooth or discontinuous, are actually weak uniform CN functions, examples are given. The operational properties of the CN functions are proved, including addition, subtraction, multiplication, division and compound operations. Second, optimization conditions of the global optimal solution to unconstrained optimization with a weak uniform CN function are proved. Based on the unconstrained optimization problem with the decomposable CN function, a decomposable algorithm is proposed by its augmented Lagrangian penalty function and its convergence is proved. Numerical results show that an approximate global optimal solution to unconstrained optimization with a CN function may be obtained by the decomposable algorithms. The decomposable algorithm can effectively reduce the scale in solving the unconstrained optimization problem with the decomposable CN function. This paper provides a new idea for solving unconstrained nonconvex optimization problems. △ Less

Submitted 15 February, 2022; originally announced February 2022.

MSC Class: 90C06; 90C25; 90C26; 90C59

arXiv:2202.05430 [pdf]

Wind power ramp prediction algorithm based on wavelet deep belief network

Authors: Zhenhao Tang, Qingyu Meng, Shengxian Cao, Yang Li, Zhongha Mu, Xiaoya Pang

Abstract: The wind power ramp events threaten the power grid safety significantly. To improve the ramp prediction accuracy, a hybrid wavelet deep belief network algorithm with adaptive feature selection (WDBNAFS) is proposed. First, the wind power characteristic is analyzed. Then, wavelet decomposition is addressed to the time series, and an adaptive feature selection algorithm is proposed to select the inp… ▽ More The wind power ramp events threaten the power grid safety significantly. To improve the ramp prediction accuracy, a hybrid wavelet deep belief network algorithm with adaptive feature selection (WDBNAFS) is proposed. First, the wind power characteristic is analyzed. Then, wavelet decomposition is addressed to the time series, and an adaptive feature selection algorithm is proposed to select the inputs of the prediction model. Finally, a deep belief network is employed to predict the wind power ramp event, and the proposed WDBNAFS was testified with the experiments based on the practical data. The simulation results demonstrate that the prediction accuracy of the proposed algorithm is more than 90%. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: in Chinese language

Journal ref: ACTA Energiae Solaris Sinica 40 (2019) 3213-3220

arXiv:2202.05118 [pdf, other]

Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm Deployed in Ridehailing Marketplace

Authors: Soheil Sadeghi Eshkevari, Xiaocheng Tang, Zhiwei Qin, **han Mei, Cheng Zhang, Qianying Meng, Jia Xu

Abstract: In this study, a real-time dispatching algorithm based on reinforcement learning is proposed and for the first time, is deployed in large scale. Current dispatching methods in ridehailing platforms are dominantly based on myopic or rule-based non-myopic approaches. Reinforcement learning enables dispatching policies that are informed of historical data and able to employ the learned information to… ▽ More In this study, a real-time dispatching algorithm based on reinforcement learning is proposed and for the first time, is deployed in large scale. Current dispatching methods in ridehailing platforms are dominantly based on myopic or rule-based non-myopic approaches. Reinforcement learning enables dispatching policies that are informed of historical data and able to employ the learned information to optimize returns of expected future trajectories. Previous studies in this field yielded promising results, yet have left room for further improvements in terms of performance gain, self-dependency, transferability, and scalable deployment mechanisms. The present study proposes a standalone RL-based dispatching solution that is equipped with multiple mechanisms to ensure robust and efficient on-policy learning and inference while being adaptable for full-scale deployment. A new form of value updating based on temporal difference is proposed that is more adapted to the inherent uncertainty of the problem. For the driver-order assignment, a customized utility function is proposed that when tuned based on the statistics of the market, results in remarkable performance improvement and interpretability. In addition, for reducing the risk of cancellation after drivers' assignment, an adaptive graph pruning strategy based on the multi-arm bandit problem is introduced. The method is evaluated using offline simulation with real data and yields notable performance improvement. In addition, the algorithm is deployed online in multiple cities under DiDi's operation for A/B testing and is launched in one of the major international markets as the primary mode of dispatch. The deployed algorithm shows over 1.3% improvement in total driver income from A/B testing. In addition, by causal inference analysis, as much as 5.3% improvement in major performance metrics is detected after full-scale deployment. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: submitted to KDD'22

arXiv:2202.04897 [pdf, other]

InterHT: Knowledge Graph Embeddings by Interaction between Head and Tail Entities

Authors: Baoxin Wang, Qingye Meng, Ziyue Wang, Honghong Zhao, Dayong Wu, Wanxiang Che, Shi** Wang, Zhigang Chen, Cong Liu

Abstract: Knowledge graph embedding (KGE) models learn the representation of entities and relations in knowledge graphs. Distance-based methods show promising performance on link prediction task, which predicts the result by the distance between two entity representations. However, most of these methods represent the head entity and tail entity separately, which limits the model capacity. We propose two nov… ▽ More Knowledge graph embedding (KGE) models learn the representation of entities and relations in knowledge graphs. Distance-based methods show promising performance on link prediction task, which predicts the result by the distance between two entity representations. However, most of these methods represent the head entity and tail entity separately, which limits the model capacity. We propose two novel distance-based methods named InterHT and InterHT+ that allow the head and tail entities to interact better and get better entity representation. Experimental results show that our proposed method achieves the best results on ogbl-wikikg2 dataset. △ Less

Submitted 23 December, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

arXiv:2201.12467 [pdf, other]

Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters

Authors: Qiang Meng, Feng Zhou, Hainan Ren, Tianshu Feng, Guochao Liu, Yuanqing Lin

Abstract: The growing public concerns on data privacy in face recognition can be greatly addressed by the federated learning (FL) paradigm. However, conventional FL methods perform poorly due to the uniqueness of the task: broadcasting class centers among clients is crucial for recognition performances but leads to privacy leakage. To resolve the privacy-utility paradox, this work proposes PrivacyFace, a fr… ▽ More The growing public concerns on data privacy in face recognition can be greatly addressed by the federated learning (FL) paradigm. However, conventional FL methods perform poorly due to the uniqueness of the task: broadcasting class centers among clients is crucial for recognition performances but leads to privacy leakage. To resolve the privacy-utility paradox, this work proposes PrivacyFace, a framework largely improves the federated learning face recognition via communicating auxiliary and privacy-agnostic information among clients. PrivacyFace mainly consists of two components: First, a practical Differentially Private Local Clustering (DPLC) mechanism is proposed to distill sanitized clusters from local class centers. Second, a consensus-aware recognition loss subsequently encourages global consensuses among clients, which ergo results in more discriminative features. The proposed framework is mathematically proved to be differentially private, introducing a lightweight overhead as well as yielding prominent performance boosts (\textit{e.g.}, +9.63\% and +10.26\% for TAR@FAR=1e-4 on IJB-B and IJB-C respectively). Extensive experiments and ablation studies on a large-scale dataset have demonstrated the efficacy and practicability of our method. △ Less

Submitted 28 January, 2022; originally announced January 2022.

Comments: ICLR2022, Spotlight

arXiv:2201.10255 [pdf, other]

Enhanced Global Optimization with Parallel Global and Local Structures

Authors: Haowei Wang, Songhao Wang, Qun Meng, Szu Hui Ng

Abstract: In practice, objective functions of real-time control systems can have multiple local minimums or can dramatically change over the function space, making them hard to optimize. To efficiently optimize such systems, in this paper, we develop a parallel global optimization framework that combines direct search methods with Bayesian parallel optimization. It consists of an iterative global and local… ▽ More In practice, objective functions of real-time control systems can have multiple local minimums or can dramatically change over the function space, making them hard to optimize. To efficiently optimize such systems, in this paper, we develop a parallel global optimization framework that combines direct search methods with Bayesian parallel optimization. It consists of an iterative global and local search that searches broadly through the entire global space for promising regions and then efficiently exploits each local promising region. We prove the asymptotic convergence properties of the proposed framework and conduct an extensive numerical comparison to illustrate its empirical performance. △ Less

Submitted 25 January, 2022; originally announced January 2022.

arXiv:2201.09308 [pdf, other]

Basket-based Softmax

Authors: Qiang Meng, Xinqian Gu, Xiaqing Xu, Feng Zhou

Abstract: Softmax-based losses have achieved state-of-the-art performances on various tasks such as face recognition and re-identification. However, these methods highly relied on clean datasets with global labels, which limits their usage in many real-world applications. An important reason is that merging and organizing datasets from various temporal and spatial scenarios is usually not realistic, as nois… ▽ More Softmax-based losses have achieved state-of-the-art performances on various tasks such as face recognition and re-identification. However, these methods highly relied on clean datasets with global labels, which limits their usage in many real-world applications. An important reason is that merging and organizing datasets from various temporal and spatial scenarios is usually not realistic, as noisy labels can be introduced and exponential-increasing resources are required. To address this issue, we propose a novel mining-during-training strategy called Basket-based Softmax (BBS) as well as its parallel version to effectively train models on multiple datasets in an end-to-end fashion. Specifically, for each training sample, we simultaneously adopt similarity scores as the clue to mining negative classes from other datasets, and dynamically add them to assist the learning of discriminative features. Experimentally, we demonstrate the efficiency and superiority of the BBS on the tasks of face recognition and re-identification, with both simulated and real-world datasets. △ Less

Submitted 23 January, 2022; originally announced January 2022.

arXiv:2201.08187 [pdf, other]

doi 10.1007/JHEP07(2022)030

An Efficient Lorentz Equivariant Graph Neural Network for Jet Tagging

Authors: Shiqi Gong, Qi Meng, Jue Zhang, Huilin Qu, Congqiao Li, Sitian Qian, Weitao Du, Zhi-Ming Ma, Tie-Yan Liu

Abstract: Deep learning methods have been increasingly adopted to study jets in particle physics. Since symmetry-preserving behavior has been shown to be an important factor for improving the performance of deep learning in many applications, Lorentz group equivariance - a fundamental spacetime symmetry for elementary particles - has recently been incorporated into a deep learning model for jet tagging. How… ▽ More Deep learning methods have been increasingly adopted to study jets in particle physics. Since symmetry-preserving behavior has been shown to be an important factor for improving the performance of deep learning in many applications, Lorentz group equivariance - a fundamental spacetime symmetry for elementary particles - has recently been incorporated into a deep learning model for jet tagging. However, the design is computationally costly due to the analytic construction of high-order tensors. In this article, we introduce LorentzNet, a new symmetry-preserving deep learning model for jet tagging. The message passing of LorentzNet relies on an efficient Minkowski dot product attention. Experiments on two representative jet tagging benchmarks show that LorentzNet achieves the best tagging performance and improves significantly over existing state-of-the-art algorithms. The preservation of Lorentz symmetry also greatly improves the efficiency and generalization power of the model, allowing LorentzNet to reach highly competitive performance when trained on only a few thousand jets. Code and models are available at \url{https://github.com/sdogsq/LorentzNet-release}. △ Less

Submitted 8 November, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: 22 pages, 3 figures, and 7 tables

Journal ref: Journal of High Energy Physics 2022 (3), 1-22

arXiv:2112.12134 [pdf, other]

A Unified Analysis Method for Online Optimization in Normed Vector Space

Authors: Qing-xin Meng, Jian-wei Liu

Abstract: This paper studies online optimization from a high-level unified theoretical perspective. We not only generalize both Optimistic-DA and Optimistic-MD in normed vector space, but also unify their analysis methods for dynamic regret. Regret bounds are the tightest possible due to the introduction of $φ$-convex. As instantiations, regret bounds of normalized exponentiated subgradient and greedy/lazy… ▽ More This paper studies online optimization from a high-level unified theoretical perspective. We not only generalize both Optimistic-DA and Optimistic-MD in normed vector space, but also unify their analysis methods for dynamic regret. Regret bounds are the tightest possible due to the introduction of $φ$-convex. As instantiations, regret bounds of normalized exponentiated subgradient and greedy/lazy projection are better than the currently known optimal results. By replacing losses of online game with monotone operators, and extending the definition of regret, namely regret$^n$, we extend online convex optimization to online monotone optimization. △ Less

Submitted 12 February, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

Comments: 29 pages. Streamlining and restructuring

arXiv:2112.08106 [pdf, other]

doi 10.1109/TASE.2022.3191519

Enhance Connectivity of Promising Regions for Sampling-based Path Planning

Authors: Han Ma, Chenming Li, Jianbang Liu, Jiankun Wang, Max Q. -H. Meng

Abstract: Sampling-based path planning algorithms usually implement uniform sampling methods to search the state space. However, uniform sampling may lead to unnecessary exploration in many scenarios, such as the environment with a few dead ends. Our previous work proposes to use the promising region to guide the sampling process to address the issue. However, the predicted promising regions are often disco… ▽ More Sampling-based path planning algorithms usually implement uniform sampling methods to search the state space. However, uniform sampling may lead to unnecessary exploration in many scenarios, such as the environment with a few dead ends. Our previous work proposes to use the promising region to guide the sampling process to address the issue. However, the predicted promising regions are often disconnected, which means they cannot connect the start and goal state, resulting in a lack of probabilistic completeness. This work focuses on enhancing the connectivity of predicted promising regions. Our proposed method regresses the connectivity probability of the edges in the x and y directions. In addition, it calculates the weight of the promising edges in loss to guide the neural network to pay more attention to the connectivity of the promising regions. We conduct a series of simulation experiments, and the results show that the connectivity of promising regions improves significantly. Furthermore, we analyze the effect of connectivity on sampling-based path planning algorithms and conclude that connectivity plays an essential role in maintaining algorithm performance. △ Less

Submitted 22 July, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

Comments: Accepted in Transactions on Automation Science and Engineering, 2022

arXiv:2112.01034 [pdf, other]

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data

Authors: Yifei Huang, Xiaoxiao Li, Li** Yang, Lin Gu, Yingying Zhu, Hirofumi Seo, Qiuming Meng, Tatsuya Harada, Yoichi Sato

Abstract: The human gaze is a cost-efficient physiological data that reveals human underlying attentional patterns. The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors. Thanks to this ability, human beings can efficiently learn from a very limited number of training samples. Inspired by this mechanism, we aim to leverage ga… ▽ More The human gaze is a cost-efficient physiological data that reveals human underlying attentional patterns. The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors. Thanks to this ability, human beings can efficiently learn from a very limited number of training samples. Inspired by this mechanism, we aim to leverage gaze for medical image analysis tasks with small training data. Our proposed framework includes a backbone encoder and a Selective Attention Network (SAN) that simulates the underlying attention. The SAN implicitly encodes information such as suspicious regions that is relevant to the medical diagnose tasks by estimating the actual human gaze. Then we design a novel Auxiliary Attention Block (AAB) to allow information from SAN to be utilized by the backbone encoder to focus on selective areas. Specifically, this block uses a modified version of a multi-head attention layer to simulate the human visual search procedure. Note that the SAN and AAB can be plugged into different backbones, and the framework can be used for multiple medical image analysis tasks when equipped with task-specific heads. Our method is demonstrated to achieve superior performance on both 3D tumor segmentation and 2D chest X-ray classification tasks. We also show that the estimated gaze probability map of the SAN is consistent with an actual gaze fixation map obtained by board-certified doctors. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: BMVC 2021

arXiv:2111.12288 [pdf, ps, other]

Stable determination of an elastic medium scatterer by a single far-field measurement and beyond

Authors: Zhengjian Bai, Huaian Diao, Hongyu Liu, Qingle Meng

Abstract: We are concerned with the time-harmonic elastic scattering due to an inhomogeneous elastic material inclusion located inside a uniformly homogeneous isotropic medium. We establish a sharp stability estimate of logarithmic type in determining the support of the elastic scatterer, independent of its material content, by a single far-field measurement when the support is a convex polyhedral domain in… ▽ More We are concerned with the time-harmonic elastic scattering due to an inhomogeneous elastic material inclusion located inside a uniformly homogeneous isotropic medium. We establish a sharp stability estimate of logarithmic type in determining the support of the elastic scatterer, independent of its material content, by a single far-field measurement when the support is a convex polyhedral domain in $\mathbb{R}^n$, $n=2,3$. Our argument in establishing the stability result is localized around a corner of the medium scatterer. This enables us to further establish a byproduct result by proving that if a generic medium scatterer, not necessary to be a polyhedral shape, possesses a corner, then there exists a positive lower bound of the scattered far-field patterns. The latter result indicates that if an elastic material object possesses a corner on its support, then it scatters every incident wave stably and invisibility phenomenon does not occur. △ Less

Submitted 24 November, 2021; originally announced November 2021.

arXiv:2111.10262 [pdf, other]

Residual fourier neural operator for thermochemical curing of composites

Authors: Gengxiang Chen, Yingguang Li, Xu liu, Qinglu Meng, **g Zhou, Xiaozhong Hao

Abstract: During the curing process of composites, the temperature history heavily determines the evolutions of the field of degree of cure as well as the residual stress, which will further influence the mechanical properties of composite, thus it is important to simulate the real temperature history to optimize the curing process of composites. Since thermochemical analysis using Finite Element (FE) simul… ▽ More During the curing process of composites, the temperature history heavily determines the evolutions of the field of degree of cure as well as the residual stress, which will further influence the mechanical properties of composite, thus it is important to simulate the real temperature history to optimize the curing process of composites. Since thermochemical analysis using Finite Element (FE) simulations requires heavy computational loads and data-driven approaches suffer from the complexity of highdimensional map**. This paper proposes a Residual Fourier Neural Operator (ResFNO) to establish the direct high-dimensional map** from any given cure cycle to the corresponding temperature histories. By integrating domain knowledge into a time-resolution independent parameterized neural network, the map** between cure cycles to temperature histories can be learned using limited number of labelled data. Besides, a novel Fourier residual map** is designed based on mode decomposition to accelerate the training and boost the performance significantly. Several cases are carried out to evaluate the superior performance and generalizability of the proposed method comprehensively. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 15 pages, 16 figures

arXiv:2111.03235 [pdf]

RASEC: Rescaling Acquisition Strategy with Energy Constraints under SE-OU Fusion Kernel for Active Trachea Palpation and Incision Recommendation in Laryngeal Region

Authors: Wenchao Yue, Fan Bai, Jianbang Liu, Feng Ju, Max Q-H Meng, Chwee Ming Lim, Hongliang Ren

Abstract: A novel palpation-based incision detection strategy in the laryngeal region, potentially for robotic tracheotomy, is proposed in this letter. A tactile sensor is introduced to measure tissue hardness in the specific laryngeal region by gentle contact. The kernel fusion method is proposed to combine the Squared Exponential (SE) kernel with Ornstein-Uhlenbeck (OU) kernel to figure out the drawbacks… ▽ More A novel palpation-based incision detection strategy in the laryngeal region, potentially for robotic tracheotomy, is proposed in this letter. A tactile sensor is introduced to measure tissue hardness in the specific laryngeal region by gentle contact. The kernel fusion method is proposed to combine the Squared Exponential (SE) kernel with Ornstein-Uhlenbeck (OU) kernel to figure out the drawbacks that the existing kernel functions are not sufficiently optimal in this scenario. Moreover, we further regularize exploration factor and greed factor, and the tactile sensor's moving distance and the robotic base link's rotation angle during the incision localization process are considered as new factors in the acquisition strategy. We conducted simulation and physical experiments to compare the newly proposed algorithm - Rescaling Acquisition Strategy with Energy Constraints (RASEC) in trachea detection with current palpation-based acquisition strategies. The result indicates that the proposed acquisition strategy with fusion kernel can successfully localize the incision with the highest algorithm performance (Average Precision 0.932, Average Recall 0.973, Average F1 score 0.952). During the robotic palpation process, the cumulative moving distance is reduced by 50%, and the cumulative rotation angle is reduced by 71.4% with no sacrifice in the comprehensive performance capabilities. Therefore, it proves that RASEC can efficiently suggest the incision zone in the laryngeal region and greatly reduced the energy loss. △ Less

Submitted 19 March, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: Submitted to RA-L

arXiv:2111.02167 [pdf, other]

doi 10.1109/TMRB.2021.3127015

Image-Guided Navigation of a Robotic Ultrasound Probe for Autonomous Spinal Sonography Using a Shadow-aware Dual-Agent Framework

Authors: Keyu Li, Yangxin Xu, Jian Wang, Dong Ni, Li Liu, Max Q. -H. Meng

Abstract: Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers. In this work, we propose a novel dual-agent framework that integrates a reinforcement learning (RL) agent and a deep learning (DL) agent to jointly deter… ▽ More Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers. In this work, we propose a novel dual-agent framework that integrates a reinforcement learning (RL) agent and a deep learning (DL) agent to jointly determine the movement of the US probe based on the real-time US images, in order to mimic the decision-making process of an expert sonographer to achieve autonomous standard view acquisitions in spinal sonography. Moreover, inspired by the nature of US propagation and the characteristics of the spinal anatomy, we introduce a view-specific acoustic shadow reward to utilize the shadow information to implicitly guide the navigation of the probe toward different standard views of the spine. Our method is validated in both quantitative and qualitative experiments in a simulation environment built with US data acquired from 17 volunteers. The average navigation accuracy toward different standard views achieves 5.18mm/5.25deg and 12.87mm/17.49deg in the intra- and inter-subject settings, respectively. The results demonstrate that our method can effectively interpret the US images and navigate the probe to acquire multiple standard views of the spine. △ Less

Submitted 10 November, 2021; v1 submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted by IEEE Transactions on Medical Robotics and Bionics. Copyright may be transferred without notice, after which this version may no longer be accessible

Journal ref: IEEE Transactions on Medical Robotics and Bionics (2021)

arXiv:2111.01977 [pdf, other]

Autonomous Magnetic Navigation Framework for Active Wireless Capsule Endoscopy Inspired by Conventional Colonoscopy Procedures

Authors: Yangxin Xu, Keyu Li, Ziqi Zhao, Max Q. -H. Meng

Abstract: In recent years, simultaneous magnetic actuation and localization (SMAL) for active wireless capsule endoscopy (WCE) has been intensively studied to improve the efficiency and accuracy of the examination. In this paper, we propose an autonomous magnetic navigation framework for active WCE that mimics the "insertion" and "withdrawal" procedures performed by an expert physician in conventional colon… ▽ More In recent years, simultaneous magnetic actuation and localization (SMAL) for active wireless capsule endoscopy (WCE) has been intensively studied to improve the efficiency and accuracy of the examination. In this paper, we propose an autonomous magnetic navigation framework for active WCE that mimics the "insertion" and "withdrawal" procedures performed by an expert physician in conventional colonoscopy, thereby enabling efficient and accurate navigation of a robotic capsule endoscope in the intestine with minimal user effort. First, the capsule is automatically propelled through the unknown intestinal environment and generate a viable path to represent the environment. Then, the capsule is autonomously navigated towards any point selected on the intestinal trajectory to allow accurate and repeated inspections of suspicious lesions. Moreover, we implement the navigation framework on a robotic system incorporated with advanced SMAL algorithms, and validate it in the navigation in various tubular environments using phantoms and an ex-vivo pig colon. Our results demonstrate that the proposed autonomous navigation framework can effectively navigate the capsule in unknown, complex tubular environments with a satisfactory accuracy, repeatability and efficiency compared with manual operation. △ Less

Submitted 2 November, 2021; originally announced November 2021.

arXiv:2111.00419 [pdf, other]

Interpreting Deep Knowledge Tracing Model on EdNet Dataset

Authors: Deliang Wang, Yu Lu, Qinggang Meng, Penghe Chen

Abstract: With more deep learning techniques being introduced into the knowledge tracing domain, the interpretability issue of the knowledge tracing models has aroused researchers' attention. Our previous study(Lu et al. 2020) on building and interpreting the KT model mainly adopts the ASSISTment dataset(Feng, Heffernan, and Koedinger 2009),, whose size is relatively small. In this work, we perform the simi… ▽ More With more deep learning techniques being introduced into the knowledge tracing domain, the interpretability issue of the knowledge tracing models has aroused researchers' attention. Our previous study(Lu et al. 2020) on building and interpreting the KT model mainly adopts the ASSISTment dataset(Feng, Heffernan, and Koedinger 2009),, whose size is relatively small. In this work, we perform the similar tasks but on a large and newly available dataset, called EdNet(Choi et al. 2020). The preliminary experiment results show the effectiveness of the interpreting techniques, while more questions and tasks are worthy to be further explored and accomplished. △ Less

Submitted 31 October, 2021; originally announced November 2021.

Comments: This paper has been accepted and presented in AAAI 2021 Workshop on AI Education

arXiv:2111.00383 [pdf, other]

Relevant Region Sampling Strategy with Adaptive Heuristic for Asymptotically Optimal Path Planning

Authors: Chenming Li, Fei Meng, Han Ma, Jiankun Wang, Max Q. -H. Meng

Abstract: Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces planning time-consumption. The RRT# algorithm defines the Relevant Region based on the cost-to-come provided by the optimal forward-searching tree. However, it uses th… ▽ More Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces planning time-consumption. The RRT# algorithm defines the Relevant Region based on the cost-to-come provided by the optimal forward-searching tree. However, it uses the cumulative cost of a direct connection between the current state and the goal state as the cost-to-go. To improve the path planning efficiency, we propose a batch sampling method that samples in a refined Relevant Region with a direct sampling strategy, which is defined according to the optimal cost-to-come and the adaptive cost-to-go, taking advantage of various sources of heuristic information. The proposed sampling approach allows the algorithm to build the search tree in the direction of the most promising area, resulting in a superior initial solution quality and reducing the overall computation time compared to related work. To validate the effectiveness of our method, we conducted several simulations in both $SE(2)$ and $SE(3)$ state spaces. And the simulation results demonstrate the superiorities of proposed algorithm. △ Less

Submitted 25 May, 2023; v1 submitted 30 October, 2021; originally announced November 2021.

arXiv:2110.14811 [pdf, other]

SE(3) Equivariant Graph Neural Networks with Complete Local Frames

Authors: Weitao Du, He Zhang, Yuanqi Du, Qi Meng, Wei Chen, Bin Shao, Tie-Yan Liu

Abstract: Group equivariance (e.g. SE(3) equivariance) is a critical physical symmetry in science, from classical and quantum physics to computational biology. It enables robust and accurate prediction under arbitrary reference transformations. In light of this, great efforts have been put on encoding this symmetry into deep neural networks, which has been shown to improve the generalization performance and… ▽ More Group equivariance (e.g. SE(3) equivariance) is a critical physical symmetry in science, from classical and quantum physics to computational biology. It enables robust and accurate prediction under arbitrary reference transformations. In light of this, great efforts have been put on encoding this symmetry into deep neural networks, which has been shown to improve the generalization performance and data efficiency for downstream tasks. Constructing an equivariant neural network generally brings high computational costs to ensure expressiveness. Therefore, how to better trade-off the expressiveness and computational efficiency plays a core role in the design of the equivariant deep learning models. In this paper, we propose a framework to construct SE(3) equivariant graph neural networks that can approximate the geometric quantities efficiently. Inspired by differential geometry and physics, we introduce equivariant local complete frames to graph neural networks, such that tensor information at given orders can be projected onto the frames. The local frame is constructed to form an orthonormal basis that avoids direction degeneration and ensure completeness. Since the frames are built only by cross product operations, our method is computationally efficient. We evaluate our method on two tasks: Newton mechanics modeling and equilibrium molecule conformation generation. Extensive experimental results demonstrate that our model achieves the best or competitive performance in two types of datasets. △ Less

Submitted 5 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: ICML 2022 accepted

arXiv:2110.13750 [pdf, other]

Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD

Authors: Bohan Wang, Huishuai Zhang, Jieyu Zhang, Qi Meng, Wei Chen, Tie-Yan Liu

Abstract: Recently, the information-theoretical framework has been proven to be able to obtain non-vacuous generalization bounds for large models trained by Stochastic Gradient Langevin Dynamics (SGLD) with isotropic noise. In this paper, we optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, th… ▽ More Recently, the information-theoretical framework has been proven to be able to obtain non-vacuous generalization bounds for large models trained by Stochastic Gradient Langevin Dynamics (SGLD) with isotropic noise. In this paper, we optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance if both the prior and the posterior are jointly optimized. This validates that the optimal noise is quite close to the empirical gradient covariance. Technically, we develop a new information-theoretical bound that enables such an optimization analysis. We then apply matrix analysis to derive the form of optimal noise covariance. Presented constraint and results are validated by the empirical observations. △ Less

Submitted 2 November, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: Accepted by Neurips 2021

arXiv:2110.10910 [pdf, ps, other]

$L^p$ estimations of fully coupled FBSDEs

Authors: Qingxin Meng, Shuzhen Yang

Abstract: In this study, for any given terminal time $T$, we establish an $L^p$ ($P>2$) estimations of fully coupled FBSDEs based on the $L^2$ estimations. Yong [24] proposed that a natural question is whether an adapted $L^2$-solution is an adapted $L^p$ solution for some $p>2$. In this study, we give a positive answer to this question. For any given terminal time $T$, based on an observation of the relati… ▽ More In this study, for any given terminal time $T$, we establish an $L^p$ ($P>2$) estimations of fully coupled FBSDEs based on the $L^2$ estimations. Yong [24] proposed that a natural question is whether an adapted $L^2$-solution is an adapted $L^p$ solution for some $p>2$. In this study, we give a positive answer to this question. For any given terminal time $T$, based on an observation of the relation between $L^2$ and $L^p$ estimations of FBSDEs, we prove that a unique $L^2$-solution of fully coupled FBSDEs is an $L^p$-solution under standard conditions on the coefficients. Furthermore, we show that the fully coupled FBSDEs developed in the linear quadratic optimal control problem or investigated by the "decoupling random field" method admit a unique $L^p$-solution. △ Less

Submitted 29 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: pages 20

MSC Class: 60H10; 49N05; 93E20

arXiv:2110.10436 [pdf, other]

A Survey on Deep-Learning Approaches for Vehicle Trajectory Prediction in Autonomous Driving

Authors: Jianbang Liu, Xinyu Mao, Yuqi Fang, Delong Zhu, Max Q. -H. Meng

Abstract: With the rapid development of machine learning, autonomous driving has become a hot issue, making urgent demands for more intelligent perception and planning systems. Self-driving cars can avoid traffic crashes with precisely predicted future trajectories of surrounding vehicles. In this work, we review and categorize existing learning-based trajectory forecasting methods from perspectives of repr… ▽ More With the rapid development of machine learning, autonomous driving has become a hot issue, making urgent demands for more intelligent perception and planning systems. Self-driving cars can avoid traffic crashes with precisely predicted future trajectories of surrounding vehicles. In this work, we review and categorize existing learning-based trajectory forecasting methods from perspectives of representation, modeling, and learning. Moreover, we make our implementation of Target-driveN Trajectory Prediction publicly available at https://github.com/Henry1iu/TNT-Trajectory-Predition, demonstrating its outstanding performance whereas its original codes are withheld. Enlightenment is expected for researchers seeking to improve trajectory prediction performance based on the achievement we have made. △ Less

Submitted 28 October, 2021; v1 submitted 20 October, 2021; originally announced October 2021.

Comments: Accepted by ROBIO2021

arXiv:2110.10041 [pdf, other]

Learning-based Fast Path Planning in Complex Environments

Authors: Jianbang Liu, Baopu Li, Tingguang Li, Wenzheng Chi, Jiankun Wang, Max Q. -H. Meng

Abstract: In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction m… ▽ More In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction module utilizes an auto-encoder-decoder-like convolutional neural network (CNN) to output a promising region where the feasible path probably lies in. In this process, the environment is treated as an RGB image to feed in our designed CNN module, and the output is also an RGB image. No extra computation is required so that we can maintain a high processing speed of 60 frames-per-second (FPS). Incorporated with a sampling-based path planner, we can extract a feasible path from the output image so that the robot can track it from start to goal. To demonstrate the advantage of the proposed algorithm, we compare it with conventional path planning algorithms in a series of simulation experiments. The results reveal that the proposed algorithm can achieve much better performance in terms of planning time, success rate, and path length. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: Accepted by ROBIO2021

arXiv:2110.06648 [pdf, other]

Robotic Autonomous Trolley Collection with Progressive Perception and Nonlinear Model Predictive Control

Authors: Anxing Xiao, Hao Luan, Ziqi Zhao, Yue Hong, Jieting Zhao, Weinan Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley c… ▽ More Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley collection. The proposed system integrates a compact hardware design and a progressive perception and planning framework, enabling the system to efficiently and robustly collect trolleys in dynamic and complex environments. For the perception, we first develop a 3D trolley detection method that combines object detection and keypoint estimation. Then, a docking process in a short distance is achieved with an accurate point cloud plane detection method and a novel manipulator design. On the planning side, we formulate the robot's motion planning under a nonlinear model predictive control framework with control barrier functions to improve obstacle avoidance capabilities while maintaining the target in the sensors' field of view at close distances. We demonstrate our design and framework by deploying the system on actual trolley collection tasks, and their effectiveness and robustness are experimentally validated. △ Less

Submitted 1 March, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: Accepted to the 2022 International Conference on Robotics and Automation (ICRA 2022)

arXiv:2110.04564 [pdf, other]

Human-Aware Robot Navigation via Reinforcement Learning with Hindsight Experience Replay and Curriculum Learning

Authors: Keyu Li, Ye Lu, Max Q. -H. Meng

Abstract: In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially complia… ▽ More In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially compliant manner. However, the expert demonstration data used in existing methods is usually expensive and difficult to obtain. In this work, we consider the task of training an RL agent without employing the demonstration data, to achieve efficient and collision-free navigation in a crowded environment. To address the sparse reward navigation problem, we propose to incorporate the hindsight experience replay (HER) and curriculum learning (CL) techniques with RL to efficiently learn the optimal navigation policy in the dense crowd. The effectiveness of our method is validated in a simulated crowd-robot coexisting environment. The results demonstrate that our method can effectively learn human-aware navigation without requiring additional demonstration data. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Accepted at ROBIO 2021. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2110.04563 [pdf, other]

Automatic Recognition of Abdominal Organs in Ultrasound Images based on Deep Neural Networks and K-Nearest-Neighbor Classification

Authors: Keyu Li, Yangxin Xu, Max Q. -H. Meng

Abstract: Abdominal ultrasound imaging has been widely used to assist in the diagnosis and treatment of various abdominal organs. In order to shorten the examination time and reduce the cognitive burden on the sonographers, we present a classification method that combines the deep learning techniques and k-Nearest-Neighbor (k-NN) classification to automatically recognize various abdominal organs in the ultr… ▽ More Abdominal ultrasound imaging has been widely used to assist in the diagnosis and treatment of various abdominal organs. In order to shorten the examination time and reduce the cognitive burden on the sonographers, we present a classification method that combines the deep learning techniques and k-Nearest-Neighbor (k-NN) classification to automatically recognize various abdominal organs in the ultrasound images in real time. Fine-tuned deep neural networks are used in combination with PCA dimension reduction to extract high-level features from raw ultrasound images, and a k-NN classifier is employed to predict the abdominal organ in the image. We demonstrate the effectiveness of our method in the task of ultrasound image classification to automatically recognize six abdominal organs. A comprehensive comparison of different configurations is conducted to study the influence of different feature extractors and classifiers on the classification accuracy. Both quantitative and qualitative results show that with minimal training effort, our method can "lazily" recognize the abdominal organs in the ultrasound images in real time with an accuracy of 96.67%. Our implementation code is publicly available at: https://github.com/LeeKeyu/abdominal_ultrasound_classification. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Accepted at ROBIO 2021. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2110.03891 [pdf, other]

Does Momentum Change the Implicit Regularization on Separable Data?

Authors: Bohan Wang, Qi Meng, Huishuai Zhang, Ruoyu Sun, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

Abstract: The momentum acceleration technique is widely adopted in many optimization algorithms. However, there is no theoretical answer on how the momentum affects the generalization performance of the optimization algorithms. This paper studies this problem by analyzing the implicit regularization of momentum-based optimization. We prove that on the linear classification problem with separable data and ex… ▽ More The momentum acceleration technique is widely adopted in many optimization algorithms. However, there is no theoretical answer on how the momentum affects the generalization performance of the optimization algorithms. This paper studies this problem by analyzing the implicit regularization of momentum-based optimization. We prove that on the linear classification problem with separable data and exponential-tailed loss, gradient descent with momentum (GDM) converges to the L2 max-margin solution, which is the same as vanilla gradient descent. That means gradient descent with momentum acceleration still converges to a low-complexity model, which guarantees their generalization. We then analyze the stochastic and adaptive variants of GDM (i.e., SGDM and deterministic Adam) and show they also converge to the L2 max-margin solution. Technically, to overcome the difficulty of the error accumulation in analyzing the momentum, we construct new potential functions to analyze the gap between the model parameter and the max-margin solution. Numerical experiments are conducted and support our theoretical results. △ Less

Submitted 27 May, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

arXiv:2109.14247 [pdf, other]

Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Authors: Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Yisen Wang, Zhouchen Lin

Abstract: Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. However, the supervised training of SNNs remains a hard problem due to the discontinuity of the spiking neuron model. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks, and use surrogate derivatives or… ▽ More Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. However, the supervised training of SNNs remains a hard problem due to the discontinuity of the spiking neuron model. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks, and use surrogate derivatives or compute gradients with respect to the spiking time to deal with the problem. These approaches either accumulate approximation errors or only propagate information limitedly through existing spikes, and usually require information propagation along time steps with large memory costs and biological implausibility. In this work, we consider feedback spiking neural networks, which are more brain-like, and propose a novel training method that does not rely on the exact reverse of the forward computation. First, we show that the average firing rates of SNNs with feedback connections would gradually evolve to an equilibrium state along time, which follows a fixed-point equation. Then by viewing the forward computation of feedback SNNs as a black-box solver for this equation, and leveraging the implicit differentiation on the equation, we can compute the gradient for parameters without considering the exact forward procedure. In this way, the forward and backward procedures are decoupled and therefore the problem of non-differentiable spiking functions is avoided. We also briefly discuss the biological plausibility of implicit differentiation, which only requires computing another equilibrium. Extensive experiments on MNIST, Fashion-MNIST, N-MNIST, CIFAR-10, and CIFAR-100 demonstrate the superior performance of our method for feedback models with fewer neurons and parameters in a small number of time steps. Our code is avaiable at https://github.com/pkuxmq/IDE-FSNN. △ Less

Submitted 17 December, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

Comments: Accepted by NeurIPS 2021 (Spotlight)

arXiv:2109.12552 [pdf]

Theoretical Chemistry Course for Students in Chemistry

Authors: Qingyong Meng

Abstract: In this work, the teaching content of a theoretical-chemistry (TC) course is reformed, establishing a theoretical contents from micro- to macro-system, and comprehensively introducing the theory of chemical reaction to undergraduate students in chemistry. In order to develop such TC course based on the general physical-chemistry course, we focus on the last-mile problem between the physics and che… ▽ More In this work, the teaching content of a theoretical-chemistry (TC) course is reformed, establishing a theoretical contents from micro- to macro-system, and comprehensively introducing the theory of chemical reaction to undergraduate students in chemistry. In order to develop such TC course based on the general physical-chemistry course, we focus on the last-mile problem between the physics and chemistry courses to train the critical thinking of undergraduate students in chemistry. To clearly show this, a reduction scheme of polymer molecular dynamics was discussed as an example, which shows a different theoretical content in polymer chemistry. Moreover, we propose a series of experiences and dependent measures that can provide information regarding students' levels of knowledge and understanding. This assessment quiz was designed to test students on the fundamental concepts and applications of TC, such as dynamics, statistical ensemble, kinetics, and so on. From the actual teaching for 36 students, it was found that these students performed significantly improvement from the present TC content. Further analysis of each individual question revealed that approximately two-third of the students learn new knowledge. Although the present TC course might be considered to be a certain degree of difficulty for chemists, these analyses show that students can effectively accept these complicated concepts. △ Less

Submitted 26 September, 2021; originally announced September 2021.

arXiv:2109.11122 [pdf, other]

doi 10.1103/PhysRevB.106.L020402

The free energy of twisting spins in Mn$_3$Sn

Authors: Xiaokang Li, Shan Jiang, Qingkai Meng, Huakun Zuo, Zengwei Zhu, Leon Balents, Kamran Behnia

Abstract: The magnetic free energy is usually quadratic in magnetic field and depends on the mutual orientation of the magnetic field and the crystalline axes. Tiny in magnitude, this magnetocrystalline anisotropy energy (MAE) is nevertheless indispensable for the existence of permanent magnets. Here, we show that in Mn$_3$Sn, a non-collinear antiferromagnet attracting much attention following the discovery… ▽ More The magnetic free energy is usually quadratic in magnetic field and depends on the mutual orientation of the magnetic field and the crystalline axes. Tiny in magnitude, this magnetocrystalline anisotropy energy (MAE) is nevertheless indispensable for the existence of permanent magnets. Here, we show that in Mn$_3$Sn, a non-collinear antiferromagnet attracting much attention following the discovery of its large anomalous Hall effect, the free energy of spins has superquadratic components, which drive the MAE. We experimentally demonstrate that the thermodynamic free energy includes terms odd in magnetic field ($\mathcal{O}(H^3)+\mathcal{O}(H^5)$) and generating sixfold and twelve-fold angular oscillations in the torque response. We show that they are quantitatively explained by theory, which can be used to quantify relevant energy scales (Heisenberg, Dzyaloshinskii-Moriya, Zeeman and single-ion anisotropy) of the system. Based on the theory, we conclude that, in contrast with common magnets, what drives the MAE in Mn$_3$Sn is the field-induced deformation of the spin texture. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: 6 pages, 5 figures, Supplemental Material is included

Journal ref: Physical Review B 106, L020402 (2022)

Showing 101–150 of 306 results for author: Meng, Q