Search | arXiv e-print repository

Symmetry engineering in 2D bioelectronics facilitating augmented biosensing interfaces

Authors: Yizhang Wu, Yihan Liu, Yuan Li, Ziquan Wei, Sicheng Xing, Yunlang Wang, Dashuai Zhu, Ziheng Guo, Anran Zhang, Gongkai Yuan, Zhibo Zhang, Ke Huang, Yong Wang, Guorong Wu, Ke Cheng, Wubin Bai

Abstract: Symmetry lies at the heart of 2D bioelectronics, determining material properties at the fundamental level. Breaking the symmetry allows emergent functionalities and effects. However, symmetry modulation in 2D bioelectronics and the resultant applications have been largely overlooked. Here we devise an oxidized architectural MXene, referred as OXene, that couples orbit symmetric breaking with inver… ▽ More Symmetry lies at the heart of 2D bioelectronics, determining material properties at the fundamental level. Breaking the symmetry allows emergent functionalities and effects. However, symmetry modulation in 2D bioelectronics and the resultant applications have been largely overlooked. Here we devise an oxidized architectural MXene, referred as OXene, that couples orbit symmetric breaking with inverse symmetric breaking to entitle the optimized interfacial impedance and Schottky-induced piezoelectric effects. The resulting OXene validates applications ranging from microelectrode arrays, gait analysis, active transistor matrix, and wireless signaling transmission, which enables highly-fidelity signal transmission and reconfigurable logic gates. Further OXene interfaces are investigated in both rodent and porcine myocardium, featuring high-quality and spatiotemporally resolved physiological recordings, while accurate differentiated predictions, enabled via various machine learning pipelines. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13956 [pdf]

Orbit symmetry breaking in MXene implements enhanced soft bioelectronic implants

Authors: Yizhang Wu, Yuan Li, Yihan Liu, Dashuai Zhu, Sicheng Xing, Noah Lambert, Hannah Weisbecker, Siyuan Liu, Brayden Davis, Lin Zhang, Meixiang Wang, Gongkai Yuan, Chris Zhoufan You, Anran Zhang, Cate Duncan, Wanrong Xie, Yihang Wang, Yong Wang, Sreya Kanamurlapudi, Garcia-Guzman Evert, Arjun Putcha, Michael D. Dickey, Ke Huang, Wubin Bai

Abstract: Bioelectronic implants with soft mechanics, biocompatibility, and excellent electrical performance enable biomedical implants to record electrophysiological signals and execute interventions within internal organs, promising to revolutionize the diagnosing, monitoring, and treatment of various pathological conditions. However, challenges remain in improving excessive impedance at the bioelectronic… ▽ More Bioelectronic implants with soft mechanics, biocompatibility, and excellent electrical performance enable biomedical implants to record electrophysiological signals and execute interventions within internal organs, promising to revolutionize the diagnosing, monitoring, and treatment of various pathological conditions. However, challenges remain in improving excessive impedance at the bioelectronic-tissue interface and thus the efficacy of electrophysiological signaling and intervention. Here, we devise orbit symmetry breaking in MXene (a low-cost scalability, biocompatible, and conductive 2D layered material, that we refer to as OBXene), that exhibits low bioelectronic-tissue impedance, originating from the out-of-plane charge transfer. Furthermore, the Schottky-induced piezoelectricity stemming from the asymmetric orbital configuration of OBXene facilitates interlayered charge transport in the device. In this study, we report an OBXene-based cardiac patch applied on the left ventricular epicardium of both rodent and porcine models to enable spatiotemporal epicardium map** and pacing, while coupling the wireless and battery-free operation for long-term real-time recording and closed-loop stimulation. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.09771 [pdf, other]

Block Coordinate Descent Methods for Optimization under J-Orthogonality Constraints with Applications

Authors: Di He, Ganzhao Yuan, Xiao Wang, Pengxiang Xu

Abstract: The J-orthogonal matrix, also referred to as the hyperbolic orthogonal matrix, is a class of special orthogonal matrix in hyperbolic space, notable for its advantageous properties. These matrices are integral to optimization under J-orthogonal constraints, which have widespread applications in statistical learning and data science. However, addressing these problems is generally challenging due to… ▽ More The J-orthogonal matrix, also referred to as the hyperbolic orthogonal matrix, is a class of special orthogonal matrix in hyperbolic space, notable for its advantageous properties. These matrices are integral to optimization under J-orthogonal constraints, which have widespread applications in statistical learning and data science. However, addressing these problems is generally challenging due to their non-convex nature and the computational intensity of the constraints. Currently, algorithms for tackling these challenges are limited. This paper introduces JOBCD, a novel Block Coordinate Descent method designed to address optimizations with J-orthogonality constraints. We explore two specific variants of JOBCD: one based on a Gauss-Seidel strategy (GS-JOBCD), the other on a variance-reduced and Jacobi strategy (VR-J-JOBCD). Notably, leveraging the parallel framework of a Jacobi strategy, VR-J-JOBCD integrates variance reduction techniques to decrease oracle complexity in the minimization of finite-sum functions. For both GS-JOBCD and VR-J-JOBCD, we establish the oracle complexity under mild conditions and strong limit-point convergence results under the Kurdyka-Lojasiewicz inequality. To demonstrate the effectiveness of our method, we conduct experiments on hyperbolic eigenvalue problems, hyperbolic structural probe problems, and the ultrahyperbolic knowledge graph embedding problem. Extensive experiments using both real-world and synthetic data demonstrate that JOBCD consistently outperforms state-of-the-art solutions, by large margins. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.01839 [pdf, other]

doi 10.1016/j.nima.2023.168685

Simulation of DAMPE silicon microstrip detectors in the $\rm Allpix^{2}$ framework

Authors: Yu-Xin Cui, Xiang Li, Shen Wang, Chuan Yue, Qiang Wan, Shi-Jun Lei, Guan-Wen Yuan, Yi-Ming Hu, Jia-Ju Wei, Jian-Hua Guo

Abstract: Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-st… ▽ More Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-strip detector with the $\rm Allpix^{2}$ framework is developed. By incorporating the electric field into the particle transport simulation based on Geant4, this framework could precisely emulate the carrier drift in the silicon micro-strip detector. The simulation results are validated using the beam test data as well as the flight data of the DAMPE experiment, which suggests that the $\rm Allpix^{2}$ framework is a powerful tool to obtain the performance of the silicon micro-strip detector. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Journal ref: Nuclear Instruments and Methods in Physics Research A 1057 (2023) 168685

arXiv:2405.15129 [pdf, ps, other]

ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints

Authors: Ganzhao Yuan

Abstract: We consider a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function. This class of problems finds diverse applications in statistical learning and data science. Existing ADMMs for addressing these problems often fail to exploit the specific… ▽ More We consider a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function. This class of problems finds diverse applications in statistical learning and data science. Existing ADMMs for addressing these problems often fail to exploit the specific structure of orthogonality constraints, struggle with nonsmooth functions and nonconvex constraint sets, or result in suboptimal oracle complexity. We propose {\sf OADMM}, an Alternating Direction Method of Multipliers (ADMM) designed to solve this class of problems using efficient proximal linearized strategies. Two specific variants of {\sf OADMM} are explored: one based on Euclidean Projection ({\sf OADMM-EP}) and the other on Riemannian retraction ({\sf OADMM-RR}). We integrate a Nesterov extrapolation strategy into {\sf OADMM-EP} and a monotone Barzilai-Borwein strategy into {\sf OADMM-RR} to potentially accelerate primal convergence. Additionally, we adopt an over-relaxation strategy in both {\sf OADMM-EP} and {\sf OADMM-RR} for rapid dual convergence. Under mild assumptions, we prove that {\sf OADMM} converges to the critical point of the problem with a provable convergence rate of $\mathcal{O}(1/ε^{3})$. We also establish the convergence rate of {\sf OADMM} under the Kurdyka-Lojasiewicz (KL) inequality. Numerical experiments are conducted to demonstrate the advantages of the proposed method. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.12511 [pdf, other]

Quantum Computing for Databases: Overview and Challenges

Authors: Gongsheng Yuan, Yuxing Chen, Jiaheng Lu, Sai Wu, Zhiwei Ye, Ling Qian, Gang Chen

Abstract: In the decades, the general field of quantum computing has experienced remarkable progress since its inception. A plethora of researchers not only proposed quantum algorithms showing the power of quantum computing but also constructed the prototype of quantum computers, making it walk into our tangible reality. Those remarkable advancements in quantum computing have opened doors for novel applicat… ▽ More In the decades, the general field of quantum computing has experienced remarkable progress since its inception. A plethora of researchers not only proposed quantum algorithms showing the power of quantum computing but also constructed the prototype of quantum computers, making it walk into our tangible reality. Those remarkable advancements in quantum computing have opened doors for novel applications, one of which is quantum databases. Researchers are trying to use a paradigm brought by quantum computing to revolutionize various aspects of database management systems. In this paper, we envision the synergy between quantum computing and databases with two perspectives: Quantum computing-enabled technology, and quantum computing-inspired technology. Based on this classification, we present a detailed overview of the research attained in this area, aiming to show the landscape of the field and draw a road map of future directions. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.09350 [pdf, other]

Digging into the ultraviolet luminosity functions of galaxies at high redshifts: galaxies evolution, reionization, and cosmological parameters

Authors: Yi-Ying Wang, Lei Lei, Shao-Peng Tang, Guan-Wen Yuan, Yi-Zhong Fan

Abstract: Thanks to the successful performance of the James Webb Space Telescope, our understanding of the epoch of reionization of the Universe has been advanced. The ultraviolet luminosity functions (UV LFs) of galaxies span a wide range of redshift, not only revealing the connection between galaxies and dark matter (DM) halos but also providing the information during reionization. In this work, we develo… ▽ More Thanks to the successful performance of the James Webb Space Telescope, our understanding of the epoch of reionization of the Universe has been advanced. The ultraviolet luminosity functions (UV LFs) of galaxies span a wide range of redshift, not only revealing the connection between galaxies and dark matter (DM) halos but also providing the information during reionization. In this work, we develop a model connecting galaxy counts and apparent magnitude based on UV LFs, which incorporates redshift-dependent star formation efficiency (SFE) and corrections for dust attenuation. By synthesizing some observations across the redshift range $4\le z \le 10$ from various galaxy surveys, we discern the evolving SFE with increasing redshift and DM halo mass through model fitting. Subsequent analyses indicate that the Thomson scattering optical depth was $τ_{\rm e} = 0.052^{+0.003}_{-0.002}$ and the epoch of reionization started (ended) at $z=20.58^{+6.25}_{-6.75}$ ($z=5.38^{+0.65}_{-0.70}$) which is insensitive to the choice of the truncated magnitude of the UV LFs. Incorporating additional dataset and some reasonable constraints, the amplitude of matter perturbation is found to be $σ_8=0.79\pm0.05$, which is consistent with the standard $Λ$CDM model. Future galaxy surveys and the dynamical simulations of galaxy evolution will break the degeneracy between SFE and cosmological parameters, improving the accuracy and the precision of the UV LF model further. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures and 3 tables

arXiv:2405.07608 [pdf, other]

FNCC: Fast Notification Congestion Control in Data Center Networks

Authors: **g Xu, Zhan Wang, Fan Yang, Ning Kang, Zhenlong Ma, Guojun Yuan, Guangming Tan, Ninghui Sun

Abstract: Congestion control plays a pivotal role in large-scale data centers, facilitating ultra-low latency, high bandwidth, and optimal utilization. Even with the deployment of data center congestion control mechanisms such as DCQCN and HPCC, these algorithms often respond to congestion sluggishly. This sluggishness is primarily due to the slow notification of congestion. It takes almost one round-trip t… ▽ More Congestion control plays a pivotal role in large-scale data centers, facilitating ultra-low latency, high bandwidth, and optimal utilization. Even with the deployment of data center congestion control mechanisms such as DCQCN and HPCC, these algorithms often respond to congestion sluggishly. This sluggishness is primarily due to the slow notification of congestion. It takes almost one round-trip time (RTT) for the congestion information to reach the sender. In this paper, we introduce the Fast Notification Congestion Control (FNCC) mechanism, which achieves sub-RTT notification. FNCC leverages the acknowledgment packet (ACK) from the return path to carry in-network telemetry (INT) information of the request path, offering the sender more timely and accurate INT. To further accelerate the responsiveness of last-hop congestion control, we propose that the receiver notifies the sender of the number of concurrent congested flows, which can be used to adjust the congested flows to a fair rate quickly. Our experimental results demonstrate that FNCC reduces flow completion time by 27.4% and 88.9% compared to HPCC and DCQCN, respectively. Moreover, FNCC triggers minimal pause frames and maintains high utilization even at 400Gbps. △ Less

Submitted 26 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.06920 [pdf, ps, other]

Stability estimate for the discrete Calderon problem from partial data

Authors: Xiaomeng Zhao, Ganghua Yuan

Abstract: In this paper, we focus on the analysis of discrete versions of the Calderon problem with partial boundary data in dimension d >= 3. In particular, we establish logarithmic stability estimates for the discrete Calderon problem on an arbitrarily small portion of the boundary under suitable a priori bounds. For this end, we will use CGO solutions and derive a new discrete Carleman estimate and a key… ▽ More In this paper, we focus on the analysis of discrete versions of the Calderon problem with partial boundary data in dimension d >= 3. In particular, we establish logarithmic stability estimates for the discrete Calderon problem on an arbitrarily small portion of the boundary under suitable a priori bounds. For this end, we will use CGO solutions and derive a new discrete Carleman estimate and a key unique continuation estimate. Unlike the continuous case, we use a new strategy inspired by [32] to prove the key discrete unique continuation estimate by utilizing the new Carleman estimate with boundary observations for a discrete Laplace operator. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 41 pages

MSC Class: 35R30; 35J25; 65N06

arXiv:2405.04371 [pdf, other]

Community Detection for Heterogeneous Multiple Social Networks

Authors: Ziqing Zhu, Guan Yuan, Tao Zhou, Jiuxin Cao

Abstract: The community plays a crucial role in understanding user behavior and network characteristics in social networks. Some users can use multiple social networks at once for a variety of objectives. These users are called overlap** users who bridge different social networks. Detecting communities across multiple social networks is vital for interaction mining, information diffusion, and behavior mig… ▽ More The community plays a crucial role in understanding user behavior and network characteristics in social networks. Some users can use multiple social networks at once for a variety of objectives. These users are called overlap** users who bridge different social networks. Detecting communities across multiple social networks is vital for interaction mining, information diffusion, and behavior migration analysis among networks. This paper presents a community detection method based on nonnegative matrix tri-factorization for multiple heterogeneous social networks, which formulates a common consensus matrix to represent the global fused community. Specifically, the proposed method involves creating adjacency matrices based on network structure and content similarity, followed by alignment matrices which distinguish overlap** users in different social networks. With the generated alignment matrices, the method could enhance the fusion degree of the global community by detecting overlap** user communities across networks. The effectiveness of the proposed method is evaluated with new metrics on Twitter, Instagram, and Tumblr datasets. The results of the experiments demonstrate its superior performance in terms of community quality and community fusion. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: This paper was accepted by IEEE Transactions on Computational Social Systems(TCSS)

arXiv:2405.03233 [pdf, ps, other]

ADMM for Nonconvex Optimization under Minimal Continuity Assumption

Authors: Ganzhao Yuan

Abstract: This paper introduces a novel approach to solving multi-block nonconvex composite optimization problems through a proximal linearized Alternating Direction Method of Multipliers (ADMM). This method incorporates an Increasing Penalization and Decreasing Smoothing (IPDS) strategy. Distinguishing itself from existing ADMM-style algorithms, our approach (denoted IPDS-ADMM) imposes a less stringent con… ▽ More This paper introduces a novel approach to solving multi-block nonconvex composite optimization problems through a proximal linearized Alternating Direction Method of Multipliers (ADMM). This method incorporates an Increasing Penalization and Decreasing Smoothing (IPDS) strategy. Distinguishing itself from existing ADMM-style algorithms, our approach (denoted IPDS-ADMM) imposes a less stringent condition, specifically requiring continuity in just one block of the objective function. IPDS-ADMM requires that the penalty increases and the smoothing parameter decreases, both at a controlled pace. When the associated linear operator is bijective, IPDS-ADMM uses an over-relaxation stepsize for faster convergence; however, when the linear operator is surjective, IPDS-ADMM uses an under-relaxation stepsize for global convergence. We devise a novel potential function to facilitate our convergence analysis and prove an oracle complexity $\mathcal{O}(ε^{-3})$ to achieve an $ε$-approximate critical point. To the best of our knowledge, this is the first complexity result for using ADMM to solve this class of nonsmooth nonconvex problems. Finally, some experiments on the sparse PCA problem are conducted to demonstrate the effectiveness of our approach. △ Less

Submitted 23 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.01992 [pdf, other]

SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation

Authors: Yunsong Yang, Genji Yuan, **jiang Li

Abstract: In order to fully utilize spatial information for segmentation and address the challenge of handling areas with significant grayscale variations in remote sensing segmentation, we propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework. This framework employs a two-stage network design: the first stage extracts features using spatial methods to obtain features with sufficient sp… ▽ More In order to fully utilize spatial information for segmentation and address the challenge of handling areas with significant grayscale variations in remote sensing segmentation, we propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework. This framework employs a two-stage network design: the first stage extracts features using spatial methods to obtain features with sufficient spatial details and semantic information; the second stage maps these features in both spatial and frequency domains. In the frequency domain map**, we introduce the Wavelet Transform Feature Decomposer (WTFD) structure, which decomposes features into low-frequency and high-frequency components using the Haar wavelet transform and integrates them with spatial features. To bridge the semantic gap between frequency and spatial features, and facilitate significant feature selection to promote the combination of features from different representation domains, we design the Multiscale Dual-Representation Alignment Filter (MDAF). This structure utilizes multiscale convolutions and dual-cross attentions. Comprehensive experimental results demonstrate that, compared to existing methods, SFFNet achieves superior performance in terms of mIoU, reaching 84.80% and 87.73% respectively.The code is located at https://github.com/yysdck/SFFNet. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01065 [pdf, other]

MFDS-Net: Multi-Scale Feature Depth-Supervised Network for Remote Sensing Change Detection with Global Semantic and Detail Information

Authors: Zhenyang Huang, Zhao** Fu, Song **tao, Genji Yuan, **jiang Li

Abstract: Change detection as an interdisciplinary discipline in the field of computer vision and remote sensing at present has been receiving extensive attention and research. Due to the rapid development of society, the geographic information captured by remote sensing satellites is changing faster and more complex, which undoubtedly poses a higher challenge and highlights the value of change detection ta… ▽ More Change detection as an interdisciplinary discipline in the field of computer vision and remote sensing at present has been receiving extensive attention and research. Due to the rapid development of society, the geographic information captured by remote sensing satellites is changing faster and more complex, which undoubtedly poses a higher challenge and highlights the value of change detection tasks. We propose MFDS-Net: Multi-Scale Feature Depth-Supervised Network for Remote Sensing Change Detection with Global Semantic and Detail Information (MFDS-Net) with the aim of achieving a more refined description of changing buildings as well as geographic information, enhancing the localisation of changing targets and the acquisition of weak features. To achieve the research objectives, we use a modified ResNet_34 as backbone network to perform feature extraction and DO-Conv as an alternative to traditional convolution to better focus on the association between feature information and to obtain better training results. We propose the Global Semantic Enhancement Module (GSEM) to enhance the processing of high-level semantic information from a global perspective. The Differential Feature Integration Module (DFIM) is proposed to strengthen the fusion of different depth feature information, achieving learning and extraction of differential features. The entire network is trained and optimized using a deep supervision mechanism. The experimental outcomes of MFDS-Net surpass those of current mainstream change detection networks. On the LEVIR dataset, it achieved an F1 score of 91.589 and IoU of 84.483, on the WHU dataset, the scores were F1: 92.384 and IoU: 86.807, and on the GZ-CD dataset, the scores were F1: 86.377 and IoU: 76.021. The code is available at https://github.com/AOZAKIiii/MFDS-Net △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2403.10799 [pdf, other]

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion

Authors: Jun Liu, Chao Wu, Changdi Yang, Hao Tang, Zhenglun Kong, Geng Yuan, Wei Niu, Dong Huang, Yanzhi Wang

Abstract: Large language models (LLMs) have become crucial for many generative downstream tasks, leading to an inevitable trend and significant challenge to deploy them efficiently on resource-constrained devices. Structured pruning is a widely used method to address this challenge. However, when dealing with the complex structure of the multiple decoder layers, general methods often employ common estimatio… ▽ More Large language models (LLMs) have become crucial for many generative downstream tasks, leading to an inevitable trend and significant challenge to deploy them efficiently on resource-constrained devices. Structured pruning is a widely used method to address this challenge. However, when dealing with the complex structure of the multiple decoder layers, general methods often employ common estimation approaches for pruning. These approaches lead to a decline in accuracy for specific downstream tasks. In this paper, we introduce a simple yet efficient method that adaptively models the importance of each substructure. Meanwhile, it can adaptively fuse coarse-grained and finegrained estimations based on the results from complex and multilayer structures. All aspects of our design seamlessly integrate into the endto-end pruning framework. Our experimental results, compared with state-of-the-art methods on mainstream datasets, demonstrate average accuracy improvements of 1.1%, 1.02%, 2.0%, and 1.2% for LLaMa-7B,Vicuna-7B, Baichuan-7B, and Bloom-7b1, respectively. △ Less

Submitted 14 May, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

arXiv:2401.16720 [pdf, other]

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing

Authors: Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang

Abstract: There has been a proliferation of artificial intelligence applications, where model training is key to promising high-quality services for these applications. However, the model training process is both time-intensive and energy-intensive, inevitably affecting the user's demand for application efficiency. Layer freezing, an efficient model training technique, has been proposed to improve training… ▽ More There has been a proliferation of artificial intelligence applications, where model training is key to promising high-quality services for these applications. However, the model training process is both time-intensive and energy-intensive, inevitably affecting the user's demand for application efficiency. Layer freezing, an efficient model training technique, has been proposed to improve training efficiency. Although existing layer freezing methods demonstrate the great potential to reduce model training costs, they still remain shortcomings such as lacking generalizability and compromised accuracy. For instance, existing layer freezing methods either require the freeze configurations to be manually defined before training, which does not apply to different networks, or use heuristic freezing criteria that is hard to guarantee decent accuracy in different scenarios. Therefore, there lacks a generic and smart layer freezing method that can automatically perform ``in-situation'' layer freezing for different networks during training processes. To this end, we propose a generic and efficient training framework (SmartFRZ). The core proposed technique in SmartFRZ is attention-guided layer freezing, which can automatically select the appropriate layers to freeze without compromising accuracy. Experimental results show that SmartFRZ effectively reduces the amount of computation in training and achieves significant training acceleration, and outperforms the state-of-the-art layer freezing approaches. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.16694 [pdf, other]

EdgeOL: Efficient in-situ Online Learning on Edge Devices

Authors: Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, **gtong Hu, Yanzhi Wang, Xulong Tang

Abstract: Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes. Online model fine-tuning is widely adopted to satisfy these needs. However, an inappropriate fine-tuning scheme could involve significant ene… ▽ More Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes. Online model fine-tuning is widely adopted to satisfy these needs. However, an inappropriate fine-tuning scheme could involve significant energy consumption, making it challenging to deploy on edge devices. In this paper, we propose EdgeOL, an edge online learning framework that optimizes inference accuracy, fine-tuning execution time, and energy efficiency through both inter-tuning and intra-tuning optimizations. Experimental results show that, on average, EdgeOL reduces overall fine-tuning execution time by 64%, energy consumption by 52%, and improves average inference accuracy by 1.75% over the immediate online learning strategy. △ Less

Submitted 15 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.11664 [pdf, other]

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM

Authors: Bingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding

Abstract: Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication. However, hardware failures, such as stuck-at-fault defects, can result in significant prediction errors during model inference. While additional crossbars can be used to address these failures, they come with storage overhe… ▽ More Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication. However, hardware failures, such as stuck-at-fault defects, can result in significant prediction errors during model inference. While additional crossbars can be used to address these failures, they come with storage overhead and are not efficient in terms of space, energy, and cost. In this paper, we propose a fault protection mechanism that incurs zero space cost. Our approach includes: 1) differentiable structure pruning of rows and columns to reduce model redundancy, 2) weight duplication and voting for robust output, and 3) embedding duplicated most significant bits (MSBs) into the model weight. We evaluate our method on nine tasks of the GLUE benchmark with the BERT model, and experimental results prove its effectiveness. △ Less

Submitted 21 January, 2024; originally announced January 2024.

arXiv:2401.11261 [pdf, other]

Diffusion Model Conditioning on Gaussian Mixture Model and Negative Gaussian Mixture Gradient

Authors: Weiguo Lu, Xuan Wu, Deng Ding, **qiao Duan, Jirong Zhuang, Gangnan Yuan

Abstract: Diffusion models (DMs) are a type of generative model that has a huge impact on image synthesis and beyond. They achieve state-of-the-art generation results in various generative tasks. A great diversity of conditioning inputs, such as text or bounding boxes, are accessible to control the generation. In this work, we propose a conditioning mechanism utilizing Gaussian mixture models (GMMs) as feat… ▽ More Diffusion models (DMs) are a type of generative model that has a huge impact on image synthesis and beyond. They achieve state-of-the-art generation results in various generative tasks. A great diversity of conditioning inputs, such as text or bounding boxes, are accessible to control the generation. In this work, we propose a conditioning mechanism utilizing Gaussian mixture models (GMMs) as feature conditioning to guide the denoising process. Based on set theory, we provide a comprehensive theoretical analysis that shows that conditional latent distribution based on features and classes is significantly different, so that conditional latent distribution on features produces fewer defect generations than conditioning on classes. Two diffusion models conditioned on the Gaussian mixture model are trained separately for comparison. Experiments support our findings. A novel gradient function called the negative Gaussian mixture gradient (NGMG) is proposed and applied in diffusion model training with an additional classifier. Training stability has improved. We also theoretically prove that NGMG shares the same benefit as the Earth Mover distance (Wasserstein) as a more sensible cost function when learning distributions supported by low-dimensional manifolds. △ Less

Submitted 1 February, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

arXiv:2401.01183 [pdf, other]

Unifying Structured Data as Graph for Data-to-Text Pre-Training

Authors: Shujie Li, Liang Li, Ruiying Geng, Min Yang, Binhua Li, Guanghu Yuan, Wanwei He, Shao Yuan, Can Ma, Fei Huang, Yongbin Li

Abstract: Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation and yields impressive performances. However, previous pre-training methods either oversimplified structured data into a sequence without considering input structures or designed training objectives tailored for a specific data s… ▽ More Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation and yields impressive performances. However, previous pre-training methods either oversimplified structured data into a sequence without considering input structures or designed training objectives tailored for a specific data structure (e.g., table or knowledge graph). In this paper, we unify different types of structured data (i.e., table, key-value data, knowledge graph) into the graph format and cast different data-to-text generation tasks as graph-to-text generation. To effectively exploit the structural information of the input graph, we propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer. Concretely, we devise a position matrix for the Transformer, encoding relative positional information of connected nodes in the input graph. In addition, we propose a new attention matrix to incorporate graph structures into the original Transformer by taking the available explicit connectivity structure into account. Extensive experiments on six benchmark datasets show the effectiveness of our model. Our source codes are available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/unid2t. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: Accepted for TACL. Pre-MIT Press publication version

arXiv:2312.15469 [pdf, other]

Efficient Estimation of the Central Mean Subspace via Smoothed Gradient Outer Products

Authors: Gan Yuan, Mingyue Xu, Samory Kpotufe, Daniel Hsu

Abstract: We consider the problem of sufficient dimension reduction (SDR) for multi-index models. The estimators of the central mean subspace in prior works either have slow (non-parametric) convergence rates, or rely on stringent distributional conditions (e.g., the covariate distribution $P_{\mathbf{X}}$ being elliptical symmetric). In this paper, we show that a fast parametric convergence rate of form… ▽ More We consider the problem of sufficient dimension reduction (SDR) for multi-index models. The estimators of the central mean subspace in prior works either have slow (non-parametric) convergence rates, or rely on stringent distributional conditions (e.g., the covariate distribution $P_{\mathbf{X}}$ being elliptical symmetric). In this paper, we show that a fast parametric convergence rate of form $C_d \cdot n^{-1/2}$ is achievable via estimating the \emph{expected smoothed gradient outer product}, for a general class of distribution $P_{\mathbf{X}}$ admitting Gaussian or heavier distributions. When the link function is a polynomial with a degree of at most $r$ and $P_{\mathbf{X}}$ is the standard Gaussian, we show that the prefactor depends on the ambient dimension $d$ as $C_d \propto d^r$. △ Less

Submitted 24 December, 2023; originally announced December 2023.

MSC Class: 62B05; 62G08

arXiv:2311.15129 [pdf]

Kinetic-Scale Topological Structures Associated with Energy Dissipation in the Turbulent Reconnection Outflow

Authors: S. Y. Huang, J. Zhang, Q. Y. Xiong, Z. G. Yuan, K. Jiang, S. B. Xu, Y. Y. Wei, R. T. Lin, L. Yu, Z. Wang

Abstract: Assisted with Magnetospheric Multiscale (MMS) mission capturing unprecedented high-resolution data in the terrestrial magnetotail, we apply a local streamline-topology classification methodology to investigate the categorization of the magnetic-field topological structures at kinetic scales in the turbulent reconnection outflow. It is found that strong correlations between the straining and rotati… ▽ More Assisted with Magnetospheric Multiscale (MMS) mission capturing unprecedented high-resolution data in the terrestrial magnetotail, we apply a local streamline-topology classification methodology to investigate the categorization of the magnetic-field topological structures at kinetic scales in the turbulent reconnection outflow. It is found that strong correlations between the straining and rotational part of the velocity gradient tensor as well as the magnetic-field gradient tensor. The strong energy dissipation prefers to occur at regions with high magnetic stress or current density, which is contributed mainly by O-type topologies. These results indicate that the kinetic structures with O-type topology play more import role in energy dissipation in turbulent reconnection outflow. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: 19 pages, 4 figures, accepted by ApJ

arXiv:2311.07211 [pdf, other]

A Gaussian Process Based Method with Deep Kernel Learning for Pricing High-dimensional American Options

Authors: Jirong Zhuang, Deng Ding, Weiguo Lu, Xuan Wu, Gangnan Yuan

Abstract: In this work, we present a novel machine learning approach for pricing high-dimensional American options based on the modified Gaussian process regression (GPR). We incorporate deep kernel learning and sparse variational Gaussian processes to address the challenges traditionally associated with GPR. These challenges include its diminished reliability in high-dimensional scenarios and the excessive… ▽ More In this work, we present a novel machine learning approach for pricing high-dimensional American options based on the modified Gaussian process regression (GPR). We incorporate deep kernel learning and sparse variational Gaussian processes to address the challenges traditionally associated with GPR. These challenges include its diminished reliability in high-dimensional scenarios and the excessive computational costs associated with processing extensive numbers of simulated paths Our findings indicate that the proposed method surpasses the performance of the least squares Monte Carlo method in high-dimensional scenarios, particularly when the underlying assets are modeled by Merton's jump diffusion model. Moreover, our approach does not exhibit a significant increase in computational time as the number of dimensions grows. Consequently, this method emerges as a potential tool for alleviating the challenges posed by the curse of dimensionality. △ Less

Submitted 18 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 21pages,8 figures

arXiv:2310.15081 [pdf, other]

E4S: Fine-grained Face Swap** via Editing With Regional GAN Inversion

Authors: Maomao Li, Ge Yuan, Cairong Wang, Zhian Liu, Yong Zhang, Yongwei Nie, Jue Wang, Dong Xu

Abstract: This paper proposes a novel approach to face swap** from the perspective of fine-grained facial editing, dubbed "editing for swap**" (E4S). The traditional face swap** methods rely on global feature extraction and fail to preserve the detailed source identity. In contrast, we propose a Regional GAN Inversion (RGI) method, which allows the explicit disentanglement of shape and texture. Specif… ▽ More This paper proposes a novel approach to face swap** from the perspective of fine-grained facial editing, dubbed "editing for swap**" (E4S). The traditional face swap** methods rely on global feature extraction and fail to preserve the detailed source identity. In contrast, we propose a Regional GAN Inversion (RGI) method, which allows the explicit disentanglement of shape and texture. Specifically, our E4S performs face swap** in the latent space of a pretrained StyleGAN, where a multi-scale mask-guided encoder is applied to project the texture of each facial component into regional style codes and a mask-guided injection module manipulating feature maps with the style codes. Based on this disentanglement, face swap** can be simplified as style and mask swap**. Besides, due to the large lighting condition gap, transferring the source skin into the target image may lead to disharmony lighting. We propose a re-coloring network to make the swapped face maintain the target lighting condition while preserving the source skin. Further, to deal with the potential mismatch areas during mask exchange, we design a face inpainting module to refine the face shape. The extensive comparisons with state-of-the-art methods demonstrate that our E4S outperforms existing methods in preserving texture, shape, and lighting. Our implementation is available at https://github.com/e4s2024/E4S2024. △ Less

Submitted 27 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: Project Page: https://e4s2024.github.io/ ;. arXiv admin note: text overlap with arXiv:2211.14068

arXiv:2310.12451 [pdf, other]

doi 10.1109/JBHI.2024.3373439

MTS-LOF: Medical Time-Series Representation Learning via Occlusion-Invariant Features

Authors: Huayu Li, Ana S. Carreon-Rascon, Xiwen Chen, Geng Yuan, Ao Li

Abstract: Medical time series data are indispensable in healthcare, providing critical insights for disease diagnosis, treatment planning, and patient management. The exponential growth in data complexity, driven by advanced sensor technologies, has presented challenges related to data labeling. Self-supervised learning (SSL) has emerged as a transformative approach to address these challenges, eliminating… ▽ More Medical time series data are indispensable in healthcare, providing critical insights for disease diagnosis, treatment planning, and patient management. The exponential growth in data complexity, driven by advanced sensor technologies, has presented challenges related to data labeling. Self-supervised learning (SSL) has emerged as a transformative approach to address these challenges, eliminating the need for extensive human annotation. In this study, we introduce a novel framework for Medical Time Series Representation Learning, known as MTS-LOF. MTS-LOF leverages the strengths of contrastive learning and Masked Autoencoder (MAE) methods, offering a unique approach to representation learning for medical time series data. By combining these techniques, MTS-LOF enhances the potential of healthcare applications by providing more sophisticated, context-rich representations. Additionally, MTS-LOF employs a multi-masking strategy to facilitate occlusion-invariant feature learning. This approach allows the model to create multiple views of the data by masking portions of it. By minimizing the discrepancy between the representations of these masked patches and the fully visible patches, MTS-LOF learns to capture rich contextual information within medical time series datasets. The results of experiments conducted on diverse medical time series datasets demonstrate the superiority of MTS-LOF over other methods. These findings hold promise for significantly enhancing healthcare applications by improving representation learning. Furthermore, our work delves into the integration of joint-embedding SSL and MAE techniques, shedding light on the intricate interplay between temporal and structural dependencies in healthcare data. This understanding is crucial, as it allows us to grasp the complexities of healthcare data analysis. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2309.14363 [pdf, ps, other]

doi 10.26421/qic22.15-16-3

Infeasibility of constructing a special orthogonal matrix for the deterministic remote preparation of arbitrary n-qubit state

Authors: Wenjie Liu, Zixian Li, Gonglin Yuan

Abstract: In this paper, we present a polynomial-complexity algorithm to construct a special orthogonal matrix for the deterministic remote state preparation (DRSP) of an arbitrary n-qubit state, and prove that if n>3, such matrices do not exist. Firstly, the construction problem is split into two sub-problems, i.e., finding a solution of a semi-orthogonal matrix and generating all semi-orthogonal matrices.… ▽ More In this paper, we present a polynomial-complexity algorithm to construct a special orthogonal matrix for the deterministic remote state preparation (DRSP) of an arbitrary n-qubit state, and prove that if n>3, such matrices do not exist. Firstly, the construction problem is split into two sub-problems, i.e., finding a solution of a semi-orthogonal matrix and generating all semi-orthogonal matrices. Through giving the definitions and properties of the matching operators, it is proved that the orthogonality of a special matrix is equivalent to the cooperation of multiple matching operators, and then the construction problem is reduced to the problem of solving an XOR linear equation system, which reduces the construction complexity from exponential to polynomial level. Having proved that each semi-orthogonal matrix can be simplified into a unique form, we use the proposed algorithm to confirm that the unique form does not have any solution when n>3, which means it is infeasible to construct such a special orthogonal matrix for the DRSP of an arbitrary n-qubit state. △ Less

Submitted 23 September, 2023; originally announced September 2023.

Comments: 31 figures

Journal ref: Quantum Information & Computation, 2022. 22(15&16): p. 1289-1319

arXiv:2309.12212 [pdf, other]

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices

Authors: Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Zabihi Masoud, Yanyue Xie, Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang, Olivia Chen

Abstract: Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high energy efficiency. By employing the distinct polarity of current to denote logic `0' and `1', AQFP devices serve as excellent carriers for binary neural network (BNN) computations. Although recent research has made initial strides toward develo** an AQFP-based BNN accelerator, several critical challenges rema… ▽ More Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high energy efficiency. By employing the distinct polarity of current to denote logic `0' and `1', AQFP devices serve as excellent carriers for binary neural network (BNN) computations. Although recent research has made initial strides toward develo** an AQFP-based BNN accelerator, several critical challenges remain, preventing the design from being a comprehensive solution. In this paper, we propose SupeRBNN, an AQFP-based randomized BNN acceleration framework that leverages software-hardware co-optimization to eventually make the AQFP devices a feasible solution for BNN acceleration. Specifically, we investigate the randomized behavior of the AQFP devices and analyze the impact of crossbar size on current attenuation, subsequently formulating the current amplitude into the values suitable for use in BNN computation. To tackle the accumulation problem and improve overall hardware performance, we propose a stochastic computing-based accumulation module and a clocking scheme adjustment-based circuit optimization method. We validate our SupeRBNN framework across various datasets and network architectures, comparing it with implementations based on different technologies, including CMOS, ReRAM, and superconducting RSFQ/ERSFQ. Experimental results demonstrate that our design achieves an energy efficiency of approximately 7.8x10^4 times higher than that of the ReRAM-based BNN framework while maintaining a similar level of model accuracy. Furthermore, when compared with superconductor-based counterparts, our framework demonstrates at least two orders of magnitude higher energy efficiency. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by MICRO'23 (56th IEEE/ACM International Symposium on Microarchitecture)

arXiv:2309.07438 [pdf, other]

Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges

Authors: Fei Dou, ** Ye, Geng Yuan, Qin Lu, Wei Niu, Haijian Sun, Le Guan, Guoyu Lu, Gengchen Mai, Ninghao Liu, ** Lu, Zhengliang Liu, Zihao Wu, Chenjiao Tan, Shaochen Xu, Xianqiao Wang, Guoming Li, Lilong Chai, Sheng Li, ** Sun, Hongyue Sun, Yunli Shao, Changying Li, Tianming Liu, Wenzhan Song

Abstract: Artificial General Intelligence (AGI), possessing the capacity to comprehend, learn, and execute tasks with human cognitive abilities, engenders significant anticipation and intrigue across scientific, commercial, and societal arenas. This fascination extends particularly to the Internet of Things (IoT), a landscape characterized by the interconnection of countless devices, sensors, and systems, c… ▽ More Artificial General Intelligence (AGI), possessing the capacity to comprehend, learn, and execute tasks with human cognitive abilities, engenders significant anticipation and intrigue across scientific, commercial, and societal arenas. This fascination extends particularly to the Internet of Things (IoT), a landscape characterized by the interconnection of countless devices, sensors, and systems, collectively gathering and sharing data to enable intelligent decision-making and automation. This research embarks on an exploration of the opportunities and challenges towards achieving AGI in the context of the IoT. Specifically, it starts by outlining the fundamental principles of IoT and the critical role of Artificial Intelligence (AI) in IoT systems. Subsequently, it delves into AGI fundamentals, culminating in the formulation of a conceptual framework for AGI's seamless integration within IoT. The application spectrum for AGI-infused IoT is broad, encompassing domains ranging from smart grids, residential environments, manufacturing, and transportation to environmental monitoring, agriculture, healthcare, and education. However, adapting AGI to resource-constrained IoT settings necessitates dedicated research efforts. Furthermore, the paper addresses constraints imposed by limited computing resources, intricacies associated with large-scale IoT communication, as well as the critical concerns pertaining to security and privacy. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2308.09444 [pdf, other]

An Efficient 1 Iteration Learning Algorithm for Gaussian Mixture Model And Gaussian Mixture Embedding For Neural Network

Authors: Weiguo Lu, Xuan Wu, Deng Ding, Gangnan Yuan

Abstract: We propose an Gaussian Mixture Model (GMM) learning algorithm, based on our previous work of GMM expansion idea. The new algorithm brings more robustness and simplicity than classic Expectation Maximization (EM) algorithm. It also improves the accuracy and only take 1 iteration for learning. We theoretically proof that this new algorithm is guarantee to converge regardless the parameters initialis… ▽ More We propose an Gaussian Mixture Model (GMM) learning algorithm, based on our previous work of GMM expansion idea. The new algorithm brings more robustness and simplicity than classic Expectation Maximization (EM) algorithm. It also improves the accuracy and only take 1 iteration for learning. We theoretically proof that this new algorithm is guarantee to converge regardless the parameters initialisation. We compare our GMM expansion method with classic probability layers in neural network leads to demonstrably better capability to overcome data uncertainty and inverse problem. Finally, we test GMM based generator which shows a potential to build further application that able to utilized distribution random sampling for stochastic variation as well as variation control. △ Less

Submitted 6 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

arXiv:2307.12487 [pdf, other]

doi 10.3847/2041-8213/acf46c

Modeling the JWST high-redshift galaxies with a general formation scenario and the consistency with the $Λ$CDM model

Authors: Yi-Ying Wang, Lei Lei, Guan-Wen Yuan, Yi-Zhong Fan

Abstract: Early results from the James Webb Space Telescope (JWST) observations have hinted at two traces beyond the standard cosmological framework. One is the extraordinarily high stellar masses and their density at $z=7.5\sim9.1$, another is the unexpected abundance of ultraviolet (UV) bright galaxies at $z\ge10$. Nevertheless, both pieces of evidence are not statistically robust yet. In this work, we co… ▽ More Early results from the James Webb Space Telescope (JWST) observations have hinted at two traces beyond the standard cosmological framework. One is the extraordinarily high stellar masses and their density at $z=7.5\sim9.1$, another is the unexpected abundance of ultraviolet (UV) bright galaxies at $z\ge10$. Nevertheless, both pieces of evidence are not statistically robust yet. In this work, we construct rest-frame UV luminosity functions (LFs) based on a general formation model for these high-redshift galaxy candidates, since UV LFs always carry the information of stellar formation efficiency (SFE), initial mass function (IMF), dust attenuation, and other crucial elements for galaxy evolution. By updating the massive galaxies candidates with spectroscopic observations and exploring the parameter space of SFE, we are able to reasonably explain the cumulative stellar mass density within the redshift range of $7.5\sim9.1$, with only one galaxy exhibiting unusual characteristics. We also reveal a potential nonmonotonic trend of SFE with the increasing redshift. At higher redshift ($z\sim13$), bright UV LFs can be well fitted with non-dust attenuation or top-heavy IMF for Population III stars. The Population III star scenario can also naturally account for the possible dip of the peak SFE evolution curve at $z\sim9$. △ Less

Submitted 12 September, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

Comments: 10 pages, 7 figures, and 2 tables. Published in ApJL

Journal ref: The Astrophysical Journal Letters, 954:L48 (10pp), 2023 September 10

arXiv:2307.12216 [pdf, other]

A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron Circuits

Authors: Masoud Zabihi, Yanyue Xie, Zhengang Li, Peiyan Dong, Geng Yuan, Olivia Chen, Massoud Pedram, Yanzhi Wang

Abstract: The production process of superconductive integrated circuits is complex and consumes significant amounts of resources and energy. Therefore, it is crucial to evaluate the environmental impact of this emerging technology. An attractive option for the next generation of superconductive technology is Adiabatic Quantum-Flux-Parametron (AQFP) devices. This study is the first to present a comprehensive… ▽ More The production process of superconductive integrated circuits is complex and consumes significant amounts of resources and energy. Therefore, it is crucial to evaluate the environmental impact of this emerging technology. An attractive option for the next generation of superconductive technology is Adiabatic Quantum-Flux-Parametron (AQFP) devices. This study is the first to present a comprehensive process-based life-cycle assessment (LCA) and inventory analysis of AQFP integrated circuits. To generate relevant outcomes, we conduct a comparative LCA that included the bulk CMOS technology. The inventory analysis considered the manufacturing, assembly, and use phases of the circuits. To ensure a fair assessment, we choose the 32-bit AQFP RISC-V single-core processor as the reference functional unit and compare its performance with that of a CMOS counterpart. Our findings reveal that the AQFP processor consumes several orders of magnitude less energy during the use phase than its CMOS counterpart. Consequently, the total life cycle energy (which encompasses manufacturing and assembly energies) of AQFP integrated circuits improves at least by two orders of magnitude. △ Less

Submitted 22 July, 2023; originally announced July 2023.

arXiv:2306.17822 [pdf, other]

doi 10.1016/j.scib.2023.10.027

Limits on scalar-induced gravitational waves from the stochastic background by pulsar timing array observations

Authors: Yi-Fu Cai, Xin-Chen He, Xiao-Han Ma, Sheng-Feng Yan, Guan-Wen Yuan

Abstract: Recently, the NANOGrav, PPTA, EPTA, and CPTA collaborations independently reported their evidence of the Stochastic Gravitational Waves Background (SGWB). While the inferred gravitational-wave background amplitude and spectrum are consistent with astrophysical expectations for a signal from the population of supermassive black-hole binaries (SMBHBs), the search for new physics remains plausible in… ▽ More Recently, the NANOGrav, PPTA, EPTA, and CPTA collaborations independently reported their evidence of the Stochastic Gravitational Waves Background (SGWB). While the inferred gravitational-wave background amplitude and spectrum are consistent with astrophysical expectations for a signal from the population of supermassive black-hole binaries (SMBHBs), the search for new physics remains plausible in this observational window. In this work, we explore the possibility of explaining such a signal by the scalar-induced gravitational waves (IGWs) in the very early universe. We use a parameterized broken power-law function as a general description of the energy spectrum of the SGWB and fit it to the new results of NANOGrav, PPTA and EPTA. We find that this approach can put constraints on the parameters of IGW energy spectrum and further yield restrictions on various inflation models that may produce primordial black holes (PBHs) in the early universe, which is also expected to be examined by the forthcoming space-based GW experiments. △ Less

Submitted 19 December, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

Comments: 7 pages, 2 figures, update some references

Journal ref: Science Bulletin, 68 (2023) 2929-2935

arXiv:2306.17143 [pdf, other]

Dark Matter Spike surrounding Supermassive Black Holes Binary and the nanohertz Stochastic Gravitational Wave Background

Authors: Zhao-Qiang Shen, Guan-Wen Yuan, Yi-Ying Wang, Yuan-Zhu Wang

Abstract: Recently, the NANOGrav, PPTA, EPTA and CPTA collaborations reported compelling evidence of the existence of the Stochastic Gravitational-Wave Background (SGWB). The amplitude and spectrum of this inferred gravitational-wave background align closely with the astrophysical predictions for a signal originating from the population of supermassive black-hole binaries. In light of these findings, we exp… ▽ More Recently, the NANOGrav, PPTA, EPTA and CPTA collaborations reported compelling evidence of the existence of the Stochastic Gravitational-Wave Background (SGWB). The amplitude and spectrum of this inferred gravitational-wave background align closely with the astrophysical predictions for a signal originating from the population of supermassive black-hole binaries. In light of these findings, we explore the possibility to detect dark matter spikes surrounding massive black holes, which could potentially impact the gravitational-wave waveform and modulate the SGWB. We demonstrate that the SMBH binary evolution induced by the combined effects of GW radiation and the dynamical friction of the dark matter spike exhibits detectable manifestations within the nHz frequency range of the SGWB. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: 5 pages, 1 figure. arXiv admin note: text overlap with arXiv:1408.3534 by other authors

arXiv:2306.05356 [pdf, other]

ReliableSwap: Boosting General Face Swap** Via Reliable Supervision

Authors: Ge Yuan, Maomao Li, Yong Zhang, Huicheng Zheng

Abstract: Almost all advanced face swap** approaches use reconstruction as the proxy task, i.e., supervision only exists when the target and source belong to the same person. Otherwise, lacking pixel-level supervision, these methods struggle for source identity preservation. This paper proposes to construct reliable supervision, dubbed cycle triplets, which serves as the image-level guidance when the sour… ▽ More Almost all advanced face swap** approaches use reconstruction as the proxy task, i.e., supervision only exists when the target and source belong to the same person. Otherwise, lacking pixel-level supervision, these methods struggle for source identity preservation. This paper proposes to construct reliable supervision, dubbed cycle triplets, which serves as the image-level guidance when the source identity differs from the target one during training. Specifically, we use face reenactment and blending techniques to synthesize the swapped face from real images in advance, where the synthetic face preserves source identity and target attributes. However, there may be some artifacts in such a synthetic face. To avoid the potential artifacts and drive the distribution of the network output close to the natural one, we reversely take synthetic images as input while the real face as reliable supervision during the training stage of face swap**. Besides, we empirically find that the existing methods tend to lose lower-face details like face shape and mouth from the source. This paper additionally designs a FixerNet, providing discriminative embeddings of lower faces as an enhancement. Our face swap** framework, named ReliableSwap, can boost the performance of any existing face swap** network with negligible overhead. Extensive experiments demonstrate the efficacy of our ReliableSwap, especially in identity preservation. The project page is https://reliable-swap.github.io/. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: Project page: https://reliable-swap.github.io/ ; Github repository: https://github.com/ygtxr1997/ReliableSwap ; Demo (HuggingFace): https://huggingface.co/spaces/ygtxr1997/ReliableSwap_Demo ;

arXiv:2306.00926 [pdf, other]

Inserting Anybody in Diffusion Models via Celeb Basis

Authors: Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng

Abstract: Exquisite demand exists for customizing the pretrained large text-to-image model, $\textit{e.g.}$, Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization methods often shows weaker combination abilities than the original ones even given several images during training. We thus propose a new personalization method… ▽ More Exquisite demand exists for customizing the pretrained large text-to-image model, $\textit{e.g.}$, Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization methods often shows weaker combination abilities than the original ones even given several images during training. We thus propose a new personalization method that allows for the seamless integration of a unique individual into the pre-trained diffusion model using just $\textbf{one facial photograph}$ and only $\textbf{1024 learnable parameters}$ under $\textbf{3 minutes}$. So as we can effortlessly generate stunning images of this person in any pose or position, interacting with anyone and doing anything imaginable from text prompts. To achieve this, we first analyze and build a well-defined celeb basis from the embedding space of the pre-trained large text encoder. Then, given one facial photo as the target identity, we generate its own embedding by optimizing the weight of this basis and locking all other parameters. Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods. Besides, our model can also learn several new identities at once and interact with each other where the previous customization model fails to. The code will be released. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: Project page: http://celeb-basis.github.io ; Github repository: https://github.com/ygtxr1997/CelebBasis

arXiv:2305.14751 [pdf, other]

DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade

Authors: Zefan Cai, Xin Zheng, Tianyu Liu, Xu Wang, Haoran Meng, Jiaqi Han, Gang Yuan, Binghuai Lin, Baobao Chang, Yunbo Cao

Abstract: In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existent data accumulated in the last updates. Within the newly added data, new intents would emerge and might have semantic entanglement with the existing intents, e.g. new intents that are semantically too specific or… ▽ More In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existent data accumulated in the last updates. Within the newly added data, new intents would emerge and might have semantic entanglement with the existing intents, e.g. new intents that are semantically too specific or generic are actually subset or superset of some existing intents in the semantic space, thus impairing the robustness of the NLU model. As the first attempt to solve this problem, we setup a new benchmark consisting of 4 Dialogue Version Control dataSets (DialogVCS). We formulate the intent detection with imperfect data in the system update as a multi-label classification task with positive but unlabeled intents, which asks the models to recognize all the proper intents, including the ones with semantic entanglement, in the inference. We also propose comprehensive baseline models and conduct in-depth analyses for the benchmark, showing that the semantically entangled intents can be effectively recognized with an automatic workflow. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: work in progress. The first three authors contribute equally

arXiv:2305.03408 [pdf, other]

doi 10.1007/s11433-023-2233-2

Black holes as the source of dark energy: a stringent test with high-redshift JWST AGNs

Authors: Lei Lei, Lei Zu, Guan-Wen Yuan, Zhao-Qiang Shen, Yi-Ying Wang, Yuan-Zhu Wang, Zhen-Bo Su, Wen-ke Ren, Shao-Peng Tang, Hao Zhou, Chi Zhang, Zhi-** **, Lei Feng, Yi-Zhong Fan, Da-Ming Wei

Abstract: Studies have proposed that there is evidence for cosmological coupling of black holes (BHs) with an index of $k\approx 3$; hence, BHs serve as the astrophysical source of dark energy. However, the data sample is limited for the redshifts of $\leq 2.5$. In recent years, the James Webb Space Telescope (JWST) has detected many high-redshift active galactic nuclei (AGNs) and quasars. Among the JWST NI… ▽ More Studies have proposed that there is evidence for cosmological coupling of black holes (BHs) with an index of $k\approx 3$; hence, BHs serve as the astrophysical source of dark energy. However, the data sample is limited for the redshifts of $\leq 2.5$. In recent years, the James Webb Space Telescope (JWST) has detected many high-redshift active galactic nuclei (AGNs) and quasars. Among the JWST NIRSpec-/NIRCam-resolved AGNs, three are determined to be in early-type host galaxies with a redshift of $z\sim 4.5--7$. However, their $M_{\star}$ and $M_{\rm BH}$ are in tension with the predicted cosmological coupling of black holes with $k = 3$ at a confidence level of $\sim 2σ$, which challenges the hypothesis that BHs serve as the origin of dark energy. Future work on high-redshift AGNs using the JWST will further assess such a hypothesis by identifying more early-type host galaxies in the higher mass range. △ Less

Submitted 17 January, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

Comments: 9 pages, 3 figures, 2 tables; Comments are welcome!

Journal ref: Sci. China-Phys. Mech. Astron. 67, 229811 (2024)

arXiv:2304.11512 [pdf, ps, other]

Stability estimates for an inverse problem for schrodinger operators at high frequencies from arbitrary partial boundary measurements

Authors: Xiaomeng Zhao, Ganghua Yuan

Abstract: In this paper, we study the partial data inverse boundary value problem for the Schrodinger operator at a high frequency k>=1 in a bounded domain with smooth boundary in Rn, n>=3. Assuming that the potential is known in a neighborhood of the boundary, we obtain the logarithmic stability when both Dirichlet data and Neumann data are taken on arbitrary open subsets of the boundary where the two sets… ▽ More In this paper, we study the partial data inverse boundary value problem for the Schrodinger operator at a high frequency k>=1 in a bounded domain with smooth boundary in Rn, n>=3. Assuming that the potential is known in a neighborhood of the boundary, we obtain the logarithmic stability when both Dirichlet data and Neumann data are taken on arbitrary open subsets of the boundary where the two sets can be disjointed. Our results also show that the logarithmic stability can be improved to the one of Holder type in the high frequency regime. To achieve those goals, we used a method by combining the CGO sulution, Runge approximation and Carleman estimate. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: 17 pages

MSC Class: 35R30; 35J25; 35R25

arXiv:2304.03641 [pdf, ps, other]

A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints

Authors: Ganzhao Yuan

Abstract: Nonsmooth composite optimization with orthogonality constraints has a broad spectrum of applications in statistical learning and data science. However, this problem is generally challenging to solve due to its non-convex and non-smooth nature. Existing solutions are limited by one or more of the following restrictions: (i) they are full gradient methods that require high computational costs in eac… ▽ More Nonsmooth composite optimization with orthogonality constraints has a broad spectrum of applications in statistical learning and data science. However, this problem is generally challenging to solve due to its non-convex and non-smooth nature. Existing solutions are limited by one or more of the following restrictions: (i) they are full gradient methods that require high computational costs in each iteration; (ii) they are not capable of solving general nonsmooth composite problems; (iii) they are infeasible methods and can only achieve the feasibility of the solution at the limit point; (iv) they lack rigorous convergence guarantees; (v) they only obtain weak optimality of critical points. In this paper, we propose \textit{\textbf{OBCD}}, a new Block Coordinate Descent method for solving general nonsmooth composite problems under Orthogonality constraints. \textit{\textbf{OBCD}} is a feasible method with low computation complexity footprints. In each iteration, our algorithm updates $k$ rows of the solution matrix ($k\geq2$ is a parameter) to preserve the constraints. Then, it solves a small-sized nonsmooth composite optimization problem under orthogonality constraints either exactly or approximately. We demonstrate that any exact block-$k$ stationary point is always an approximate block-$k$ stationary point, which is equivalent to the critical stationary point. We are particularly interested in the case where $k=2$ as the resulting subproblem reduces to a one-dimensional nonconvex problem. We propose a breakpoint searching method and a fifth-order iterative method to solve this problem efficiently and effectively. We also propose two novel greedy strategies to find a good working set to further accelerate the convergence of \textit{\textbf{OBCD}}. Finally, we have conducted extensive experiments on several tasks to demonstrate the superiority of our approach. △ Less

Submitted 7 April, 2023; originally announced April 2023.

arXiv:2303.09391 [pdf, other]

Rapidly growing primordial black holes as seeds of the massive high-redshift JWST Galaxies

Authors: Guan-Wen Yuan, Lei Lei, Yuan-Zhu Wang, Bo Wang, Yi-Ying Wang, Chao Chen, Zhao-Qiang Shen, Yi-Fu Cai, Yi-Zhong Fan

Abstract: A group of massive galaxies at redshifts of $z\gtrsim 7$ have been recently detected by the James Webb Space Telescope (JWST), which were unexpected to form so early within the framework of standard Big Bang cosmology. In this work, we propose that this puzzle can be explained by the presence of some primordial black holes (PBHs) with a mass of $\sim 1000 M_\odot$. These PBHs, clothed in dark matt… ▽ More A group of massive galaxies at redshifts of $z\gtrsim 7$ have been recently detected by the James Webb Space Telescope (JWST), which were unexpected to form so early within the framework of standard Big Bang cosmology. In this work, we propose that this puzzle can be explained by the presence of some primordial black holes (PBHs) with a mass of $\sim 1000 M_\odot$. These PBHs, clothed in dark matter halo and undergoing super-Eddington accretion, serve as seeds for the early galaxy formation with masses of $\sim 10^{8}-10^{10}~M_\odot$ at high redshift, thus accounting for the JWST observations. Using a hierarchical Bayesian inference framework to constrain the PBH mass distribution models, we find that the Lognormal model with $M_{\rm c}\sim 750M_\odot$ is preferred over other hypotheses. These rapidly growing BHs are expected to emit strong radiation and may appear as high-redshift compact objects, similar to those recently discovered by JWST. Although we focuse on PBHs in this work, the bound on the initial mass of the seed black holes remains robust even if they were formed through astrophysical channels. △ Less

Submitted 18 June, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: Accepted by Science China Physics, Mechanics & Astronomy

arXiv:2303.09284 [pdf, other]

doi 10.1093/mnras/stad3282

Exploring dark matter spike distribution around the Galactic centre with stellar orbits

Authors: Zhao-Qiang Shen, Guan-Wen Yuan, Cheng-Zi Jiang, Yue-Lin Sming Tsai, Qiang Yuan, Yi-Zhong Fan

Abstract: Precise measurements of the stellar orbits around Sagittarius A* have established the existence of a supermassive black hole (SMBH) at the Galactic centre (GC). Due to the interplay between the SMBH and dark matter (DM), the DM density profile in the innermost region of the Galaxy, which is crucial for the DM indirect detection, is still an open question. Among the most popular models in the liter… ▽ More Precise measurements of the stellar orbits around Sagittarius A* have established the existence of a supermassive black hole (SMBH) at the Galactic centre (GC). Due to the interplay between the SMBH and dark matter (DM), the DM density profile in the innermost region of the Galaxy, which is crucial for the DM indirect detection, is still an open question. Among the most popular models in the literature, the theoretical spike profile proposed by Gondolo and Silk (1999; GS hereafter) is well adopted. In this work, we investigate the DM spike profile using updated data from the Keck and VLT telescopes considering that the presence of such an extended mass component may affect the orbits of the S-stars in the Galactic center. We examine the radius and slope of the generalized NFW spike profile, analyze the Einasto spike, and discuss the influence of DM annihilation on the results. Our findings indicate that an initial slope of $γ\gtrsim 0.92$ for the generalized NFW spike profile is ruled out at a 95% confidence level. Additionally, the spike radius $R_{\rm sp}$ larger than 21.5 pc is rejected at 95% probability for the Einasto spike with $α=0.17$, which also contradicts the GS spike model. The constraints with the VLT/GRAVITY upper limits are also projected. Although the GS NFW spike is well constrained by the Keck and VLT observation of S2, an NFW spike with a weak annihilation cusp may still be viable, as long as the DM annihilation cross section satisfies $\left< σv \right> \gtrsim 7.7\times 10^{-27}~{\rm cm^3\,s^{-1}} (m_{\rm DM}/100~{\rm GeV})$ at 95% level. △ Less

Submitted 24 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: 12 pages, 10 figures and 1 table. Accepted for publication in MNRAS

Journal ref: MNRAS 527, 3196 (2024)

arXiv:2303.07177 [pdf, ps, other]

Fixed Point Theorems for Upper Semicontinuous Set-valued Map**s in $p$-Vector Spaces

Authors: George Xianzhi Yuan

Abstract: The goal of this paper is to establish a general fixed point theorem for compact single-valued continuous map** in Hausdorff p-vector spaces, and the fixed point theorem for upper semicontinuous set-valued map**s in Hausdorff locally p-convex for p in (0, 1]. These new results provide an answer to Schauder conjecture in the affirmative under the setting of general p-vector spaces for compact s… ▽ More The goal of this paper is to establish a general fixed point theorem for compact single-valued continuous map** in Hausdorff p-vector spaces, and the fixed point theorem for upper semicontinuous set-valued map**s in Hausdorff locally p-convex for p in (0, 1]. These new results provide an answer to Schauder conjecture in the affirmative under the setting of general p-vector spaces for compact single-valued continuous, and also give the fixed point theorems for upper semicontinuous set-valued map**s defined on s-convex subsets in Hausdorff locally p-convex spaces, which would be fundamental for nonlinear functional analysis in mathematics, where s,p in (0.1]. △ Less

Submitted 12 April, 2023; v1 submitted 13 December, 2022; originally announced March 2023.

Comments: 13 pages; no figures. arXiv admin note: substantial text overlap with arXiv:2210.10286

MSC Class: 47H04; 47H10; 46A16; 46A55; 49J27; 49J35; 52A07; 54C60; 54H25; 55M20

arXiv:2211.12005 [pdf, other]

Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors

Authors: Sizhe Chen, Geng Yuan, Xinwen Cheng, Yifan Gong, Minghai Qin, Yanzhi Wang, Xiaolin Huang

Abstract: As data becomes increasingly vital, a company would be very cautious about releasing data, because the competitors could use it to train high-performance models, thereby posing a tremendous threat to the company's commercial competence. To prevent training good models on the data, we could add imperceptible perturbations to it. Since such perturbations aim at hurting the entire training process, t… ▽ More As data becomes increasingly vital, a company would be very cautious about releasing data, because the competitors could use it to train high-performance models, thereby posing a tremendous threat to the company's commercial competence. To prevent training good models on the data, we could add imperceptible perturbations to it. Since such perturbations aim at hurting the entire training process, they should reflect the vulnerability of DNN training, rather than that of a single model. Based on this new idea, we seek perturbed examples that are always unrecognized (never correctly classified) in training. In this paper, we uncover them by model checkpoints' gradients, forming the proposed self-ensemble protection (SEP), which is very effective because (1) learning on examples ignored during normal training tends to yield DNNs ignoring normal examples; (2) checkpoints' cross-model gradients are close to orthogonal, meaning that they are as diverse as DNNs with different architectures. That is, our amazing performance of ensemble only requires the computation of training one model. By extensive experiments with 9 baselines on 3 datasets and 5 architectures, SEP is verified to be a new state-of-the-art, e.g., our small $\ell_\infty=2/255$ perturbations reduce the accuracy of a CIFAR-10 ResNet18 from 94.56% to 14.68%, compared to 41.35% by the best-known method. Code is available at https://github.com/Sizhe-Chen/SEP. △ Less

Submitted 12 April, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: ICLR 2023

arXiv:2211.10801 [pdf, other]

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

Authors: Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Abstract: Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization. Previous compression algorithms usually start from the pre-trained dense models and only focus on efficient inference, while time-consuming training is still unavoidable. In contrast, this paper points… ▽ More Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization. Previous compression algorithms usually start from the pre-trained dense models and only focus on efficient inference, while time-consuming training is still unavoidable. In contrast, this paper points out that the million-scale training data is redundant, which is the fundamental reason for the tedious training. To address the issue, this paper aims to introduce sparsity into data and proposes an end-to-end efficient training framework from three sparse perspectives, dubbed Tri-Level E-ViT. Specifically, we leverage a hierarchical data redundancy reduction scheme, by exploring the sparsity under three levels: number of training examples in the dataset, number of patches (tokens) in each example, and number of connections between tokens that lie in attention weights. With extensive experiments, we demonstrate that our proposed technique can noticeably accelerate training for various ViT architectures while maintaining accuracy. Remarkably, under certain ratios, we are able to improve the ViT accuracy rather than compromising it. For example, we can achieve 15.2% speedup with 72.6% (+0.4) Top-1 accuracy on Deit-T, and 15.7% speedup with 79.9% (+0.1) Top-1 accuracy on Deit-S. This proves the existence of data redundancy in ViT. △ Less

Submitted 19 November, 2022; originally announced November 2022.

Comments: AAAI 2023

arXiv:2211.02851 [pdf, ps, other]

linearized inverse problem for biharmonic operators at high frequencies

Authors: Xiaomeng Zhao, Ganghua Yuan

Abstract: In this paper, we study the phenomenon of increasing stability in the inverse boundary value problems for the biharmonic equation. By considering a linearized form, we obtain an increasing Lipschitz-like stability when k is large. Furthermore, we extend the discussion to the linearized inverse biharmonic potential problem with attenuation, where an exponential dependence of the attenuation constan… ▽ More In this paper, we study the phenomenon of increasing stability in the inverse boundary value problems for the biharmonic equation. By considering a linearized form, we obtain an increasing Lipschitz-like stability when k is large. Furthermore, we extend the discussion to the linearized inverse biharmonic potential problem with attenuation, where an exponential dependence of the attenuation constant is traced in the stability estimate. △ Less

Submitted 5 November, 2022; originally announced November 2022.

Comments: 18 pages. arXiv admin note: text overlap with arXiv:1812.05011 by other authors

MSC Class: 35R30; 31B30

arXiv:2211.01484 [pdf, other]

Data Level Lottery Ticket Hypothesis for Vision Transformers

Authors: Xuan Shen, Zhenglun Kong, Minghai Qin, Peiyan Dong, Geng Yuan, Xin Meng, Hao Tang, Xiaolong Ma, Yanzhi Wang

Abstract: The conventional lottery ticket hypothesis (LTH) claims that there exists a sparse subnetwork within a dense neural network and a proper random initialization method called the winning ticket, such that it can be trained from scratch to almost as good as the dense counterpart. Meanwhile, the research of LTH in vision transformers (ViTs) is scarcely evaluated. In this paper, we first show that the… ▽ More The conventional lottery ticket hypothesis (LTH) claims that there exists a sparse subnetwork within a dense neural network and a proper random initialization method called the winning ticket, such that it can be trained from scratch to almost as good as the dense counterpart. Meanwhile, the research of LTH in vision transformers (ViTs) is scarcely evaluated. In this paper, we first show that the conventional winning ticket is hard to find at the weight level of ViTs by existing methods. Then, we generalize the LTH for ViTs to input data consisting of image patches inspired by the input dependence of ViTs. That is, there exists a subset of input image patches such that a ViT can be trained from scratch by using only this subset of patches and achieve similar accuracy to the ViTs trained by using all image patches. We call this subset of input patches the em winning tickets, which represent a significant amount of information in the input data. We use a ticket selector to generate the winning tickets based on the informativeness of patches for various types of ViT, including DeiT, LV-ViT, and Swin Transformers. The experiments show that there is a clear difference between the performance of models trained with winning tickets and randomly selected subsets, which verifies our proposed theory. We elaborate on the analogical similarity between our proposed Data-LTH-ViTs and the conventional LTH to further verify the integrity of our theory. The Source codes are available at https://github.com/shawnricecake/vit-lottery-ticket-input. △ Less

Submitted 29 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: Accepted by IJCAI 2023

arXiv:2210.16868 [pdf, other]

Catching the geometric phase effect around conical intersection in molecules by high order harmonic spectroscopy

Authors: Guanglu Yuan, Ruifeng Lu, Shicheng Jiang, Konstantin Dorfman

Abstract: Nonadiabatic dynamics around an avoid crossing or a conical intersection play a crucial role in the photoinduced processes of most polyatomic molecules. The present work shows that the topological phase in conical intersection makes the behavior of pump-probe high-order harmonic spectroscopy different from the case of avoid crossing. The coherence built up when the system crosses the avoid crossin… ▽ More Nonadiabatic dynamics around an avoid crossing or a conical intersection play a crucial role in the photoinduced processes of most polyatomic molecules. The present work shows that the topological phase in conical intersection makes the behavior of pump-probe high-order harmonic spectroscopy different from the case of avoid crossing. The coherence built up when the system crosses the avoid crossing will lead to the oscillatory behavior of the spectrum, while the geometric phase erodes these oscillations in the case of conical intersection. Additionally, the dynamical blueshift and the splitting of time-resolved spectrum allow capturing the snapshot dynamics with sub-femtosecond resolution. △ Less

Submitted 30 October, 2022; originally announced October 2022.

Comments: 10 pages, 5 figures

arXiv:2210.10629 [pdf, other]

Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems

Authors: Guanghu Yuan, Fajie Yuan, Yudong Li, Beibei Kong, Shujie Li, Lei Chen, Min Yang, Chenyun Yu, Bo Hu, Zang Li, Yu Xu, Xiaohu Qie

Abstract: Existing benchmark datasets for recommender systems (RS) either are created at a small scale or involve very limited forms of user feedback. RS models evaluated on such datasets often lack practical values for large-scale real-world applications. In this paper, we describe Tenrec, a novel and publicly available data collection for RS that records various user feedback from four different recommend… ▽ More Existing benchmark datasets for recommender systems (RS) either are created at a small scale or involve very limited forms of user feedback. RS models evaluated on such datasets often lack practical values for large-scale real-world applications. In this paper, we describe Tenrec, a novel and publicly available data collection for RS that records various user feedback from four different recommendation scenarios. To be specific, Tenrec has the following five characteristics: (1) it is large-scale, containing around 5 million users and 140 million interactions; (2) it has not only positive user feedback, but also true negative feedback (vs. one-class recommendation); (3) it contains overlapped users and items across four different scenarios; (4) it contains various types of user positive feedback, in forms of clicks, likes, shares, and follows, etc; (5) it contains additional features beyond the user IDs and item IDs. We verify Tenrec on ten diverse recommendation tasks by running several classical baseline models per task. Tenrec has the potential to become a useful benchmark dataset for a majority of popular recommendation tasks. △ Less

Submitted 4 June, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.10286 [pdf, ps, other]

Nonlinear Analysis in p-Vector Spaces for Singe-Valued 1-Set Contractive Map**s

Authors: George Xianzhi Yuan

Abstract: The goal of this paper is to develop some fundamental and important nonlinear analysis for single-valued map**s under the framework of p-vector spaces, in particular, for locally p-convex spaces for p in (0, 1]. More precisely, based on the fixed point theorem of single-valued continuous condensing map** in locally p-convex spaces as the starting point, we first establish best approximation re… ▽ More The goal of this paper is to develop some fundamental and important nonlinear analysis for single-valued map**s under the framework of p-vector spaces, in particular, for locally p-convex spaces for p in (0, 1]. More precisely, based on the fixed point theorem of single-valued continuous condensing map** in locally p-convex spaces as the starting point, we first establish best approximation results for (single-valued) continuous condensing map**s which are then used to develop new results for three classes of nonlinear map**s consisting of 1) condensing; 2) 1-set contractive; and 3) semiclosed 1-set contractive map**s in locally p-convex spaces. Next they are used to establish general principle for nonlinear alternative, Leray - Schauder alternative, fixed points for non-self map**s with different boundary conditions for nonlinear map**s from locally p-convex spaces, to nonexpansive map**s in uniformly convex Banach spaces, or locally convex spaces with Opial condition. The results given by this paper not only include the corresponding ones in the existing literature as special cases, but also expected to be useful tools for the development of new theory in nonlinear functional analysis and applications to the study of related nonlinear problems arising from practice under the general framework of p-vector spaces for p in (0, 1]. Finally, the work presented by this paper focuses on the development of nonlinear analysis for single-valued (instead of set-valued) map**s for locally p-convex spaces, essentially, is indeed the continuation of the associated work given recently by Yuan [134] therein, the attention is given to the study of nonlinear analysis for set-valued map**s in locally p-convex spaces for p in (0, 1]. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: 65

MSC Class: 47H04; 47H10; 46A16; 46A55; 49J27; 49J35; 52A07; 54C60; 54H25; 55M20

arXiv:2210.04623 [pdf, other]

DeltaFS: Pursuing Zero Update Overhead via Metadata-Enabled Delta Compression for Log-structured File System on Mobile Devices

Authors: Chao Wu, Cheng Ji, Geng Yuan, Riwei Pan, Weichao Guo, Chao Yu, Zongwei Zhu, Yanzhi Wang

Abstract: Data compression has been widely adopted to release mobile devices from intensive write pressure. Delta compression is particularly promising for its high compression efficacy over conventional compression methods. However, this method suffers from non-trivial system overheads incurred by delta maintenance and read penalty, which prevents its applicability on mobile devices. To this end, this pape… ▽ More Data compression has been widely adopted to release mobile devices from intensive write pressure. Delta compression is particularly promising for its high compression efficacy over conventional compression methods. However, this method suffers from non-trivial system overheads incurred by delta maintenance and read penalty, which prevents its applicability on mobile devices. To this end, this paper proposes DeltaFS, a metadata-enabled Delta compression on log-structured File System for mobile devices, to achieve utmost compressing efficiency and zero hardware costs. DeltaFS smartly exploits the out-of-place updating ability of Log-structured File System (LFS) to alleviate the problems of write amplification, which is the key bottleneck for delta compression implementation. Specifically, DeltaFS utilizes the inline area in file inodes for delta maintenance with zero hardware cost, and integrates an inline area manage strategy to improve the utilization of constrained inline area. Moreover, a complimentary delta maintenance strategy is incorporated, which selectively maintains delta chunks in the main data area to break through the limitation of constrained inline area. Experimental results show that DeltaFS substantially reduces write traffics by up to 64.8\%, and improves the I/O performance by up to 37.3\%. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2209.11204 [pdf, other]

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

Authors: Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Abstract: Recently, sparse training has emerged as a promising paradigm for efficient deep learning on edge devices. The current research mainly devotes efforts to reducing training costs by further increasing model sparsity. However, increasing sparsity is not always ideal since it will inevitably introduce severe accuracy degradation at an extremely high sparsity level. This paper intends to explore other… ▽ More Recently, sparse training has emerged as a promising paradigm for efficient deep learning on edge devices. The current research mainly devotes efforts to reducing training costs by further increasing model sparsity. However, increasing sparsity is not always ideal since it will inevitably introduce severe accuracy degradation at an extremely high sparsity level. This paper intends to explore other possible directions to effectively and efficiently reduce sparse training costs while preserving accuracy. To this end, we investigate two techniques, namely, layer freezing and data sieving. First, the layer freezing approach has shown its success in dense model training and fine-tuning, yet it has never been adopted in the sparse training domain. Nevertheless, the unique characteristics of sparse training may hinder the incorporation of layer freezing techniques. Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs. Second, we propose a data sieving method for dataset-efficient training, which further reduces training costs by ensuring only a partial dataset is used throughout the entire training process. We show that both techniques can be well incorporated into the sparse training algorithm to form a generic framework, which we dub SpFDE. Our extensive experiments demonstrate that SpFDE can significantly reduce training costs while preserving accuracy from three dimensions: weight sparsity, layer freezing, and dataset sieving. △ Less

Submitted 22 September, 2022; originally announced September 2022.

Comments: Published in 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

Showing 1–50 of 244 results for author: Yuan, G