Search | arXiv e-print repository

There is No Place Like Home -- Finding Birth Radii of Stars in the Milky Way

Authors: Yuxi Lu, Ivan Minchev, Tobias Buck, Sergey Khoperskov, Matthias Steinmetz, Noam Libeskind, Gabriele Cescutti, Ken C. Freeman, Bridget Ratcliffe

Abstract: Stars move away from their birthplaces over time via a process known as radial migration, which blurs chemo-kinematic relations used for reconstructing the Milky Way (MW) formation history. To understand the true time evolution of the MW, one needs to take into account the effects of this process. We show that stellar birth radii can be derived directly from the data with minimum prior assumptions… ▽ More Stars move away from their birthplaces over time via a process known as radial migration, which blurs chemo-kinematic relations used for reconstructing the Milky Way (MW) formation history. To understand the true time evolution of the MW, one needs to take into account the effects of this process. We show that stellar birth radii can be derived directly from the data with minimum prior assumptions on the Galactic enrichment history. This is done by first recovering the time evolution of the stellar birth metallicity gradient, $d\mathrm{[Fe/H]}(R, τ)/dR$, through its inverse relation to the metallicity range as a function of age today, allowing us to place any star with age and metallicity measurements back to its birthplace, $R_b$. Applying our method to a large, high-precision data set of MW disk subgiant stars, we find a steepening of the birth metallicity gradient from 11 to 8 Gyr ago, which coincides with the time of the last massive merger, Gaia-Sausage-Enceladus (GSE). This transition appears to play a major role in sha** both the age-metallicity relation and the bimodality in the [$α$/Fe]-[Fe/H] plane. By dissecting the disk into mono-$R_b$ populations, clumps in the low-[$α$/Fe] sequence appear, which are not seen in the total sample and coincide in time with known star-formation bursts, possibly associated with the Sagittarius Dwarf Galaxy. We estimated that the Sun was born at $4.5\pm 0.4$~kpc from the Galactic center. Our $R_b$ estimates provide the missing piece needed to recover the Milky Way formation history. △ Less

Submitted 22 May, 2024; v1 submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.02399 [pdf, other]

Flat bands, non-trivial band topology and electronic nematicity in layered kagome-lattice RbTi$_3$Bi$_5$

Authors: Zhicheng Jiang, Zhengtai Liu, Haiyang Ma, Wei Xia, Zhonghao Liu, Jishan Liu, Soohyun Cho, Yichen Yang, Jianyang Ding, Jiayu Liu, Zhe Huang, Yuxi Qiao, Jiajia Shen, Wenchuan **g, Xiangqi Liu, Jianpeng Liu, Yanfeng Guo, Dawei Shen

Abstract: Layered kagome-lattice materials with 3$d$ transition metals provide a fertile playground for studies on geometry frustration, band topology and other novel ordered states. A representative class of materials AV$_3$Sb$_5$ (A=K, Rb, Cs) have been proved to possess various unconventional phases such as superconductivity, non-trivial $\mathbb{Z}_2$ band topology, and electronic nematicity, which are… ▽ More Layered kagome-lattice materials with 3$d$ transition metals provide a fertile playground for studies on geometry frustration, band topology and other novel ordered states. A representative class of materials AV$_3$Sb$_5$ (A=K, Rb, Cs) have been proved to possess various unconventional phases such as superconductivity, non-trivial $\mathbb{Z}_2$ band topology, and electronic nematicity, which are intertwined with multiple interlaced charge density waves (CDW). However, the interplay among these novel states and their mechanisms are still elusive. Recently, the discovery of isostructural titanium-based single-crystals ATi$_3$Bi$_5$ (A=K, Rb, Cs), which demonstrate similar multiple exotic states but in the absence of the concomitant intertwined CDW, has been offering an ideal opportunity to disentangle these complex novel states in kagome-lattice. Here, we combine the high-resolution angle-resolved photoemission spectroscopy and first-principles calculations to systematically investigate the low-lying electronic structure of RbTi$_3$Bi$_5$. For the first time, we experimentally demonstrate the coexistence of flat bands and multiple non-trivial topological states, including type-II Dirac nodal lines and non-trivial $\mathbb{Z}_2$ topological surface states therein. Furthermore, our findings as well provide the hint of rotation symmetry breaking in RbTi$_3$Bi$_5$, suggesting the directionality of the electronic structure and possibility of emerging pure electronic nematicity in this new family of kagome compounds, which may provide important insights into the electronic nematic phase in correlated kagome metals. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: 6 pages, 5 figures

arXiv:2211.12018 [pdf, other]

Level-S$^2$fM: Structure from Motion on Neural Level Set of Implicit Surfaces

Authors: Yuxi Xiao, Nan Xue, Tianfu Wu, Gui-Song Xia

Abstract: This paper presents a neural incremental Structure-from-Motion (SfM) approach, Level-S$^2$fM, which estimates the camera poses and scene geometry from a set of uncalibrated images by learning coordinate MLPs for the implicit surfaces and the radiance fields from the established keypoint correspondences. Our novel formulation poses some new challenges due to inevitable two-view and few-view configu… ▽ More This paper presents a neural incremental Structure-from-Motion (SfM) approach, Level-S$^2$fM, which estimates the camera poses and scene geometry from a set of uncalibrated images by learning coordinate MLPs for the implicit surfaces and the radiance fields from the established keypoint correspondences. Our novel formulation poses some new challenges due to inevitable two-view and few-view configurations in the incremental SfM pipeline, which complicates the optimization of coordinate MLPs for volumetric neural rendering with unknown camera poses. Nevertheless, we demonstrate that the strong inductive basis conveying in the 2D correspondences is promising to tackle those challenges by exploiting the relationship between the ray sampling schemes. Based on this, we revisit the pipeline of incremental SfM and renew the key components, including two-view geometry initialization, the camera poses registration, the 3D points triangulation, and Bundle Adjustment, with a fresh perspective based on neural implicit surfaces. By unifying the scene geometry in small MLP networks through coordinate MLPs, our Level-S$^2$fM treats the zero-level set of the implicit surface as an informative top-down regularization to manage the reconstructed 3D points, reject the outliers in correspondences via querying SDF, and refine the estimated geometries by NBA (Neural BA). Not only does our Level-S$^2$fM lead to promising results on camera pose estimation and scene geometry reconstruction, but it also shows a promising way for neural implicit rendering without knowing camera extrinsic beforehand. △ Less

Submitted 27 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: camera-ready version (CVPR 2023). Project page: https://henry123-boy.github.io/level-s2fm/

arXiv:2211.06045 [pdf, other]

Integrated Convolutional and Recurrent Neural Networks for Health Risk Prediction using Patient Journey Data with Many Missing Values

Authors: Yuxi Liu, Shaowen Qin, Antonio Jimeno Yepes, Wei Shao, Zhenhao Zhang, Flora D. Salim

Abstract: Predicting the health risks of patients using Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Health risk refers to the probability of the occurrence of a specific health outcome for a specific patient. The predicted risks can be used to support decision-making by healthcare professionals. EHRs are s… ▽ More Predicting the health risks of patients using Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Health risk refers to the probability of the occurrence of a specific health outcome for a specific patient. The predicted risks can be used to support decision-making by healthcare professionals. EHRs are structured patient journey data. Each patient journey contains a chronological set of clinical events, and within each clinical event, there is a set of clinical/medical activities. Due to variations of patient conditions and treatment needs, EHR patient journey data has an inherently high degree of missingness that contains important information affecting relationships among variables, including time. Existing deep learning-based models generate imputed values for missing values when learning the relationships. However, imputed data in EHR patient journey data may distort the clinical meaning of the original EHR patient journey data, resulting in classification bias. This paper proposes a novel end-to-end approach to modeling EHR patient journey data with Integrated Convolutional and Recurrent Neural Networks. Our model can capture both long- and short-term temporal patterns within each patient journey and effectively handle the high degree of missingness in EHR data without any imputation data generation. Extensive experimental results using the proposed model on two real-world datasets demonstrate robust performance as well as superior prediction accuracy compared to existing state-of-the-art imputation-based prediction methods. △ Less

Submitted 13 November, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: 6 pages, 2 figures, accepted at IEEE BIBM 2022

arXiv:2211.00890 [pdf, other]

Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective

Authors: **xiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei Zhang, Yuan Xie, Chengjie Wang

Abstract: Few-shot learning problem focuses on recognizing unseen classes given a few labeled images. In recent effort, more attention is paid to fine-grained feature embedding, ignoring the relationship among different distance metrics. In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements… ▽ More Few-shot learning problem focuses on recognizing unseen classes given a few labeled images. In recent effort, more attention is paid to fine-grained feature embedding, ignoring the relationship among different distance metrics. In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements in few-shot classification. We start from a naive baseline of confidence summation and demonstrate the necessity of exploiting the complementary property of different distance metrics. By finding the competition problem among them, built upon the baseline, we propose an Adaptive Metrics Module (AMM) to decouple metrics fusion into metric-prediction fusion and metric-losses fusion. The former encourages mutual complementary, while the latter alleviates metric competition via multi-task collaborative learning. Based on AMM, we design a few-shot classification framework AMTNet, including the AMM and the Global Adaptive Loss (GAL), to jointly optimize the few-shot task and auxiliary self-supervised task, making the embedding features more robust. In the experiment, the proposed AMM achieves 2% higher performance than the naive metrics fusion module, and our AMTNet outperforms the state-of-the-arts on multiple benchmark datasets. △ Less

Submitted 2 November, 2022; originally announced November 2022.

Journal ref: Proceedings of the 30th ACM International Conference on Multimedia 2022

arXiv:2210.08728 [pdf, other]

Fault Injection based Failure Analysis of three CentOS-like Operating Systems

Authors: Hao Xu, Yuxi Hu, Bolong Tan, Xiaohai Shi, Zhangjun Lu, Wei Zhang, Jianhui Jiang

Abstract: The reliability of operating system (OS) has always been a major concern in the academia and industry. This paper studies how to perform OS failure analysis by fault injection based on the fault mode library. Firstly, we use the fault mode generation method based on Linux abstract hierarchy structure analysis to systematically define the Linux-like fault modes, construct a Linux fault mode library… ▽ More The reliability of operating system (OS) has always been a major concern in the academia and industry. This paper studies how to perform OS failure analysis by fault injection based on the fault mode library. Firstly, we use the fault mode generation method based on Linux abstract hierarchy structure analysis to systematically define the Linux-like fault modes, construct a Linux fault mode library and develop a fault injection tool based on the fault mode library (FIFML). Then, fault injection experiments are carried out on three commercial Linux distributions, CentOS, Anolis OS and openEuler, to identify their reliability problems and give improvement suggestions. We also use the virtual file systems of these three OSs as experimental objects, to perform fault injection at levels of Light and Normal, measure the performance of 13 common file operations before and after fault injection. △ Less

Submitted 27 November, 2023; v1 submitted 16 October, 2022; originally announced October 2022.

Comments: 9 pages, 8 figures

arXiv:2210.07486 [pdf, other]

AFETM: Adaptive function execution trace monitoring for fault diagnosis

Authors: Wei Zhang, Yuxi Hu, Bolong Tan, Xiaohai Shi, Jianhui Jiang

Abstract: The high tracking overhead, the amount of up-front effort required to selecting the trace points, and the lack of effective data analysis model are the significant barriers to the adoption of intra-component tracking for fault diagnosis today. This paper introduces a novel method for fault diagnosis by combining adaptive function level dynamic tracking, target fault injection, and graph convolutio… ▽ More The high tracking overhead, the amount of up-front effort required to selecting the trace points, and the lack of effective data analysis model are the significant barriers to the adoption of intra-component tracking for fault diagnosis today. This paper introduces a novel method for fault diagnosis by combining adaptive function level dynamic tracking, target fault injection, and graph convolutional network. In order to implement this method, we introduce techniques for (i) selecting function level trace points, (ii) constructing approximate function call tree of program when using adaptive tracking, and (iii) constructing graph convolutional network with fault injection campaign. We evaluate our method using a web service benchmark composed of Redis, Nginx, Httpd, and SQlite. The experimental results show that this method outperforms log based method, full tracking method, and Gaussian influence method in the accuracy of fault diagnosis, overhead, and performance impact on the diagnosis target. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.06604 [pdf, other]

doi 10.3847/1538-3881/ac9bee

Bridging the gap -- the disappearance of the intermediate period gap for fully convective stars, uncovered by new ZTF rotation periods

Authors: Yuxi Lu, Jason L. Curtis, Ruth Angus, Trevor J. David, Soichiro Hattori

Abstract: The intermediate period gap, discovered by Kepler, is an observed dearth of stellar rotation periods in the temperature-period diagram at $\sim$ 20 days for G dwarfs and up to $\sim$ 30 days for early-M dwarfs. However, because Kepler mainly targeted solar-like stars, there is a lack of measured periods for M dwarfs, especially those at the fully convective limit. Therefore it is unclear if the in… ▽ More The intermediate period gap, discovered by Kepler, is an observed dearth of stellar rotation periods in the temperature-period diagram at $\sim$ 20 days for G dwarfs and up to $\sim$ 30 days for early-M dwarfs. However, because Kepler mainly targeted solar-like stars, there is a lack of measured periods for M dwarfs, especially those at the fully convective limit. Therefore it is unclear if the intermediate period gap exists for mid- to late-M dwarfs. Here, we present a period catalog containing 40,553 rotation periods (9,535 periods $>$ 10 days), measured using the Zwicky Transient Facility (ZTF). To measure these periods, we developed a simple pipeline that improves directly on the ZTF archival light curves and reduces the photometric scatter by 26%, on average. This new catalog spans a range of stellar temperatures that connect samples from Kepler with MEarth, a ground-based time domain survey of bright M-dwarfs, and reveals that the intermediate period gap closes at the theoretically predicted location of the fully convective boundary ($G_{\rm BP} - G_{\rm RP} \sim 2.45$ mag). This result supports the hypothesis that the gap is caused by core-envelope interactions. Using gyro-kinematic ages, we also find a potential rapid spin-down of stars across this period gap. △ Less

Submitted 12 October, 2022; originally announced October 2022.

arXiv:2210.05517 [pdf, other]

DeepMLE: A Robust Deep Maximum Likelihood Estimator for Two-view Structure from Motion

Authors: Yuxi Xiao, Li Li, Xiaodi Li, Jian Yao

Abstract: Two-view structure from motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM (vSLAM). Many existing end-to-end learning-based methods usually formulate it as a brute regression problem. However, the inadequate utilization of traditional geometry model makes the model not robust in unseen environments. To improve the generalization capability and robustness of end-to-end two-view Sf… ▽ More Two-view structure from motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM (vSLAM). Many existing end-to-end learning-based methods usually formulate it as a brute regression problem. However, the inadequate utilization of traditional geometry model makes the model not robust in unseen environments. To improve the generalization capability and robustness of end-to-end two-view SfM network, we formulate the two-view SfM problem as a maximum likelihood estimation (MLE) and solve it with the proposed framework, denoted as DeepMLE. First, we propose to take the deep multi-scale correlation maps to depict the visual similarities of 2D image matches decided by ego-motion. In addition, in order to increase the robustness of our framework, we formulate the likelihood function of the correlations of 2D image matches as a Gaussian and Uniform mixture distribution which takes the uncertainty caused by illumination changes, image noise and moving objects into account. Meanwhile, an uncertainty prediction module is presented to predict the pixel-wise distribution parameters. Finally, we iteratively refine the depth and relative camera pose using the gradient-like information to maximize the likelihood function of the correlations. Extensive experimental results on several datasets prove that our method significantly outperforms the state-of-the-art end-to-end two-view SfM approaches in accuracy and generalization capability. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: 8 pages, Accepted by IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2022)

arXiv:2209.10773 [pdf, ps, other]

Asymptotic stability of rarefaction waves for compressible Navier-Stokes equations with relaxation

Authors: Yuxi Hu, Xuefang Wang

Abstract: The asymptotic stability of rarefaction wave for 1-d relaxed compressible isentropic Navier-Stokes equations is established. For initial data with different far-field values, we show that there exists a unique global in time solution. Moreover, as time goes to infinity, the obtained solutions are shown to converge uniformly to rarefaction wave solution of $p$-system with corresponding Riemann init… ▽ More The asymptotic stability of rarefaction wave for 1-d relaxed compressible isentropic Navier-Stokes equations is established. For initial data with different far-field values, we show that there exists a unique global in time solution. Moreover, as time goes to infinity, the obtained solutions are shown to converge uniformly to rarefaction wave solution of $p$-system with corresponding Riemann initial data. The proof is based on $L^2$ energy methods. △ Less

Submitted 22 September, 2022; originally announced September 2022.

arXiv:2209.08337 [pdf]

doi 10.1016/j.knosys.2022.109824

Lightweight Spatial-Channel Adaptive Coordination of Multilevel Refinement Enhancement Network for Image Reconstruction

Authors: Yuxi Cai, Huicheng Lai, Zhenghong Jia

Abstract: Benefiting from the vigorous development of deep learning, many CNN-based image super-resolution methods have emerged and achieved better results than traditional algorithms. However, it is difficult for most algorithms to adaptively adjust the spatial region and channel features at the same time, let alone the information exchange between them. In addition, the exchange of information between att… ▽ More Benefiting from the vigorous development of deep learning, many CNN-based image super-resolution methods have emerged and achieved better results than traditional algorithms. However, it is difficult for most algorithms to adaptively adjust the spatial region and channel features at the same time, let alone the information exchange between them. In addition, the exchange of information between attention modules is even less visible to researchers. To solve these problems, we put forward a lightweight spatial-channel adaptive coordination of multilevel refinement enhancement networks(MREN). Specifically, we construct a space-channel adaptive coordination block, which enables the network to learn the spatial region and channel feature information of interest under different receptive fields. In addition, the information of the corresponding feature processing level between the spatial part and the channel part is exchanged with the help of jump connection to achieve the coordination between the two. We establish a communication bridge between attention modules through a simple linear combination operation, so as to more accurately and continuously guide the network to pay attention to the information of interest. Extensive experiments on several standard test sets have shown that our MREN achieves superior performance over other advanced algorithms with a very small number of parameters and very low computational complexity. △ Less

Submitted 17 September, 2022; originally announced September 2022.

arXiv:2208.10683 [pdf, other]

Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling

Authors: Boshen Zhang, Yuxi Li, Yuanpeng Tu, **long Peng, Yabiao Wang, Cunlin Wu, Yang Xiao, Cairong Zhao

Abstract: Training deep neural network (DNN) with noisy labels is practically challenging since inaccurate labels severely degrade the generalization ability of DNN. Previous efforts tend to handle part or full data in a unified denoising flow via identifying noisy data with a coarse small-loss criterion to mitigate the interference from noisy labels, ignoring the fact that the difficulties of noisy samples… ▽ More Training deep neural network (DNN) with noisy labels is practically challenging since inaccurate labels severely degrade the generalization ability of DNN. Previous efforts tend to handle part or full data in a unified denoising flow via identifying noisy data with a coarse small-loss criterion to mitigate the interference from noisy labels, ignoring the fact that the difficulties of noisy samples are different, thus a rigid and unified data selection pipeline cannot tackle this problem well. In this paper, we first propose a coarse-to-fine robust learning method called CREMA, to handle noisy data in a divide-and-conquer manner. In coarse-level, clean and noisy sets are firstly separated in terms of credibility in a statistical sense. Since it is practically impossible to categorize all noisy samples correctly, we further process them in a fine-grained manner via modeling the credibility of each sample. Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set. Meanwhile, for samples categorized into the noisy set, a selective label update strategy is proposed to correct noisy labels while mitigating the problem of correction error. Extensive experiments are conducted on benchmarks of different modalities, including image classification (CIFAR, Clothing1M etc) and text recognition (IMDB), with either synthetic or natural semantic noises, demonstrating the superiority and generality of CREMA. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: ECCV 2022: L2ID Workshop

arXiv:2208.10437 [pdf, other]

Multiple topological nodal structure in LaSb2 with large linear magnetoresistance

Authors: Y. X. Qiao, Z. C. Tao, F. Y. Wang, Huaiqiang Wang, Z. C. Jiang, Z. T. Liu, Soohyun Cho, F. Y. Zhang, Q. K. Meng, W. Xia, Y. C. Yang, Z. Huang, J. S. Liu, Z. H. Liu, Z. W. Zhu, S. Qiao, Y. F. Guo, Haijun Zhang, Dawei Shen

Abstract: Unconventional fermions in the immensely studied topological semimetals are the source for rich exotic topological properties. Here, using symmetry analysis and first-principles calculations, we propose the coexistence of multiple topological nodal structure in LaSb2, including topological nodal surfaces, nodal lines and in particular eightfold degenerate nodal points, which have been scarcely obs… ▽ More Unconventional fermions in the immensely studied topological semimetals are the source for rich exotic topological properties. Here, using symmetry analysis and first-principles calculations, we propose the coexistence of multiple topological nodal structure in LaSb2, including topological nodal surfaces, nodal lines and in particular eightfold degenerate nodal points, which have been scarcely observed in a single material. Further, utilizing high resolution angle-resolved photoemission spectroscopy in combination with Shubnikov-de Haas quantum oscillations measurements, we confirm the existence of nodal surfaces and eightfold degenerate nodal points in LaSb2, and extract the π Berry phase proving the non-trivial electronic band structure topology therein. The intriguing multiple topological nodal structure might play a crucial role in giving rise to the large linear magnetoresistance. Our work renews the insights into the exotic topological phenomena in LaSb2 and its analogous. △ Less

Submitted 22 August, 2022; originally announced August 2022.

arXiv:2208.07846 [pdf, other]

TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation

Authors: Lorenz Stangier, Ji-Ung Lee, Yuxi Wang, Marvin Müller, Nicholas Frick, Joachim Metternich, Iryna Gurevych

Abstract: Collecting and annotating task-oriented dialog data is difficult, especially for highly specific domains that require expert knowledge. At the same time, informal communication channels such as instant messengers are increasingly being used at work. This has led to a lot of work-relevant information that is disseminated through those channels and needs to be post-processed manually by the employee… ▽ More Collecting and annotating task-oriented dialog data is difficult, especially for highly specific domains that require expert knowledge. At the same time, informal communication channels such as instant messengers are increasingly being used at work. This has led to a lot of work-relevant information that is disseminated through those channels and needs to be post-processed manually by the employees. To alleviate this problem, we present TexPrax, a messaging system to collect and annotate problems, causes, and solutions that occur in work-related chats. TexPrax uses a chatbot to directly engage the employees to provide lightweight annotations on their conversation and ease their documentation work. To comply with data privacy and security regulations, we use an end-to-end message encryption and give our users full control over their data which has various advantages over conventional annotation tools. We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work. Overall, we collect 202 task-oriented German dialogues containing 1,027 sentences with sentence-level expert annotations. Our data analysis also reveals that real-world conversations frequently contain instances with code-switching, varying abbreviations for the same entity, and dialects which NLP systems should be able to handle. △ Less

Submitted 20 October, 2022; v1 submitted 16 August, 2022; originally announced August 2022.

Comments: Accepted at AACL 2022 (System Demonstrations). Code and data: https://github.com/UKPLab/TexPrax

arXiv:2208.01499 [pdf]

doi 10.1021/acs.nanolett.3c01151

Observation of electronic nematicity driven by three-dimensional charge density wave in kagome lattice KV$_3$Sb$_5$

Authors: Zhicheng Jiang, Haiyang Ma, Wei Xia, Zhengtai Liu, Qian Xiao, Zhonghao Liu, Yichen Yang, Jianyang Ding, Zhe Huang, Jiayu Liu, Yuxi Qiao, Jishan Liu, Yingying Peng, Soohyun Cho, Yanfeng Guo, Jianpeng Liu, Dawei Shen

Abstract: Kagome superconductors AV$_3$Sb$_5$ (A = K, Rb, Cs) provide a fertile playground for studying intriguing phenomena, including non-trivial band topology, superconductivity, giant anomalous Hall effect and charge density wave (CDW). Recently, a $C_2$ symmetric nematic phase prior to the superconducting state in AV$_3$Sb$_5$ drew enormous attention due to its potential inheritance of the symmetry of… ▽ More Kagome superconductors AV$_3$Sb$_5$ (A = K, Rb, Cs) provide a fertile playground for studying intriguing phenomena, including non-trivial band topology, superconductivity, giant anomalous Hall effect and charge density wave (CDW). Recently, a $C_2$ symmetric nematic phase prior to the superconducting state in AV$_3$Sb$_5$ drew enormous attention due to its potential inheritance of the symmetry of the unusual superconductivity. However, direct evidence on the rotation symmetry breaking of the electronic structure in the CDW state from the reciprocal space is still rare, and the underlying mechanism remains ambiguous. The observation shows unconventional unidirectionality, indicative of rotation symmetry breaking from six-fold to two-fold. The interlayer coupling between adjacent planes with $π$-phase offset in the 2$\times$2$\times$2 CDW phase leads to the preferred two-fold symmetric electronic structure. These rarely observed unidirectional back-folded bands in KV$_3$Sb$_5$ may provide important insights into its peculiar charge order and superconductivity. △ Less

Submitted 15 June, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

arXiv:2208.01320 [pdf, other]

Compound Density Networks for Risk Prediction using Electronic Health Records

Authors: Yuxi Liu, Shaowen Qin, Zhenhao Zhang, Wei Shao

Abstract: Electronic Health Records (EHRs) exhibit a high amount of missing data due to variations of patient conditions and treatment needs. Imputation of missing values has been considered an effective approach to deal with this challenge. Existing work separates imputation method and prediction model as two independent parts of an EHR-based machine learning system. We propose an integrated end-to-end app… ▽ More Electronic Health Records (EHRs) exhibit a high amount of missing data due to variations of patient conditions and treatment needs. Imputation of missing values has been considered an effective approach to deal with this challenge. Existing work separates imputation method and prediction model as two independent parts of an EHR-based machine learning system. We propose an integrated end-to-end approach by utilizing a Compound Density Network (CDNet) that allows the imputation method and prediction model to be tuned together within a single framework. CDNet consists of a Gated recurrent unit (GRU), a Mixture Density Network (MDN), and a Regularized Attention Network (RAN). The GRU is used as a latent variable model to model EHR data. The MDN is designed to sample latent variables generated by GRU. The RAN serves as a regularizer for less reliable imputed values. The architecture of CDNet enables GRU and MDN to iteratively leverage the output of each other to impute missing values, leading to a more accurate and robust prediction. We validate CDNet on the mortality prediction task on the MIMIC-III dataset. Our model outperforms state-of-the-art models by significant margins. We also empirically show that regularizing imputed values is a key factor for superior prediction performance. Analysis of prediction uncertainty shows that our model can capture both aleatoric and epistemic uncertainties, which offers model users a better understanding of the model results. △ Less

Submitted 24 October, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

Comments: 8 pages, 6 figures, accepted at IEEE BIBM 2022

arXiv:2207.09693 [pdf, other]

doi 10.1109/TBME.2023.3246599

Correntropy-Based Logistic Regression with Automatic Relevance Determination for Robust Sparse Brain Activity Decoding

Authors: Yuanhao Li, Badong Chen, Yuxi Shi, Natsue Yoshimura, Yasuharu Koike

Abstract: Recent studies have utilized sparse classifications to predict categorical variables from high-dimensional brain activity signals to expose human's intentions and mental states, selecting the relevant features automatically in the model training process. However, existing sparse classification models will likely be prone to the performance degradation which is caused by noise inherent in the brain… ▽ More Recent studies have utilized sparse classifications to predict categorical variables from high-dimensional brain activity signals to expose human's intentions and mental states, selecting the relevant features automatically in the model training process. However, existing sparse classification models will likely be prone to the performance degradation which is caused by noise inherent in the brain recordings. To address this issue, we aim to propose a new robust and sparse classification algorithm in this study. To this end, we introduce the correntropy learning framework into the automatic relevance determination based sparse classification model, proposing a new correntropy-based robust sparse logistic regression algorithm. To demonstrate the superior brain activity decoding performance of the proposed algorithm, we evaluate it on a synthetic dataset, an electroencephalogram (EEG) dataset, and a functional magnetic resonance imaging (fMRI) dataset. The extensive experimental results confirm that not only the proposed method can achieve higher classification accuracy in a noisy and high-dimensional classification task, but also it would select those more informative features for the decoding scenarios. Integrating the correntropy learning approach with the automatic relevance determination technique will significantly improve the robustness with respect to the noise, leading to more adequate robust sparse brain decoding algorithm. It provides a more powerful approach in the real-world brain activity decoding and the brain-computer interfaces. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Journal ref: IEEE Transactions on Biomedical Engineering ( Volume: 70, Issue: 8, August 2023)

arXiv:2207.07340 [pdf, other]

doi 10.1145/3503161.3548303

DuetFace: Collaborative Privacy-Preserving Face Recognition via Channel Splitting in the Frequency Domain

Authors: Yuxi Mi, Yuge Huang, Jiazhen Ji, Hongquan Liu, Xingkun Xu, Shouhong Ding, Shuigeng Zhou

Abstract: With the wide application of face recognition systems, there is rising concern that original face images could be exposed to malicious intents and consequently cause personal privacy breaches. This paper presents DuetFace, a novel privacy-preserving face recognition method that employs collaborative inference in the frequency domain. Starting from a counterintuitive discovery that face recognition… ▽ More With the wide application of face recognition systems, there is rising concern that original face images could be exposed to malicious intents and consequently cause personal privacy breaches. This paper presents DuetFace, a novel privacy-preserving face recognition method that employs collaborative inference in the frequency domain. Starting from a counterintuitive discovery that face recognition can achieve surprisingly good performance with only visually indistinguishable high-frequency channels, this method designs a credible split of frequency channels by their cruciality for visualization and operates the server-side model on non-crucial channels. However, the model degrades in its attention to facial features due to the missing visual information. To compensate, the method introduces a plug-in interactive block to allow attention transfer from the client-side by producing a feature mask. The mask is further refined by deriving and overlaying a facial region of interest (ROI). Extensive experiments on multiple datasets validate the effectiveness of the proposed method in protecting face images from undesired visual inspection, reconstruction, and identification while maintaining high task availability and performance. Results show that the proposed method achieves a comparable recognition accuracy and computation cost to the unprotected ArcFace and outperforms the state-of-the-art privacy-preserving methods. The source code is available at https://github.com/Tencent/TFace/tree/master/recognition/tasks/duetface. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: Accepted to ACM Multimedia 2022

arXiv:2207.06961 [pdf, other]

doi 10.3847/1538-4357/ac8168

Infrared Excess of a Large OB Star Sample

Authors: Dingshan Deng, Yang Sun, Tianding Wang, Yuxi Wang, Biwei Jiang

Abstract: The infrared excess from OB stars are commonly considered as contributions from ionized stellar wind or circumstellar dust. With the newly published LAMOST-OB catalog and GOSSS data, this work steps further on understanding the infrared excess of OB stars. Based on a forward modeling approach comparing the spectral slope of observational Spectral Energy Distributions (SED) and photospheric models,… ▽ More The infrared excess from OB stars are commonly considered as contributions from ionized stellar wind or circumstellar dust. With the newly published LAMOST-OB catalog and GOSSS data, this work steps further on understanding the infrared excess of OB stars. Based on a forward modeling approach comparing the spectral slope of observational Spectral Energy Distributions (SED) and photospheric models, 1147 stars are found to have infrared excess from 7818 stars with good-quality photometric data. After removing the objects in the sightline of dark clouds, 532 ($\sim7\%$) B-type stars and 118 ($\sim23\%$) O-type stars are identified to be true OB stars with circumstellar infrared excess emission. The ionized stellar wind model and the circumstellar dust model are adopted to explain the infrared excess, and Bayes Factors are computed to quantitatively compare the two. It is shown that the infrared excess can be accounted for by the stellar wind for about 65\% cases in which 33\% by free-free emission and 32\% by synchrotron radiation. Other 30\% sources could have and 4\% should have a dust component or other mechanisms to explain the sharply increase flux at $λ> 10μ$m. The parameters of dust model indicate a large-scale circumstellar halo structure which implies the origin of the dust from the birthplace of the OB stars. A statistical study suggests that the proportion with infrared excess in OB stars increases with stellar effective temperature and luminosity, and that there is no systematic change of the mechanism for infrared emission with stellar parameters. △ Less

Submitted 28 October, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

Comments: Accepted for publication in the ApJ (July 13, 2022). 17 pages, 9 figures. Typos corrected, authors info updated

arXiv:2207.06654 [pdf, other]

Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation

Authors: Zhengkai Jiang, Yuxi Li, Ceyuan Yang, Peng Gao, Yabiao Wang, Ying Tai, Chengjie Wang

Abstract: Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the labeled source domain to an unlabeled target domain. In this paper, we present Prototypical Contrast Adaptation (ProCA), a simple and efficient contrastive learning method for unsupervised domain adaptive semantic segmentation. Previous domain adaptation methods merely consider the alignment of the intra-class representati… ▽ More Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the labeled source domain to an unlabeled target domain. In this paper, we present Prototypical Contrast Adaptation (ProCA), a simple and efficient contrastive learning method for unsupervised domain adaptive semantic segmentation. Previous domain adaptation methods merely consider the alignment of the intra-class representational distributions across various domains, while the inter-class structural relationship is insufficiently explored, resulting in the aligned representations on the target domain might not be as easily discriminated as done on the source domain anymore. Instead, ProCA incorporates inter-class information into class-wise prototypes, and adopts the class-centered distribution alignment for adaptation. By considering the same class prototypes as positives and other class prototypes as negatives to achieve class-centered distribution alignment, ProCA achieves state-of-the-art performance on classical domain adaptation tasks, {\em i.e., GTA5 $\to$ Cityscapes \text{and} SYNTHIA $\to$ Cityscapes}. Code is available at \href{https://github.com/jiangzhengkai/ProCA}{ProCA} △ Less

Submitted 14 July, 2022; originally announced July 2022.

arXiv:2207.06414 [pdf, other]

doi 10.1145/3535508.3545535

Modeling Long-term Dependencies and Short-term Correlations in Patient Journey Data with Temporal Attention Networks for Health Prediction

Authors: Yuxi Liu, Zhenhao Zhang, Antonio Jimeno Yepes, Flora D. Salim

Abstract: Building models for health prediction based on Electronic Health Records (EHR) has become an active research area. EHR patient journey data consists of patient time-ordered clinical events/visits from patients. Most existing studies focus on modeling long-term dependencies between visits, without explicitly taking short-term correlations between consecutive visits into account, where irregular tim… ▽ More Building models for health prediction based on Electronic Health Records (EHR) has become an active research area. EHR patient journey data consists of patient time-ordered clinical events/visits from patients. Most existing studies focus on modeling long-term dependencies between visits, without explicitly taking short-term correlations between consecutive visits into account, where irregular time intervals, incorporated as auxiliary information, are fed into health prediction models to capture latent progressive patterns of patient journeys. We present a novel deep neural network with four modules to take into account the contributions of various variables for health prediction: i) the Stacked Attention module strengthens the deep semantics in clinical events within each patient journey and generates visit embeddings, ii) the Short-Term Temporal Attention module models short-term correlations between consecutive visit embeddings while capturing the impact of time intervals within those visit embeddings, iii) the Long-Term Temporal Attention module models long-term dependencies between visit embeddings while capturing the impact of time intervals within those visit embeddings, iv) and finally, the Coupled Attention module adaptively aggregates the outputs of Short-Term Temporal Attention and Long-Term Temporal Attention modules to make health predictions. Experimental results on MIMIC-III demonstrate superior predictive accuracy of our model compared to existing state-of-the-art methods, as well as the interpretability and robustness of this approach. Furthermore, we found that modeling short-term correlations contributes to local priors generation, leading to improved predictive modeling of patient journeys. △ Less

Submitted 15 July, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: 10 pages, 4 figures, accepted at ACM BCB 2022

arXiv:2206.13114 [pdf, other]

Dynamic-Group-Aware Networks for Multi-Agent Trajectory Prediction with Relational Reasoning

Authors: Chenxin Xu, Yuxi Wei, Bohan Tang, Sheng Yin, Ya Zhang, Siheng Chen

Abstract: Demystifying the interactions among multiple agents from their past trajectories is fundamental to precise and interpretable trajectory prediction. However, previous works mainly consider static, pair-wise interactions with limited relational reasoning. To promote more comprehensive interaction modeling and relational reasoning, we propose DynGroupNet, a dynamic-group-aware network, which can i) m… ▽ More Demystifying the interactions among multiple agents from their past trajectories is fundamental to precise and interpretable trajectory prediction. However, previous works mainly consider static, pair-wise interactions with limited relational reasoning. To promote more comprehensive interaction modeling and relational reasoning, we propose DynGroupNet, a dynamic-group-aware network, which can i) model time-varying interactions in highly dynamic scenes; ii) capture both pair-wise and group-wise interactions; and iii) reason both interaction strength and category without direct supervision. Based on DynGroupNet, we further design a prediction system to forecast socially plausible trajectories with dynamic relational reasoning. The proposed prediction system leverages the Gaussian mixture model, multiple sampling and prediction refinement to promote prediction diversity, training stability and trajectory smoothness, respectively. Extensive experiments show that: 1)DynGroupNet can capture time-varying group behaviors, infer time-varying interaction category and interaction strength during trajectory prediction without any relation supervision on physical simulation datasets; 2)DynGroupNet outperforms the state-of-the-art trajectory prediction methods by a significant improvement of 22.6%/28.0%, 26.9%/34.9%, 5.1%/13.0% in ADE/FDE on the NBA, NFL Football and SDD datasets and achieve the state-of-the-art performance on the ETH-UCY dataset. △ Less

Submitted 27 June, 2022; originally announced June 2022.

Comments: arXiv admin note: text overlap with arXiv:2204.08770

arXiv:2206.11190 [pdf, other]

Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space

Authors: Zeyu Wang, Huiying Zhao, Peng Ren, Yuxi Zhou, Ming Sheng

Abstract: Sepsis is a leading cause of death in the ICU. It is a disease requiring complex interventions in a short period of time, but its optimal treatment strategy remains uncertain. Evidence suggests that the practices of currently used treatment strategies are problematic and may cause harm to patients. To address this decision problem, we propose a new medical decision model based on historical data t… ▽ More Sepsis is a leading cause of death in the ICU. It is a disease requiring complex interventions in a short period of time, but its optimal treatment strategy remains uncertain. Evidence suggests that the practices of currently used treatment strategies are problematic and may cause harm to patients. To address this decision problem, we propose a new medical decision model based on historical data to help clinicians recommend the best reference option for real-time treatment. Our model combines offline reinforcement learning and deep reinforcement learning to solve the problem of traditional reinforcement learning in the medical field due to the inability to interact with the environment, while enabling our model to make decisions in a continuous state-action space. We demonstrate that, on average, the treatments recommended by the model are more valuable and reliable than those recommended by clinicians. In a large validation dataset, we find out that the patients whose actual doses from clinicians matched the decisions made by AI has the lowest mortality rates. Our model provides personalized and clinically interpretable treatment decisions for sepsis to improve patient care. △ Less

Submitted 14 July, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

arXiv:2206.10845 [pdf, other]

Parallel Pre-trained Transformers (PPT) for Synthetic Data-based Instance Segmentation

Authors: Ming Li, Jie Wu, **hang Cai, Jie Qin, Yuxi Ren, Xuefeng Xiao, Min Zheng, Rui Wang, Xin Pan

Abstract: Recently, Synthetic data-based Instance Segmentation has become an exceedingly favorable optimization paradigm since it leverages simulation rendering and physics to generate high-quality image-annotation pairs. In this paper, we propose a Parallel Pre-trained Transformers (PPT) framework to accomplish the synthetic data-based Instance Segmentation task. Specifically, we leverage the off-the-shelf… ▽ More Recently, Synthetic data-based Instance Segmentation has become an exceedingly favorable optimization paradigm since it leverages simulation rendering and physics to generate high-quality image-annotation pairs. In this paper, we propose a Parallel Pre-trained Transformers (PPT) framework to accomplish the synthetic data-based Instance Segmentation task. Specifically, we leverage the off-the-shelf pre-trained vision Transformers to alleviate the gap between natural and synthetic data, which helps to provide good generalization in the downstream synthetic data scene with few samples. Swin-B-based CBNet V2, SwinL-based CBNet V2 and Swin-L-based Uniformer are employed for parallel feature learning, and the results of these three models are fused by pixel-level Non-maximum Suppression (NMS) algorithm to obtain more robust results. The experimental results reveal that PPT ranks first in the CVPR2022 AVA Accessibility Vision and Autonomy Challenge, with a 65.155% mAP. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Comments: The solution of 1st Place in AVA Accessibility Vision and Autonomy Challenge on CVPR 2022 workshop. Website: https://accessibility-cv.github.io/

arXiv:2206.06693 [pdf, other]

ET White Paper: To Find the First Earth 2.0

Authors: Jian Ge, Hui Zhang, Weicheng Zang, Hong** Deng, Shude Mao, Ji-Wei Xie, Hui-Gen Liu, Ji-Lin Zhou, Kevin Willis, Chelsea Huang, Steve B. Howell, Fabo Feng, Jiapeng Zhu, Xinyu Yao, Beibei Liu, Masataka Aizawa, Wei Zhu, Ya-** Li, Bo Ma, Quanzhi Ye, Jie Yu, Maosheng Xiang, Cong Yu, Shangfei Liu, Ming Yang , et al. (142 additional authors not shown)

Abstract: We propose to develop a wide-field and ultra-high-precision photometric survey mission, temporarily named "Earth 2.0 (ET)". This mission is designed to measure, for the first time, the occurrence rate and the orbital distributions of Earth-sized planets. ET consists of seven 30cm telescopes, to be launched to the Earth-Sun's L2 point. Six of these are transit telescopes with a field of view of 500… ▽ More We propose to develop a wide-field and ultra-high-precision photometric survey mission, temporarily named "Earth 2.0 (ET)". This mission is designed to measure, for the first time, the occurrence rate and the orbital distributions of Earth-sized planets. ET consists of seven 30cm telescopes, to be launched to the Earth-Sun's L2 point. Six of these are transit telescopes with a field of view of 500 square degrees. Staring in the direction that encompasses the original Kepler field for four continuous years, this monitoring will return tens of thousands of transiting planets, including the elusive Earth twins orbiting solar-type stars. The seventh telescope is a 30cm microlensing telescope that will monitor an area of 4 square degrees toward the galactic bulge. This, combined with simultaneous ground-based KMTNet observations, will measure masses for hundreds of long-period and free-floating planets. Together, the transit and the microlensing telescopes will revolutionize our understandings of terrestrial planets across a large swath of orbital distances and free space. In addition, the survey data will also facilitate studies in the fields of asteroseismology, Galactic archeology, time-domain sciences, and black holes in binaries. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: 116 pages,79 figures

arXiv:2206.03873 [pdf, ps, other]

Optimal Gevrey stability of hydrostatic approximation for the Navier-Stokes equations in a thin domain

Authors: Chao Wang, Yuxi Wang

Abstract: In this paper, we study the hydrostatic approximation for the Navier-Stokes system in a thin domain. When the convex initial data with Gevrey regularity of optimal index 3/2 in x variable and Sobolev regularity in y variable, we justify the limit from the anisotropic Navier-Stokes system to the hydrostatic Navier-Stokes/Prandtl system. Due to our method in the paper is independent of ε, by the sam… ▽ More In this paper, we study the hydrostatic approximation for the Navier-Stokes system in a thin domain. When the convex initial data with Gevrey regularity of optimal index 3/2 in x variable and Sobolev regularity in y variable, we justify the limit from the anisotropic Navier-Stokes system to the hydrostatic Navier-Stokes/Prandtl system. Due to our method in the paper is independent of ε, by the same argument, we also get the hydrostatic Navier-Stokes/Prandtl system is well-posedness in the optimal Gevrey space. Our results improve the Gevrey index in [14, 34] whose Gevrey index is 9/8 . △ Less

Submitted 10 June, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

Comments: 39 pages

arXiv:2206.00806 [pdf, other]

XBound-Former: Toward Cross-scale Boundary Modeling in Transformers

Authors: Jiacheng Wang, Fei Chen, Yuxi Ma, Liansheng Wang, Zhaodong Fei, Jianwei Shuai, Xiangdong Tang, Qichao Zhou, **g Qin

Abstract: Skin lesion segmentation from dermoscopy images is of great significance in the quantitative analysis of skin cancers, which is yet challenging even for dermatologists due to the inherent issues, i.e., considerable size, shape and color variation, and ambiguous boundaries. Recent vision transformers have shown promising performance in handling the variation through global context modeling. Still,… ▽ More Skin lesion segmentation from dermoscopy images is of great significance in the quantitative analysis of skin cancers, which is yet challenging even for dermatologists due to the inherent issues, i.e., considerable size, shape and color variation, and ambiguous boundaries. Recent vision transformers have shown promising performance in handling the variation through global context modeling. Still, they have not thoroughly solved the problem of ambiguous boundaries as they ignore the complementary usage of the boundary knowledge and global contexts. In this paper, we propose a novel cross-scale boundary-aware transformer, \textbf{XBound-Former}, to simultaneously address the variation and boundary problems of skin lesion segmentation. XBound-Former is a purely attention-based network and catches boundary knowledge via three specially designed learners. We evaluate the model on two skin lesion datasets, ISIC-2016\&PH$^2$ and ISIC-2018, where our model consistently outperforms other convolution- and transformer-based models, especially on the boundary-wise metrics. We extensively verify the generalization ability of polyp lesion segmentation that has similar characteristics, and our model can also yield significant improvement compared to the latest models. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: https://github.com/jcwang123/xboundformer

arXiv:2205.13738 [pdf]

Image Reconstruction of Multi Branch Feature Multiplexing Fusion Network with Mixed Multi-layer Attention

Authors: Yuxi Cai, Huicheng Lai

Abstract: Image super-resolution reconstruction achieves better results than traditional methods with the help of the powerful nonlinear representation ability of convolution neural network. However, some existing algorithms also have some problems, such as insufficient utilization of phased features, ignoring the importance of early phased feature fusion to improve network performance, and the inability of… ▽ More Image super-resolution reconstruction achieves better results than traditional methods with the help of the powerful nonlinear representation ability of convolution neural network. However, some existing algorithms also have some problems, such as insufficient utilization of phased features, ignoring the importance of early phased feature fusion to improve network performance, and the inability of the network to pay more attention to high-frequency information in the reconstruction process. To solve these problems, we propose a multi-branch feature multiplexing fusion network with mixed multi-layer attention (MBMFN), which realizes the multiple utilization of features and the multistage fusion of different levels of features. To further improve the networks performance, we propose a lightweight enhanced residual channel attention (LERCA), which can not only effectively avoid the loss of channel information but also make the network pay more attention to the key channel information and benefit from it. Finally, the attention mechanism is introduced into the reconstruction process to strengthen the restoration of edge texture and other details. A large number of experiments on several benchmark sets show that, compared with other advanced reconstruction algorithms, our algorithm produces highly competitive objective indicators and restores more image detail texture information. △ Less

Submitted 26 May, 2022; originally announced May 2022.

arXiv:2205.13734 [pdf, other]

An efficient tensor regression for high-dimensional data

Authors: Yuefeng Si, Yingying Zhang, Yuxi Cai, Chunling Liu, Guodong Li

Abstract: Most currently used tensor regression models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. However, for the simplest tensor autoregression in handling time series data, its coefficient tensor already has the order of six. This paper… ▽ More Most currently used tensor regression models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. However, for the simplest tensor autoregression in handling time series data, its coefficient tensor already has the order of six. This paper revises a newly proposed tensor train (TT) decomposition and then applies it to tensor regression such that a nice statistical interpretation can be obtained. The new tensor regression can well match the data with hierarchical structures, and it even can lead to a better interpretation for the data with factorial structures, which are supposed to be better fitted by models with Tucker decomposition. More importantly, the new tensor regression can be easily applied to the case with higher order tensors since TT decomposition can compress the coefficient tensors much more efficiently. The methodology is also extended to tensor autoregression for time series data, and nonasymptotic properties are derived for the ordinary least squares estimations of both tensor regression and autoregression. A new algorithm is introduced to search for estimators, and its theoretical justification is also discussed. Theoretical and computational properties of the proposed methodology are verified by simulation studies, and the advantages over existing methods are illustrated by two real examples. △ Less

Submitted 18 March, 2024; v1 submitted 26 May, 2022; originally announced May 2022.

arXiv:2205.08901 [pdf, other]

doi 10.3847/1538-3881/ac6fea

The 3D Galactocentric velocities of Kepler stars: marginalizing over missing RVs

Authors: Ruth Angus, Adrian M. Price-Whelan, Joel C. Zinn, Megan Bedell, Yuxi, Lu, Daniel Foreman-Mackey

Abstract: Precise Gaia measurements of positions, parallaxes, and proper motions provide an opportunity to calculate 3D positions and 2D velocities (i.e. 5D phase-space) of Milky Way stars. Where available, spectroscopic radial velocity (RV) measurements provide full 6D phase-space information, however there are now and will remain many stars without RV measurements. Without an RV it is not possible to dire… ▽ More Precise Gaia measurements of positions, parallaxes, and proper motions provide an opportunity to calculate 3D positions and 2D velocities (i.e. 5D phase-space) of Milky Way stars. Where available, spectroscopic radial velocity (RV) measurements provide full 6D phase-space information, however there are now and will remain many stars without RV measurements. Without an RV it is not possible to directly calculate 3D stellar velocities, however one can infer 3D stellar velocities by marginalizing over the missing RV dimension. In this paper, we infer the 3D velocities of stars in the Kepler field in Cartesian Galactocentric coordinates (vx, vy, vz). We directly calculate velocities for around a quarter of all Kepler targets, using RV measurements available from the Gaia, LAMOST and APOGEE spectroscopic surveys. Using the velocity distributions of these stars as our prior, we infer velocities for the remaining three-quarters of the sample by marginalizing over the RV dimension. The median uncertainties on our inferred vx, vy, and vz velocities are around 4, 18, and 4 km/s, respectively. We provide 3D velocities for a total of 148,590 stars in the Kepler field. These 3D velocities could enable kinematic age-dating, Milky Way stellar population studies, and other scientific studies using the benchmark sample of well-studied Kepler stars. Although the methodology used here is broadly applicable to targets across the sky, our prior is specifically constructed from and for the Kepler field. Care should be taken to use a suitable prior when extending this method to other parts of the Galaxy. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: Accepted for publication in AAS Journals

arXiv:2205.01615 [pdf, ps, other]

Global semiconcavity of solutions to first-order Hamilton-Jacobi equations with state constraints

Authors: Yuxi Han

Abstract: We focus on the global semiconcavity of solutions to first-order Hamilton--Jacobi equations with state constraints, especially for the Hamiltonian $H(x, β):=|β|^p-f(x)$ with $p \in (1, 2]$. We first show that the solution is locally semiconcave, and the semiconcavity constant at each point depends on the first time a corresponding minimizing curve emanating from this point hits the boundary. Then,… ▽ More We focus on the global semiconcavity of solutions to first-order Hamilton--Jacobi equations with state constraints, especially for the Hamiltonian $H(x, β):=|β|^p-f(x)$ with $p \in (1, 2]$. We first show that the solution is locally semiconcave, and the semiconcavity constant at each point depends on the first time a corresponding minimizing curve emanating from this point hits the boundary. Then, with appropriate conditions on $Df$, we prove that for any such minimizing curve, the time it takes to hit the boundary of the domain is $+\infty$, and as a consequence, the solution is globally semiconcave. Moreover, the condition on $Df$ is essentially optimal with examples in one-dimensional space. The proofs employ the Euler-Lagrange equations and techniques in weak KAM theory. △ Less

Submitted 3 May, 2022; originally announced May 2022.

MSC Class: 35B65; 35D40; 35F20; 49L25

arXiv:2205.00340 [pdf, other]

doi 10.1093/mnrasl/slac065

Reliability and limitations of inferring birth radii in the Milky Way disk

Authors: Yuxi Lu, Tobias Buck, Ivan Minchev, Melissa K. Ness

Abstract: Recovering the birth radii of observed stars in the Milky Way is one of the ultimate goals of Galactic Archaeology. One method to infer the birth radius and the evolution of the ISM metallicity assumes a linear relation between the ISM metallicity with radius at any given look-back time. Here we test the reliability of this assumption by using 4 zoom-in cosmological hydrodynamic simulations from t… ▽ More Recovering the birth radii of observed stars in the Milky Way is one of the ultimate goals of Galactic Archaeology. One method to infer the birth radius and the evolution of the ISM metallicity assumes a linear relation between the ISM metallicity with radius at any given look-back time. Here we test the reliability of this assumption by using 4 zoom-in cosmological hydrodynamic simulations from the NIHAO-UHD project. We find that one can infer precise birth radii only when the stellar disk starts to form, which for our modeled galaxies happens ~ 10 Gyr ago, in agreement with recent estimates for the Milky Way. At later times the linear correlation between the ISM metallicity and radius increases, as stellar motions become more ordered and the azimuthal variations of the ISM metallicity start to drop. The formation of a central bar and perturbations from mergers can increase this uncertainty in the inner and outer disk, respectively. △ Less

Submitted 2 May, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

arXiv:2204.13080 [pdf, ps, other]

Global existence versus blow-up for multi-d hyperbolized compressible Navier-Stokes equations

Authors: Yuxi Hu, Reinhard Racke

Abstract: We consider the non-isentropic compressible Navier-Stokes equations in two or three space dimensions for which the heat conduction of Fourier's law is replaced by Cattaneo's law and the classical Newtonian flow is replaced by a revised Maxwell flow. We show that a physical entropy exists for this new model. For two special cases, we show the global well-posedness of solutions with small initial da… ▽ More We consider the non-isentropic compressible Navier-Stokes equations in two or three space dimensions for which the heat conduction of Fourier's law is replaced by Cattaneo's law and the classical Newtonian flow is replaced by a revised Maxwell flow. We show that a physical entropy exists for this new model. For two special cases, we show the global well-posedness of solutions with small initial data and the blow-up of solutions in finite time for a class of large initial data. Moreover, for vanishing relaxation parameters, the solutions (if it exists) are shown to converge to solutions of the classical system. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 26 pages

arXiv:2204.05548 [pdf, other]

doi 10.3847/1538-4365/ac63c1

Dust Extinction Law in Nearby Star-Resolved Galaxies. II. M33 Traced by Supergiants

Authors: Yuxi Wang, Jian Gao, Yi Ren, Bingqiu Chen

Abstract: The dust extinction curves toward individual sight lines in M33 are derived for the first time with a sample of reddened O-type and B-type supergiants obtained from the LGGS. The observed photometric data are obtained from the LGGS, PS1 Survey, UKIRT, PHATTER Survey, GALEX, Swift/UVOT and XMM-SUSS. We combine the intrinsic spectral energy distributions (SEDs) obtained from the ATLAS9 and Tlusty st… ▽ More The dust extinction curves toward individual sight lines in M33 are derived for the first time with a sample of reddened O-type and B-type supergiants obtained from the LGGS. The observed photometric data are obtained from the LGGS, PS1 Survey, UKIRT, PHATTER Survey, GALEX, Swift/UVOT and XMM-SUSS. We combine the intrinsic spectral energy distributions (SEDs) obtained from the ATLAS9 and Tlusty stellar model atmosphere extinguished by the model extinction curves from the silicate-graphite dust model to construct model SEDs. The extinction traces are distributed along the arms in M33, and the derived extinction curves cover a wide range of shapes ($R_V \approx 2-6$), indicating the complexity of the interstellar environment and the inhomogeneous distribution of interstellar dust in M33. The average extinction curve with $R_V \approx 3.39$ and dust size distribution $dn/da \sim a^{-3.45}{\rm exp}(-a/0.25)$ is similar to that of the MW but with a weaker 2175 Ang bump and a slightly steeper rise in the far-UV band. The extinction in the $V$ band of M33 is up to 2 mag, with a median value of $ A_V \approx 0.43$ mag. The multiband extinction values from the UV to IR bands are also predicted for M33, which will provide extinction corrections for future works. The method adopted in this work is also applied to other star-resolved galaxies (NGC 6822 and WLM), but only a few extinction curves can be derived because of the limited observations. △ Less

Submitted 12 April, 2022; originally announced April 2022.

arXiv:2204.00975 [pdf, other]

Question-Driven Graph Fusion Network For Visual Question Answering

Authors: Yuxi Qian, Yuncong Hu, Ruonan Wang, Fangxiang Feng, Xiaojie Wang

Abstract: Existing Visual Question Answering (VQA) models have explored various visual relationships between objects in the image to answer complex questions, which inevitably introduces irrelevant information brought by inaccurate object detection and text grounding. To address the problem, we propose a Question-Driven Graph Fusion Network (QD-GFN). It first models semantic, spatial, and implicit visual re… ▽ More Existing Visual Question Answering (VQA) models have explored various visual relationships between objects in the image to answer complex questions, which inevitably introduces irrelevant information brought by inaccurate object detection and text grounding. To address the problem, we propose a Question-Driven Graph Fusion Network (QD-GFN). It first models semantic, spatial, and implicit visual relations in images by three graph attention networks, then question information is utilized to guide the aggregation process of the three graphs, further, our QD-GFN adopts an object filtering mechanism to remove question-irrelevant objects contained in the image. Experiment results demonstrate that our QD-GFN outperforms the prior state-of-the-art on both VQA 2.0 and VQA-CP v2 datasets. Further analysis shows that both the novel graph aggregation method and object filtering mechanism play a significant role in improving the performance of the model. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: Accepted by ICME 2022

arXiv:2204.00879 [pdf, other]

Co-VQA : Answering by Interactive Sub Question Sequence

Authors: Ruonan Wang, Yuxi Qian, Fangxiang Feng, Xiaojie Wang, Huixing Jiang

Abstract: Most existing approaches to Visual Question Answering (VQA) answer questions directly, however, people usually decompose a complex question into a sequence of simple sub questions and finally obtain the answer to the original question after answering the sub question sequence(SQS). By simulating the process, this paper proposes a conversation-based VQA (Co-VQA) framework, which consists of three c… ▽ More Most existing approaches to Visual Question Answering (VQA) answer questions directly, however, people usually decompose a complex question into a sequence of simple sub questions and finally obtain the answer to the original question after answering the sub question sequence(SQS). By simulating the process, this paper proposes a conversation-based VQA (Co-VQA) framework, which consists of three components: Questioner, Oracle, and Answerer. Questioner raises the sub questions using an extending HRED model, and Oracle answers them one-by-one. An Adaptive Chain Visual Reasoning Model (ACVRM) for Answerer is also proposed, where the question-answer pair is used to update the visual representation sequentially. To perform supervised learning for each model, we introduce a well-designed method to build a SQS for each question on VQA 2.0 and VQA-CP v2 datasets. Experimental results show that our method achieves state-of-the-art on VQA-CP v2. Further analyses show that SQSs help build direct semantic connections between questions and images, provide question-adaptive variable-length reasoning chains, and with explicit interpretability as well as error traceability. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: Accepted by Findings of ACL 2022

arXiv:2203.13412 [pdf, other]

Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

Authors: Zengjie Song, Yuxi Wang, Junsong Fan, Tieniu Tan, Zhaoxiang Zhang

Abstract: Sound source localization in visual scenes aims to localize objects emitting the sound in a given image. Recent works showing impressive localization performance typically rely on the contrastive learning framework. However, the random sampling of negatives, as commonly adopted in these methods, can result in misalignment between audio and visual features and thus inducing ambiguity in localizatio… ▽ More Sound source localization in visual scenes aims to localize objects emitting the sound in a given image. Recent works showing impressive localization performance typically rely on the contrastive learning framework. However, the random sampling of negatives, as commonly adopted in these methods, can result in misalignment between audio and visual features and thus inducing ambiguity in localization. In this paper, instead of following previous literature, we propose Self-Supervised Predictive Learning (SSPL), a negative-free method for sound localization via explicit positive mining. Specifically, we first devise a three-stream network to elegantly associate sound source with two augmented views of one corresponding video frame, leading to semantically coherent similarities between audio and visual features. Second, we introduce a novel predictive coding module for audio-visual feature alignment. Such a module assists SSPL to focus on target objects in a progressive manner and effectively lowers the positive-pair learning difficulty. Experiments show surprising results that SSPL outperforms the state-of-the-art approach on two standard sound localization benchmarks. In particular, SSPL achieves significant improvements of 8.6% cIoU and 3.4% AUC on SoundNet-Flickr compared to the previous best. Code is available at: https://github.com/zjsong/SSPL. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: Camera-ready, CVPR 2022. Code: https://github.com/zjsong/SSPL

arXiv:2203.08920 [pdf, other]

doi 10.3847/1538-4357/ac6dd3

Further Evidence of Modified Spin-down in Sun-like Stars: Pileups in the Temperature-Period Distribution

Authors: Trevor J. David, Ruth Angus, Jason L. Curtis, Jennifer L. van Saders, Isabel L. Colman, Gabriella Contardo, Yuxi Lu, Joel C. Zinn

Abstract: We combine stellar surface rotation periods determined from NASA's Kepler mission with spectroscopic temperatures to demonstrate the existence of pileups at the long-period and short-period edges of the temperature-period distribution for main-sequence stars with temperatures exceeding $\sim 5500$K. The long-period pileup is well-described by a curve of constant Rossby number, with a critical valu… ▽ More We combine stellar surface rotation periods determined from NASA's Kepler mission with spectroscopic temperatures to demonstrate the existence of pileups at the long-period and short-period edges of the temperature-period distribution for main-sequence stars with temperatures exceeding $\sim 5500$K. The long-period pileup is well-described by a curve of constant Rossby number, with a critical value of $\mathrm{Ro_{crit}} \lesssim 2$. The long-period pileup was predicted by van Saders et al. (2019) as a consequence of weakened magnetic braking, in which wind-driven angular momentum losses cease once stars reach a critical Rossby number. Stars in the long-period pileup are found to have a wide range of ages ($\sim 2-6$Gyr), meaning that, along the pileup, rotation period is strongly predictive of a star's surface temperature but weakly predictive of its age. The short-period pileup, which is also well-described by a curve of constant Rossby number, is not a prediction of the weakened magnetic braking hypothesis but may instead be related to a phase of slowed surface spin-down due to core-envelope coupling. The same mechanism was proposed by Curtis et al. (2020) to explain the overlap** rotation sequences of low-mass members of differently aged open clusters. The relative dearth of stars with intermediate rotation periods between the short- and long-period pileups is also well-described by a curve of constant Rossby number, which aligns with the period gap initially discovered by McQuillan et al. (2013a) in M-type stars. These observations provide further support for the hypothesis that the period gap is due to stellar astrophysics, rather than a non-uniform star-formation history in the Kepler field. △ Less

Submitted 10 May, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: Accepted to ApJ. 29 pages, 21 figures. The data and code required to reproduce this work is available at http://github.com/trevordavid/rossby-ridge

arXiv:2203.05738 [pdf, other]

Learning Distinctive Margin toward Active Domain Adaptation

Authors: Ming Xie, Yuxi Li, Yabiao Wang, Zekun Luo, Zhenye Gan, Zhongyi Sun, Mingmin Chi, Chengjie Wang, Pei Wang

Abstract: Despite plenty of efforts focusing on improving the domain adaptation ability (DA) under unsupervised or few-shot semi-supervised settings, recently the solution of active learning started to attract more attention due to its suitability in transferring model in a more practical way with limited annotation resource on target data. Nevertheless, most active learning methods are not inherently desig… ▽ More Despite plenty of efforts focusing on improving the domain adaptation ability (DA) under unsupervised or few-shot semi-supervised settings, recently the solution of active learning started to attract more attention due to its suitability in transferring model in a more practical way with limited annotation resource on target data. Nevertheless, most active learning methods are not inherently designed to handle domain gap between data distribution, on the other hand, some active domain adaptation methods (ADA) usually requires complicated query functions, which is vulnerable to overfitting. In this work, we propose a concise but effective ADA method called Select-by-Distinctive-Margin (SDM), which consists of a maximum margin loss and a margin sampling algorithm for data selection. We provide theoretical analysis to show that SDM works like a Support Vector Machine, storing hard examples around decision boundaries and exploiting them to find informative and transferable data. In addition, we propose two variants of our method, one is designed to adaptively adjust the gradient from margin loss, the other boosts the selectivity of margin sampling by taking the gradient direction into account. We benchmark SDM with standard active learning setting, demonstrating our algorithm achieves competitive results with good data scalability. Code is available at https://github.com/TencentYoutuResearch/ActiveLearning-SDM △ Less

Submitted 2 April, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: To appear in CVPR 2022

arXiv:2202.11296 [pdf, other]

Reinforcement Learning in Practice: Opportunities and Challenges

Authors: Yuxi Li

Abstract: This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical details. The article is based on both historical and recent research papers, surveys, tutorials, talks, blogs, books, (panel) discussions, and workshops/conferences. Various groups of readers, like r… ▽ More This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical details. The article is based on both historical and recent research papers, surveys, tutorials, talks, blogs, books, (panel) discussions, and workshops/conferences. Various groups of readers, like researchers, engineers, students, managers, investors, officers, and people wanting to know more about the field, may find the article interesting. In this article, we first give a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI. Then we discuss opportunities of RL, in particular, products and services, games, bandits, recommender systems, robotics, transportation, finance and economics, healthcare, education, combinatorial optimization, computer systems, and science and engineering. Then we discuss challenges, in particular, 1) foundation, 2) representation, 3) reward, 4) exploration, 5) model, simulation, planning, and benchmarks, 6) off-policy/offline learning, 7) learning to learn a.k.a. meta-learning, 8) explainability and interpretability, 9) constraints, 10) software development and deployment, 11) business perspectives, and 12) more challenges. We conclude with a discussion, attempting to answer: "Why has RL not been widely adopted in practice yet?" and "When is RL helpful?". △ Less

Submitted 22 April, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

arXiv:2202.04311 [pdf, other]

Identifying Backdoor Attacks in Federated Learning via Anomaly Detection

Authors: Yuxi Mi, Yiheng Sun, Jihong Guan, Shuigeng Zhou

Abstract: Federated learning has seen increased adoption in recent years in response to the growing regulatory demand for data privacy. However, the opaque local training process of federated learning also sparks rising concerns about model faithfulness. For instance, studies have revealed that federated learning is vulnerable to backdoor attacks, whereby a compromised participant can stealthily modify the… ▽ More Federated learning has seen increased adoption in recent years in response to the growing regulatory demand for data privacy. However, the opaque local training process of federated learning also sparks rising concerns about model faithfulness. For instance, studies have revealed that federated learning is vulnerable to backdoor attacks, whereby a compromised participant can stealthily modify the model's behavior in the presence of backdoor triggers. This paper proposes an effective defense against the attack by examining shared model updates. We begin with the observation that the embedding of backdoors influences the participants' local model weights in terms of the magnitude and orientation of their model gradients, which can manifest as distinguishable disparities. We enable a robust identification of backdoors by studying the statistical distribution of the models' subsets of gradients. Concretely, we first segment the model gradients into fragment vectors that represent small portions of model parameters. We then employ anomaly detection to locate the distributionally skewed fragments and prune the participants with the most outliers. We embody the findings in a novel defense method, ARIBA. We demonstrate through extensive analyses that our proposed methods effectively mitigate state-of-the-art backdoor attacks with minimal impact on task utility. △ Less

Submitted 23 August, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: APWeb-WAIM 2023

arXiv:2201.12701 [pdf, other]

DearFSAC: An Approach to Optimizing Unreliable Federated Learning via Deep Reinforcement Learning

Authors: Chenghao Huang, Weilong Chen, Yuxi Chen, Shunji Yang, Yanru Zhang

Abstract: In federated learning (FL), model aggregation has been widely adopted for data privacy. In recent years, assigning different weights to local models has been used to alleviate the FL performance degradation caused by differences between local datasets. However, when various defects make the FL process unreliable, most existing FL approaches expose weak robustness. In this paper, we propose the DEf… ▽ More In federated learning (FL), model aggregation has been widely adopted for data privacy. In recent years, assigning different weights to local models has been used to alleviate the FL performance degradation caused by differences between local datasets. However, when various defects make the FL process unreliable, most existing FL approaches expose weak robustness. In this paper, we propose the DEfect-AwaRe federated soft actor-critic (DearFSAC) to dynamically assign weights to local models to improve the robustness of FL. The deep reinforcement learning algorithm soft actor-critic is adopted for near-optimal performance and stable convergence. Besides, an auto-encoder is trained to output low-dimensional embedding vectors that are further utilized to evaluate model quality. In the experiments, DearFSAC outperforms three existing approaches on four datasets for both independent and identically distributed (IID) and non-IID settings under defective scenarios. △ Less

Submitted 29 January, 2022; originally announced January 2022.

arXiv:2201.12109 [pdf, other]

Protum: A New Method For Prompt Tuning Based on "[MASK]"

Authors: Pan He, Yuxi Chen, Yan Wang, Yanru Zhang

Abstract: Recently, prompt tuning \cite{lester2021power} has gradually become a new paradigm for NLP, which only depends on the representation of the words by freezing the parameters of pre-trained language models (PLMs) to obtain remarkable performance on downstream tasks. It maintains the consistency of Masked Language Model (MLM) \cite{devlin2018bert} task in the process of pre-training, and avoids some… ▽ More Recently, prompt tuning \cite{lester2021power} has gradually become a new paradigm for NLP, which only depends on the representation of the words by freezing the parameters of pre-trained language models (PLMs) to obtain remarkable performance on downstream tasks. It maintains the consistency of Masked Language Model (MLM) \cite{devlin2018bert} task in the process of pre-training, and avoids some issues that may happened during fine-tuning. Naturally, we consider that the "[MASK]" tokens carry more useful information than other tokens because the model combines with context to predict the masked tokens. Among the current prompt tuning methods, there will be a serious problem of random composition of the answer tokens in prediction when they predict multiple words so that they have to map tokens to labels with the help verbalizer. In response to the above issue, we propose a new \textbf{Pro}mpt \textbf{Tu}ning based on "[\textbf{M}ASK]" (\textbf{Protum}) method in this paper, which constructs a classification task through the information carried by the hidden layer of "[MASK]" tokens and then predicts the labels directly rather than the answer tokens. At the same time, we explore how different hidden layers under "[MASK]" impact on our classification model on many different data sets. Finally, we find that our \textbf{Protum} can achieve much better performance than fine-tuning after continuous pre-training with less time consumption. Our model facilitates the practical application of large models in NLP. △ Less

Submitted 28 January, 2022; originally announced January 2022.

Comments: under review in ICML

arXiv:2201.05265 [pdf, other]

Photoinduced enhancement of superconductivity in the plaquette Hubbard model

Authors: Yuxi Zhang, Rubem Mondaini, Richard T. Scalettar

Abstract: Real-time dynamics techniques have proven increasingly useful in understanding strongly correlated systems both theoretically and experimentally. By employing unbiased time-resolved exact diagonalization, we study pump dynamics in the two-dimensional plaquette Hubbard model, where distinct hop** integrals $t_h$ and $t_h^\prime$ are present within and between plaquettes. In the intermediate coupl… ▽ More Real-time dynamics techniques have proven increasingly useful in understanding strongly correlated systems both theoretically and experimentally. By employing unbiased time-resolved exact diagonalization, we study pump dynamics in the two-dimensional plaquette Hubbard model, where distinct hop** integrals $t_h$ and $t_h^\prime$ are present within and between plaquettes. In the intermediate coupling regime, a significant enhancement of $d$-wave superconductivity is observed and compared with that obtained by simple examination of expectation values with the eigenstates of the Hamiltonian. Our work provides further understanding of superconductivity in the Hubbard model, extends the description of the pairing amplitude to the frequency-anisotropy plane, and offers a promising approach for experimentally engineering emergent out-of-equilibrium states. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: 8 pages, 7 figures

arXiv:2201.03749 [pdf, other]

doi 10.1145/3510003.3510086

Utilizing Parallelism in Smart Contracts on Decentralized Blockchains by Taming Application-Inherent Conflicts

Authors: Péter Garamvölgyi, Yuxi Liu, Dong Zhou, Fan Long, Ming Wu

Abstract: Traditional public blockchain systems typically had very limited transaction throughput because of the bottleneck of the consensus protocol itself. With recent advances in consensus technology, the performance limit has been greatly lifted, typically to thousands of transactions per second. With this, transaction execution has become a new performance bottleneck. Exploiting parallelism in transact… ▽ More Traditional public blockchain systems typically had very limited transaction throughput because of the bottleneck of the consensus protocol itself. With recent advances in consensus technology, the performance limit has been greatly lifted, typically to thousands of transactions per second. With this, transaction execution has become a new performance bottleneck. Exploiting parallelism in transaction execution is a clear and direct way to address this and to further increase transaction throughput. Although some recent literature introduced concurrency control mechanisms to execute smart contract transactions in parallel, the reported speedup that they can achieve is far from ideal. The main reason is that the proposed parallel execution mechanisms cannot effectively deal with the conflicts inherent in many blockchain applications. In this work, we thoroughly study the historical transaction execution traces in Ethereum. We observe that application-inherent conflicts are the major factors that limit the exploitable parallelism during execution. We propose to use partitioned counters and special commutative instructions to break up the application conflict chains in order to maximize the potential speedup. When we evaluated the maximum parallel speedup achievable, these techniques doubled this limit to an 18x overall speedup compared to serial execution, thus approaching the optimum. We also propose OCC-DA, an optimistic concurrency control scheduler with deterministic aborts, which makes it possible to use OCC scheduling in public blockchain settings. △ Less

Submitted 10 February, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2112.10585 [pdf]

Dynamic Stabilization of Water Bottles

Authors: Yanwen Gu, Yunzhou Bai, Yuxi Xin, Lintao Xiao, Sihui Wang, Hanchao Sun

Abstract: The motion of water filled bottles is studied when it is thrown into the air and falls back to the floor, including the possibilities of an upright landing or rolling down before it finally reaches static state. When dealing with the process after throwing a water bottle, the free falling (bottle falls without initial angular velocity) and flip** (bottle falls with initial angular velocity) are… ▽ More The motion of water filled bottles is studied when it is thrown into the air and falls back to the floor, including the possibilities of an upright landing or rolling down before it finally reaches static state. When dealing with the process after throwing a water bottle, the free falling (bottle falls without initial angular velocity) and flip** (bottle falls with initial angular velocity) are considered. In theory, the physical principles behind the motion are analyzed. In addition, the impacts of initial angle, linear velocity, angular velocity and the water amount on the uprightness of the bottle are discussed. In experiment of throwing bottle, we changed the water amount, angular velocity, and releasing height, and examined the impacts of these factors. The results suggest that a certain amount of water and spinning result in higher possibility of upright landing. When dealing with rolling bottle, theoretically we build the bottle-and-bead model to describe the coupled motion of water and the bottle. Analytical solutions are obtained for small amplitude and the numerical solution can be done in a general situation. In the experiment of rolling bottle, we firstly verified the theoretical model, and then addressed the impact of initial conditions and water amount on the motion patterns of the bottle. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 48 pages

arXiv:2112.05740 [pdf, other]

doi 10.1103/PhysRevB.105.195429

Effect of Emitters on Quantum State Transfer in Coupled Cavity Arrays

Authors: Eli Baum, Amelia Broman, Trevor Clarke, Natanael C. Costa, Jack Mucciaccio, Alexander Yue, Yuxi Zhang, Victoria Norman, Jesse Patton, Marina Radulaski, Richard T. Scalettar

Abstract: Over the last decade, conditions for perfect state transfer in quantum spin chains have been discovered, and their experimental realizations addressed. In this paper, we consider an extension of such studies to quantum state transfer in a coupled cavity array including the effects of atoms in the cavities which can absorb and emit photons as they propagate down the array. Our model is equivalent t… ▽ More Over the last decade, conditions for perfect state transfer in quantum spin chains have been discovered, and their experimental realizations addressed. In this paper, we consider an extension of such studies to quantum state transfer in a coupled cavity array including the effects of atoms in the cavities which can absorb and emit photons as they propagate down the array. Our model is equivalent to previously examined spin chains in the one-excitation sector and in the absence of emitters. We introduce a Monte Carlo approach to the inverse eigenvalue problem which allows the determination of the inter-cavity and cavity-emitter couplings resulting in near-perfect quantum state transfer fidelity, and examine the time dependent polariton wave function through exact diagonalization of the resulting Tavis-Cummings-Hubbard Hamiltonian. The effect of inhomogeneous emitter locations is also evaluated. △ Less

Submitted 8 January, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

Comments: 12 pages, 11 figures

arXiv:2112.05238 [pdf, other]

doi 10.1093/mnras/stac780

Turning Points in the Age-Metallicity Relations -- Created by Late Satellite Infall and Enhanced by Radial Migration

Authors: Yuxi Lu, Melissa K. Ness, Tobias Buck, Christopher Carr

Abstract: The present-day Age-Metallicity Relation (AMR) is a record of the star formation history of the Galaxy, as this traces the chemical enrichment of the gas over time. We use a zoomed-in cosmological simulation that reproduces key signatures of the Milky Way (MW), g2.79e12 from the NIHAO-UHD project, to examine how stellar migration and satellite infall shape the AMR across the disk. We find in the s… ▽ More The present-day Age-Metallicity Relation (AMR) is a record of the star formation history of the Galaxy, as this traces the chemical enrichment of the gas over time. We use a zoomed-in cosmological simulation that reproduces key signatures of the Milky Way (MW), g2.79e12 from the NIHAO-UHD project, to examine how stellar migration and satellite infall shape the AMR across the disk. We find in the simulation, similar to the MW, the AMR in small spatial regions (R, z) shows turning points that connect changes in the direction of the relations. The turning points in the AMR in the simulation, are a signature of late satellite infall. This satellite infall has a mass radio similar as that of the Sagittarius dwarf to the MW (~ 0.001). Stars in the apex of the turning points are young and have nearly not migrated. The late satellite infall creates the turning points via depositing metal-poor gas in the disk, triggering star formation of stars in a narrow metallicity range compared to the overall AMR. The main effect of radial migration on the AMR turning points is to widen the metallicity range of the apex. This can happen when radial migration brings stars born from the infallen gas in other spatial bins, with slightly different metallicities, into the spatial bin of interest. These results indicate that it is possible that the passage of the Sagittarius dwarf galaxy played a role in creating the turning points that we see in the AMR in the Milky Way. △ Less

Submitted 9 December, 2021; originally announced December 2021.

arXiv:2111.09523 [pdf, other]

doi 10.3847/1538-4365/ac3bc6

Dust Extinction Law in Nearby Star-Resolved Galaxies. I. M31 Traced by Supergiants

Authors: Yuxi Wang, Jian Gao, Yi Ren

Abstract: The dust extinction laws and dust properties in M31 are explored with a sample of reddened O-type and B-type supergiants obtained from the LGGS. The observed spectral energy distributions (SEDs) for each tracer are constructed with multiband photometry from the LGGS, PS1 Survey, UKIRT, PHAT Survey, Swift/UVOT and XMM-SUSS. We model the SED for each tracer in combination with the intrinsic spectrum… ▽ More The dust extinction laws and dust properties in M31 are explored with a sample of reddened O-type and B-type supergiants obtained from the LGGS. The observed spectral energy distributions (SEDs) for each tracer are constructed with multiband photometry from the LGGS, PS1 Survey, UKIRT, PHAT Survey, Swift/UVOT and XMM-SUSS. We model the SED for each tracer in combination with the intrinsic spectrum obtained from the stellar model atmosphere extinguished by the model extinction curves. Instead of mathematically parameterizing the extinction functions, the model extinction curves in this work are directly derived from the silicate-graphite dust model with a dust size distribution of $dn/da \sim a^{-α}{\rm exp}(-a/0.25),~0.005 < a < 5~μ{\rm m}$. The extinction tracers are distributed along the arms in M31, with the derived MW-type extinction curves covering a wide range of $R_V$ ($\approx 2 - 6$), indicating the complexity of the interstellar environment and the inhomogeneous distribution of interstellar dust in M31. The average extinction curve with $R_V \approx 3.51$ and dust size distribution $dn/da \sim a^{-3.35}{\rm exp}(-a/0.25)$ is similar to those of the MW but rises slightly less steeply in the far-UV bands, implying that the overall interstellar environment in M31 resembles the diffuse region in the MW. The extinction in the $V$ band of M31 is up to 3 mag, with a median value of $ A_V \approx 1$ mag. The multiband extinction values from the UV to IR bands are also predicted for M31, which will provide a general extinction correction for future works. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 46 pages, 11 figures and 6 tables

arXiv:2111.01323 [pdf, other]

Exploring the Semi-supervised Video Object Segmentation Problem from a Cyclic Perspective

Authors: Yuxi Li, Ning Xu, Wenjie Yang, John See, Weiyao Lin

Abstract: Modern video object segmentation (VOS) algorithms have achieved remarkably high performance in a sequential processing order, while most of currently prevailing pipelines still show some obvious inadequacy like accumulative error, unknown robustness or lack of proper interpretation tools. In this paper, we place the semi-supervised video object segmentation problem into a cyclic workflow and find… ▽ More Modern video object segmentation (VOS) algorithms have achieved remarkably high performance in a sequential processing order, while most of currently prevailing pipelines still show some obvious inadequacy like accumulative error, unknown robustness or lack of proper interpretation tools. In this paper, we place the semi-supervised video object segmentation problem into a cyclic workflow and find the defects above can be collectively addressed via the inherent cyclic property of semi-supervised VOS systems. Firstly, a cyclic mechanism incorporated to the standard sequential flow can produce more consistent representations for pixel-wise correspondance. Relying on the accurate reference mask in the starting frame, we show that the error propagation problem can be mitigated. Next, a simple gradient correction module, which naturally extends the offline cyclic pipeline to an online manner, can highlight the high-frequent and detailed part of results to further improve the segmentation quality while kee** feasible computation cost. Meanwhile such correction can protect the network from severe performance degration resulted from interference signals. Finally we develop cycle effective receptive field (cycle-ERF) based on gradient correction process to provide a new perspective into analyzing object-specific regions of interests. We conduct comprehensive comparison and detailed analysis on challenging benchmarks of DAVIS16, DAVIS17 and Youtube-VOS, demonstrating that the cyclic mechanism is helpful to enhance segmentation quality, improve the robustness of VOS systems, and further provide qualitative comparison and interpretation on how different VOS algorithms work. The code of this project can be found at https://github.com/lyxok1/STM-Training △ Less

Submitted 25 July, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: modified version to appear in IJCV. arXiv admin note: substantial text overlap with arXiv:2010.12176

Showing 151–200 of 304 results for author: Yuxi