Search | arXiv e-print repository

HGNET: A Hierarchical Feature Guided Network for Occupancy Flow Field Prediction

Abstract: Predicting the motion of multiple traffic participants has always been one of the most challenging tasks in autonomous driving. The recently proposed occupancy flow field prediction method has shown to be a more effective and scalable representation compared to general trajectory prediction methods. However, in complex multi-agent traffic scenarios, it remains difficult to model the interactions a… ▽ More Predicting the motion of multiple traffic participants has always been one of the most challenging tasks in autonomous driving. The recently proposed occupancy flow field prediction method has shown to be a more effective and scalable representation compared to general trajectory prediction methods. However, in complex multi-agent traffic scenarios, it remains difficult to model the interactions among various factors and the dependencies among prediction outputs at different time steps. In view of this, we propose a transformer-based hierarchical feature guided network (HGNET), which can efficiently extract features of agents and map information from visual and vectorized inputs, modeling multimodal interaction relationships. Second, we design the Feature-Guided Attention (FGAT) module to leverage the potential guiding effects between different prediction targets, thereby improving prediction accuracy. Additionally, to enhance the temporal consistency and causal relationships of the predictions, we propose a Time Series Memory framework to learn the conditional distribution models of the prediction outputs at future time steps from multivariate time series. The results demonstrate that our model exhibits competitive performance, which ranks 3rd in the 2024 Waymo Occupancy and Flow Prediction Challenge. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.14841 [pdf, other]

TabularMark: Watermarking Tabular Datasets for Machine Learning

Authors: Yihao Zheng, Haocheng Xia, Junyuan Pang, **fei Liu, Kui Ren, Lingyang Chu, Yang Cao, Li Xiong

Abstract: Watermarking is broadly utilized to protect ownership of shared data while preserving data utility. However, existing watermarking methods for tabular datasets fall short on the desired properties (detectability, non-intrusiveness, and robustness) and only preserve data utility from the perspective of data statistics, ignoring the performance of downstream ML models trained on the datasets. Can we… ▽ More Watermarking is broadly utilized to protect ownership of shared data while preserving data utility. However, existing watermarking methods for tabular datasets fall short on the desired properties (detectability, non-intrusiveness, and robustness) and only preserve data utility from the perspective of data statistics, ignoring the performance of downstream ML models trained on the datasets. Can we watermark tabular datasets without significantly compromising their utility for training ML models while preventing attackers from training usable ML models on attacked datasets? In this paper, we propose a hypothesis testing-based watermarking scheme, TabularMark. Data noise partitioning is utilized for data perturbation during embedding, which is adaptable for numerical and categorical attributes while preserving the data utility. For detection, a custom-threshold one proportion z-test is employed, which can reliably determine the presence of the watermark. Experiments on real-world and synthetic datasets demonstrate the superiority of TabularMark in detectability, non-intrusiveness, and robustness. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.02744 [pdf, other]

DPDR: Gradient Decomposition and Reconstruction for Differentially Private Deep Learning

Authors: Yixuan Liu, Li Xiong, Yuhan Liu, Yujie Gu, Ruixuan Liu, Hong Chen

Abstract: Differentially Private Stochastic Gradients Descent (DP-SGD) is a prominent paradigm for preserving privacy in deep learning. It ensures privacy by perturbing gradients with random noise calibrated to their entire norm at each training step. However, this perturbation suffers from a sub-optimal performance: it repeatedly wastes privacy budget on the general converging direction shared among gradie… ▽ More Differentially Private Stochastic Gradients Descent (DP-SGD) is a prominent paradigm for preserving privacy in deep learning. It ensures privacy by perturbing gradients with random noise calibrated to their entire norm at each training step. However, this perturbation suffers from a sub-optimal performance: it repeatedly wastes privacy budget on the general converging direction shared among gradients from different batches, which we refer as common knowledge, yet yields little information gain. Motivated by this, we propose a differentially private training framework with early gradient decomposition and reconstruction (DPDR), which enables more efficient use of the privacy budget. In essence, it boosts model utility by focusing on incremental information protection and recycling the privatized common knowledge learned from previous gradients at early training steps. Concretely, DPDR incorporates three steps. First, it disentangles common knowledge and incremental information in current gradients by decomposing them based on previous noisy gradients. Second, most privacy budget is spent on protecting incremental information for higher information gain. Third, the model is updated with the gradient reconstructed from recycled common knowledge and noisy incremental information. Theoretical analysis and extensive experiments show that DPDR outperforms state-of-the-art baselines on both convergence rate and accuracy. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 14 pages

arXiv:2406.02593 [pdf]

Construction and Observation of Flexibly Controllable High-Dimensional Non-Hermitian Skin Effects

Authors: Qicheng Zhang, Yufei Leng, Liwei Xiong, Yuzeng Li, Kun Zhang, Liangjun Qi, Chunyin Qiu

Abstract: Non-Hermitian skin effect (NHSE) is one of the most fundamental phenomena in non-Hermitian physics. Although it is established that one-dimensional NHSE originates from the nontrivial spectral winding topology, the topological origin behind the higher-dimensional NHSE remains unclear so far. This poses a substantial challenge in constructing and manipulating high-dimensional NHSEs. Here, an intuit… ▽ More Non-Hermitian skin effect (NHSE) is one of the most fundamental phenomena in non-Hermitian physics. Although it is established that one-dimensional NHSE originates from the nontrivial spectral winding topology, the topological origin behind the higher-dimensional NHSE remains unclear so far. This poses a substantial challenge in constructing and manipulating high-dimensional NHSEs. Here, an intuitive bottom-to-top scheme to construct high-dimensional NHSEs is proposed, through assembling multiple independent one-dimensional NHSEs. Not only the elusive high-dimensional NHSEs can be effectively predicted from the well-defined one-dimensional spectral winding topologies, but also the high-dimensional generalized Brillouin zones can be directly synthesized from the one-dimensional counterparts. As examples, two two-dimensional nonreciprocal acoustic metamaterials are experimentally implemented to demonstrate highly controllable multi-polar NHSEs and hybrid skin-topological effects, where the sound fields can be frequency-selectively localized at any desired corners and boundaries. These results offer a practicable strategy for engineering high-dimensional NHSEs, which could boost advanced applications such as selective filters and directional amplifiers. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: Published in Advanced Materials

Journal ref: Advanced Materials, 2403108 (2024)

arXiv:2406.01457 [pdf, other]

Differentially Private Tabular Data Synthesis using Large Language Models

Authors: Toan V. Tran, Li Xiong

Abstract: Synthetic tabular data generation with differential privacy is a crucial problem to enable data sharing with formal privacy. Despite a rich history of methodological research and development, develo** differentially private tabular data generators that can provide realistic synthetic datasets remains challenging. This paper introduces DP-LLMTGen -- a novel framework for differentially private ta… ▽ More Synthetic tabular data generation with differential privacy is a crucial problem to enable data sharing with formal privacy. Despite a rich history of methodological research and development, develo** differentially private tabular data generators that can provide realistic synthetic datasets remains challenging. This paper introduces DP-LLMTGen -- a novel framework for differentially private tabular data synthesis that leverages pretrained large language models (LLMs). DP-LLMTGen models sensitive datasets using a two-stage fine-tuning procedure with a novel loss function specifically designed for tabular data. Subsequently, it generates synthetic data through sampling the fine-tuned LLMs. Our empirical evaluation demonstrates that DP-LLMTGen outperforms a variety of existing mechanisms across multiple datasets and privacy settings. Additionally, we conduct an ablation study and several experimental analyses to deepen our understanding of LLMs in addressing this important problem. Finally, we highlight the controllable generation ability of DP-LLMTGen through a fairness-constrained generation setting. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.19669 [pdf, other]

Texture-guided Coding for Deep Features

Authors: Lei Xiong, Xin Luo, Zihao Wang, Chaofan He, Shuyuan Zhu, Bing Zeng

Abstract: With the rapid development of machine vision technology in recent years, many researchers have begun to focus on feature compression that is better suited for machine vision tasks. The target of feature compression is deep features, which arise from convolution in the middle layer of a pre-trained convolutional neural network. However, due to the large volume of data and high level of abstraction… ▽ More With the rapid development of machine vision technology in recent years, many researchers have begun to focus on feature compression that is better suited for machine vision tasks. The target of feature compression is deep features, which arise from convolution in the middle layer of a pre-trained convolutional neural network. However, due to the large volume of data and high level of abstraction of deep features, their application is primarily limited to machine-centric scenarios, which poses significant constraints in situations requiring human-computer interaction. This paper investigates features and textures and proposes a texture-guided feature compression strategy based on their characteristics. Specifically, the strategy comprises feature layers and texture layers. The feature layers serve the machine, including a feature selection module and a feature reconstruction network. With the assistance of texture images, they selectively compress and transmit channels relevant to visual tasks, reducing feature data while providing high-quality features for the machine. The texture layers primarily serve humans and consist of an image reconstruction network. This image reconstruction network leverages features and texture images to reconstruct preview images for humans. Our method fully exploits the characteristics of texture and features. It eliminates feature redundancy, reconstructs high-quality preview images for humans, and supports decision-making. The experimental results demonstrate excellent performance when employing our proposed method to compress the deep features. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.08043 [pdf, other]

HRNet: Differentially Private Hierarchical and Multi-Resolution Network for Human Mobility Data Synthesization

Authors: Shun Takagi, Li Xiong, Fumiyuki Kato, Yang Cao, Masatoshi Yoshikawa

Abstract: Human mobility data offers valuable insights for many applications such as urban planning and pandemic response, but its use also raises privacy concerns. In this paper, we introduce the Hierarchical and Multi-Resolution Network (HRNet), a novel deep generative model specifically designed to synthesize realistic human mobility data while guaranteeing differential privacy. We first identify the key… ▽ More Human mobility data offers valuable insights for many applications such as urban planning and pandemic response, but its use also raises privacy concerns. In this paper, we introduce the Hierarchical and Multi-Resolution Network (HRNet), a novel deep generative model specifically designed to synthesize realistic human mobility data while guaranteeing differential privacy. We first identify the key difficulties inherent in learning human mobility data under differential privacy. In response to these challenges, HRNet integrates three components: a hierarchical location encoding mechanism, multi-task learning across multiple resolutions, and private pre-training. These elements collectively enhance the model's ability under the constraints of differential privacy. Through extensive comparative experiments utilizing a real-world dataset, HRNet demonstrates a marked improvement over existing methods in balancing the utility-privacy trade-off. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.03323 [pdf, other]

Distance between two manifolds, topological phase transitions and scaling laws

Authors: ZhaoXiang Fang, Ming Gong, Guang-Can Guo, Yongxu Fu, Long Xiong

Abstract: Topological phases are generally characterized by topological invariants denoted by integer numbers. However, different topological systems often require different topological invariants to measure, such as geometric phases, topological orders, winding numbers, etc. Moreover, geometric phases and its associated definitions usually fail at critical points. Therefore, it's challenging to predict wha… ▽ More Topological phases are generally characterized by topological invariants denoted by integer numbers. However, different topological systems often require different topological invariants to measure, such as geometric phases, topological orders, winding numbers, etc. Moreover, geometric phases and its associated definitions usually fail at critical points. Therefore, it's challenging to predict what would occur during the transformation between two different topological phases. To address these issues, in this work, we propose a general definition based on fidelity and trace distance from quantum information theory: manifold distance. This definition does not rely on the berry connection of the manifolds but rather on the information of the two manifolds - their ground state wave functions. Thus, it can measure different topological systems (including traditional band topology models, non-Hermitian systems, and topological order models, etc.) and exhibit some universal laws during the transformation between two topological phases. Our research demonstrates that when the properties of two manifolds are identical, the distance and associated higher-order derivatives between them can smoothly transition to each other. However, for two different topological manifolds, the higher-order derivatives exhibit various divergent behaviors near the critical points. For subsequent studies, we expect the method to be generalized to real-space or non-lattice models, in order to facilitate the study of a wider range of physical platforms such as open systems and many-body localization. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 7+36 pages, 4+31 figures

arXiv:2405.00438 [pdf, other]

MetaRM: Shifted Distributions Alignment via Meta-Learning

Authors: Shihan Dou, Yan Liu, Enyu Zhou, Tianlong Li, Haoxiang Jia, Limao Xiong, Xin Zhao, Junjie Ye, Rui Zheng, Tao Gui, Qi Zhang, Xuan**g Huang

Abstract: The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model shifts, leading to the RM's reduced ability to distinguish between responses. This issue is further compounded when the RM, trained on a specific data… ▽ More The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model shifts, leading to the RM's reduced ability to distinguish between responses. This issue is further compounded when the RM, trained on a specific data distribution, struggles to generalize to examples outside of that distribution. These two issues can be united as a challenge posed by the shifted distribution of the environment. To surmount this challenge, we introduce MetaRM, a method leveraging meta-learning to align the RM with the shifted environment distribution. MetaRM is designed to train the RM by minimizing data loss, particularly for data that can improve the differentiation ability to examples of the shifted target distribution. Extensive experiments demonstrate that MetaRM significantly improves the RM's distinguishing ability in iterative RLHF optimization, and also provides the capacity to identify subtle differences in out-of-distribution samples. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 11 pages, 6 figures. arXiv admin note: text overlap with arXiv:2401.06080

arXiv:2404.08251 [pdf, other]

Interplay Between Single-Photon Ionization and the Auger Process in Argon Ion Formation

Authors: Linhao Xiong

Abstract: We explore the interactions between Argon and extreme ultraviolet (XUV) laser pulses across photon energies of 200 eV, 260 eV, and 315 eV, scrutinizing the influence of photon energy on Argon ion yields and unraveling the associated ionization pathways. Utilizing pulse durations of 10 fs and 30 fs, we spotlight a notable increase in Argon's ionization propensity with escalating laser intensities,… ▽ More We explore the interactions between Argon and extreme ultraviolet (XUV) laser pulses across photon energies of 200 eV, 260 eV, and 315 eV, scrutinizing the influence of photon energy on Argon ion yields and unraveling the associated ionization pathways. Utilizing pulse durations of 10 fs and 30 fs, we spotlight a notable increase in Argon's ionization propensity with escalating laser intensities, especially within the $5 \times 10^{14}\, \text{W/cm}^2$ to $1.2 \times 10^{16}\, \text{W/cm}^2$ range. While direct ionization predominated at 200 eV, the 260 eV and 315 eV spectra revealed complex interactions, prominently featuring the Auger process. A consistent overshadowing of odd-charged ions by even-charged Argon ions, particularly at 315 eV, paves the way for future research. The findings provide crucial insights into atomic interactions with intense light, establishing a groundwork for further exploration into similar atomic systems. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 10 pages,7 figures

arXiv:2403.15484 [pdf, other]

RakutenAI-7B: Extending Large Language Models for Japanese

Authors: Rakuten Group, Aaron Levine, Connie Huang, Chenguang Wang, Eduardo Batista, Ewa Szymanska, Hongyi Ding, Hou Wei Chou, Jean-François Pessiot, Johanes Effendi, Justin Chiu, Kai Torben Ohlhus, Karan Chopra, Keiji Shinzato, Koji Murakami, Lee Xiong, Lei Chen, Maki Kubota, Maksim Tkachenko, Miroku Lee, Naoki Takahashi, Prathyusha Jwalapuram, Ryutaro Tatsushima, Saurabh Jain, Sunil Kumar Yadav , et al. (5 additional authors not shown)

Abstract: We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license. We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.12171 [pdf, other]

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

Authors: Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, **g Shao, Tao Gui, Qi Zhang, Xuan**g Huang

Abstract: Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large Language Models (LLMs). They are designed to bypass safeguards and elicit prohibited outputs. However, due to significant differences among various jailbreak methods, there is no standard implementation framework available for the community, which limits comprehensive security evaluations. This paper… ▽ More Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large Language Models (LLMs). They are designed to bypass safeguards and elicit prohibited outputs. However, due to significant differences among various jailbreak methods, there is no standard implementation framework available for the community, which limits comprehensive security evaluations. This paper introduces EasyJailbreak, a unified framework simplifying the construction and evaluation of jailbreak attacks against LLMs. It builds jailbreak attacks using four components: Selector, Mutator, Constraint, and Evaluator. This modular framework enables researchers to easily construct attacks from combinations of novel and existing components. So far, EasyJailbreak supports 11 distinct jailbreak methods and facilitates the security validation of a broad spectrum of LLMs. Our validation across 10 distinct LLMs reveals a significant vulnerability, with an average breach probability of 60% under various jailbreaking attacks. Notably, even advanced models like GPT-3.5-Turbo and GPT-4 exhibit average Attack Success Rates (ASR) of 57% and 33%, respectively. We have released a wealth of resources for researchers, including a web platform, PyPI published package, screencast video, and experimental outputs. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.12104 [pdf, other]

Topology reconstruction for asymmetric systems by isomorphic map** or perturbation approximation

Authors: Yunlin Li, **gguang Chen, Xingchao Qi, Langlang Xiong, Xianjun Wang, Yufu Liu, Fang Guan, Lei Shi, Xunya Jiang

Abstract: The systems without symmetries, e.g. the spatial and chiral symmetries, are generally thought to be improper for topological study and no conventional integral topological invariant can be well defined. In this work, with multi-band asymmetric Rice-Mele-like systems as examples, for the first time we show that the topology of all gaps can be reconstructed by two general methods and topological ori… ▽ More The systems without symmetries, e.g. the spatial and chiral symmetries, are generally thought to be improper for topological study and no conventional integral topological invariant can be well defined. In this work, with multi-band asymmetric Rice-Mele-like systems as examples, for the first time we show that the topology of all gaps can be reconstructed by two general methods and topological origin of many phenomena are revealed. A new integral topological invariant, i.e. the renormalized real-space winding number, can properly characterize the topology and bulk-edge correspondence of such systems. For the first method, an isomorphic map** relationship between a Rice-Mele-like system and its chiral counterpart is set up, which accounts for the topology reconstruction in the half-filling gaps. For the second method, the Hilbert space of asymmetric systems could be reduced into degenerate subspaces by perturbation approximation, so that the topology in subspaces accounts for the topology reconstruction in the fractional-filling gaps. Surprisingly, the topology reconstructed by perturbation approximation exhibits extraordinary robustness since the topological edge states even exist far beyond the weak perturbation limit. We also show that both methods can be widely used for other asymmetric systems, e.g. the two-dimensional (2D) Rice-Mele systems and the superconductor systems. At last, for the asymmetric photonic systems, we predict different topological edge states by our topology-reconstruction theory and experimentally observe them in the laboratory, which agrees with each other very well. Our findings open a door for investigating new topological phenomena in asymmetric systems by various topological reconstruction methods which should greatly expand the category of topology study. △ Less

Submitted 24 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.11613 [pdf, other]

Scattering Singularity in Topological Dielectric Photonic Crystals

Authors: Langlang Xiong, Xunya Jiang, Guangwei Hu

Abstract: The exploration of topology in natural materials and metamaterials has garnered significant attention. Notably, the one-dimensional (1D) and two-dimensional (2D) Su-Schrieffer-Heeger (SSH) model, assessed through tight-binding approximations, has been extensively investigated in both quantum and classical systems, encompassing general and higher-order topology. Despite these advancements, a compre… ▽ More The exploration of topology in natural materials and metamaterials has garnered significant attention. Notably, the one-dimensional (1D) and two-dimensional (2D) Su-Schrieffer-Heeger (SSH) model, assessed through tight-binding approximations, has been extensively investigated in both quantum and classical systems, encompassing general and higher-order topology. Despite these advancements, a comprehensive examination of these models from the perspective of wave physics, particularly the scattering view, remains underexplored. In this study, we systematically unveil the origin of the 1D and 2D Zak phases stemming from the zero-scattering point, termed the scattering singularity in k-space. Employing an expanded plane wave expansion, we accurately compute the reflective spectrum of an infinite 2D photonic crystal (2D-PhC). Analyzing the reflective spectrum reveals the presence of a zero-scattering line in the 2D-PhC, considered the topological origin of the non-trivial Zak phase. Two distinct models, representing omnidirectional non-trivial cases and directional non-trivial cases, are employed to substantiate these findings. Our work introduces a novel perspective for characterizing the nature of non-trivial topological phases. The identification of the zero-scattering line not only enhances our understanding of the underlying physics but also provides valuable insights for the design of innovative devices. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures

arXiv:2403.09562 [pdf, other]

PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps

Authors: Ruixuan Liu, Tianhao Wang, Yang Cao, Li Xiong

Abstract: The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is… ▽ More The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. The effectiveness of defending against privacy attacks on a fine-tuned model seems promising, as empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques are invulnerable to privacy attacks. But PreCurious demonstrates the possibility of breaking up invulnerability in a stealthy manner compared to fine-tuning on a benign model. By further leveraging a sanitized dataset, PreCurious can extract originally unexposed secrets under differentially private fine-tuning. Thus, PreCurious raises warnings for users who download pre-trained models from unknown sources, rely solely on tutorials or common-sense defenses, and previously release sanitized datasets even after perfect scrubbing. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 19 pages

arXiv:2402.17282 [pdf, other]

doi 10.1051/0004-6361/202449200

Distribution of number of peaks within a long gamma-ray burst

Authors: C. Guidorzi, M. Sartori, R. Maccary, A. Tsvetkova, L. Amati, L. Bazzanini, M. Bulla, A. E. Camisasca, L. Ferro, F. Frontera, C. K. Li, S. L. Xiong, S. N. Zhang

Abstract: The variety of long duration gamma-ray burst (LGRB) light curves (LCs) encode a wealth of information on how LGRB engines release energy following the collapse of the progenitor star. Attempts to characterise GRB LCs focused on a number of properties, such as the minimum variability timescale, power density spectra (both ensemble average and individual), or with different definitions of variabilit… ▽ More The variety of long duration gamma-ray burst (LGRB) light curves (LCs) encode a wealth of information on how LGRB engines release energy following the collapse of the progenitor star. Attempts to characterise GRB LCs focused on a number of properties, such as the minimum variability timescale, power density spectra (both ensemble average and individual), or with different definitions of variability. In parallel, a characterisation as a stochastic process was pursued by studying the distributions of waiting times, peak flux, fluence of individual peaks within GRB time profiles. Yet, the question remains as to whether the diversity of profiles can be described in terms of a common stochastic process. Here we address this issue by studying for the first time the distribution of the number of peaks in a GRB profile. We used four different GRB catalogues: CGRO/BATSE, Swift/BAT, BeppoSAX/GRBM, and Insight-HXMT. The statistically significant peaks were identified by means of well tested algorithm MEPSA and further selected by applying a set of thresholds on signal-to-noise ratio. We then extracted the corresponding distributions of number of peaks per GRB. Among the different models considered (power-law, simple or stretched exponential) only a mixture of two exponentials models all the observed distributions, suggesting the existence of two distinct behaviours: (i) an average number of 2.1+-0.1 peaks per GRB ("peak poor") and accounting for about 80% of the observed population of GRBs; (ii) an average number of 8.3+-1.0 peaks per GRB ("peak rich") and accounting for the remaining 20% of the observed population. We associate the class of peak-rich GRBs with the presence of sub-second variability, which seems to be absent among peak-poor GRBs. The two classes could result from two different regimes through which GRB engines release energy or through which energy is dissipated into gamma-rays. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 7 pages, 5 figures, accepted by A&A

Journal ref: A&A 685, A34 (2024)

arXiv:2402.15722 [pdf, other]

Enhancement of spin-orbit interaction and nearly perfect spin-conversion by 1D photonic crystal with the anisotropic defect layer

Authors: Xianjun Wang, Yufu Liu, Yunlin Li, Langlang Xiong, Xunya Jiang

Abstract: Although photon spin-orbit interaction (SOI) has been extensively studied, the vortex-conversion efficiency and the enhancement of spin Hall effect in abnormal modes in SOI remain to be investigated. Using an one-dimensional (1D) photonic crystal (PhC) system with the anisotropic defect layer(ADL), we firstly find that the generation efficiency of the vortex beam is close to 50\% when the number o… ▽ More Although photon spin-orbit interaction (SOI) has been extensively studied, the vortex-conversion efficiency and the enhancement of spin Hall effect in abnormal modes in SOI remain to be investigated. Using an one-dimensional (1D) photonic crystal (PhC) system with the anisotropic defect layer(ADL), we firstly find that the generation efficiency of the vortex beam is close to 50\% when the number of periodic layers of the PhC reaches 5. Secondly, We also discussed the case where linearly polarized light is obliquely incident on a defect state system, and find that the destructive interference between the normal mode and the abnormal mode reaches the maximum, resulting in the enhancement of the spin hall displacement, and the effect can be enhanced at any angle of incidence in this system. Finally, we found that in the defect mode, the mutual conversion of normal and abnormal mode spins can be regulated, and the conversion efficiency can be close to 100\%. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.12791 [pdf, other]

Dual-polarization huge photonic spin Hall shift and deep-subwavelength sensing based on topological singularities in one-dimensional photonic crystals

Authors: Yufu Liu, Xianjun Wang, Yunlin Li, Haoran Zhang, Langlang Xiong, Xingchao Qi, Zhen Lai, Xuezhi Wang, Xunya Jiang

Abstract: Although several efforts have been taken to enhance the photonic spin Hall shift in deep-subwavelength region, according to effective medium theory, the fundamental confliction between near-zero reflection coefficient and near-zero incident angle still hinders the further application. Here, we reveal a fundamental breakdown of effective medium theory due to the existing of topological singularity… ▽ More Although several efforts have been taken to enhance the photonic spin Hall shift in deep-subwavelength region, according to effective medium theory, the fundamental confliction between near-zero reflection coefficient and near-zero incident angle still hinders the further application. Here, we reveal a fundamental breakdown of effective medium theory due to the existing of topological singularity in deep-subwavelength region in one-dimensional photonic crystals. We find that near the topological singularity, huge photonic spin Hall shift can be achieved for s-polarization and p-polarization. At the topological singularity, the reflected filed is split as dipole-like distribution with zero photonic spin Hall shift for both-polarizations, which is resulted from the interfere of the spin-maintained normal light and spin-flipped abnormal light. Based on the theoretical research, dual-polarizations thickness and dielectric constant sensing devices can be designed in deep-subwavelength region. Further more, by applying more complicated layered structure, multi-channels dual-polarizations detection and broadband dual-polarizations huge spin Hall shift platform can be designed. This work paves the way to exploring the topological properties and polarization control of photonic crystals and provides a prospective method for the design of multi-channels sensitive detection spin optical devices. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.01391 [pdf, other]

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Authors: Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuan**g Huang, Tao Gui

Abstract: The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit te… ▽ More The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit tests may not cover the complicated code, optimizing LLMs by using these unexecuted code snippets is ineffective. To tackle these challenges, we introduce StepCoder, a novel RL framework for code generation, consisting of two main components: CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks, while FGO only optimizes the model by masking the unexecuted code segments to provide Fine-Grained Optimization. In addition, we furthermore construct the APPS+ dataset for RL training, which is manually verified to ensure the correctness of unit tests. Experimental results show that our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks. Our dataset APPS+ and StepCoder are available online. △ Less

Submitted 5 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 13 pages, 5 figures

arXiv:2401.16251 [pdf, other]

Cross-silo Federated Learning with Record-level Personalized Differential Privacy

Authors: Junxu Liu, Jian Lou, Li Xiong, **fei Liu, Xiaofeng Meng

Abstract: Federated learning (FL) enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In thi… ▽ More Federated learning (FL) enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In this paper, we explore the uncharted territory of cross-silo FL with record-level personalized differential privacy. We devise a novel framework named \textit{rPDP-FL}, employing a two-stage hybrid sampling scheme with both uniform client-level sampling and non-uniform record-level sampling to accommodate varying privacy requirements. A critical and non-trivial problem is how to determine the ideal per-record sampling probability $q$ given the personalized privacy budget $\varepsilon$. We introduce a versatile solution named \textit{Simulation-CurveFitting}, allowing us to uncover a significant insight into the nonlinear correlation between $q$ and $\varepsilon$ and derive an elegant mathematical model to tackle the problem. Our evaluation demonstrates that our solution can provide significant performance gains over the baselines that do not consider personalized privacy preservation. △ Less

Submitted 29 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 15 pages, 8 figures, accepted by CCS'2024

arXiv:2401.10458 [pdf, other]

Contrastive Unlearning: A Contrastive Approach to Machine Unlearning

Authors: Hong kyu Lee, Qiuchen Zhang, Carl Yang, Jian Lou, Li Xiong

Abstract: Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effect… ▽ More Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effective unlearning. It removes the influence of unlearning samples by contrasting their embeddings against the remaining samples so that they are pushed away from their original classes and pulled toward other classes. By directly optimizing the representation space, it effectively removes the influence of unlearning samples while maintaining the representations learned from the remaining samples. Experiments on a variety of datasets and models on both class unlearning and sample unlearning showed that contrastive unlearning achieves the best unlearning effects and efficiency with the lowest performance loss compared with the state-of-the-art algorithms. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.03472 [pdf, other]

PEneo: Unifying Line Extraction, Line Grou**, and Entity Linking for End-to-end Document Pair Extraction

Authors: Zening Lin, Jiapeng Wang, Teng Li, Wenhui Liao, Dayi Huang, Longfei Xiong, Lianwen **

Abstract: Document pair extraction aims to identify key and value entities as well as their relationships from visually-rich documents. Most existing methods divide it into two separate tasks: semantic entity recognition (SER) and relation extraction (RE). However, simply concatenating SER and RE serially can lead to severe error propagation, and it fails to handle cases like multi-line entities in real sce… ▽ More Document pair extraction aims to identify key and value entities as well as their relationships from visually-rich documents. Most existing methods divide it into two separate tasks: semantic entity recognition (SER) and relation extraction (RE). However, simply concatenating SER and RE serially can lead to severe error propagation, and it fails to handle cases like multi-line entities in real scenarios. To address these issues, this paper introduces a novel framework, PEneo (Pair Extraction new decoder option), which performs document pair extraction in a unified pipeline, incorporating three concurrent sub-tasks: line extraction, line grou**, and entity linking. This approach alleviates the error accumulation problem and can handle the case of multi-line entities. Furthermore, to better evaluate the model's performance and to facilitate future research on pair extraction, we introduce RFUND, a re-annotated version of the commonly used FUNSD and XFUND datasets, to make them more accurate and cover realistic situations. Experiments on various benchmarks demonstrate PEneo's superiority over previous pipelines, boosting the performance by a large margin (e.g., 19.89%-22.91% F1 score on RFUND-EN) when combined with various backbones like LiLT and LayoutLMv3, showing its effectiveness and generality. Codes and the new annotations will be open to the public. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2312.11490 [pdf]

Tracking Intrinsic Non-Hermitian Skin Effect in Lossy Lattices

Authors: Liwei Xiong, Qicheng Zhang, Xiling Feng, Yufei Leng, Min Pi, Shuaishuai Tong, Chunyin Qiu

Abstract: Non-Hermitian skin effect (NHSE), characterized by a majority of eigenstates localized at open boundaries, is one of the most iconic phenomena in non-Hermitian lattices. Despite notable experimental studies implemented, most of them witness only certain signs of the NHSE rather than the intrinsic exponential localization inherent in eigenstates, owing to the ubiquitous and inevitable background lo… ▽ More Non-Hermitian skin effect (NHSE), characterized by a majority of eigenstates localized at open boundaries, is one of the most iconic phenomena in non-Hermitian lattices. Despite notable experimental studies implemented, most of them witness only certain signs of the NHSE rather than the intrinsic exponential localization inherent in eigenstates, owing to the ubiquitous and inevitable background loss. Even worse, the experimental observation of the NHSE would be completely obscured in highly lossy cases. Here, we theoretically propose a dual test approach to eliminate the destructive loss effect and track the intrinsic NHSE that is essentially irrelevant to background loss. Experimentally, the effectiveness of this approach is precisely validated by one- and two-dimensional non-Hermitian acoustic lattices. Our study sheds new light on the previously untapped intrinsic aspect of the NHSE, which is of particular significance in non-Hermitian topological physics. △ Less

Submitted 1 December, 2023; originally announced December 2023.

arXiv:2312.03408 [pdf, other]

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

Authors: Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, **gdong Wang, Futang Zhu, Chun**g Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao

Abstract: With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. Current autonomous driving datasets can broadly be categorized into two generations. The first-generation autonomous driving datasets are characterized by relatively sim… ▽ More With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. Current autonomous driving datasets can broadly be categorized into two generations. The first-generation autonomous driving datasets are characterized by relatively simpler sensor modalities, smaller data scale, and is limited to perception-level tasks. KITTI, introduced in 2012, serves as a prominent representative of this initial wave. In contrast, the second-generation datasets exhibit heightened complexity in sensor modalities, greater data scale and diversity, and an expansion of tasks from perception to encompass prediction and control. Leading examples of the second generation include nuScenes and Waymo, introduced around 2019. This comprehensive review, conducted in collaboration with esteemed colleagues from both academia and industry, systematically assesses over seventy open-source autonomous driving datasets from domestic and international sources. It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets, the pivotal role of data engine systems, and the utilization of generative foundation models to facilitate scalable data generation. Furthermore, this review undertakes an exhaustive analysis and discourse regarding the characteristics and data scales that future third-generation autonomous driving datasets should possess. It also delves into the scientific and technical challenges that warrant resolution. These endeavors are pivotal in advancing autonomous innovation and fostering technological enhancement in critical domains. For further details, please refer to https://github.com/OpenDriveLab/DriveAGI. △ Less

Submitted 22 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: This article is a simplified English translation of corresponding Chinese article. Please refer to Chinese version for the complete content

arXiv:2312.02646 [pdf, other]

SAMSGL: Series-Aligned Multi-Scale Graph Learning for Spatio-Temporal Forecasting

Authors: Xiaobei Zou, Luolin Xiong, Yang Tang, Jürgen Kurths

Abstract: Spatio-temporal forecasting in various domains, like traffic prediction and weather forecasting, is a challenging endeavor, primarily due to the difficulties in modeling propagation dynamics and capturing high-dimensional interactions among nodes. Despite the significant strides made by graph-based networks in spatio-temporal forecasting, there remain two pivotal factors closely related to forecas… ▽ More Spatio-temporal forecasting in various domains, like traffic prediction and weather forecasting, is a challenging endeavor, primarily due to the difficulties in modeling propagation dynamics and capturing high-dimensional interactions among nodes. Despite the significant strides made by graph-based networks in spatio-temporal forecasting, there remain two pivotal factors closely related to forecasting performance that need further consideration: time delays in propagation dynamics and multi-scale high-dimensional interactions. In this work, we present a Series-Aligned Multi-Scale Graph Learning (SAMSGL) framework, aiming to enhance forecasting performance. In order to handle time delays in spatial interactions, we propose a series-aligned graph convolution layer to facilitate the aggregation of non-delayed graph signals, thereby mitigating the influence of time delays for the improvement in accuracy. To understand global and local spatio-temporal interactions, we develop a spatio-temporal architecture via multi-scale graph learning, which encompasses two essential components: multi-scale graph structure learning and graph-fully connected (Graph-FC) blocks. The multi-scale graph structure learning includes a global graph structure to learn both delayed and non-delayed node embeddings, as well as a local one to learn node variations influenced by neighboring factors. The Graph-FC blocks synergistically fuse spatial and temporal information to boost prediction accuracy. To evaluate the performance of SAMSGL, we conduct experiments on meteorological and traffic forecasting datasets, which demonstrate its effectiveness and superiority. △ Less

Submitted 27 May, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted by Chaos

arXiv:2311.08430 [pdf, other]

Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

Authors: Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen

Abstract: Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1… ▽ More Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1) scale - Meta ranking systems serve billions of users, (2) strong baselines - the baselines are production models optimized by hundreds to thousands of world-class engineers for years since the rise of deep learning, (3) dynamic baselines - engineers may have established new and stronger baselines during NAS search, and (4) efficiency - the search pipeline must yield results quickly in alignment with the productionization life cycle. In this paper, we present Rankitect, a NAS software framework for ranking systems at Meta. Rankitect seeks to build brand new architectures by composing low level building blocks from scratch. Rankitect implements and improves state-of-the-art (SOTA) NAS methods for comprehensive and fair comparison under the same search space, including sampling-based NAS, one-shot NAS, and Differentiable NAS (DNAS). We evaluate Rankitect by comparing to multiple production ranking models at Meta. We find that Rankitect can discover new models from scratch achieving competitive tradeoff between Normalized Entropy loss and FLOPs. When utilizing search space designed by engineers, Rankitect can generate better models than engineers, achieving positive offline evaluation and online A/B test at Meta scale. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: Wei Wen and Kuang-Hung Liu contribute equally

arXiv:2311.06227 [pdf, other]

Does Differential Privacy Prevent Backdoor Attacks in Practice?

Authors: Fereshteh Razmi, Jian Lou, Li Xiong

Abstract: Differential Privacy (DP) was originally developed to protect privacy. However, it has recently been utilized to secure machine learning (ML) models from poisoning attacks, with DP-SGD receiving substantial attention. Nevertheless, a thorough investigation is required to assess the effectiveness of different DP techniques in preventing backdoor attacks in practice. In this paper, we investigate th… ▽ More Differential Privacy (DP) was originally developed to protect privacy. However, it has recently been utilized to secure machine learning (ML) models from poisoning attacks, with DP-SGD receiving substantial attention. Nevertheless, a thorough investigation is required to assess the effectiveness of different DP techniques in preventing backdoor attacks in practice. In this paper, we investigate the effectiveness of DP-SGD and, for the first time in literature, examine PATE in the context of backdoor attacks. We also explore the role of different components of DP algorithms in defending against backdoor attacks and will show that PATE is effective against these attacks due to the bagging structure of the teacher models it employs. Our experiments reveal that hyperparameters and the number of backdoors in the training dataset impact the success of DP algorithms. Additionally, we propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE. We conclude that while Label-DP algorithms generally offer weaker privacy protection, accurate hyper-parameter tuning can make them more effective than DP methods in defending against backdoor attacks while maintaining model accuracy. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.03896 [pdf, other]

iACOS: Advancing Implicit Sentiment Extraction with Informative and Adaptive Negative Examples

Authors: Xiancai Xu, Jia-Dong Zhang, Lei Xiong, Zhishang Liu

Abstract: Aspect-based sentiment analysis (ABSA) have been extensively studied, but little light has been shed on the quadruple extraction consisting of four fundamental elements: aspects, categories, opinions and sentiments, especially with implicit aspects and opinions. In this paper, we propose a new method iACOS for extracting Implicit Aspects with Categories and Opinions with Sentiments. First, iACOS a… ▽ More Aspect-based sentiment analysis (ABSA) have been extensively studied, but little light has been shed on the quadruple extraction consisting of four fundamental elements: aspects, categories, opinions and sentiments, especially with implicit aspects and opinions. In this paper, we propose a new method iACOS for extracting Implicit Aspects with Categories and Opinions with Sentiments. First, iACOS appends two implicit tokens at the end of a text to capture the context-aware representation of all tokens including implicit aspects and opinions. Second, iACOS develops a sequence labeling model over the context-aware token representation to co-extract explicit and implicit aspects and opinions. Third, iACOS devises a multi-label classifier with a specialized multi-head attention for discovering aspect-opinion pairs and predicting their categories and sentiments simultaneously. Fourth, iACOS leverages informative and adaptive negative examples to jointly train the multi-label classifier and the other two classifiers on categories and sentiments by multi-task learning. Finally, the experimental results show that iACOS significantly outperforms other quadruple extraction baselines according to the F1 score on two public benchmark datasets. △ Less

Submitted 22 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Journal ref: NAACL 2024 (Volume 1: Long Papers)

arXiv:2310.20251 [pdf, other]

An Implementation of Multimodal Fusion System for Intelligent Digital Human Generation

Authors: Yingjie Zhou, Yaodong Chen, Kaiyue Bi, Lian Xiong, Hui Liu

Abstract: With the rapid development of artificial intelligence (AI), digital humans have attracted more and more attention and are expected to achieve a wide range of applications in several industries. Then, most of the existing digital humans still rely on manual modeling by designers, which is a cumbersome process and has a long development cycle. Therefore, facing the rise of digital humans, there is a… ▽ More With the rapid development of artificial intelligence (AI), digital humans have attracted more and more attention and are expected to achieve a wide range of applications in several industries. Then, most of the existing digital humans still rely on manual modeling by designers, which is a cumbersome process and has a long development cycle. Therefore, facing the rise of digital humans, there is an urgent need for a digital human generation system combined with AI to improve development efficiency. In this paper, an implementation scheme of an intelligent digital human generation system with multimodal fusion is proposed. Specifically, text, speech and image are taken as inputs, and interactive speech is synthesized using large language model (LLM), voiceprint extraction, and text-to-speech conversion techniques. Then the input image is age-transformed and a suitable image is selected as the driving image. Then, the modification and generation of digital human video content is realized by digital human driving, novel view synthesis, and intelligent dressing techniques. Finally, we enhance the user experience through style transfer, super-resolution, and quality evaluation. Experimental results show that the system can effectively realize digital human generation. The related code is released at https://github.com/zyj-2000/CUMT_2D_PhotoSpeaker. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.14783 [pdf, other]

Interpretable Deep Reinforcement Learning for Optimizing Heterogeneous Energy Storage Systems

Authors: Luolin Xiong, Yang Tang, Chensheng Liu, Shuai Mao, Ke Meng, Zhaoyang Dong, Feng Qian

Abstract: Energy storage systems (ESS) are pivotal component in the energy market, serving as both energy suppliers and consumers. ESS operators can reap benefits from energy arbitrage by optimizing operations of storage equipment. To further enhance ESS flexibility within the energy market and improve renewable energy utilization, a heterogeneous photovoltaic-ESS (PV-ESS) is proposed, which leverages the u… ▽ More Energy storage systems (ESS) are pivotal component in the energy market, serving as both energy suppliers and consumers. ESS operators can reap benefits from energy arbitrage by optimizing operations of storage equipment. To further enhance ESS flexibility within the energy market and improve renewable energy utilization, a heterogeneous photovoltaic-ESS (PV-ESS) is proposed, which leverages the unique characteristics of battery energy storage (BES) and hydrogen energy storage (HES). For scheduling tasks of the heterogeneous PV-ESS, cost description plays a crucial role in guiding operator's strategies to maximize benefits. We develop a comprehensive cost function that takes into account degradation, capital, and operation/maintenance costs to reflect real-world scenarios. Moreover, while numerous methods excel in optimizing ESS energy arbitrage, they often rely on black-box models with opaque decision-making processes, limiting practical applicability. To overcome this limitation and enable transparent scheduling strategies, a prototype-based policy network with inherent interpretability is introduced. This network employs human-designed prototypes to guide decision-making by comparing similarities between prototypical situations and encountered situations, which allows for naturally explained scheduling strategies. Comparative results across four distinct cases underscore the effectiveness and practicality of our proposed pre-hoc interpretable optimization method when contrasted with black-box models. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.06502 [pdf, other]

The Limits of ChatGPT in Extracting Aspect-Category-Opinion-Sentiment Quadruples: A Comparative Analysis

Authors: Xiancai Xu, Jia-Dong Zhang, Rongchang Xiao, Lei Xiong

Abstract: Recently, ChatGPT has attracted great attention from both industry and academia due to its surprising abilities in natural language understanding and generation. We are particularly curious about whether it can achieve promising performance on one of the most complex tasks in aspect-based sentiment analysis, i.e., extracting aspect-category-opinion-sentiment quadruples from texts. To this end, in… ▽ More Recently, ChatGPT has attracted great attention from both industry and academia due to its surprising abilities in natural language understanding and generation. We are particularly curious about whether it can achieve promising performance on one of the most complex tasks in aspect-based sentiment analysis, i.e., extracting aspect-category-opinion-sentiment quadruples from texts. To this end, in this paper we develop a specialized prompt template that enables ChatGPT to effectively tackle this complex quadruple extraction task. Further, we propose a selection method on few-shot examples to fully exploit the in-context learning ability of ChatGPT and uplift its effectiveness on this complex task. Finally, we provide a comparative evaluation on ChatGPT against existing state-of-the-art quadruple extraction models based on four public datasets and highlight some important findings regarding the capability boundaries of ChatGPT in the quadruple extraction. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.07864 [pdf, other]

The Rise and Potential of Large Language Model Based Agents: A Survey

Authors: Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie **, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin , et al. (4 additional authors not shown)

Abstract: For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training stra… ▽ More For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many researchers have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. In this paper, we perform a comprehensive survey on LLM-based agents. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. Building upon this, we present a general framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored for different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge from an agent society, and the insights they offer for human society. Finally, we discuss several key topics and open problems within the field. A repository for the related papers at https://github.com/WooooDyy/LLM-Agent-Paper-List. △ Less

Submitted 19 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: 86 pages, 12 figures

arXiv:2309.06667 [pdf]

doi 10.1038/s41467-023-41773-x

Visualizing moiré ferroelectricity via plasmons and nano-photocurrent in graphene/twisted-WSe2 structures

Authors: Shuai Zhang, Yang Liu, Zhiyuan Sun, Xinzhong Chen, Baichang Li, S. L. Moore, Song Liu, Zhiying Wang, S. E. Rossi, Ran **g, Jordan Fonseca, Birui Yang, Yinming Shao, Chun-Ying Huang, Taketo Handa, Lin Xiong, Matthew Fu, Tsai-Chun Pan, Dorri Halbertal, Xinyi Xu, Wenjun Zheng, P. J. Schuck, A. N. Pasupathy, C. R. Dean, Xiaoyang Zhu , et al. (6 additional authors not shown)

Abstract: Ferroelectricity, a spontaneous and reversible electric polarization, is found in certain classes of van der Waals (vdW) material heterostructures. The discovery of ferroelectricity in twisted vdW layers provides new opportunities to engineer spatially dependent electric and optical properties associated with the configuration of moiré superlattice domains and the network of domain walls. Here, we… ▽ More Ferroelectricity, a spontaneous and reversible electric polarization, is found in certain classes of van der Waals (vdW) material heterostructures. The discovery of ferroelectricity in twisted vdW layers provides new opportunities to engineer spatially dependent electric and optical properties associated with the configuration of moiré superlattice domains and the network of domain walls. Here, we employ near-field infrared nano-imaging and nano-photocurrent measurements to study ferroelectricity in minimally twisted WSe2. The ferroelectric domains are visualized through the imaging of the plasmonic response in a graphene monolayer adjacent to the moiré WSe2 bilayers. Specifically, we find that the ferroelectric polarization in moiré domains is imprinted on the plasmonic response of the graphene. Complementary nano-photocurrent measurements demonstrate that the optoelectronic properties of graphene are also modulated by the proximal ferroelectric domains. Our approach represents an alternative strategy for studying moiré ferroelectricity at native length scales and opens promising prospects for (opto)electronic devices. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 19 pages, 3 figures

Journal ref: Nature Communications 14, 6200 (2023)

arXiv:2308.12210 [pdf, other]

ULDP-FL: Federated Learning with Across Silo User-Level Differential Privacy

Authors: Fumiyuki Kato, Li Xiong, Shun Takagi, Yang Cao, Masatoshi Yoshikawa

Abstract: Differentially Private Federated Learning (DP-FL) has garnered attention as a collaborative machine learning approach that ensures formal privacy. Most DP-FL approaches ensure DP at the record-level within each silo for cross-silo FL. However, a single user's data may extend across multiple silos, and the desired user-level DP guarantee for such a setting remains unknown. In this study, we present… ▽ More Differentially Private Federated Learning (DP-FL) has garnered attention as a collaborative machine learning approach that ensures formal privacy. Most DP-FL approaches ensure DP at the record-level within each silo for cross-silo FL. However, a single user's data may extend across multiple silos, and the desired user-level DP guarantee for such a setting remains unknown. In this study, we present Uldp-FL, a novel FL framework designed to guarantee user-level DP in cross-silo FL where a single user's data may belong to multiple silos. Our proposed algorithm directly ensures user-level DP through per-user weighted clip**, departing from group-privacy approaches. We provide a theoretical analysis of the algorithm's privacy and utility. Additionally, we enhance the utility of the proposed algorithm with an enhanced weighting strategy based on user record distribution and design a novel private protocol that ensures no additional information is revealed to the silos and the server. Experiments on real-world datasets show substantial improvements in our methods in privacy-utility trade-offs under user-level DP compared to baseline methods. To the best of our knowledge, our work is the first FL framework that effectively provides user-level DP in the general cross-silo FL setting. △ Less

Submitted 16 June, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: This is the full version of the paper accepted to VLDB 2024

arXiv:2308.09251 [pdf, other]

Catalogue of topological electrons and phonons in all allotropes of carbon

Authors: Qing-Bo Liu, Xiang-Feng Yang, Zhe-Qi Wang, Ziyang Yu, Lun Xiong, Hua-Hua Fu

Abstract: Carbon, as one of the most common element in the earth, constructs hundreds of allotropic phases to present rich physical nature. In this work, by combining the ab inito calculations and symmetry analyses method, we systematically study a large number of allotropes of carbon (703), and discovered 315 ideal topological phononic materials and 32 topological electronic materials. The ideal topologica… ▽ More Carbon, as one of the most common element in the earth, constructs hundreds of allotropic phases to present rich physical nature. In this work, by combining the ab inito calculations and symmetry analyses method, we systematically study a large number of allotropes of carbon (703), and discovered 315 ideal topological phononic materials and 32 topological electronic materials. The ideal topological phononic nature includes single, charge-two, three, four Weyl honons, the Dirac or Weyl nodal lines phonons, and nodal surfaces phonons. And the topological electron nature ncludes topological insulator, (Type-II) Dirac points, triple nodal points, the Dirac (Weyl) nodal lines, quadratic nodal lines and so on. For convenience, we take the uni in SG 178 and pbg in SG 230 as the examples to describe the topological features in the main. We find that it is the coexistence of single pair Weyl phonons and one-nodal surfaces phonons in the uni in SG 178, which can form the single surface arc in the (100) surface BZ and isolated double-helix surface states (IDHSSs)in the (110) surface BZ. In topological semimetal pbg in SG 230, we find that the perfect triple degenerate nodal point can be found in the near Fermi level, and it can form the clear surface states in the (001) and (110) surface BZ. Our work not only greatly expands the topological features in all allotropes of carbon, but also provide many ideal platforms to study the topological electrons and phonons. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.06573 [pdf, other]

4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion

Authors: Guirong Zhuo, Shouyi Lu, Huanyu Zhou, Lianqing Zheng, Lu Xiong

Abstract: Four-dimensional (4D) radar--visual odometry (4DRVO) integrates complementary information from 4D radar and cameras, making it an attractive solution for achieving accurate and robust pose estimation. However, 4DRVO may exhibit significant tracking errors owing to three main factors: 1) sparsity of 4D radar point clouds; 2) inaccurate data association and insufficient feature interaction between t… ▽ More Four-dimensional (4D) radar--visual odometry (4DRVO) integrates complementary information from 4D radar and cameras, making it an attractive solution for achieving accurate and robust pose estimation. However, 4DRVO may exhibit significant tracking errors owing to three main factors: 1) sparsity of 4D radar point clouds; 2) inaccurate data association and insufficient feature interaction between the 4D radar and camera; and 3) disturbances caused by dynamic objects in the environment, affecting odometry estimation. In this paper, we present 4DRVO-Net, which is a method for 4D radar--visual odometry. This method leverages the feature pyramid, pose war**, and cost volume (PWC) network architecture to progressively estimate and refine poses. Specifically, we propose a multi-scale feature extraction network called Radar-PointNet++ that fully considers rich 4D radar point information, enabling fine-grained learning for sparse 4D radar point clouds. To effectively integrate the two modalities, we design an adaptive 4D radar--camera fusion module (A-RCFM) that automatically selects image features based on 4D radar point features, facilitating multi-scale cross-modal feature interaction and adaptive multi-modal feature fusion. In addition, we introduce a velocity-guided point-confidence estimation module to measure local motion patterns, reduce the influence of dynamic objects and outliers, and provide continuous updates during pose refinement. We demonstrate the excellent performance of our method and the effectiveness of each module design on both the VoD and in-house datasets. Our method outperforms all learning-based and geometry-based methods for most sequences in the VoD dataset. Furthermore, it has exhibited promising performance that closely approaches that of the 64-line LiDAR odometry results of A-LOAM without map** optimization. △ Less

Submitted 12 August, 2023; originally announced August 2023.

Comments: 14 pages,12 figures

arXiv:2307.06501 [pdf, other]

Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

Authors: Wenzhou Lv, Tianyu Wu, Luolin Xiong, Liang Wu, Jian Zhou, Yang Tang, Feng Qian

Abstract: Objective: The artificial pancreas (AP) has shown promising potential in achieving closed-loop glucose control for individuals with type 1 diabetes mellitus (T1DM). However, designing an effective control policy for the AP remains challenging due to the complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety… ▽ More Objective: The artificial pancreas (AP) has shown promising potential in achieving closed-loop glucose control for individuals with type 1 diabetes mellitus (T1DM). However, designing an effective control policy for the AP remains challenging due to the complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but faces challenges with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the FDA-accepted UVA/Padova T1DM simulator across three scenarios. Our approaches achieve the highest percentage of time spent in the desired euglycemic range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming the great potential of proposed methods for efficient closed-loop glucose control. △ Less

Submitted 13 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: 12 pages

arXiv:2307.05717 [pdf, other]

Towards Mobility Data Science (Vision Paper)

Authors: Mohamed Mokbel, Mahmoud Sakr, Li Xiong, Andreas Züfle, Jussara Almeida, Taylor Anderson, Walid Aref, Gennady Andrienko, Natalia Andrienko, Yang Cao, Sanjay Chawla, Reynold Cheng, Panos Chrysanthis, Xiqi Fei, Gabriel Ghinita, Anita Graser, Dimitrios Gunopulos, Christian Jensen, Joon-Seok Kim, Kyoung-Sook Kim, Peer Kröger, John Krumm, Johannes Lauer, Amr Magdy, Mario Nascimento , et al. (23 additional authors not shown)

Abstract: Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences… ▽ More Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years. △ Less

Submitted 7 March, 2024; v1 submitted 21 June, 2023; originally announced July 2023.

Comments: Updated to reflect the major revision for ACM Transactions on Spatial Algorithms and Systems (TSAS). This version reflects the final version accepted by ACM TSAS

arXiv:2307.04964 [pdf, other]

Secrets of RLHF in Large Language Models Part I: PPO

Authors: Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie **, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang , et al. (2 additional authors not shown)

Abstract: Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current… ▽ More Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO codes, aiming to make modest contributions to the advancement of LLMs. △ Less

Submitted 18 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

arXiv:2306.12608 [pdf, other]

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Authors: Xiaolan Gu, Ming Li, Li Xiong

Abstract: Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while kee** their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or ro… ▽ More Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while kee** their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or robustness, but not both. In this paper, we focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history. The robustness is achieved via client momentum, which averages the updates of each client over time, thus reducing the variance of the honest clients and exposing the small malicious perturbations of Byzantine clients that are undetectable in a single round but accumulate over time. In our initial solution DP-BREM, DP is achieved by adding noise to the aggregated momentum, and we account for the privacy cost from the momentum, which is different from the conventional DP-SGD that accounts for the privacy cost from the gradient. Since DP-BREM assumes a trusted server (who can obtain clients' local models or updates), we further develop the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server by utilizing secure aggregation techniques, where DP noise is securely and jointly generated by the clients. Both theoretical analysis and experimental results demonstrate that our proposed protocols achieve better privacy-utility tradeoff and stronger Byzantine robustness than several baseline methods, under different DP budgets and attack settings. △ Less

Submitted 10 May, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

arXiv:2306.10255 [pdf, other]

doi 10.1029/2022GL102325

The First GECAM Observation Results on Terrestrial Gamma-ray Flashes and Terrestrial Electron Beams

Authors: Y. Zhao, J. C. Liu, S. L. Xiong, W. C. Xue, Q. B. Yi, G. P. Lu, W. Xu, F. C. Lyu, J. C. Sun, W. X. Peng, C. Zheng, Y. Q. Zhang, C. Cai, S. Xiao, S. L. Xie, C. W. Wang, W. J. Tan, Z. H. An, G. Chen, Y. Q. Du, Y. Huang, M. Gao, K. Gong, D. Y. Guo, J. J. He , et al. (37 additional authors not shown)

Abstract: Gravitational-wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a space-borne instrument dedicated to monitoring high-energy transients, including Terrestrial Gamma-ray Flashes (TGFs) and Terrestrial Electron Beams (TEBs). We implemented a TGF/TEB search algorithm for GECAM, with which 147 bright TGFs, 2 typical TEBs and 2 special TEB-like events are identified during an effe… ▽ More Gravitational-wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a space-borne instrument dedicated to monitoring high-energy transients, including Terrestrial Gamma-ray Flashes (TGFs) and Terrestrial Electron Beams (TEBs). We implemented a TGF/TEB search algorithm for GECAM, with which 147 bright TGFs, 2 typical TEBs and 2 special TEB-like events are identified during an effective observation time of $\sim$9 months. We show that, with gamma-ray and charged particle detectors, GECAM can effectively identify and distinguish TGFs and TEBs, and measure their temporal and spectral properties in detail. A very high TGF-lightning association rate of $\sim$80\% is obtained between GECAM and GLD360 in east Asia region. △ Less

Submitted 17 June, 2023; originally announced June 2023.

Comments: The paper was accepted by Geophysical Research Letters on June 16th, 2023

arXiv:2306.07142 [pdf]

Evolving Testing Scenario Generation Method and Intelligence Evaluation Framework for Automated Vehicles

Authors: Yining Ma, Wei Jiang, Lingtong Zhang, Junyi Chen, Hong Wang, Chen Lv, Xuesong Wang, Lu Xiong

Abstract: Interaction between the background vehicles (BVs) and automated vehicles (AVs) in scenario-based testing plays a critical role in evaluating the intelligence of the AVs. Current testing scenarios typically employ predefined or scripted BVs, which inadequately reflect the complexity of human-like social behaviors in real-world driving scenarios, and also lack a systematic metric for evaluating the… ▽ More Interaction between the background vehicles (BVs) and automated vehicles (AVs) in scenario-based testing plays a critical role in evaluating the intelligence of the AVs. Current testing scenarios typically employ predefined or scripted BVs, which inadequately reflect the complexity of human-like social behaviors in real-world driving scenarios, and also lack a systematic metric for evaluating the comprehensive intelligence of AVs. Therefore, this paper proposes an evolving scenario generation method that utilizes deep reinforcement learning (DRL) to create human-like BVs for testing and intelligence evaluation of AVs. Firstly, a class of driver models with human-like competitive, cooperative, and mutual driving motivations is designed. Then, utilizing an improved "level-k" training procedure, the three distinct driver models acquire game-based interactive driving policies. And these models are assigned to BVs for generating evolving scenarios in which all BVs can interact continuously and evolve diverse contents. Next, a framework including safety, driving efficiency, and interaction utility are presented to evaluate and quantify the intelligence performance of 3 systems under test (SUTs), indicating the effectiveness of the evolving scenario for intelligence testing. Finally, the complexity and fidelity of the proposed evolving testing scenario are validated. The results demonstrate that the proposed evolving scenario exhibits the highest level of complexity compared to other baseline scenarios and has more than 85% similarity to naturalistic driving data. This highlights the potential of the proposed method to facilitate the development and evaluation of high-level AVs in a realistic and challenging environment. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 18 pages,17 figures

arXiv:2305.18628 [pdf, other]

doi 10.1051/0004-6361/202245303

Simultaneous and panchromatic observations of the Fast Radio Burst FRB 20180916B

Authors: M. Trudu, M. Pilia, L. Nicastro, C. Guidorzi, M. Orlandini, L. Zampieri, V. R. Marthi, F. Ambrosino, A. Possenti, M. Burgay, C. Casentini, I. Mereminskiy, V. Savchenko, E. Palazzi, F. Panessa, A. Ridolfi, F. Verrecchia, M. Anedda, G. Bernardi, M. Bachetti, R. Burenin, A. Burtovoi, P. Casella, M. Fiori, F. Frontera , et al. (25 additional authors not shown)

Abstract: Aims. Fast Radio Bursts are bright radio transients whose origin has not yet explained. The search for a multi-wavelength counterpart of those events can put a tight constrain on the emission mechanism and the progenitor source. Methods. We conducted a multi-wavelength observational campaign on FRB 20180916B between October 2020 and August 2021 during eight activity cycles of the source. Observati… ▽ More Aims. Fast Radio Bursts are bright radio transients whose origin has not yet explained. The search for a multi-wavelength counterpart of those events can put a tight constrain on the emission mechanism and the progenitor source. Methods. We conducted a multi-wavelength observational campaign on FRB 20180916B between October 2020 and August 2021 during eight activity cycles of the source. Observations were led in the radio band by the SRT both at 336 MHz and 1547 MHz and the uGMRT at 400 MHz. Simultaneous observations have been conducted by the optical telescopes Asiago (Galileo and Copernico), CMO SAI MSU, CAHA 2.2m, RTT-150 and TNG, and X/Gamma-ray detectors on board the AGILE, Insight-HXMT, INTEGRAL and Swift satellites. Results. We present the detection of 14 new bursts detected with the SRT at 336 MHz and seven new bursts with the uGMRT from this source. We provide the deepest prompt upper limits in the optical band fro FRB 20180916B to date. In fact, the TNG/SiFAP2 observation simultaneous to a burst detection by uGMRT gives an upper limit E_optical / E_radio < 1.3 x 10^2. Another burst detected by the SRT at 336 MHz was also co-observed by Insight-HMXT. The non-detection in the X-rays yields an upper limit (1-30 keV band) of E_X-ray / E_radio in the range of (0.9-1.3) x 10^7, depending on which model is considered for the X-ray emission. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: A&A accepted

Journal ref: A&A 676, A17 (2023)

arXiv:2305.12485 [pdf, other]

A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition

Authors: Limao Xiong, Jie Zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuan**g Huang, ** Ma, Ying Shan

Abstract: Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets, which always obtain using crowdsourcing. However, it is hard to obtain a unified and correct label via majority voting from multiple annotators for NER due to the large labeling space and complexity of this task. To address this problem, we aim to utilize the original multi-annotator labels directl… ▽ More Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets, which always obtain using crowdsourcing. However, it is hard to obtain a unified and correct label via majority voting from multiple annotators for NER due to the large labeling space and complexity of this task. To address this problem, we aim to utilize the original multi-annotator labels directly. Particularly, we propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER. This model learns a token- and content-dependent confidence via an Expectation-Maximization (EM) algorithm by minimizing empirical risk. The true posterior estimator and confidence estimator perform iteratively to update the true posterior and confidence respectively. We conduct extensive experimental results on both real-world and synthetic datasets, which show that our model can improve performance effectively compared with strong baselines. △ Less

Submitted 27 July, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

arXiv:2304.06929 [pdf]

Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment

Authors: Rachel Cummings, Damien Desfontaines, David Evans, Roxana Geambasu, Yangsibo Huang, Matthew Jagielski, Peter Kairouz, Gautam Kamath, Sewoong Oh, Olga Ohrimenko, Nicolas Papernot, Ryan Rogers, Milan Shen, Shuang Song, Weijie Su, Andreas Terzis, Abhradeep Thakurta, Sergei Vassilvitskii, Yu-Xiang Wang, Li Xiong, Sergey Yekhanin, Da Yu, Huanyu Zhang, Wanrong Zhang

Abstract: In this article, we present a detailed review of current practices and state-of-the-art methodologies in the field of differential privacy (DP), with a focus of advancing DP's deployment in real-world applications. Key points and high-level contents of the article were originated from the discussions from "Differential Privacy (DP): Challenges Towards the Next Frontier," a workshop held in July 20… ▽ More In this article, we present a detailed review of current practices and state-of-the-art methodologies in the field of differential privacy (DP), with a focus of advancing DP's deployment in real-world applications. Key points and high-level contents of the article were originated from the discussions from "Differential Privacy (DP): Challenges Towards the Next Frontier," a workshop held in July 2022 with experts from industry, academia, and the public sector seeking answers to broad questions pertaining to privacy and its implications in the design of industry-grade systems. This article aims to provide a reference point for the algorithmic and design decisions within the realm of privacy, highlighting important challenges and potential research directions. Covering a wide spectrum of topics, this article delves into the infrastructure needs for designing private systems, methods for achieving better privacy/utility trade-offs, performing privacy attacks and auditing, as well as communicating privacy with broader audiences and stakeholders. △ Less

Submitted 12 March, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

arXiv:2304.05516 [pdf, other]

Echo of Neighbors: Privacy Amplification for Personalized Private Federated Learning with Shuffle Model

Authors: Yixuan Liu, Suyun Zhao, Li Xiong, Yuhan Liu, Hong Chen

Abstract: Federated Learning, as a popular paradigm for collaborative training, is vulnerable against privacy attacks. Different privacy levels regarding users' attitudes need to be satisfied locally, while a strict privacy guarantee for the global model is also required centrally. Personalized Local Differential Privacy (PLDP) is suitable for preserving users' varying local privacy, yet only provides a cen… ▽ More Federated Learning, as a popular paradigm for collaborative training, is vulnerable against privacy attacks. Different privacy levels regarding users' attitudes need to be satisfied locally, while a strict privacy guarantee for the global model is also required centrally. Personalized Local Differential Privacy (PLDP) is suitable for preserving users' varying local privacy, yet only provides a central privacy guarantee equivalent to the worst-case local privacy level. Thus, achieving strong central privacy as well as personalized local privacy with a utility-promising model is a challenging problem. In this work, a general framework (APES) is built up to strengthen model privacy under personalized local privacy by leveraging the privacy amplification effect of the shuffle model. To tighten the privacy bound, we quantify the heterogeneous contributions to the central privacy user by user. The contributions are characterized by the ability of generating "echos" from the perturbation of each user, which is carefully measured by proposed methods Neighbor Divergence and Clip-Laplace Mechanism. Furthermore, we propose a refined framework (S-APES) with the post-sparsification technique to reduce privacy loss in high-dimension scenarios. To the best of our knowledge, the impact of shuffling on personalized local privacy is considered for the first time. We provide a strong privacy amplification effect, and the bound is tighter than the baseline result based on existing methods for uniform local privacy. Experiments demonstrate that our frameworks ensure comparable or higher accuracy for the global model. △ Less

Submitted 26 May, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

arXiv:2303.12787 [pdf, other]

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Authors: Hansheng Chen, Wei Tian, Pichao Wang, Fan Wang, Lu Xiong, Hao Li

Abstract: Locating 3D objects from a single RGB image via Perspective-n-Point (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest interpreting PnP as a differentiable layer, allowing for partial learning of 2D-3D point correspondences by backpropagating the gradients of pose loss. Yet, learning the entire correspondences from scratch is highly chal… ▽ More Locating 3D objects from a single RGB image via Perspective-n-Point (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest interpreting PnP as a differentiable layer, allowing for partial learning of 2D-3D point correspondences by backpropagating the gradients of pose loss. Yet, learning the entire correspondences from scratch is highly challenging, particularly for ambiguous pose solutions, where the globally optimal pose is theoretically non-differentiable w.r.t. the points. In this paper, we propose the EPro-PnP, a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose with differentiable probability density on the SE(3) manifold. The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution. The underlying principle generalizes previous approaches, and resembles the attention mechanism. EPro-PnP can enhance existing correspondence networks, closing the gap between PnP-based method and the task-specific leaders on the LineMOD 6DoF pose estimation benchmark. Furthermore, EPro-PnP helps to explore new possibilities of network design, as we demonstrate a novel deformable correspondence network with the state-of-the-art pose accuracy on the nuScenes 3D object detection benchmark. Our code is available at https://github.com/tjiiv-cprg/EPro-PnP-v2. △ Less

Submitted 17 December, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: Code available at https://github.com/tjiiv-cprg/EPro-PnP-v2. Revised and fixed typos. arXiv admin note: substantial text overlap with arXiv:2203.13254

arXiv:2303.12357 [pdf, other]

Wasserstein Adversarial Examples on Univariant Time Series Data

Authors: Wenjie Wang, Li Xiong, Jian Lou

Abstract: Adversarial examples are crafted by adding indistinguishable perturbations to normal examples in order to fool a well-trained deep learning model to misclassify. In the context of computer vision, this notion of indistinguishability is typically bounded by $L_{\infty}$ or other norms. However, these norms are not appropriate for measuring indistinguishiability for time series data. In this work, w… ▽ More Adversarial examples are crafted by adding indistinguishable perturbations to normal examples in order to fool a well-trained deep learning model to misclassify. In the context of computer vision, this notion of indistinguishability is typically bounded by $L_{\infty}$ or other norms. However, these norms are not appropriate for measuring indistinguishiability for time series data. In this work, we propose adversarial examples in the Wasserstein space for time series data for the first time and utilize Wasserstein distance to bound the perturbation between normal examples and adversarial examples. We introduce Wasserstein projected gradient descent (WPGD), an adversarial attack method for perturbing univariant time series data. We leverage the closed-form solution of Wasserstein distance in the 1D space to calculate the projection step of WPGD efficiently with the gradient descent method. We further propose a two-step projection so that the search of adversarial examples in the Wasserstein space is guided and constrained by Euclidean norms to yield more effective and imperceptible perturbations. We empirically evaluate the proposed attack on several time series datasets in the healthcare domain. Extensive results demonstrate that the Wasserstein attack is powerful and can successfully attack most of the target classifiers with a high attack success rate. To better study the nature of Wasserstein adversarial example, we evaluate a strong defense mechanism named Wasserstein smoothing for potential certified robustness defense. Although the defense can achieve some accuracy gain, it still has limitations in many cases and leaves space for develo** a stronger certified robustness method to Wasserstein adversarial examples on univariant time series data. △ Less

Submitted 22 March, 2023; originally announced March 2023.

arXiv:2303.05665 [pdf, other]

A Systematic Survey of Control Techniques and Applications in Connected and Automated Vehicles

Authors: Wei Liu, Min Hua, Zhiyun Deng, Zonglin Meng, Yanjun Huang, Chuan Hu, Shunhui Song, Letian Gao, Changsheng Liu, Bin Shuai, Amir Khajepour, Lu Xiong, Xin Xia

Abstract: Vehicle control is one of the most critical challenges in autonomous vehicles (AVs) and connected and automated vehicles (CAVs), and it is paramount in vehicle safety, passenger comfort, transportation efficiency, and energy saving. This survey attempts to provide a comprehensive and thorough overview of the current state of vehicle control technology, focusing on the evolution from vehicle state… ▽ More Vehicle control is one of the most critical challenges in autonomous vehicles (AVs) and connected and automated vehicles (CAVs), and it is paramount in vehicle safety, passenger comfort, transportation efficiency, and energy saving. This survey attempts to provide a comprehensive and thorough overview of the current state of vehicle control technology, focusing on the evolution from vehicle state estimation and trajectory tracking control in AVs at the microscopic level to collaborative control in CAVs at the macroscopic level. First, this review starts with vehicle key state estimation, specifically vehicle sideslip angle, which is the most pivotal state for vehicle trajectory control, to discuss representative approaches. Then, we present symbolic vehicle trajectory tracking control approaches for AVs. On top of that, we further review the collaborative control frameworks for CAVs and corresponding applications. Finally, this survey concludes with a discussion of future research directions and the challenges. This survey aims to provide a contextualized and in-depth look at state of the art in vehicle control for AVs and CAVs, identifying critical areas of focus and pointing out the potential areas for further exploration. △ Less

Submitted 11 April, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2302.06225 [pdf, other]

doi 10.3847/2041-8213/acc8d0

GRANDMA and HXMT Observations of GRB 221009A -- the Standard-Luminosity Afterglow of a Hyper-Luminous Gamma-Ray Burst

Authors: D. A. Kann, S. Agayeva, V. Aivazyan, S. Alishov, C. M. Andrade, S. Antier, A. Baransky, P. Bendjoya, Z. Benkhaldoun, S. Beradze, D. Berezin, M. Boër, E. Broens, S. Brunier, M. Bulla, O. Burkhonov, E. Burns, Y. Chen, Y. P. Chen, M. Conti, M. W. Coughlin, W. W. Cui, F. Daigne, B. Delaveau, H. A. R. Devillepoix , et al. (91 additional authors not shown)

Abstract: GRB 221009A is the brightest Gamma-Ray Burst (GRB) detected in more than 50 years of study. In this paper, we present observations in the X-ray and optical domains after the GRB obtained by the GRANDMA Collaboration (which includes observations from more than 30 professional and amateur telescopes) and the Insight-HXMT Collaboration. We study the optical afterglow with empirical fitting from GRAND… ▽ More GRB 221009A is the brightest Gamma-Ray Burst (GRB) detected in more than 50 years of study. In this paper, we present observations in the X-ray and optical domains after the GRB obtained by the GRANDMA Collaboration (which includes observations from more than 30 professional and amateur telescopes) and the Insight-HXMT Collaboration. We study the optical afterglow with empirical fitting from GRANDMA+HXMT data, augmented with data from the literature up to 60 days. We then model numerically, using a Bayesian approach, the GRANDMA and HXMT-LE afterglow observations, that we augment with Swift-XRT and additional optical/NIR observations reported in the literature. We find that the GRB afterglow, extinguished by a large dust column, is most likely behind a combination of a large Milky-Way dust column combined with moderate low-metallicity dust in the host galaxy. Using the GRANDMA+HXMT-LE+XRT dataset, we find that the simplest model, where the observed afterglow is produced by synchrotron radiation at the forward external shock during the deceleration of a top-hat relativistic jet by a uniform medium, fits the multi-wavelength observations only moderately well, with a tension between the observed temporal and spectral evolution. This tension is confirmed when using the extended dataset. We find that the consideration of a jet structure (Gaussian or power-law), the inclusion of synchrotron self-Compton emission, or the presence of an underlying supernova do not improve the predictions, showing that the modelling of GRB22109A will require going beyond the most standard GRB afterglow model. Placed in the global context of GRB optical afterglows, we find the afterglow of GRB 221009A is luminous but not extraordinarily so, highlighting that some aspects of this GRB do not deviate from the global known sample despite its extreme energetics and the peculiar afterglow evolution. △ Less

Submitted 27 March, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: Accepted to ApJL for the special issue, 37 pages, 23 pages main text, 6 tables, 13 figures

Showing 1–50 of 238 results for author: Xiong, L