-
Absence of Weyl nodes in EuCd$_2$As$_2$ revealed by the carrier density dependence of the anomalous Hall effect
Authors:
Yue Shi,
Zhaoyu Liu,
Logan A. Burnett,
Seokhyeong Lee,
Chaowei Hu,
Qianni Jiang,
Jiaqi Cai,
Xiaodong Xu,
Mo Li,
Cheng-Chien Chen,
Jiun-Haw Chu
Abstract:
The antiferromagnetic layered compound EuCd$_2$As$_2$ is widely considered as a leading candidate of ideal Weyl semimetal, featuring a single pair of Weyl nodes in its field-induced ferromagnetic (FM) state. Nevertheless, this view has recently been challenged by an optical spectroscopy study, which suggests that it is a magnetic semiconductor. In this study, we have successfully synthesized highl…
▽ More
The antiferromagnetic layered compound EuCd$_2$As$_2$ is widely considered as a leading candidate of ideal Weyl semimetal, featuring a single pair of Weyl nodes in its field-induced ferromagnetic (FM) state. Nevertheless, this view has recently been challenged by an optical spectroscopy study, which suggests that it is a magnetic semiconductor. In this study, we have successfully synthesized highly insulating EuCd$_2$As$_2$ crystals with carrier density reaching as low as $2\times 10^{15}$ $\text{cm}^{-3}$. The magneto-transport measurements revealed a progressive decrease of the anomalous Hall conductivity (AHC) by several orders of magnitude as the carrier density decreases. This behavior contradicts with what is expected from the intrinsic AHC generated by the Weyl points, which is independent of carrier density as the Fermi level approaches the charge neutrality point. In contrast, the scaling relationship between AHC and longitudinal conductivity aligns with the characteristics of variable range hop** insulators. Our results suggest that EuCd$_2$As$_2$ is a magnetic semiconductor rather than a topological Weyl semimetal.
△ Less
Submitted 27 February, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
Bayesian Recursive Information Optical Imaging: A Ghost Imaging Scheme Based on Bayesian Filtering
Authors:
Long-Kun Du,
Chenyu Hu,
Shuang Liu,
Chen** Deng,
Chaoran Wang,
Zunwang Bo,
Mingliang Chen,
Wei-Tao Liu,
Shensheng Han
Abstract:
Computational imaging~(CI) has been attracting a lot of interest in recent years for its superiority over traditional imaging in various applications. In CI systems, information is generally acquired in an encoded form and subsequently decoded via processing algorithms, which is quite in line with the information transmission mode of modern communication, and leads to emerging studies from the vie…
▽ More
Computational imaging~(CI) has been attracting a lot of interest in recent years for its superiority over traditional imaging in various applications. In CI systems, information is generally acquired in an encoded form and subsequently decoded via processing algorithms, which is quite in line with the information transmission mode of modern communication, and leads to emerging studies from the viewpoint of information optical imaging. Currently, one of the most important issues to be theoretically studied for CI is to quantitatively evaluate the fundamental ability of information acquisition, which is essential for both objective performance assessment and efficient design of imaging system. In this paper, by incorporating the Bayesian filtering paradigm, we propose a framework for CI that enables quantitative evaluation and design of the imaging system, and demonstate it based on ghost imaging. In specific, this framework can provide a quantitative evaluation on the acquired information through Fisher information and Cramér-Rao Lower Bound (CRLB), and the intrinsic performance of the imaging system can be accessed in real-time. With simulation and experiments, the framework is validated and compared with existing linear unbiased algorithms. In particular, the image retrieval can reach the CRLB. Furthermore, information-driven adaptive design for optimizing the information acquisition procedure is also achieved. By quantitative describing and efficient designing, the proposed framework is expected to promote the practical applications of CI techniques.
△ Less
Submitted 29 December, 2023;
originally announced January 2024.
-
Drinfeld Module and Weil pairing over Dedekind domain of class number two
Authors:
Chuangqiang Hu,
Xiao-Min Huang
Abstract:
The primary objective of this paper is to derive explicit formulas for rank one and rank two Drinfeld modules over a specific domain denoted by A. This domain corresponds to the projective line associated with an infinite place of degree two. To achieve the goals, we construct a pair of standard Drinfeld modules whose coefficients are in the Hilbert class field of A. We demonstrate that the period…
▽ More
The primary objective of this paper is to derive explicit formulas for rank one and rank two Drinfeld modules over a specific domain denoted by A. This domain corresponds to the projective line associated with an infinite place of degree two. To achieve the goals, we construct a pair of standard Drinfeld modules whose coefficients are in the Hilbert class field of A. We demonstrate that the period lattice of the exponential functions corresponding to both modules behaves similarly to the period lattice of the Carlitz module, the standard rank one Drinfeld module defined over rational function field. Moreover, we employ Andersons t-motive to obtain the complete family of rank two Drinfeld modules. This family is parameterized by the invariant J = λ^{q^2+1} which effectively serves as the counterpart of the j-invariant for elliptic curves. Building upon the concepts introduced by van~der~Heiden, particularly with regard to rank two Drinfeld modules, we are able to reformulate the Weil pairing of Drinfeld modules of any rank using a specialized polynomial in multiple variables known as the Weil operator. As an illustrative example, we provide a detailed examination of a more explicit formula for the Weil pairing and the Weil operator of rank two Drinfeld modules over the domain A.
△ Less
Submitted 26 June, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation
Authors:
Jia Cheng Hu,
Roberto Cavicchioli,
Giulia Berardinelli,
Alessandro Capotondi
Abstract:
Although the Transformer is currently the best-performing architecture in the homogeneous configuration (self-attention only) in Neural Machine Translation, many State-of-the-Art models in Natural Language Processing are made of a combination of different Deep Learning approaches. However, these models often focus on combining a couple of techniques only and it is unclear why some methods are chos…
▽ More
Although the Transformer is currently the best-performing architecture in the homogeneous configuration (self-attention only) in Neural Machine Translation, many State-of-the-Art models in Natural Language Processing are made of a combination of different Deep Learning approaches. However, these models often focus on combining a couple of techniques only and it is unclear why some methods are chosen over others. In this work, we investigate the effectiveness of integrating an increasing number of heterogeneous methods. Based on a simple combination strategy and performance-driven synergy criteria, we designed the Multi-Encoder Transformer, which consists of up to five diverse encoders. Results showcased that our approach can improve the quality of the translation across a variety of languages and dataset sizes and it is particularly effective in low-resource languages where we observed a maximum increase of 7.16 BLEU compared to the single-encoder model.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment
Authors:
Rui Zhao,
Liang Zhang,
Biao Fu,
Cong Hu,
**song Su,
Yidong Chen
Abstract:
Sign language translation (SLT) aims to convert continuous sign language videos into textual sentences. As a typical multi-modal task, there exists an inherent modality gap between sign language videos and spoken language text, which makes the cross-modal alignment between visual and textual modalities crucial. However, previous studies tend to rely on an intermediate sign gloss representation to…
▽ More
Sign language translation (SLT) aims to convert continuous sign language videos into textual sentences. As a typical multi-modal task, there exists an inherent modality gap between sign language videos and spoken language text, which makes the cross-modal alignment between visual and textual modalities crucial. However, previous studies tend to rely on an intermediate sign gloss representation to help alleviate the cross-modal problem thereby neglecting the alignment across modalities that may lead to compromised results. To address this issue, we propose a novel framework based on Conditional Variational autoencoder for SLT (CV-SLT) that facilitates direct and sufficient cross-modal alignment between sign language videos and spoken language text. Specifically, our CV-SLT consists of two paths with two Kullback-Leibler (KL) divergences to regularize the outputs of the encoder and decoder, respectively. In the prior path, the model solely relies on visual information to predict the target text; whereas in the posterior path, it simultaneously encodes visual information and textual knowledge to reconstruct the target text. The first KL divergence optimizes the conditional variational autoencoder and regularizes the encoder outputs, while the second KL divergence performs a self-distillation from the posterior path to the prior path, ensuring the consistency of decoder outputs. We further enhance the integration of textual information to the posterior path by employing a shared Attention Residual Gaussian Distribution (ARGD), which considers the textual information in the posterior path as a residual component relative to the prior path. Extensive experiments conducted on public datasets (PHOENIX14T and CSL-daily) demonstrate the effectiveness of our framework, achieving new state-of-the-art results while significantly alleviating the cross-modal representation discrepancy.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation
Authors:
Chengming Hu,
Haolun Wu,
Xuan Li,
Chen Ma,
Xi Chen,
Jun Yan,
Boyu Wang,
Xue Liu
Abstract:
Knowledge distillation aims to train a compact student network using soft supervision from a larger teacher network and hard supervision from ground truths. However, determining an optimal knowledge fusion ratio that balances these supervisory signals remains challenging. Prior methods generally resort to a constant or heuristic-based fusion ratio, which often falls short of a proper balance. In t…
▽ More
Knowledge distillation aims to train a compact student network using soft supervision from a larger teacher network and hard supervision from ground truths. However, determining an optimal knowledge fusion ratio that balances these supervisory signals remains challenging. Prior methods generally resort to a constant or heuristic-based fusion ratio, which often falls short of a proper balance. In this study, we introduce a novel adaptive method for learning a sample-wise knowledge fusion ratio, exploiting both the correctness of teacher and student, as well as how well the student mimics the teacher on each sample. Our method naturally leads to the intra-sample trilateral geometric relations among the student prediction ($S$), teacher prediction ($T$), and ground truth ($G$). To counterbalance the impact of outliers, we further extend to the inter-sample relations, incorporating the teacher's global average prediction $\bar{T}$ for samples within the same class. A simple neural network then learns the implicit map** from the intra- and inter-sample relations to an adaptive, sample-wise knowledge fusion ratio in a bilevel-optimization manner. Our approach provides a simple, practical, and adaptable solution for knowledge distillation that can be employed across various architectures and model sizes. Extensive experiments demonstrate consistent improvements over other loss re-weighting methods on image classification, attack detection, and click-through rate prediction.
△ Less
Submitted 18 February, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Tailoring Interlayer Chiral Exchange by Azimuthal Symmetry Engineering
Authors:
Yu-Hao Huang,
Jui-Hsu Han,
Wei-Bang Liao,
Chen-Yu Hu,
Yan-Ting Liu,
Chi-Feng Pai
Abstract:
Recent theoretical and experimental studies of the interlayer Dzyaloshinskii-Moriya interaction (DMI) has sparked great interest in its implementation into practical magnetic random-access memory (MRAM) devices, due to its capability to mediate long-range chiral spin textures. So far, experimental reports focused on the observation of interlayer DMI, leaving the development of strategies to contro…
▽ More
Recent theoretical and experimental studies of the interlayer Dzyaloshinskii-Moriya interaction (DMI) has sparked great interest in its implementation into practical magnetic random-access memory (MRAM) devices, due to its capability to mediate long-range chiral spin textures. So far, experimental reports focused on the observation of interlayer DMI, leaving the development of strategies to control interlayer DMI's magnitude unaddressed. Here, we introduce an azimuthal symmetry engineering protocol capable of additive/subtractive tuning of interlayer DMI through the control of wedge deposition of separate layers, and demonstrate its capability to mediate field-free spin-orbit torque (SOT) magnetization switching in both orthogonally magnetized and synthetic antiferromagnetically coupled systems. Furthermore, we showcase the spatial inhomogeneity brought about by wedge depositon can be suppressed by specific azimuthal engineering design, ideal for practical implementation. Our findings provide guidelines for effective manipulations of interlayer DMI strength, beneficial for future design of SOT-MRAM or other spintronic devices utilizing interlayer DMI.
△ Less
Submitted 25 December, 2023; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Adversarial Infrared Curves: An Attack on Infrared Pedestrian Detectors in the Physical World
Authors:
Chengyin Hu,
Weiwen Shi
Abstract:
Deep neural network security is a persistent concern, with considerable research on visible light physical attacks but limited exploration in the infrared domain. Existing approaches, like white-box infrared attacks using bulb boards and QR suits, lack realism and stealthiness. Meanwhile, black-box methods with cold and hot patches often struggle to ensure robustness. To bridge these gaps, we prop…
▽ More
Deep neural network security is a persistent concern, with considerable research on visible light physical attacks but limited exploration in the infrared domain. Existing approaches, like white-box infrared attacks using bulb boards and QR suits, lack realism and stealthiness. Meanwhile, black-box methods with cold and hot patches often struggle to ensure robustness. To bridge these gaps, we propose Adversarial Infrared Curves (AdvIC). Using Particle Swarm Optimization, we optimize two Bezier curves and employ cold patches in the physical realm to introduce perturbations, creating infrared curve patterns for physical sample generation. Our extensive experiments confirm AdvIC's effectiveness, achieving 94.8\% and 67.2\% attack success rates for digital and physical attacks, respectively. Stealthiness is demonstrated through a comparative analysis, and robustness assessments reveal AdvIC's superiority over baseline methods. When deployed against diverse advanced detectors, AdvIC achieves an average attack success rate of 76.8\%, emphasizing its robust nature. we explore adversarial defense strategies against AdvIC and examine its impact under various defense mechanisms. Given AdvIC's substantial security implications for real-world vision-based applications, urgent attention and mitigation efforts are warranted.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding
Authors:
Yuecen Wei,
Haonan Yuan,
Xingcheng Fu,
Qingyun Sun,
Hao Peng,
Xianxian Li,
Chunming Hu
Abstract:
Hierarchy is an important and commonly observed topological property in real-world graphs that indicate the relationships between supervisors and subordinates or the organizational behavior of human groups. As hierarchy is introduced as a new inductive bias into the Graph Neural Networks (GNNs) in various tasks, it implies latent topological relations for attackers to improve their inference attac…
▽ More
Hierarchy is an important and commonly observed topological property in real-world graphs that indicate the relationships between supervisors and subordinates or the organizational behavior of human groups. As hierarchy is introduced as a new inductive bias into the Graph Neural Networks (GNNs) in various tasks, it implies latent topological relations for attackers to improve their inference attack performance, leading to serious privacy leakage issues. In addition, existing privacy-preserving frameworks suffer from reduced protection ability in hierarchical propagation due to the deficiency of adaptive upper-bound estimation of the hierarchical perturbation boundary. It is of great urgency to effectively leverage the hierarchical property of data while satisfying privacy guarantees. To solve the problem, we propose the Poincaré Differential Privacy framework, named PoinDP, to protect the hierarchy-aware graph embedding based on hyperbolic geometry. Specifically, PoinDP first learns the hierarchy weights for each entity based on the Poincaré model in hyperbolic space. Then, the Personalized Hierarchy-aware Sensitivity is designed to measure the sensitivity of the hierarchical structure and adaptively allocate the privacy protection strength. Besides, the Hyperbolic Gaussian Mechanism (HGM) is proposed to extend the Gaussian mechanism in Euclidean space to hyperbolic space to realize random perturbations that satisfy differential privacy under the hyperbolic space metric. Extensive experiment results on five real-world datasets demonstrate the proposed PoinDP's advantages of effective privacy protection while maintaining good performance on the node classification task.
△ Less
Submitted 29 February, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Experimental 3D super-localization with Laguerre-Gaussian modes
Authors:
Chenyu Hu,
Liang Xu,
Ben Wang,
Zhiwen Li,
Yipeng Zhang,
Yong Zhang,
Lijian Zhang
Abstract:
Improving three-dimensional (3D) localization precision is of paramount importance for super-resolution imaging. By properly engineering the point spread function (PSF), such as utilizing Laguerre-Gaussian (LG) modes and their superposition, the ultimate limits of 3D localization precision can be enhanced. However, achieving these limits is challenging, as it often involves complicated detection s…
▽ More
Improving three-dimensional (3D) localization precision is of paramount importance for super-resolution imaging. By properly engineering the point spread function (PSF), such as utilizing Laguerre-Gaussian (LG) modes and their superposition, the ultimate limits of 3D localization precision can be enhanced. However, achieving these limits is challenging, as it often involves complicated detection strategies and practical limitations. In this work, we rigorously derive the ultimate 3D localization limits of LG modes and their superposition, specifically rotation modes, in the multi-parameter estimation framework. Our findings reveal that a significant portion of the information required for achieving 3D super-localization of LG modes can be obtained through feasible intensity detection. Moreover, the 3D ultimate precision can be achieved when the azimuthal index $l$ is zero. To provide a proof-of-principle demonstration, we develop an iterative maximum likelihood estimation (MLE) algorithm that converges to the 3D position of a point source, considering the pixelation and detector noise. The experimental implementation exhibits an improvement of up to two-fold in lateral localization precision and up to twenty-fold in axial localization precision when using LG modes compared to Gaussian mode. We also showcase the superior axial localization capability of the rotation mode within the near-focus region, effectively overcoming the limitations encountered by single LG modes. Notably, in the presence of realistic aberration, the algorithm robustly achieves the Cramér-Rao lower bound. Our findings provide valuable insights for evaluating and optimizing the achievable 3D localization precision, which will facilitate the advancements in super-resolution microscopy.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Optical Ranging Using Coherent Kerr Soliton Dual-microcombs with Extended Ambiguity Distance
Authors:
Yuechen Yang,
Yang Shen,
Kailu Zhou,
Chenhua Hu,
Yuanzhuo Ding,
Tinghao Jiang,
Wei Li,
Yudong Li,
Liangsen Feng,
Tengfei Wu,
Guangqiang He
Abstract:
Optical ranging is a key technology in metrology. Optical frequency combs are shown to provide several advantages in light ranging, offering high precision with high acquisition rate. However, performance of traditional ranging systems based on microcombs is limited by the short ambiguity distance and non-real-time processing. Here, we show that dual-comb ranging system using coherent Kerr soliton…
▽ More
Optical ranging is a key technology in metrology. Optical frequency combs are shown to provide several advantages in light ranging, offering high precision with high acquisition rate. However, performance of traditional ranging systems based on microcombs is limited by the short ambiguity distance and non-real-time processing. Here, we show that dual-comb ranging system using coherent Kerr soliton microcombs and optical switch realizes extended ambiguity distance and provides a route to real-time processing. The ambguity distance is extended to 3.28 m from about 1.5 mm and the uncertainty reaches about 1.05 times 10^-7, while the system is compatible with low-bandwidth detectors. Combining coherent microcomb ranging systems with special FPGA could enable comb-based real-time ranging systems for several applications such as industrial process monitoring.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Authors:
Wenhai Wang,
Jiangwei Xie,
ChuanYang Hu,
Haoming Zou,
Jianan Fan,
Wenwen Tong,
Yang Wen,
Silei Wu,
Hanming Deng,
Zhiqi Li,
Hao Tian,
Lewei Lu,
Xizhou Zhu,
Xiaogang Wang,
Yu Qiao,
Jifeng Dai
Abstract:
Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge…
▽ More
Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge the gap between the language decisions and the vehicle control commands by standardizing the decision states according to the off-the-shelf motion planning module. (2) We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system, which uses driving rules, user commands, and inputs from various sensors (e.g., camera, lidar) as input and makes driving decisions and provide explanations; This model can plug-and-play in existing AD systems such as Apollo for close-loop driving. (3) We design an effective data engine to collect a dataset that includes decision state and corresponding explanation annotation for model training and evaluation. We conduct extensive experiments and show that our model achieves 76.1 driving score on the CARLA Town05 Long, and surpasses the Apollo baseline by 4.7 points under the same settings, demonstrating the effectiveness of our model. We hope this work can serve as a baseline for autonomous driving with LLMs. Code and models shall be released at https://github.com/OpenGVLab/DriveMLM.
△ Less
Submitted 25 December, 2023; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Integral Representations of Three Novel Multiple Zeta Functions for Barnes Type: A Probabilistic Approach
Authors:
Gwo Dong Lin,
Chin-Yuan Hu
Abstract:
Integral representation is one of the powerful tools for studying analytic continuation of the zeta functions. It is known that Hurwitz zeta function generalizes the famous Riemann zeta function which plays an important role in analytic number theory. They both have several multiple versions in the literature. In this paper, we introduce three novel multiple zeta functions for Barnes type and stud…
▽ More
Integral representation is one of the powerful tools for studying analytic continuation of the zeta functions. It is known that Hurwitz zeta function generalizes the famous Riemann zeta function which plays an important role in analytic number theory. They both have several multiple versions in the literature. In this paper, we introduce three novel multiple zeta functions for Barnes type and study their integral representations through hyperbolic probability distributions given by Pitman and Yor (2003, Canad. J. Math., 55, 292-330). The analytically continued properties of the three multiple zeta functions are also investigated. Surprisingly, two of them, unlike the previous results, can extend analytically to entire functions in the whole complex plane.
△ Less
Submitted 2 January, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Building Universal Foundation Models for Medical Image Analysis with Spatially Adaptive Networks
Authors:
Lingxiao Luo,
Xuanzhong Chen,
Bingda Tang,
Xinsheng Chen,
Rong Han,
Chengpeng Hu,
Yujiang Li,
Ting Chen
Abstract:
Recent advancements in foundation models, typically trained with self-supervised learning on large-scale and diverse datasets, have shown great potential in medical image analysis. However, due to the significant spatial heterogeneity of medical imaging data, current models must tailor specific structures for different datasets, making it challenging to leverage the abundant unlabeled data. In thi…
▽ More
Recent advancements in foundation models, typically trained with self-supervised learning on large-scale and diverse datasets, have shown great potential in medical image analysis. However, due to the significant spatial heterogeneity of medical imaging data, current models must tailor specific structures for different datasets, making it challenging to leverage the abundant unlabeled data. In this work, we propose a universal foundation model for medical image analysis that processes images with heterogeneous spatial properties using a unified structure. To accomplish this, we propose spatially adaptive networks (SPAD-Nets), a family of networks that dynamically adjust the structures to adapt to the spatial properties of input images, to build such a universal foundation model. We pre-train a spatial adaptive visual tokenizer (SPAD-VT) and then a spatial adaptive Vision Transformer (SPAD-ViT) via masked image modeling (MIM) on 55 public medical image datasets. The pre-training data comprises over 9 million image slices, representing the largest, most comprehensive, and most diverse dataset to our knowledge for pre-training universal foundation models for medical image analysis. The experimental results on downstream medical image classification and segmentation tasks demonstrate the superior performance and label efficiency of our model. Our code is available at https://github.com/function2-llx/PUMIT.
△ Less
Submitted 23 January, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Two-stage optimized unified adversarial patch for attacking visible-infrared cross-modal detectors in the physical world
Authors:
Chengyin Hu,
Weiwen Shi
Abstract:
Currently, many studies have addressed security concerns related to visible and infrared detectors independently. In practical scenarios, utilizing cross-modal detectors for tasks proves more reliable than relying on single-modal detectors. Despite this, there is a lack of comprehensive security evaluations for cross-modal detectors. While existing research has explored the feasibility of attacks…
▽ More
Currently, many studies have addressed security concerns related to visible and infrared detectors independently. In practical scenarios, utilizing cross-modal detectors for tasks proves more reliable than relying on single-modal detectors. Despite this, there is a lack of comprehensive security evaluations for cross-modal detectors. While existing research has explored the feasibility of attacks against cross-modal detectors, the implementation of a robust attack remains unaddressed. This work introduces the Two-stage Optimized Unified Adversarial Patch (TOUAP) designed for performing attacks against visible-infrared cross-modal detectors in real-world, black-box settings. The TOUAP employs a two-stage optimization process: firstly, PSO optimizes an irregular polygonal infrared patch to attack the infrared detector; secondly, the color QR code is optimized, and the shape information of the infrared patch from the first stage is used as a mask. The resulting irregular polygon visible modal patch executes an attack on the visible detector. Through extensive experiments conducted in both digital and physical environments, we validate the effectiveness and robustness of the proposed method. As the TOUAP surpasses baseline performance, we advocate for its widespread attention.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration
Authors:
Chen Zhao,
Weiling Cai,
Chenyu Dong,
Chengwei Hu
Abstract:
Underwater images are subject to intricate and diverse degradation, inevitably affecting the effectiveness of underwater visual tasks. However, most approaches primarily operate in the raw pixel space of images, which limits the exploration of the frequency characteristics of underwater images, leading to an inadequate utilization of deep models' representational capabilities in producing high-qua…
▽ More
Underwater images are subject to intricate and diverse degradation, inevitably affecting the effectiveness of underwater visual tasks. However, most approaches primarily operate in the raw pixel space of images, which limits the exploration of the frequency characteristics of underwater images, leading to an inadequate utilization of deep models' representational capabilities in producing high-quality images. In this paper, we introduce a novel Underwater Image Enhancement (UIE) framework, named WF-Diff, designed to fully leverage the characteristics of frequency domain information and diffusion models. WF-Diff consists of two detachable networks: Wavelet-based Fourier information interaction network (WFI2-net) and Frequency Residual Diffusion Adjustment Module (FRDAM). With our full exploration of the frequency domain information, WFI2-net aims to achieve preliminary enhancement of frequency information in the wavelet space. Our proposed FRDAM can further refine the high- and low-frequency information of the initial enhanced images, which can be viewed as a plug-and-play universal module to adjust the detail of the underwater images. With the above techniques, our algorithm can show SOTA performance on real-world underwater image datasets, and achieves competitive performance in visual quality.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Many vertex-disjoint even cycles of fixed length in a graph
Authors:
Jianfeng Hou,
Caiyun Hu,
Heng Li,
Xizhi Liu,
Caihong Yang,
Yixiao Zhang
Abstract:
For every integer $k \ge 3$, we determine the extremal structure of an $n$-vertex graph with at most $t$ vertex-disjoint copies of $C_{2k}$ when $n$ is sufficiently large and $t$ lies in the interval $\left[\frac{\mathrm{ex}(n,C_{2k})}{\varepsilon n}, \varepsilon n\right]$, where $\varepsilon>0$ is a constant depending only on $k$. The question for $k = 2$ and…
▽ More
For every integer $k \ge 3$, we determine the extremal structure of an $n$-vertex graph with at most $t$ vertex-disjoint copies of $C_{2k}$ when $n$ is sufficiently large and $t$ lies in the interval $\left[\frac{\mathrm{ex}(n,C_{2k})}{\varepsilon n}, \varepsilon n\right]$, where $\varepsilon>0$ is a constant depending only on $k$. The question for $k = 2$ and $t = o\left(\frac{\mathrm{ex}(n,C_{2k})}{n}\right)$ was explored in prior work~\cite{HHLLYZ23a}, revealing different extremal structures in these cases. Our result can be viewed as an extension of the theorems by Egawa~\cite{Ega96} and Verstraëte~\cite{Ver03}, where the focus was on the existence of many vertex-disjoint cycles of the same length without any length constraints.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
A Survey on Vulnerability of Federated Learning: A Learning Algorithm Perspective
Authors:
Xianghua Xie,
Chen Hu,
Hanchi Ren,
**g**g Deng
Abstract:
This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data…
▽ More
This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of develo** robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
A new fuzzy multi-attribute group decision-making method based on TOPSIS and optimization models
Authors:
Qixiao Hu,
Shiquan Zhang,
Chaolang Hu,
Yuetong Liu
Abstract:
In this paper, a new method based on TOPSIS and optimization models is proposed for multi-attribute group decision-making in the environment of interval-valued intuitionistic fuzzy sets.Firstly, by minimizing the sum of differences between individual evaluations and the overallconsistent evaluations of all experts, a new optimization model is established for determining expert weights. Secondly, b…
▽ More
In this paper, a new method based on TOPSIS and optimization models is proposed for multi-attribute group decision-making in the environment of interval-valued intuitionistic fuzzy sets.Firstly, by minimizing the sum of differences between individual evaluations and the overallconsistent evaluations of all experts, a new optimization model is established for determining expert weights. Secondly, based on TOPSIS method, the improved closeness index for evaluating each alternative is obtained. Finally, the attribute weight is determined by establishing an optimization model with the goal of maximizing the closeness of each alternative, and it is brought into the closeness index so that the alternatives can be ranked. Combining all these together, the complete fuzzy multi-attribute group decision-making algorithm is formulated, which can give full play to the advantages of subjective and objective weighting methods. In the end, the feasibility and effectiveness of the provided method are verified by a real case study.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Toward a density Corrádi--Hajnal theorem for degenerate hypergraphs
Authors:
Jianfeng Hou,
Caiyun Hu,
Heng Li,
Xizhi Liu,
Caihong Yang,
Yixiao Zhang
Abstract:
Given an $r$-graph $F$ with $r \ge 2$, let $\mathrm{ex}(n, (t+1) F)$ denote the maximum number of edges in an $n$-vertex $r$-graph with at most $t$ pairwise vertex-disjoint copies of $F$. Extending several old results and complementing prior work [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density Corrádi--Hajnal theorem. arXiv:2302.09849, 2023.] on nondegenerate hyp…
▽ More
Given an $r$-graph $F$ with $r \ge 2$, let $\mathrm{ex}(n, (t+1) F)$ denote the maximum number of edges in an $n$-vertex $r$-graph with at most $t$ pairwise vertex-disjoint copies of $F$. Extending several old results and complementing prior work [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density Corrádi--Hajnal theorem. arXiv:2302.09849, 2023.] on nondegenerate hypergraphs, we initiate a systematic study on $\mathrm{ex}(n, (t+1) F)$ for degenerate hypergraphs $F$. For a broad class of degenerate hypergraphs $F$, we present near-optimal upper bounds for $\mathrm{ex}(n, (t+1) F)$ when $n$ is sufficiently large and $t$ lies in intervals $\left[0, \frac{\varepsilon \cdot \mathrm{ex}(n,F)}{n^{r-1}}\right]$, $\left[\frac{\mathrm{ex}(n,F)}{\varepsilon n^{r-1}}, \varepsilon n \right]$, and $\left[ (1-\varepsilon)\frac{n}{v(F)}, \frac{n}{v(F)} \right]$, where $\varepsilon > 0$ is a constant depending only on $F$. Our results reveal very different structures for extremal constructions across the three intervals, and we provide characterizations of extremal constructions within the first interval. Additionally, for graphs, we offer a characterization of extremal constructions within the second interval. Our proof for the first interval also applies to a special class of nondegenerate hypergraphs, including those with undetermined Turán densities, partially improving a result in [J. Hou, H. Li, X. Liu, L.-T. Yuan, and Y. Zhang. A step towards a general density Corrádi--Hajnal theorem. arXiv:2302.09849, 2023.]
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
A New Approach to the Determination of Expert Weights in Multi-attribute Group Decision Making
Authors:
Yuetong Liu,
Chaolang Hu,
Shiquan Zhang,
Qixiao Hu
Abstract:
This paper presents a new approach based on optimization model to determine the weights of experts in the multi-attribute group decision. Firstly, by minimizing the sum of differences between individual evaluations and the overall consistent evaluations of all experts, a new optimization model is established for determining expert weights. Then, rigorous proof of the unique existence of solution i…
▽ More
This paper presents a new approach based on optimization model to determine the weights of experts in the multi-attribute group decision. Firstly, by minimizing the sum of differences between individual evaluations and the overall consistent evaluations of all experts, a new optimization model is established for determining expert weights. Then, rigorous proof of the unique existence of solution is analyzed in detail, and the sequential least squares quadratic programming algorithm is adopted to solve the optimization model. Finally, the reasonableness of the new approach is verified by numerical experiments, i.e., the smaller the difference between the individual evaluations and the overall consistent evaluations, the larger the weights assigned to the corresponding individual.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Enhancement of photon blockade via topological edge states
Authors:
Jun Li,
Can-ming Hu,
Ya** Yang
Abstract:
Quantum technologies, holding the promise of exponentially superior performance than their classical counterparts for certain tasks, have consistently encountered challenges, including instability in quantum light sources, quantum decoherence and vulnerability to losses that topological photonics happens to adeptly address. Here, we theoretically put forth a quantum Su-Schrieffer-Heeger-type chain…
▽ More
Quantum technologies, holding the promise of exponentially superior performance than their classical counterparts for certain tasks, have consistently encountered challenges, including instability in quantum light sources, quantum decoherence and vulnerability to losses that topological photonics happens to adeptly address. Here, we theoretically put forth a quantum Su-Schrieffer-Heeger-type chain designed to greatly enhance single-photon blockade (single-PB) effect with topological protection. By designing the deliberate coupling strengths, the quantum-level lattices take the form of a one-dimensional array with a topological edge state in single-excitation space and a two-dimensional square breathing lattice with topological corner states in two-excitation space, resulting in enhanced single-photon excitation and the suppression of two-photon transitions. Therefore the second-order correlation function is diminished by up to two orders of magnitude at the cavity resonance frequency, accompanied by stronger brightness.Furthermore, the PB effect is robust to local perturbations in cavity-qubit coupling and qubit frequency, benefitting from topological protection.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing
Authors:
Yating Xu,
Conghui Hu,
Gim Hee Lee
Abstract:
Existing works on weakly-supervised audio-visual video parsing adopt hybrid attention network (HAN) as the multi-modal embedding to capture the cross-modal context. It embeds the audio and visual modalities with a shared network, where the cross-attention is performed at the input. However, such an early fusion method highly entangles the two non-fully correlated modalities and leads to sub-optima…
▽ More
Existing works on weakly-supervised audio-visual video parsing adopt hybrid attention network (HAN) as the multi-modal embedding to capture the cross-modal context. It embeds the audio and visual modalities with a shared network, where the cross-attention is performed at the input. However, such an early fusion method highly entangles the two non-fully correlated modalities and leads to sub-optimal performance in detecting single-modality events. To deal with this problem, we propose the messenger-guided mid-fusion transformer to reduce the uncorrelated cross-modal context in the fusion. The messengers condense the full cross-modal context into a compact representation to only preserve useful cross-modal information. Furthermore, due to the fact that microphones capture audio events from all directions, while cameras only record visual events within a restricted field of view, there is a more frequent occurrence of unaligned cross-modal context from audio for visual event predictions. We thus propose cross-audio prediction consistency to suppress the impact of irrelevant audio information on visual event prediction. Experiments consistently illustrate the superior performance of our framework compared to existing state-of-the-art methods.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Deformable Groupwise Registration Using a Locally Low-Rank Dissimilarity Metric for Myocardial Strain Estimation from Cardiac Cine MRI Images
Authors:
Haiyang Chen,
Juan Gao,
Chenxi Hu
Abstract:
Objective: Cardiovascular magnetic resonance-feature tracking (CMR-FT) represents a group of methods for myocardial strain estimation from cardiac cine MRI images. Established CMR-FT methods are mainly based on optical flow or pairwise registration. However, these methods suffer from either inaccurate estimation of large motion or drift effect caused by accumulative tracking errors. In this work,…
▽ More
Objective: Cardiovascular magnetic resonance-feature tracking (CMR-FT) represents a group of methods for myocardial strain estimation from cardiac cine MRI images. Established CMR-FT methods are mainly based on optical flow or pairwise registration. However, these methods suffer from either inaccurate estimation of large motion or drift effect caused by accumulative tracking errors. In this work, we propose a deformable groupwise registration method using a locally low-rank (LLR) dissimilarity metric for CMR-FT. Methods: The proposed method (Groupwise-LLR) tracks the feature points by a groupwise registration-based two-step strategy. Unlike the globally low-rank (GLR) dissimilarity metric, the proposed LLR metric imposes low-rankness on local image patches rather than the whole image. We quantitatively compared Groupwise-LLR with the Farneback optical flow, a pairwise registration method, and a GLR-based groupwise registration method on simulated and in vivo datasets. Results: Results from the simulated dataset showed that Groupwise-LLR achieved more accurate tracking and strain estimation compared with the other methods. Results from the in vivo dataset showed that Groupwise-LLR achieved more accurate tracking and elimination of the drift effect in late-diastole. Inter-observer reproducibility of strain estimates was similar between all studied methods. Conclusion: The proposed method estimates myocardial strains more accurately due to the application of a groupwise registration-based tracking strategy and an LLR-based dissimilarity metric. Significance: The proposed CMR-FT method may facilitate more accurate estimation of myocardial strains, especially in diastole, for clinical assessments of cardiac dysfunction.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Sketch-based Video Object Segmentation: Benchmark and Analysis
Authors:
Ruolin Yang,
Da Li,
Conghui Hu,
Timothy Hospedales,
Honggang Zhang,
Yi-Zhe Song
Abstract:
Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask. However, language expressions can sometimes be vague in conveying an intended concept and ambiguous when similar objects in one frame are hard to distinguish by language. Meanwhile, pho…
▽ More
Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask. However, language expressions can sometimes be vague in conveying an intended concept and ambiguous when similar objects in one frame are hard to distinguish by language. Meanwhile, photo masks are costly to annotate and less practical to provide in a real application. This paper introduces a new task of sketch-based video object segmentation, an associated benchmark, and a strong baseline. Our benchmark includes three datasets, Sketch-DAVIS16, Sketch-DAVIS17 and Sketch-YouTube-VOS, which exploit human-drawn sketches as an informative yet low-cost reference for video object segmentation. We take advantage of STCN, a popular baseline of semi-supervised VOS task, and evaluate what the most effective design for incorporating a sketch reference is. Experimental results show sketch is more effective yet annotation-efficient than other references, such as photo masks, language and scribble.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Time-Optimal Control for High-Order Chain-of-Integrators Systems with Full State Constraints and Arbitrary Terminal States (Extended Version)
Authors:
Yunan Wang,
Chuxiong Hu,
Zeyang Li,
Shize Lin,
Suqin He,
Yu Zhu
Abstract:
Time-optimal control for high-order chain-of-integrators systems with full state constraints and arbitrarily given terminal states remains a challenging problem in the optimal control theory domain, yet to be resolved. To enhance further comprehension of the problem, this paper establishes a novel notation system and theoretical framework, providing the switching manifold for high-order problems i…
▽ More
Time-optimal control for high-order chain-of-integrators systems with full state constraints and arbitrarily given terminal states remains a challenging problem in the optimal control theory domain, yet to be resolved. To enhance further comprehension of the problem, this paper establishes a novel notation system and theoretical framework, providing the switching manifold for high-order problems in the form of switching laws. Through deriving properties of switching laws regarding signs and dimension, this paper proposes a definite condition for time-optimal control. Guided by the developed theory, a trajectory planning method named the manifold-intercept method (MIM) is developed. The proposed MIM can plan time-optimal jerk-limited trajectories with full state constraints, and can also plan near-optimal non-chattering higher-order trajectories with negligible extra motion time compared to optimal profiles. Numerical results indicate that the proposed MIM outperforms all baselines in computational time, computational accuracy, and trajectory quality by a large gap.
△ Less
Submitted 28 March, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Learning Predictive Safety Filter via Decomposition of Robust Invariant Set
Authors:
Zeyang Li,
Chuxiong Hu,
Weiye Zhao,
Changliu Liu
Abstract:
Ensuring safety of nonlinear systems under model uncertainty and external disturbances is crucial, especially for real-world control tasks. Predictive methods such as robust model predictive control (RMPC) require solving nonconvex optimization problems online, which leads to high computational burden and poor scalability. Reinforcement learning (RL) works well with complex systems, but pays the p…
▽ More
Ensuring safety of nonlinear systems under model uncertainty and external disturbances is crucial, especially for real-world control tasks. Predictive methods such as robust model predictive control (RMPC) require solving nonconvex optimization problems online, which leads to high computational burden and poor scalability. Reinforcement learning (RL) works well with complex systems, but pays the price of losing rigorous safety guarantee. This paper presents a theoretical framework that bridges the advantages of both RMPC and RL to synthesize safety filters for nonlinear systems with state- and action-dependent uncertainty. We decompose the robust invariant set (RIS) into two parts: a target set that aligns with terminal region design of RMPC, and a reach-avoid set that accounts for the rest of RIS. We propose a policy iteration approach for robust reach-avoid problems and establish its monotone convergence. This method sets the stage for an adversarial actor-critic deep RL algorithm, which simultaneously synthesizes a reach-avoid policy network, a disturbance policy network, and a reach-avoid value network. The learned reach-avoid policy network is utilized to generate nominal trajectories for online verification, which filters potentially unsafe actions that may drive the system into unsafe regions when worst-case disturbances are applied. We formulate a second-order cone programming (SOCP) approach for online verification using system level synthesis, which optimizes for the worst-case reach-avoid value of any possible trajectories. The proposed safety filter requires much lower computational complexity than RMPC and still enjoys persistent robust safety guarantee. The effectiveness of our method is illustrated through a numerical example.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Joint Angle and Delay Cramér-Rao Bound Optimization for ISAC
Authors:
Chao Hu,
Yuan Fang,
Ling Qiu
Abstract:
In this paper, we study a multi-input multi-output (MIMO) beamforming design in an integrated sensing and communication (ISAC) system, in which an ISAC base station (BS) is used to communicate with multiple downlink users and simultaneously the communication signals are reused for sensing multiple targets. Our interested sensing parameters are the angle and delay information of the targets, which…
▽ More
In this paper, we study a multi-input multi-output (MIMO) beamforming design in an integrated sensing and communication (ISAC) system, in which an ISAC base station (BS) is used to communicate with multiple downlink users and simultaneously the communication signals are reused for sensing multiple targets. Our interested sensing parameters are the angle and delay information of the targets, which can be used to locate these targets. Under this consideration, we first derive the Cramér-Rao bound (CRB) for joint angle and delay estimation. Then, we optimize the transmit beamforming at the BS to minimize the CRB, subject to the communication rate requirement and the maximum transmit power constraint. In particular, we obtain the closed-form optimal solution in the case of single-target and single-user, and in the case of multi-target and multi-user scenario, the sparsity of the optimal solution is proven, leading to a reduction in computational complexity during optimization. The numerical results demonstrate that the optimized beamforming yields excellent positioning performance and effectively reduces the requirement for a large number of antennas at the BS.
△ Less
Submitted 3 July, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
The High Energy X-ray Probe (HEX-P): Magnetars and Other Isolated Neutron Stars
Authors:
J. A. J. Alford,
G. A. Younes,
Z. Wadiasingh,
M. Abdelmaguid,
H. An,
M. Bachetti,
M. Baring,
A. Beloborodov,
A. Y. Chen,
T. Enoto,
J. A. García,
J. D. Gelfand,
E. V. Gotthelf,
A. Harding,
C. -P. Hu,
A. D. Jaodand,
V. Kaspi,
C. Kim,
C. Kouveliotou,
L. Kuiper,
K. Mori,
M. Nynka,
J. Park,
D. Stern,
J. Valverde
, et al. (1 additional authors not shown)
Abstract:
The hard X-ray emission from magnetars and other isolated neutron stars remains under-explored. An instrument with higher sensitivity to hard X-rays is critical to understanding the physics of neutron star magnetospheres and also the relationship between magnetars and Fast Radio Bursts (FRBs). High sensitivity to hard X-rays is required to determine the number of magnetars with hard X-ray tails, a…
▽ More
The hard X-ray emission from magnetars and other isolated neutron stars remains under-explored. An instrument with higher sensitivity to hard X-rays is critical to understanding the physics of neutron star magnetospheres and also the relationship between magnetars and Fast Radio Bursts (FRBs). High sensitivity to hard X-rays is required to determine the number of magnetars with hard X-ray tails, and to track transient non-thermal emission from these sources for years post-outburst. This sensitivity would also enable previously impossible studies of the faint non-thermal emission from middle-aged rotation-powered pulsars (RPPs), and detailed phase-resolved spectroscopic studies of younger, bright RPPs. The High Energy X-ray Probe (HEX-P) is a probe-class mission concept that will combine high spatial resolution X-ray imaging ($<5$ arcsec half-power diameter (HPD) at 0.2--25 keV) and broad spectral coverage (0.2--80 keV) with a sensitivity superior to current facilities (including XMM-Newton and NuSTAR). HEX-P has the required timing resolution to perform follow-up observations of sources identified by other facilities and positively identify candidate pulsating neutron stars. Here we discuss how HEX-P is ideally suited to address important questions about the physics of magnetars and other isolated neutron stars.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Spatial Correlation at the Boson Peak Frequency in Amorphous Materials
Authors:
X. Y. Li,
H. P. Zhang,
S. Lan,
D. L. Abernathy,
C. H. Hu,
L. R. Fan,
M. Z. Li,
X. -L. Wang
Abstract:
The Boson peak (BP), an excess of vibrational density of states, is ubiquitous for amorphous materials and is believed to hold the key to understanding the dynamics of glass and glass transition. Previous studies have established an energy scale for the BP, which is ~1-10 meV or ~THz in frequency. However, so far, little is known about the momentum dependence or spatial correlation of the BP. Here…
▽ More
The Boson peak (BP), an excess of vibrational density of states, is ubiquitous for amorphous materials and is believed to hold the key to understanding the dynamics of glass and glass transition. Previous studies have established an energy scale for the BP, which is ~1-10 meV or ~THz in frequency. However, so far, little is known about the momentum dependence or spatial correlation of the BP. Here, we report the observation of the BP in model Zr-Cu-Al metallic glasses over a wide range of momentum transfer, using inelastic neutron scattering, heat capacity, Raman scattering measurements, and molecular dynamics (MD) simulations. The BP energy is largely dispersionless; however, the BP intensity was found to scale with the static structure factor. Additional MD simulations with a generic Lennard-Jones potential confirmed the same. Based on these results, an analytical expression for the dynamic structure factor was formulated for the BP excitation. Further analysis of the simulated disordered structures suggests that the BP is related to local structure fluctuations (e.g., in shear strain). Our results offered insights into the nature of the BP and provide guidance for the development of theories of amorphous materials.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Recursive construction for expansions of tree Yang-Mills amplitudes from soft theorem
Authors:
Chang Hu,
Kang Zhou
Abstract:
In this paper, we have introduced a fundamentally different approach, based on a bottom-up methodology, to expand tree-level Yang-Mills (YM) amplitudes into Yang-Mills-scalar (YMS) amplitudes and Bi-adjoint-scalar (BAS) amplitudes. Our method relies solely on the intrinsic soft behavior of external gluons, eliminating the need for external aids such as Feynman rules or CHY rules. The recursive pro…
▽ More
In this paper, we have introduced a fundamentally different approach, based on a bottom-up methodology, to expand tree-level Yang-Mills (YM) amplitudes into Yang-Mills-scalar (YMS) amplitudes and Bi-adjoint-scalar (BAS) amplitudes. Our method relies solely on the intrinsic soft behavior of external gluons, eliminating the need for external aids such as Feynman rules or CHY rules. The recursive procedure consistently preserves explicit gauge invariance at every step, ultimately resulting in a manifest gauge-invariant outcome when the initial expression is already framed in a gauge-invariant manner. The resulting expansion can be directly analogized to the expansions of gravitational (GR) amplitudes using the double copy structure. When combined with the expansions of Einstein-Yang-Mills amplitudes obtained using the covariant double copy method from existing literature, the expansions presented in this note yield gauge-invariant BCJ numerators.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Software Engineering for OpenHarmony: A Research Roadmap
Authors:
Li Li,
Xiang Gao,
Hailong Sun,
Chunming Hu,
Xiaoyu Sun,
Haoyu Wang,
Haipeng Cai,
Ting Su,
Xiapu Luo,
Tegawendé F. Bissyandé,
Jacques Klein,
John Grundy,
Tao Xie,
Haibo Chen,
Huaimin Wang
Abstract:
Mobile software engineering has been a hot research topic for decades. Our fellow researchers have proposed various approaches (with over 7,000 publications for Android alone) in this field that essentially contributed to the great success of the current mobile ecosystem. Existing research efforts mainly focus on popular mobile platforms, namely Android and iOS. OpenHarmony, a newly open-sourced m…
▽ More
Mobile software engineering has been a hot research topic for decades. Our fellow researchers have proposed various approaches (with over 7,000 publications for Android alone) in this field that essentially contributed to the great success of the current mobile ecosystem. Existing research efforts mainly focus on popular mobile platforms, namely Android and iOS. OpenHarmony, a newly open-sourced mobile platform, has rarely been considered, although it is the one requiring the most attention as OpenHarmony is expected to occupy one-third of the market in China (if not in the world). To fill the gap, we present to the mobile software engineering community a research roadmap for encouraging our fellow researchers to contribute promising approaches to OpenHarmony. Specifically, we start by presenting a literature review of mobile software engineering, attempting to understand what problems have been targeted by the mobile community and how they have been resolved. We then summarize the existing (limited) achievements of OpenHarmony and subsequently highlight the research gap between Android/iOS and OpenHarmony. This research gap eventually helps in forming the roadmap for conducting software engineering research for OpenHarmony.
△ Less
Submitted 21 November, 2023; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution
Authors:
Zhewei Huang,
Ailin Huang,
Xiaotao Hu,
Chen Hu,
Jun Xu,
Shuchang Zhou
Abstract:
The Space-Time Video Super-Resolution (STVSR) task aims to enhance the visual quality of videos, by simultaneously performing video frame interpolation (VFI) and video super-resolution (VSR). However, facing the challenge of the additional temporal dimension and scale inconsistency, most existing STVSR methods are complex and inflexible in dynamically modeling different motion amplitudes. In this…
▽ More
The Space-Time Video Super-Resolution (STVSR) task aims to enhance the visual quality of videos, by simultaneously performing video frame interpolation (VFI) and video super-resolution (VSR). However, facing the challenge of the additional temporal dimension and scale inconsistency, most existing STVSR methods are complex and inflexible in dynamically modeling different motion amplitudes. In this work, we find that choosing an appropriate processing scale achieves remarkable benefits in flow-based feature propagation. We propose a novel Scale-Adaptive Feature Aggregation (SAFA) network that adaptively selects sub-networks with different processing scales for individual samples. Experiments on four public STVSR benchmarks demonstrate that SAFA achieves state-of-the-art performance. Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.
△ Less
Submitted 27 November, 2023; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning
Authors:
Yuxiang Wang,
Xiao Yan,
Chuang Hu,
Fangcheng Fu,
Wentao Zhang,
Hao Wang,
Shuo Shang,
Jiawei Jiang
Abstract:
For graph self-supervised learning (GSSL), masked autoencoder (MAE) follows the generative paradigm and learns to reconstruct masked graph edges or node features. Contrastive Learning (CL) maximizes the similarity between augmented views of the same graph and is widely used for GSSL. However, MAE and CL are considered separately in existing works for GSSL. We observe that the MAE and CL paradigms…
▽ More
For graph self-supervised learning (GSSL), masked autoencoder (MAE) follows the generative paradigm and learns to reconstruct masked graph edges or node features. Contrastive Learning (CL) maximizes the similarity between augmented views of the same graph and is widely used for GSSL. However, MAE and CL are considered separately in existing works for GSSL. We observe that the MAE and CL paradigms are complementary and propose the graph contrastive masked autoencoder (GCMAE) framework to unify them. Specifically, by focusing on local edges or node features, MAE cannot capture global information of the graph and is sensitive to particular edges and features. On the contrary, CL excels in extracting global information because it considers the relation between graphs. As such, we equip GCMAE with an MAE branch and a CL branch, and the two branches share a common encoder, which allows the MAE branch to exploit the global information extracted by the CL branch. To force GCMAE to capture global graph structures, we train it to reconstruct the entire adjacency matrix instead of only the masked edges as in existing works. Moreover, a discrimination loss is proposed for feature reconstruction, which improves the disparity between node embeddings rather than reducing the reconstruction error to tackle the feature smoothing problem of MAE. We evaluate GCMAE on four popular graph tasks (i.e., node classification, node clustering, link prediction, and graph classification) and compare with 14 state-of-the-art baselines. The results show that GCMAE consistently provides good accuracy across these tasks, and the maximum accuracy improvement is up to 3.2% compared with the best-performing baseline.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Observation of the Antimatter Hypernucleus $^4_{\barΛ}\overline{\hbox{H}}$
Authors:
STAR Collaboration,
M. I. Abdulhamid,
B. E. Aboona,
J. Adam,
L. Adamczyk,
J. R. Adams,
I. Aggarwal,
M. M. Aggarwal,
Z. Ahammed,
E. C. Aschenauer,
S. Aslam,
J. Atchison,
V. Bairathi,
J. G. Ball Cap,
K. Barish,
R. Bellwied,
P. Bhagat,
A. Bhasin,
S. Bhatta,
S. R. Bhosale,
J. Bielcik,
J. Bielcikova,
J. D. Brandenburg,
C. Broodo,
X. Z. Cai
, et al. (342 additional authors not shown)
Abstract:
At the origin of the Universe, asymmetry between the amount of created matter and antimatter led to the matter-dominated Universe as we know today. The origins of this asymmetry remain not completely understood yet. High-energy nuclear collisions create conditions similar to the Universe microseconds after the Big Bang, with comparable amounts of matter and antimatter. Much of the created antimatt…
▽ More
At the origin of the Universe, asymmetry between the amount of created matter and antimatter led to the matter-dominated Universe as we know today. The origins of this asymmetry remain not completely understood yet. High-energy nuclear collisions create conditions similar to the Universe microseconds after the Big Bang, with comparable amounts of matter and antimatter. Much of the created antimatter escapes the rapidly expanding fireball without annihilating, making such collisions an effective experimental tool to create heavy antimatter nuclear objects and study their properties, ho** to shed some light on existing questions on the asymmetry between matter and antimatter. Here we report the first observation of the antimatter hypernucleus \hbox{$^4_{\barΛ}\overline{\hbox{H}}$}, composed of a $\barΛ$ , an antiproton and two antineutrons. The discovery was made through its two-body decay after production in ultrarelativistic heavy-ion collisions by the STAR experiment at the Relativistic Heavy Ion Collider. In total, 15.6 candidate \hbox{$^4_{\barΛ}\overline{\hbox{H}}$} antimatter hypernuclei are obtained with an estimated background count of 6.4. The lifetimes of the antihypernuclei \hbox{$^3_{\barΛ}\overline{\hbox{H}}$} and \hbox{$^4_{\barΛ}\overline{\hbox{H}}$} are measured and compared with the lifetimes of their corresponding hypernuclei, testing the symmetry between matter and antimatter. Various production yield ratios among (anti)hypernuclei and (anti)nuclei are also measured and compared with theoretical model predictions, shedding light on their production mechanisms.
△ Less
Submitted 8 June, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Predict the Future from the Past? On the Temporal Data Distribution Shift in Financial Sentiment Classifications
Authors:
Yue Guo,
Chenxi Hu,
Yi Yang
Abstract:
Temporal data distribution shift is prevalent in the financial text. How can a financial sentiment analysis system be trained in a volatile market environment that can accurately infer sentiment and be robust to temporal data distribution shifts? In this paper, we conduct an empirical study on the financial sentiment analysis system under temporal data distribution shifts using a real-world financ…
▽ More
Temporal data distribution shift is prevalent in the financial text. How can a financial sentiment analysis system be trained in a volatile market environment that can accurately infer sentiment and be robust to temporal data distribution shifts? In this paper, we conduct an empirical study on the financial sentiment analysis system under temporal data distribution shifts using a real-world financial social media dataset that spans three years. We find that the fine-tuned models suffer from general performance degradation in the presence of temporal distribution shifts. Furthermore, motivated by the unique temporal nature of the financial text, we propose a novel method that combines out-of-distribution detection with time series modeling for temporal financial sentiment analysis. Experimental results show that the proposed method enhances the model's capability to adapt to evolving temporal shifts in a volatile financial market.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Architectural Implications of GNN Aggregation Programming Abstractions
Authors:
Yingjie Qi,
Jianlei Yang,
Ao Zhou,
Tong Qiao,
Chunming Hu
Abstract:
Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstrac…
▽ More
Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstractions, thus no clear consensus on which approach is better. In this letter, we classify existing programming abstractions for GNN Aggregation by the dimension of data organization and propagation method. By constructing these abstractions on a state-of-the-art GNN library, we perform a thorough and detailed characterization study to compare their performance and efficiency, and provide several insights on future GNN acceleration based on our analysis.
△ Less
Submitted 20 October, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Engineering band structures and topological invariants by transformation optics
Authors:
Xianghong Kong,
Chuanjie Hu,
Xingsi Liu,
Chunqi Zheng,
Jianfeng Chen,
Huanyang Chen,
Cheng-Wei Qiu
Abstract:
By introducing the transformation optics method to periodic systems, we show the tunability of the band structures by comparing the results from original spaces and transformed spaces. Interestingly, we find the topological invariant Chern number will change sign when the orientation of the Brillouin zone flipped. The new platform we provided for engineering the band diagram and topological invari…
▽ More
By introducing the transformation optics method to periodic systems, we show the tunability of the band structures by comparing the results from original spaces and transformed spaces. Interestingly, we find the topological invariant Chern number will change sign when the orientation of the Brillouin zone flipped. The new platform we provided for engineering the band diagram and topological invariant might lead to the development of both transformation optics and photonic topological states.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Multimodal Federated Learning in Healthcare: a Review
Authors:
Jacob Thrasher,
Alina Devkota,
Prasiddha Siwakotai,
Rohit Chivukula,
Pranav Poudel,
Chaunbo Hu,
Binod Bhattarai,
Prashnna Gyawali
Abstract:
Recent advancements in multimodal machine learning have empowered the development of accurate and robust AI systems in the medical domain, especially within centralized database systems. Simultaneously, Federated Learning (FL) has progressed, providing a decentralized mechanism where data need not be consolidated, thereby enhancing the privacy and security of sensitive healthcare data. The integra…
▽ More
Recent advancements in multimodal machine learning have empowered the development of accurate and robust AI systems in the medical domain, especially within centralized database systems. Simultaneously, Federated Learning (FL) has progressed, providing a decentralized mechanism where data need not be consolidated, thereby enhancing the privacy and security of sensitive healthcare data. The integration of these two concepts supports the ongoing progress of multimodal learning in healthcare while ensuring the security and privacy of patient records within local data-holding agencies. This paper offers a concise overview of the significance of FL in healthcare and outlines the current state-of-the-art approaches to Multimodal Federated Learning (MMFL) within the healthcare domain. It comprehensively examines the existing challenges in the field, shedding light on the limitations of present models. Finally, the paper outlines potential directions for future advancements in the field, aiming to bridge the gap between cutting-edge AI technology and the imperative need for patient data privacy in healthcare applications.
△ Less
Submitted 27 February, 2024; v1 submitted 14 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Risk-informed Resilience Planning of Transmission Systems Against Ice Storms
Authors:
Chenxi Hu,
Yujia Li,
Yunhe Hou
Abstract:
Ice storms, known for their severity and predictability, necessitate proactive resilience enhancement in power systems. Traditional approaches often overlook the endogenous uncertainties inherent in human decisions and underutilize predictive information like forecast accuracy and preparation time. To bridge these gaps, we proposed a two-stage risk-informed decision-dependent resilience planning (…
▽ More
Ice storms, known for their severity and predictability, necessitate proactive resilience enhancement in power systems. Traditional approaches often overlook the endogenous uncertainties inherent in human decisions and underutilize predictive information like forecast accuracy and preparation time. To bridge these gaps, we proposed a two-stage risk-informed decision-dependent resilience planning (RIDDRP) for transmission systems against ice storms. The model leverages predictive information to optimize resource allocation, considering decision-dependent line failure uncertainties introduced by planning decisions and exogenous ice storm-related uncertainties. We adopt a dual-objective approach to balance economic efficiency and system resilience across both normal and emergent conditions. The first stage of the RDDIP model makes line hardening decisions, as well as the optimal sitting and sizing of energy storage. The second stage evaluates the risk-informed operation costs, considering both pre-event preparation and emergency operations. Case studies demonstrate the model's ability to leverage predictive information, leading to more judicious investment decisions and optimized utilization of dispatchable resources. We also quantified the impact of different properties of predictive information on resilience enhancement. The RIDDRP model provides grid operators and planners valuable insights for making risk-informed infrastructure investments and operational strategy decisions, thereby improving preparedness and response to future extreme weather events.
△ Less
Submitted 22 January, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Bridging the Gap between Newton-Raphson Method and Regularized Policy Iteration
Authors:
Zeyang Li,
Chuxiong Hu,
Yunan Wang,
Guojian Zhan,
Jie Li,
Shengbo Eben Li
Abstract:
Regularization is one of the most important techniques in reinforcement learning algorithms. The well-known soft actor-critic algorithm is a special case of regularized policy iteration where the regularizer is chosen as Shannon entropy. Despite some empirical success of regularized policy iteration, its theoretical underpinnings remain unclear. This paper proves that regularized policy iteration…
▽ More
Regularization is one of the most important techniques in reinforcement learning algorithms. The well-known soft actor-critic algorithm is a special case of regularized policy iteration where the regularizer is chosen as Shannon entropy. Despite some empirical success of regularized policy iteration, its theoretical underpinnings remain unclear. This paper proves that regularized policy iteration is strictly equivalent to the standard Newton-Raphson method in the condition of smoothing out Bellman equation with strongly convex functions. This equivalence lays the foundation of a unified analysis for both global and local convergence behaviors of regularized policy iteration. We prove that regularized policy iteration has global linear convergence with the rate being $γ$ (discount factor). Furthermore, this algorithm converges quadratically once it enters a local region around the optimal value. We also show that a modified version of regularized policy iteration, i.e., with finite-step policy evaluation, is equivalent to inexact Newton method where the Newton iteration formula is solved with truncated iterations. We prove that the associated algorithm achieves an asymptotic linear convergence rate of $γ^M$ in which $M$ denotes the number of steps carried out in policy evaluation. Our results take a solid step towards a better understanding of the convergence properties of regularized policy iteration algorithms.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Robust Safe Reinforcement Learning under Adversarial Disturbances
Authors:
Zeyang Li,
Chuxiong Hu,
Shengbo Eben Li,
Jia Cheng,
Yunan Wang
Abstract:
Safety is a primary concern when applying reinforcement learning to real-world control tasks, especially in the presence of external disturbances. However, existing safe reinforcement learning algorithms rarely account for external disturbances, limiting their applicability and robustness in practice. To address this challenge, this paper proposes a robust safe reinforcement learning framework tha…
▽ More
Safety is a primary concern when applying reinforcement learning to real-world control tasks, especially in the presence of external disturbances. However, existing safe reinforcement learning algorithms rarely account for external disturbances, limiting their applicability and robustness in practice. To address this challenge, this paper proposes a robust safe reinforcement learning framework that tackles worst-case disturbances. First, this paper presents a policy iteration scheme to solve for the robust invariant set, i.e., a subset of the safe set, where persistent safety is only possible for states within. The key idea is to establish a two-player zero-sum game by leveraging the safety value function in Hamilton-Jacobi reachability analysis, in which the protagonist (i.e., control inputs) aims to maintain safety and the adversary (i.e., external disturbances) tries to break down safety. This paper proves that the proposed policy iteration algorithm converges monotonically to the maximal robust invariant set. Second, this paper integrates the proposed policy iteration scheme into a constrained reinforcement learning algorithm that simultaneously synthesizes the robust invariant set and uses it for constrained policy optimization. This algorithm tackles both optimality and safety, i.e., learning a policy that attains high rewards while maintaining safety under worst-case disturbances. Experiments on classic control tasks show that the proposed method achieves zero constraint violation with learned worst-case adversarial disturbances, while other baseline algorithms violate the safety constraints substantially. Our proposed method also attains comparable performance as the baselines even in the absence of the adversary.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Authors:
Zih-Jyun Lin,
Yi-Ju Chen,
Po-Chih Kuo,
Likai Huang,
Chaur-Jong Hu,
Cheng-Yu Chen
Abstract:
Dementia diagnosis requires a series of different testing methods, which is complex and time-consuming. Early detection of dementia is crucial as it can prevent further deterioration of the condition. This paper utilizes a speech recognition model to construct a dementia assessment system tailored for Mandarin speakers during the picture description task. By training an attention-based speech reco…
▽ More
Dementia diagnosis requires a series of different testing methods, which is complex and time-consuming. Early detection of dementia is crucial as it can prevent further deterioration of the condition. This paper utilizes a speech recognition model to construct a dementia assessment system tailored for Mandarin speakers during the picture description task. By training an attention-based speech recognition model on voice data closely resembling real-world scenarios, we have significantly enhanced the model's recognition capabilities. Subsequently, we extracted the encoder from the speech recognition model and added a linear layer for dementia assessment. We collected Mandarin speech data from 99 subjects and acquired their clinical assessments from a local hospital. We achieved an accuracy of 92.04% in Alzheimer's disease detection and a mean absolute error of 9% in clinical dementia rating score prediction.
△ Less
Submitted 15 December, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
AGN STORM 2. VI. Map** Temperature Fluctuations in the Accretion Disk of Mrk 817
Authors:
Jack M. M. Neustadt,
Christopher S. Kochanek,
John Montano,
Jonathan Gelbord,
Aaron J. Barth,
Gisella De Rosa,
Gerard A. Kriss,
Edward M. Cackett,
Keith Horne,
Erin A. Kara,
Hermine Landt,
Hagai Netzer,
Nahum Arav,
Misty C. Bentz,
Elena Dalla Bonta,
Maryam Dehghanian,
Pu Du,
Rick Edelson,
Gary J. Ferland,
Carina Fian,
Travis Fischer,
Michael R. Goad,
Diego H. Gonzalez Buitrago,
Varoujan Gorjian,
Catherine J. Grier
, et al. (27 additional authors not shown)
Abstract:
We fit the UV/optical lightcurves of the Seyfert 1 galaxy Mrk 817 to produce maps of the accretion disk temperature fluctuations $δT$ resolved in time and radius. The $δT$ maps are dominated by coherent radial structures that move slowly ($v \ll c$) inwards and outwards, which conflicts with the idea that disk variability is driven only by reverberation. Instead, these slow-moving temperature fluc…
▽ More
We fit the UV/optical lightcurves of the Seyfert 1 galaxy Mrk 817 to produce maps of the accretion disk temperature fluctuations $δT$ resolved in time and radius. The $δT$ maps are dominated by coherent radial structures that move slowly ($v \ll c$) inwards and outwards, which conflicts with the idea that disk variability is driven only by reverberation. Instead, these slow-moving temperature fluctuations are likely due to variability intrinsic to the disk. We test how modifying the input lightcurves by smoothing and subtracting them changes the resulting $δT$ maps and find that most of the temperature fluctuations exist over relatively long timescales ($\sim$100s of days). We show how detrending AGN lightcurves can be used to separate the flux variations driven by the slow-moving temperature fluctuations from those driven by reverberation. We also simulate contamination of the continuum emission from the disk by continuum emission from the broad line region (BLR), which is expected to have spectral features localized in wavelength, such as the Balmer break contaminating the $U$ band. We find that a disk with a smooth temperature profile cannot produce a signal localized in wavelength and that any BLR contamination should appear as residuals in our model lightcurves. Given the observed residuals, we estimate that only $\sim$20% of the variable flux in the $U$ and $u$ lightcurves can be due to BLR contamination. Finally, we discus how these maps not only describe the data, but can make predictions about other aspects of AGN variability.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
REMEDI: REinforcement learning-driven adaptive MEtabolism modeling of primary sclerosing cholangitis DIsease progression
Authors:
Chang Hu,
Krishnakant V. Saboo,
Ahmad H. Ali,
Brian D. Juran,
Konstantinos N. Lazaridis,
Ravishankar K. Iyer
Abstract:
Primary sclerosing cholangitis (PSC) is a rare disease wherein altered bile acid metabolism contributes to sustained liver injury. This paper introduces REMEDI, a framework that captures bile acid dynamics and the body's adaptive response during PSC progression that can assist in exploring treatments. REMEDI merges a differential equation (DE)-based mechanistic model that describes bile acid metab…
▽ More
Primary sclerosing cholangitis (PSC) is a rare disease wherein altered bile acid metabolism contributes to sustained liver injury. This paper introduces REMEDI, a framework that captures bile acid dynamics and the body's adaptive response during PSC progression that can assist in exploring treatments. REMEDI merges a differential equation (DE)-based mechanistic model that describes bile acid metabolism with reinforcement learning (RL) to emulate the body's adaptations to PSC continuously. An objective of adaptation is to maintain homeostasis by regulating enzymes involved in bile acid metabolism. These enzymes correspond to the parameters of the DEs. REMEDI leverages RL to approximate adaptations in PSC, treating homeostasis as a reward signal and the adjustment of the DE parameters as the corresponding actions. On real-world data, REMEDI generated bile acid dynamics and parameter adjustments consistent with published findings. Also, our results support discussions in the literature that early administration of drugs that suppress bile acid synthesis may be effective in PSC treatment.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Coupled Active Perception and Manipulation Planning for a Mobile Manipulator in Precision Agriculture Applications
Authors:
Shuangyu Xie,
Chengsong Hu,
Di Wang,
Joe Johnson,
Muthukumar Bagavathiannan,
Dezhen Song
Abstract:
A mobile manipulator often finds itself in an application where it needs to take a close-up view before performing a manipulation task. Named this as a coupled active perception and manipulation (CAPM) problem, we model the uncertainty in the perception process and devise a key state/task planning approach that considers reachability conditions as task constraints of both perception and manipulati…
▽ More
A mobile manipulator often finds itself in an application where it needs to take a close-up view before performing a manipulation task. Named this as a coupled active perception and manipulation (CAPM) problem, we model the uncertainty in the perception process and devise a key state/task planning approach that considers reachability conditions as task constraints of both perception and manipulation tasks for the mobile platform. By minimizing the expected energy usage in the body key state planning while satisfying task constraints, our algorithm achieves the best balance between the task success rate and energy usage. We have implemented the algorithm and tested it in both simulation and physical experiments. The results have confirmed that our algorithm has a lower energy consumption compared to a two-stage decoupled approach, while still maintaining a success rate of 100\% for the task.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Results on Elastic Cross Sections in Proton-Proton Collisions at $\sqrt{s} = 510$ GeV with the STAR Detector at RHIC
Authors:
STAR Collaboration,
M. I. Abdulhamid,
B. E. Aboona,
J. Adam,
L. Adamczyk,
J. R. Adams,
I. Aggarwal,
M. M. Aggarwal,
Z. Ahammed,
E. C. Aschenauer,
S. Aslam,
J. Atchison,
V. Bairathi,
J. G. Ball Cap,
K. Barish,
R. Bellwied,
P. Bhagat,
A. Bhasin,
S. Bhatta,
S. R. Bhosale,
J. Bielcik,
J. Bielcikova,
J. D. Brandenburg,
C. Broodo,
X. Z. Cai
, et al. (343 additional authors not shown)
Abstract:
We report results on an elastic cross section measurement in proton-proton collisions at a center-of-mass energy $\sqrt{s}=510$ GeV, obtained with the Roman Pot setup of the STAR experiment at the Relativistic Heavy Ion Collider (RHIC). The elastic differential cross section is measured in the four-momentum transfer squared range $0.23 \leq -t \leq 0.67$ GeV$^2$. We find that a constant slope $B$…
▽ More
We report results on an elastic cross section measurement in proton-proton collisions at a center-of-mass energy $\sqrt{s}=510$ GeV, obtained with the Roman Pot setup of the STAR experiment at the Relativistic Heavy Ion Collider (RHIC). The elastic differential cross section is measured in the four-momentum transfer squared range $0.23 \leq -t \leq 0.67$ GeV$^2$. We find that a constant slope $B$ does not fit the data in the aforementioned $t$ range, and we obtain a much better fit using a second-order polynomial for $B(t)$. The $t$ dependence of $B$ is determined using six subintervals of $t$ in the STAR measured $t$ range, and is in good agreement with the phenomenological models. The measured elastic differential cross section $\mathrm{d}σ/\mathrm{dt}$ agrees well with the results obtained at $\sqrt{s} = 546$ GeV for proton--antiproton collisions by the UA4 experiment. We also determine that the integrated elastic cross section within the STAR $t$-range is $σ^\mathrm{fid}_\mathrm{el} = 462.1 \pm 0.9 (\mathrm{stat.}) \pm 1.1 (\mathrm {syst.}) \pm 11.6 (\mathrm {scale})$~$μ\mathrm{b}$.
△ Less
Submitted 6 May, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.