-
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Authors:
Xinpeng Wang,
Shitong Duan,
Xiaoyuan Yi,
**g Yao,
Shanlin Zhou,
Zhihua Wei,
Peng Zhang,
Dongkuan Xu,
Maosong Sun,
Xing Xie
Abstract:
Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable o…
▽ More
Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable oversight, and how to align remains an open question. In this survey paper, we comprehensively investigate value alignment approaches. We first unpack the historical context of alignment tracing back to the 1920s (where it comes from), then delve into the mathematical essence of alignment (what it is), shedding light on the inherent challenges. Following this foundation, we provide a detailed examination of existing alignment methods, which fall into three categories: Reinforcement Learning, Supervised Fine-Tuning, and In-context Learning, and demonstrate their intrinsic connections, strengths, and limitations, hel** readers better understand this research area. In addition, two emerging topics, personal alignment, and multimodal alignment, are also discussed as novel frontiers in this field. Looking forward, we discuss potential alignment paradigms and how they could handle remaining challenges, prospecting where future alignment will go.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
ProMISe: Promptable Medical Image Segmentation using SAM
Authors:
**feng Wang,
Sifan Song,
Xinkun Wang,
Yiyi Wang,
Yiyi Miao,
Jionglong Su,
S. Kevin Zhou
Abstract:
With the proposal of the Segment Anything Model (SAM), fine-tuning SAM for medical image segmentation (MIS) has become popular. However, due to the large size of the SAM model and the significant domain gap between natural and medical images, fine-tuning-based strategies are costly with potential risk of instability, feature damage and catastrophic forgetting. Furthermore, some methods of transfer…
▽ More
With the proposal of the Segment Anything Model (SAM), fine-tuning SAM for medical image segmentation (MIS) has become popular. However, due to the large size of the SAM model and the significant domain gap between natural and medical images, fine-tuning-based strategies are costly with potential risk of instability, feature damage and catastrophic forgetting. Furthermore, some methods of transferring SAM to a domain-specific MIS through fine-tuning strategies disable the model's prompting capability, severely limiting its utilization scenarios. In this paper, we propose an Auto-Prompting Module (APM), which provides SAM-based foundation model with Euclidean adaptive prompts in the target domain. Our experiments demonstrate that such adaptive prompts significantly improve SAM's non-fine-tuned performance in MIS. In addition, we propose a novel non-invasive method called Incremental Pattern Shifting (IPS) to adapt SAM to specific medical domains. Experimental results show that the IPS enables SAM to achieve state-of-the-art or competitive performance in MIS without the need for fine-tuning. By coupling these two methods, we propose ProMISe, an end-to-end non-fine-tuned framework for Promptable Medical Image Segmentation. Our experiments demonstrate that both using our methods individually or in combination achieves satisfactory performance in low-cost pattern shifting, with all of SAM's parameters frozen.
△ Less
Submitted 18 March, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Observation of the decay $h_{c}\to3(π^{+}π^{-})π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on $(2712.4\pm14.1)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector, we study the decays $h_{c}\to3(π^{+}π^{-})π^{0}$, $h_{c}\to2(π^{+}π^{-})ω$, $h_{c}\to2(π^{+}π^{-})π^{0}η$, $h_{c}\to2(π^{+}π^{-})η$, and $h_{c}\to p\bar{p}$ via $ψ(3686)\toπ^{0}h_{c}$. The decay channel $h_{c}\to3(π^{+}π^{-})π^{0}$ is observed for the first time, and its branching fraction is determined to…
▽ More
Based on $(2712.4\pm14.1)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector, we study the decays $h_{c}\to3(π^{+}π^{-})π^{0}$, $h_{c}\to2(π^{+}π^{-})ω$, $h_{c}\to2(π^{+}π^{-})π^{0}η$, $h_{c}\to2(π^{+}π^{-})η$, and $h_{c}\to p\bar{p}$ via $ψ(3686)\toπ^{0}h_{c}$. The decay channel $h_{c}\to3(π^{+}π^{-})π^{0}$ is observed for the first time, and its branching fraction is determined to be $\left( {9.28\pm 1.14 \pm 0.77} \right) \times {10^{ - 3}}$, where the first uncertainty is statistical and the second is systematic. In addition, first evidence is found for the modes $h_{c} \to 2(π^{+}π^{-})π^{0}η$ and $h_{c}\to2(π^{+}π^{-})ω$ with significances of 4.8$σ$ and 4.7$σ$, and their branching fractions are determined to be $(7.55\pm1.51\pm0.77)\times10^{-3}$ and $\left( {4.00 \pm 0.86 \pm 0.35}\right) \times {10^{ - 3}}$, respectively. No significant signals of $h_c\to 2(π^+π^-)η$ and $h_{c}\to p\bar{p}$ are observed, and the upper limits of the branching fractions of these decays are determined to be $<6.19\times10^{-4}$ and $<4.40\times10^{-5}$ at the 90% confidence level, respectively.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
An EnKF-LSTM Assimilation Algorithm for Crop Growth Model
Authors:
Siqi Zhou,
Ling Wang,
Jie Liu,
**shan Tang
Abstract:
Accurate and timely prediction of crop growth is of great significance to ensure crop yields and researchers have developed several crop models for the prediction of crop growth. However, there are large difference between the simulation results obtained by the crop models and the actual results, thus in this paper, we proposed to combine the simulation results with the collected crop data for dat…
▽ More
Accurate and timely prediction of crop growth is of great significance to ensure crop yields and researchers have developed several crop models for the prediction of crop growth. However, there are large difference between the simulation results obtained by the crop models and the actual results, thus in this paper, we proposed to combine the simulation results with the collected crop data for data assimilation so that the accuracy of prediction will be improved. In this paper, an EnKF-LSTM data assimilation method for various crops is proposed by combining ensemble Kalman filter and LSTM neural network, which effectively avoids the overfitting problem of existing data assimilation methods and eliminates the uncertainty of the measured data. The verification of the proposed EnKF-LSTM method and the comparison of the proposed method with other data assimilation methods were performed using datasets collected by sensor equipment deployed on a farm.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Bootstrap** AdS$_2 \times$ S$^2$ hypermultiplets: hidden four-dimensional conformal symmetry
Authors:
Konstantinos C. Rigatos,
Shaodong Zhou
Abstract:
We bootstrap the $4$-point amplitude of $\mathcal{N}=2$ hypermultiplets in $\text{AdS}_2 \times \text{S}^2$ at tree-level and for arbitrary external weights. We hereby explicitly demonstrate the existence of a hidden four-dimensional conformal symmetry that was used as an assumption in previous studies to derive this result.
We bootstrap the $4$-point amplitude of $\mathcal{N}=2$ hypermultiplets in $\text{AdS}_2 \times \text{S}^2$ at tree-level and for arbitrary external weights. We hereby explicitly demonstrate the existence of a hidden four-dimensional conformal symmetry that was used as an assumption in previous studies to derive this result.
△ Less
Submitted 24 April, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Aα-spectral radius and path-factor covered graphs
Authors:
Sizhong Zhou,
Hongxia Liu,
Qiuxiang Bian
Abstract:
Let $α\in[0,1)$, and let $G$ be a connected graph of order $n$ with $n\geq f(α)$, where $f(α)=14$ for $α\in[0,\frac{1}{2}]$, $f(α)=17$ for $α\in(\frac{1}{2},\frac{2}{3}]$, $f(α)=20$ for $α\in(\frac{2}{3},\frac{3}{4}]$ and $f(α)=\frac{5}{1-α}+1$ for $α\in(\frac{3}{4},1)$. A path factor is a spanning subgraph $F$ of $G$ such that every component of $F$ is a path with at least two vertices. Let…
▽ More
Let $α\in[0,1)$, and let $G$ be a connected graph of order $n$ with $n\geq f(α)$, where $f(α)=14$ for $α\in[0,\frac{1}{2}]$, $f(α)=17$ for $α\in(\frac{1}{2},\frac{2}{3}]$, $f(α)=20$ for $α\in(\frac{2}{3},\frac{3}{4}]$ and $f(α)=\frac{5}{1-α}+1$ for $α\in(\frac{3}{4},1)$. A path factor is a spanning subgraph $F$ of $G$ such that every component of $F$ is a path with at least two vertices. Let $k\geq2$ be an integer. A $P_{\geq k}$-factor means a path-factor with each component being a path of order at least $k$. A graph $G$ is called a $P_{\geq k}$-factor covered graph if $G$ has a $P_{\geq k}$-factor containing $e$ for any $e\in E(G)$. Let $A_α(G)=αD(G)+(1-α)A(G)$, where $D(G)$ denotes the diagonal matrix of vertex degrees of $G$ and $A(G)$ denotes the adjacency matrix of $G$. The largest eigenvalue of $A_α(G)$ is called the $A_α$-spectral radius of $G$, which is denoted by $ρ_α(G)$. In this paper, it is proved that $G$ is a $P_{\geq2}$-factor covered graph if $ρ_α(G)>η(n)$, where $η(n)$ is the largest root of $x^{3}-((α+1)n+α-4)x^{2}+(αn^{2}+(α^{2}-2α-1)n-2α+1)x-α^{2}n^{2}+(5α^{2}-3α+2)n-10α^{2}+15α-8=0$. Furthermore, we provide a graph to show that the bound on $A_α$-spectral radius is optimal.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Measurement of $CP$ asymmetries in $B^0 \rightarrow K^0_S K^0_S K^0_S$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
M. Bauer,
A. Baur,
A. Beaubien
, et al. (428 additional authors not shown)
Abstract:
We report a measurement of decay-time dependent charge-parity ($CP$) asymmetries in $B^0 \rightarrow K^0_S K^0_S K^0_S$ decays. We use $387 \times 10^6 B\bar{B}$ pairs collected at the $Υ(4S)$ resonance with the Belle II detector at the SuperKEKB asymmetric-energy electron-positron collider. We reconstruct 220 signal events and extract the $CP$-violating parameters $S$ and $C$ from a fit to the di…
▽ More
We report a measurement of decay-time dependent charge-parity ($CP$) asymmetries in $B^0 \rightarrow K^0_S K^0_S K^0_S$ decays. We use $387 \times 10^6 B\bar{B}$ pairs collected at the $Υ(4S)$ resonance with the Belle II detector at the SuperKEKB asymmetric-energy electron-positron collider. We reconstruct 220 signal events and extract the $CP$-violating parameters $S$ and $C$ from a fit to the distribution of the decay-time difference between the two $B$ mesons. The resulting confidence region is consistent with previous measurements in $B^0 \rightarrow K^0_S K^0_S K^0_S$ and $B^0 \rightarrow (c\bar{c})K^0$ decays, and with predictions based on the standard model.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Perfect codes in circulant graphs of degree $p^l-1$
Authors:
Xiaomeng Wang,
Oriol Serra,
Shou-Jun Xu,
Sanming Zhou
Abstract:
A perfect code in a graph is an independent set of the graph such that every vertex outside the set is adjacent to exactly one vertex in the set. A circulant graph is a Cayley graph of a cyclic group. In this paper we study perfect codes in circulant graphs of degree $p^l - 1$, where $p$ is a prime and $l \ge 1$. We obtain a necessary and sufficient condition for such a circulant graph to admit pe…
▽ More
A perfect code in a graph is an independent set of the graph such that every vertex outside the set is adjacent to exactly one vertex in the set. A circulant graph is a Cayley graph of a cyclic group. In this paper we study perfect codes in circulant graphs of degree $p^l - 1$, where $p$ is a prime and $l \ge 1$. We obtain a necessary and sufficient condition for such a circulant graph to admit perfect codes, give a construction of all such circulant graphs which admit perfect codes, and prove a lower bound on the number of distinct perfect codes in such a circulant graph. This extends known results for the case $l=1$ and provides insight on the general problem on the existence and structure of perfect codes in circulant graphs.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
A Simple Baseline for Efficient Hand Mesh Reconstruction
Authors:
Zhishan Zhou,
Shihao. zhou,
Zhi Lv,
Minqiang Zou,
Yao Tang,
Jiajun Liang
Abstract:
3D hand pose estimation has found broad application in areas such as gesture recognition and human-machine interaction tasks. As performance improves, the complexity of the systems also increases, which can limit the comparative analysis and practical implementation of these methods. In this paper, we propose a simple yet effective baseline that not only surpasses state-of-the-art (SOTA) methods b…
▽ More
3D hand pose estimation has found broad application in areas such as gesture recognition and human-machine interaction tasks. As performance improves, the complexity of the systems also increases, which can limit the comparative analysis and practical implementation of these methods. In this paper, we propose a simple yet effective baseline that not only surpasses state-of-the-art (SOTA) methods but also demonstrates computational efficiency. To establish this baseline, we abstract existing work into two components: a token generator and a mesh regressor, and then examine their core structures. A core structure, in this context, is one that fulfills intrinsic functions, brings about significant improvements, and achieves excellent performance without unnecessary complexities. Our proposed approach is decoupled from any modifications to the backbone, making it adaptable to any modern models. Our method outperforms existing solutions, achieving state-of-the-art (SOTA) results across multiple datasets. On the FreiHAND dataset, our approach produced a PA-MPJPE of 5.7mm and a PA-MPVPE of 6.0mm. Similarly, on the Dexycb dataset, we observed a PA-MPJPE of 5.5mm and a PA-MPVPE of 5.0mm. As for performance speed, our method reached up to 33 frames per second (fps) when using HRNet and up to 70 fps when employing FastViT-MA36
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Observation of $ψ(3686)\to 3φ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (645 additional authors not shown)
Abstract:
Using $(2.712\pm0.014)\times 10^9$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we report the first observation of $ψ(3686)\to 3φ$ decay with a significance larger than 10$σ$. The branching fraction of this decay is determined to be $(1.46\pm0.05\pm0.17)\times10^{-5}$, where the first uncertainty is statistical and the second is systematic. No significant str…
▽ More
Using $(2.712\pm0.014)\times 10^9$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we report the first observation of $ψ(3686)\to 3φ$ decay with a significance larger than 10$σ$. The branching fraction of this decay is determined to be $(1.46\pm0.05\pm0.17)\times10^{-5}$, where the first uncertainty is statistical and the second is systematic. No significant structure is observed in the $φφ$ invariant mass spectra.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
How long will the quasar UV/optical flickering be damped?
Authors:
Shuying Zhou,
Mouyuan Sun,
Zhen-Yi Cai,
Guowei Ren,
Jun-Xian Wang,
Yongquan Xue
Abstract:
The UV/optical light curves of Active Galactic Nuclei (AGNs) are commonly described by the Damped Random Walk (DRW) model. However, the physical interpretation of the dam** timescale, a key parameter in the DRW model, remains unclear. Particularly, recent observations indicate a weak dependence of the dam** timescale upon both wavelength and accretion rate, clearly being inconsistent with the…
▽ More
The UV/optical light curves of Active Galactic Nuclei (AGNs) are commonly described by the Damped Random Walk (DRW) model. However, the physical interpretation of the dam** timescale, a key parameter in the DRW model, remains unclear. Particularly, recent observations indicate a weak dependence of the dam** timescale upon both wavelength and accretion rate, clearly being inconsistent with the accretion-disk theory. In this study, we investigate the dam** timescale in the framework of the Corona Heated Accretion disk Reprocessing (CHAR) model, a physical model that describes AGN variability. We find that while the CHAR model can reproduce the observed power spectral densities of the 20-year light curves for 190 sources from \cite{Stone2022}, the observed dam** timescale, as well as its weak dependence on wavelength, can also be well recovered through fitting the mock light curves with DRW. We further demonstrate that such weak dependence is artificial due to the effect of inadequate durations of light curves, which leads to best-fitting dam** timescales lower than the intrinsic ones. After eliminating this effect, the CHAR model indeed yields a strong dependence of the intrinsic dam** timescale on the bolometric luminosity and rest-frame wavelength. Our results highlight the demand for sufficiently long light curves in AGN variability studies and important applications of the CHAR model in such studies.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
A family of symmetric graphs in relation to 2-point-transitive linear spaces
Authors:
Teng Fang,
Sanming Zhou,
Shenglin Zhou
Abstract:
A graph $Γ$ is $G$-symmetric if it admits $G$ as a group of automorphisms acting transitively on the set of arcs of $Γ$, where an arc is an ordered pair of adjacent vertices. Let $Γ$ be a $G$-symmetric graph such that its vertex set admits a nontrivial $G$-invariant partition ${\cal B}$, and let ${\cal D}(Γ, {\cal B})$ be the incidence structure with point set ${\cal B}$ and blocks…
▽ More
A graph $Γ$ is $G$-symmetric if it admits $G$ as a group of automorphisms acting transitively on the set of arcs of $Γ$, where an arc is an ordered pair of adjacent vertices. Let $Γ$ be a $G$-symmetric graph such that its vertex set admits a nontrivial $G$-invariant partition ${\cal B}$, and let ${\cal D}(Γ, {\cal B})$ be the incidence structure with point set ${\cal B}$ and blocks $\{B\} \cup Γ_{\cal B}(α)$, for $B \in {\cal B}$ and $α\in B$, where $Γ_{\cal B}(α)$ is the set of blocks of ${\cal B}$ containing at least one neighbour of $α$ in $Γ$. In this paper we classify all $G$-symmetric graphs $Γ$ such that $Γ_{\cal B}(α) \ne Γ_{\cal B}(β)$ for distinct $α, β\in B$, the quotient graph of $Γ$ with respect to ${\cal B}$ is a complete graph, and ${\cal D}(Γ, {\cal B})$ is isomorphic to the complement of a $(G, 2)$-point-transitive linear space.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Stability of graph pairs involving cycles
Authors:
Xiaomeng Wang,
Shou-Jun Xu,
Sanming Zhou
Abstract:
A graph pair $(Γ, Σ)$ is called stable if $\aut(Γ)\times\aut(Σ)$ is isomorphic to $\aut(Γ\timesΣ)$ and unstable otherwise, where $Γ\timesΣ$ is the direct product of $Γ$ and $Σ$. A graph is called $R$-thin if distinct vertices have different neighbourhoods. $Γ$ and $Σ$ are said to be coprime if there is no nontrivial graph $Δ$ such that $Γ\cong Γ_1 \times Δ$ and $Σ\cong Σ_1 \times Δ$ for some graph…
▽ More
A graph pair $(Γ, Σ)$ is called stable if $\aut(Γ)\times\aut(Σ)$ is isomorphic to $\aut(Γ\timesΣ)$ and unstable otherwise, where $Γ\timesΣ$ is the direct product of $Γ$ and $Σ$. A graph is called $R$-thin if distinct vertices have different neighbourhoods. $Γ$ and $Σ$ are said to be coprime if there is no nontrivial graph $Δ$ such that $Γ\cong Γ_1 \times Δ$ and $Σ\cong Σ_1 \times Δ$ for some graphs $Γ_1$ and $Σ_1$. An unstable graph pair $(Γ, Σ)$ is called nontrivially unstable if $Γ$ and $Σ$ are $R$-thin connected coprime graphs and at least one of them is non-bipartite. This paper contributes to the study of the stability of graph pairs with a focus on the case when $Σ= C_n$ is a cycle. We give two sufficient conditions for $(Γ, C_n)$ to be nontrivially unstable, where $n \ne 4$ and $Γ$ is an $R$-thin connected graph. In the case when $Γ$ is an $R$-thin connected non-bipartite graph, we obtain the following results: (i) if $(Γ, K_2)$ is unstable, then $(Γ, C_{n})$ is unstable for every even integer $n \geq 4$; (ii) if an even integer $n \ge 6$ is compatible with $Γ$ in some sense, then $(Γ, C_{n})$ is nontrivially unstable if and only if $(Γ, K_2)$ is unstable; (iii) if there is an even integer $n \ge 6$ compatible with $Γ$ such that $(Γ, C_{n})$ is nontrivially unstable, then $(Γ, C_{m})$ is unstable for all even integers $m \ge 6$. We also prove that if $Γ$ is an $R$-thin connected graph and $n \ge 3$ is an odd integer compatible with $Γ$, then $(Γ, C_{n})$ is stable.
△ Less
Submitted 14 April, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing
Authors:
Yafei Zhang,
Shen Zhou,
Huafeng Li
Abstract:
Recovering a clear image from a single hazy image is an open inverse problem. Although significant research progress has been made, most existing methods ignore the effect that downstream tasks play in promoting upstream dehazing. From the perspective of the haze generation mechanism, there is a potential relationship between the depth information of the scene and the hazy image. Based on this, we…
▽ More
Recovering a clear image from a single hazy image is an open inverse problem. Although significant research progress has been made, most existing methods ignore the effect that downstream tasks play in promoting upstream dehazing. From the perspective of the haze generation mechanism, there is a potential relationship between the depth information of the scene and the hazy image. Based on this, we propose a dual-task collaborative mutual promotion framework to achieve the dehazing of a single image. This framework integrates depth estimation and dehazing by a dual-task interaction mechanism and achieves mutual enhancement of their performance. To realize the joint optimization of the two tasks, an alternative implementation mechanism with the difference perception is developed. On the one hand, the difference perception between the depth maps of the dehazing result and the ideal image is proposed to promote the dehazing network to pay attention to the non-ideal areas of the dehazing. On the other hand, by improving the depth estimation performance in the difficult-to-recover areas of the hazy image, the dehazing network can explicitly use the depth information of the hazy image to assist the clear image recovery. To promote the depth estimation, we propose to use the difference between the dehazed image and the ground truth to guide the depth estimation network to focus on the dehazed unideal areas. It allows dehazing and depth estimation to leverage their strengths in a mutually reinforcing manner. Experimental results show that the proposed method can achieve better performance than that of the state-of-the-art approaches.
△ Less
Submitted 12 July, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
Sharp interface limit for $1$D stochastic Allen-Cahn equation in full small noise regime
Authors:
Weijun Xu,
Wenhao Zhao,
Shuhan Zhou
Abstract:
We study the sharp interface limit for the $1$D stochastic Allen-Cahn equation, and extend earlier work by Funaki to the full small noise regime. The main new idea is the construction of a series of functional correctors, which are designed to recursively cancel potential divergences.
In addition, in order to show these correctors are well-behaved, we develop a systematic decomposition of functi…
▽ More
We study the sharp interface limit for the $1$D stochastic Allen-Cahn equation, and extend earlier work by Funaki to the full small noise regime. The main new idea is the construction of a series of functional correctors, which are designed to recursively cancel potential divergences.
In addition, in order to show these correctors are well-behaved, we develop a systematic decomposition of functional derivatives of the deterministic Allen-Cahn flow of all orders. This decomposition is of its own interest, and may be useful in other situations as well.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Entanglement-enabled advantage for learning a bosonic random displacement channel
Authors:
Changhun Oh,
Senrui Chen,
Yat Wong,
Sisi Zhou,
Hsin-Yuan Huang,
Jens A. H. Nielsen,
Zheng-Hao Liu,
Jonas S. Neergaard-Nielsen,
Ulrik L. Andersen,
Liang Jiang,
John Preskill
Abstract:
We show that quantum entanglement can provide an exponential advantage in learning properties of a bosonic continuous-variable (CV) system. The task we consider is estimating a probabilistic mixture of displacement operators acting on $n$ bosonic modes, called a random displacement channel. We prove that if the $n$ modes are not entangled with an ancillary quantum memory, then the channel must be…
▽ More
We show that quantum entanglement can provide an exponential advantage in learning properties of a bosonic continuous-variable (CV) system. The task we consider is estimating a probabilistic mixture of displacement operators acting on $n$ bosonic modes, called a random displacement channel. We prove that if the $n$ modes are not entangled with an ancillary quantum memory, then the channel must be sampled a number of times exponential in $n$ in order to estimate its characteristic function to reasonable precision; this lower bound on sample complexity applies even if the channel inputs and measurements performed on channel outputs are chosen adaptively. On the other hand, we present a simple entanglement-assisted scheme that only requires a number of samples independent of $n$, given a sufficient amount of squeezing. This establishes an exponential separation in sample complexity. We then analyze the effect of photon loss and show that the entanglement-assisted scheme is still significantly more efficient than any lossless entanglement-free scheme under mild experimental conditions. Our work illuminates the role of entanglement in learning continuous-variable systems and points toward experimentally feasible demonstrations of provable entanglement-enabled advantage using CV quantum platforms.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Limits of noisy quantum metrology with restricted quantum controls
Authors:
Sisi Zhou
Abstract:
The Heisenberg limit (HL) and the standard quantum limit (SQL) are two quantum metrological limits, which describe the scalings of estimation precision $Δ\hatθ$ of an unknown parameter $θ$ with respect to $n$, the number of one-parameter quantum channels applied. It was known that the HL ($Δ\hatθ\propto 1/n$) is achievable using quantum error correction (QEC) strategies when the ``Hamiltonian-not-…
▽ More
The Heisenberg limit (HL) and the standard quantum limit (SQL) are two quantum metrological limits, which describe the scalings of estimation precision $Δ\hatθ$ of an unknown parameter $θ$ with respect to $n$, the number of one-parameter quantum channels applied. It was known that the HL ($Δ\hatθ\propto 1/n$) is achievable using quantum error correction (QEC) strategies when the ``Hamiltonian-not-in-Kraus-span'' (HNKS) condition is satisfied; and when HNKS is violated, the SQL ($Δ\hatθ\propto 1/n^{1/2}$) is optimal and can be achieved with $n$ repeated measurements. However, it is unknown whether such limits are still achievable using restricted quantum devices where the required QEC operations are not available -- e.g., finite-size devices where only unitary controls are available or where noiseless ancilla is not available. In this work, we identify various new noisy metrological limits for estimating one-parameter qubit channels in different settings with restricted controls. The HL is proven to be unattainable in these cases, indicating the necessity of QEC in achieving the HL. Furthermore, we find a necessary and sufficient condition for qubit channels to attain the SQL, called the ``rotation-generators-not-in-Kraus-span'' (RGNKS) condition. When RGNKS is satisfied, the SQL is achievable using only unitary controls and a single measurement. When RGNKS is violated, the estimation precision (in most cases) has a constant floor when repeated measurements are not allowed. Demonstration of this separation in metrological powers is within reach of current quantum technologies.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Play like a Vertex: A Stackelberg Game Approach for Streaming Graph Partitioning
Authors:
Zezhong Ding,
Yongan Xiang,
Shangyou Wang,
Xike Xie,
S. Kevin Zhou
Abstract:
In the realm of distributed systems tasked with managing and processing large-scale graph-structured data, optimizing graph partitioning stands as a pivotal challenge. The primary goal is to minimize communication overhead and runtime cost. However, alongside the computational complexity associated with optimal graph partitioning, a critical factor to consider is memory overhead. Real-world graphs…
▽ More
In the realm of distributed systems tasked with managing and processing large-scale graph-structured data, optimizing graph partitioning stands as a pivotal challenge. The primary goal is to minimize communication overhead and runtime cost. However, alongside the computational complexity associated with optimal graph partitioning, a critical factor to consider is memory overhead. Real-world graphs often reach colossal sizes, making it impractical and economically unviable to load the entire graph into memory for partitioning. This is also a fundamental premise in distributed graph processing, where accommodating a graph with non-distributed systems is unattainable. Currently, existing streaming partitioning algorithms exhibit a skew-oblivious nature, yielding satisfactory partitioning results exclusively for specific graph types. In this paper, we propose a novel streaming partitioning algorithm, the Skewness-aware Vertex-cut Partitioner S5P, designed to leverage the skewness characteristics of real graphs for achieving high-quality partitioning. S5P offers high partitioning quality by segregating the graph's edge set into two subsets, head and tail sets. Following processing by a skewness-aware clustering algorithm, these two subsets subsequently undergo a Stackelberg graph game. Our extensive evaluations conducted on substantial real-world and synthetic graphs demonstrate that, in all instances, the partitioning quality of S5P surpasses that of existing streaming partitioning algorithms, operating within the same load balance constraints. For example, S5P can bring up to a 51% improvement in partitioning quality compared to the top partitioner among the baselines. Lastly, we showcase that the implementation of S5P results in up to an 81% reduction in communication cost and a 130% increase in runtime efficiency for distributed graph processing tasks on PowerGraph.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks
Authors:
Zhewei Wu,
Ruilong Yu,
Qihe Liu,
Shuying Cheng,
Shilin Qiu,
Shijie Zhou
Abstract:
Adversarial attacks in visual object tracking have significantly degraded the performance of advanced trackers by introducing imperceptible perturbations into images. These attack methods have garnered considerable attention from researchers in recent years. However, there is still a lack of research on designing adversarial defense methods specifically for visual object tracking. To address these…
▽ More
Adversarial attacks in visual object tracking have significantly degraded the performance of advanced trackers by introducing imperceptible perturbations into images. These attack methods have garnered considerable attention from researchers in recent years. However, there is still a lack of research on designing adversarial defense methods specifically for visual object tracking. To address these issues, we propose an effective additional pre-processing network called DuaLossDef that eliminates adversarial perturbations during the tracking process. DuaLossDef is deployed ahead of the search branche or template branche of the tracker to apply defensive transformations to the input images. Moreover, it can be seamlessly integrated with other visual trackers as a plug-and-play module without requiring any parameter adjustments. We train DuaLossDef using adversarial training, specifically employing Dua-Loss to generate adversarial samples that simultaneously attack the classification and regression branches of the tracker. Extensive experiments conducted on the OTB100, LaSOT, and VOT2018 benchmarks demonstrate that DuaLossDef maintains excellent defense robustness against adversarial attack methods in both adaptive and non-adaptive attack scenarios. Moreover, when transferring the defense network to other trackers, it exhibits reliable transferability. Finally, DuaLossDef achieves a processing time of up to 5ms/frame, allowing seamless integration with existing high-speed trackers without introducing significant computational overhead. We will make our code publicly available soon.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Toughness and Aα-spectral radius in graphs
Authors:
Sizhong Zhou,
Yuli Zhang,
Tao Zhang,
Hongxia Liu
Abstract:
Let $α\in[0,1)$, and let $G$ be a connected graph of order $n$ with $n\geq f(α)$, where $f(α)=6$ for $α\in[0,\frac{2}{3}]$ and $f(α)=\frac{4}{1-α}$ for $α\in(\frac{2}{3},1)$. A graph $G$ is said to be $t$-tough if $|S|\geq tc(G-S)$ for each subset $S$ of $V(G)$ with $c(G-S)\geq2$, where $c(G-S)$ is the number of connected components in $G-S$. The $A_α$-spectral radius of $G$ is denoted by…
▽ More
Let $α\in[0,1)$, and let $G$ be a connected graph of order $n$ with $n\geq f(α)$, where $f(α)=6$ for $α\in[0,\frac{2}{3}]$ and $f(α)=\frac{4}{1-α}$ for $α\in(\frac{2}{3},1)$. A graph $G$ is said to be $t$-tough if $|S|\geq tc(G-S)$ for each subset $S$ of $V(G)$ with $c(G-S)\geq2$, where $c(G-S)$ is the number of connected components in $G-S$. The $A_α$-spectral radius of $G$ is denoted by $ρ_α(G)$. In this paper, it is verified that $G$ is a 1-tough graph unless $G=K_1\vee(K_{n-2}\cup K_1)$ if $ρ_α(G)\geqρ_α(K_1\vee(K_{n-2}\cup K_1))$, where $ρ_α(K_1\vee(K_{n-2}\cup K_1))$ equals the largest root of $x^{3}-((α+1)n+α-3)x^{2}+(αn^{2}+(α^{2}-α-1)n-2α+1)x-α^{2}n^{2}+(3α^{2}-α+1)n-4α^{2}+5α-3=0$. Further, we present an $A_α$-spectral radius condition for a graph to be a $t$-tough graph.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification
Authors:
Haoran Lai,
Qingsong Yao,
Zihang Jiang,
Rongsheng Wang,
Zhiyang He,
Xiaodong Tao,
S. Kevin Zhou
Abstract:
The advancement of Zero-Shot Learning in the medical domain has been driven forward by using pre-trained models on large-scale image-text pairs, focusing on image-text alignment. However, existing methods primarily rely on cosine similarity for alignment, which may not fully capture the complex relationship between medical images and reports. To address this gap, we introduce a novel approach call…
▽ More
The advancement of Zero-Shot Learning in the medical domain has been driven forward by using pre-trained models on large-scale image-text pairs, focusing on image-text alignment. However, existing methods primarily rely on cosine similarity for alignment, which may not fully capture the complex relationship between medical images and reports. To address this gap, we introduce a novel approach called Cross-Attention Alignment for Radiology Zero-Shot Classification (CARZero). Our approach innovatively leverages cross-attention mechanisms to process image and report features, creating a Similarity Representation that more accurately reflects the intricate relationships in medical semantics. This representation is then linearly projected to form an image-text similarity matrix for cross-modality alignment. Additionally, recognizing the pivotal role of prompt selection in zero-shot learning, CARZero incorporates a Large Language Model-based prompt alignment strategy. This strategy standardizes diverse diagnostic expressions into a unified format for both training and inference phases, overcoming the challenges of manual prompt design. Our approach is simple yet effective, demonstrating state-of-the-art performance in zero-shot classification on five official chest radiograph diagnostic test sets, including remarkable results on datasets with long-tail distributions of rare diseases. This achievement is attributed to our new image-text alignment strategy, which effectively addresses the complex relationship between medical images and reports. Code and models are available at https://github.com/laihaoran/CARZero.
△ Less
Submitted 24 March, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Dual-Space Optimization: Improved Molecule Sequence Design by Latent Prompt Transformer
Authors:
Deqian Kong,
Yuhao Huang,
Jianwen Xie,
Edouardo Honig,
Ming Xu,
Shuanghong Xue,
Pei Lin,
San** Zhou,
Sheng Zhong,
Nanning Zheng,
Ying Nian Wu
Abstract:
Designing molecules with desirable properties, such as drug-likeliness and high binding affinities towards protein targets, is a challenging problem. In this paper, we propose the Dual-Space Optimization (DSO) method that integrates latent space sampling and data space selection to solve this problem. DSO iteratively updates a latent space generative model and a synthetic dataset in an optimizatio…
▽ More
Designing molecules with desirable properties, such as drug-likeliness and high binding affinities towards protein targets, is a challenging problem. In this paper, we propose the Dual-Space Optimization (DSO) method that integrates latent space sampling and data space selection to solve this problem. DSO iteratively updates a latent space generative model and a synthetic dataset in an optimization process that gradually shifts the generative model and the synthetic data towards regions of desired property values. Our generative model takes the form of a Latent Prompt Transformer (LPT) where the latent vector serves as the prompt of a causal transformer. Our extensive experiments demonstrate effectiveness of the proposed method, which sets new performance benchmarks across single-objective, multi-objective and constrained molecule design tasks.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Probing anyonic statistics via Mach-Zehnder interferometry in quantum computers
Authors:
Shiyu Zhou,
Yi Teng,
Claudio Chamon,
Claudio Castelnovo,
Armin Rahmani
Abstract:
We introduce a synthetic Mach-Zehnder interferometer for digitized quantum computing devices to probe fractional exchange statistics of anyonic excitations that appear in quantum spin liquids. Employing an IonQ quantum computer, we apply this scheme to the toric ladder, a quasi-one-dimensional reduction of the toric code. We observe interference patterns resulting from the movement of `electric' e…
▽ More
We introduce a synthetic Mach-Zehnder interferometer for digitized quantum computing devices to probe fractional exchange statistics of anyonic excitations that appear in quantum spin liquids. Employing an IonQ quantum computer, we apply this scheme to the toric ladder, a quasi-one-dimensional reduction of the toric code. We observe interference patterns resulting from the movement of `electric' excitations in the presence and absence of `magnetic' ones. We model the noise in IonQ via depolarizing Lindbladian dynamics, and find quantitative agreement with the measurements obtained from the quantum device. The synthetic Mach-Zehnder interferometer can thus also serve as an effective means to probe the coherence length and time scales of multi-qubit noisy quantum devices.
△ Less
Submitted 7 March, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
High-order topological pum** on a superconducting quantum processor
Authors:
Cheng-Lin Deng,
Yu Liu,
Yu-Ran Zhang,
Xue-Gang Li,
Tao Liu,
Chi-Tong Chen,
Tong Liu,
Cong-Wei Lu,
Yong-Yi Wang,
Tian-Ming Li,
Cai-** Fang,
Si-Yun Zhou,
Jia-Cheng Song,
Yue-Shan Xu,
Yang He,
Zheng-He Liu,
Kai-Xuan Huang,
Zhong-Cheng Xiang,
Jie-Ci Wang,
Dong-Ning Zheng,
Guang-Ming Xue,
Kai Xu,
H. F. Yu,
Heng Fan
Abstract:
High-order topological phases of matter refer to the systems of $n$-dimensional bulk with the topology of $m$-th order, exhibiting $(n-m)$-dimensional boundary modes and can be characterized by topological pum**. Here, we experimentally demonstrate two types of second-order topological pumps, forming four 0-dimensional corner localized states on a 4$\times$4 square lattice array of 16 supercondu…
▽ More
High-order topological phases of matter refer to the systems of $n$-dimensional bulk with the topology of $m$-th order, exhibiting $(n-m)$-dimensional boundary modes and can be characterized by topological pum**. Here, we experimentally demonstrate two types of second-order topological pumps, forming four 0-dimensional corner localized states on a 4$\times$4 square lattice array of 16 superconducting qubits. The initial ground state of the system for half-filling, as a product of four identical entangled 4-qubit states, is prepared using an adiabatic scheme. During the pum** procedure, we adiabatically modulate the superlattice Bose-Hubbard Hamiltonian by precisely controlling both the hop** strengths and on-site potentials. At the half pum** period, the system evolves to a corner-localized state in a quadrupole configuration. The robustness of the second-order topological pump is also investigated by introducing different on-site disorder. Our work studies the topological properties of high-order topological phases from the dynamical transport picture using superconducting qubits, which would inspire further research on high-order topological phases.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
How to identify earth pressures on in-service tunnel linings: A Bayesian learning perspective
Authors:
Zhiyao Tian,
Shunhua Zhou,
Anthony Lee,
Yao Shan,
Bettina Detmann
Abstract:
The identification of earth pressures acting on in-service transportation tunnel linings is essential for their health monitoring and performance prediction, especially for those exhibiting poor structural performance. Since pressure gauges incur substantial costs, the inversion of pressures based on easily observed structural responses, such as deformations, is desirable. The inherent challenge i…
▽ More
The identification of earth pressures acting on in-service transportation tunnel linings is essential for their health monitoring and performance prediction, especially for those exhibiting poor structural performance. Since pressure gauges incur substantial costs, the inversion of pressures based on easily observed structural responses, such as deformations, is desirable. The inherent challenge in this inverse problem lies in the non-uniqueness of solutions, which arises from the fact that various pressures can yield structural responses fitting equally well with the observed data. However, existing approaches for pressure inversion predominantly rely on a deterministic framework, often neglecting a detailed discussion on this non-uniqueness. In addressing this gap, this study introduces a Bayesian approach. The proposed statistical framework enables the quantification of uncertainty induced by non-uniqueness in inversion results. The analysis identifies the uniform component in distributed pressures as the primary source of non-uniqueness. The mitigation of solution non-uniqueness can be achieved by increasing the quantity of deformation data or incorporating an observation of internal normal force in a tunnel lining -- the latter proving to be notably more effective. The practical application in a numerical case demonstrates the effectiveness of this approach and the associated findings. In addition, our investigation recommends maintaining deformation measurement accuracy within the range of [-1, 1] mm to ensure satisfactory outcomes. Finally, deficiencies and potential future extensions of this approach are discussed.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
How to Sustain a Scientific Open-Source Software Ecosystem: Learning from the Astropy Project
Authors:
Jiayi Sun,
Aarya Patil,
Youhai Li,
** L. C. Guo,
Shurui Zhou
Abstract:
Scientific open-source software (OSS) has greatly benefited research communities through its transparent and collaborative nature. Given its critical role in scientific research, ensuring the sustainability of such software has become vital. Earlier studies have proposed sustainability strategies for conventional scientific software and open-source communities. However, it remains unclear whether…
▽ More
Scientific open-source software (OSS) has greatly benefited research communities through its transparent and collaborative nature. Given its critical role in scientific research, ensuring the sustainability of such software has become vital. Earlier studies have proposed sustainability strategies for conventional scientific software and open-source communities. However, it remains unclear whether these solutions can be easily adapted to the integrated framework of scientific OSS and its larger ecosystem. This study examines the challenges and opportunities to enhance the sustainability of scientific OSS in the context of interdisciplinary collaboration, open-source community, and multi-project ecosystem. We conducted a case study on a widely-used software ecosystem in the astrophysics domain, the Astropy Project, using a mixed-methods design approach. This approach includes an interview with core contributors regarding their participation in an interdisciplinary team, a survey of disengaged contributors about their motivations for contribution, reasons for disengagement, and suggestions for sustaining the communities, and finally, an analysis of cross-referenced issues and pull requests to understand best practices for collaboration on the ecosystem level. Our study reveals the implications of major challenges for sustaining scientific OSS and proposes concrete suggestions for tackling these challenges.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
On nonparametric estimation of the interaction function in particle system models
Authors:
Denis Belomestny,
Mark Podolskij,
Shi-Yuan Zhou
Abstract:
This paper delves into a nonparametric estimation approach for the interaction function within diffusion-type particle system models. We introduce two estimation methods based upon an empirical risk minimization. Our study encompasses an analysis of the stochastic and approximation errors associated with both procedures, along with an examination of certain minimax lower bounds. In particular, we…
▽ More
This paper delves into a nonparametric estimation approach for the interaction function within diffusion-type particle system models. We introduce two estimation methods based upon an empirical risk minimization. Our study encompasses an analysis of the stochastic and approximation errors associated with both procedures, along with an examination of certain minimax lower bounds. In particular, we show that there is a natural metric under which the corresponding minimax estimation error of the interaction function converges to zero with parametric rate. This result is rather suprising given complexity of the underlying estimation problem and rather large classes of interaction functions for which the above parametric rate holds.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
A spectral condition for a graph having a strong parity factor
Authors:
Sizhong Zhou
Abstract:
A graph $G$ contains a strong parity factor $F$ if for every subset $X\subseteq V(G)$ with $|X|$ even, $G$ has a spanning subgraph $F$ satisfying $δ(F)\geq1$, $d_F(u)\equiv1$ (mod 2) for any $u\in X$, and $d_F(v)\equiv0$ (mod 2) for any $v\in V(G)\setminus X$. In this paper, we give a spectral radius condition to guarantee that a connected graph contains a strong parity factor.
A graph $G$ contains a strong parity factor $F$ if for every subset $X\subseteq V(G)$ with $|X|$ even, $G$ has a spanning subgraph $F$ satisfying $δ(F)\geq1$, $d_F(u)\equiv1$ (mod 2) for any $u\in X$, and $d_F(v)\equiv0$ (mod 2) for any $v\in V(G)\setminus X$. In this paper, we give a spectral radius condition to guarantee that a connected graph contains a strong parity factor.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Flexible Physical Camouflage Generation Based on a Differential Approach
Authors:
Yang Li,
Wenyi Tan,
Chenxing Zhao,
Shuangju Zhou,
Xinkai Liang,
Quan Pan
Abstract:
This study introduces a novel approach to neural rendering, specifically tailored for adversarial camouflage, within an extensive 3D rendering framework. Our method, named FPA, goes beyond traditional techniques by faithfully simulating lighting conditions and material variations, ensuring a nuanced and realistic representation of textures on a 3D target. To achieve this, we employ a generative ap…
▽ More
This study introduces a novel approach to neural rendering, specifically tailored for adversarial camouflage, within an extensive 3D rendering framework. Our method, named FPA, goes beyond traditional techniques by faithfully simulating lighting conditions and material variations, ensuring a nuanced and realistic representation of textures on a 3D target. To achieve this, we employ a generative approach that learns adversarial patterns from a diffusion model. This involves incorporating a specially designed adversarial loss and covert constraint loss to guarantee the adversarial and covert nature of the camouflage in the physical world. Furthermore, we showcase the effectiveness of the proposed camouflage in sticker mode, demonstrating its ability to cover the target without compromising adversarial information. Through empirical and physical experiments, FPA exhibits strong performance in terms of attack success rate and transferability. Additionally, the designed sticker-mode camouflage, coupled with a concealment constraint, adapts to the environment, yielding diverse styles of texture. Our findings highlight the versatility and efficacy of the FPA approach in adversarial camouflage applications.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Distributionally Robust Graph-based Recommendation System
Authors:
Bohao Wang,
Jiawei Chen,
Changdong Li,
Sheng Zhou,
Qihao Shi,
Yang Gao,
Yan Feng,
Chun Chen,
Can Wang
Abstract:
With the capacity to capture high-order collaborative signals, Graph Neural Networks (GNNs) have emerged as powerful methods in Recommender Systems (RS). However, their efficacy often hinges on the assumption that training and testing data share the same distribution (a.k.a. IID assumption), and exhibits significant declines under distribution shifts. Distribution shifts commonly arises in RS, oft…
▽ More
With the capacity to capture high-order collaborative signals, Graph Neural Networks (GNNs) have emerged as powerful methods in Recommender Systems (RS). However, their efficacy often hinges on the assumption that training and testing data share the same distribution (a.k.a. IID assumption), and exhibits significant declines under distribution shifts. Distribution shifts commonly arises in RS, often attributed to the dynamic nature of user preferences or ubiquitous biases during data collection in RS. Despite its significance, researches on GNN-based recommendation against distribution shift are still sparse. To bridge this gap, we propose Distributionally Robust GNN (DR-GNN) that incorporates Distributional Robust Optimization (DRO) into the GNN-based recommendation. DR-GNN addresses two core challenges: 1) To enable DRO to cater to graph data intertwined with GNN, we reinterpret GNN as a graph smoothing regularizer, thereby facilitating the nuanced application of DRO; 2) Given the typically sparse nature of recommendation data, which might impede robust optimization, we introduce slight perturbations in the training distribution to expand its support. Notably, while DR-GNN involves complex optimization, it can be implemented easily and efficiently. Our extensive experiments validate the effectiveness of DR-GNN against three typical distribution shifts. The code is available at https://github.com/WANGBohaO-jpg/DR-GNN.
△ Less
Submitted 21 February, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Designed spin-texture-lattice to control anisotropic magnon transport in antiferromagnets
Authors:
Peter Meisenheimer,
Maya Ramesh,
Sajid Husain,
Isaac Harris,
Hyeon Woo Park,
Shiyu Zhou,
Hossein Taghinejad,
Hongrui Zhang,
Lane W. Martin,
James Analytis,
Paul Stevenson,
Jorge Íñiguez-González,
Se Kwon Kim,
Darrell G. Schlom,
Lucas Caretta,
Zhi Yao,
Ramamoorthy Ramesh
Abstract:
Spin waves in magnetic materials are promising information carriers for future computing technologies due to their ultra-low energy dissipation and long coherence length. Antiferromagnets are strong candidate materials due, in part, to their stability to external fields and larger group velocities. Multiferroic aniferromagnets, such as BiFeO$_3$ (BFO), have an additional degree of freedom stemming…
▽ More
Spin waves in magnetic materials are promising information carriers for future computing technologies due to their ultra-low energy dissipation and long coherence length. Antiferromagnets are strong candidate materials due, in part, to their stability to external fields and larger group velocities. Multiferroic aniferromagnets, such as BiFeO$_3$ (BFO), have an additional degree of freedom stemming from magnetoelectric coupling, allowing for control of the magnetic structure, and thus spin waves, with electric field. Unfortunately, spin-wave propagation in BFO is not well understood due to the complexity of the magnetic structure. In this work, we explore long-range spin transport within an epitaxially engineered, electrically tunable, one-dimensional (1D) magnonic crystal. We discover a striking anisotropy in the spin transport parallel and perpendicular to the 1D crystal axis. Multiscale theory and simulation suggests that this preferential magnon conduction emerges from a combination of a population imbalance in its dispersion, as well as anisotropic structural scattering. This work provides a pathway to electrically-reconfigurable magnonic crystals in antiferromagnets.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Authors:
Yuxuan Yue,
Zhihang Yuan,
Haojie Duanmu,
Sifan Zhou,
Jianlong Wu,
Liqiang Nie
Abstract:
Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process. This paper addresses these challenges by focusing on the quantization of LLMs, a technique that reduces memory consumption by converting model parameters and activations into low-bit integers. We critically analyz…
▽ More
Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process. This paper addresses these challenges by focusing on the quantization of LLMs, a technique that reduces memory consumption by converting model parameters and activations into low-bit integers. We critically analyze the existing quantization approaches, identifying their limitations in balancing the accuracy and efficiency of the quantized LLMs. To advance beyond these limitations, we propose WKVQuant, a PTQ framework especially designed for quantizing weights and the key/value (KV) cache of LLMs. Specifically, we incorporates past-only quantization to improve the computation of attention. Additionally, we introduce two-dimensional quantization strategy to handle the distribution of KV cache, along with a cross-block reconstruction regularization for parameter optimization. Experiments show that WKVQuant achieves almost comparable memory savings to weight-activation quantization, while also approaching the performance of weight-only quantization.
△ Less
Submitted 20 February, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Gras** the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction
Authors:
Sizhe Zhou,
Yu Meng,
Bowen **,
Jiawei Han
Abstract:
Relation extraction (RE), a crucial task in NLP, aims to identify semantic relationships between entities mentioned in texts. Despite significant advancements in this field, existing models typically rely on extensive annotated data for training, which can be both costly and time-consuming to acquire. Moreover, these models often struggle to adapt to new or unseen relationships. In contrast, few-s…
▽ More
Relation extraction (RE), a crucial task in NLP, aims to identify semantic relationships between entities mentioned in texts. Despite significant advancements in this field, existing models typically rely on extensive annotated data for training, which can be both costly and time-consuming to acquire. Moreover, these models often struggle to adapt to new or unseen relationships. In contrast, few-shot learning settings, which aim to reduce annotation requirements, may offer incomplete and biased supervision for understanding target relation semantics, leading to degraded and unstable performance. To provide the model with accurate and explicit descriptions of the relations types and meanwhile minimize the annotation requirements, we study the definition only zero-shot RE setting where only relation definitions expressed in natural language are used to train a RE model. Motivated by the strong synthetic data generation power of LLMs, we propose a framework REPaL which consists of three stages: (1) We utilize LLMs to generate initial seed instances based on relation definitions and an unlabeled corpora. (2) We fine-tune a bidirectional Small Language Model (SLM) using these initial seeds to learn the relations for the target domain. (3) We enhance pattern coverage and mitigate bias resulting from the limited number of initial seeds by incorporating feedback acquired from SLM's predictions on unlabeled corpora. To accomplish this, we leverage the multi-turn conversation ability of LLMs to generate new instances in follow-up dialogues. Experiments on two datasets show REPaL achieves better zero-shot performance with large margins over baseline methods.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Control Color: Multimodal Diffusion-based Interactive Image Colorization
Authors:
Zhexin Liang,
Zhaochen Li,
Shangchen Zhou,
Chongyi Li,
Chen Change Loy
Abstract:
Despite the existence of numerous colorization methods, several limitations still exist, such as lack of user interaction, inflexibility in local colorization, unnatural color rendering, insufficient color variation, and color overflow. To solve these issues, we introduce Control Color (CtrlColor), a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model, offeri…
▽ More
Despite the existence of numerous colorization methods, several limitations still exist, such as lack of user interaction, inflexibility in local colorization, unnatural color rendering, insufficient color variation, and color overflow. To solve these issues, we introduce Control Color (CtrlColor), a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model, offering promising capabilities in highly controllable interactive image colorization. While several diffusion-based methods have been proposed, supporting colorization in multiple modalities remains non-trivial. In this study, we aim to tackle both unconditional and conditional image colorization (text prompts, strokes, exemplars) and address color overflow and incorrect color within a unified framework. Specifically, we present an effective way to encode user strokes to enable precise local color manipulation and employ a practical way to constrain the color distribution similar to exemplars. Apart from accepting text prompts as conditions, these designs add versatility to our approach. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring. Extensive comparisons show that our model outperforms state-of-the-art image colorization methods both qualitatively and quantitatively.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
MRPD: Undersampled MRI reconstruction by prompting a large latent diffusion model
Authors:
Ziqi Gao,
S. Kevin Zhou
Abstract:
Implicit visual knowledge in a large latent diffusion model (LLDM) pre-trained on natural images is rich and hypothetically universal to natural and medical images. To test this hypothesis from a practical perspective, we propose a novel framework for undersampled MRI Reconstruction by Prompting a large latent Diffusion model (MRPD). While the existing methods trained on MRI datasets are typically…
▽ More
Implicit visual knowledge in a large latent diffusion model (LLDM) pre-trained on natural images is rich and hypothetically universal to natural and medical images. To test this hypothesis from a practical perspective, we propose a novel framework for undersampled MRI Reconstruction by Prompting a large latent Diffusion model (MRPD). While the existing methods trained on MRI datasets are typically of limited generalizability toward diverse data acquisition scenarios, MRPD supports unsupervised and universally adaptive MRI reconstruction. For unsupervised reconstruction, MRSampler guides LLDM with a random-phase-modulated hard-to-soft control. With any single- or multiple-source MRI dataset, MRPD's performance is boosted universally by a lightweight MRAdapter that only finetunes the LLDM's autoencoder. Experiments on FastMRI and IXI show that MRPD is the only model that supports both MRI database-free and database-available scenarios and attains the best generalizability towards out-of-domain (OOD) samplings, contrasts, and organs among compared unsupervised, supervised, and MRI diffusion methods. To our knowledge, MRPD is the first method that empirically shows the universal prowess of an LLDM pre-trained on vast natural images for MRI. Our official implementation is at https://github.com/Z7Gao/MRPD.
△ Less
Submitted 5 July, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Learning-Augmented Skip Lists
Authors:
Chunkai Fu,
Jung Hoon Seo,
Samson Zhou
Abstract:
We study the integration of machine learning advice into the design of skip lists to improve upon traditional data structure design. Given access to a possibly erroneous oracle that outputs estimated fractional frequencies for search queries on a set of items, we construct a skip list that provably provides the optimal expected search time, within nearly a factor of two. In fact, our learning-augm…
▽ More
We study the integration of machine learning advice into the design of skip lists to improve upon traditional data structure design. Given access to a possibly erroneous oracle that outputs estimated fractional frequencies for search queries on a set of items, we construct a skip list that provably provides the optimal expected search time, within nearly a factor of two. In fact, our learning-augmented skip list is still optimal up to a constant factor, even if the oracle is only accurate within a constant factor. We show that if the search queries follow the ubiquitous Zipfian distribution, then the expected search time for an item by our skip list is only a constant, independent of the total number $n$ of items, i.e., $\mathcal{O}(1)$, whereas a traditional skip list will have an expected search time of $\mathcal{O}(\log n)$. We also demonstrate robustness by showing that our data structure achieves an expected search time that is within a constant factor of an oblivious skip list construction even when the predictions are arbitrarily incorrect. Finally, we empirically show that our learning-augmented skip list outperforms traditional skip lists on both synthetic and real-world datasets.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Permittivity Estimation in Ray-tracing Using Path Loss Data based on GAMP
Authors:
Yuanhao Jiang,
Shidong Zhou,
Xiaofeng Zhong
Abstract:
In this paper, we propose a modified Generalized Approximate Message Passing (GAMP) algorithm to estimate permittivity parameters using path loss data in ray-tracing model.
In this paper, we propose a modified Generalized Approximate Message Passing (GAMP) algorithm to estimate permittivity parameters using path loss data in ray-tracing model.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Adaptive tempered reversible jump algorithm for Bayesian curve fitting
Authors:
Zhiyao Tian,
Anthony Lee,
Shunhua Zhou
Abstract:
Bayesian curve fitting plays an important role in inverse problems, and is often addressed using the Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm. However, this algorithm can be computationally inefficient without appropriately tuned proposals. As a remedy, we present an adaptive RJMCMC algorithm for the curve fitting problems by extending the Adaptive Metropolis sampler from a fixe…
▽ More
Bayesian curve fitting plays an important role in inverse problems, and is often addressed using the Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm. However, this algorithm can be computationally inefficient without appropriately tuned proposals. As a remedy, we present an adaptive RJMCMC algorithm for the curve fitting problems by extending the Adaptive Metropolis sampler from a fixed-dimensional to a trans-dimensional case. In this presented algorithm, both the size and orientation of the proposal function can be automatically adjusted in the sampling process. Specifically, the curve fitting setting allows for the approximation of the posterior covariance of the a priori unknown function on a representative grid of points. This approximation facilitates the definition of efficient proposals. In addition, we introduce an auxiliary-tempered version of this algorithm via non-reversible parallel tempering. To evaluate the algorithms, we conduct numerical tests involving a series of controlled experiments. The results demonstrate that the adaptive algorithms exhibit significantly higher efficiency compared to the conventional ones. Even in cases where the posterior distribution is highly complex, leading to ineffective convergence in the auxiliary-tempered conventional RJMCMC, the proposed auxiliary-tempered adaptive RJMCMC performs satisfactorily. Furthermore, we present a realistic inverse example to test the algorithms. The successful application of the adaptive algorithm distinguishes it again from the conventional one that fails to converge effectively even after millions of iterations.
△ Less
Submitted 27 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Precise and Fast LIDAR via Electrical Asynchronous Sampling Based on a Single Femtosecond Laser
Authors:
Lizong Dong,
Qinggai Mi,
Siyu Zhou,
Guanhao Wu
Abstract:
LiDAR, using a laser-based ranging method for precise environmental 3D sensing, has numerous scientific and industrial applications. However, the challenge lies in simultaneously enhancing precision and update rate, hinders its application in more unexpected scenarios. To this end, an optical frequency comb with a stable repetition frequency and femtosecond pulse width was used as an advanced lase…
▽ More
LiDAR, using a laser-based ranging method for precise environmental 3D sensing, has numerous scientific and industrial applications. However, the challenge lies in simultaneously enhancing precision and update rate, hinders its application in more unexpected scenarios. To this end, an optical frequency comb with a stable repetition frequency and femtosecond pulse width was used as an advanced laser source. The LiDAR performance significantly improved in the micrometer and megahertz regimes using an asynchronous sampling ranging method of electrical pulses based on a single femtosecond laser. This overcame the limitation of traditional optical sampling approaches, achieving a 38.8 $μ$m Allan deviation at an update rate of 1 MHz and 8.06 $μ$m after 2 ms time-averaging. The proposed method used a single laser for fast metrology monitoring, 1 megapixel/s 3D imaging at the meter-level non-ambiguous range and contactless vital sign detection at the hundred-micrometer scale.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Interrater agreement statistics under the two-rater dichotomous-response case with correlated decisions
Authors:
Zizhong Tian,
Vernon M. Chinchilli,
Chan Shen,
Shouhao Zhou
Abstract:
Measurement of the interrater agreement (IRA) is critical in various disciplines. To correct for potential confounding chance agreement in IRA, Cohen's kappa and many other methods have been proposed. However, owing to the varied strategies and assumptions across these methods, there is a lack of practical guidelines on how these methods should be preferred even for the common two-rater dichotomou…
▽ More
Measurement of the interrater agreement (IRA) is critical in various disciplines. To correct for potential confounding chance agreement in IRA, Cohen's kappa and many other methods have been proposed. However, owing to the varied strategies and assumptions across these methods, there is a lack of practical guidelines on how these methods should be preferred even for the common two-rater dichotomous rating. To fill the gaps in the literature, we systematically review nine IRA methods and propose a generalized framework that can simulate the correlated decision processes behind the two raters to compare those reviewed methods under comprehensive practical scenarios. Based on the new framework, an estimand of "true" chance-corrected IRA is defined by accounting for the "probabilistic certainty" and serves as the comparison benchmark. We carry out extensive simulations to evaluate the performance of the reviewed IRA measures, and an agglomerative hierarchical clustering analysis is conducted to assess the inter-relationships among the included methods and the benchmark metric. Recommendations for selecting appropriate IRA statistics in different practical conditions are provided and the needs for further advancements in IRA estimation methodologies are emphasized.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Sensor Misalignment-tolerant AUV Navigation with Passive DoA and Doppler Measurements
Authors:
Bingbing Zhang,
Shuo Liu,
Shanmin Zhou,
Daxiong Ji,
Tao Wang,
Tian Xia,
Wen Xu
Abstract:
We present a sensor misalignment-tolerant AUV navigation method that leverages measurements from an acoustic array and dead reckoned information. Recent studies have demonstrated the potential use of passive acoustic Direction of Arrival (DoA) measurements for AUV navigation without requiring ranging measurements. However, the sensor misalignment between the acoustic array and the attitude sensor…
▽ More
We present a sensor misalignment-tolerant AUV navigation method that leverages measurements from an acoustic array and dead reckoned information. Recent studies have demonstrated the potential use of passive acoustic Direction of Arrival (DoA) measurements for AUV navigation without requiring ranging measurements. However, the sensor misalignment between the acoustic array and the attitude sensor was not accounted for. Such misalignment may deteriorate the navigation accuracy. This paper proposes a novel approach that allows simultaneous AUV navigation, beacon localization, and sensor alignment. An Unscented Kalman Filter (UKF) that enables the necessary calculations to be completed at an affordable computational load is developed. A Nonlinear Least Squares (NLS)-based technique is employed to find an initial solution for beacon localization and sensor alignment as early as possible using a short-term window of measurements. Experimental results demonstrate the performance of the proposed method.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Discriminative Adversarial Unlearning
Authors:
Rohan Sharma,
Shijie Zhou,
Kaiyi Ji,
Changyou Chen
Abstract:
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm. We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model. We consider the scenario of two networks, the attacker $\mathbf{A}$ and the trained defender $\mathbf{D}$ pitted against e…
▽ More
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm. We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model. We consider the scenario of two networks, the attacker $\mathbf{A}$ and the trained defender $\mathbf{D}$ pitted against each other in an adversarial objective, wherein the attacker aims at teasing out the information of the data to be unlearned in order to infer membership, and the defender unlearns to defend the network against the attack, whilst preserving its general performance. The algorithm can be trained end-to-end using backpropagation, following the well known iterative min-max approach in updating the attacker and the defender. We additionally incorporate a self-supervised objective effectively addressing the feature space discrepancies between the forget set and the validation set, enhancing unlearning performance. Our proposed algorithm closely approximates the ideal benchmark of retraining from scratch for both random sample forgetting and class-wise forgetting schemes on standard machine-unlearning datasets. Specifically, on the class unlearning scheme, the method demonstrates near-optimal performance and comprehensively overcomes known methods over the random sample forgetting scheme across all metrics and multiple network pruning strategies.
△ Less
Submitted 13 February, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Resource Allocation for Channel Estimation in Reconfigurable Intelligent Surface-Aided Multi-Cell Networks
Authors:
Yining Xu,
Sheng Zhou
Abstract:
Reconfigurable intelligent surface (RIS) is a promising solution to deal with the blockage-sensitivity of millimeter wave band and reduce the high energy consumption caused by network densification. However, deploying large scale RISs may not bring expected performance gain due to significant channel estimation overhead and non-negligible reflected interference. In this paper, we derive the analyt…
▽ More
Reconfigurable intelligent surface (RIS) is a promising solution to deal with the blockage-sensitivity of millimeter wave band and reduce the high energy consumption caused by network densification. However, deploying large scale RISs may not bring expected performance gain due to significant channel estimation overhead and non-negligible reflected interference. In this paper, we derive the analytical expressions of the coverage probability, area spectrum efficiency (ASE) and energy efficiency (EE) of a downlink RIS-aided multi-cell network. In order to optimize the network performance, we investigate the conditions for the optimal number of training symbols of each antenna-to-antenna and antenna-to-element path (referred to as the optimal unit training overhead) in channel estimation. Our study shows that: 1) RIS deployment is not `the more, the better', only when blockage objects are dense should one deploy more RISs; 2) the coverage probability is maximized when the unit training overhead is designed as large as possible; 3) however, the ASE-and-EE-optimal unit training overhead exists. It is a monotonically increasing function of the frame length and a monotonically decreasing function of the average signal-to-noise-ratio (in the high signal-to-noise-ratio region). Additionally, the optimal unit training overhead is smaller when communication ends deploy particularly few or many antennas.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Rethinking Propagation for Unsupervised Graph Domain Adaptation
Authors:
Meihan Liu,
Zeyu Fang,
Zhen Zhang,
Ming Gu,
Sheng Zhou,
Xin Wang,
Jiajun Bu
Abstract:
Unsupervised Graph Domain Adaptation (UGDA) aims to transfer knowledge from a labelled source graph to an unlabelled target graph in order to address the distribution shifts between graph domains. Previous works have primarily focused on aligning data from the source and target graph in the representation space learned by graph neural networks (GNNs). However, the inherent generalization capabilit…
▽ More
Unsupervised Graph Domain Adaptation (UGDA) aims to transfer knowledge from a labelled source graph to an unlabelled target graph in order to address the distribution shifts between graph domains. Previous works have primarily focused on aligning data from the source and target graph in the representation space learned by graph neural networks (GNNs). However, the inherent generalization capability of GNNs has been largely overlooked. Motivated by our empirical analysis, we reevaluate the role of GNNs in graph domain adaptation and uncover the pivotal role of the propagation process in GNNs for adapting to different graph domains. We provide a comprehensive theoretical analysis of UGDA and derive a generalization bound for multi-layer GNNs. By formulating GNN Lipschitz for k-layer GNNs, we show that the target risk bound can be tighter by removing propagation layers in source graph and stacking multiple propagation layers in target graph. Based on the empirical and theoretical analysis mentioned above, we propose a simple yet effective approach called A2GNN for graph domain adaptation. Through extensive experiments on real-world datasets, we demonstrate the effectiveness of our proposed A2GNN framework.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Cosmelkology: Elko fermions in FLRW space-time
Authors:
Cheng-Yang Lee,
Haomin Rao,
Wenqi Yu,
Siyi Zhou
Abstract:
Cosmelkology is the study of Elko in cosmology. Elko is a massive spin-half field of mass dimension one. Elko differs from the Dirac and Majorana fermions because it furnishes the irreducible representation of the extended Poincare group with a two-fold Wigner degeneracy where the particle and anti-particle states both have four degrees of freedom. Elko has a renormalizable quartic self interactio…
▽ More
Cosmelkology is the study of Elko in cosmology. Elko is a massive spin-half field of mass dimension one. Elko differs from the Dirac and Majorana fermions because it furnishes the irreducible representation of the extended Poincare group with a two-fold Wigner degeneracy where the particle and anti-particle states both have four degrees of freedom. Elko has a renormalizable quartic self interaction which makes it a candidate for self-interacting dark matter. We study Elko in the spatially flat FLRW space-time and find exact solutions in the de Sitter space. By choosing the appropriate solutions and phases, the fields satisfy the canonical anti-commutation relations and have the correct time evolutions in the flat space limit.
△ Less
Submitted 3 March, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Spinning $Q$-ball Superradiance in 3+1D
Authors:
Guo-Dong Zhang,
Fu-Ming Chang,
Paul M. Saffin,
Qi-Xin Xie,
Shuang-Yong Zhou
Abstract:
Recently, it has been found that a $Q$-ball can amplify waves incident upon it, due to rotation in the internal space and the interaction of the two modes in the complex scalar field. While the spherically symmetric 3D case has been investigated previously, here we explore the 3D axi-symmetric case, which is numerically much more challenging. The difficulty comes because a partial wave expansion i…
▽ More
Recently, it has been found that a $Q$-ball can amplify waves incident upon it, due to rotation in the internal space and the interaction of the two modes in the complex scalar field. While the spherically symmetric 3D case has been investigated previously, here we explore the 3D axi-symmetric case, which is numerically much more challenging. The difficulty comes because a partial wave expansion is needed, and the different partial waves can not be separated, for either the background spinning Q-ball solution or the perturbative scattering on top of it. A relaxation method and a high dimensional shooting method are applied to compute the Q-ball solutions and the amplification factors respectively. We also classify the behavior of the amplification factors and we discuss their bounds and the superradiance criteria.
△ Less
Submitted 26 February, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Nuclear mass table in deformed relativistic Hartree-Bogoliubov theory in continuum, II: Even-$Z$ nuclei
Authors:
DRHBc Mass Table Collaboration,
Peng Guo,
Xiaojie Cao,
Kangmin Chen,
Zhihui Chen,
Myung-Ki Cheoun,
Yong-Beom Choi,
Pak Chung Lam,
Wenmin Deng,
Jianmin Dong,
Pengxiang Du,
Xiaokai Du,
Kangda Duan,
Xiaohua Fan,
Wei Gao,
Lisheng Geng,
Eunja Ha,
Xiao-Tao He,
**niu Hu,
**gke Huang,
Kun Huang,
Yanan Huang,
Zidan Huang,
Kim Da Hyung,
Hoi Yat Chan
, et al. (58 additional authors not shown)
Abstract:
The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-ne…
▽ More
The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-neutron separation energies, root-mean-square (rms) radii of neutron, proton, matter, and charge distributions, quadrupole deformations, and neutron and proton Fermi surfaces are tabulated and compared with available experimental data. A total of 4829 even-$Z$ nuclei are predicted to be bound, with an rms deviation of 1.477 MeV from the 1244 mass data. Good agreement with the available experimental odd-even mass differences, $α$ decay energies, and charge radii is also achieved. The description accuracy for nuclear masses and nucleon separation energies as well as the prediction for drip lines is compared with the results obtained from other relativistic and nonrelativistic density functional. The comparison shows that the DRHBc theory with PC-PK1 provides an excellent microscopic description for the masses of even-$Z$ nuclei. The systematics of the nucleon separation energies, odd-even mass differences, pairing energies, two-nucleon gaps, $α$ decay energies, rms radii, quadrupole deformations, potential energy curves, neutron density distributions, and neutron mean-field potentials are discussed.
△ Less
Submitted 10 June, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
The second largest eigenvalue of some nonnormal Cayley graphs on symmetric groups
Authors:
Yuxuan Li,
Binzhou Xia,
Sanming Zhou
Abstract:
A Cayley graph on the symmetric group $S_n$ is said to have the Aldous property if its strictly second largest eigenvalue (that is, the largest eigenvalue strictly smaller than the degree) is attained by the standard representation of $S_n$. For $1\leq r < k < n$, let $C(n,k;r)$ be the set of $k$-cycles of $S_n$ moving every point in $\{1, \ldots, r\}$. Recently, Siemons and Zalesski [J. Algebraic…
▽ More
A Cayley graph on the symmetric group $S_n$ is said to have the Aldous property if its strictly second largest eigenvalue (that is, the largest eigenvalue strictly smaller than the degree) is attained by the standard representation of $S_n$. For $1\leq r < k < n$, let $C(n,k;r)$ be the set of $k$-cycles of $S_n$ moving every point in $\{1, \ldots, r\}$. Recently, Siemons and Zalesski [J. Algebraic Combin. 55 (2022) 989--1005] posed a conjecture which is equivalent to saying that for any $n \ge 5$ and $1\leq r<k<n$ the nonnormal Cayley graph $\mathrm{Cay}(S_n, C(n,k;r))$ on $S_n$ with connection set $C(n,k;r)$ has the Aldous property. Solving this conjecture, we prove that all these graphs have the Aldous property except when (i) $(n, k, r) = (6, 5, 1)$ or (ii) $n$ is odd, $k = n-1$, and $1 \le r < \frac{n}{2}$. Along the way we determine all irreducible representations of $S_n$ that can achieve the strictly second largest eigenvalue of $\mathrm{Cay}(S_n, C(n,n-1;r))$ as well as the smallest eigenvalue of this graph.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
On Secure mmWave RSMA Systems
Authors:
Hongjiang Lei,
Sha Zhou,
Xinhu Chen,
Imran Shafique Ansari,
Yun Li,
Gaofeng Pan,
Mohamed-Slim Alouini
Abstract:
This work considers a multiple-input-single-output mmWave RSMA system wherein a base station serves two users in the presence of a passive eavesdropper. Different eavesdrop** scenarios are considered corresponding to the overlapped resolvable paths between the main and the wiretap channels under the considered transmission schemes. The analytical expressions for the secrecy outage probability ar…
▽ More
This work considers a multiple-input-single-output mmWave RSMA system wherein a base station serves two users in the presence of a passive eavesdropper. Different eavesdrop** scenarios are considered corresponding to the overlapped resolvable paths between the main and the wiretap channels under the considered transmission schemes. The analytical expressions for the secrecy outage probability are derived respectively through the Gaussian Chebyshev quadrature method. Monte Carlo simulation results are presented to validate the correctness of the derived analytical expressions and demonstrate the effects of system parameters on the SOP of the considered mmWave RSMA systems.
△ Less
Submitted 25 February, 2024; v1 submitted 3 February, 2024;
originally announced February 2024.
-
Robust Multi-Task Learning with Excess Risks
Authors:
Yifei He,
Shiji Zhou,
Guojun Zhang,
Hyokun Yun,
Yi Xu,
Belinda Zeng,
Trishul Chilimbi,
Han Zhao
Abstract:
Multi-task learning (MTL) considers learning a joint model for multiple tasks by optimizing a convex combination of all task losses. To solve the optimization problem, existing methods use an adaptive weight updating scheme, where task weights are dynamically adjusted based on their respective losses to prioritize difficult tasks. However, these algorithms face a great challenge whenever label noi…
▽ More
Multi-task learning (MTL) considers learning a joint model for multiple tasks by optimizing a convex combination of all task losses. To solve the optimization problem, existing methods use an adaptive weight updating scheme, where task weights are dynamically adjusted based on their respective losses to prioritize difficult tasks. However, these algorithms face a great challenge whenever label noise is present, in which case excessive weights tend to be assigned to noisy tasks that have relatively large Bayes optimal errors, thereby overshadowing other tasks and causing performance to drop across the board. To overcome this limitation, we propose Multi-Task Learning with Excess Risks (ExcessMTL), an excess risk-based task balancing method that updates the task weights by their distances to convergence instead. Intuitively, ExcessMTL assigns higher weights to worse-trained tasks that are further from convergence. To estimate the excess risks, we develop an efficient and accurate method with Taylor approximation. Theoretically, we show that our proposed algorithm achieves convergence guarantees and Pareto stationarity. Empirically, we evaluate our algorithm on various MTL benchmarks and demonstrate its superior performance over existing methods in the presence of label noise.
△ Less
Submitted 14 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.