Search | arXiv e-print repository

One-dimensional flat bands in phosphorene nanoribbons with pentagonal nature

Authors: Shuo Sun, **g-Yang You, Zhihao Cai, Jie Su, Tong Yang, Xinnan Peng, Yihe Wang, Daiyu Geng, Jian Gou, Yuli Huang, Sisheng Duan, Lan Chen, Kehui Wu, Andrew T. S. Wee, Yuan ** Feng, Jia Lin Zhang, Jiong Lu, Baojie Feng, Wei Chen

Abstract: Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNR… ▽ More Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNRs, wherein the corresponding flat bands are directly verified by using angle-resolved photoemission spectroscopy. We confirm that the observed 1D flat bands originate from the electronic 1D sawtooth and Lieb lattices, respectively, as revealed by the combination of bond-resolved scanning tunneling microscopy, scanning tunneling spectroscopy, tight-binding models, and first-principles calculations. Our study demonstrates a general way to construct 1D flat bands in 1D solid materials system, which provides a robust platform to explore strongly interacting phases of matter. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 13 pages, 4 figures

arXiv:2407.08146 [pdf, other]

Extending the Takagi-Taupin equations for x-ray nanobeam Bragg coherent diffraction

Authors: T. Zhou, M. J. Cherukara, S. Kandel, M. Allain, N. Hua, O. Shpyrko, Y. Takamura, Z. Cai, S. O. Hruszkewycz, M. V. Holt

Abstract: We present a new approach for simulating x-ray nanobeam Bragg coherent diffraction patterns based on the Takagi-Taupin equations. Compared to conventional methods, the current approach can be universally applied to any weakly strained system including semi-infinite crystals that diffract dynamically. It addresses issues such as the curved wavefront and re-divergence of the focused incident beam. W… ▽ More We present a new approach for simulating x-ray nanobeam Bragg coherent diffraction patterns based on the Takagi-Taupin equations. Compared to conventional methods, the current approach can be universally applied to any weakly strained system including semi-infinite crystals that diffract dynamically. It addresses issues such as the curved wavefront and re-divergence of the focused incident beam. We show excellent agreement against experimental data on a strained La0.7Sr0.3MnO3 thin film on SrTiO3 substrate, and a path to extracting physical information using automatic differentiation. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07343 [pdf]

Electrically Tuning Quasi-Bound States in the Continuum with Hybrid Graphene-Silicon Metasurfaces

Authors: Ziqiang Cai, Xianzhe Zhang, Tushar Sanjay Karnik, Yihao Xu, Tae Yoon Kim, Juejun Hu, Yongmin Liu

Abstract: Metasurfaces have become one of the most prominent research topics in the field of optics owing to their unprecedented properties and novel applications on an ultrathin platform. By combining graphene with metasurfaces, electrical tunable functions can be achieved with fast tuning speed, large modulation depth and broad tuning range. However, the tuning efficiency of hybrid graphene metasurfaces w… ▽ More Metasurfaces have become one of the most prominent research topics in the field of optics owing to their unprecedented properties and novel applications on an ultrathin platform. By combining graphene with metasurfaces, electrical tunable functions can be achieved with fast tuning speed, large modulation depth and broad tuning range. However, the tuning efficiency of hybrid graphene metasurfaces within the short-wavelength infrared (SWIR) spectrum is typically low because of the small resonance wavelength shift in this wavelength range. In this work, through the integration of graphene and silicon metasurfaces that support quasi-bound states in the continuum (quasi-BIC), we experimentally demonstrate significant transmittance tuning even with less than 30 nm resonance wavelength shift thanks to the high quality-factor of quasi-BIC metasurfaces. The tunable transmittance spectrum was measured using Fourier Transform Infrared Spectroscopy (FTIR) with a modified reflective lens to improve the accuracy, and the electrical tuning was realized utilizing the cut-and-stick method of ion gel. At the wavelength of 3.0 um, the measured change of transmittance T_max-T_min and modulation depth (T_max-T_min)/T_max can reach 22.2% and 28.9%, respectively, under a small bias voltage ranging from -2 V to +2 V. To the best of our knowledge, this work is the first experimental demonstration of tunable graphene/quasi-BIC metasurfaces, which have potential applications in optical modulation, reconfigurable photonic devices, and optical communications. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 14 pages, 5 figures

arXiv:2407.04365 [pdf, ps, other]

Simulation of Spin Chains with off-diagonal Coupling Using Inchworm Method

Authors: Yixiao Sun, Geshuo Wang, Zhenning Cai

Abstract: We study the dynamical simulation of open quantum spin chain with nearest neighboring coupling, where each spin in the chain is associated with a harmonic bath. This is an extension of our previous work [G. Wang and Z. Cai, J. Chem. Theory Comput., 19, 8523--8540, 2023] by generalizing the application of the inchworm method and the technique of modular path integrals from diagonally coupled cases… ▽ More We study the dynamical simulation of open quantum spin chain with nearest neighboring coupling, where each spin in the chain is associated with a harmonic bath. This is an extension of our previous work [G. Wang and Z. Cai, J. Chem. Theory Comput., 19, 8523--8540, 2023] by generalizing the application of the inchworm method and the technique of modular path integrals from diagonally coupled cases to off-diagonally coupled cases. Additionally, to reduce computational and memory cost in long time simulation, we apply tensor-train representation to efficiently represent the reduced density matrix of the spin chains, and employ the transfer tensor method (TTM) to avoid exponential growth of computational cost with respect to time. Abundant numerical experiments are performed to validate our method. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.03597 [pdf, other]

doi 10.3390/universe10070282

The Host Galaxy Fluxes of Active Galaxy Nuclei Are Generally Overestimated by the Flux Variation Gradient Method

Authors: Minxuan Cai, Zhen Wan, Zhenyi Cai, Lulu Fan, Junxian Wang

Abstract: In terms of the variable nature of normal active galaxy nuclei (AGN) and luminous quasars, a so-called flux variation gradient (FVG) method has been widely utilized to estimate the underlying non-variable host galaxy fluxes. The FVG method assumes an invariable AGN color, but this assumption has been questioned by the intrinsic color variation of quasars and local Seyfert galaxies. Here, using an… ▽ More In terms of the variable nature of normal active galaxy nuclei (AGN) and luminous quasars, a so-called flux variation gradient (FVG) method has been widely utilized to estimate the underlying non-variable host galaxy fluxes. The FVG method assumes an invariable AGN color, but this assumption has been questioned by the intrinsic color variation of quasars and local Seyfert galaxies. Here, using an up-to-date thermal fluctuation model to simulate multi-wavelength AGN variability, we theoretically demonstrate that the FVG method generally overestimates the host galaxy flux; that is, it is more significant for brighter AGN/quasars. Furthermore, we observationally confirm that the FVG method indeed overestimates the host galaxy flux by comparing it to that estimated through other independent methods. We thus caution that applying the FVG method should be performed carefully in the era of time-domain astronomy. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Journal ref: Universe 2024, 10, 282

arXiv:2407.03567 [pdf, ps, other]

Charmless decays of the spin-2 partner of $X(3872)$

Authors: Zu-Xin Cai, Zhao-Sai Jia, Gang Li, Shi-Dong Liu, Ju-Jun Xie

Abstract: The Belle collaboration recently reported a promising candidate for the spin-2 $D^*\bar{D}^*$ partner of the $X(3872)$, called the $X_2$ for short, having a mass of $(4014.3 \pm 4.0 \pm 1.5)~\mathrm{MeV}$ and a width of $(4 \pm 11 \pm 6)~\mathrm{MeV} $. In present work, we assume the $X_2$ as a pure molecule of the $D^*\bar{D}^*$ under three cases, i.e., pure neutral components ($θ= 0$), isospin s… ▽ More The Belle collaboration recently reported a promising candidate for the spin-2 $D^*\bar{D}^*$ partner of the $X(3872)$, called the $X_2$ for short, having a mass of $(4014.3 \pm 4.0 \pm 1.5)~\mathrm{MeV}$ and a width of $(4 \pm 11 \pm 6)~\mathrm{MeV} $. In present work, we assume the $X_2$ as a pure molecule of the $D^*\bar{D}^*$ under three cases, i.e., pure neutral components ($θ= 0$), isospin singlet ($θ= π/4$) and neutral components dominant ($θ= π/6$), where $θ$ is a phase angle describing the proportion of neutral and charged constituents. Using an effective Lagrangian approach, we calculated the partial widths of $X_2\to VV$ and $X_2 \to PP$ ($V$ and $P$ stand for light vector and pseudoscalar mesons, respectively). The predicted decay widths of $X_2 \to VV$ can reach a few hundreds of $\mathrm{keV}$, while the decay widths of $X_2 \to PP$ are about several tens of $\mathrm{keV}$. In addition, the effects from the proportion of neutral and charged constituent on the decay widths of $X_2\to VV$ and $PP$ are also investigated. We hope that the present calculations will be checked experimentally in the future. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 8 pages, 7figure, comments welcome

arXiv:2407.02973 [pdf, other]

NOEMA formIng Cluster survEy (NICE): Characterizing eight massive galaxy groups at $1.5 < z < 4$ in the COSMOS field

Authors: Nikolaj B. Sillassen, Shuowen **, Georgios E. Magdis, Emanuele Daddi, Tao Wang, Shiying Lu, Hanwen Sun, Vinod Arumugam, Daizhong Liu, Malte Brinch, Chiara D'Eugenio, Raphael Gobat, Carlos Gómez-Guijarro, Michael Rich, Eva Schinnerer, Veronica Strazzullo, Qinghua Tan, Francesco Valentino, Yijun Wang, Mengyuan Xiao, Luwenjia Zhou, David Blánquez-Sesé, Zheng Cai, Yanmei Chen, Laure Ciesla , et al. (19 additional authors not shown)

Abstract: The NOEMA formIng Cluster survEy (NICE) is a large program targeting 69 massive galaxy group candidates at $z>2$ in six deep fields. We report spectroscopic confirmation of eight groups at $1.65\leq z\leq3.61$ in COSMOS. Homogeneously selected as significant overdensities of red IRAC sources with red Herschel colors, four groups are confirmed by CO and [CI] with NOEMA 3mm observations, three are c… ▽ More The NOEMA formIng Cluster survEy (NICE) is a large program targeting 69 massive galaxy group candidates at $z>2$ in six deep fields. We report spectroscopic confirmation of eight groups at $1.65\leq z\leq3.61$ in COSMOS. Homogeneously selected as significant overdensities of red IRAC sources with red Herschel colors, four groups are confirmed by CO and [CI] with NOEMA 3mm observations, three are confirmed with ALMA, and one is confirmed by H$α$ from Subaru/FMOS. We constructed the integrated FIR SEDs for the eight groups, obtaining total IR SFR $=260-1300~{\rm M_\odot}$~yr$^{-1}$. We adopted six methods to estimate the dark matter masses, including stellar mass to halo mass relations, overdensity with galaxy bias, and NFW profile fitting to radial stellar mass density. We found the radial stellar mass density are consistent with a NFW profile, supporting that they are collapsed structures hosted by a single dark matter halo. The best halo mass estimates are $\log(M_{\rm h}/{\rm M_\odot})=12.8-13.7$ with uncertainty of 0.3 dex. From halo mass estimates, we derive baryonic accretion rate ${\rm BAR}=(1-8)\times10^{3}\,{\rm M_{\odot}/yr}$ for this sample. We find a quasi-linear correlation between the integrated SFR/BAR and the theoretical halo mass limit for cold streams, $M_{\rm stream}/M_{\rm h}$, with ${\rm SFR/BAR}=10^{-0.46\pm0.22}\left({M_{\rm stream}/M_{\rm h}}\right)^{0.71\pm0.16}$ with a scatter of $0.40\,{\rm dex}$. Further, we compare halo masses and stellar masses with simulations, and find all structures are consistent with being progenitors of $M_{\rm h}(z=0)>10^{14}\,{\rm M_{\odot}}$ galaxy clusters, and the most massive central galaxies have stellar masses consistent with brightest cluster galaxies (BCGs) progenitors in the TNG300 simulation. The results strongly suggest these structures are forming massive galaxy clusters via baryonic and dark matter accretion. △ Less

Submitted 5 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

Comments: 44 pages (27pp appendix), 32 figures, 18 tables, accepted for publication in A&A

arXiv:2407.01496 [pdf, other]

Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting

Authors: Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera

Abstract: This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis funct… ▽ More This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples. △ Less

Submitted 1 July, 2024; originally announced July 2024.

MSC Class: 65K10; 65F05

arXiv:2406.19280 [pdf, other]

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Authors: Junying Chen, Ruyi Ouyang, Anningzhe Gao, Shunian Chen, Guiming Hardy Chen, Xidong Wang, Ruifei Zhang, Zhenyang Cai, Ke Ji, Guangjun Yu, Xiang Wan, Benyou Wang

Abstract: The rapid development of multimodal large language models (MLLMs), such as GPT-4V, has led to significant advancements. However, these models still face challenges in medical multimodal capabilities due to limitations in the quantity and quality of medical vision-text data, stemming from data privacy concerns and high annotation costs. While pioneering approaches utilize PubMed's large-scale, de-i… ▽ More The rapid development of multimodal large language models (MLLMs), such as GPT-4V, has led to significant advancements. However, these models still face challenges in medical multimodal capabilities due to limitations in the quantity and quality of medical vision-text data, stemming from data privacy concerns and high annotation costs. While pioneering approaches utilize PubMed's large-scale, de-identified medical image-text pairs to address these limitations, they still fall short due to inherent data noise. To tackle this, we refined medical image-text pairs from PubMed and employed MLLMs (GPT-4V) in an 'unblinded' capacity to denoise and reformat the data, resulting in the creation of the PubMedVision dataset with 1.3 million medical VQA samples. Our validation demonstrates that: (1) PubMedVision can significantly enhance the medical multimodal capabilities of current MLLMs, showing significant improvement in benchmarks including the MMMU Health & Medicine track; (2) manual checks by medical experts and empirical results validate the superior data quality of our dataset compared to other data construction methods. Using PubMedVision, we train a 34B medical MLLM HuatuoGPT-Vision, which shows superior performance in medical multimodal scenarios among open-source MLLMs. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.18775 [pdf, other]

Probing the cosmic web in Ly$α$ emission over large scales: an Intensity Map** forecast for DECaLS/BASS and DESI

Authors: Pablo Renard, Daniele Spinoso, Zechang Sun, Hu Zou, Paulo Montero-Camacho, Zheng Cai

Abstract: Being the most prominent HI line, Ly$α$ permeates the cosmic web in emission. Despite its potential as a cosmological probe, its detection on large scales remains elusive. We present a new methodology to perform Ly$α$ intensity map** with broad-band optical images, by cross-correlating them with Ly$α$ forest data using a custom one-parameter estimator. We also develop an analytical large-scale L… ▽ More Being the most prominent HI line, Ly$α$ permeates the cosmic web in emission. Despite its potential as a cosmological probe, its detection on large scales remains elusive. We present a new methodology to perform Ly$α$ intensity map** with broad-band optical images, by cross-correlating them with Ly$α$ forest data using a custom one-parameter estimator. We also develop an analytical large-scale Ly$α$ emission model with two parameters (average luminosity $\langle L_{\rm Lyα} \rangle$ and bias $b_{\rm e}$) that respects observational constraints from QSO luminosity functions. We compute a forecast for DECaLS/BASS $g$-band images cross-correlated with DESI Ly$α$ forest data, setting guidelines for reducing images into Ly$α$ intensity maps. Given the transversal scales of our cross-correlation (26.4 arcmin, $\sim$33 cMpc/h), our study effectively integrates Ly$α$ emission over all the cosmic volume inside the DESI footprint at $2.2 < z < 3.4$ (the $g$-band Ly$α$ redshift range). Over the parameter space ($\langle L_{\rm Lyα} \rangle$, $b_{\rm e}$) sampled by our forecast, we find a 3$σ$ of large-scale structure in Ly$α$ likely, with a probability of detection of 23.95\% for DESI-DECaLS/BASS, and 54.93\% for a hypothetical DESI phase II with twice as much Ly$α$ QSOs. Without a detection, we derive upper bounds on $\langle L_{\rm Lyα} \rangle$ competitive with optimistic literature estimates ($2.3 \pm 1 \cdot 10^{\rm 41}$ erg/s/cMpc$^3$ for DESI, and $\sim$35\% lower for its hypothetical phase II). Extrapolation to the DESI-Rubin overlap shows that a detection of large-scale structure with Ly$α$ intensity map** using next-generation imaging surveys is certain. [abridged] △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 26 pages, 24 figures, submitted to MNRAS. Comments welcome!

arXiv:2406.16291 [pdf, other]

Integrated Study of X-ray Spectrum and Time Lags for HBL Mrk 421 within the Framework of the Multiple-Zone Leptonic Model

Authors: Wen Hu, Jia-Lai Kang, Zhen-Yi Cai, Jun-Xian Wang, Zhen-Bo Su, Guang-Cheng Xiao

Abstract: We present the timing analysis of 10 archived \XMM observations with an exposure of $>40$ ks of Markarian 421. Mrk 421 is the brightest high-frequency-peaked BL Lac object (HBL) emitting in X-rays produced by electrons accelerated in the innermost regions of a relativistic jet pointing toward us. For each observation, we construct averaged X-ray spectra in 0.5--10 keV band, as well as 100 s binned… ▽ More We present the timing analysis of 10 archived \XMM observations with an exposure of $>40$ ks of Markarian 421. Mrk 421 is the brightest high-frequency-peaked BL Lac object (HBL) emitting in X-rays produced by electrons accelerated in the innermost regions of a relativistic jet pointing toward us. For each observation, we construct averaged X-ray spectra in 0.5--10 keV band, as well as 100 s binned light curves (LCs) in various subbands. During these observations, the source exhibited various intensity states differing by close to an order of magnitude in flux, with the fractional variability amplitude increasing with energy through the X-ray band. Bayesian power spectral density analysis reveals that the X-ray variability can be characterized by a colored noise, with an index ranging from $\sim-1.9$ to $-3.0$. Moreover, both the standard cross-correlation function and cross-spectral methods indicate that the amount of time lags increases with the energy difference between two compared LCs. A time-dependent two-zone jet model is developed to extract physical information from the X-ray emission of Mrk 421. In the model, we assume that the jet emission mostly comprises a quasi-stationary component and a highly variable one. Our results show that the two-zone model can simultaneously provide a satisfactory description for both the X-ray spectra and time lags observed in different epochs, with the model parameters constrained in a fully acceptable interval. We suggest that shocks within the jets may be the primary energy dissipation process responsible for triggering the rapid variability, although magnetic reconnection cannot be excluded. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 33 pages, 12 figures, 6 tables; Accepted for publication in ApJ supplement series

arXiv:2406.15743 [pdf, other]

CasModaTest: A Cascaded and Model-agnostic Self-directed Framework for Unit Test Generation

Authors: Chao Ni, Xiaoya Wang, Liushan Chen, Dehai Zhao, Zhengong Cai, Shaohua Wang, Xiaohu Yang

Abstract: Though many machine learning (ML)-based unit testing generation approaches have been proposed and indeed achieved remarkable performance, they still have several limitations in effectiveness and practical usage. More precisely, existing ML-based approaches (1) generate partial content of a unit test, mainly focusing on test oracle generation; (2) mismatch the test prefix with the test oracle seman… ▽ More Though many machine learning (ML)-based unit testing generation approaches have been proposed and indeed achieved remarkable performance, they still have several limitations in effectiveness and practical usage. More precisely, existing ML-based approaches (1) generate partial content of a unit test, mainly focusing on test oracle generation; (2) mismatch the test prefix with the test oracle semantically; and (3) are highly bound with the close-sourced model, eventually damaging data security. We propose CasModaTest, a cascaded, model-agnostic, and end-to-end unit test generation framework, to alleviate the above limitations with two cascaded stages: test prefix generation and test oracle generation. Then, we manually build large-scale demo pools to provide CasModaTest with high-quality test prefixes and test oracles examples. Finally, CasModaTest automatically assembles the generated test prefixes and test oracles and compiles or executes them to check their effectiveness, optionally appending with several attempts to fix the errors occurring in compiling and executing phases. To evaluate the effectiveness of CasModaTest, we conduct large-scale experiments on a widely used dataset (Defects4J) and compare it with four state-of-the-art (SOTA) approaches by considering two performance measures. The experimental results indicate that CasModaTest outperforms all SOTAs with a substantial improvement (i.e., 60.62%-352.55% in terms of accuracy, 2.83%-87.27% in terms of focal method coverage). Besides, we also conduct experiments of CasModaTest on different open-source LLMs and find that CasModaTest can also achieve significant improvements over SOTAs (39.82%-293.96% and 9.25%-98.95% in terms of accuracy and focal method coverage, respectively) in end-to-end unit test generation △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 14 pages, 7 figures

arXiv:2406.15017 [pdf, other]

Observation of Continuous Time Crystal in a Spin Maser System

Authors: Weiyu Wang, Mingjun Feng, Qian** Ma, Zi Cai, Erwei Li, Guobin Liu

Abstract: Pair interaction potentials between atoms in a crystal are in general non-monotonic in distance, with a local minimum whose position gives the lattice constant of the crystal. A temporal analogue of this idea of crystal formation is still pending despite intensive studies on the time crystal phase. In a hybrid spin maser system with a time delay feedback, we report the observation of a continuous… ▽ More Pair interaction potentials between atoms in a crystal are in general non-monotonic in distance, with a local minimum whose position gives the lattice constant of the crystal. A temporal analogue of this idea of crystal formation is still pending despite intensive studies on the time crystal phase. In a hybrid spin maser system with a time delay feedback, we report the observation of a continuous time crystal induced by a retarded interaction with a characteristic time scale. This nonequilibrium phase features a self-sustained oscillation with an emergent frequency other than the intrinsic Larmor precession frequency of the spin maser system. It is shown that the amplitude of the oscillation is robust against perturbation, while its time phase randomly distributes from 0 to $2π$ for different realizations, a signature of spontaneous continuous time translation symmetry breaking. This CTC phase emerges only when the feedback strength exceeds a critical value, at which the system experiences a first order phase transition. Such a retarded interaction induced CTC is closer to the original idea of crystal, compared to mechanisms in other time crystal proposals. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 5 pages, 5 figures

arXiv:2406.14024 [pdf, other]

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Authors: Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

Abstract: Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale la… ▽ More Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale labels (i.e., the correctness of the current step and the explanations). In this paper, we propose \textbf{Math-Minos}, a natural language feedback enhanced verifier by constructing automatically-generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set (30k) of natural language feedbacks can significantly boost the performance of the verifier by the accuracy of 1.6\% (86.6\% $\rightarrow$ 88.2\%) on GSM8K and 0.8\% (37.8\% $\rightarrow$ 38.6\%) on MATH. We have released our code and data for further exploration. △ Less

Submitted 8 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: 9 pages

arXiv:2406.13963 [pdf, ps, other]

SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis

Authors: Zijian Cai, Xinquan Yang, Xuguang Li, Xiaoling Luo, Xuechen Li, Linlin Shen, He Meng, Yongqiang Deng

Abstract: Panoramic X-ray is a simple and effective tool for diagnosing dental diseases in clinical practice. When deep learning models are developed to assist dentist in interpreting panoramic X-rays, most of their performance suffers from the limited annotated data, which requires dentist's expertise and a lot of time cost. Although self-supervised learning (SSL) has been proposed to address this challeng… ▽ More Panoramic X-ray is a simple and effective tool for diagnosing dental diseases in clinical practice. When deep learning models are developed to assist dentist in interpreting panoramic X-rays, most of their performance suffers from the limited annotated data, which requires dentist's expertise and a lot of time cost. Although self-supervised learning (SSL) has been proposed to address this challenge, the two-stage process of pretraining and fine-tuning requires even more training time and computational resources. In this paper, we present a self-supervised auxiliary detection (SSAD) framework, which is plug-and-play and compatible with any detectors. It consists of a reconstruction branch and a detection branch. Both branches are trained simultaneously, sharing the same encoder, without the need for finetuning. The reconstruction branch learns to restore the tooth texture of healthy or diseased teeth, while the detection branch utilizes these learned features for diagnosis. To enhance the encoder's ability to capture fine-grained features, we incorporate the image encoder of SAM to construct a texture consistency (TC) loss, which extracts image embedding from the input and output of reconstruction branch, and then enforces both embedding into the same feature space. Extensive experiments on the public DENTEX dataset through three detection tasks demonstrate that the proposed SSAD framework achieves state-of-the-art performance compared to mainstream object detection methods and SSL methods. The code is available at https://github.com/Dylonsword/SSAD △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13914 [pdf, other]

The Blue Multi Unit Spectroscopic Explorer (BlueMUSE) on the VLT: science drivers and overview of instrument design

Authors: Johan Richard, Rémi Giroud, Florence Laurent, Davor Krajnović, Alexandre Jeanneau, Roland Bacon, Manuel Abreu, Angela Adamo, Ricardo Araujo, Nicolas Bouché, Jarle Brinchmann, Zhemin Cai, Norberto Castro, Ariadna Calcines, Diane Chapuis, Adélaïde Claeyssens, Luca Cortese, Emanuele Daddi, Christopher Davison, Michael Goodwin, Robert Harris, Matthew Hayes, Mathilde Jauzac, Andreas Kelz, Jean-Paul Kneib , et al. (24 additional authors not shown)

Abstract: BlueMUSE is a blue-optimised, medium spectral resolution, panoramic integral field spectrograph under development for the Very Large Telescope (VLT). With an optimised transmission down to 350 nm, spectral resolution of R$\sim$3500 on average across the wavelength range, and a large FoV (1 arcmin$^2$), BlueMUSE will open up a new range of galactic and extragalactic science cases facilitated by its… ▽ More BlueMUSE is a blue-optimised, medium spectral resolution, panoramic integral field spectrograph under development for the Very Large Telescope (VLT). With an optimised transmission down to 350 nm, spectral resolution of R$\sim$3500 on average across the wavelength range, and a large FoV (1 arcmin$^2$), BlueMUSE will open up a new range of galactic and extragalactic science cases facilitated by its specific capabilities. The BlueMUSE consortium includes 9 institutes located in 7 countries and is led by the Centre de Recherche Astrophysique de Lyon (CRAL). The BlueMUSE project development is currently in Phase A, with an expected first light at the VLT in 2031. We introduce here the Top Level Requirements (TLRs) derived from the main science cases, and then present an overview of the BlueMUSE system and its subsystems fulfilling these TLRs. We specifically emphasize the tradeoffs that are made and the key distinctions compared to the MUSE instrument, upon which the system architecture is built. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 20 pages, 10 figures, proceedings of the SPIE astronomical telescopes and instrumentation conference, Yokohama, 16-21 June

arXiv:2406.13169 [pdf, other]

A surprising excess of radio emission in extremely stable quasars: a unique clue to jet launching?

Authors: Wen-Yong Kang, Jun-Xian Wang, Zhen-Yi Cai, Hao-Chen Wang, Wen-Ke Ren, Mai Liao, Feng Yuan, Andrzej Zdziarski, Xinwu Cao

Abstract: Quasars are generally divided into jetted radio-loud and non-jetted radio-quiet ones, but why only 10% quasars are radio loud has been puzzling for decades. Other than jet-induced-phenomena, black hole mass, or Eddington ratio, prominent difference between jetted and non-jetted quasars has scarcely been detected. Here we show a unique distinction between them and the mystery of jet launching could… ▽ More Quasars are generally divided into jetted radio-loud and non-jetted radio-quiet ones, but why only 10% quasars are radio loud has been puzzling for decades. Other than jet-induced-phenomena, black hole mass, or Eddington ratio, prominent difference between jetted and non-jetted quasars has scarcely been detected. Here we show a unique distinction between them and the mystery of jet launching could be disclosed by a prominent excess of radio emission in extremely stable quasars (ESQs, i.e., type 1 quasars with extremely weak variability in UV/optical over 10 years). Specifically, we find that $>$ 25% of the ESQs are detected by the FIRST/VLASS radio survey, while only $\sim$ 6-8% of the control sample, matched in redshift, luminosity, and Eddington ratio, are radio-detected. The excess of radio detection in ESQs has a significance of 4.4 $σ$ (99.9995%), and dominantly occurs at intermediate radio loudness with R $\sim$ 10 - 60. The radio detection fraction of ESQs also tends to increase in the ESQ samples selected with more stringent thresholds. Our results are in contrast to the common view that RL quasars are likely more variable in UV/optical due to jet contribution. New clues/challenge posed by our findings highlight the importance of extensive follow-up observations to probe the nature of jets in ESQs, and theoretical studies on the link between jet launching and ESQs. Moreover, our results makes ESQs, an essential population which has never been explored, unique targets in the burgeoning era of time domain astronomy, like their opposite counterparts of quasars exhibiting extreme variability or changing-look features. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 11 pages, 16 figures, Accepted by ApJ

arXiv:2406.11533 [pdf, other]

High-Dimensional Subspace Expansion Using Classical Shadows

Authors: Gregory Boyd, Bálint Koczor, Zhenyu Cai

Abstract: We introduce a post-processing technique for classical shadow measurement data that enhances the precision of ground state estimation through high-dimensional subspace expansion; the dimensionality is only limited by the amount of classical post-processing resources rather than by quantum resources. Crucial steps of our approach are the efficient identification of useful observables from shadow da… ▽ More We introduce a post-processing technique for classical shadow measurement data that enhances the precision of ground state estimation through high-dimensional subspace expansion; the dimensionality is only limited by the amount of classical post-processing resources rather than by quantum resources. Crucial steps of our approach are the efficient identification of useful observables from shadow data, followed by our regularised subspace expansion that is designed to be numerically stable even when using noisy data. We analytically investigate noise propagation within our method, and upper bound the statistical fluctuations due to the limited number of snapshots in classical shadows. In numerical simulations, our method can achieve a reduction in the energy estimation errors in many cases, sometimes by more than an order of magnitude. We also demonstrate that our performance improvements are robust against both coherent errors (bad initial state) and gate noise in the state-preparation circuits. Furthermore, performance is guaranteed to be at least as good - and in many cases better - than direct energy estimation without using additional quantum resources and the approach is thus a very natural alternative for estimating ground state energies directly from classical shadow data. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.11116 [pdf]

Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople

Authors: Zhuang Qiu, Xufeng Duan, Zhenguang G. Cai

Abstract: Large language models (LLMs) have demonstrated exceptional performance across various linguistic tasks. However, it remains uncertain whether LLMs have developed human-like fine-grained grammatical intuition. This preregistered study (https://osf.io/t5nes) presents the first large-scale investigation of ChatGPT's grammatical intuition, building upon a previous study that collected laypeople's gram… ▽ More Large language models (LLMs) have demonstrated exceptional performance across various linguistic tasks. However, it remains uncertain whether LLMs have developed human-like fine-grained grammatical intuition. This preregistered study (https://osf.io/t5nes) presents the first large-scale investigation of ChatGPT's grammatical intuition, building upon a previous study that collected laypeople's grammatical judgments on 148 linguistic phenomena that linguists judged to be grammatical, ungrammatical, or marginally grammatical (Sprouse, Schutze, & Almeida, 2013). Our primary focus was to compare ChatGPT with both laypeople and linguists in the judgement of these linguistic constructions. In Experiment 1, ChatGPT assigned ratings to sentences based on a given reference sentence. Experiment 2 involved rating sentences on a 7-point scale, and Experiment 3 asked ChatGPT to choose the more grammatical sentence from a pair. Overall, our findings demonstrate convergence rates ranging from 73% to 95% between ChatGPT and linguists, with an overall point-estimate of 89%. Significant correlations were also found between ChatGPT and laypeople across all tasks, though the correlation strength varied by task. We attribute these results to the psychometric nature of the judgment tasks and the differences in language processing styles between humans and LLMs. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 23 pages

arXiv:2406.10163 [pdf, other]

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Authors: Yiwen Chen, Tong He, Di Huang, Weicai Ye, Si** Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang

Abstract: Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industry applications, and the meshes produced by current mesh extraction methods are significantly inferior to Artist-Created… ▽ More Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industry applications, and the meshes produced by current mesh extraction methods are significantly inferior to Artist-Created Meshes (AMs), i.e., meshes created by human artists. Specifically, current mesh extraction methods rely on dense faces and ignore geometric features, leading to inefficiencies, complicated post-processing, and lower representation quality. To address these issues, we introduce MeshAnything, a model that treats mesh extraction as a generation problem, producing AMs aligned with specified shapes. By converting 3D assets in any 3D representation into AMs, MeshAnything can be integrated with various 3D asset production methods, thereby enhancing their application across the 3D industry. The architecture of MeshAnything comprises a VQ-VAE and a shape-conditioned decoder-only transformer. We first learn a mesh vocabulary using the VQ-VAE, then train the shape-conditioned decoder-only transformer on this vocabulary for shape-conditioned autoregressive mesh generation. Our extensive experiments show that our method generates AMs with hundreds of times fewer faces, significantly improving storage, rendering, and simulation efficiencies, while achieving precision comparable to previous methods. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Project Page: https://buaacyw.github.io/mesh-anything/ Code: https://github.com/buaacyw/MeshAnything

arXiv:2406.08443 [pdf, other]

Transformation-Dependent Adversarial Attacks

Authors: Yaoteng Tan, Zikui Cai, M. Salman Asif

Abstract: We introduce transformation-dependent adversarial attacks, a new class of threats where a single additive perturbation can trigger diverse, controllable mis-predictions by systematically transforming the input (e.g., scaling, blurring, compression). Unlike traditional attacks with static effects, our perturbations embed metamorphic properties to enable different adversarial attacks as a function o… ▽ More We introduce transformation-dependent adversarial attacks, a new class of threats where a single additive perturbation can trigger diverse, controllable mis-predictions by systematically transforming the input (e.g., scaling, blurring, compression). Unlike traditional attacks with static effects, our perturbations embed metamorphic properties to enable different adversarial attacks as a function of the transformation parameters. We demonstrate the transformation-dependent vulnerability across models (e.g., convolutional networks and vision transformers) and vision tasks (e.g., image classification and object detection). Our proposed geometric and photometric transformations enable a range of targeted errors from one crafted input (e.g., higher than 90% attack success rate for classifiers). We analyze effects of model architecture and type/variety of transformations on attack effectiveness. This work forces a paradigm shift by redefining adversarial inputs as dynamic, controllable threats. We highlight the need for robust defenses against such multifaceted, chameleon-like perturbations that current techniques are ill-prepared for. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06541 [pdf, other]

Global and Local Attention-based Inception U-Net for Static IR Drop Estimation

Authors: Yilu Chen, Zhijie Cai, Min Wei, Zhifeng Lin, Jianli Chen

Abstract: Static IR drop analysis is a fundamental and critical task in chip design since the IR drop will significantly affect the design's functionality, performance, and reliability. However, the process of IR drop analysis can be time-consuming, potentially taking several hours. Furthermore, in the process of fixing violations, it is frequently imperative to do IR drop analysis iteratively, hence exacer… ▽ More Static IR drop analysis is a fundamental and critical task in chip design since the IR drop will significantly affect the design's functionality, performance, and reliability. However, the process of IR drop analysis can be time-consuming, potentially taking several hours. Furthermore, in the process of fixing violations, it is frequently imperative to do IR drop analysis iteratively, hence exacerbating the computational burden associated with the analysis. Therefore, a fast and accurate IR drop prediction is paramount for reducing the overall time invested in chip design. In this paper, we propose a global and local attention-based Inception U-Net for static IR drop estimation. Our U-Net incorporates components from the Transformer, CBAM, and Inception architectures to enhance its feature capture capability at different scales and improve the accuracy of predicted IR drop. Experimental results demonstrate that our proposed algorithm can achieve the best results among the winning teams of the ICCAD 2023 contest and the state-of-the-art algorithms. △ Less

Submitted 27 April, 2024; originally announced June 2024.

Comments: 7 pages, 8 figures

arXiv:2406.05707 [pdf, other]

QGEval: A Benchmark for Question Generation Evaluation

Authors: Wei** Fu, Bifan Wei, Jianxiang Hu, Zhongmin Cai, Jun Liu

Abstract: Automatically generated questions often suffer from problems such as unclear expression or factual inaccuracies, requiring a reliable and comprehensive evaluation of their quality. Human evaluation is frequently used in the field of question generation (QG) and is one of the most accurate evaluation methods. It also serves as the standard for automatic metrics. However, there is a lack of unified… ▽ More Automatically generated questions often suffer from problems such as unclear expression or factual inaccuracies, requiring a reliable and comprehensive evaluation of their quality. Human evaluation is frequently used in the field of question generation (QG) and is one of the most accurate evaluation methods. It also serves as the standard for automatic metrics. However, there is a lack of unified evaluation criteria, which hampers the development of both QG technologies and automatic evaluation methods. To address this, we propose QGEval, a multi-dimensional Evaluation benchmark for Question Generation, which evaluates both generated questions and existing automatic metrics across 7 dimensions: fluency, clarity, conciseness, relevance, consistency, answerability, and answer consistency. We demonstrate the appropriateness of these dimensions by examining their correlations and distinctions. Analysis with QGEval reveals that 1) most QG models perform unsatisfactorily in terms of answerability and answer consistency, and 2) existing metrics fail to align well with human assessments when evaluating generated questions across the 7 dimensions. We expect this work to foster the development of both QG technologies and automatic metrics for QG. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.04702 [pdf, other]

Marking the Pace: A Blockchain-Enhanced Privacy-Traceable Strategy for Federated Recommender Systems

Authors: Zhen Cai, Tao Tang, Shuo Yu, Yunpeng Xiao, Feng Xia

Abstract: Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of sh… ▽ More Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of shared data and the evolution of model updates. Consequently, data sharing is vulnerable to exploitation by malicious entities, raising significant data privacy concerns, while excluding data sharing will result in sub-optimal recommendations. To mitigate these concerns, we present LIBERATE, a privacy-traceable federated recommender system. We design a blockchain-based traceability mechanism, ensuring data privacy during data sharing and model updates. We further enhance privacy protection by incorporating local differential privacy in user-server communication. Extensive evaluations with the real-world dataset corroborate LIBERATE's capabilities in ensuring data privacy during data sharing and model update while maintaining efficiency and performance. Results underscore blockchain-based traceability mechanism as a promising solution for privacy-preserving in federated recommender systems. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.04249 [pdf, other]

Conv-INR: Convolutional Implicit Neural Representation for Multimodal Visual Signals

Authors: Zhicheng Cai

Abstract: Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations. Typically, INR is parameterized by a multiplayer perceptron (MLP) which takes the coordinates as the inputs and generates corresponding attributes of a signal. However, MLP-based INRs face two critical issues: i) individually considering each coordinate while ignoring the connections; ii)… ▽ More Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations. Typically, INR is parameterized by a multiplayer perceptron (MLP) which takes the coordinates as the inputs and generates corresponding attributes of a signal. However, MLP-based INRs face two critical issues: i) individually considering each coordinate while ignoring the connections; ii) suffering from the spectral bias thus failing to learn high-frequency components. While target visual signals usually exhibit strong local structures and neighborhood dependencies, and high-frequency components are significant in these signals, the issues harm the representational capacity of INRs. This paper proposes Conv-INR, the first INR model fully based on convolution. Due to the inherent attributes of convolution, Conv-INR can simultaneously consider adjacent coordinates and learn high-frequency components effectively. Compared to existing MLP-based INRs, Conv-INR has better representational capacity and trainability without requiring primary function expansion. We conduct extensive experiments on four tasks, including image fitting, CT/MRI reconstruction, and novel view synthesis, Conv-INR all significantly surpasses existing MLP-based INRs, validating the effectiveness. Finally, we raise three reparameterization methods that can further enhance the performance of the vanilla Conv-INR without introducing any extra inference cost. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.04178 [pdf, other]

Encoding Semantic Priors into the Weights of Implicit Neural Representation

Authors: Zhicheng Cai, Qiu Shen

Abstract: Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations, which takes coordinates as inputs and generates corresponding signal values. Since these coordinates contain no semantic features, INR fails to take any semantic information into consideration. However, semantic information has been proven critical in many vision tasks, especially for visu… ▽ More Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations, which takes coordinates as inputs and generates corresponding signal values. Since these coordinates contain no semantic features, INR fails to take any semantic information into consideration. However, semantic information has been proven critical in many vision tasks, especially for visual signal representation. This paper proposes a reparameterization method termed as SPW, which encodes the semantic priors to the weights of INR, thus making INR contain semantic information implicitly and enhancing its representational capacity. Specifically, SPW uses the Semantic Neural Network (SNN) to extract both low- and high-level semantic information of the target visual signal and generates the semantic vector, which is input into the Weight Generation Network (WGN) to generate the weights of INR model. Finally, INR uses the generated weights with semantic priors to map the coordinates to the signal values. After training, we only retain the generated weights while abandoning both SNN and WGN, thus SPW introduces no extra costs in inference. Experimental results show that SPW can improve the performance of various INR models significantly on various tasks, including image fitting, CT reconstruction, MRI reconstruction, and novel view synthesis. Further experiments illustrate that model with SPW has lower weight redundancy and learns more novel representations, validating the effectiveness of SPW. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: ICME 2024

arXiv:2406.02599 [pdf, other]

Privacy-Aware Randomized Quantization via Linear Programming

Authors: Zhongteng Cai, Xueru Zhang, Mohammad Mahdi Khalili

Abstract: Differential privacy mechanisms such as the Gaussian or Laplace mechanism have been widely used in data analytics for preserving individual privacy. However, they are mostly designed for continuous outputs and are unsuitable for scenarios where discrete values are necessary. Although various quantization mechanisms were proposed recently to generate discrete outputs under differential privacy, the… ▽ More Differential privacy mechanisms such as the Gaussian or Laplace mechanism have been widely used in data analytics for preserving individual privacy. However, they are mostly designed for continuous outputs and are unsuitable for scenarios where discrete values are necessary. Although various quantization mechanisms were proposed recently to generate discrete outputs under differential privacy, the outcomes are either biased or have an inferior accuracy-privacy trade-off. In this paper, we propose a family of quantization mechanisms that is unbiased and differentially private. It has a high degree of freedom and we show that some existing mechanisms can be considered as special cases of ours. To find the optimal mechanism, we formulate a linear optimization that can be solved efficiently using linear programming tools. Experiments show that our proposed mechanism can attain a better privacy-accuracy trade-off compared to baselines. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.02575 [pdf, other]

Cross-Modal Safety Alignment: Is textual unlearning all you need?

Authors: Trishna Chakraborty, Erfan Shayegani, Zikui Cai, Nael Abu-Ghazaleh, M. Salman Asif, Yue Dong, Amit K. Roy-Chowdhury, Chengyu Song

Abstract: Recent studies reveal that integrating new modalities into Large Language Models (LLMs), such as Vision-Language Models (VLMs), creates a new attack surface that bypasses existing safety training techniques like Supervised Fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). While further SFT and RLHF-based safety training can be conducted in multi-modal settings, collecting mu… ▽ More Recent studies reveal that integrating new modalities into Large Language Models (LLMs), such as Vision-Language Models (VLMs), creates a new attack surface that bypasses existing safety training techniques like Supervised Fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). While further SFT and RLHF-based safety training can be conducted in multi-modal settings, collecting multi-modal training datasets poses a significant challenge. Inspired by the structural design of recent multi-modal models, where, regardless of the combination of input modalities, all inputs are ultimately fused into the language space, we aim to explore whether unlearning solely in the textual domain can be effective for cross-modality safety alignment. Our evaluation across six datasets empirically demonstrates the transferability -- textual unlearning in VLMs significantly reduces the Attack Success Rate (ASR) to less than 8\% and in some cases, even as low as nearly 2\% for both text-based and vision-text-based attacks, alongside preserving the utility. Moreover, our experiments show that unlearning with a multi-modal dataset offers no potential benefits but incurs significantly increased computational demands, possibly up to 6 times higher. △ Less

Submitted 27 May, 2024; originally announced June 2024.

arXiv:2406.02397 [pdf, other]

One-arm Probabilities for Metric Graph Gaussian Free Fields below and at the Critical Dimension

Authors: Zhenhao Cai, Jian Ding

Abstract: For the critical level-set of the Gaussian free field on the metric graph of $\mathbb Z^d$, we consider the one-arm probability $θ_d(N)$, i.e., the probability that the boundary of a box of side length $2N$ is connected to the center. We prove that $θ_d(N)$ is $O(N^{-\frac{d}{2}+1})$ for $3\le d\le 5$, and is $N^{-2+o(1)}$ for $d=6$. Our upper bounds match the lower bounds in a previous work by Di… ▽ More For the critical level-set of the Gaussian free field on the metric graph of $\mathbb Z^d$, we consider the one-arm probability $θ_d(N)$, i.e., the probability that the boundary of a box of side length $2N$ is connected to the center. We prove that $θ_d(N)$ is $O(N^{-\frac{d}{2}+1})$ for $3\le d\le 5$, and is $N^{-2+o(1)}$ for $d=6$. Our upper bounds match the lower bounds in a previous work by Ding and Wirth up to a constant factor for $3\le d\le 5$, and match the exponent therein for $d=6$. Combined with our previous result that $θ_d(N) \asymp N^{-2}$ for $d>6$, this seems to present the first percolation model whose one-arm probabilities are essentially completely understood in all dimensions. In particular, these results fully confirm Werner's conjectures (2021) on the one-arm exponents: \begin{equation*} \text{(1) for}\ 3\le d<d_c=6,\ θ_d(N)=N^{-\frac{d}{2}+o(1)};\ \text{(2) for}\ d>d_c,\ θ_d(N)=N^{-2+o(1)}. \end{equation*} Prior to our work, Drewitz, Prévost and Rodriguez obtained upper bounds for $d\in \{3, 4\}$, which are very sharp although lose some diverging factors. In the same work, they conjectured that $θ_{d_c}(N) = N^{-2+o(1)}$, which is now established. In addition, in a recent concurrent work, Drewitz, Prévost and Rodriguez independently obtained the up-to-constant upper bound for $d=3$. △ Less

Submitted 12 July, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02069 [pdf, other]

PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

Authors: Zefan Cai., Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao

Abstract: In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc… ▽ More In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately focusin on critical tokens (a.k.a massive activation or attention sink) in higher layers. Motivated by these insights, we developed PyramidKV, a novel and effective KV cache compression method. This approach dynamically adjusts the KV cache size across different layers, allocating more cache in lower layers and less in higher ones, diverging from traditional methods that maintain a uniform KV cache size. Our experimental evaluations, utilizing the LongBench benchmark, show that PyramidKV matches the performance of models with a full KV cache while retaining only 12% of the KV cache, thus significantly reducing memory usage. In scenarios emphasizing memory efficiency, where only 0.7% of the KV cache is maintained, PyramidKV surpasses other KV cache compression techniques achieving up to a 20.5 absolute accuracy improvement on TREC. △ Less

Submitted 16 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.01843 [pdf, other]

L-MAGIC: Language Model Assisted Generation of Images with Coherence

Authors: Zhipeng Cai, Matthias Mueller, Reiner Birkl, Diana Wofk, Shao-Yen Tseng, JunDa Cheng, Gabriela Ben-Melech Stan, Vasudev Lal, Michael Paulitsch

Abstract: In the current era of generative AI breakthroughs, generating panoramic scenes from a single input image remains a key challenge. Most existing methods use diffusion-based iterative or simultaneous multi-view inpainting. However, the lack of global scene layout priors leads to subpar outputs with duplicated objects (e.g., multiple beds in a bedroom) or requires time-consuming human text inputs for… ▽ More In the current era of generative AI breakthroughs, generating panoramic scenes from a single input image remains a key challenge. Most existing methods use diffusion-based iterative or simultaneous multi-view inpainting. However, the lack of global scene layout priors leads to subpar outputs with duplicated objects (e.g., multiple beds in a bedroom) or requires time-consuming human text inputs for each view. We propose L-MAGIC, a novel method leveraging large language models for guidance while diffusing multiple coherent views of 360 degree panoramic scenes. L-MAGIC harnesses pre-trained diffusion and language models without fine-tuning, ensuring zero-shot performance. The output quality is further enhanced by super-resolution and multi-view fusion techniques. Extensive experiments demonstrate that the resulting panoramic scenes feature better scene layouts and perspective view rendering quality compared to related works, with >70% preference in human evaluations. Combined with conditional diffusion models, L-MAGIC can accept various input modalities, including but not limited to text, depth maps, sketches, and colored scripts. Applying depth estimation further enables 3D point cloud generation and dynamic scene exploration with fluid camera motion. Code is available at https://github.com/IntelLabs/MMPano. The video presentation is available at https://youtu.be/XDMNEzH4-Ec?list=PLG9Zyvu7iBa0-a7ccNLO8LjcVRAoMn57s. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: accepted to CVPR 2024

arXiv:2406.01391 [pdf, other]

Knowledge Graph in Astronomical Research with Large Language Models: Quantifying Driving Forces in Interdisciplinary Scientific Discovery

Authors: Zechang Sun, Yuan-Sen Ting, Yaobo Liang, Nan Duan, Song Huang, Zheng Cai

Abstract: Identifying and predicting the factors that contribute to the success of interdisciplinary research is crucial for advancing scientific discovery. However, there is a lack of methods to quantify the integration of new ideas and technological advancements in astronomical research and how these new technologies drive further scientific breakthroughs. Large language models, with their ability to extr… ▽ More Identifying and predicting the factors that contribute to the success of interdisciplinary research is crucial for advancing scientific discovery. However, there is a lack of methods to quantify the integration of new ideas and technological advancements in astronomical research and how these new technologies drive further scientific breakthroughs. Large language models, with their ability to extract key concepts from vast literature beyond keyword searches, provide a new tool to quantify such processes. In this study, we extracted concepts in astronomical research from 297,807 publications between 1993 and 2024 using large language models, resulting in a set of 24,939 concepts. These concepts were then used to form a knowledge graph, where the link strength between any two concepts was determined by their relevance through the citation-reference relationships. By calculating this relevance across different time periods, we quantified the impact of numerical simulations and machine learning on astronomical research. The knowledge graph demonstrates two phases of development: a phase where the technology was integrated and another where the technology was explored in scientific discovery. The knowledge graph reveals that despite machine learning has made much inroad in astronomy, there is currently a lack of new concept development at the intersection of AI and Astronomy, which may be the current bottleneck preventing machine learning from further transforming the field of astronomy. △ Less

Submitted 15 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: An interactive version of the knowledge graph is made publicly available at https://astrokg.github.io/. Accepted to IJCAI 2024 AI4Research Workshop. Comments are welcome

arXiv:2405.20343 [pdf, other]

Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Authors: Kailu Wu, Fangfu Liu, Zhihan Cai, Runjie Yan, Hanyang Wang, Yating Hu, Yueqi Duan, Kaisheng Ma

Abstract: In this work, we introduce Unique3D, a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images, featuring state-of-the-art generation fidelity and strong generalizability. Previous methods based on Score Distillation Sampling (SDS) can produce diversified 3D results by distilling 3D knowledge from large 2D diffusion models, but they usually suffer from… ▽ More In this work, we introduce Unique3D, a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images, featuring state-of-the-art generation fidelity and strong generalizability. Previous methods based on Score Distillation Sampling (SDS) can produce diversified 3D results by distilling 3D knowledge from large 2D diffusion models, but they usually suffer from long per-case optimization time with inconsistent issues. Recent works address the problem and generate better 3D results either by finetuning a multi-view diffusion model or training a fast feed-forward model. However, they still lack intricate textures and complex geometries due to inconsistency and limited generated resolution. To simultaneously achieve high fidelity, consistency, and efficiency in single image-to-3D, we propose a novel framework Unique3D that includes a multi-view diffusion model with a corresponding normal diffusion model to generate multi-view images with their normal maps, a multi-level upscale process to progressively improve the resolution of generated orthographic multi-views, as well as an instant and consistent mesh reconstruction algorithm called ISOMER, which fully integrates the color and geometric priors into mesh results. Extensive experiments demonstrate that our Unique3D significantly outperforms other image-to-3D baselines in terms of geometric and textural details. △ Less

Submitted 13 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: Project page: https://wukailu.github.io/Unique3D

ACM Class: I.2.10

arXiv:2405.19730 [pdf]

Research on Foundation Model for Spatial Data Intelligence: China's 2024 White Paper on Strategic Development of Spatial Data Intelligence

Authors: Shaohua Wang, Xing Xie, Yong Li, Danhuai Guo, Zhi Cai, Yu Liu, Yang Yue, Xiao Pan, Feng Lu, Huayi Wu, Zhipeng Gui, Zhiming Ding, Bolong Zheng, Fuzheng Zhang, Tao Qin, **gyuan Wang, Chuang Tao, Zhengchao Chen, Hao Lu, Jiayi Li, Hongyang Chen, Peng Yue, Wenhao Yu, Yao Yao, Leilei Sun , et al. (9 additional authors not shown)

Abstract: This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial dat… ▽ More This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial data intelligent large models and their applications in urban environments, aerospace remote sensing, geography, transportation, and other scenarios. Additionally, it summarizes the latest application cases of spatial data intelligent large models in themes such as urban development, multimodal systems, remote sensing, smart transportation, and resource environments. Finally, the report concludes with an overview and outlook on the development prospects of spatial data intelligent large models. △ Less

Submitted 29 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: in Chinese language

arXiv:2405.19256 [pdf, other]

Weak Generative Sampler to Efficiently Sample Invariant Distribution of Stochastic Differential Equation

Authors: Zhiqiang Cai, Yu Cao, Yuanfei Huang, Xiang Zhou

Abstract: Sampling invariant distributions from an Ito diffusion process presents a significant challenge in stochastic simulation. Traditional numerical solvers for stochastic differential equations require both a fine step size and a lengthy simulation period, resulting in both biased and correlated samples. Current deep learning-based method solves the stationary Fokker--Planck equation to determine the… ▽ More Sampling invariant distributions from an Ito diffusion process presents a significant challenge in stochastic simulation. Traditional numerical solvers for stochastic differential equations require both a fine step size and a lengthy simulation period, resulting in both biased and correlated samples. Current deep learning-based method solves the stationary Fokker--Planck equation to determine the invariant probability density function in form of deep neural networks, but they generally do not directly address the problem of sampling from the computed density function. In this work, we introduce a framework that employs a weak generative sampler (WGS) to directly generate independent and identically distributed (iid) samples induced by a transformation map derived from the stationary Fokker--Planck equation. Our proposed loss function is based on the weak form of the Fokker--Planck equation, integrating normalizing flows to characterize the invariant distribution and facilitate sample generation from the base distribution. Our randomized test function circumvents the need for mini-max optimization in the traditional weak formulation. Distinct from conventional generative models, our method neither necessitates the computationally intensive calculation of the Jacobian determinant nor the invertibility of the transformation map. A crucial component of our framework is the adaptively chosen family of test functions in the form of Gaussian kernel functions with centres selected from the generated data samples. Experimental results on several benchmark examples demonstrate the effectiveness of our method, which offers both low computational costs and excellent capability in exploring multiple metastable states. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 24 pages,10 figures

arXiv:2405.17792 [pdf, other]

JUNO Sensitivity to Invisible Decay Modes of Neutrons

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 28 pages, 7 figures, 4 tables

arXiv:2405.16075 [pdf, other]

Continuous Temporal Domain Generalization

Authors: Zekun Cai, Guangji Bai, Renhe Jiang, Xuan Song, Liang Zhao

Abstract: Temporal Domain Generalization (TDG) addresses the challenge of training predictive models under temporally varying data distributions. Traditional TDG approaches typically focus on domain data collected at fixed, discrete time intervals, which limits their capability to capture the inherent dynamics within continuous-evolving and irregularly-observed temporal domains. To overcome this, this work… ▽ More Temporal Domain Generalization (TDG) addresses the challenge of training predictive models under temporally varying data distributions. Traditional TDG approaches typically focus on domain data collected at fixed, discrete time intervals, which limits their capability to capture the inherent dynamics within continuous-evolving and irregularly-observed temporal domains. To overcome this, this work formalizes the concept of Continuous Temporal Domain Generalization (CTDG), where domain data are derived from continuous times and are collected at arbitrary times. CTDG tackles critical challenges including: 1) Characterizing the continuous dynamics of both data and models, 2) Learning complex high-dimensional nonlinear dynamics, and 3) Optimizing and controlling the generalization across continuous temporal domains. To address them, we propose a Koopman operator-driven continuous temporal domain generalization (Koodos) framework. We formulate the problem within a continuous dynamic system and leverage the Koopman theory to learn the underlying dynamics; the framework is further enhanced with a comprehensive optimization strategy equipped with analysis and control driven by prior knowledge of the dynamics patterns. Extensive experiments demonstrate the effectiveness and efficiency of our approach. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.15222 [pdf, other]

Leveraging Unknown Objects to Construct Labeled-Unlabeled Meta-Relationships for Zero-Shot Object Navigation

Authors: Yanwei Zheng, Changrui Li, Chuanlin Lan, Yaling Li, Xiao Zhang, Yifei Zou, Dongxiao Yu, Zhipeng Cai

Abstract: Zero-shot object navigation (ZSON) addresses situation where an agent navigates to an unseen object that does not present in the training set. Previous works mainly train agent using seen objects with known labels, and ignore the seen objects without labels. In this paper, we introduce seen objects without labels, herein termed as ``unknown objects'', into training procedure to enrich the agent's… ▽ More Zero-shot object navigation (ZSON) addresses situation where an agent navigates to an unseen object that does not present in the training set. Previous works mainly train agent using seen objects with known labels, and ignore the seen objects without labels. In this paper, we introduce seen objects without labels, herein termed as ``unknown objects'', into training procedure to enrich the agent's knowledge base with distinguishable but previously overlooked information. Furthermore, we propose the label-wise meta-correlation module (LWMCM) to harness relationships among objects with and without labels, and obtain enhanced objects information. Specially, we propose target feature generator (TFG) to generate the features representation of the unlabeled target objects. Subsequently, the unlabeled object identifier (UOI) module assesses whether the unlabeled target object appears in the current observation frame captured by the camera and produces an adapted target features representation specific to the observed context. In meta contrastive feature modifier (MCFM), the target features is modified via approaching the features of objects within the observation frame while distancing itself from features of unobserved objects. Finally, the meta object-graph learner (MOGL) module is utilized to calculate the relationships among objects based on the features. Experiments conducted on AI2THOR and RoboTHOR platforms demonstrate the effectiveness of our proposed method. △ Less

Submitted 26 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15205 [pdf, other]

Enhancing Generalized Fetal Brain MRI Segmentation using A Cascade Network with Depth-wise Separable Convolution and Attention Mechanism

Authors: Zhigao Cai, Xing-Ming Zhao

Abstract: Automatic segmentation of the fetal brain is still challenging due to the health state of fetal development, motion artifacts, and variability across gestational ages, since existing methods rely on high-quality datasets of healthy fetuses. In this work, we propose a novel cascade network called CasUNext to enhance the accuracy and generalization of fetal brain MRI segmentation. CasUNext incorpora… ▽ More Automatic segmentation of the fetal brain is still challenging due to the health state of fetal development, motion artifacts, and variability across gestational ages, since existing methods rely on high-quality datasets of healthy fetuses. In this work, we propose a novel cascade network called CasUNext to enhance the accuracy and generalization of fetal brain MRI segmentation. CasUNext incorporates depth-wise separable convolution, attention mechanisms, and a two-step cascade architecture for efficient high-precision segmentation. The first network localizes the fetal brain region, while the second network focuses on detailed segmentation. We evaluate CasUNext on 150 fetal MRI scans between 20 to 36 weeks from two scanners made by Philips and Siemens including axial, coronal, and sagittal views, and also validated on a dataset of 50 abnormal fetuses. Results demonstrate that CasUNext achieves improved segmentation performance compared to U-Nets and other state-of-the-art approaches. It obtains an average Dice coefficient of 96.1% and mean intersection over union of 95.9% across diverse scenarios. CasUNext shows promising capabilities for handling the challenges of multi-view fetal MRI and abnormal cases, which could facilitate various quantitative analyses and apply to multi-site data. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15115 [pdf, other]

Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification

Authors: Shang Liu, Zhongze Cai, Guanting Chen, Xiaocheng Li

Abstract: Predicting simple function classes has been widely used as a testbed for develo** theory and understanding of the trained Transformer's in-context learning (ICL) ability. In this paper, we revisit the training of Transformers on linear regression tasks, and different from all the existing literature, we consider a bi-objective prediction task of predicting both the conditional expectation… ▽ More Predicting simple function classes has been widely used as a testbed for develo** theory and understanding of the trained Transformer's in-context learning (ICL) ability. In this paper, we revisit the training of Transformers on linear regression tasks, and different from all the existing literature, we consider a bi-objective prediction task of predicting both the conditional expectation $\mathbb{E}[Y|X]$ and the conditional variance Var$(Y|X)$. This additional uncertainty quantification objective provides a handle to (i) better design out-of-distribution experiments to distinguish ICL from in-weight learning (IWL) and (ii) make a better separation between the algorithms with and without using the prior information of the training distribution. Theoretically, we show that the trained Transformer reaches near Bayes-optimum, suggesting the usage of the information of the training distribution. Our method can be extended to other cases. Specifically, with the Transformer's context window $S$, we prove a generalization bound of $\tilde{\mathcal{O}}(\sqrt{\min\{S, T\}/(n T)})$ on $n$ tasks with sequences of length $T$, providing sharper analysis compared to previous results of $\tilde{\mathcal{O}}(\sqrt{1/n})$. Empirically, we illustrate that while the trained Transformer behaves as the Bayes-optimal solution as a natural consequence of supervised training in distribution, it does not necessarily perform a Bayesian inference when facing task shifts, in contrast to the \textit{equivalence} between these two proposed in many existing literature. We also demonstrate the trained Transformer's ICL ability over covariates shift and prompt-length shift and interpret them as a generalization over a meta distribution. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14216 [pdf, other]

Emergence of spatial patterns and synchronization in superconducting time crystals

Authors: Bo Fan, Zi Cai, Antonio M. García-García

Abstract: We identify a time crystal phase characterized by a frequency half of the driving frequency in disordered superconductors by employing the time dependent Bogoliubov-de Gennes formalism at zero temperature with a periodically driven coupling constant. After a period of exponential increase of spatial inhomogeneities and exponential suppression of the order parameter amplitude, the time crystal deve… ▽ More We identify a time crystal phase characterized by a frequency half of the driving frequency in disordered superconductors by employing the time dependent Bogoliubov-de Gennes formalism at zero temperature with a periodically driven coupling constant. After a period of exponential increase of spatial inhomogeneities and exponential suppression of the order parameter amplitude, the time crystal develops islands of different sizes. Each of these islands is a time crystal with the same frequency albeit with a phase shift $π$ with respect to the homogeneous time crystal. After its emergence, the island gradually becomes smaller, though the phase shift persists, until it is abruptly synchronized at a time that it depends on its initial size. We find a critical disorder strength, still deep in the metallic phase, at which the time crystal phase terminates. For even stronger disorder, the order parameter oscillates with the driving frequency in regions where localization effects are not important. △ Less

Submitted 20 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 30 pages, 23 figures

arXiv:2405.14195 [pdf, other]

Enhanced Object Tracking by Self-Supervised Auxiliary Depth Estimation Learning

Authors: Zhenyu Wei, Yujie He, Zhanchuan Cai

Abstract: RGB-D tracking significantly improves the accuracy of object tracking. However, its dependency on real depth inputs and the complexity involved in multi-modal fusion limit its applicability across various scenarios. The utilization of depth information in RGB-D tracking inspired us to propose a new method, named MDETrack, which trains a tracking network with an additional capability to understand… ▽ More RGB-D tracking significantly improves the accuracy of object tracking. However, its dependency on real depth inputs and the complexity involved in multi-modal fusion limit its applicability across various scenarios. The utilization of depth information in RGB-D tracking inspired us to propose a new method, named MDETrack, which trains a tracking network with an additional capability to understand the depth of scenes, through supervised or self-supervised auxiliary Monocular Depth Estimation learning. The outputs of MDETrack's unified feature extractor are fed to the side-by-side tracking head and auxiliary depth estimation head, respectively. The auxiliary module will be discarded in inference, thus kee** the same inference speed. We evaluated our models with various training strategies on multiple datasets, and the results show an improved tracking accuracy even without real depth. Through these findings we highlight the potential of depth estimation in enhancing object tracking performance. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.13113 [pdf, other]

MAMMOTH-Subaru. II. Diverse Populations of Circumgalactic Ly$α$ Nebulae at Cosmic Noon

Authors: Mingyu Li, Haibin Zhang, Zheng Cai, Yongming Liang, Nobunari Kashikawa, Ke Ma, Xiaohui Fan, J. Xavier Prochaska, Bjorn H. C. Emonts, Xin Wang, Yun**g Wu, Shiwu Zhang, Qiong Li, Sean D. Johnson, Minghao Yue, Fabrizio Arrigoni Battaia, Sebastiano Cantalupo, Joseph F. Hennawi, Satoshi Kikuta, Yuanhang Ning, Masami Ouchi, Rhythm Shimakawa, Ben Wang, Weichen Wang, Zheng Zheng , et al. (1 additional authors not shown)

Abstract: Circumgalactic Lyman-alpha (Ly$α$) nebulae are gaseous halos around galaxies exhibiting luminous extended Ly$α$ emission. This work investigates Ly$α$ nebulae from deep imaging of $\sim12~\mathrm{deg}^2$ sky, targeted by the MAMMOTH-Subaru survey. Utilizing the wide-field capability of Hyper Suprime-Cam (HSC), we present one of the largest blind Ly$α$ nebula selections, including QSO nebulae, Ly… ▽ More Circumgalactic Lyman-alpha (Ly$α$) nebulae are gaseous halos around galaxies exhibiting luminous extended Ly$α$ emission. This work investigates Ly$α$ nebulae from deep imaging of $\sim12~\mathrm{deg}^2$ sky, targeted by the MAMMOTH-Subaru survey. Utilizing the wide-field capability of Hyper Suprime-Cam (HSC), we present one of the largest blind Ly$α$ nebula selections, including QSO nebulae, Ly$α$ blobs, and radio galaxy nebulae down to typical $2σ$ Ly$α$ surface brightness of $(5-10)\times10^{-18}\mathrm{~erg~s^{-1}~cm^{-2}~arcsec^{-2}}$. The sample contains 117 nebulae with Ly$α$ sizes of 40 - 400 kpc, and the most gigantic one spans about 365 kpc, referred to as the Ivory Nebula. Combining with multiwavelength data, we investigate diverse nebula populations and associated galaxies. We find a small fraction of Ly$α$ nebulae have QSOs ($\sim7\%$), luminous infrared galaxies ($\sim1\%$), and radio galaxies ($\sim 2\%$). Remarkably, among the 28 enormous Ly$α$ nebulae (ELANe) exceeding 100 kpc, about $80\%$ are associated with UV-faint galaxies ($M_\mathrm{UV} > -22$), categorized as Type II ELANe. We underscore that Type II ELANe constitute the majority but remain largely hidden in current galaxy and QSO surveys. Dusty starburst and obscured AGN activity are proposed to explain the nature of Type II ELANe. The SED of stacking all Ly$α$ nebulae also reveals signs of massive dusty star-forming galaxies with obscured AGNs. We propose a model to explain the dusty nature where the diverse populations of Ly$α$ nebula capture massive galaxies at different evolutionary stages undergoing violent assembling. Ly$α$ nebulae provide critical insights into the formation and evolution of today's massive cluster galaxies at cosmic noon. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 26 pages, 10 figures, 3 tables, submitted to ApJS, comments welcome

arXiv:2405.11750 [pdf, other]

The Intermediate-Mass Black Hole Reverberation Map** Project: Initial Results for a candidate IMBH in a nearby Seyfert 1 Galaxy

Authors: Wenwen Zuo, Hengxiao Guo, **gbo Sun, Qi Yuan, Paulina Lira, Minfeng Gu, Philip G. Edwards, Alok C. Gupta, Shubham Kishore, Jamie Stevens, Tao An, Zhen-Yi Cai, Haicheng Feng, Luis C. Ho, Dragana Ilić, Andjelka B. Kovačević, ShaSha Li, Mar Mezcua, Luka Č. Popović, Mouyuan Sun, Tushar Tripathi, Vivian U., Oliver Vince, Jianguo Wang, Junxian Wang , et al. (3 additional authors not shown)

Abstract: To investigate the short-term variability and determine the size of the optical continuum emitting size of intermediate-mass black holes (IMBHs), we carried out high-cadence, multi-band photometric monitoring of a Seyfert 1 galaxy J0249-0815 across two nights, together with a one-night single-band preliminary test. The presence of the broad Ha component in our target was confirmed by recent Paloma… ▽ More To investigate the short-term variability and determine the size of the optical continuum emitting size of intermediate-mass black holes (IMBHs), we carried out high-cadence, multi-band photometric monitoring of a Seyfert 1 galaxy J0249-0815 across two nights, together with a one-night single-band preliminary test. The presence of the broad Ha component in our target was confirmed by recent Palomar/P200 spectroscopic observations, 23 years after Sloan Digital Sky Survey, ruling out the supernovae origin of the broad Ha line. The photometric experiment was primarily conducted utilizing four-channel imagers MuSCAT 3 & 4 mounted on 2-meter telescopes within the Las Cumbres Observatory Global Telescope Network. Despite the expectation of variability, we observed no significant variation (<1.4%) on timescales of 6-10 hours. This non-detection is likely due to substantial host galaxy light diluting the subtle AGN variability. Dual-band preliminary tests and tailored simulations may enhance the possibility of detecting variability and lag in future IMBH reverberation campaigns. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 14 pages, 6 figures, submitted to ApJ, comments welcome

arXiv:2405.09153 [pdf, other]

Adapting Abstract Meaning Representation Parsing to the Clinical Narrative -- the SPRING THYME parser

Authors: Jon Z. Cai, Kristin Wright-Bettner, Martha Palmer, Guergana K. Savova, James H. Martin

Abstract: This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME)… ▽ More This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME) corpus, we adapted a state-of-the-art AMR parser utilizing continuous training. Our approach incorporates data augmentation techniques to enhance the accuracy of AMR structure predictions. Notably, through this learning strategy, our parser achieved an impressive F1 score of 88% on the THYME corpus's colon cancer dataset. Moreover, our research delved into the efficacy of data required for domain adaptation within the realm of clinical notes, presenting domain adaptation data requirements for AMR parsing. This exploration not only underscores the parser's robust performance but also highlights its potential in facilitating a deeper understanding of clinical narratives through structured semantic representations. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Accepted to the 6th Clinical NLP Workshop at NAACL, 2024

arXiv:2405.08977 [pdf, other]

Constraints on the variation of the fine-structure constant at 3<z<10 with JWST emission-line galaxies

Authors: Linhua Jiang, Shuqi Fu, Feige Wang, Sarah E. I. Bosman, Zheng Cai, Hyunsung D. Jun, Zhiwei Pan, Fengwu Sun, **yi Yang, Huanian Zhang

Abstract: We present constraints on the spacetime variation of the fine-structure constant $α$ at redshifts $3<z<10$ using JWST emission-line galaxies. The galaxy sample consists of 572 high-quality spectra with strong and narrow [O III] $λλ$4959,5007 doublet emission lines from 522 galaxies, including 267 spectra at $z>5$. The [O III] doublet lines are arguably the best emission lines to probe the variatio… ▽ More We present constraints on the spacetime variation of the fine-structure constant $α$ at redshifts $3<z<10$ using JWST emission-line galaxies. The galaxy sample consists of 572 high-quality spectra with strong and narrow [O III] $λλ$4959,5007 doublet emission lines from 522 galaxies, including 267 spectra at $z>5$. The [O III] doublet lines are arguably the best emission lines to probe the variation in $α$. We divide our sample into 5 subsamples based on redshift and calculate the relative variation $Δα/α$ for the individual subsamples. The calculated $Δα/α$ values are consistent with zero within $1σ$ at all redshifts, suggesting no time variation in $α$ above a level of $(1-2) \times10^{-4}$ ($1σ$) in the past 13.2 billion years. When the whole sample is combined, the constraint is improved to be $Δα/α= (0.4\pm0.7) \times10^{-4}$. We further test the spatial variation in $α$ using four subsamples of galaxies in four different directions on the sky. The measured $Δα/α$ values are consistent with zero at a $1σ$ level of $\sim10^{-4}$. While the constraints in this work are not as stringent as those from lower-redshift quasar absorption lines in previous studies, this work uses an independent tracer and provides the first constraints on $Δα/α$ at the highest redshifts. Our analyses also indicate that the relative wavelength calibration of the JWST spectra is robust. With the growing number of emission-line galaxies from JWST, we expect to achieve stronger constraints in the future. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 9 pages, 6 figures, submitted to ApJ

arXiv:2405.08588 [pdf, ps, other]

Sharing Quantum Steering via Standard Projective Measurements

Authors: Shufen Dong, Zinuo Cai, Chunfeng Wu, Changliang Ren

Abstract: We propose a scheme for the sharing of quantum steering among three observers, Alice, Bob, and Charlie using standard projective measurements. We show that in the unilateral sequential scenario, Alice can steer Bob's and Charlie's states and conversely, Bob and Charlie can steer Alice's state. Unlike the quantum steering sharing achieved through weak measurements, we use the standard projective me… ▽ More We propose a scheme for the sharing of quantum steering among three observers, Alice, Bob, and Charlie using standard projective measurements. We show that in the unilateral sequential scenario, Alice can steer Bob's and Charlie's states and conversely, Bob and Charlie can steer Alice's state. Unlike the quantum steering sharing achieved through weak measurements, we use the standard projective measurements to enable quantum steering sharing. Quantum steering is demonstrated by the violations of the linear steering inequality among different observer combinations. We find that Alice can simultaneously steer both Bob's and Charlie's states, and Bob and Charlie can simultaneously steer Alice's state, regardless of whether they are in maximally entangled states or partially entangled states. The maximum double violation of the linear steering inequalities obtained from partially entangled states can be greater in some cases than that obtained from maximally entangled states when randomly combining the case of two projective measurements and the case of two identity measurements. Additionally, we verify hybrid quantum correlation sharing through the double violation of the Clauser-Horne-Shimony-Holt (CHSH) inequality and the linear steering inequality. Our results provide a new perspective for the study of quantum steering and may lead to applications in quantum random access code, randomness certification, and self-testing process. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.05231 [pdf, other]

DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training

Authors: Renjie Liu, Yichuan Wang, Xiao Yan, Zhenkun Cai, Minjie Wang, Haitian Jiang, Bo Tang, **yang Li

Abstract: Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications. To train GNNs on large graphs that exceed CPU memory, several systems store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when reading node features that are usually smaller than a disk page or degraded model accur… ▽ More Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications. To train GNNs on large graphs that exceed CPU memory, several systems store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when reading node features that are usually smaller than a disk page or degraded model accuracy by treating the graph as disconnected partitions. To close this gap, we build a system called DiskGNN, which achieves high I/O efficiency and thus fast training without hurting model accuracy. The key technique used by DiskGNN is offline sampling, which helps decouple graph sampling from model computation. In particular, by conducting graph sampling beforehand, DiskGNN acquires the node features that will be accessed by model computation, and such information is utilized to pack the target node features contiguously on disk to avoid read amplification. Besides, \name{} also adopts designs including four-level feature store to fully utilize the memory hierarchy to cache node features and reduce disk access, batched packing to accelerate the feature packing process, and pipelined training to overlap disk access with other operations. We compare DiskGNN with Ginex and MariusGNN, which are state-of-the-art systems for out-of-core GNN training. The results show that DiskGNN can speed up the baselines by over 8x while matching their best model accuracy. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.03781 [pdf, other]

doi 10.3847/1538-4357/ad488a

Large Scale Overdensity of Lyman Break Galaxies Around the z=6.3 Ultraluminous Quasar J0100+2802

Authors: Maria Pudoka, Feige Wang, Xiaohui Fan, **yi Yang, Jaclyn Champagne, Victoria Jones, Fuyan Bian, Zheng Cai, Linhua Jiang, Dezi Liu, Xue-Bing Wu

Abstract: We study the environment of the z=6.33 ultraluminous quasar SDSS J010013.02+280225.8 (J0100) to understand its association with large-scale structure. Theoretical models propose high-redshift quasars as markers of galaxy overdensities residing in the most massive dark matter halos (DMHs) in the early universe. J0100 is an ultraluminous quasar with the most massive black hole known at z>6, suggesti… ▽ More We study the environment of the z=6.33 ultraluminous quasar SDSS J010013.02+280225.8 (J0100) to understand its association with large-scale structure. Theoretical models propose high-redshift quasars as markers of galaxy overdensities residing in the most massive dark matter halos (DMHs) in the early universe. J0100 is an ultraluminous quasar with the most massive black hole known at z>6, suggesting a high likelihood of residing in a massive DMH. We present wide-field ($\sim$522 square arcminute) imaging in the r-, i-, and z-bands from the Large Binocular Camera on the Large Binocular Telescope, with Y- and J-band imaging from the Wide-field Infrared Camera on the Canada-France-Hawaii Telescope, centered on J0100. Applying color selections, we identify 23 objects as i-droput Lyman Break Galaxy (LBG) candidates in the J0100 field. We use the deep photometric catalog in the 1.27 square degree COSMOS field to calculate the density of LBGs in a blank field, and to estimate the selection completeness and purity. The observed surface density of LBG candidates in the J0100 field corresponds to a galaxy overdensity of $δ$=4 (at 8.4$σ$). This large-scale overdensity suggests that the $\sim$ 22 square arcminute overdensity found by Kashino et al. using JWST data extends out to much larger scales. We calculate the angular auto-correlation function of the candidates and find a positive correlation on $\lesssim$ 10 arcminute scales as well as evidence of asymmetries in their spatial distribution, further suggesting a direct detection of large-scale structure around the ultra-luminous quasar J0100. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 21 pages, 11 figures, 3 tables, to be published in The Astrophysical Journal (ApJ)

arXiv:2405.03076 [pdf, other]

Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management

Authors: Bingzhang Wang, Zhiyu Cai, Muhammad Monjurul Karim, Chenxi Liu, Yinhai Wang

Abstract: The digitization of traffic sensing infrastructure has significantly accumulated an extensive traffic data warehouse, which presents unprecedented challenges for transportation analytics. The complexities associated with querying large-scale multi-table databases require specialized programming expertise and labor-intensive development. Additionally, traditional analysis methods have focused mainl… ▽ More The digitization of traffic sensing infrastructure has significantly accumulated an extensive traffic data warehouse, which presents unprecedented challenges for transportation analytics. The complexities associated with querying large-scale multi-table databases require specialized programming expertise and labor-intensive development. Additionally, traditional analysis methods have focused mainly on numerical data, often neglecting the semantic aspects that could enhance interpretability and understanding. Furthermore, real-time traffic data access is typically limited due to privacy concerns. To bridge this gap, the integration of Large Language Models (LLMs) into the domain of traffic management presents a transformative approach to addressing the complexities and challenges inherent in modern transportation systems. This paper proposes an intelligent online chatbot, TP-GPT, for efficient customized transportation surveillance and management empowered by a large real-time traffic database. The innovative framework leverages contextual and generative intelligence of language models to generate accurate SQL queries and natural language interpretations by employing transportation-specialized prompts, Chain-of-Thought prompting, few-shot learning, multi-agent collaboration strategy, and chat memory. Experimental study demonstrates that our approach outperforms state-of-the-art baselines such as GPT-4 and PaLM 2 on a challenging traffic-analysis benchmark TransQuery. TP-GPT would aid researchers and practitioners in real-time transportation surveillance and management in a privacy-preserving, equitable, and customizable manner. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures, submitted to 27th IEEE International Conference on Intelligent Transportation Systems (IEEE ITSC 2024)

Showing 1–50 of 989 results for author: Cai., Z