Search | arXiv e-print repository

arXiv:2405.19207 [pdf]

A Multi-Source Retrieval Question Answering Framework Based on RAG

Authors: Ridong Wu, Shuhong Chen, Xiangbiao Su, Yuankai Zhu, Yifei Liao, Jianming Wu

Abstract: With the rapid development of large-scale language models, Retrieval-Augmented Generation (RAG) has been widely adopted. However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces tradition… ▽ More With the rapid development of large-scale language models, Retrieval-Augmented Generation (RAG) has been widely adopted. However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information. We also propose a web retrieval based method to implement fine-grained knowledge retrieval, Utilizing the powerful reasoning capability of GPT-3.5 to realize semantic partitioning of problem.In order to mitigate the illusion of GPT retrieval and reduce noise in Web retrieval,we proposes a multi-source retrieval framework, named MSRAG, which combines GPT retrieval with web retrieval. Experiments on multiple knowledge-intensive QA datasets demonstrate that the proposed framework in this study performs better than existing RAG framework in enhancing the overall efficiency and accuracy of QA systems. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 4 pages,3 figures

arXiv:2405.18816 [pdf, other]

Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

Abstract: Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural a… ▽ More Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural approach would be to incorporate such image probabilities in a maximum-a-posteriori (MAP) estimation problem. A major obstacle, however, lies in the slow computation of the log-likelihood, as it requires backpropagating through an ODE solver, which can be prohibitively slow for high-dimensional problems. In this work, we propose an iterative algorithm to approximate the MAP estimator efficiently to solve a variety of linear inverse problems. Our algorithm is mathematically justified by the observation that the MAP objective can be approximated by a sum of $N$ ``local MAP'' objectives, where $N$ is the number of function evaluations. By leveraging Tweedie's formula, we show that we can perform gradient steps to sequentially optimize these objectives. We validate our approach for various linear inverse problems, such as super-resolution, deblurring, inpainting, and compressed sensing, and demonstrate that we can outperform other methods based on flow matching. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.18721 [pdf, other]

doi 10.1109/TPAMI.2024.3407759

Correctable Landmark Discovery via Large Models for Vision-Language Navigation

Authors: Bingqian Lin, Yunshuang Nie, Ziming Wei, Yi Zhu, Hang Xu, Shikui Ma, Jianzhuang Liu, Xiaodan Liang

Abstract: Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse visual observations. However, previous VLN agents fail to perform accurate modality alignment especially in unexplored scenes, since they learn from limited navigation data and lack s… ▽ More Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse visual observations. However, previous VLN agents fail to perform accurate modality alignment especially in unexplored scenes, since they learn from limited navigation data and lack sufficient open-world alignment knowledge. In this work, we propose a new VLN paradigm, called COrrectable LaNdmark DiScOvery via Large ModEls (CONSOLE). In CONSOLE, we cast VLN as an open-world sequential landmark discovery problem, by introducing a novel correctable landmark discovery scheme based on two large models ChatGPT and CLIP. Specifically, we use ChatGPT to provide rich open-world landmark cooccurrence commonsense, and conduct CLIP-driven landmark discovery based on these commonsense priors. To mitigate the noise in the priors due to the lack of visual constraints, we introduce a learnable cooccurrence scoring module, which corrects the importance of each cooccurrence according to actual observations for accurate landmark discovery. We further design an observation enhancement strategy for an elegant combination of our framework with different VLN agents, where we utilize the corrected landmark features to obtain enhanced observation features for action decision. Extensive experimental results on multiple popular VLN benchmarks (R2R, REVERIE, R4R, RxR) show the significant superiority of CONSOLE over strong baselines. Especially, our CONSOLE establishes the new state-of-the-art results on R2R and R4R in unseen scenarios. Code is available at https://github.com/expectorlin/CONSOLE. △ Less

Submitted 5 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: Accepted by TPAMI 2024

arXiv:2405.18470 [pdf, other]

5-25 $μ$m Galaxy Number Counts from Deep JWST Data

Authors: Meredith A. Stone, Stacey Alberts, George H. Rieke, Andrew J. Bunker, Jianwei Lyu, Pablo G. Pérez-González, Irene Shivaei, Yongda Zhu

Abstract: Galaxy number counts probe the evolution of galaxies over cosmic time, and serve as a valuable comparison point to theoretical models of galaxy formation. We present new galaxy number counts in eight photometric bands between 5 and 25 $μ$m from the Systematic Mid-infrared Instrument Legacy Extragalactic Survey (SMILES) and the JWST Advanced Deep Extragalactic Survey (JADES) deep MIRI parallel, ext… ▽ More Galaxy number counts probe the evolution of galaxies over cosmic time, and serve as a valuable comparison point to theoretical models of galaxy formation. We present new galaxy number counts in eight photometric bands between 5 and 25 $μ$m from the Systematic Mid-infrared Instrument Legacy Extragalactic Survey (SMILES) and the JWST Advanced Deep Extragalactic Survey (JADES) deep MIRI parallel, extending to unprecedented depth. By combining our new MIRI counts with existing data from Spitzer and AKARI, we achieve counts across 3-5 orders of magnitude in flux in all MIRI bands. Our counts diverge from predictions from recent semi-analytical models of galaxy formation, likely owing to their treatment of mid-infrared aromatic features. Finally, we integrate our combined JWST-Spitzer counts at 8 and 24 $μ$m to measure the cosmic infrared background (CIB) light at these wavelengths; our measured CIB fluxes are consistent with those from previous mid-infrared surveys, but larger than predicted by models based on TeV blazar data. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 13 pages, 4 figures, 2 tables. Submitted to ApJ

arXiv:2405.18462 [pdf, other]

JWST/MIRI photometric detection at $7.7\ μ\mathrm{m}$ of the stellar continuum and nebular emission in a galaxy at $z > 14$

Authors: Jakob M. Helton, George H. Rieke, Stacey Alberts, Zihao Wu, Daniel J. Eisenstein, Kevin N. Hainline, Stefano Carniani, Zhiyuan Ji, William M. Baker, Rachana Bhatawdekar, Andrew J. Bunker, Phillip A. Cargile, Stéphane Charlot, Jacopo Chevallard, Francesco D'Eugenio, Eiichi Egami, Benjamin D. Johnson, Gareth C. Jones, Jianwei Lyu, Roberto Maiolino, Pablo G. Pérez-González, Marcia J. Rieke, Brant Robertson, Aayush Saxena, Jan Scholtz , et al. (9 additional authors not shown)

Abstract: The James Webb Space Telescope (JWST) has spectroscopically confirmed numerous galaxies at $z > 10$. While weak rest-ultraviolet emission lines have only been seen in a handful of sources, the stronger rest-optical emission lines are highly diagnostic and accessible at mid-infrared wavelengths with the Mid-Infrared Instrument (MIRI) of JWST. We report the photometric detection of the most distant… ▽ More The James Webb Space Telescope (JWST) has spectroscopically confirmed numerous galaxies at $z > 10$. While weak rest-ultraviolet emission lines have only been seen in a handful of sources, the stronger rest-optical emission lines are highly diagnostic and accessible at mid-infrared wavelengths with the Mid-Infrared Instrument (MIRI) of JWST. We report the photometric detection of the most distant spectroscopically confirmed galaxy JADES-GS-z14-0 at $z = 14.32^{+0.08}_{-0.20}$ with MIRI at $7.7\ μ\mathrm{m}$. The most plausible solution for the stellar population properties is that this galaxy contains half a billion solar masses in stars with a strong burst of star formation in the most recent few million years. For this model, at least one-third of the flux at $7.7\ μ\mathrm{m}$ comes from the rest-optical emission lines $\mathrm{H}β$ and/or $\mathrm{[OIII]}λ\lambda4959,5007$. The inferred properties of JADES-GS-z14-0 suggest rapid mass assembly and metal enrichment during the earliest phases of galaxy formation. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Submitted; main text has 9 pages, 3 figures and 1 table; extended text has 13 pages, 5 figures, and 1 table

arXiv:2405.17778 [pdf, other]

Synthetic non-Abelian charges in degenerate Fermi gases

Authors: Qi-Dong Wang, Yan-Qing Zhu, Shi-Liang Zhu, Zhen Zheng

Abstract: Topological phases associated with non-Abelian charges can exhibit a distinguished bulk-edge correspondence compared to Abelian phases, although elucidating this relationship remains challenging in traditional solid-state systems. In this paper, we propose a theoretical framework for synthesizing non-Abelian quaternion charges in degenerate Fermi gases. By designing artificial spin-orbit coupling… ▽ More Topological phases associated with non-Abelian charges can exhibit a distinguished bulk-edge correspondence compared to Abelian phases, although elucidating this relationship remains challenging in traditional solid-state systems. In this paper, we propose a theoretical framework for synthesizing non-Abelian quaternion charges in degenerate Fermi gases. By designing artificial spin-orbit coupling patterns, the topological edge modes demonstrate a clear correspondence with the band topology determined by various quaternion charges. This paves the way for observing the interface modes whose existence is attributed to the non-conservation multiplication relation, which is fundamental to non-Abelian charges. This scheme can be readily implemented using current ultracold atom techniques, offering a promising approach to explore the intriguing non-Abelian characteristics of the system. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 8 pages, 6 figures

arXiv:2405.17201 [pdf, other]

Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View

Authors: ** Wang, Shichao Dong, Yapeng Zhu, Kelu Yao, Weidong Zhao, Chao Li, ** Luo

Abstract: Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show that current Vision Language Models (VLMs) surprisingly lack sufficient knowledge with respect to such capabilities. To this end, we propose to thoroughly diagnose the composition representations encoded by VLMs, systematically revealing the potential cause for th… ▽ More Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show that current Vision Language Models (VLMs) surprisingly lack sufficient knowledge with respect to such capabilities. To this end, we propose to thoroughly diagnose the composition representations encoded by VLMs, systematically revealing the potential cause for this weakness. Specifically, we propose evaluation methods from a novel game-theoretic view to assess the vulnerability of VLMs on different aspects of compositional understanding, e.g., relations and attributes. Extensive experimental results demonstrate and validate several insights to understand the incapabilities of VLMs on compositional reasoning, which provide useful and reliable guidance for future studies. The deliverables will be updated at https://vlms-compositionality-gametheory.github.io/. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 21 pages, 8 figures

arXiv:2405.16851 [pdf, other]

Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning

Authors: Mingqing Xiao, Yixin Zhu, Di He, Zhouchen Lin

Abstract: Spiking neural networks (SNNs) are investigated as biologically inspired models of neural computation, distinguished by their computational capability and energy efficiency due to precise spiking times and sparse spikes with event-driven computation. A significant question is how SNNs can emulate human-like graph-based reasoning of concepts and relations, especially leveraging the temporal domain… ▽ More Spiking neural networks (SNNs) are investigated as biologically inspired models of neural computation, distinguished by their computational capability and energy efficiency due to precise spiking times and sparse spikes with event-driven computation. A significant question is how SNNs can emulate human-like graph-based reasoning of concepts and relations, especially leveraging the temporal domain optimally. This paper reveals that SNNs, when amalgamated with synaptic delay and temporal coding, are proficient in executing (knowledge) graph reasoning. It is elucidated that spiking time can function as an additional dimension to encode relation properties via a neural-generalized path formulation. Empirical results highlight the efficacy of temporal delay in relation processing and showcase exemplary performance in diverse graph reasoning tasks. The spiking model is theoretically estimated to achieve $20\times$ energy savings compared to non-spiking counterparts, deepening insights into the capabilities and potential of biologically inspired SNNs for efficient reasoning. The code is available at https://github.com/pkuxmq/GRSNN. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted by ICML 2024

arXiv:2405.16577 [pdf, other]

Reflected Flow Matching

Authors: Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, Cheng Zhang

Abstract: Continuous normalizing flows (CNFs) learn an ordinary differential equation to transform prior samples into data. Flow matching (FM) has recently emerged as a simulation-free approach for training CNFs by regressing a velocity model towards the conditional velocity field. However, on constrained domains, the learned velocity model may lead to undesirable flows that result in highly unnatural sampl… ▽ More Continuous normalizing flows (CNFs) learn an ordinary differential equation to transform prior samples into data. Flow matching (FM) has recently emerged as a simulation-free approach for training CNFs by regressing a velocity model towards the conditional velocity field. However, on constrained domains, the learned velocity model may lead to undesirable flows that result in highly unnatural samples, e.g., oversaturated images, due to both flow matching error and simulation error. To address this, we add a boundary constraint term to CNFs, which leads to reflected CNFs that keep trajectories within the constrained domains. We propose reflected flow matching (RFM) to train the velocity model in reflected CNFs by matching the conditional velocity fields in a simulation-free manner, similar to the vanilla FM. Moreover, the analytical form of conditional velocity fields in RFM avoids potentially biased approximations, making it superior to existing score-based generative models on constrained domains. We demonstrate that RFM achieves comparable or better results on standard image benchmarks and produces high-quality class-conditioned samples under high guidance weight. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: ICML 2024 camera-ready

arXiv:2405.16440 [pdf, other]

MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting

Authors: Xiuding Cai, Yaoyao Zhu, Xueyao Wang, Yu Yao

Abstract: In recent years, Transformers have become the de-facto architecture for long-term sequence forecasting (LTSF), but faces challenges such as quadratic complexity and permutation invariant bias. A recent model, Mamba, based on selective state space models (SSMs), has emerged as a competitive alternative to Transformer, offering comparable performance with higher throughput and linear complexity rela… ▽ More In recent years, Transformers have become the de-facto architecture for long-term sequence forecasting (LTSF), but faces challenges such as quadratic complexity and permutation invariant bias. A recent model, Mamba, based on selective state space models (SSMs), has emerged as a competitive alternative to Transformer, offering comparable performance with higher throughput and linear complexity related to sequence length. In this study, we analyze the limitations of current Mamba in LTSF and propose four targeted improvements, leading to MambaTS. We first introduce variable scan along time to arrange the historical information of all the variables together. We suggest that causal convolution in Mamba is not necessary for LTSF and propose the Temporal Mamba Block (TMB). We further incorporate a dropout mechanism for selective parameters of TMB to mitigate model overfitting. Moreover, we tackle the issue of variable scan order sensitivity by introducing variable permutation training. We further propose variable-aware scan along time to dynamically discover variable relationships during training and decode the optimal variable scan order by solving the shortest path visiting all nodes problem during inference. Extensive experiments conducted on eight public datasets demonstrate that MambaTS achieves new state-of-the-art performance. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16381 [pdf, other]

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

Authors: Yuchen Zhu, Tianrong Chen, Lingkai Kong, Evangelos A. Theodorou, Molei Tao

Abstract: The generative modeling of data on manifold is an important task, for which diffusion models in flat spaces typically need nontrivial adaptations. This article demonstrates how a technique called `trivialization' can transfer the effectiveness of diffusion models in Euclidean spaces to Lie groups. In particular, an auxiliary momentum variable was algorithmically introduced to help transport the po… ▽ More The generative modeling of data on manifold is an important task, for which diffusion models in flat spaces typically need nontrivial adaptations. This article demonstrates how a technique called `trivialization' can transfer the effectiveness of diffusion models in Euclidean spaces to Lie groups. In particular, an auxiliary momentum variable was algorithmically introduced to help transport the position variable between data distribution and a fixed, easy-to-sample distribution. Normally, this would incur further difficulty for manifold data because momentum lives in a space that changes with the position. However, our trivialization technique creates to a new momentum variable that stays in a simple $\textbf{fixed vector space}$. This design, together with a manifold preserving integrator, simplifies implementation and avoids inaccuracies created by approximations such as projections to tangent space and manifold, which were typically used in prior work, hence facilitating generation with high-fidelity and efficiency. The resulting method achieves state-of-the-art performance on protein and RNA torsion angle generation and sophisticated torus datasets. We also, arguably for the first time, tackle the generation of data on high-dimensional Special Orthogonal and Unitary groups, the latter essential for quantum problems. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.16178 [pdf, other]

Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

Authors: Yun Zhu, Jia-Chen Gu, Caitlin Sikora, Ho Ko, Yinxiao Liu, Chu-Cheng Lin, Lei Shu, Liangchen Luo, Lei Meng, Bang Liu, **dong Chen

Abstract: Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility by incorporating external contexts. However, the input length grows linearly in the number of retrieved documents, causing a dramatic increase in latency. In this paper, we propose a novel paradigm named Sparse RAG, which seeks to cut computation costs through sparsity. Specifically, Sparse R… ▽ More Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility by incorporating external contexts. However, the input length grows linearly in the number of retrieved documents, causing a dramatic increase in latency. In this paper, we propose a novel paradigm named Sparse RAG, which seeks to cut computation costs through sparsity. Specifically, Sparse RAG encodes retrieved documents in parallel, which eliminates latency introduced by long-range attention of retrieved documents. Then, LLMs selectively decode the output by only attending to highly relevant caches auto-regressively, which are chosen via prompting LLMs with special control tokens. It is notable that Sparse RAG combines the assessment of each individual document and the generation of the response into a single process. The designed sparse mechanism in a RAG system can facilitate the reduction of the number of documents loaded during decoding for accelerating the inference of the RAG system. Additionally, filtering out undesirable contexts enhances the model's focus on relevant context, inherently improving its generation quality. Evaluation results of two datasets show that Sparse RAG can strike an optimal balance between generation quality and computational efficiency, demonstrating its generalizability across both short- and long-form generation tasks. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.16144 [pdf, other]

GreenCOD: A Green Camouflaged Object Detection Method

Authors: Hong-Shuo Chen, Yao Zhu, Suya You, Azad M. Madni, C. -C. Jay Kuo

Abstract: We introduce GreenCOD, a green method for detecting camouflaged objects, distinct in its avoidance of backpropagation techniques. GreenCOD leverages gradient boosting and deep features extracted from pre-trained Deep Neural Networks (DNNs). Traditional camouflaged object detection (COD) approaches often rely on complex deep neural network architectures, seeking performance improvements through bac… ▽ More We introduce GreenCOD, a green method for detecting camouflaged objects, distinct in its avoidance of backpropagation techniques. GreenCOD leverages gradient boosting and deep features extracted from pre-trained Deep Neural Networks (DNNs). Traditional camouflaged object detection (COD) approaches often rely on complex deep neural network architectures, seeking performance improvements through backpropagation-based fine-tuning. However, such methods are typically computationally demanding and exhibit only marginal performance variations across different models. This raises the question of whether effective training can be achieved without backpropagation. Addressing this, our work proposes a new paradigm that utilizes gradient boosting for COD. This approach significantly simplifies the model design, resulting in a system that requires fewer parameters and operations and maintains high performance compared to state-of-the-art deep learning models. Remarkably, our models are trained without backpropagation and achieve the best performance with fewer than 20G Multiply-Accumulate Operations (MACs). This new, more efficient paradigm opens avenues for further exploration in green, backpropagation-free model training. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.15972 [pdf, other]

SMILES Initial Data Release: Unveiling the Obscured Universe with MIRI Multi-band Imaging

Authors: Stacey Alberts, Jianwei Lyu, Irene Shivaei, George H. Rieke, Pablo G. Perez-Gonzalez, Nina Bonventura, Yongda Zhu, Jakob M. Helton, Zhiyuan Ji, Jane Morrison, Brant E. Robertson, Meredith A. Stone, Yang Sun, Christina C. Williams, Christopher N. A. Willmer

Abstract: The James Webb Space Telescope (JWST) is revolutionizing our view of the Universe through unprecedented sensitivity and resolution in the infrared, with some of the largest gains realized at its longest wavelengths. We present the Systematic Mid-infrared Instrument (MIRI) Legacy Extragalactic Survey (SMILES), an eight-band MIRI survey with Near-Infrared Spectrograph (NIRSpec) spectroscopic follow-… ▽ More The James Webb Space Telescope (JWST) is revolutionizing our view of the Universe through unprecedented sensitivity and resolution in the infrared, with some of the largest gains realized at its longest wavelengths. We present the Systematic Mid-infrared Instrument (MIRI) Legacy Extragalactic Survey (SMILES), an eight-band MIRI survey with Near-Infrared Spectrograph (NIRSpec) spectroscopic follow-up in the GOODS-S/HUDF region. SMILES takes full advantage of MIRI's continuous coverage from $5.6-25.5\,μ$m over a $\sim34$ arcmin$^2$ area to greatly expand our understanding of the obscured Universe up to cosmic noon and beyond. This work, together with a companion paper by Rieke et al., covers the SMILES science drivers and technical design, early results with SMILES, data reduction, photometric catalog creation, and the first data release. As part of the discussion on early results, we additionally present a high-level science demonstration on how MIRI's wavelength coverage and resolution will advance our understanding of cosmic dust using the full range of polycyclic aromatic hydrocarbon (PAH) emission features from $3.3-18\,μ$m. Using custom background subtraction, we produce robust reductions of the MIRI imaging that maximize the depths reached with our modest exposure times ($\sim0.6 - 2.2$ ks per filter). Included in our initial data release are (1) eight MIRI imaging mosaics reaching depths of $0.2-18\,μ$Jy ($5σ$) and (2) a $5-25.5\,μ$m photometric catalog with over 3,000 sources. Building upon the rich legacy of extensive photometric and spectroscopy coverage of GOODS-S/HUDF from the X-ray to the radio, SMILES greatly expands our investigative power in understanding the obscured Universe. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 23 pages, 19 figures, submitted to ApJ. Comments welcome! Data release will go live at https://archive.stsci.edu/hlsp/smiles in the next few weeks

arXiv:2405.15967 [pdf, other]

3-Minute Oscillations in the Upper Corona: Evidence from Parker Solar Probe

Authors: Zesen Huang, Marco Velli, Chen Shi, Yingjie Zhu, B. D. G. Chandran, Victor Réville, Trevor Bowen, Nikos Sioulas, Marc Pulupa, Jia Huang, Sheng Huang

Abstract: Recent observations of Parker Solar Probe (PSP) from around the Alfvén surface have shown that the trace magnetic power spectrum density (PSD) is often characterized by a shallow-inertial double power law, where in the low frequency energy injection range, the power spectrum is shallow (flatter than $1/f$), and in the inertial range the spectrum is steep, with a scaling index of [1.5, 1.67]. Conse… ▽ More Recent observations of Parker Solar Probe (PSP) from around the Alfvén surface have shown that the trace magnetic power spectrum density (PSD) is often characterized by a shallow-inertial double power law, where in the low frequency energy injection range, the power spectrum is shallow (flatter than $1/f$), and in the inertial range the spectrum is steep, with a scaling index of [1.5, 1.67]. Consequently, close to the sun, the majority of the fluctuation energy concentrates in a small frequency range around the low frequency power spectral break. In this work, we conduct a systematic survey of PSP observations for the first 17 encounters to statistically study the energy behaviors of the magnetic fluctuations. Our results show that the center frequency of fluctuation energy systematically drifts to around 3-minute for the most pristine solar wind (smallest solar wind advection time). Moreover, the center frequency rapidly drifts to lower frequency as solar wind advection time increases, as expected for active turbulence. The concentration of fluctuation energy around 3-minutes suggests that Alfvénic fluctuations in solar wind might mostly be coming from resonant p-mode oscillations in the photosphere, though other potential sources are discussed. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15830 [pdf, other]

Diff-DTI: Fast Diffusion Tensor Imaging Using A Feature-Enhanced Joint Diffusion Model

Authors: Lang Zhang, **ling He, Dong Liang, Hairong Zheng, Yanjie Zhu

Abstract: Magnetic resonance diffusion tensor imaging (DTI) is a critical tool for neural disease diagnosis. However, long scan time greatly hinders the widespread clinical use of DTI. To accelerate image acquisition, a feature-enhanced joint diffusion model (Diff-DTI) is proposed to obtain accurate DTI parameter maps from a limited number of diffusion-weighted images (DWIs). Diff-DTI introduces a joint dif… ▽ More Magnetic resonance diffusion tensor imaging (DTI) is a critical tool for neural disease diagnosis. However, long scan time greatly hinders the widespread clinical use of DTI. To accelerate image acquisition, a feature-enhanced joint diffusion model (Diff-DTI) is proposed to obtain accurate DTI parameter maps from a limited number of diffusion-weighted images (DWIs). Diff-DTI introduces a joint diffusion model that directly learns the joint probability distribution of DWIs with DTI parametric maps for conditional generation. Additionally, a feature enhancement fusion mechanism (FEFM) is designed and incorporated into the generative process of Diff-DTI to preserve fine structures in the generated DTI maps. A comprehensive evaluation of the performance of Diff-DTI was conducted on the Human Connectome Project dataset. The results demonstrate that Diff-DTI outperforms existing state-of-the-art fast DTI imaging methods in terms of visual quality and quantitative metrics. Furthermore, Diff-DTI has shown the ability to produce high-fidelity DTI maps with only three DWIs, thus overcoming the requirement of a minimum of six DWIs for DTI. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 11 pages, 7 figures

arXiv:2405.14251 [pdf, other]

Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field

Authors: Haodong Feng, Dehan Yuan, Jiale Miao, Jie You, Yue Wang, Yi Zhu, Dixia Fan

Abstract: Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-struct… ▽ More Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-structure interactions (FSI) caused by unsteady hydrodynamics. This study proposes a deep reinforcement learning (DRL) algorithm, trained in a data-driven manner, to enable efficient navigation of a robotic fish swimming across vortical flows. Our proposed algorithm incorporates the LSTM architecture and uses several recent consecutive observations as the state to address the issue of partial observation, often due to sensor limitations. We present a numerical study of navigation within a Karman vortex street, created by placing a stationary cylinder in a uniform flow, utilizing the immersed boundary-lattice Boltzmann method (IB-LBM). The aim is to train the robotic fish to discover efficient navigation policies, enabling it to reach a designated target point across the Karman vortex street from various initial positions. After training, the fish demonstrates the ability to rapidly reach the target from different initial positions, showcasing the effectiveness and robustness of our proposed algorithm. Analysis of the results reveals that the robotic fish can leverage velocity gains and pressure differences induced by the vortices to reach the target, underscoring the potential of our proposed algorithm in enhancing navigation in complex hydrodynamic environments. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14238 [pdf, other]

Surveying Image Segmentation Approaches in Astronomy

Authors: Duo Xu, Ye Zhu

Abstract: Image segmentation plays a critical role in unlocking the mysteries of the universe, providing astronomers with a clearer perspective on celestial objects within complex astronomical images and data cubes. Manual segmentation, while traditional, is not only time-consuming but also susceptible to biases introduced by human intervention. As a result, automated segmentation methods have become essent… ▽ More Image segmentation plays a critical role in unlocking the mysteries of the universe, providing astronomers with a clearer perspective on celestial objects within complex astronomical images and data cubes. Manual segmentation, while traditional, is not only time-consuming but also susceptible to biases introduced by human intervention. As a result, automated segmentation methods have become essential for achieving robust and consistent results in astronomical studies. This review begins by summarizing traditional and classical segmentation methods widely used in astronomical tasks. Despite the significant improvements these methods have brought to segmentation outcomes, they fail to meet astronomers' expectations, requiring additional human correction, further intensifying the labor-intensive nature of the segmentation process. The review then focuses on the transformative impact of machine learning, particularly deep learning, on segmentation tasks in astronomy. It introduces state-of-the-art machine learning approaches, highlighting their applications and the remarkable advancements they bring to segmentation accuracy in both astronomical images and data cubes. As the field of machine learning continues to evolve rapidly, it is anticipated that astronomers will increasingly leverage these sophisticated techniques to enhance segmentation tasks in their research projects. In essence, this review serves as a comprehensive guide to the evolution of segmentation methods in astronomy, emphasizing the transition from classical approaches to cutting-edge machine learning methodologies. We encourage astronomers to embrace these advancements, fostering a more streamlined and accurate segmentation process that aligns with the ever-expanding frontiers of astronomical exploration. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Invited review to appear in Astronomy & Computing's special issue on machine learning methods in modern astronomy

arXiv:2405.14205 [pdf, other]

Agent Planning with World Knowledge Model

Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' m… ▽ More Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning. Concretely, we steer the agent model to self-synthesize knowledge from both expert and sampled trajectories. Then we develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines. Besides, we analyze to illustrate that our WKM can effectively alleviate the blind trial-and-error and hallucinatory action issues, providing strong support for the agent's understanding of the world. Other interesting findings include: 1) our instance-level task knowledge can generalize better to unseen tasks, 2) weak WKM can guide strong agent model planning, and 3) unified WKM training has promising potential for further development. Code will be available at https://github.com/zjunlp/WKM. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Work in progress

arXiv:2405.13576 [pdf, other]

FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research

Authors: Jiajie **, Yutao Zhu, Xinyu Yang, Chenghao Zhang, Zhicheng Dou

Abstract: With the advent of Large Language Models (LLMs), the potential of Retrieval Augmented Generation (RAG) techniques have garnered considerable research attention. Numerous novel algorithms and models have been introduced to enhance various aspects of RAG systems. However, the absence of a standardized framework for implementation, coupled with the inherently intricate RAG process, makes it challengi… ▽ More With the advent of Large Language Models (LLMs), the potential of Retrieval Augmented Generation (RAG) techniques have garnered considerable research attention. Numerous novel algorithms and models have been introduced to enhance various aspects of RAG systems. However, the absence of a standardized framework for implementation, coupled with the inherently intricate RAG process, makes it challenging and time-consuming for researchers to compare and evaluate these approaches in a consistent environment. Existing RAG toolkits like LangChain and LlamaIndex, while available, are often heavy and unwieldy, failing to meet the personalized needs of researchers. In response to this challenge, we propose FlashRAG, an efficient and modular open-source toolkit designed to assist researchers in reproducing existing RAG methods and in develo** their own RAG algorithms within a unified framework. Our toolkit implements 12 advanced RAG methods and has gathered and organized 32 benchmark datasets. Our toolkit has various features, including customizable modular framework, rich collection of pre-implemented RAG works, comprehensive datasets, efficient auxiliary pre-processing scripts, and extensive and standard evaluation metrics. Our toolkit and resources are available at https://github.com/RUC-NLPIR/FlashRAG. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 8 pages

arXiv:2405.13315 [pdf, other]

Study of the decays $χ_{cJ}\toΛ\barΛω$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,… ▽ More Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$, $\mathcal{B}(χ_{c1}\toΛ\barΛω)=({1.01 \pm 0.10 \pm 0.11}) \times 10^{-4}$, and $\mathcal{B}(χ_{c2}\toΛ\barΛω)=({1.40 \pm 0.13 \pm 0.17}) \times 10^{-4}$, where the first uncertainties are statistical and the second are systematic. We observe no clear intermediate structures. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 10 figures

arXiv:2405.12809 [pdf, other]

Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with significantly improved precision. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: to be submitted to PRD

arXiv:2405.12679 [pdf]

Observation of Spin Splitting in Room-Temperature Metallic Antiferromagnet CrSb

Authors: Meng Zeng, Ming-Yuan Zhu, Yu-Peng Zhu, Xiang-Rui Liu, Xiao-Ming Ma, Yu-Jie Hao, Pengfei Liu, Gexing Qu, Yichen Yang, Zhicheng Jiang, Kohei Yamagami, Masashi Arita, Xiaoqian Zhang, Tian-Hao Shao, Yue Dai, Kenya Shimada, Zhengtai Liu, Mao Ye, Yaobo Huang, Qihang Liu, Chang Liu

Abstract: Recently, unconventional antiferromagnets that enable the splitting of electronic spins have been theoretically proposed and experimentally realized, where the magnetic sublattices containing moments pointing at different directions are connected by a novel set of symmetries. Such spin splitting (SS) is substantial, $k$-dependent, and independent of the spin-orbit coupling strength, making these m… ▽ More Recently, unconventional antiferromagnets that enable the splitting of electronic spins have been theoretically proposed and experimentally realized, where the magnetic sublattices containing moments pointing at different directions are connected by a novel set of symmetries. Such spin splitting (SS) is substantial, $k$-dependent, and independent of the spin-orbit coupling strength, making these magnets promising materials for antiferromagnetic spintronics. Here, combined with angle-resolved photoemission spectroscopy (ARPES) and density functional theory (DFT) calculations, we perform a systematic study on CrSb, a metallic spin-split antiferromagnet candidate with $T_N$ = 703 K. Our data reveals the electronic structure of CrSb along both out-of-plane and in-plane momentum directions, which renders anisotropic $k$-dependent SS and agrees well with the calculational results. The magnitude of such SS reaches up to at least 0.8 eV at non-high-symmetry momentum points, which is significantly higher than the largest known SOC-induced SS. This compound expands the choice of materials in the field of antiferromagnetic spintronics and is likely to stimulate subsequent investigations of high-efficiency spintronic devices that are functional at room temperature. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures

arXiv:2405.12535 [pdf, ps, other]

PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation

Authors: Yuhua Zhu

Abstract: In this paper, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. When the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? We first highlight that the commonly used Bellman equation (BE) is not always a reliable approxim… ▽ More In this paper, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. When the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? We first highlight that the commonly used Bellman equation (BE) is not always a reliable approximation to the true value function. We then introduce a new bellman equation, PhiBE, which integrates the discrete-time information into a PDE formulation. The new bellman equation offers a more accurate approximation to the true value function, especially in scenarios where the underlying dynamics change slowly. Moreover, we extend PhiBE to higher orders, providing increasingly accurate approximations. We conduct the error analysis for both BE and PhiBE with explicit dependence on the discounted coefficient, the reward and the dynamics. Additionally, we present a model-free algorithm to solve PhiBE when only discrete-time trajectory data is available. Numerical experiments are provided to validate the theoretical guarantees we propose. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12275 [pdf, other]

doi 10.1093/mnrasl/slae061

Dam** Wing-Like Features in the Stacked Ly$α$ Forest: Potential Neutral Hydrogen Islands at $z<6$

Authors: Yongda Zhu, George D. Becker, Sarah E. I. Bosman, Christopher Cain, Laura C. Keating, Fahad Nasir, Valentina D'Odorico, Eduardo Bañados, Fuyan Bian, Manuela Bischetti, James S. Bolton, Huanqing Chen, Anson D'Aloisio, Frederick B. Davies, Rebecca L. Davies, Anna-Christina Eilers, Xiaohui Fan, Prakash Gaikwad, Bradley Greig, Martin G. Haehnelt, Girish Kulkarni, Samuel Lai, Ewald Puchwein, Yuxiang Qin, Emma V. Ryan-Weber , et al. (6 additional authors not shown)

Abstract: Recent quasar absorption line observations suggest that reionization may end as late as $z \approx 5.3$. As a means to search for large neutral hydrogen islands at $z<6$, we revisit long dark gaps in the Ly$β$ forest in VLT/X-Shooter and Keck/ESI quasar spectra. We stack the Ly$α$ forest corresponding to both edges of these Ly$β$ dark gaps and identify a dam** wing-like extended absorption profi… ▽ More Recent quasar absorption line observations suggest that reionization may end as late as $z \approx 5.3$. As a means to search for large neutral hydrogen islands at $z<6$, we revisit long dark gaps in the Ly$β$ forest in VLT/X-Shooter and Keck/ESI quasar spectra. We stack the Ly$α$ forest corresponding to both edges of these Ly$β$ dark gaps and identify a dam** wing-like extended absorption profile. The average redshift of the stacked forest is $z=5.8$. By comparing these observations with reionization simulations, we infer that such a dam** wing-like feature can be naturally explained if these gaps are at least partially created by neutral islands. Conversely, simulated dark gaps lacking neutral hydrogen struggle to replicate the observed dam** wing features. Furthermore, this dam** wing-like profile implies that the volume-averaged neutral hydrogen fraction must be $\langle x_{\rm HI} \rangle \geq 6.1 \pm 3.9\%$ at $z = 5.8$. Our results offer robust evidence that reionization extends below $z=6$. △ Less

Submitted 28 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures, 1 table; accepted for publication in MNRAS Letters

arXiv:2405.12273 [pdf, other]

Dam** wings in the Lyman-α forest: a model-independent measurement of the neutral fraction at 5.4<z<6.1

Authors: Benedetta Spina, Sarah E. I. Bosman, Frederick B. Davies, Prakash Gaikwad, Yongda Zhu

Abstract: Recent observations have positioned the endpoint of the Epoch of Reionisation (EoR) at redshift $z \sim 5.3$. However, observations of the Lyman-$α$ forest have not yet been able to discern whether reionisation occurred slowly and late, with substantial neutral hydrogen persisting at redshift $\sim 6$, or rapidly and earlier, with the apparent late end driven by the fluctuating UV background. Gunn… ▽ More Recent observations have positioned the endpoint of the Epoch of Reionisation (EoR) at redshift $z \sim 5.3$. However, observations of the Lyman-$α$ forest have not yet been able to discern whether reionisation occurred slowly and late, with substantial neutral hydrogen persisting at redshift $\sim 6$, or rapidly and earlier, with the apparent late end driven by the fluctuating UV background. Gunn-Peterson (GP) absorption troughs are solid indicators that reionisation is not complete until $z=5.3$, but whether they contain significantly neutral gas has not yet been proven. We aim to answer this question by directly measuring, for the first time, the neutral hydrogen fraction ($x_\mathrm{HI}$) at the end of the EoR ($5 \lesssim z \lesssim 6$) in high-redshift quasars spectra. For high neutral fractions $x_\mathrm{HI}\gtrsim0.1$, GP troughs exhibit dam** wing (DW) absorption extending over $1000$ km s$^{-1}$ beyond the troughs. While conclusively detected in Lyman-$α$ emission lines of quasars at $z\geq7$, DWs are challenging to observe in the general Lyman-$α$ forest due to absorption complexities and small-scale stochastic transmission features. We report the first successful identification of the stochastic DW signal adjacent to GP troughs at redshifts $z=5.6$ through careful stacking of the dark gaps in Lyman-$α$ forest. We use the signal to present a measurement of the corresponding global $x_\mathrm{HI}=0.19\pm0.07$ $(_{-0.16}^{+0.11})$ at $1σ$ $(2σ)$ at $z=5.6$ and a limit $x_\mathrm{HI}<0.44$ at $z=5.9$. The detection of this signal demonstrates the existence of substantially neutral islands near the conclusion of the EoR, unequivocally signaling a late-and-slow reionization scenario. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 7 pages, 4 figures, submitted to A&A Letters

arXiv:2405.11841 [pdf, other]

Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities

Authors: Junqi Wang, Chunhui Zhang, Jiapeng Li, Yuxi Ma, Lixing Niu, Jiaheng Han, Yujia Peng, Yixin Zhu, Lifeng Fan

Abstract: Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for s… ▽ More Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for social dynamics and introduced two evaluation tasks: Inverse Reasoning (IR) and Inverse Inverse Planning (IIP). Our approach also encompassed a computational model based on recursive Bayesian inference, adept at elucidating diverse human behavioral patterns. Extensive experiments and detailed analyses revealed that humans surpassed the latest GPT models in overall performance, zero-shot learning, one-shot generalization, and adaptability to multi-modalities. Notably, GPT models demonstrated social intelligence only at the most basic order (order = 0), in stark contrast to human social intelligence (order >= 2). Further examination indicated a propensity of LLMs to rely on pattern recognition for shortcuts, casting doubt on their possession of authentic human-level social intelligence. Our codes, dataset, appendix and human data are released at https://github.com/bigai-ai/Evaluate-n-Model-Social-Intelligence. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: Also published in Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), 2024

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11761 [pdf, other]

Strongly coupled magneto-exciton condensates in large-angle twisted double bilayer graphene

Authors: Qingxin Li, Yiwei Chen, LingNan Wei, Hong Chen, Yan Huang, Yujian Zhu, Wang Zhu, Dongdong An, Junwei Song, Qikang Gan, Qi Zhang, Kenji Watanabe, Takashi Taniguchi, Xiaoyang Shi, Kostya S. Novoselov, Rui Wang, Geliang Yu, Lei Wang

Abstract: Excitons, the bosonic quasiparticle emerging from Coulomb interaction between electrons and holes, will undergo a Bose-Einstein condensation(BEC) and transition into a superfluid state with global phase coherence at low temperatures. An important platform to study such excitonic physics is built on double-layer quantum wells or recent two-dimensional material heterostructures, where two parallel p… ▽ More Excitons, the bosonic quasiparticle emerging from Coulomb interaction between electrons and holes, will undergo a Bose-Einstein condensation(BEC) and transition into a superfluid state with global phase coherence at low temperatures. An important platform to study such excitonic physics is built on double-layer quantum wells or recent two-dimensional material heterostructures, where two parallel planes of electrons and holes are separated by a thin insulating layer. Lowering this separation distance ($d$) enhances the interlayer Coulomb interaction thereby strengthens the exciton binding energy. However, an exceedingly small $d$ will lead to the undesired interlayer tunneling, which results the annihilation of excitons. Here, we report the observation of a sequences of robust exciton condensates(ECs) in double bilayer graphenes twisted to $\sim 10^\circ$ with no insulating mid-layer. The large momentum mismatch between the two graphene layers well suppress the interlayer tunneling, allowing us to reach the separation lower limit $\sim$ 0.334 nm and investigate ECs in the extreme coupling regime. Carrying out transport measurements on the bulk and edge of the devices, we find incompressible states corresponding to ECs when both layers are half-filled in the $N=0$ and $N=1$ Landau levels (LLs). The comparison between these ECs and theoretical calculations suggest that the low-energy charged excitation of ECs can be meron-antimeron or particle-hole pair, which relies on both LL index and carrier type. Our results establish large-angle twisted bilayers as an experimental platform with extreme coupling strength for studying quantum bosonic phase and its low-energy excitations. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11585 [pdf, other]

Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

Abstract: The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the… ▽ More The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the first uncertainties are statistical, the second systematic, and the third from the branching fraction of $ψ(3686)\rightarrowπ^{0}h_c$. The ratio $R_{h_c}=\frac{\mathscr{B}(h_c\rightarrowγη)}{\mathscr{B}(h_c\rightarrowγη^\prime)}$ is calculated to be $(27.0\pm4.4\pm1.0)\%$. The measurements are consistent with the previous results with improved precision by a factor of 2. The results are valuable for gaining a deeper understanding of $η-η^\prime$ mixing, and its manifestation within quantum chromodynamics. No significant signal is found for the decay $h_c\rightarrowγπ^{0}$, and an upper limit is placed on its branching fraction of $\mathscr{B}(h_c\rightarrowγπ^{0})<5.0\times10^{-5}$, at the 90\% confidence level. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11153 [pdf]

Dual-color Coherent Perfect Absorber

Authors: Boyi Xue, **tian Lin, Jiankun Hou, Yicheng Zhu, Ruixin Ma, Xianfeng Chen, Ya Cheng, Li Ge, Wenjie Wan

Abstract: Perfect absorption of light critically affects light-matter interaction for various applications. Coherent perfect absorbers (CPA) gain the unique capability of controlling light with light in a linear fashion. Multi-color CPAs [Phys. Rev. Lett. 107, 033901] are highly desirable for broadband and nonlinear light-to-light coherent control, however, the experimental demonstration has still remained… ▽ More Perfect absorption of light critically affects light-matter interaction for various applications. Coherent perfect absorbers (CPA) gain the unique capability of controlling light with light in a linear fashion. Multi-color CPAs [Phys. Rev. Lett. 107, 033901] are highly desirable for broadband and nonlinear light-to-light coherent control, however, the experimental demonstration has still remained elusive. Here we experimentally observe a dual-color version of CPA (DC-CPA) through a second harmonic generation in a single whispering-gallery-mode microcavity. The DC-CPA enables simultaneous perfect absorption of both the incoming fundamental wave and its second harmonic. Similar to its linear counterpart, coherent control in the DC-CPA can be also realized by tuning the relative phase and intensity between the two-colored waves through nonlinear interference instead of the linear one. This scheme breaks the linear boundary of the traditional CPA into a multi-frequency domain and paves the way toward all-optically signal processing and quantum information. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.10344 [pdf, ps, other]

Feasibility of Nash-Moser iteration for Cheng-Yau-type gradient estimates of nonlinear equations on complete Riemannian manifolds

Authors: Bin Shen, Yuhan Zhu

Abstract: In this manuscript, we employ the Nash-Moser iteration technique to determine a condition under which the positive solution $u$ of the generalized nonlinear Poisson equation $$\operatorname{div} (\varphi(|\nabla u|^2)\nabla u) + ψ(u^2)u = 0,$$ on a complete Riemannian manifold with Ricci curvature bounded from below can be shown to satisfy a Cheng-Yau-type gradient estimate. We define a class of… ▽ More In this manuscript, we employ the Nash-Moser iteration technique to determine a condition under which the positive solution $u$ of the generalized nonlinear Poisson equation $$\operatorname{div} (\varphi(|\nabla u|^2)\nabla u) + ψ(u^2)u = 0,$$ on a complete Riemannian manifold with Ricci curvature bounded from below can be shown to satisfy a Cheng-Yau-type gradient estimate. We define a class of $\varphi$-Laplacian operators by $Δ_{\varphi}(u):=\operatorname{div} (\varphi(|\nabla u|^2)\nabla u)$, where $\varphi$ is a $C^2$ function under some certain growth conditions. This can be regarded as a natural generalization of the $p$-Laplacian, the $(p,q)$-Laplacian and the exponential Laplacian, as well as having a close connection to the prescribed mean curvature problem. We illustrate the feasibility of applying the Nash-Moser iteration for such Poisson equation to get the Cheng-Yau-type gradient estimates in different cases with various $\varphi$ and $ψ$. Utilizing these estimates, we proves the related Harnack inequalities and a series of Liouville theorems. Our results can cover a wide range of quasilinear Laplace operator (e.g. $p$-Laplacian for $\varphi(t)=t^{p/2-1}$), and Lichnerowicz-type nonlinear equations (i.e. $ψ(t) = At^{p} + Bt^{q} + Ct\log t + D$). △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09855 [pdf, other]

Density-based clustering algorithm for galaxy group/cluster identification

Authors: Hai-Xia Ma, Tsutomu T. Takeuchi, Suchetha Cooray, Yongda Zhu

Abstract: A direct approach to studying the galaxy-halo connection is the analysis of observed groups and clusters of galaxies that trace the underlying dark matter halos, making identifying galaxy clusters and their associated brightest cluster galaxies (BCGs) crucial. We test and propose a robust density-based clustering algorithm that outperforms the traditional Friends-of-Friends (FoF) algorithm in the… ▽ More A direct approach to studying the galaxy-halo connection is the analysis of observed groups and clusters of galaxies that trace the underlying dark matter halos, making identifying galaxy clusters and their associated brightest cluster galaxies (BCGs) crucial. We test and propose a robust density-based clustering algorithm that outperforms the traditional Friends-of-Friends (FoF) algorithm in the currently available galaxy group/cluster catalogs. Our new approach is a modified version of the Ordering Points To Identify the Clustering Structure (OPTICS) algorithm, which accounts for line-of-sight positional uncertainties due to redshift space distortions by incorporating a scaling factor, and is thereby referred to as sOPTICS. When tested on both a galaxy group catalog based on semi-analytic galaxy formation simulations and observational data, our algorithm demonstrated robustness to outliers and relative insensitivity to hyperparameter choices. In total, we compared the results of eight clustering algorithms. The proposed density-based clustering method, sOPTICS, outperforms FoF in accurately identifying giant galaxy clusters and their associated BCGs in various environments with higher purity and recovery rate, also successfully recovering 115 BCGs out of 118 reliable BCGs from a large galaxy sample. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.09286 [pdf, other]

MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding

Authors: Jiajie Teng, Huiyu Duan, Yucheng Zhu, Si**g Wu, Guangtao Zhai

Abstract: Recent years have witnessed the rapid development of short videos, which usually contain both visual and audio modalities. Background music is important to the short videos, which can significantly influence the emotions of the viewers. However, at present, the background music of short videos is generally chosen by the video producer, and there is a lack of automatic music recommendation methods… ▽ More Recent years have witnessed the rapid development of short videos, which usually contain both visual and audio modalities. Background music is important to the short videos, which can significantly influence the emotions of the viewers. However, at present, the background music of short videos is generally chosen by the video producer, and there is a lack of automatic music recommendation methods for short videos. This paper introduces MVBind, an innovative Music-Video embedding space Binding model for cross-modal retrieval. MVBind operates as a self-supervised approach, acquiring inherent knowledge of intermodal relationships directly from data, without the need of manual annotations. Additionally, to compensate the lack of a corresponding musical-visual pair dataset for short videos, we construct a dataset, SVM-10K(Short Video with Music-10K), which mainly consists of meticulously selected short videos. On this dataset, MVBind manifests significantly improved performance compared to other baseline methods. The constructed dataset and code will be released to facilitate future research. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09200 [pdf, ps, other]

Performance Analysis of RIS-aided MISO Systems with EMI and Channel Aging

Authors: Taoyu Song, Enyu Shi, Yu Lu, Yiyang Zhu, Jiayi Zhang, Bo Ai

Abstract: In this paper, we investigate a reconfigurable intelligent surface (RIS)-aided multiple-input single-output (MISO) system in the presence of electromagnetic interference (EMI) and channel aging with a Rician fading channel model between the base station (BS) and user equipment (UE). Specifically, we derive the closed-form expression for downlink spectral efficiency (SE) with maximum ratio transmis… ▽ More In this paper, we investigate a reconfigurable intelligent surface (RIS)-aided multiple-input single-output (MISO) system in the presence of electromagnetic interference (EMI) and channel aging with a Rician fading channel model between the base station (BS) and user equipment (UE). Specifically, we derive the closed-form expression for downlink spectral efficiency (SE) with maximum ratio transmission (MRT) precoding. The Monte-Carlo simulation supports the theoretical results, demonstrating that amplifying the weight of the line-of-sight (LoS) component in Rician fading channels can boost SE, while EMI has a detrimental impact. Furthermore, continuously increasing the number of RIS elements is not an optimal choice when EMI exists. Nonetheless, RIS can be deployed to compensate for SE degradation caused by channel aging effects. Finally, enlarging the RIS elements size can significantly improve system performance. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09066 [pdf, other]

Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ are set to be $1.1 \times 10^{-5}$ and $4.3 \times 10^{-6}$ at 90\% confidence level, respectively. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.08920 [pdf, other]

Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning

Authors: Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang

Abstract: A recent study by De et al. (2022) has reported that large-scale representation learning through pre-training on a public dataset significantly enhances differentially private (DP) learning in downstream tasks, despite the high dimensionality of the feature space. To theoretically explain this phenomenon, we consider the setting of a layer-peeled model in representation learning, which results in… ▽ More A recent study by De et al. (2022) has reported that large-scale representation learning through pre-training on a public dataset significantly enhances differentially private (DP) learning in downstream tasks, despite the high dimensionality of the feature space. To theoretically explain this phenomenon, we consider the setting of a layer-peeled model in representation learning, which results in interesting phenomena related to learned features in deep learning and transfer learning, known as Neural Collapse (NC). Within the framework of NC, we establish an error bound indicating that the misclassification error is independent of dimension when the distance between actual features and the ideal ones is smaller than a threshold. Additionally, the quality of the features in the last layer is empirically evaluated under different pre-trained models within the framework of NC, showing that a more powerful transformer leads to a better feature representation. Furthermore, we reveal that DP fine-tuning is less robust compared to fine-tuning without DP, particularly in the presence of perturbations. These observations are supported by both theoretical analyses and experimental evaluation. Moreover, to enhance the robustness of DP fine-tuning, we suggest several strategies, such as feature normalization or employing dimension reduction methods like Principal Component Analysis (PCA). Empirically, we demonstrate a significant improvement in testing accuracy by conducting PCA on the last-layer features. △ Less

Submitted 16 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: To appear in ICML 2024

arXiv:2405.08885 [pdf, other]

Dam** wing absorption associated with a giant Ly$α$ trough at $z < 6$: direct evidence for late-ending reionization

Authors: George D. Becker, James S. Bolton, Yongda Zhu, Seyedazim Hashemi

Abstract: Multiple observations now suggest that the hydrogen reionization may have ended well below redshift six. While there has previously been no conclusive proof of extended neutral islands in the $z < 6$ intergalactic medium, it is possible that such islands give rise to the giant Ly$α$ absorption troughs seen in the spectra of high-redshift quasars. Here we present evidence that the deepest and longe… ▽ More Multiple observations now suggest that the hydrogen reionization may have ended well below redshift six. While there has previously been no conclusive proof of extended neutral islands in the $z < 6$ intergalactic medium, it is possible that such islands give rise to the giant Ly$α$ absorption troughs seen in the spectra of high-redshift quasars. Here we present evidence that the deepest and longest-known Ly$α$ trough at $z < 6$, towards ULAS J0148+0600 (J0148), is associated with dam** wing absorption. The evidence comes from a window of strong Ly$α$ transmission at the edge of the J0148 proximity zone. We show that the relatively smooth profile of this transmission window is highly unlikely to arise from resonant absorption alone, but is consistent with the presence of a dam** wing. We further argue that the dam** wing is unlikely to arise from a compact source due to the lack of associated metal lines, and is more likely to arise from an extended neutral island associated with the giant Ly$α$ trough. We investigate the physical conditions that may give rise to the strong transmission window, and speculate that it may signal an usually deep void, nearby ionizing sources, and/or the recent passage of an ionization front. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 17 pages, 18 figures; resubmitted to MNRAS after addressing referee comments

arXiv:2405.08814 [pdf, other]

Performance of wave function and Green's functions based methods for non equilibrium many-body dynamics

Authors: Cian C. Reeves, Gaurav Harsha, Avijit Shee, Yuanran Zhu, Chao Yang, K Birgitta Whaley, Dominika Zgid, Vojtech Vlcek

Abstract: Theoretical descriptions of non equilibrium dynamics of quantum many-body systems essentially employ either (i) explicit treatments, relying on truncation of the expansion of the many-body wave function, (ii) compressed representations of the many-body wave function, or (iii) evolution of an effective (downfolded) representation through Green's functions. In this work, we select representative cas… ▽ More Theoretical descriptions of non equilibrium dynamics of quantum many-body systems essentially employ either (i) explicit treatments, relying on truncation of the expansion of the many-body wave function, (ii) compressed representations of the many-body wave function, or (iii) evolution of an effective (downfolded) representation through Green's functions. In this work, we select representative cases of each of the methods and address how these complementary approaches capture the dynamics driven by intense field perturbations to non equilibrium states. Under strong driving, the systems are characterized by strong entanglement of the single particle density matrix and natural populations approaching those of a strongly interacting equilibrium system. We generate a representative set of results that are numerically exact and form a basis for critical comparison of the distinct families of methods. We demonstrate that the compressed formulation based on similarity transformed Hamiltonians (coupled cluster approach) is practically exact in weak fields and, hence, weakly or moderately correlated systems. Coupled cluster, however, struggles for strong driving fields, under which the system exhibits strongly correlated behavior, as measured by the von Neumann entropy of the single particle density matrix. The dynamics predicted by Green's functions in the (widely popular) GW approximation are less accurate by improve significantly upon the mean-field results in the strongly driven regime. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 11 pages, 2 figures in main text. 11 page supplemental information with 6 figures

arXiv:2405.08638 [pdf, other]

vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement

Authors: Yiwen Zhu, **yi Liu, Wenya Wei, Qianyi Fu, Yu**g Hu, Zhou Fang, Bo An, Jianye Hao, Tangjie Lv, Changjie Fan

Abstract: Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement p… ▽ More Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement process can obtain different gradients. Previous studies have combined these gradients without considering their disagreements. Therefore, optimizing the policy improvement process is crucial to enhance learning efficiency. This study focuses on investigating the impact of gradient disagreements caused by ensemble critics on policy improvement. We introduce the concept of uncertainty of gradient directions as a means to measure the disagreement among gradients utilized in the policy improvement process. Through measuring the disagreement among gradients, we find that transitions with lower uncertainty of gradient directions are more reliable in the policy improvement process. Building on this analysis, we propose a method called von Mises-Fisher Experience Resampling (vMFER), which optimizes the policy improvement process by resampling transitions and assigning higher confidence to transitions with lower uncertainty of gradient directions. Our experiments demonstrate that vMFER significantly outperforms the benchmark and is particularly well-suited for ensemble structures in RL. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted by IJCAI 2024, with appendix

arXiv:2405.07878 [pdf, other]

Effective medium properties of stealthy hyperuniform photonic structures using multiscale physics-informed neural networks

Authors: Roberto Riganti, Yilin Zhu, Wei Cai, Salvatore Torquato, Luca Dal Negro

Abstract: In this article, we employ multiscale physics-informed neural networks (MscalePINNs) for the inverse retrieval of the effective permittivity and homogenization of finite-size photonic media with stealthy hyperuniform (SHU) disordered geometries. Specifically, we show that MscalePINNs are capable of capturing the fast spatial variations of complex fields scattered by arrays of dielectric nanocylind… ▽ More In this article, we employ multiscale physics-informed neural networks (MscalePINNs) for the inverse retrieval of the effective permittivity and homogenization of finite-size photonic media with stealthy hyperuniform (SHU) disordered geometries. Specifically, we show that MscalePINNs are capable of capturing the fast spatial variations of complex fields scattered by arrays of dielectric nanocylinders arranged according to isotropic SHU point patterns, thus enabling a systematic methodology to inverse retrieve their effective dielectric profiles. Our approach extends the recently developed high-frequency homogenization theory of hyperuniform media and retrieves more general permittivity profiles for applications-relevant finite-size SHU systems, unveiling unique features related to their isotropic nature. In particular, we demonstrate the existence of a transparency region beyond the long-wavelength approximation, enabling effective and isotropic homogenization even without disorder-averaging, in contrast to the case of uncorrelated Poisson random patterns. We believe that the multiscale network approach introduced here enables the efficient inverse design of general effective media and finite-size metamaterials with isotropic electromagnetic responses beyond the limitations of traditional homogenization theories. △ Less

Submitted 14 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07741 [pdf, other]

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07011 [pdf, other]

Fair Graph Representation Learning via Sensitive Attribute Disentanglement

Authors: Yuchang Zhu, **tang Li, Zibin Zheng, Liang Chen

Abstract: Group fairness for Graph Neural Networks (GNNs), which emphasizes algorithmic decisions neither favoring nor harming certain groups defined by sensitive attributes (e.g., race and gender), has gained considerable attention. In particular, the objective of group fairness is to ensure that the decisions made by GNNs are independent of the sensitive attribute. To achieve this objective, most existing… ▽ More Group fairness for Graph Neural Networks (GNNs), which emphasizes algorithmic decisions neither favoring nor harming certain groups defined by sensitive attributes (e.g., race and gender), has gained considerable attention. In particular, the objective of group fairness is to ensure that the decisions made by GNNs are independent of the sensitive attribute. To achieve this objective, most existing approaches involve eliminating sensitive attribute information in node representations or algorithmic decisions. However, such ways may also eliminate task-related information due to its inherent correlation with the sensitive attribute, leading to a sacrifice in utility. In this work, we focus on improving the fairness of GNNs while preserving task-related information and propose a fair GNN framework named FairSAD. Instead of eliminating sensitive attribute information, FairSAD enhances the fairness of GNNs via Sensitive Attribute Disentanglement (SAD), which separates the sensitive attribute-related information into an independent component to mitigate its impact. Additionally, FairSAD utilizes a channel masking mechanism to adaptively identify the sensitive attribute-related component and subsequently decorrelates it. Overall, FairSAD minimizes the impact of the sensitive attribute on GNN outcomes rather than eliminating sensitive attributes, thereby preserving task-related information associated with the sensitive attribute. Furthermore, experiments conducted on several real-world datasets demonstrate that FairSAD outperforms other state-of-the-art methods by a significant margin in terms of both fairness and utility performance. Our source code is available at https://github.com/ZzoomD/FairSAD. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: Accepted by WWW 2024

arXiv:2405.06988 [pdf]

Enforced symmetry breaking for valley polarization in two-dimensional hexagonal lattices

Authors: Yongqian Zhu, Jia-Tao Sun, **bo Pan, Jun Deng, Shixuan Du

Abstract: The generation and manipulation of the valley polarization in solids are crucial for valleytronics, which is mainly limited to the analysis of inversion and time-reversal symmetries for two-dimensional (2D) hexagonal systems. Here, through group theory analysis, we propose general rules for the generation and manipulation of valley polarization in 2D hexagonal lattices. The generation of valley po… ▽ More The generation and manipulation of the valley polarization in solids are crucial for valleytronics, which is mainly limited to the analysis of inversion and time-reversal symmetries for two-dimensional (2D) hexagonal systems. Here, through group theory analysis, we propose general rules for the generation and manipulation of valley polarization in 2D hexagonal lattices. The generation of valley polarization requires breaking the specific enforced symmetry that is associated with different valleys or reverses the sign of Berry curvature. Further manipulation of valley polarization requires asymmetry operators connecting two states with opposite signs of Berry curvature. These rules for generating and manipulating valley polarization are extendable to generic points in momentum space. Combined with first-principles calculations, we realize the controllable valley polarization in three representative systems, i.e., monolayer FeCl2, bilayer TcGeSe3, and monolayer CrOBr. Our work provides symmetry rules for designing valleytronic materials that could facilitate the experimental detection and realistic applications. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures

arXiv:2405.06914 [pdf, other]

Non-confusing Generation of Customized Concepts in Diffusion Models

Authors: Wang Lin, **gyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao **, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang Zhang

Abstract: We tackle the common challenge of inter-concept visual confusion in compositional concept generation using text-guided diffusion models (TGDMs). It becomes even more pronounced in the generation of customized concepts, due to the scarcity of user-provided concept visual examples. By revisiting the two major stages leading to the success of TGDMs -- 1) contrastive image-language pre-training (CLIP)… ▽ More We tackle the common challenge of inter-concept visual confusion in compositional concept generation using text-guided diffusion models (TGDMs). It becomes even more pronounced in the generation of customized concepts, due to the scarcity of user-provided concept visual examples. By revisiting the two major stages leading to the success of TGDMs -- 1) contrastive image-language pre-training (CLIP) for text encoder that encodes visual semantics, and 2) training TGDM that decodes the textual embeddings into pixels -- we point that existing customized generation methods only focus on fine-tuning the second stage while overlooking the first one. To this end, we propose a simple yet effective solution called CLIF: contrastive image-language fine-tuning. Specifically, given a few samples of customized concepts, we obtain non-confusing textual embeddings of a concept by fine-tuning CLIP via contrasting a concept and the over-segmented visual regions of other concepts. Experimental results demonstrate the effectiveness of CLIF in preventing the confusion of multi-customized concept generation. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06212 [pdf]

Realized Stable BP-N at Ambient Pressure by Phosphorus Do**

Authors: Guo Chen, Chengfeng Zhang, Yuanqin Zhu, Bingqing cao, Jie Zhang, Xianlong Wang

Abstract: Black phosphorus nitrogen (BP-N) is an attractive high-energy-density material. However, high-pressure synthesized BP-N will decompose at low-pressure and cannot be quenched to ambient conditions. Finding a method to stabilize it at 0 GPa is of great significance for its practical applications. However, unlike cg-N, LP-N, and HLP-N, it is always a metastable phase at high-pressure up to 260 GPa, a… ▽ More Black phosphorus nitrogen (BP-N) is an attractive high-energy-density material. However, high-pressure synthesized BP-N will decompose at low-pressure and cannot be quenched to ambient conditions. Finding a method to stabilize it at 0 GPa is of great significance for its practical applications. However, unlike cg-N, LP-N, and HLP-N, it is always a metastable phase at high-pressure up to 260 GPa, and decomposes into chains at 23 GPa. Here, based on the first-principles simulations, we find that P atom do** can effectively reduce the synthesis pressure of BP-N and maintain its stability at 0 GPa. Uniform distribution of P atom dopants within the layer helps maintain the structural stability of BP-N layer at 0 GPa, while interlayer electrostatic interaction induced by N-P dipoles enhances its dynamic stability by eliminating interlayer slip**. Furthermore, pressure is conducive to enhancing the stability of BP-N and its doped forms by suppressing N-chain dissociation. For the configuration with 12.5% do** concentration, a gravimetric energy density of 8.07 kJ/g can be realized, which is nearly two times higher than TNT. △ Less

Submitted 19 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 27 pages, 6 figures

arXiv:2405.05985 [pdf, other]

TrafficGPT: Towards Multi-Scale Traffic Analysis and Generation with Spatial-Temporal Agent Framework

Authors: **hui Ouyang, Yijie Zhu, Xiang Yuan, Di Wu

Abstract: The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semanti… ▽ More The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semantic information can realize short-term, long-term, and unseen road traffic prediction. As the demands of multi-scale traffic analysis increase, on-demand interactions and visualizations are expected to be available for transportation participants. We have designed a multi-scale traffic generation system, namely TrafficGPT, using three AI agents to process multi-scale traffic data, conduct multi-scale traffic analysis, and present multi-scale visualization results. TrafficGPT consists of three essential AI agents: 1) a text-to-demand agent that is employed with Question & Answer AI to interact with users and extract prediction tasks through texts; 2) a traffic prediction agent that leverages multi-scale traffic data to generate temporal features and similarity, and fuse them with limited spatial features and similarity, to achieve accurate prediction of three tasks; and 3) a suggestion and visualization agent that uses the prediction results to generate suggestions and visualizations, providing users with a comprehensive understanding of traffic conditions. Our TrafficGPT system focuses on addressing concerns about traffic prediction from transportation participants, and conducted extensive experiments on five real-world road datasets to demonstrate its superior predictive and interactive performance △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.05983 [pdf]

Real-Time Pill Identification for the Visually Impaired Using Deep Learning

Authors: Bo Dang, Wenchao Zhao, Yufeng Li, Danqing Ma, Qixuan Yu, Elly Yijun Zhu

Abstract: The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to… ▽ More The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to accurately recognize and differentiate between various pill types through real-time image processing on mobile devices. The system incorporates Text-to- Speech (TTS) to provide immediate auditory feedback, enhancing usability and independence for visually impaired users. Our study evaluates the application's effectiveness in terms of detection accuracy and user experience, highlighting its potential to improve medication management and safety among the visually impaired community. Keywords-Deep Learning; YOLO Framework; Mobile Application; Visual Impairment; Pill Identification; Healthcare △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.05942 [pdf, other]

Improved Evolutionary Algorithms for Submodular Maximization with Cost Constraints

Authors: Yanhui Zhu, Samik Basu, A Pavan

Abstract: We present an evolutionary algorithm evo-SMC for the problem of Submodular Maximization under Cost constraints (SMC). Our algorithm achieves $1/2$-approximation with a high probability $1-1/n$ within $\mathcal{O}(n^2K_β)$ iterations, where $K_β$ denotes the maximum size of a feasible solution set with cost constraint $β$. To the best of our knowledge, this is the best approximation guarantee offer… ▽ More We present an evolutionary algorithm evo-SMC for the problem of Submodular Maximization under Cost constraints (SMC). Our algorithm achieves $1/2$-approximation with a high probability $1-1/n$ within $\mathcal{O}(n^2K_β)$ iterations, where $K_β$ denotes the maximum size of a feasible solution set with cost constraint $β$. To the best of our knowledge, this is the best approximation guarantee offered by evolutionary algorithms for this problem. We further refine evo-SMC, and develop {\sc st-evo-SMC}. This stochastic version yields a significantly faster algorithm while maintaining the approximation ratio of $1/2$, with probability $1-ε$. The required number of iterations reduces to $\mathcal{O}(nK_β\log{(1/ε)}/p)$, where the user defined parameters $p \in (0,1]$ represents the stochasticity probability, and $ε\in (0,1]$ denotes the error threshold. Finally, the empirical evaluations carried out through extensive experimentation substantiate the efficiency and effectiveness of our proposed algorithms. Our algorithms consistently outperform existing methods, producing higher-quality solutions. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: IJCAI 2024

Showing 101–150 of 4,086 results for author: Zhu, Y