Search | arXiv e-print repository

Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems

Authors: Qihuang Zhong, Kang Wang, Ziyang Xu, Juhua Liu, Liang Ding, Bo Du, Dacheng Tao

Abstract: Chain-of-Thought (CoT) prompting has enhanced the performance of Large Language Models (LLMs) across various reasoning tasks. However, CoT still falls short in dealing with complex math word problems, as it usually suffers from three pitfalls: semantic misunderstanding errors, calculation errors and step-missing errors. Prior studies involve addressing the calculation errors and step-missing error… ▽ More Chain-of-Thought (CoT) prompting has enhanced the performance of Large Language Models (LLMs) across various reasoning tasks. However, CoT still falls short in dealing with complex math word problems, as it usually suffers from three pitfalls: semantic misunderstanding errors, calculation errors and step-missing errors. Prior studies involve addressing the calculation errors and step-missing errors, but neglect the semantic misunderstanding errors, which is the major factor limiting the LLMs' performance. To this end, we propose a simple-yet-effective method, namely Deeply Understanding the Problems (DUP), to improve the LLMs' math problem-solving ability by addressing semantic misunderstanding errors. The core of our method is to encourage the LLMs to deeply understand the problems and extract the key problem-solving information used for better reasoning. Extensive experiments on 10 diverse reasoning benchmarks show that our DUP method consistently outperforms the other counterparts by a large margin. More encouragingly, DUP achieves a new SOTA result on the GSM8K benchmark, with an accuracy of 97.1% under zero-shot setting. △ Less

Submitted 29 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: Work in progress

arXiv:2404.14446 [pdf, other]

Spatio-temporal Joint Analysis of PM2.5 and Ozone in California with INLA

Authors: Jianan Pan, Kunyang He, Kai Wang, Qing Mu, Chengxiu Ling

Abstract: The substantial threat of concurrent air pollutants to public health is increasingly severe under climate change. To identify the common drivers and extent of spatio-temporal similarity of PM2.5 and ozone, this paper proposed a log Gaussian-Gumbel Bayesian hierarchical model allowing for sharing a SPDE-AR(1) spatio-temporal interaction structure. The proposed model outperforms in terms of estimati… ▽ More The substantial threat of concurrent air pollutants to public health is increasingly severe under climate change. To identify the common drivers and extent of spatio-temporal similarity of PM2.5 and ozone, this paper proposed a log Gaussian-Gumbel Bayesian hierarchical model allowing for sharing a SPDE-AR(1) spatio-temporal interaction structure. The proposed model outperforms in terms of estimation accuracy and prediction capacity for its increased parsimony and reduced uncertainty, especially for the shared ozone sub-model. Besides the consistently significant influence of temperature (positive), extreme drought (positive), fire burnt area (positive), and wind speed (negative) on both PM2.5 and ozone, surface pressure and GDP per capita (precipitation) demonstrate only positive associations with PM2.5 (ozone), while population density relates to neither. In addition, our results show the distinct spatio-temporal interactions and different seasonal patterns of PM2.5 and ozone, with peaks of PM2.5 and ozone in cold and hot seasons, respectively. Finally, with the aid of the excursion function, we see that the areas around the intersection of San Luis Obispo and Santa Barbara counties are likely to exceed the unhealthy ozone level for sensitive groups throughout the year. Our findings provide new insights for regional and seasonal strategies in the co-control of PM2.5 and ozone. Our methodology is expected to be utilized when interest lies in multiple interrelated processes in the fields of environment and epidemiology. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.14073 [pdf, other]

Towards Robust Trajectory Representations: Isolating Environmental Confounders with Causal Learning

Authors: Kang Luo, Yuanshao Zhu, Wei Chen, Kun Wang, Zhengyang Zhou, Sijie Ruan, Yuxuan Liang

Abstract: Trajectory modeling refers to characterizing human movement behavior, serving as a pivotal step in understanding mobility patterns. Nevertheless, existing studies typically ignore the confounding effects of geospatial context, leading to the acquisition of spurious correlations and limited generalization capabilities. To bridge this gap, we initially formulate a Structural Causal Model (SCM) to de… ▽ More Trajectory modeling refers to characterizing human movement behavior, serving as a pivotal step in understanding mobility patterns. Nevertheless, existing studies typically ignore the confounding effects of geospatial context, leading to the acquisition of spurious correlations and limited generalization capabilities. To bridge this gap, we initially formulate a Structural Causal Model (SCM) to decipher the trajectory representation learning process from a causal perspective. Building upon the SCM, we further present a Trajectory modeling framework (TrajCL) based on Causal Learning, which leverages the backdoor adjustment theory as an intervention tool to eliminate the spurious correlations between geospatial context and trajectories. Extensive experiments on two real-world datasets verify that TrajCL markedly enhances performance in trajectory classification tasks while showcasing superior generalization and interpretability. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: The paper has been accepted by IJCAI 2024

arXiv:2404.14043 [pdf, other]

LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation

Authors: Keheng Wang, Feiyu Duan, Peiguang Li, Sirui Wang, Xunliang Cai

Abstract: Retrieval-Augmented Generation (RAG) demonstrates great value in alleviating outdated knowledge or hallucination by supplying LLMs with updated and relevant knowledge. However, there are still several difficulties for RAG in understanding complex multi-hop query and retrieving relevant documents, which require LLMs to perform reasoning and retrieve step by step. Inspired by human's reasoning proce… ▽ More Retrieval-Augmented Generation (RAG) demonstrates great value in alleviating outdated knowledge or hallucination by supplying LLMs with updated and relevant knowledge. However, there are still several difficulties for RAG in understanding complex multi-hop query and retrieving relevant documents, which require LLMs to perform reasoning and retrieve step by step. Inspired by human's reasoning process in which they gradually search for the required information, it is natural to ask whether the LLMs could notice the missing information in each reasoning step. In this work, we first experimentally verified the ability of LLMs to extract information as well as to know the missing. Based on the above discovery, we propose a Missing Information Guided Retrieve-Extraction-Solving paradigm (MIGRES), where we leverage the identification of missing information to generate a targeted query that steers the subsequent knowledge retrieval. Besides, we design a sentence-level re-ranking filtering approach to filter the irrelevant content out from document, along with the information extraction capability of LLMs to extract useful information from cleaned-up documents, which in turn to bolster the overall efficacy of RAG. Extensive experiments conducted on multiple public datasets reveal the superiority of the proposed MIGRES method, and analytical experiments demonstrate the effectiveness of our proposed modules. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13840 [pdf, other]

Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be… ▽ More Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be $0.38\pm0.20_\text{stat.}\pm0.01_\text{syst.}$ ($R< 0.83$ at 90\% confidence level). In addition, we measure the ratio of the average cross section of $e^+e^-\toωX(3872)$ to $e^+e^-\toωχ_{c1}(ωχ_{c2})$ to be $σ_{ωX(3872)}/σ_{ωχ_{c1}}~(σ_{ωX(3872)}/σ_{ωχ_{c2}})=5.2\pm1.0_\text{stat.}\pm1.9_\text{syst.}~ (5.5\pm1.1_\text{stat.}\pm2.4_\text{syst.})$. Finally, we search for the process of $e^+e^-\toγX(3872)$, and no obvious signal is observed. The upper limit on the ratio of the average cross section of $e^+e^-\toγX(3872)$ to $e^+e^-\toωX(3872)$ is set as $σ_{γX(3872)}/σ_{ωX(3872)}<0.23$ at 90\% confidence level. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 19 pages, 10 figures

arXiv:2404.13405 [pdf]

Field-free switching of perpendicular magnetization by cooperation of planar Hall and orbital Hall effects

Authors: Zelalem Abebe Bekele, Yuan-Yuan Jiang, Kun Lei, Xiukai Lan, Xiangyu Liu, Hui Wen, Ding-Fu Shao, Kaiyou Wang

Abstract: Spin-orbit torques (SOTs) generated through the conventional spin Hall effect and/or Rashba-Edelstein effect are promising for manipulating magnetization. However, this approach typically exhibits non-deterministic and inefficient behaviour when it comes to switching perpendicular ferromagnets. This limitation posed a challenge for write-in operations in high-density magnetic memory devices. Here,… ▽ More Spin-orbit torques (SOTs) generated through the conventional spin Hall effect and/or Rashba-Edelstein effect are promising for manipulating magnetization. However, this approach typically exhibits non-deterministic and inefficient behaviour when it comes to switching perpendicular ferromagnets. This limitation posed a challenge for write-in operations in high-density magnetic memory devices. Here, we determine an effective solution to overcome this challenge by simultaneously leveraging both a planar Hall effect (PHE) and an orbital Hall effect (OHE). Using a representative Co/PtGd/Mo trilayer SOT device, we demonstrate that the PHE of Co is enhanced by the interfacial coupling of Co/PtGd, giving rise to a finite out-of-plane dam**-like torque within the Co layer. Simultaneously, the OHE in Mo layer induces a strong out-of-plane orbital current, significantly amplifying the in-plane dam**-like torque through orbital-to-spin conversion. While either the PHE or OHE alone proves insufficient for reversing the perpendicular magnetization of Co, their collaborative action enables high-efficiency field-free deterministic switching. Our work provides a straightforward strategy to realize high-speed and low-power spintronics. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: 13 pages, 3 figures, submitted to Nat. Commun

arXiv:2404.13238 [pdf, other]

Personalized Wireless Federated Learning for Large Language Models

Authors: Feibo Jiang, Li Dong, Siwei Tu, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Dusit Niyato

Abstract: Large Language Models (LLMs) have revolutionized natural language processing tasks. However, their deployment in wireless networks still face challenges, i.e., a lack of privacy and security protection mechanisms. Federated Learning (FL) has emerged as a promising approach to address these challenges. Yet, it suffers from issues including inefficient handling with big and heterogeneous data, resou… ▽ More Large Language Models (LLMs) have revolutionized natural language processing tasks. However, their deployment in wireless networks still face challenges, i.e., a lack of privacy and security protection mechanisms. Federated Learning (FL) has emerged as a promising approach to address these challenges. Yet, it suffers from issues including inefficient handling with big and heterogeneous data, resource-intensive training, and high communication overhead. To tackle these issues, we first compare different learning stages and their features of LLMs in wireless networks. Next, we introduce two personalized wireless federated fine-tuning methods with low communication overhead, i.e., (1) Personalized Federated Instruction Tuning (PFIT), which employs reinforcement learning to fine-tune local LLMs with diverse reward models to achieve personalization; (2) Personalized Federated Task Tuning (PFTT), which can leverage global adapters and local Low-Rank Adaptations (LoRA) to collaboratively fine-tune local LLMs, where the local LoRAs can be applied to achieve personalization without aggregation. Finally, we perform simulations to demonstrate the effectiveness of the proposed two methods and comprehensively discuss open issues. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures

arXiv:2404.12859 [pdf]

Circular Photocurrents in Centrosymmetric Semiconductors with Hidden Spin Polarization

Authors: Kexin Wang, Butian Zhang, Chengyu Yan, Luojun Du, Shun Wang

Abstract: Centrosymmetric materials with site inversion asymmetries possess hidden spin polarization, which remains challenging to be converted into spin currents because the global inversion symmetry is still conserved. This study demonstrates the spin-polarized DC circular photocurrents (CPC) in centrosymmetric transition metal dichalcogenides (TMDCs) at normal incidence without applying electric bias. Th… ▽ More Centrosymmetric materials with site inversion asymmetries possess hidden spin polarization, which remains challenging to be converted into spin currents because the global inversion symmetry is still conserved. This study demonstrates the spin-polarized DC circular photocurrents (CPC) in centrosymmetric transition metal dichalcogenides (TMDCs) at normal incidence without applying electric bias. The global inversion symmetry is broken by using a spatially-varying circularly polarized light beam, which could generate spin gradient owing to the hidden spin polarization. The dependences of the CPC on electrode configuration, illumination position, and beam spot size indicate an emergence of circulating electric current under spatially inhomogeneous light, which is associated with the deflection of spin-polarized current through the inverse spin Hall effect (ISHE). The CPC is subsequently utilized to probe the spin polarization and ISHE under different excitation wavelengths and temperatures. The results of this study demonstrate the feasibility of using centrosymmetric materials with hidden spin polarization and non-vanishing Berry curvature for spintronic device applications. △ Less

Submitted 20 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12794 [pdf, other]

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

Authors: Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, **tao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang

Abstract: LiDAR-based Moving Object Segmentation (MOS) aims to locate and segment moving objects in point clouds of the current scan using motion information from previous scans. Despite the promising results achieved by previous MOS methods, several key issues, such as the weak coupling of temporal and spatial information, still need further study. In this paper, we propose a novel LiDAR-based 3D Moving Ob… ▽ More LiDAR-based Moving Object Segmentation (MOS) aims to locate and segment moving objects in point clouds of the current scan using motion information from previous scans. Despite the promising results achieved by previous MOS methods, several key issues, such as the weak coupling of temporal and spatial information, still need further study. In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS. Firstly, we develop a novel embedding module, the Time Clue Bootstrap** Embedding (TCBE), to enhance the coupling of temporal and spatial information in point clouds and alleviate the issue of overlooked temporal clues. Secondly, we introduce the Motion-aware State Space Model (MSSM) to endow the model with the capacity to understand the temporal correlations of the same object across different time steps. Specifically, MSSM emphasizes the motion states of the same object at different time steps through two distinct temporal modeling and correlation steps. We utilize an improved state space model to represent these motion differences, significantly modeling the motion states. Finally, extensive experiments on the SemanticKITTI-MOS and KITTI-Road benchmarks demonstrate that the proposed MambaMOS achieves state-of-the-art performance. The source code of this work will be made publicly available at https://github.com/Terminal-K/MambaMOS. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: The source code will be made publicly available at https://github.com/Terminal-K/MambaMOS

arXiv:2404.12536 [pdf]

Asteroid (101955) Bennu in the Laboratory: Properties of the Sample Collected by OSIRIS-REx

Authors: Dante S. Lauretta, Harold C. Connolly, Jr., Joseph E. Aebersold, Conel M. O. D. Alexander, Ronald-L. Ballouz, Jessica J. Barnes, Helena C. Bates, Carina A. Bennett, Laurinne Blanche, Erika H. Blumenfeld, Simon J. Clemett, George D. Cody, Daniella N. DellaGiustina, Jason P. Dworkin, Scott A. Eckley, Dionysis I. Foustoukos, Ian A. Franchi, Daniel P. Glavin, Richard C. Greenwood, Pierre Haenecour, Victoria E. Hamilton, Dolores H. Hill, Takahiro Hiroi, Kana Ishimaru, Fred Jourdan , et al. (28 additional authors not shown)

Abstract: On 24 September 2023, the NASA OSIRIS-REx mission dropped a capsule to Earth containing approximately 120 g of pristine carbonaceous regolith from Bennu. We describe the delivery and initial allocation of this asteroid sample and introduce its bulk physical, chemical, and mineralogical properties from early analyses. The regolith is very dark overall, with higher-reflectance inclusions and particl… ▽ More On 24 September 2023, the NASA OSIRIS-REx mission dropped a capsule to Earth containing approximately 120 g of pristine carbonaceous regolith from Bennu. We describe the delivery and initial allocation of this asteroid sample and introduce its bulk physical, chemical, and mineralogical properties from early analyses. The regolith is very dark overall, with higher-reflectance inclusions and particles interspersed. Particle sizes range from sub-micron dust to a stone about 3.5 cm long. Millimeter-scale and larger stones typically have hummocky or angular morphologies. A subset of the stones appears mottled by brighter material that occurs as veins and crusts. Hummocky stones have the lowest densities and mottled stones have the highest. Remote sensing of the surface of Bennu detected hydrated phyllosilicates, magnetite, organic compounds, carbonates, and scarce anhydrous silicates, all of which the sample confirms. We also find sulfides, presolar grains, and, less expectedly, Na-rich phosphates, as well as other trace phases. The sample composition and mineralogy indicate substantial aqueous alteration and resemble those of Ryugu and the most chemically primitive, low-petrologic-type carbonaceous chondrites. Nevertheless, we find distinct hydrogen, nitrogen, and oxygen isotopic compositions, and some of the material we analyzed is enriched in fluid-mobile elements. Our findings underscore the value of sample return, especially for low-density material that may not readily survive atmospheric entry, and lay the groundwork for more comprehensive analyses. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 73 pages, 22 figures

arXiv:2404.12127 [pdf, other]

Personalized Forgetting Mechanism with Concept-Driven Knowledge Tracing

Authors: Shanshan Wang, Ying Hu, Xun Yang, Zhongzhou Zhang, Keyang Wang, Xingyi Zhang

Abstract: Knowledge Tracing (KT) aims to trace changes in students' knowledge states throughout their entire learning process by analyzing their historical learning data and predicting their future learning performance. Existing forgetting curve theory based knowledge tracing models only consider the general forgetting caused by time intervals, ignoring the individualization of students and the causal relat… ▽ More Knowledge Tracing (KT) aims to trace changes in students' knowledge states throughout their entire learning process by analyzing their historical learning data and predicting their future learning performance. Existing forgetting curve theory based knowledge tracing models only consider the general forgetting caused by time intervals, ignoring the individualization of students and the causal relationship of the forgetting process. To address these problems, we propose a Concept-driven Personalized Forgetting knowledge tracing model (CPF) which integrates hierarchical relationships between knowledge concepts and incorporates students' personalized cognitive abilities. First, we integrate the students' personalized capabilities into both the learning and forgetting processes to explicitly distinguish students' individual learning gains and forgetting rates according to their cognitive abilities. Second, we take into account the hierarchical relationships between knowledge points and design a precursor-successor knowledge concept matrix to simulate the causal relationship in the forgetting process, while also integrating the potential impact of forgetting prior knowledge points on subsequent ones. The proposed personalized forgetting mechanism can not only be applied to the learning of specifc knowledge concepts but also the life-long learning process. Extensive experimental results on three public datasets show that our CPF outperforms current forgetting curve theory based methods in predicting student performance, demonstrating CPF can better simulate changes in students' knowledge status through the personalized forgetting mechanism. △ Less

Submitted 25 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

Comments: under review

arXiv:2404.11943 [pdf, other]

AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration

Authors: Bo Pan, Jiaying Lu, Ke Wang, Li Zheng, Zhen Wen, Yingchaojie Feng, Minfeng Zhu, Wei Chen

Abstract: The potential of automatic task-solving through Large Language Model (LLM)-based multi-agent collaboration has recently garnered widespread attention from both the research community and industry. While utilizing natural language to coordinate multiple agents presents a promising avenue for democratizing agent technology for general users, designing coordination strategies remains challenging with… ▽ More The potential of automatic task-solving through Large Language Model (LLM)-based multi-agent collaboration has recently garnered widespread attention from both the research community and industry. While utilizing natural language to coordinate multiple agents presents a promising avenue for democratizing agent technology for general users, designing coordination strategies remains challenging with existing coordination frameworks. This difficulty stems from the inherent ambiguity of natural language for specifying the collaboration process and the significant cognitive effort required to extract crucial information (e.g. agent relationship, task dependency, result correspondence) from a vast amount of text-form content during exploration. In this work, we present a visual exploration framework to facilitate the design of coordination strategies in multi-agent collaboration. We first establish a structured representation for LLM-based multi-agent coordination strategy to regularize the ambiguity of natural language. Based on this structure, we devise a three-stage generation method that leverages LLMs to convert a user's general goal into an executable initial coordination strategy. Users can further intervene at any stage of the generation process, utilizing LLMs and a set of interactions to explore alternative strategies. Whenever a satisfactory strategy is identified, users can commence the collaboration and examine the visually enhanced execution result. We develop AgentCoord, a prototype interactive system, and conduct a formal user study to demonstrate the feasibility and effectiveness of our approach. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11749 [pdf, ps, other]

Weyl group twists and representations of quantum affine Borel algebras

Authors: Keyu Wang

Abstract: We define categories $\mathcal{O}^w$ of representations of Borel subalgebras $\mathcal{U}_q\mathfrak{b}$ of quantum affine algebras $\mathcal{U}_q\hat{\mathfrak{g}}$, which come from the category $\mathcal{O}$ twisted by Weyl group elements $w$. We construct inductive systems of finite-dimensional $\mathcal{U}_q\mathfrak{b}$-modules twisted by $w$, which provide representations in the category… ▽ More We define categories $\mathcal{O}^w$ of representations of Borel subalgebras $\mathcal{U}_q\mathfrak{b}$ of quantum affine algebras $\mathcal{U}_q\hat{\mathfrak{g}}$, which come from the category $\mathcal{O}$ twisted by Weyl group elements $w$. We construct inductive systems of finite-dimensional $\mathcal{U}_q\mathfrak{b}$-modules twisted by $w$, which provide representations in the category $\mathcal{O}^w$. We also establish a classification of simple modules in these categories $\mathcal{O}^w$. We explore convergent phenomenon of $q$-characters of representations of quantum affine algebras, which conjecturally give the $q$-characters of representations in $\mathcal{O}^w$. Furthermore, we propose a conjecture concerning the relationship between the category $\mathcal{O}$ and the twisted category $\mathcal{O}^w$, and we propose a possible connection with shifted quantum affine algebras. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 30 pages

arXiv:2404.11565 [pdf, other]

MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation

Authors: Kuan-Chieh Wang, Daniil Ostashev, Yuwei Fang, Sergey Tulyakov, Kfir Aberman

Abstract: We introduce a new architecture for personalization of text-to-image diffusion models, coined Mixture-of-Attention (MoA). Inspired by the Mixture-of-Experts mechanism utilized in large language models (LLMs), MoA distributes the generation workload between two attention pathways: a personalized branch and a non-personalized prior branch. MoA is designed to retain the original model's prior by fixi… ▽ More We introduce a new architecture for personalization of text-to-image diffusion models, coined Mixture-of-Attention (MoA). Inspired by the Mixture-of-Experts mechanism utilized in large language models (LLMs), MoA distributes the generation workload between two attention pathways: a personalized branch and a non-personalized prior branch. MoA is designed to retain the original model's prior by fixing its attention layers in the prior branch, while minimally intervening in the generation process with the personalized branch that learns to embed subjects in the layout and context generated by the prior branch. A novel routing mechanism manages the distribution of pixels in each layer across these branches to optimize the blend of personalized and generic content creation. Once trained, MoA facilitates the creation of high-quality, personalized images featuring multiple subjects with compositions and interactions as diverse as those generated by the original model. Crucially, MoA enhances the distinction between the model's pre-existing capability and the newly augmented personalized intervention, thereby offering a more disentangled subject-context control that was previously unattainable. Project page: https://snap-research.github.io/mixture-of-attention △ Less

Submitted 6 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: Project Website: https://snap-research.github.io/mixture-of-attention, Same as previous version, only updated metadata because bib was missing an author name

arXiv:2404.11118 [pdf, other]

MHLR: Moving Haar Learning Rate Scheduler for Large-scale Face Recognition Training with One GPU

Authors: Xueyuan Gong, Yain-whar Si, Zheng Zhang, Xiaochen Yuan, Ke Wang, Xinyuan Zhang, Cong Lin, Xiaoxiang Liu

Abstract: Face recognition (FR) has seen significant advancements due to the utilization of large-scale datasets. Training deep FR models on large-scale datasets with multiple GPUs is now a common practice. In fact, computing power has evolved into a foundational and indispensable resource in the area of deep learning. It is nearly impossible to train a deep FR model without holding adequate hardware resour… ▽ More Face recognition (FR) has seen significant advancements due to the utilization of large-scale datasets. Training deep FR models on large-scale datasets with multiple GPUs is now a common practice. In fact, computing power has evolved into a foundational and indispensable resource in the area of deep learning. It is nearly impossible to train a deep FR model without holding adequate hardware resources. Recognizing this challenge, some FR approaches have started exploring ways to reduce the time complexity of the fully-connected layer in FR models. Unlike other approaches, this paper introduces a simple yet highly effective approach, Moving Haar Learning Rate (MHLR) scheduler, for scheduling the learning rate promptly and accurately in the training process. MHLR supports large-scale FR training with only one GPU, which is able to accelerate the model to 1/4 of its original training time without sacrificing more than 1% accuracy. More specifically, MHLR only needs $30$ hours to train the model ResNet100 on the dataset WebFace12M containing more than 12M face images with 0.6M identities. Extensive experiments validate the efficiency and effectiveness of MHLR. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.11014 [pdf, other]

Towards Multi-agent Reinforcement Learning based Traffic Signal Control through Spatio-temporal Hypergraphs

Authors: Kang Wang, Zhishu Shen, Zhen Lei, Tiehua Zhang

Abstract: Traffic signal control systems (TSCSs) are integral to intelligent traffic management, fostering efficient vehicle flow. Traditional approaches often simplify road networks into standard graphs, which results in a failure to consider the dynamic nature of traffic data at neighboring intersections, thereby neglecting higher-order interconnections necessary for real-time control. To address this, we… ▽ More Traffic signal control systems (TSCSs) are integral to intelligent traffic management, fostering efficient vehicle flow. Traditional approaches often simplify road networks into standard graphs, which results in a failure to consider the dynamic nature of traffic data at neighboring intersections, thereby neglecting higher-order interconnections necessary for real-time control. To address this, we propose a novel TSCS framework to realize intelligent traffic control. This framework collaborates with multiple neighboring edge computing servers to collect traffic information across the road network. To elevate the efficiency of traffic signal control, we have crafted a multi-agent soft actor-critic (MA-SAC) reinforcement learning algorithm. Within this algorithm, individual agents are deployed at each intersection with a mandate to optimize traffic flow across the entire road network collectively. Furthermore, we introduce hypergraph learning into the critic network of MA-SAC to enable the spatio-temporal interactions from multiple intersections in the road network. This method fuses hypergraph and spatio-temporal graph structures to encode traffic data and capture the complex spatial and temporal correlations between multiple intersections. Our empirical evaluation, tested on varied datasets, demonstrates the superiority of our framework in minimizing average vehicle travel times and sustaining high-throughput performance. This work facilitates the development of more intelligent and reactive urban traffic management solutions. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.10445 [pdf, other]

SparseDM: Toward Sparse Efficient Diffusion Models

Authors: Kafeng Wang, Jianfei Chen, He Li, Zhenpeng Mi, Jun Zhu

Abstract: Diffusion models have been extensively used in data generation tasks and are recognized as one of the best generative models. However, their time-consuming deployment, long inference time, and requirements on large memory limit their application on mobile devices. In this paper, we propose a method based on the improved Straight-Through Estimator to improve the deployment efficiency of diffusion m… ▽ More Diffusion models have been extensively used in data generation tasks and are recognized as one of the best generative models. However, their time-consuming deployment, long inference time, and requirements on large memory limit their application on mobile devices. In this paper, we propose a method based on the improved Straight-Through Estimator to improve the deployment efficiency of diffusion models. Specifically, we add sparse masks to the Convolution and Linear layers in a pre-trained diffusion model, then use design progressive sparsity for model training in the fine-tuning stage, and switch the inference mask on and off, which supports a flexible choice of sparsity during inference according to the FID and MACs requirements. Experiments on four datasets conducted on a state-of-the-art Transformer-based diffusion model demonstrate that our method reduces MACs by $50\%$ while increasing FID by only 1.5 on average. Under other MACs conditions, the FID is also lower than 1$\sim$137 compared to other methods. △ Less

Submitted 30 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.09894 [pdf, ps, other]

Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection

Authors: Yuxi Li, Yi Liu, Gelei Deng, Ying Zhang, Wenjia Song, Ling Shi, Kailong Wang, Yuekang Li, Yang Liu, Haoyu Wang

Abstract: With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of resp… ▽ More With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of response. Specifically, we experiment on seven top popular LLMs utilizing three distinct tokenizers and involving a totally of 182,517 tokens. We present categorizations of the identified glitch tokens and symptoms exhibited by LLMs when interacting with glitch tokens. Based on our observation that glitch tokens tend to cluster in the embedding space, we propose GlitchHunter, a novel iterative clustering-based technique, for efficient glitch token detection. The evaluation shows that our approach notably outperforms three baseline methods on eight open-source LLMs. To the best of our knowledge, we present the first comprehensive study on glitch tokens. Our new detection further provides valuable insights into mitigating tokenization-related errors in LLMs. △ Less

Submitted 19 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09237 [pdf, other]

Some new bistable transition fronts with changing shape

Authors: Hongjun Guo, Kelei Wang

Abstract: We construct entire solutions of bistable reaction-diffusion equations by mixing finite planar fronts, which form a finite-dimensional manifold. These entire solutions are generalized traveling fronts, that is, transition fronts. We also show their uniqueness and stability. Furthermore, we prove that transition fronts with level sets having finite facets are determined by finite planar fronts and… ▽ More We construct entire solutions of bistable reaction-diffusion equations by mixing finite planar fronts, which form a finite-dimensional manifold. These entire solutions are generalized traveling fronts, that is, transition fronts. We also show their uniqueness and stability. Furthermore, we prove that transition fronts with level sets having finite facets are determined by finite planar fronts and they are in the class of entire solutions constructed by us. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.09219 [pdf, ps, other]

Observation of $D \to a_{0}(980)π$ in the decays $D^{0} \rightarrow π^{+}π^{-}η$ and $D^{+} \rightarrow π^{+}π^{0}η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the… ▽ More We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the $D^{0(+)} \to a_{0}(980)^{-(0)} π^{+}$ contribution. The ratios $\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{+}π^{-})/\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{-}π^{+})$ and $\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{+}π^{0})/\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{0}π^{+})$ are measured to be $7.5^{+2.5}_{-0.8\,\mathrm{stat.}}\pm1.7_{\mathrm{syst.}}$ and $2.6\pm0.6_{\mathrm{stat.}}\pm0.3_{\mathrm{syst.}}$, respectively. The measured $D^{0}$ ratio disagrees with the theoretical predictions by orders of magnitudes, thus implying a substantial contribution from final-state interactions. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.08521 [pdf, other]

The magnetism measurements of the two-dimensional van der Waals antiferromagnet CrPS4 using dynamic cantilever magnetometry

Authors: Qi Li, Weili Zhen, Ning Wang, Yang Yu, Senyang Pan, Lin Deng, Jiaqiang Cai, Kang Wang, Lvkuan Zou, Zhongming Zeng, **glei Zhang, Haifeng Du

Abstract: The exploration of van der Waals (vdWs) magnetic materials has sparked great interest in spintronics. However, conventional methods often face challenges in characterizing the magnetic properties of small-sized vdWs materials, especially for antiferromagnets with extremely small magnetic moments. Here, we demonstrate the efficacy of dynamic cantilever magnetometry (DCM) in characterizing the magne… ▽ More The exploration of van der Waals (vdWs) magnetic materials has sparked great interest in spintronics. However, conventional methods often face challenges in characterizing the magnetic properties of small-sized vdWs materials, especially for antiferromagnets with extremely small magnetic moments. Here, we demonstrate the efficacy of dynamic cantilever magnetometry (DCM) in characterizing the magnetic properties of vdWs magnets, using an antiferromagnetic semiconductor CrPS4. We observe continuous spin axis rotation under a magnetic field, accurately modelled by considering the existance of marked magnetic anisotropies. Furthermore, the dominance of out-of-plane magnetic anisotropy in spin reorientation behavior at low temperatures transitions to the prevalence of in-plane anisotropy with increasing temperature, leading to a sign reversal of the frequency shift in measurements. The peculiar magnetic phase transitions make CrPS4 an intriguing platform for studying two-dimensional magnetism. Our findings underscore the effectiveness of DCM in characterizing magnetic anisotropies and phase transitions in vdWs magnets. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.08135 [pdf, other]

SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations

Authors: Jamie Menjay Lin, Jisoo Jeong, Hong Cai, Risheek Garrepalli, Kai Wang, Fatih Porikli

Abstract: Optical flow estimation is crucial to a variety of vision tasks. Despite substantial recent advancements, achieving real-time on-device optical flow estimation remains a complex challenge. First, an optical flow model must be sufficiently lightweight to meet computation and memory constraints to ensure real-time performance on devices. Second, the necessity for real-time on-device operation impose… ▽ More Optical flow estimation is crucial to a variety of vision tasks. Despite substantial recent advancements, achieving real-time on-device optical flow estimation remains a complex challenge. First, an optical flow model must be sufficiently lightweight to meet computation and memory constraints to ensure real-time performance on devices. Second, the necessity for real-time on-device operation imposes constraints that weaken the model's capacity to adequately handle ambiguities in flow estimation, thereby intensifying the difficulty of preserving flow accuracy. This paper introduces two synergistic techniques, Self-Cleaning Iteration (SCI) and Regression Focal Loss (RFL), designed to enhance the capabilities of optical flow models, with a focus on addressing optical flow regression ambiguities. These techniques prove particularly effective in mitigating error propagation, a prevalent issue in optical flow models that employ iterative refinement. Notably, these techniques add negligible to zero overhead in model parameters and inference latency, thereby preserving real-time on-device efficiency. The effectiveness of our proposed SCI and RFL techniques, collectively referred to as SciFlow for brevity, is demonstrated across two distinct lightweight optical flow model architectures in our experiments. Remarkably, SciFlow enables substantial reduction in error metrics (EPE and Fl-all) over the baseline models by up to 6.3% and 10.5% for in-domain scenarios and by up to 6.2% and 13.5% for cross-domain scenarios on the Sintel and KITTI 2015 datasets, respectively. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: CVPRW 2024

arXiv:2404.07874 [pdf, other]

Artificial Chemotaxis under Electrodiffusiophoresis

Authors: Carlos A. Silvera Batista, Kun Wang, Hannah Blake, Vivian Nwosu-Madueke, Sophie Marbach

Abstract: Diffusiophoretic motion induced by gradients of dissolved species has enabled the manipulation of colloids over large distances, spanning hundreds of microns. Nonetheless, studies have primarily focused on simple geometries that feature 1D gradients of solutes generated by reactions or selective dissolution. Thus, our understanding of 3D diffusiophoresis remains elusive despite its importance in w… ▽ More Diffusiophoretic motion induced by gradients of dissolved species has enabled the manipulation of colloids over large distances, spanning hundreds of microns. Nonetheless, studies have primarily focused on simple geometries that feature 1D gradients of solutes generated by reactions or selective dissolution. Thus, our understanding of 3D diffusiophoresis remains elusive despite its importance in wide-ranging scenarios, such as cellular transport and nanofluidics. Herein, we present a strategy to generate 3D chemical gradients under electric fields. In this approach, faradaic reactions at electrodes induce global pH gradients that drive long-range transport through electrodiffusiophoresis. Simultaneously, the electric field induces local pH gradients by driving the particle's double layer far from equilibrium. As a result, while global pH gradients lead to 2D focusing away from electrodes, local pH gradients induce aggregation in the third dimension. Resulting interparticle interactions display a strong dependence on surface chemistry, and particle size. Furthermore, pH gradients can be readily tuned by adjusting the voltage and frequency of the electric field. For large Péclet numbers, we observed a chemotactic-like collapse. Remarkably, such collapse occurs without reactions at a particle's surface. By mixing particles with different sizes, we also demonstrate the emergence of non-reciprocal interactions through experiments and Brownian dynamics simulations. These findings suggest a wide array of possibilities for the dynamic assembly of materials and the design of responsive matter. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07840 [pdf, other]

On Training Data Influence of GPT Models

Authors: Qingyi Liu, Yekun Chai, Shuohuan Wang, Yu Sun, Qiwei Peng, Keze Wang, Hua Wu

Abstract: Amidst the rapid advancements in generative language models, the investigation of how training data shapes the performance of GPT models is still emerging. This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. Our approach not only traces the influence of individual training instance… ▽ More Amidst the rapid advancements in generative language models, the investigation of how training data shapes the performance of GPT models is still emerging. This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. Our approach not only traces the influence of individual training instances on performance trajectories, such as loss and other key metrics, on targeted test points but also enables a comprehensive comparison with existing methods across various training scenarios in GPT models, ranging from 14 million to 2.8 billion parameters, across a range of downstream tasks. Contrary to earlier methods that struggle with generalization to new data, GPTfluence introduces a parameterized simulation of training dynamics, demonstrating robust generalization capabilities to unseen training data. This adaptability is evident across both fine-tuning and instruction-tuning scenarios, spanning tasks in natural language understanding and generation. We will make our code and data publicly available. △ Less

Submitted 16 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07787 [pdf, other]

Research on fine co-focus adjustment method for segmented solar telescope

Authors: Kunyan Wang, Yichun Dai, Bin Wang, Xu Tan, Dehua Yang, Zhenyu **

Abstract: For segmented telescopes, achieving fine co-focus adjustment is essential for realizing co-phase adjustment and maintenance, which involves adjusting the millimeter-scale piston between segments to fall within the capture range of the co-phase detection system. CGST proposes using a SHWFS for piston detection during the co-focus adjustment stage. However, the residual piston after adjustment excee… ▽ More For segmented telescopes, achieving fine co-focus adjustment is essential for realizing co-phase adjustment and maintenance, which involves adjusting the millimeter-scale piston between segments to fall within the capture range of the co-phase detection system. CGST proposes using a SHWFS for piston detection during the co-focus adjustment stage. However, the residual piston after adjustment exceeds the capture range of the broadband PSF phasing algorithm$(\pm 30 μm) $, and the multi-wavelength PSF algorithm requires even higher precision in co-focus adjustment. To improve the co-focus adjustment accuracy of CGST, a fine co-focus adjustment based on cross-calibration is proposed. This method utilizes a high-precision detector to calibrate and fit the measurements from the SHWFS, thereby reducing the impact of atmospheric turbulence and systematic errors on piston measurement accuracy during co-focus adjustment. Simulation results using CGST demonstrate that the proposed method significantly enhances adjustment accuracy compared to the SHWFS detection method. Additionally, the residual piston after fine co-focus adjustment using this method falls within the capture range of the multi-wavelength PSF algorithm. To verify the feasibility of this method, experiments were conducted on an 800mm ring segmented mirror system, successfully achieving fine co-focus adjustment where the remaining piston of all segments fell within $\pm 15 μm$. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07766 [pdf, other]

doi 10.3390/photonics10050548

RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network

Authors: Kai Luo, Yakun Ju, Lin Qi, Kaixuan Wang, Junyu Dong

Abstract: Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual… ▽ More Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual multiscale attentional feature fusion to handle the ``difficult'' regions of the object. Unlike previous approaches that only use stacked convolutional layers to extract deep features from the input image, our method integrates feature information from different resolution stages and scales of the image. This approach preserves more physical information, such as texture and geometry of the object in complex regions, through shallow-deep stage feature extraction, double branching enhancement, and attention optimization. To test the network structure under real-world conditions, we propose a new real dataset called Simple PS data, which contains multiple objects with varying structures and materials. Experimental results on a publicly available benchmark dataset demonstrate that our method outperforms most existing calibrated photometric stereo methods for the same number of input images, especially in the case of highly non-convex object structures. Our method also obtains good results under sparse lighting conditions. △ Less

Submitted 14 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

Comments: 17 pages,12 figures

Journal ref: Photonics 2023,10(5),548

arXiv:2404.07436 [pdf, other]

Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be… ▽ More The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.07430 [pdf, other]

Geometric deformation and redshift structure caused by plane gravitational waves

Authors: Ke Wang, Chao-Jun Feng

Abstract: The curved spacetime induced by gravitational waves can give rise to visual effects such as geometric distortions and redshift structures in the observed image. By establishing a map** from the object's surface coordinates to the observer's screen coordinates, we study these effects in the context of plane gravitational waves. The simulation reveals that the image of an object doesn't merely see… ▽ More The curved spacetime induced by gravitational waves can give rise to visual effects such as geometric distortions and redshift structures in the observed image. By establishing a map** from the object's surface coordinates to the observer's screen coordinates, we study these effects in the context of plane gravitational waves. The simulation reveals that the image of an object doesn't merely seem compressed or stretched, but rather appears twisted and wobbled. Furthermore, the redshift structure on the object's surface appears to rotate as a whole. This outcome offers an intuitive depiction of the lensing effect in plane gravitational wave spacetimes. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 6 pages 2 columns, 3 figures

arXiv:2404.06722 [pdf, other]

Fuel-optimal powered descent guidance for lunar pinpoint landing using neural networks

Authors: Kun Wang, Zheng Chen, Jun Li

Abstract: This paper presents a Neural Networks (NNs) based approach for designing the Fuel-Optimal Powered Descent Guidance (FOPDG) for lunar pinpoint landing. According to Pontryagin's Minimum Principle, the optimality conditions are first derived. To generate the dataset of optimal trajectories for training NNs, we formulate a parameterized system, which allows for generating each optimal trajectory by a… ▽ More This paper presents a Neural Networks (NNs) based approach for designing the Fuel-Optimal Powered Descent Guidance (FOPDG) for lunar pinpoint landing. According to Pontryagin's Minimum Principle, the optimality conditions are first derived. To generate the dataset of optimal trajectories for training NNs, we formulate a parameterized system, which allows for generating each optimal trajectory by a simple propagation without using any optimization method. Then, a dataset containing the optimal state and optimal thrust vector pairs can be readily collected. Since it is challenging for NNs to approximate bang-bang (or discontinuous) type of optimal thrust magnitude, we introduce a regularisation function to the switching function so that the regularized switching function approximated by a simple NN can be used to represent the optimal thrust magnitude. Meanwhile, another two well-trained NNs are used to predict the thrust steering angle and time of flight given a flight state. Finally, numerical simulations show that the proposed method is capable of generating the FOPDG that steers the lunar lander to the desired landing site with acceptable landing errors. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06718 [pdf, other]

Measurement of the Born cross section for $e^{+}e^{-}\to ηh_c $ at center-of-mass energies between 4.1 and 4.6\,GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth,… ▽ More We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth, where the first uncertainties are statistical and the second systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06056 [pdf, other]

Demonstration of Lossy Linear Transformations and Two-Photon Interference on a Photonic Chip

Authors: Kai Wang, Simon J. U. White, Alexander Szameit, Andrey A. Sukhorukov, Alexander S. Solntsev

Abstract: Studying quantum correlations in the presence of loss is of critical importance for the physical modeling of real quantum systems. Here, we demonstrate the control of spatial correlations between entangled photons in a photonic chip, designed and modeled using the singular value decomposition approach. We show that engineered loss, using an auxiliary waveguide, allows one to invert the spatial sta… ▽ More Studying quantum correlations in the presence of loss is of critical importance for the physical modeling of real quantum systems. Here, we demonstrate the control of spatial correlations between entangled photons in a photonic chip, designed and modeled using the singular value decomposition approach. We show that engineered loss, using an auxiliary waveguide, allows one to invert the spatial statistics from bunching to antibunching. Furthermore, we study the photon statistics within the loss-emulating channel and observe photon coincidences, which may provide insights into the design of quantum photonic integrated chips. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 5 pages, 4 figures

arXiv:2404.05973 [pdf, ps, other]

Search for the Rare Decays $D_s^+\to h^+(h^{0})e^+e^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay… ▽ More Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay $D_s^+\toπ^+φ,φ\to e^{+}e^{-}$ is observed with a statistical significance of 7.8$σ$, and evidence for the decay $D_s^+\toρ^+φ,φ\to e^{+}e^{-}$ is found for the first time with a statistical significance of 4.4$σ$. The decay branching fractions are measured to be $\mathcal{B}(D_s^+\toπ^+φ, φ\to e^{+}e^{-} )=(1.17^{+0.23}_{-0.21}\pm0.03)\times 10^{-5}$, and $\mathcal{B}(D_s^+\toρ^+φ, φ\to e^{+}e^{-} )=(2.44^{+0.67}_{-0.62}\pm 0.16)\times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant signal for the three four-body decays of $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-},\ D_{s}^{+}\to K^{+}π^{0}e^{+}e^{-}$, and $D_{s}^{+}\to K_{S}^{0}π^{+}e^{+}e^{-}$ is observed. For $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-}$, the $φ$ mass region is vetoed to minimize the long-distance effects. The 90$\%$ confidence level upper limits set on the branching fractions of these decays are in the range of $(7.0-8.1)\times 10^{-5}$. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 10 pages, 2 figures, 1 table

arXiv:2404.05960 [pdf, other]

EasyTrack: Efficient and Compact One-stream 3D Point Clouds Tracker

Authors: Baojie Fan, Wuyang Zhou, Kai Wang, Shijun Zhou, Fengyu Xu, Jiandong Tian

Abstract: Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones. In this work, beyond typical 3D Siamese or motion tracking, we propose a neat and compact one-stream transformer 3D SOT paradigm from the nove… ▽ More Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones. In this work, beyond typical 3D Siamese or motion tracking, we propose a neat and compact one-stream transformer 3D SOT paradigm from the novel perspective, termed as \textbf{EasyTrack}, which consists of three special designs: 1) A 3D point clouds tracking feature pre-training module is developed to exploit the masked autoencoding for learning 3D point clouds tracking representations. 2) A unified 3D tracking feature learning and fusion network is proposed to simultaneously learns target-aware 3D features, and extensively captures mutual correlation through the flexible self-attention mechanism. 3) A target location network in the dense bird's eye view (BEV) feature space is constructed for target classification and regression. Moreover, we develop an enhanced version named EasyTrack++, which designs the center points interaction (CPI) strategy to reduce the ambiguous targets caused by the noise point cloud background information. The proposed EasyTrack and EasyTrack++ set a new state-of-the-art performance ($\textbf{18\%}$, $\textbf{40\%}$ and $\textbf{3\%}$ success gains) in KITTI, NuScenes, and Waymo while runing at \textbf{52.6fps} with few parameters (\textbf{1.3M}). The code will be available at https://github.com/KnightApple427/Easytrack. △ Less

Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.05340 [pdf, other]

doi 10.1021/acsphotonics.3c01287

Robust Classical and Quantum Polarimetry with a Single Nanostructured Metagrating

Authors: Shaun Lung, Kai Wang, Nicolas R. H. Pedersen, Frank Setzpfandt, Andrey A. Sukhorukov

Abstract: We formulate a new conceptual approach for one-shot complete polarization state measurement with nanostructured metasurfaces applicable to classical light and multi-photon quantum states, by drawing on the principles of generalized quantum measurements based on positive operator-valued measures (POVMs). Accurate polarization reconstruction from a combination of photon counts or correlations from s… ▽ More We formulate a new conceptual approach for one-shot complete polarization state measurement with nanostructured metasurfaces applicable to classical light and multi-photon quantum states, by drawing on the principles of generalized quantum measurements based on positive operator-valued measures (POVMs). Accurate polarization reconstruction from a combination of photon counts or correlations from several diffraction orders is robust with respect to even strong fabrication inaccuracies, requiring only a single classical calibration of metasurface transmission. Furthermore, this approach operates with a single metagrating without interleaving, allowing for the metasurface size reduction while preserving the high transmission efficiency and output beam quality. We theoretically obtained original metasurface designs, fabricated the metasurface from amorphous silicon nanostructures deposited on glass, and experimentally confirmed accurate polarization reconstruction for laser beams. We also anticipate robust operation under changes in environmental conditions, opening new possibilities for space-based imaging and satellite optics. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 27 pages including bibliography and supplementary; 6 figures including supplementary

Journal ref: ACS Photonics 2024, 11, 3, 1060-1067

arXiv:2404.04917 [pdf, ps, other]

Search for $η_c(2S)\to 2(π^+π^-)$ and improved measurement of $χ_{cJ}\to 2(π^+π^-)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level… ▽ More We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level. Using $ψ(3686)\toγχ_{cJ}$ transitions, we also measure the branching fractions of $\mathcal{B}[χ_{cJ(J=0,1,2)}\to 2(π^+π^-)]$, which are $\mathcal{B}[χ_{c0}\to 2(π^+π^-)]=(2.127\pm 0.002~(\mathrm{stat.})\pm 0.101~(\mathrm{syst.}))$\%, $\mathcal{B}[χ_{c1}\to 2(π^+π^-)]=(0.685\pm 0.001~(\mathrm{stat.})\pm 0.031~\mathrm{syst.}))$\%, and $\mathcal{B}[χ_{c2}\to 2(π^+π^-)]=(1.153\pm 0.001~(\mathrm{stat.})\pm 0.063~(\mathrm{syst.}))$\%. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04640 [pdf, other]

Search for di-photon decays of an axion-like particle in radiative decays of J/psi

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (604 additional authors not shown)

Abstract: We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative decays of the $J/ψ$, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a narrow resonance and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon… ▽ More We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative decays of the $J/ψ$, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a narrow resonance and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constant $g_{a γγ}$ in the ranges of $(3.6-49.8) \times 10^{-8}$ and $(2.2 -103.8)\times 10^{-4}$ GeV$^{-1}$, respectively, for $0.18 \le m_a \le 2.85~$ GeV/$c^2$. These are the most stringent limits to date in this mass region. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: 9 pages, 5 figures, Submitted to Phys. Rev. D (Letter)

Report number: BESIII Analysis Memo - 671

arXiv:2404.04545 [pdf, other]

TCAN: Text-oriented Cross Attention Network for Multimodal Sentiment Analysis

Authors: Ming Zhou, Weize Quan, Ziqi Zhou, Kai Wang, Tong Wang, Dong-Ming Yan

Abstract: Multimodal Sentiment Analysis (MSA) endeavors to understand human sentiment by leveraging language, visual, and acoustic modalities. Despite the remarkable performance exhibited by previous MSA approaches, the presence of inherent multimodal heterogeneities poses a challenge, with the contribution of different modalities varying considerably. Past research predominantly focused on improving repres… ▽ More Multimodal Sentiment Analysis (MSA) endeavors to understand human sentiment by leveraging language, visual, and acoustic modalities. Despite the remarkable performance exhibited by previous MSA approaches, the presence of inherent multimodal heterogeneities poses a challenge, with the contribution of different modalities varying considerably. Past research predominantly focused on improving representation learning techniques and feature fusion strategies. However, many of these efforts overlooked the variation in semantic richness among different modalities, treating each modality uniformly. This approach may lead to underestimating the significance of strong modalities while overemphasizing the importance of weak ones. Motivated by these insights, we introduce a Text-oriented Cross-Attention Network (TCAN), emphasizing the predominant role of the text modality in MSA. Specifically, for each multimodal sample, by taking unaligned sequences of the three modalities as inputs, we initially allocate the extracted unimodal features into a visual-text and an acoustic-text pair. Subsequently, we implement self-attention on the text modality and apply text-queried cross-attention to the visual and acoustic modalities. To mitigate the influence of noise signals and redundant features, we incorporate a gated control mechanism into the framework. Additionally, we introduce unimodal joint learning to gain a deeper understanding of homogeneous emotional tendencies across diverse modalities through backpropagation. Experimental results demonstrate that TCAN consistently outperforms state-of-the-art MSA methods on two datasets (CMU-MOSI and CMU-MOSEI). △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.03926 [pdf, other]

Fuel-Optimal Trajectory Planning for Lunar Vertical Landing

Authors: Kun Wang, Zheng Chen, Jun Li

Abstract: In this paper, we consider a trajectory planning problem arising from a lunar vertical landing with minimum fuel consumption. The vertical landing requirement is written as a final steering angle constraint, and a nonnegative regularization term is proposed to modify the cost functional. In this way, the final steering angle constraint will be inherently satisfied according to Pontryagin's Minimum… ▽ More In this paper, we consider a trajectory planning problem arising from a lunar vertical landing with minimum fuel consumption. The vertical landing requirement is written as a final steering angle constraint, and a nonnegative regularization term is proposed to modify the cost functional. In this way, the final steering angle constraint will be inherently satisfied according to Pontryagin's Minimum Principle. As a result, the modified optimal steering angle has to be determined by solving a transcendental equation. To this end, a transforming procedure is employed, which allows for finding the desired optimal steering angle by a simple bisection method. Consequently, the modified optimal control problem can be solved by the indirect shooting method. Finally, some numerical examples are presented to demonstrate and verify the developments of the paper. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2404.03858 [pdf, other]

INvestigations of massive Filaments ANd sTar formation (INFANT). I. Core Identification and Core Mass Function

Authors: Yu Cheng, Xing Lu, Patricio Sanhueza, Hauyu Baobab Liu, Qizhou Zhang, Roberto Galván-Madrid, Ke Wang, Fumitaka Nakamura, Tie Liu, Siyi Feng, Shanghuo Li, Sihan Jiao, Kei E. I. Tanaka, Xunchuan Liu, Pak Shing Li, Qiuyi Luo, Qilao Gu, Yuxin Lin, András E. Guzmán

Abstract: Filamentary structures are ubiquitously found in high-mass star-forming clouds. To investigate the relationship between filaments and star formation, we carry out the INFANT (INvestigations of massive Filaments ANd sTar formation) survey, a multi-scale, multi-wavelength survey of massive filamentary clouds with ALMA band 3/band 6 and VLA K band. In this first paper, we present the ALMA band 6 cont… ▽ More Filamentary structures are ubiquitously found in high-mass star-forming clouds. To investigate the relationship between filaments and star formation, we carry out the INFANT (INvestigations of massive Filaments ANd sTar formation) survey, a multi-scale, multi-wavelength survey of massive filamentary clouds with ALMA band 3/band 6 and VLA K band. In this first paper, we present the ALMA band 6 continuum observations toward a sample of 8 high-mass star forming filaments. We covered each target with approximately rectangular mosaic field of view with two 12-m array configurations, achieving an angular resolution of $\sim$0.6" (2700 AU at 4.5 kpc) and a continuum rms of $\sim$0.1 mJy/beam ($\sim$0.06 Msun in gas mass assuming 15 K). We identify cores using the getsf and astrodendro and find the former is more robust in terms of both identification and measuring flux densities. We identify in total 183 dense cores (15--36 cores in each cloud) and classify their star formation states via outflow and warm gas tracers. The protostellar cores are statistically more massive than the prestellar cores, possibly indicating further accretion onto cores after formation of protostars. For the high-mass end ($M_\text{core}$ $>$ 1.5 Msun) of the core mass function (CMF) we derive a power-law index of $-$1.15 $\pm$ 0.12 for the whole sample, and $-$1.70 $\pm$ 0.25 for the prestellar population. We also find a steepening trend in CMF with cloud evolution ($-$0.89 $\pm$ 0.15 for the young group v.s. $-$1.44 $\pm$ 0.25 for the evolved group) and discuss its implication for cluster formation. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 25 pages, 8 figures, accepted for ApJ

arXiv:2404.03253 [pdf, other]

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

Authors: Yin Li, Qi Chen, Kai Wang, Meige Li, Li** Si, Yingwei Guo, Yu Xiong, Qixing Wang, Yang Qin, Ling Xu, Patrick van der Smagt, Jun Tang, Nutan Chen

Abstract: Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC). The lack of publicly available, comprehensive datasets limits advancements in diagnosis, treatment planning, and the development of machine learning algorithms for NPC. Addressing this critical need, we in… ▽ More Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC). The lack of publicly available, comprehensive datasets limits advancements in diagnosis, treatment planning, and the development of machine learning algorithms for NPC. Addressing this critical need, we introduce the first comprehensive NPC MRI dataset, encompassing MR axial imaging of 277 primary NPC patients. This dataset includes T1-weighted, T2-weighted, and contrast-enhanced T1-weighted sequences, totaling 831 scans. In addition to the corresponding clinical data, manually annotated and labeled segmentations by experienced radiologists offer high-quality data resources from untreated primary NPC. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.03223 [pdf, ps, other]

Blow up analysis for a parabolic MEMS problem, I: Hölder estimate

Authors: Kelei Wang, Guangzeng Yi

Abstract: This is the first in a series of papers devoted to the blow up analysis for the quenching phenomena in a parabolic MEMS equation. In this paper, we first give an optimal Hölder estimate for solutions to this equation by using the blow up method and some Liouville theorems on stationary two-valued caloric functions, and then establish a convergence theory for sequences of uniformly Hölder continuou… ▽ More This is the first in a series of papers devoted to the blow up analysis for the quenching phenomena in a parabolic MEMS equation. In this paper, we first give an optimal Hölder estimate for solutions to this equation by using the blow up method and some Liouville theorems on stationary two-valued caloric functions, and then establish a convergence theory for sequences of uniformly Hölder continuous solutions. These results are also used to prove a stratification theorem on the rupture set $\{u=0\}$. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 34 pages. Comments welcome

MSC Class: 35K58; 35B44; 35B45

arXiv:2404.03217 [pdf, other]

Evidence of the $h_c\to K_S^0 K^+π^-+c.c.$ decay

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systemat… ▽ More Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systematic, respectively. Combining with the exclusive decay width of $η_c\to K\bar{K}π$, our result indicates inconsistencies with both pQCD and NRQCD predictions. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.02432 [pdf]

GNSS Spoofing Detection by Crowdsourcing Double Differential Pseudorange Spatial Distribution

Authors: Xin Chen, Kai Wang

Abstract: It is widely known that spoofing is a major threat that adversely impacts the reliability and accuracy of GNSS applications. In this study, a crowdsourcing double differential pseudorange spatial (D2SP) random set is constructed and the distribution of the set is derived.Based on the variance of the D2SP set, a tri-level hypothesis detection algorithm is designed to classify spoofing-free, fully-s… ▽ More It is widely known that spoofing is a major threat that adversely impacts the reliability and accuracy of GNSS applications. In this study, a crowdsourcing double differential pseudorange spatial (D2SP) random set is constructed and the distribution of the set is derived.Based on the variance of the D2SP set, a tri-level hypothesis detection algorithm is designed to classify spoofing-free, fully-spoofed, and partially-spoofed cases in the region of interest (ROI).It does not require the prior knowledge of the truth positions or relative distances of the receivers.Simulation test results show that the proposed D2SP spoofing detection method has the advantages of lower computational complexity and higher tolerance for multipath errors compared with the generalized likelihood ratio test (GLRT) method that is the current mainstream spoofing detection algorithm based on multiple receivers' differential pseudoranges.Moreover, it also shows better flexibility for different sizes of ROI and numbers of the crowdsourcing receivers. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.02275 [pdf, other]

The ALMA-QUARKS Survey: II. the ACA 1.3 mm continuum source catalog and the assembly of dense gas in massive star-forming clumps

Authors: Fengwei Xu, Ke Wang, Tie Liu, Lei Zhu, Guido Garay, Xunchuan Liu, Paul Goldsmith, Qizhou Zhang, Patricio Sanhueza, Shengli Qin, **hua He, Mika Juvela, Anandmayee Tej, Hongli Liu, Shanghuo Li, Kaho Morii, Siju Zhang, Jianwen Zhou, Amelia Stutz, Neal J. Evans, Kim Kee-Tae, Shengyuan Liu, Diego Mardones, Guangxing Li, Leonardo Bronfman , et al. (8 additional authors not shown)

Abstract: Leveraging the high resolution, high sensitivity, and wide frequency coverage of the Atacama Large Millimeter/submillimeter Array (ALMA), the QUARKS survey, standing for "Querying Underlying mechanisms of massive star formation with ALMA-Resolved gas Kinematics and Structures", is observing 139 massive star-forming clumps at ALMA Band 6 ($λ\sim$ 1.3 mm). This paper introduces the Atacama Compact A… ▽ More Leveraging the high resolution, high sensitivity, and wide frequency coverage of the Atacama Large Millimeter/submillimeter Array (ALMA), the QUARKS survey, standing for "Querying Underlying mechanisms of massive star formation with ALMA-Resolved gas Kinematics and Structures", is observing 139 massive star-forming clumps at ALMA Band 6 ($λ\sim$ 1.3 mm). This paper introduces the Atacama Compact Array (ACA) 7-m data. Combining multi-wavelength data, we provide the first edition of QUARKS atlas, offering insights into the multiscale and multiphase interstellar medium in high-mass star formation. The ACA 1.3 mm catalog includes 207 continuum sources that are called ACA sources. Their gas kinetic temperatures are estimated using three formaldehyde (H$_2$CO) transitions with a non-LTE radiation transfer model, and the mass and density are derived from a dust emission model. The ACA sources are massive (16-84 percentile values of 6-160 $M_{\odot}$), gravity-dominated ($M\propto R^{1.1}$) fragments within massive clumps, with supersonic turbulence ($\mathcal{M}>1$) and embedded star-forming protoclusters. We find a linear correlation between the masses of the fragments and the massive clumps, with a ratio of 6% between the two. When considering the fragments as representative of dense gas, the ratio indicates a dense gas fraction (DGF) of 6%, although with a wide scatter ranging from 1% to 10%. If we consider the QUARKS massive clumps to be what is observed at various scales, then the size-independent DGF indicates a self-similar fragmentation or collapsing mode in protocluster formation. With the ACA data over four orders of magnitude of luminosity-to-mass ratio ($L/M$), we find that the DGF increases significantly with $L/M$, which indicates clump evolutionary stage. We observed a limited fragmentation at the subclump scale, which can be explained by dynamic global collapse process. △ Less

Submitted 4 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 24 pages, 7 figures. Accepted for publication in Research in Astronomy and Astrophysics. QUARKS atlas link: https://drive.google.com/file/d/1KTqXxCDduYepvLd9kIvZVSSytK48OmfL/view?usp=sharing

arXiv:2404.02033 [pdf, other]

Search for $C$-even states decaying to $D_{s}^{\pm}D_{s}^{*\mp}$ with masses between $4.08$ and $4.32$ $\rm GeV/{\it c}^{2}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically s… ▽ More Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically significant signal is observed in the mass range from $4.08$ to $4.32~\mathrm{GeV}/c^{2}$. The upper limits of $σ[e^+e^-\toγX]\cdot \mathcal{B}[X \to D_{s}^{\pm}D_{s}^{*\mp}]$ at a $90\%$ confidence level are determined. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01969 [pdf]

Analytical photoresponses of gated nanowire photoconductors

Authors: Yinchu Shen, Jia**g He, Yang Xu, Kaiyou Wang, Ya** Dan

Abstract: Low-dimensional photoconductors have extraordinarily high photoresponse and gain, which can be modulated by gate voltages as shown in literature. However, the physics of gate modulation remains elusive. In this work, we investigated the physics of gate modulation in silicon nanowire photoconductors with the analytical photoresponse equations. It was found that the impact of gate voltage varies vas… ▽ More Low-dimensional photoconductors have extraordinarily high photoresponse and gain, which can be modulated by gate voltages as shown in literature. However, the physics of gate modulation remains elusive. In this work, we investigated the physics of gate modulation in silicon nanowire photoconductors with the analytical photoresponse equations. It was found that the impact of gate voltage varies vastly for nanowires with different size. For the wide nanowires that cannot be pinched off by high gate voltage, we found that the photoresponses are enhanced by at least one order of magnitude due to the gate-induced electric passivation. For narrow nanowires that starts with a pinched-off channel, the gate voltage has no electric passivation effect but increases the potential barrier between source and drain, resulting in a decrease in dark and photo current. For the nanowires with an intermediate size, the channel is continuous but can be pinched off by a high gate voltage. The photoresponsivity and photodetectivity is maximized during the transition from the continuous channel to the pinched-off one. This work provides important insights on how to design high-performance photoconductors. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 4 figures, 18 pages

Journal ref: ACS Nano 2024

arXiv:2404.01360 [pdf, other]

Harnessing Data and Physics for Deep Learning Phase Recovery

Authors: Kaiqiang Wang, Edmund Y. Lam

Abstract: Phase recovery, calculating the phase of a light wave from its intensity measurements, is essential for various applications, such as coherent diffraction imaging, adaptive optics, and biomedical imaging. It enables the reconstruction of an object's refractive index distribution or topography as well as the correction of imaging system aberrations. In recent years, deep learning has been proven to… ▽ More Phase recovery, calculating the phase of a light wave from its intensity measurements, is essential for various applications, such as coherent diffraction imaging, adaptive optics, and biomedical imaging. It enables the reconstruction of an object's refractive index distribution or topography as well as the correction of imaging system aberrations. In recent years, deep learning has been proven to be highly effective in addressing phase recovery problems. Two main deep learning phase recovery strategies are data-driven (DD) with supervised learning mode and physics-driven (PD) with self-supervised learning mode. DD and PD achieve the same goal in different ways and lack the necessary study to reveal similarities and differences. Therefore, in this paper, we comprehensively compare these two deep learning phase recovery strategies in terms of time consumption, accuracy, generalization ability, ill-posedness adaptability, and prior capacity. What's more, we propose a co-driven (CD) strategy of combining datasets and physics for the balance of high- and low-frequency information. The codes for DD, PD, and CD are publicly available at https://github.com/kqwang/DLPR. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 26 pages, 10 figures

arXiv:2404.01027 [pdf]

Easy-to-configure zero-dimensional valley-chiral modes in a graphene point junction

Authors: Konstantin Davydov, Xi Zhang, Wei Ren, Matthew Coles, Logan Kline, Bryan Zucker, Kenji Watanabe, Takashi Taniguchi, Ke Wang

Abstract: The valley degree of freedom in 2D materials can be manipulated for low-dissipation quantum electronics called valleytronics. At the boundary between two regions of bilayer graphene with different atomic or electrostatic configuration, valley-polarized current has been realized. However, the demanding fabrication and operation requirements limit device reproducibility and scalability toward more a… ▽ More The valley degree of freedom in 2D materials can be manipulated for low-dissipation quantum electronics called valleytronics. At the boundary between two regions of bilayer graphene with different atomic or electrostatic configuration, valley-polarized current has been realized. However, the demanding fabrication and operation requirements limit device reproducibility and scalability toward more advanced valleytronics circuits. We demonstrate a new device architecture of a point junction where a valley-chiral 0D PN junction is easily configured, switchable, and capable of carrying valley current with an estimated polarization of ~80%. This work provides a new building block in manipulating valley quantum numbers and scalable valleytronics. △ Less

Submitted 1 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.00729 [pdf, other]

Nonparametric End-to-End Probabilistic Forecasting of Distributed Generation Outputs Considering Missing Data Imputation

Authors: Minghui Chen, Zichao Meng, Yan** Liu, Longbo Luo, Ye Guo, Kang Wang

Abstract: In this paper, we introduce a nonparametric end-to-end method for probabilistic forecasting of distributed renewable generation outputs while including missing data imputation. Firstly, we employ a nonparametric probabilistic forecast model utilizing the long short-term memory (LSTM) network to model the probability distributions of distributed renewable generations' outputs. Secondly, we design a… ▽ More In this paper, we introduce a nonparametric end-to-end method for probabilistic forecasting of distributed renewable generation outputs while including missing data imputation. Firstly, we employ a nonparametric probabilistic forecast model utilizing the long short-term memory (LSTM) network to model the probability distributions of distributed renewable generations' outputs. Secondly, we design an end-to-end training process that includes missing data imputation through iterative imputation and iterative loss-based training procedures. This two-step modeling approach effectively combines the strengths of the nonparametric method with the end-to-end approach. Consequently, our approach demonstrates exceptional capabilities in probabilistic forecasting for the outputs of distributed renewable generations while effectively handling missing values. Simulation results confirm the superior performance of our approach compared to existing alternatives. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Showing 151–200 of 3,737 results for author: Wang, K