-
Galaxy Mergers in the Epoch of Reionization I: A JWST Study of Pair Fractions, Merger Rates, and Stellar Mass Accretion Rates at $z = 4.5-11.5$
Authors:
Qiao Duan,
Christopher J. Conselice,
Qiong Li,
Duncan Austin,
Thomas Harvey,
Nathan J. Adams,
Kenneth J. Duncan,
James Trussler,
Leonardo Ferreira,
Lewi Westcott,
Honor Harris,
Rogier A. Windhorst,
Benne W. Holwerda,
Thomas J. Broadhurst,
Dan Coe,
Seth H. Cohen,
Simon P. Driver,
Brenda Frye,
Norman A. Grogin,
Nimish P. Hathi,
Rolf A. Jansen,
Anton M. Koekemoer,
Madeline A. Marshall,
Mario Nonino,
Rafael Ortiz III
, et al. (7 additional authors not shown)
Abstract:
We present a full analysis of galaxy major merger pair fractions, merger rates, and mass accretion rates, thus uncovering the role of mergers in galaxy formation at the earliest previously unexplored epoch of $4.5<z<11.5$. We target galaxies with masses $\log_{10}(\mathrm{M}_*/\mathrm{M}_\odot) = 8.0 - 10.0$, utilizing data from eight JWST Cycle-1 fields (CEERS, JADES GOODS-S, NEP-TDF, NGDEEP, GLA…
▽ More
We present a full analysis of galaxy major merger pair fractions, merger rates, and mass accretion rates, thus uncovering the role of mergers in galaxy formation at the earliest previously unexplored epoch of $4.5<z<11.5$. We target galaxies with masses $\log_{10}(\mathrm{M}_*/\mathrm{M}_\odot) = 8.0 - 10.0$, utilizing data from eight JWST Cycle-1 fields (CEERS, JADES GOODS-S, NEP-TDF, NGDEEP, GLASS, El-Gordo, SMACS-0723, MACS-0416), covering an unmasked area of 189.36 $\mathrm{arcmin}^2$. We develop a new probabilistic pair-counting methodology that integrates full photometric redshift posteriors and corrects for detection incompleteness to quantify close pairs with physical projected separations between 20 and 50 kpc. Our analysis reveals an increase in pair fractions up to $z = 8$, reaching $0.211 \pm 0.065$, followed by a statistically flat evolution to $z = 11.5$. We find that the galaxy merger rate increases from the local Universe up to $z = 6$ and then stabilizes at a value of $\sim 6$ Gyr$^{-1}$ up to $z = 11.5$. We fit both a power-law and a power-law + exponential model to our pair fraction and merger rate redshift evolution, finding that the latter model describes the trends more accurately, particularly at $z = 8.0 - 11.5$. In addition, we measure that the average galaxy increases its stellar mass due to mergers by a factor of $2.77 \pm 0.99$ from redshift $z = 10.5$ to $z = 5.0$. Lastly, we investigate the impact of mergers on galaxy stellar mass growth, revealing that mergers contribute $71 \pm 25\%$ as much to galaxy stellar mass increases as star formation from gas. This indicates that mergers drive about half of galaxy assembly at high redshift.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization
Authors:
Wenrui Yu,
Qiongxiu Li,
Milan Lopuhaä-Zwakenberg,
Mads Græsbøll Christensen,
Richard Heusdens
Abstract:
Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centrali…
▽ More
Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centralized models, our study provides compelling evidence to the contrary. We demonstrate that decentralized FL, when deploying distributed optimization, provides enhanced privacy protection - both theoretically and empirically - compared to centralized approaches. The challenge of quantifying privacy loss through iterative processes has traditionally constrained the theoretical exploration of FL protocols. We overcome this by conducting a pioneering in-depth information-theoretical privacy analysis for both frameworks. Our analysis, considering both eavesdrop** and passive adversary models, successfully establishes bounds on privacy leakage. We show information theoretically that the privacy loss in decentralized FL is upper bounded by the loss in centralized FL. Compared to the centralized case where local gradients of individual participants are directly revealed, a key distinction of optimization-based decentralized FL is that the relevant information includes differences of local gradients over successive iterations and the aggregated sum of different nodes' gradients over the network. This information complicates the adversary's attempt to infer private data. To bridge our theoretical insights with practical applications, we present detailed case studies involving logistic regression and deep neural networks. These examples demonstrate that while privacy leakage remains comparable in simpler models, complex models like deep neural networks exhibit lower privacy risks under decentralized FL.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs
Authors:
Aobo Kong,
Shiwan Zhao,
Hao Chen,
Qicheng Li,
Yong Qin,
Ruiqi Sun,
Xin Zhou,
Jiaming Zhou,
Haoqin Sun
Abstract:
Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the promp…
▽ More
Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the prompt needs to be manually designed for the given problem, requiring certain expertise and iterative modifications. To this end, we propose self-prompt tuning, making LLMs themselves generate role-play prompts through fine-tuning. Leveraging the LIMA dataset as our foundational corpus, we employ GPT-4 to annotate role-play prompts for each data points, resulting in the creation of the LIMA-Role dataset. We then fine-tune LLMs like Llama-2-7B and Mistral-7B on LIMA-Role. Consequently, the self-prompt tuned LLMs can automatically generate expert role prompts for any given question. We extensively evaluate self-prompt tuned LLMs on widely used NLP benchmarks and open-ended question test. Our empirical results illustrate that self-prompt tuned LLMs outperform standard instruction tuned baselines across most datasets. This highlights the great potential of utilizing fine-tuning to enable LLMs to self-prompt, thereby automating complex prompting strategies. We release the dataset, models, and code at this \href{https://anonymous.4open.science/r/Self-Prompt-Tuning-739E/}{url}.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Bora: Biomedical Generalist Video Generation Model
Authors:
Weixiang Sun,
Xiaocao You,
Ruizhe Zheng,
Zhengqing Yuan,
Xiang Li,
Lifang He,
Quanzheng Li,
Lichao Sun
Abstract:
Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical…
▽ More
Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical procedures and detailed anatomical structures. This paper introduces Bora, the first spatio-temporal diffusion probabilistic model designed for text-guided biomedical video generation. Bora leverages Transformer architecture and is pre-trained on general-purpose video generation tasks. It is fine-tuned through model alignment and instruction tuning using a newly established medical video corpus, which includes paired text-video data from various biomedical fields. To the best of our knowledge, this is the first attempt to establish such a comprehensive annotated biomedical video dataset. Bora is capable of generating high-quality video data across four distinct biomedical domains, adhering to medical expert standards and demonstrating consistency and diversity. This generalist video generative model holds significant potential for enhancing medical consultation and decision-making, particularly in resource-limited settings. Additionally, Bora could pave the way for immersive medical training and procedure planning. Extensive experiments on distinct medical modalities such as endoscopy, ultrasound, MRI, and cell tracking validate the effectiveness of our model in understanding biomedical instructions and its superior performance across subjects compared to state-of-the-art generation models.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Distributed Backdoor Attacks on Federated Graph Learning and Certified Defenses
Authors:
Yuxin Yang,
Qiang Li,
**yuan Jia,
Yuan Hong,
Binghui Wang
Abstract:
Federated graph learning (FedGL) is an emerging federated learning (FL) framework that extends FL to learn graph data from diverse sources. FL for non-graph data has shown to be vulnerable to backdoor attacks, which inject a shared backdoor trigger into the training data such that the trained backdoored FL model can predict the testing data containing the trigger as the attacker desires. However,…
▽ More
Federated graph learning (FedGL) is an emerging federated learning (FL) framework that extends FL to learn graph data from diverse sources. FL for non-graph data has shown to be vulnerable to backdoor attacks, which inject a shared backdoor trigger into the training data such that the trained backdoored FL model can predict the testing data containing the trigger as the attacker desires. However, FedGL against backdoor attacks is largely unexplored, and no effective defense exists.
In this paper, we aim to address such significant deficiency. First, we propose an effective, stealthy, and persistent backdoor attack on FedGL. Our attack uses a subgraph as the trigger and designs an adaptive trigger generator that can derive the effective trigger location and shape for each graph. Our attack shows that empirical defenses are hard to detect/remove our generated triggers. To mitigate it, we further develop a certified defense for any backdoored FedGL model against the trigger with any shape at any location. Our defense involves carefully dividing a testing graph into multiple subgraphs and designing a majority vote-based ensemble classifier on these subgraphs. We then derive the deterministic certified robustness based on the ensemble classifier and prove its tightness. We extensively evaluate our attack and defense on six graph datasets. Our attack results show our attack can obtain > 90% backdoor accuracy in almost all datasets. Our defense results show, in certain cases, the certified accuracy for clean testing graphs against an arbitrary trigger with size 20 can be close to the normal accuracy under no attack, while there is a moderate gap in other cases. Moreover, the certified backdoor accuracy is always 0 for backdoored testing graphs generated by our attack, implying our defense can fully mitigate the attack. Source code is available at: https://github.com/Yuxin104/Opt-GDBA.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
MetaUrban: A Simulation Platform for Embodied AI in Urban Spaces
Authors:
Wayne Wu,
Honglin He,
Yiran Wang,
Chenda Duan,
Jack He,
Zhizheng Liu,
Quanyi Li,
Bolei Zhou
Abstract:
Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while diverse robot dogs and humanoids have recently emerged in the street. Ensurin…
▽ More
Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while diverse robot dogs and humanoids have recently emerged in the street. Ensuring the generalizability and safety of these forthcoming mobile machines is crucial when navigating through the bustling streets in urban spaces. In this work, we present MetaUrban, a compositional simulation platform for Embodied AI research in urban spaces. MetaUrban can construct an infinite number of interactive urban scenes from compositional elements, covering a vast array of ground plans, object placements, pedestrians, vulnerable road users, and other mobile agents' appearances and dynamics. We design point navigation and social navigation tasks as the pilot study using MetaUrban for embodied AI research and establish various baselines of Reinforcement Learning and Imitation Learning. Experiments demonstrate that the compositional nature of the simulated environments can substantially improve the generalizability and safety of the trained mobile agents. MetaUrban will be made publicly available to provide more research opportunities and foster safe and trustworthy embodied AI in urban spaces.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models
Authors:
Ying Zhang,
Xiaoyan Zhou,
Hui Wen,
Wenjia Niu,
Jiqiang Liu,
Haining Wang,
Qiang Li
Abstract:
Nowadays, the open-source software (OSS) ecosystem suffers from security threats of software supply chain (SSC) attacks. Interpreted OSS malware plays a vital role in SSC attacks, as criminals have an arsenal of attack vectors to deceive users into installing malware and executing malicious activities. In this paper, we introduce tactics, techniques, and procedures (TTPs) proposed by MITRE ATT\&CK…
▽ More
Nowadays, the open-source software (OSS) ecosystem suffers from security threats of software supply chain (SSC) attacks. Interpreted OSS malware plays a vital role in SSC attacks, as criminals have an arsenal of attack vectors to deceive users into installing malware and executing malicious activities. In this paper, we introduce tactics, techniques, and procedures (TTPs) proposed by MITRE ATT\&CK into the interpreted malware analysis to characterize different phases of an attack lifecycle. Specifically, we propose GENTTP, a zero-shot approach to extracting a TTP of an interpreted malware package. GENTTP leverages large language models (LLMs) to automatically generate a TTP, where the input is a malicious package, and the output is a deceptive tactic and an execution tactic of attack vectors. To validate the effectiveness of GENTTP, we collect two datasets for evaluation: a dataset with ground truth labels and a large dataset in the wild. Experimental results show that GENTTP can generate TTPs with high accuracy and efficiency. To demonstrate GENTTP's benefits, we build an LLM-based Chatbot from 3,700+ PyPI malware's TTPs. We further conduct a quantitative analysis of malware's TTPs at a large scale. Our main findings include: (1) many OSS malicious packages share a relatively stable TTP, even with the increasing emergence of malware and attack campaigns, (2) a TTP reflects characteristics of a malware-based attack, and (3) an attacker's intent behind the malware is linked to a TTP.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Spatially-Variant Degradation Model for Dataset-free Super-resolution
Authors:
Shaojie Guo,
Haofei Song,
Qingli Li,
Yan Wang
Abstract:
This paper focuses on the dataset-free Blind Image Super-Resolution (BISR). Unlike existing dataset-free BISR methods that focus on obtaining a degradation kernel for the entire image, we are the first to explicitly design a spatially-variant degradation model for each pixel. Our method also benefits from having a significantly smaller number of learnable parameters compared to data-driven spatial…
▽ More
This paper focuses on the dataset-free Blind Image Super-Resolution (BISR). Unlike existing dataset-free BISR methods that focus on obtaining a degradation kernel for the entire image, we are the first to explicitly design a spatially-variant degradation model for each pixel. Our method also benefits from having a significantly smaller number of learnable parameters compared to data-driven spatially-variant BISR methods. Concretely, each pixel's degradation kernel is expressed as a linear combination of a learnable dictionary composed of a small number of spatially-variant atom kernels. The coefficient matrices of the atom degradation kernels are derived using membership functions of fuzzy set theory. We construct a novel Probabilistic BISR model with tailored likelihood function and prior terms. Subsequently, we employ the Monte Carlo EM algorithm to infer the degradation kernels for each pixel. Our method achieves a significant improvement over other state-of-the-art BISR methods, with an average improvement of 1 dB (2x).Code will be released at https://github.com/shaojieguoECNU/SVDSR.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Beyond Benchmarking: A New Paradigm for Evaluation and Assessment of Large Language Models
Authors:
** Liu,
Qingquan Li,
Wenlong Du
Abstract:
In current benchmarks for evaluating large language models (LLMs), there are issues such as evaluation content restriction, untimely updates, and lack of optimization guidance. In this paper, we propose a new paradigm for the measurement of LLMs: Benchmarking-Evaluation-Assessment. Our paradigm shifts the "location" of LLM evaluation from the "examination room" to the "hospital". Through conductin…
▽ More
In current benchmarks for evaluating large language models (LLMs), there are issues such as evaluation content restriction, untimely updates, and lack of optimization guidance. In this paper, we propose a new paradigm for the measurement of LLMs: Benchmarking-Evaluation-Assessment. Our paradigm shifts the "location" of LLM evaluation from the "examination room" to the "hospital". Through conducting a "physical examination" on LLMs, it utilizes specific task-solving as the evaluation content, performs deep attribution of existing problems within LLMs, and provides recommendation for optimization.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Ferromagnetic polar metals via epitaxial strain: a case study of SrCoO$_3$
Authors:
Zhiwei Liu,
Qiuyue Li,
Hanghui Chen
Abstract:
While polar metals are a metallic analogue of ferroelectrics, magnetic polar metals can be considered as a metallic analogue of multiferroics. There have been a number of attempts to integrate magnetism into a polar metal by synthesizing new materials or heterostructures. Here we use a simple yet widely used approach--epitaxial strain in the search for intrinsic magnetic polar metals. Via first-pr…
▽ More
While polar metals are a metallic analogue of ferroelectrics, magnetic polar metals can be considered as a metallic analogue of multiferroics. There have been a number of attempts to integrate magnetism into a polar metal by synthesizing new materials or heterostructures. Here we use a simple yet widely used approach--epitaxial strain in the search for intrinsic magnetic polar metals. Via first-principles calculations, we study strain engineering of a ferromagnetic metallic oxide SrCoO$_3$, whose bulk form crystallizes in a cubic structure. We find that under an experimentally feasible biaxial strain on the $ab$ plane, collective Co polar displacements are stabilized in SrCoO$_3$. Specifically, a compressive strain stabilizes Co polar displacements along the $c$ axis, while a tensile strain stabilizes Co polar displacements along the diagonal line in the $ab$ plane. In both cases, we find an intrinsic ferromagnetic polar metallic state in SrCoO$_3$. In addition, we also find that a sufficiently large biaxial strain ($> 4\%$) can yield a ferromagnetic-to-antiferromagnetic transition in SrCoO$_3$. Our work demonstrates that in addition to yielding emergent multiferroics, epitaxial strain is also a viable approach to inducing magnetic polar metallic states in quantum materials.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models
Authors:
Yun Qi Li,
Tuan Do,
Evan Jones,
Bernie Boscoe,
Kevin Alfaro,
Zooey Nguyen
Abstract:
Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change ove…
▽ More
Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change over billions of years, following physical laws and relationships that are both easy to characterize and difficult to encode in generative models. We build a conditional denoising diffusion probabilistic model (DDPM) and a conditional variational autoencoder (CVAE) and test their ability to generate realistic galaxies conditioned on their redshifts (galaxy ages). This is one of the first studies to probe these generative models using physically motivated metrics. We find that both models produce comparable realistic galaxies based on human evaluation, but our physics-based metrics are better able to discern the strengths and weaknesses of the generative models. Overall, the DDPM model performs better than the CVAE on the majority of the physics-based metrics. Ultimately, if we can show that generative models can learn the physics of galaxy evolution, they have the potential to unlock new astrophysical discoveries.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Hypergraph based Understanding for Document Semantic Entity Recognition
Authors:
Qiwei Li,
Zuchao Li,
** Wang,
Haojun Ai,
Hai Zhao
Abstract:
Semantic entity recognition is an important task in the field of visually-rich document understanding. It distinguishes the semantic types of text by analyzing the position relationship between text nodes and the relation between text content. The existing document understanding models mainly focus on entity categories while ignoring the extraction of entity boundaries. We build a novel hypergraph…
▽ More
Semantic entity recognition is an important task in the field of visually-rich document understanding. It distinguishes the semantic types of text by analyzing the position relationship between text nodes and the relation between text content. The existing document understanding models mainly focus on entity categories while ignoring the extraction of entity boundaries. We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time. It can conduct a more detailed analysis of the document text representation analyzed by the upstream model and achieves a better performance of semantic information. We apply this method on the basis of GraphLayoutLM to construct a new semantic entity recognition model HGALayoutLM. Our experiment results on FUNSD, CORD, XFUND and SROIE show that our method can effectively improve the performance of semantic entity recognition tasks based on the original model. The results of HGALayoutLM on FUNSD and XFUND reach the new state-of-the-art results.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Reactivity of ultra-thin Kagome Metal FeSn towards Oxygen and Water
Authors:
James Blyth,
Sadhana Sridhar,
Mengting Zhao,
Sajid Ali2,
Thi Hai Yen Vu,
Qile Li,
Johnathon Maniatis,
Grace Causer,
Michael S. Fuhrer,
Nikhil V. Medhekar,
Anton Tadich,
Mark Edmonds
Abstract:
The kagome metal FeSn, consists of alternating layers of kagome-lattice Fe3Sn and honeycomb Sn2, and exhibits great potential for applications in future low energy electronics and spintronics because of an ideal combination of novel topological phases and high-temperature magnetic ordering. Robust synthesis methods for ultra-thin FeSn films, as well as an understanding of their air stability is cr…
▽ More
The kagome metal FeSn, consists of alternating layers of kagome-lattice Fe3Sn and honeycomb Sn2, and exhibits great potential for applications in future low energy electronics and spintronics because of an ideal combination of novel topological phases and high-temperature magnetic ordering. Robust synthesis methods for ultra-thin FeSn films, as well as an understanding of their air stability is crucial for its development and long-term operation in future devices. In this work, we realize large area, sub-10 nm epitaxial FeSn thin films, and explore the oxidation process via synchrotron-based photoelectron spectroscopy using in-situ oxygen and water dosing, as well as ex-situ air exposure. Upon exposure to atmosphere the FeSn films are shown to be highly reactive, with a stable ~3 nm thick oxide layer forming at the surface within 10 minutes. Notably the surface Fe remains largely unoxidized when compared to Sn, which undergoes near-complete oxidation. This is further confirmed with controlled in-situ dosing of O2 and H2O where only the Sn2 (stanene) inter-layers within the FeSn lattice oxidize, suggesting the Fe3Sn kagome layers remain almost pristine. These results are in excellent agreement with first principles calculations, which show Fe-O bonds to the Fe3Sn layer are energetically unfavorable, and furthermore, a large formation energy preference of 1.37 eV for Sn-O bonds in the stanene Sn2 layer over Sn-O bonds in the kagome Fe3Sn layer. The demonstration that oxidation only occurs within the stanene layers may provide new avenues in how to engineer, handle and prepare future kagome metal devices.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Superconductivity up to 14.2 K in MnB$_4$ under pressure
Authors:
Zhe-Ning Xiang,
Ying-Jie Zhang,
Qing Lu,
Qing Li,
Yiwen Li,
Tianheng Huang,
Yijie Zhu,
Yongze Ye,
Jian Sun,
Hai-Hu Wen
Abstract:
The discovery of superconductivity in 3$d$-transition metal compounds with strong magnetism is interesting but rare. Especially for Mn-based compounds, there exist only very limited materials that show superconductivity. Here, we report the discovery of superconductivity up to 14.2 K in a Mn-based material MnB$_4$. By applying high pressures, we found the continuous suppression of a weak insulatin…
▽ More
The discovery of superconductivity in 3$d$-transition metal compounds with strong magnetism is interesting but rare. Especially for Mn-based compounds, there exist only very limited materials that show superconductivity. Here, we report the discovery of superconductivity up to 14.2 K in a Mn-based material MnB$_4$. By applying high pressures, we found the continuous suppression of a weak insulating behavior and the occurrence of superconductivity after about 30 GPa. With further increasing pressure, $T_\text{c}$ is gradually enhanced and reaches the maximum value of about 14.2 K at 150 GPa with a Fermi-Liquid behavior in the normal states. The synchrotron X-ray diffraction data reveal the unchanged monoclinic (S.G: $P2_1/c$) symmetry but an unusual crossover of the lattice parameters $b$ and $c$. Theoretical calculations based on the electron-phonon coupling picture reveal a very low $T_\text{c}$ (less than 1 K), manifesting an exotic pairing mechanism beyond the Bardeen-Cooper-Schrieffer (BCS) theory. Our findings show a promising way to explore high $T_\text{c}$ superconductivity by combining the 3d-transition metal magnetic elements and light elements.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Nonrigid Reconstruction of Freehand Ultrasound without a Tracker
Authors:
Qi Li,
Ziyi Shen,
Qianye Yang,
Dean C. Barratt,
Matthew J. Clarkson,
Tom Vercauteren,
Yipeng Hu
Abstract:
Reconstructing 2D freehand Ultrasound (US) frames into 3D space without using a tracker has recently seen advances with deep learning. Predicting good frame-to-frame rigid transformations is often accepted as the learning objective, especially when the ground-truth labels from spatial tracking devices are inherently rigid transformations. Motivated by a) the observed nonrigid deformation due to so…
▽ More
Reconstructing 2D freehand Ultrasound (US) frames into 3D space without using a tracker has recently seen advances with deep learning. Predicting good frame-to-frame rigid transformations is often accepted as the learning objective, especially when the ground-truth labels from spatial tracking devices are inherently rigid transformations. Motivated by a) the observed nonrigid deformation due to soft tissue motion during scanning, and b) the highly sensitive prediction of rigid transformation, this study investigates the methods and their benefits in predicting nonrigid transformations for reconstructing 3D US. We propose a novel co-optimisation algorithm for simultaneously estimating rigid transformations among US frames, supervised by ground-truth from a tracker, and a nonrigid deformation, optimised by a regularised registration network. We show that these two objectives can be either optimised using meta-learning or combined by weighting. A fast scattered data interpolation is also developed for enabling frequent reconstruction and registration of non-parallel US frames, during training. With a new data set containing over 357,000 frames in 720 scans, acquired from 60 subjects, the experiments demonstrate that, due to an expanded thus easier-to-optimise solution space, the generalisation is improved with the added deformation estimation, with respect to the rigid ground-truth. The global pixel reconstruction error (assessing accumulative prediction) is lowered from 18.48 to 16.51 mm, compared with baseline rigid-transformation-predicting methods. Using manually identified landmarks, the proposed co-optimisation also shows potentials in compensating nonrigid tissue motion at inference, which is not measurable by tracker-provided ground-truth. The code and data used in this paper are made publicly available at https://github.com/QiLi111/NR-Rec-FUS.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Cross Prompting Consistency with Segment Anything Model for Semi-supervised Medical Image Segmentation
Authors:
Juzheng Miao,
Cheng Chen,
Keli Zhang,
Jie Chuai,
Quanzheng Li,
Pheng-Ann Heng
Abstract:
Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation. To achieve effective SSL, a model needs to be able to efficiently learn from limited labeled data and effectively exploiting knowledge from abundant unlabeled data. Recent developments in visual foundation models, such as the Segment Anything Model (SAM), have demonstrated remarkable adaptability with impro…
▽ More
Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation. To achieve effective SSL, a model needs to be able to efficiently learn from limited labeled data and effectively exploiting knowledge from abundant unlabeled data. Recent developments in visual foundation models, such as the Segment Anything Model (SAM), have demonstrated remarkable adaptability with improved sample efficiency. To harness the power of foundation models for application in SSL, we propose a cross prompting consistency method with segment anything model (CPC-SAM) for semi-supervised medical image segmentation. Our method employs SAM's unique prompt design and innovates a cross-prompting strategy within a dual-branch framework to automatically generate prompts and supervisions across two decoder branches, enabling effectively learning from both scarce labeled and valuable unlabeled data. We further design a novel prompt consistency regularization, to reduce the prompt position sensitivity and to enhance the output invariance under different prompts. We validate our method on two medical image segmentation tasks. The extensive experiments with different labeled-data ratios and modalities demonstrate the superiority of our proposed method over the state-of-the-art SSL methods, with more than 9% Dice improvement on the breast cancer segmentation task.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
FM-OSD: Foundation Model-Enabled One-Shot Detection of Anatomical Landmarks
Authors:
Juzheng Miao,
Cheng Chen,
Keli Zhang,
Jie Chuai,
Quanzheng Li,
Pheng-Ann Heng
Abstract:
One-shot detection of anatomical landmarks is gaining significant attention for its efficiency in using minimal labeled data to produce promising results. However, the success of current methods heavily relies on the employment of extensive unlabeled data to pre-train an effective feature extractor, which limits their applicability in scenarios where a substantial amount of unlabeled data is unava…
▽ More
One-shot detection of anatomical landmarks is gaining significant attention for its efficiency in using minimal labeled data to produce promising results. However, the success of current methods heavily relies on the employment of extensive unlabeled data to pre-train an effective feature extractor, which limits their applicability in scenarios where a substantial amount of unlabeled data is unavailable. In this paper, we propose the first foundation model-enabled one-shot landmark detection (FM-OSD) framework for accurate landmark detection in medical images by utilizing solely a single template image without any additional unlabeled data. Specifically, we use the frozen image encoder of visual foundation models as the feature extractor, and introduce dual-branch global and local feature decoders to increase the resolution of extracted features in a coarse to fine manner. The introduced feature decoders are efficiently trained with a distance-aware similarity learning loss to incorporate domain knowledge from the single template image. Moreover, a novel bidirectional matching strategy is developed to improve both robustness and accuracy of landmark detection in the case of scattered similarity map obtained by foundation models. We validate our method on two public anatomical landmark detection datasets. By using solely a single template image, our method demonstrates significant superiority over strong state-of-the-art one-shot landmark detection methods.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Authors:
Haozhe Zhao,
Xiaojian Ma,
Liang Chen,
Shuzheng Si,
Rujie Wu,
Kaikai An,
Peiyu Yu,
Minjia Zhang,
Qing Li,
Baobao Chang
Abstract:
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct a…
▽ More
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct advantages: 1) It features a broader range of editing instructions by leveraging the creativity of large language models (LLMs) alongside in-context editing examples from human raters; 2) Its data sources are based on real images, including photographs and artworks, which provide greater diversity and reduced bias compared to datasets solely generated by text-to-image models; 3) It also supports region-based editing, enhanced by high-quality, automatically produced region annotations. Our experiments show that canonical diffusion-based editing baselines trained on UltraEdit set new records on MagicBrush and Emu-Edit benchmarks. Our analysis further confirms the crucial role of real image anchors and region-based editing data. The dataset, code, and models can be found in https://ultra-editing.github.io.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Tilted Disk Precession and Negative Superhumps in HS 2325+8205: A Multi-Window Analysis
Authors:
Qi-Bin Sun,
Sheng-Bang Qian,
Li-Ying Zhu,
Qin-Mei Li,
Min-Yu Li,
** Li
Abstract:
Tilted disk precession exists in different objects. Negative superhumps (NSHs) in cataclysmic variable stars (CVs) are hypothesized to arise from the interaction between the reverse precession of a tilted disk and the streams from the secondary star. Utilizing TESS photometry, we present a comprehensive investigation into the tilted disk precession and NSHs in the dwarf nova (DN) HS 2325+8205, emp…
▽ More
Tilted disk precession exists in different objects. Negative superhumps (NSHs) in cataclysmic variable stars (CVs) are hypothesized to arise from the interaction between the reverse precession of a tilted disk and the streams from the secondary star. Utilizing TESS photometry, we present a comprehensive investigation into the tilted disk precession and NSHs in the dwarf nova (DN) HS 2325+8205, employing eclipse minima, eclipse depths, NSH frequencies, and NSH amplitudes and the correlation between them as the windows. We report the discovery of NSHs in HS 2325+8205 with a period of 0.185671(17) d. The NSH frequency was found to vary with a period of 3.943(9) d, similar to the tilted disk precession period validated in novae-like star (NL, SDSS J0812) and intermediate polar (IP, TV Col). The eclipsing minima of O-C were similarly found to vary cyclically in period 4.135(5) d, showing faster rise than fall. Furthermore, NSH amplitude varies parabolically, linearly increasing with periodic variations, potentially linked to changes in disk radius, mass transfer rate, and apparent area of the hot spot. Additionally, for the first time in DNe, we observe bi-periodic variations in eclipse depth (P1= 4.131(4) d and P2= 2.065(2) d ~ Pprec/2), resembling those seen in IPs, suggesting that variations with P2 are not attributable to the accretion curtain. Moreover, NSH amplitude and eclipse depth decrease with increasing NSH frequency, while NSH amplitude correlates positively with eclipse depth.These complex variations observed across multiple observational windows provide substantial evidence for understanding of tilted disk precession and NSHs.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks
Authors:
Zesen Liu,
Tianshuo Cong,
Xinlei He,
Qi Li
Abstract:
Large Language Models (LLMs) excel in various applications, including text generation and complex tasks. However, the misuse of LLMs raises concerns about the authenticity and ethical implications of the content they produce, such as deepfake news, academic fraud, and copyright infringement. Watermarking techniques, which embed identifiable markers in machine-generated text, offer a promising solu…
▽ More
Large Language Models (LLMs) excel in various applications, including text generation and complex tasks. However, the misuse of LLMs raises concerns about the authenticity and ethical implications of the content they produce, such as deepfake news, academic fraud, and copyright infringement. Watermarking techniques, which embed identifiable markers in machine-generated text, offer a promising solution to these issues by allowing for content verification and origin tracing. Unfortunately, the robustness of current LLM watermarking schemes under potential watermark removal attacks has not been comprehensively explored.
In this paper, to fill this gap, we first systematically comb the mainstream watermarking schemes and removal attacks on machine-generated texts, and then we categorize them into pre-text (before text generation) and post-text (after text generation) classes so that we can conduct diversified analyses. In our experiments, we evaluate eight watermarks (five pre-text, three post-text) and twelve attacks (two pre-text, ten post-text) across 87 scenarios. Evaluation results indicate that (1) KGW and Exponential watermarks offer high text quality and watermark retention but remain vulnerable to most attacks; (2) Post-text attacks are found to be more efficient and practical than pre-text attacks; (3) Pre-text watermarks are generally more imperceptible, as they do not alter text fluency, unlike post-text watermarks; (4) Additionally, combined attack methods can significantly increase effectiveness, highlighting the need for more robust watermarking solutions. Our study underscores the vulnerabilities of current techniques and the necessity for develo** more resilient schemes.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Jailbreak Attacks and Defenses Against Large Language Models: A Survey
Authors:
Sibo Yi,
Yule Liu,
Zhen Sun,
Tianshuo Cong,
Xinlei He,
Jiaxing Song,
Ke Xu,
Qi Li
Abstract:
Large Language Models (LLMs) have performed exceptionally in various text-generative tasks, including question answering, translation, code completion, etc. However, the over-assistance of LLMs has raised the challenge of "jailbreaking", which induces the model to generate malicious responses against the usage policy and society by designing adversarial prompts. With the emergence of jailbreak att…
▽ More
Large Language Models (LLMs) have performed exceptionally in various text-generative tasks, including question answering, translation, code completion, etc. However, the over-assistance of LLMs has raised the challenge of "jailbreaking", which induces the model to generate malicious responses against the usage policy and society by designing adversarial prompts. With the emergence of jailbreak attack methods exploiting different vulnerabilities in LLMs, the corresponding safety alignment measures are also evolving. In this paper, we propose a comprehensive and detailed taxonomy of jailbreak attack and defense methods. For instance, the attack methods are divided into black-box and white-box attacks based on the transparency of the target model. Meanwhile, we classify defense methods into prompt-level and model-level defenses. Additionally, we further subdivide these attack and defense methods into distinct sub-classes and present a coherent diagram illustrating their relationships. We also conduct an investigation into the current evaluation methods and compare them from different perspectives. Our findings aim to inspire future research and practical implementations in safeguarding LLMs against adversarial attacks. Above all, although jailbreak remains a significant concern within the community, we believe that our work enhances the understanding of this domain and provides a foundation for develo** more secure LLMs.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Measurement Embedded Schrödinger Bridge for Inverse Problems
Authors:
Yuang Wang,
Pengfei **,
Siyeop Yoon,
Matthew Tivnan,
Quanzheng Li,
Li Zhang,
Dufan Wu
Abstract:
Score-based diffusion models are frequently employed as structural priors in inverse problems. However, their iterative denoising process, initiated from Gaussian noise, often results in slow inference speeds. The Image-to-Image Schrödinger Bridge (I$^2$SB), which begins with the corrupted image, presents a promising alternative as a prior for addressing inverse problems. In this work, we introduc…
▽ More
Score-based diffusion models are frequently employed as structural priors in inverse problems. However, their iterative denoising process, initiated from Gaussian noise, often results in slow inference speeds. The Image-to-Image Schrödinger Bridge (I$^2$SB), which begins with the corrupted image, presents a promising alternative as a prior for addressing inverse problems. In this work, we introduce the Measurement Embedded Schrödinger Bridge (MESB). MESB establishes Schrödinger Bridges between the distribution of corrupted images and the distribution of clean images given observed measurements. Based on optimal transport theory, we derive the forward and backward processes of MESB. Through validation on diverse inverse problems, our proposed approach exhibits superior performance compared to existing Schrödinger Bridge-based inverse problems solvers in both visual quality and quantitative metrics.
△ Less
Submitted 22 May, 2024;
originally announced July 2024.
-
Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model
Authors:
Xia Hou,
Qifeng Li,
Jian Yang,
Tongliang Li,
Linzheng Chai,
Xianjie Wu,
Hangyuan Ji,
Zhoujun Li,
Jixuan Nie,
**gbo Dun,
Wenfeng Song
Abstract:
Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generat…
▽ More
Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generating knowledge-intensive multi-turn dialogues for instruction tuning. By integrating raw documents from both open-source datasets and domain-specific web-crawled documents into a benchmark K-BENCH, we cover diverse areas such as Wikipedia (English), Science (Chinese), and Artifacts (Chinese). Our approach first decides the logic flow of the current dialogue and then prompts LLMs to produce key phrases for sourcing relevant response content. This methodology enables the creation of the G I NSTRUCT instruction dataset, retaining raw document knowledge within dialoguestyle interactions. Utilizing this dataset, we fine-tune GLLM, a model designed to transform raw documents into structured multi-turn dialogues, thereby injecting comprehensive domain knowledge into the SFT model for enhanced instruction tuning. This work signifies a stride towards refining the adaptability and effectiveness of LLMs in processing and generating more accurate, contextually nuanced responses across various fields.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Constraints on real space representations of Chern bands
Authors:
Qingchen Li,
Junkai Dong,
Patrick J. Ledwith,
Eslam Khalaf
Abstract:
A Chern band is characterized by a Wannier obstruction indicating the absence of a basis of complete, orthogonal, and exponentially-localized states. Here, we study the properties of real space bases of a Chern band obtained by relaxing either exponential localization or orthogonality and completeness. This yields two distinct real space representations of a band with Chern number $C$: (i) a basis…
▽ More
A Chern band is characterized by a Wannier obstruction indicating the absence of a basis of complete, orthogonal, and exponentially-localized states. Here, we study the properties of real space bases of a Chern band obtained by relaxing either exponential localization or orthogonality and completeness. This yields two distinct real space representations of a band with Chern number $C$: (i) a basis of complete orthogonal Wannier states which decay as power-law and (ii) a basis of exponentially-localized overcomplete non-orthogonal coherent states. For (i), we show that the power-law tail only depends on the Chern number and provide an explicit gauge choice leading to the universal asymptotic $w({\boldsymbol r}) \approx \frac{C e^{-i C \varphi_{\boldsymbol r}}}{2π|{\boldsymbol r}|^2}$ up to a normalized Bloch-periodic spinor. For (ii), we prove a rigorous lower bound on the spatial spread that can always be saturated for ideal bands. We provide an explicit construction of the maximally localized coherent state by map** the problem to a dual Landau level problem where the Berry curvature and trace of the quantum metric take the roles of an effective magnetic field and scalar potential, respectively. Our coherent state result rigorously bounds the spatial spread of any localized state constructed as a linear superposition of wavefunctions within the Chern band. Remarkably, we find that such bound does not generically scale with the Chern number and provide an explicit example of an exponentially localized state in a Chern $C$ band whose size does not increase with $|C|$. Our results show that band topology can be encoded in a real space description and set the stage for a systematic study of interaction effects in topological bands in real space.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions
Authors:
Zhiqi Zhao,
Chang Wu,
Xiaotong Kong,
Zejie Lv,
Xiaoqi Du,
Qiyan Li
Abstract:
Simultaneous Localization and Map** (SLAM) has become a critical technology for intelligent transportation systems and autonomous robots and is widely used in autonomous driving. However, traditional manual feature-based methods in challenging lighting environments make it difficult to ensure robustness and accuracy. Some deep learning-based methods show potential but still have significant draw…
▽ More
Simultaneous Localization and Map** (SLAM) has become a critical technology for intelligent transportation systems and autonomous robots and is widely used in autonomous driving. However, traditional manual feature-based methods in challenging lighting environments make it difficult to ensure robustness and accuracy. Some deep learning-based methods show potential but still have significant drawbacks. To address this problem, we propose a novel hybrid system for visual SLAM based on the LightGlue deep learning network. It uses deep local feature descriptors to replace traditional hand-crafted features and a more efficient and accurate deep network to achieve fast and precise feature matching. Thus, we use the robustness of deep learning to improve the whole system. We have combined traditional geometry-based approaches to introduce a complete visual SLAM system for monocular, binocular, and RGB-D sensors. We thoroughly tested the proposed system on four public datasets: KITTI, EuRoC, TUM, and 4Season, as well as on actual campus scenes. The experimental results show that the proposed method exhibits better accuracy and robustness in adapting to low-light and strongly light-varying environments than traditional manual features and deep learning-based methods. It can also run on GPU in real time.
△ Less
Submitted 10 May, 2024;
originally announced July 2024.
-
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Authors:
Wenna Lai,
Haoran Xie,
Guandong Xu,
Qing Li
Abstract:
With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popular…
▽ More
With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popularity to serve as backbone models for SA applications, considering impressive text comprehension and reasoning ability among diverse tasks. On the other hand, Decoder-only (DO) LLMs exhibit superior natural language generation and in-context learning capabilities. However, their responses may contain misleading or inaccurate information. To identify implicit sentiment with reliable reasoning, this study proposes RVISA, a two-stage reasoning framework that harnesses the generation ability of DO LLMs and the reasoning ability of ED LLMs to train an enhanced reasoner. Specifically, we adopt three-hop reasoning prompting to explicitly furnish sentiment elements as cues. The generated rationales are utilized to fine-tune an ED LLM into a skilled reasoner. Additionally, we develop a straightforward yet effective verification mechanism to ensure the reliability of the reasoning learning. We evaluated the proposed method on two benchmark datasets and achieved state-of-the-art results in ISA performance.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models
Authors:
Ying Nie,
Binwei Yan,
Tianyu Guo,
Hao Liu,
Haoyu Wang,
Wei He,
Binfan Zheng,
Weihao Wang,
Qiang Li,
Weijian Sun,
Yunhe Wang,
Dacheng Tao
Abstract:
Large language models (LLMs) have achieved remarkable performance on various NLP tasks, yet their potential in more challenging and domain-specific task, such as finance, has not been fully explored. In this paper, we present CFinBench: a meticulously crafted, the most comprehensive evaluation benchmark to date, for assessing the financial knowledge of LLMs under Chinese context. In practice, to b…
▽ More
Large language models (LLMs) have achieved remarkable performance on various NLP tasks, yet their potential in more challenging and domain-specific task, such as finance, has not been fully explored. In this paper, we present CFinBench: a meticulously crafted, the most comprehensive evaluation benchmark to date, for assessing the financial knowledge of LLMs under Chinese context. In practice, to better align with the career trajectory of Chinese financial practitioners, we build a systematic evaluation from 4 first-level categories: (1) Financial Subject: whether LLMs can memorize the necessary basic knowledge of financial subjects, such as economics, statistics and auditing. (2) Financial Qualification: whether LLMs can obtain the needed financial qualified certifications, such as certified public accountant, securities qualification and banking qualification. (3) Financial Practice: whether LLMs can fulfill the practical financial jobs, such as tax consultant, junior accountant and securities analyst. (4) Financial Law: whether LLMs can meet the requirement of financial laws and regulations, such as tax law, insurance law and economic law. CFinBench comprises 99,100 questions spanning 43 second-level categories with 3 question types: single-choice, multiple-choice and judgment. We conduct extensive experiments of 50 representative LLMs with various model size on CFinBench. The results show that GPT4 and some Chinese-oriented models lead the benchmark, with the highest average accuracy being 60.16%, highlighting the challenge presented by CFinBench. The dataset and evaluation code are available at https://cfinbench.github.io/.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Origin of the Chromospheric Umbral Waves in Sunspots
Authors:
Xinsheng Zhang,
Xiaoli Yan,
Zhike Xue,
**cheng Wang,
Zhe Xu,
Qiaoling Li,
Yang Peng,
Li** Yang
Abstract:
Oscillations are ubiquitous in sunspots and the associated higher atmospheres. However, it is still unclear whether these oscillations are driven by the external acoustic waves (p-modes) or generated by the internal magnetoconvection. To obtain clues about the driving source of umbral waves in sunspots, we analyzed the spiral wave patterns (SWPs) in two sunspots registered by IRIS MgII 2796 Å slit…
▽ More
Oscillations are ubiquitous in sunspots and the associated higher atmospheres. However, it is still unclear whether these oscillations are driven by the external acoustic waves (p-modes) or generated by the internal magnetoconvection. To obtain clues about the driving source of umbral waves in sunspots, we analyzed the spiral wave patterns (SWPs) in two sunspots registered by IRIS MgII 2796 Å slit-jaw images. By tracking the motion of the SWPs, we find for the first time that two one-armed SWPs coexist in the umbra, and they can rotate either in the same or opposite directions. Furthermore, by analyzing the spatial distribution of the oscillation centers of the one-armed SWPs within the umbra (the oscillation center is defined as the location where the SWP first appears), we find that the chromospheric umbral waves repeatedly originate from the regions with high oscillation power and most of the umbral waves occur in the dark nuclei and strong magnetic field regions of the umbra. Our study results indicate that the chromospheric umbral waves are likely excited by the p-mode oscillations.
△ Less
Submitted 3 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Looking From the Future: Multi-order Iterations Can Enhance Adversarial Attack Transferability
Authors:
Zijian Ying,
Qianmu Li,
Tao Wang,
Zhichao Lian,
Shunmei Meng,
Xuyun Zhang
Abstract:
Various methods try to enhance adversarial transferability by improving the generalization from different perspectives. In this paper, we rethink the optimization process and propose a novel sequence optimization concept, which is named Looking From the Future (LFF). LFF makes use of the original optimization process to refine the very first local optimization choice. Adapting the LFF concept to t…
▽ More
Various methods try to enhance adversarial transferability by improving the generalization from different perspectives. In this paper, we rethink the optimization process and propose a novel sequence optimization concept, which is named Looking From the Future (LFF). LFF makes use of the original optimization process to refine the very first local optimization choice. Adapting the LFF concept to the adversarial attack task, we further propose an LFF attack as well as an MLFF attack with better generalization ability. Furthermore, guiding with the LFF concept, we propose an $LLF^{\mathcal{N}}$ attack which entends the LFF attack to a multi-order attack, further enhancing the transfer attack ability. All our proposed methods can be directly applied to the iteration-based attack methods. We evaluate our proposed method on the ImageNet1k dataset by applying several SOTA adversarial attack methods under four kinds of tasks. Experimental results show that our proposed method can greatly enhance the attack transferability. Ablation experiments are also applied to verify the effectiveness of each component. The source code will be released after this paper is accepted.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Bulk and fracture process zone contribution to the rate-dependent adhesion amplification in viscoelastic broad-band materials
Authors:
Ali Maghami,
Qingao Wang,
Michele Tricarico,
Michele Ciavarella,
Qunyang Li,
Antonio Papangelo
Abstract:
The contact between a rigid Hertzian indenter and an adhesive broad-band viscoelastic substrate is considered. The material behaviour is described by a modified power law model, which is characterized by only four parameters, the glassy and rubbery elastic moduli, a characteristic exponent n and a timescale $τ_0$. The maximum adherence force that can be reached while unloading the rigid indenter f…
▽ More
The contact between a rigid Hertzian indenter and an adhesive broad-band viscoelastic substrate is considered. The material behaviour is described by a modified power law model, which is characterized by only four parameters, the glassy and rubbery elastic moduli, a characteristic exponent n and a timescale $τ_0$. The maximum adherence force that can be reached while unloading the rigid indenter from a relaxed viscoelastic half-space is studied by means of a numerical implementation based on the boundary element method, as a function of the unloading velocity, preload and by varying the broadness of the viscoelastic material spectrum. Through a comprehensive numerical analysis we have determined the minimum contact radius that is needed to achieve the maximum amplification of the pull-off force at a specified unloading rate and for different material exponents n. The numerical results are then compared with the prediction of Persson and Brener viscoelastic crack propagation theory, providing excellent agreement. However, comparison against experimental tests for a glass lens indenting a PDMS substrate show data can be fitted with the linear theory only up to an unloading rate of about $100 \textrm{ $μ$}$m/s showing the fracture process zone rate-dependent contribution to the energy enhancement is of the same order of the bulk dissipation contribution. Hence, the limitations of the current numerical and theoretical models for viscoelastic adhesion are discussed in light of the most recent literature results.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Universal Approximation Theory: The basic theory for large language models
Authors:
Wei Wang,
Qing Li
Abstract:
Language models have emerged as a critical area of focus in artificial intelligence, particularly with the introduction of groundbreaking innovations like ChatGPT. Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms. Built on the Transformer architecture, these models enable interactions that closely mimic human communicati…
▽ More
Language models have emerged as a critical area of focus in artificial intelligence, particularly with the introduction of groundbreaking innovations like ChatGPT. Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms. Built on the Transformer architecture, these models enable interactions that closely mimic human communication and, equipped with extensive knowledge, can even assist in guiding human tasks. Despite their impressive capabilities and growing complexity, a key question remains-the theoretical foundations of large language models (LLMs). What makes Transformer so effective for powering intelligent language applications, such as translation and coding? What underlies LLMs' ability for In-Context Learning (ICL)? How does the LoRA scheme enhance the fine-tuning of LLMs? And what supports the practicality of pruning LLMs? To address these critical questions and explore the technological strategies within LLMs, we leverage the Universal Approximation Theory (UAT) to offer a theoretical backdrop, shedding light on the mechanisms that underpin these advancements.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution Analysis
Authors:
Xinhao Deng,
Qi Li,
Ke Xu
Abstract:
Website Fingerprinting (WF) attacks identify the websites visited by users by performing traffic analysis, compromising user privacy. Particularly, DL-based WF attacks demonstrate impressive attack performance. However, the effectiveness of DL-based WF attacks relies on the collected complete and pure traffic during the page loading, which impacts the practicality of these attacks. The WF performa…
▽ More
Website Fingerprinting (WF) attacks identify the websites visited by users by performing traffic analysis, compromising user privacy. Particularly, DL-based WF attacks demonstrate impressive attack performance. However, the effectiveness of DL-based WF attacks relies on the collected complete and pure traffic during the page loading, which impacts the practicality of these attacks. The WF performance is rather low under dynamic network conditions and various WF defenses, particularly when the analyzed traffic is only a small part of the complete traffic. In this paper, we propose Holmes, a robust and reliable early-stage WF attack. Holmes utilizes temporal and spatial distribution analysis of website traffic to effectively identify websites in the early stages of page loading. Specifically, Holmes develops adaptive data augmentation based on the temporal distribution of website traffic and utilizes a supervised contrastive learning method to extract the correlations between the early-stage traffic and the pre-collected complete traffic. Holmes accurately identifies traffic in the early stages of page loading by computing the correlation of the traffic with the spatial distribution information, which ensures robust and reliable detection according to early-stage traffic. We extensively evaluate Holmes using six datasets. Compared to nine existing DL-based WF attacks, Holmes improves the F1-score of identifying early-stage traffic by an average of 169.18%. Furthermore, we replay the traffic of visiting real-world dark web websites. Holmes successfully identifies dark web websites when the ratio of page loading on average is only 21.71%, with an average precision improvement of 169.36% over the existing WF attacks.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models
Authors:
Weihong Zhong,
Xiaocheng Feng,
Liang Zhao,
Qiming Li,
Lei Huang,
Yuxuan Gu,
Weitao Ma,
Yuan Xu,
Bing Qin
Abstract:
Though advanced in understanding visual information with human languages, Large Vision-Language Models (LVLMs) still suffer from multimodal hallucinations. A natural concern is that during multimodal interaction, the generated hallucinations could influence the LVLMs' subsequent generation. Thus, we raise a question: When presented with a query relevant to the previously generated hallucination, w…
▽ More
Though advanced in understanding visual information with human languages, Large Vision-Language Models (LVLMs) still suffer from multimodal hallucinations. A natural concern is that during multimodal interaction, the generated hallucinations could influence the LVLMs' subsequent generation. Thus, we raise a question: When presented with a query relevant to the previously generated hallucination, will LVLMs be misled and respond incorrectly, even though the ground visual information exists? To answer this, we propose a framework called MMHalSnowball to evaluate LVLMs' behaviors when encountering generated hallucinations, where LVLMs are required to answer specific visual questions within a curated hallucinatory conversation. Crucially, our experiment shows that the performance of open-source LVLMs drops by at least $31\%$, indicating that LVLMs are prone to accept the generated hallucinations and make false claims that they would not have supported without distractions. We term this phenomenon Multimodal Hallucination Snowballing. To mitigate this, we further propose a training-free method called Residual Visual Decoding, where we revise the output distribution of LVLMs with the one derived from the residual visual input, providing models with direct access to the visual information. Experiments show that our method can mitigate more than $24\%$ of the snowballed multimodal hallucination while maintaining capabilities.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
Authors:
Zihao Wang,
Shaofei Cai,
Zhancun Mu,
Haowei Lin,
Ceyao Zhang,
Xuejie Liu,
Qing Li,
Anji Liu,
Xiaojian Ma,
Yitao Liang
Abstract:
We present OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-world instruction-following agents in open-world Minecraft. Compared to prior works that either emit textual goals to separate controllers or produce the control command directly, OmniJARVIS seeks a different path to ensure both strong reasoning and efficient decision-making capabilities via unified tokenization of multimod…
▽ More
We present OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-world instruction-following agents in open-world Minecraft. Compared to prior works that either emit textual goals to separate controllers or produce the control command directly, OmniJARVIS seeks a different path to ensure both strong reasoning and efficient decision-making capabilities via unified tokenization of multimodal interaction data. First, we introduce a self-supervised approach to learn a behavior encoder that produces discretized tokens for behavior trajectories $τ$ = {$o_0$, $a_0$, $\dots$} and an imitation learning (IL) policy decoder conditioned on these tokens. These additional behavior tokens will be augmented to the vocabulary of pretrained Multimodal Language Models (MLMs). With this encoder, we then pack long-term multimodal interactions involving task instructions, memories, thoughts, observations, textual responses, behavior trajectories, etc. into unified token sequences and model them with autoregressive transformers. Thanks to the semantically meaningful behavior tokens, the resulting VLA model, OmniJARVIS, can reason (by producing chain-of-thoughts), plan, answer questions, and act (by producing behavior tokens for the IL policy decoder). OmniJARVIS demonstrates excellent performances on a comprehensive collection of atomic, programmatic, and open-ended tasks in open-world Minecraft. Our analysis further unveils the crucial design principles in interaction data formation, unified tokenization, and its scaling potentials.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
Twist angle driven electronic structure evolution of twisted bilayer graphene
Authors:
Jiawei Yu,
Guihao Jia,
Qian Li,
Yuyang Wang,
Kebin Xiao,
Yongkang Ju,
Hongyun Zhang,
Zhiqiang Hu,
Yunkai Guo,
Biao Lian,
Peizhe Tang,
Shuyun Zhou,
Qi-Kun Xue,
Wei Li
Abstract:
In twisted bilayer graphene (TBG) devices, local strains often coexist and entangle with the twist-angle dependent moiré superlattice, both of which can significantly affect the electronic properties of TBG. Here, using low-temperature scanning tunneling microscopy, we investigate the fine evolution of the electronic structures of a TBG device with continuous variation of twist angles from 0.32° t…
▽ More
In twisted bilayer graphene (TBG) devices, local strains often coexist and entangle with the twist-angle dependent moiré superlattice, both of which can significantly affect the electronic properties of TBG. Here, using low-temperature scanning tunneling microscopy, we investigate the fine evolution of the electronic structures of a TBG device with continuous variation of twist angles from 0.32° to 1.29°, spanning the first (1.1°), second (0.5°) and third (0.3°) magic angles. We reveal the exotic behavior of the flat bands and remote bands in both the energy space and real space near the magic angles. Interestingly, we observe an anomalous spectral weight transfer between the two flat band peaks in the tunneling spectra when approaching the first magic angle, suggesting strong inter-flat-bands interactions. The position of the remote band peak can be an index for the twist angle in TBG, since it positively correlates with the twist angle but is insensitive to the strain. Moreover, influences of the twist angle gradient on symmetry breaking of the flat bands are also studied.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems
Authors:
Zheng Fang,
Tao Wang,
Lingchen Zhao,
Shenyi Zhang,
Bowen Li,
Yunjie Ge,
Qi Li,
Chao Shen,
Qian Wang
Abstract:
In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack…
▽ More
In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack on ASR systems in the zero-query black-box setting. Through a comprehensive review and categorization of modern ASR technologies, we first meticulously select surrogate ASRs of diverse types to generate adversarial examples. Following this, ZQ-Attack initializes the adversarial perturbation with a scaled target command audio, rendering it relatively imperceptible while maintaining effectiveness. Subsequently, to achieve high transferability of adversarial perturbations, we propose a sequential ensemble optimization algorithm, which iteratively optimizes the adversarial perturbation on each surrogate model, leveraging collaborative information from other models. We conduct extensive experiments to evaluate ZQ-Attack. In the over-the-line setting, ZQ-Attack achieves a 100% success rate of attack (SRoA) with an average signal-to-noise ratio (SNR) of 21.91dB on 4 online speech recognition services, and attains an average SRoA of 100% and SNR of 19.67dB on 16 open-source ASRs. For commercial intelligent voice control devices, ZQ-Attack also achieves a 100% SRoA with an average SNR of 15.77dB in the over-the-air setting.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Enhancing interfacial thermal transport by nanostructures: Monte Carlo simulations with ab initio phonon properties
Authors:
Wenzhu Luo,
Neng Wang,
Wenlei Lian,
Ershuai Yin,
Qiang Li
Abstract:
Recent experiments have indicated that employing nanostructures can enhance interfacial heat transport, but the mechanism by which different structural morphologies and dimensions contribute to the full-spectrum phonon interfacial transport remains unclear. In this paper, a multiscale method to study the thermal transfer at nanostructured interfaces is developed by combining density functional cal…
▽ More
Recent experiments have indicated that employing nanostructures can enhance interfacial heat transport, but the mechanism by which different structural morphologies and dimensions contribute to the full-spectrum phonon interfacial transport remains unclear. In this paper, a multiscale method to study the thermal transfer at nanostructured interfaces is developed by combining density functional calculation, Monte Carlo simulation, and diffuse mismatch method. The changes in the transport paths and contributions to thermal conductance of different frequency phonons caused by changes in nanostructure morphology and size are investigated. The results show that, compared to the triangular and trapezoidal nanostructures, the rectangular nanostructures are more beneficial in enhancing the probability of the reflected phonons encountering the interface, and thus the phonon interfacial transmittance. The nanostructure makes the interfacial heat flow extremely heterogeneous, with significant transverse heat flow occurring at the sidewalls, resulting in a new thermal conduction pathway. The phenomena of multiple reflections and double transmission together lead to the existence of the optimal dimension that maximizes the nanostructures enhancement effect on interfacial heat transfer. The optimal nanostructure width is 100 nm when the height is 100 nm and the maximum interfacial thermal conductance enhancement ratio is 1.31. These results can guide the design of heat transfer enhancement structures at the interface of the actual high-power chips.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI
Authors:
Zi Wang,
Fanwen Wang,
Chen Qin,
Jun Lyu,
Ouyang Cheng,
Shuo Wang,
Yan Li,
Mengyao Yu,
Haoyu Zhang,
Kunyuan Guo,
Zhang Shi,
Qirong Li,
Ziqiang Xu,
Ya**g Zhang,
Hao Li,
Sha Hua,
Binghua Chen,
Longyu Sun,
Mengting Sun,
Qin Li,
Ying-Hua Chu,
Wenjia Bai,
**g Qin,
Xiahai Zhuang,
Claudia Prieto
, et al. (7 additional authors not shown)
Abstract:
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h…
▽ More
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Measurement of dynamic nonlocal deformation using nanodiamond sensors
Authors:
Yue Cui,
Weng-Hang Leong,
Guoli Zhu,
Ren-Bao Liu,
Quan Li
Abstract:
Nonlocal deformation sensing achieved by integrating atomic force microscopy indentation with nanodiamond-based orientation tracking features high precision and high spatial resolution, providing a useful technique for studying the mechanical properties of soft biological systems. However, this technique is currently limited to lifeless systems because it cannot differentiate the indentation-induc…
▽ More
Nonlocal deformation sensing achieved by integrating atomic force microscopy indentation with nanodiamond-based orientation tracking features high precision and high spatial resolution, providing a useful technique for studying the mechanical properties of soft biological systems. However, this technique is currently limited to lifeless systems because it cannot differentiate the indentation-induced deformation from that associated with live activities or other external perturbations. Here we develop a dynamic nonlocal deformation sensing method using oscillatory nanoindentation and spectroscopic analysis to overcome this limitation. The method realizes both temporally and spatially resolved mechanical analysis, with tens of microsecond time-lag precision, nanometer vertical deformation precision, and sub-hundred nanometer lateral spatial resolution, leading to the disclosure of surface/interface effects in the mechanical response of viscoelastic materials and live cells. Neglecting surface tension would underestimate the liquid-like characteristics of the materials. This work demonstrates nanodiamond sensors as a useful tool for spatial-temporal mechanical analysis of soft, complex bio-relevant materials.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Functional knockoffs selection with applications to functional data analysis in high dimensions
Authors:
Xinghao Qiao,
Mingya Long,
Qizhai Li
Abstract:
The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or high-dimensional functional data, which has become increasingly prevalent in various scientific applications. In this paper, we propose a novel functional model-X knock…
▽ More
The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or high-dimensional functional data, which has become increasingly prevalent in various scientific applications. In this paper, we propose a novel functional model-X knockoffs selection framework tailored to sparse high-dimensional functional models, and show that our proposal can achieve the effective FDR control for any sample size. Furthermore, we illustrate the proposed functional model-X knockoffs selection procedure along with the associated theoretical guarantees for both FDR control and asymptotic power using examples of commonly adopted functional linear additive regression models and the functional graphical model. In the construction of functional knockoffs, we integrate essential components including the correlation operator matrix, the Karhunen-Loève expansion, and semidefinite programming, and develop executable algorithms. We demonstrate the superiority of our proposed methods over the competitors through both extensive simulations and the analysis of two brain imaging datasets.
△ Less
Submitted 27 June, 2024; v1 submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Artificial Immune System of Secure Face Recognition Against Adversarial Attacks
Authors:
Min Ren,
Yunlong Wang,
Yuhao Zhu,
Yongzhen Huang,
Zhenan Sun,
Qi Li,
Tieniu Tan
Abstract:
Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored…
▽ More
Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored and underutilised in insect farming. Here we present a comprehensive review of the selective breeding framework in the context of insect production. We systematically evaluate adjustments of selective breeding techniques to the realm of insects and highlight the essential components integral to the breeding process. The discussion covers every step of a conventional breeding scheme, such as formulation of breeding objectives, phenoty**, estimation of genetic parameters and breeding values, selection of appropriate breeding strategies, and mitigation of issues associated with genetic diversity depletion and inbreeding. This review combines knowledge from diverse disciplines, bridging the gap between animal breeding, quantitative genetics, evolutionary biology, and entomology, offering an integrated view of the insect breeding research area and uniting knowledge which has previously remained scattered across diverse fields of expertise.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Representations of domains via closure spaces in the quantale-valued setting
Authors:
Guojun Wu,
Wei Yao,
Qingguo Li
Abstract:
With a commutative unital quantale $L$ as the truth value table, this study focuses on the representations of $L$-domains by means of $L$-closure spaces. First, the notions of interpolative generalized $L$-closure spaces and directed closed sets are introduced. It is proved that in an interpolative generalized $L$-closure space (resp., $L$-closure space), the collection of directed closed sets wit…
▽ More
With a commutative unital quantale $L$ as the truth value table, this study focuses on the representations of $L$-domains by means of $L$-closure spaces. First, the notions of interpolative generalized $L$-closure spaces and directed closed sets are introduced. It is proved that in an interpolative generalized $L$-closure space (resp., $L$-closure space), the collection of directed closed sets with respect to the inclusion $L$-order forms a continuous $L$-dcpo (resp., an algebraic $L$-dcpo). Conversely, it is shown that every continuous $L$-dcpo (resp., algebraic $L$-dcpo) can be reconstructed by an interpolative generalized $L$-closure space (resp., $L$-closure space). Second, when $L$ is integral, the notion of dense subspaces of generalized $L$-closure spaces is introduced. By means of dense subspaces, an alternative representation for algebraic $L$-dcpos is given. Moreover, the concept of $L$-approximable relations between interpolative generalized $L$-closure spaces is introduced. Consequently, a categorical equivalence between the category of interpolative generalized $L$-closure spaces (resp., $L$-closure spaces) with $L$-approximable relations and that of continuous $L$-dcpos (resp., algebraic $L$-dcpos) with Scott continuous map**s is established.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Comparison of the origin of Short Gamma ray Bursts with or without extended emission
Authors:
Qin-Mei Li,
Qi-Bin Sun
Abstract:
The merger of compact binary stars produces short gamma-ray bursts (sGRBs), involving channels such as neutron star - neutron star (BNS) and neutron star - black hole (NS-BH). The association between sGRB 170817A and gravitational wave GW 170817 provides reliable evidence for the BNS channel. The spatial distribution and merger rate differ between BNS mergers and NS-BH mergers. Some speculations s…
▽ More
The merger of compact binary stars produces short gamma-ray bursts (sGRBs), involving channels such as neutron star - neutron star (BNS) and neutron star - black hole (NS-BH). The association between sGRB 170817A and gravitational wave GW 170817 provides reliable evidence for the BNS channel. The spatial distribution and merger rate differ between BNS mergers and NS-BH mergers. Some speculations suggest that sGRBs with extended emission (EE) may represent another distinct population. We compared the offset distributions of these two types of samples and found that they follow the same distribution. Utilizing non-parametric methods, we investigated the origin of these burst types in terms of their formation rate. We examined the luminosity function and formation rate of sGRBs without any assuming. The luminosity function can be described as $ψ(L_{0}) \propto L_{0}^{-0.09 \pm 0.01}$ for $L_{0} < L_0^b$ ($ψ(L_{0}) \propto L_{0}^{-0.57 \pm 0.02}$ for $L_{0} > L_0^b$) for standard sGRBs and $ψ(L_{0}) \propto L_{0}^{-0.11 \pm 0.004}$ for $L_{0} < L_0^b$ ($ψ(L_{0}) \propto L_{0}^{-0.61 \pm 0.01}$ for $L_{0} > L_0^b$) for sGRBs with EE. The formation rate is characterized as $ρ(z) \propto (1 + z)^{-4.21 \pm 0.22}$ for $z < 0.8$ and $ρ(z) \propto (1 + z)^{-0.22 \pm 0.74}$ for $0.8 < z < 3$ for standard sGRBs, while for sGRBs with EE, it is $ρ(z) \propto (1 + z)^{-4.30 \pm 0.13}$ for $z < 0.8$ and $ρ(z) \propto (1 + z)^{-0.33 \pm 0.66}$ for $0.8 < z < 3$. Based on these findings, we suggest that there is no significant difference in the progenitor stars of sGRBs with and without EE, considering the spatial offset and formation rate perspectives.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Cross-Modal Spherical Aggregation for Weakly Supervised Remote Sensing Shadow Removal
Authors:
Kaichen Chi,
Wei **g,
Junjie Li,
Qiang Li,
Qi Wang
Abstract:
Remote sensing shadow removal, which aims to recover contaminated surface information, is tricky since shadows typically display overwhelmingly low illumination intensities. In contrast, the infrared image is robust toward significant light changes, providing visual clues complementary to the visible image. Nevertheless, the existing methods ignore the collaboration between heterogeneous modalitie…
▽ More
Remote sensing shadow removal, which aims to recover contaminated surface information, is tricky since shadows typically display overwhelmingly low illumination intensities. In contrast, the infrared image is robust toward significant light changes, providing visual clues complementary to the visible image. Nevertheless, the existing methods ignore the collaboration between heterogeneous modalities, leading to undesired quality degradation. To fill this gap, we propose a weakly supervised shadow removal network with a spherical feature space, dubbed S2-ShadowNet, to explore the best of both worlds for visible and infrared modalities. Specifically, we employ a modal translation (visible-to-infrared) model to learn the cross-domain map**, thus generating realistic infrared samples. Then, Swin Transformer is utilized to extract strong representational visible/infrared features. Simultaneously, the extracted features are mapped to the smooth spherical manifold, which alleviates the domain shift through regularization. Well-designed similarity loss and orthogonality loss are embedded into the spherical space, prompting the separation of private visible/infrared features and the alignment of shared visible/infrared features through constraints on both representation content and orientation. Such a manner encourages implicit reciprocity between modalities, thus providing a novel insight into shadow removal. Notably, ground truth is not available in practice, thus S2-ShadowNet is trained by crop** shadow and shadow-free patches from the shadow image itself, avoiding stereotypical and strict pair data acquisition. More importantly, we contribute a large-scale weakly supervised shadow removal benchmark, including 4000 shadow images with corresponding shadow masks.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.