-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Operando monitoring of strain field distribution in lithium battery anode via ultra-high spatial resolution optical frequency domain reflectometer
Authors:
Kaijun Liu,
Zhijuan Zou,
Guolu Yin,
Yingze Song,
Zeheng Zhang,
Yuyang Lou,
Zixuan Zhong,
Huafeng Lu,
Duidui Li,
Tao Zhu
Abstract:
The cycling performance of lithium-ion batteries is closely related to the expansion effect of anode materials during charge and discharge processes. Studying the mechanical field evolution of anode materials is crucial for evaluating battery per-formance. Here, we propose a phase-sensitive ultra-high spatial resolution optical frequency domain reflectometry tech-nique, in which the test fiber is…
▽ More
The cycling performance of lithium-ion batteries is closely related to the expansion effect of anode materials during charge and discharge processes. Studying the mechanical field evolution of anode materials is crucial for evaluating battery per-formance. Here, we propose a phase-sensitive ultra-high spatial resolution optical frequency domain reflectometry tech-nique, in which the test fiber is embedded into the anode of a lithium-ion battery to monitor the mechanical evolution of the anode material during cycling. We investigated the strain evolution of the anode material under different loading levels and used this method to infer the morphological changes of the material. Furthermore, combining this with battery capacity in-formation provides a new approach for assessing the performance of lithium-ion batteries.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Authors:
Huiqiang Jiang,
Yucheng Li,
Chengruidong Zhang,
Qianhui Wu,
Xufang Luo,
Surin Ahn,
Zhenhua Han,
Amir H. Abdi,
Dongsheng Li,
Chin-Yew Lin,
Yuqing Yang,
Lili Qiu
Abstract:
The computational challenges of Large Language Model (LLM) inference remain a significant barrier to their widespread deployment, especially as prompt lengths continue to increase. Due to the quadratic complexity of the attention computation, it takes 30 minutes for an 8B LLM to process a prompt of 1M tokens (i.e., the pre-filling stage) on a single A100 GPU. Existing methods for speeding up prefi…
▽ More
The computational challenges of Large Language Model (LLM) inference remain a significant barrier to their widespread deployment, especially as prompt lengths continue to increase. Due to the quadratic complexity of the attention computation, it takes 30 minutes for an 8B LLM to process a prompt of 1M tokens (i.e., the pre-filling stage) on a single A100 GPU. Existing methods for speeding up prefilling often fail to maintain acceptable accuracy or efficiency when applied to long-context LLMs. To address this gap, we introduce MInference (Milliontokens Inference), a sparse calculation method designed to accelerate pre-filling of long-sequence processing. Specifically, we identify three unique patterns in long-context attention matrices-the A-shape, Vertical-Slash, and Block-Sparsethat can be leveraged for efficient sparse computation on GPUs. We determine the optimal pattern for each attention head offline and dynamically build sparse indices based on the assigned pattern during inference. With the pattern and sparse indices, we perform efficient sparse attention calculations via our optimized GPU kernels to significantly reduce the latency in the pre-filling stage of long-context LLMs. Our proposed technique can be directly applied to existing LLMs without any modifications to the pre-training setup or additional fine-tuning. By evaluating on a wide range of downstream tasks, including InfiniteBench, RULER, PG-19, and Needle In A Haystack, and models including LLaMA-3-1M, GLM4-1M, Yi-200K, Phi-3-128K, and Qwen2-128K, we demonstrate that MInference effectively reduces inference latency by up to 10x for pre-filling on an A100, while maintaining accuracy. Our code is available at https://aka.ms/MInference.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Turbulent Diffuse Molecular Media with Non-ideal Magnetohydrodynamics and Consistent Thermochemistry: Numerical Simulations and Dynamic Characteristics
Authors:
Nannan Yue,
Lile Wang,
Thomas Bisbas,
Donghui Quan,
Di Li
Abstract:
Turbulent diffuse molecular clouds can exhibit complicated morphologies caused by the interactions among radiation, chemistry, fluids, and fields. We performed full 3D simulations for turbulent diffuse molecular interstellar media, featuring time-dependent non-equilibrium thermochemistry co-evolved with magnetohydrodynamics (MHD). Simulation results exhibit the relative abundances of key chemical…
▽ More
Turbulent diffuse molecular clouds can exhibit complicated morphologies caused by the interactions among radiation, chemistry, fluids, and fields. We performed full 3D simulations for turbulent diffuse molecular interstellar media, featuring time-dependent non-equilibrium thermochemistry co-evolved with magnetohydrodynamics (MHD). Simulation results exhibit the relative abundances of key chemical species (e.g., C, CO, OH) vary by more than one order of magnitude for the "premature" epoch of chemical evolution ($t\lesssim 2\times 10^5~{\rm yr}$). Various simulations are also conducted to study the impacts of physical parameters. Non-ideal MHD effects are essential in sha** the behavior of gases, and strong magnetic fields ($\sim 10~μ{\rm G}$) tend to inhibit vigorous compressions and thus reduce the fraction of warm gases ($T\gtrsim 10^2~{\rm K}$). Thermodynamical and chemical conditions of the gas are sensitive to modulation by dynamic conditions, especially the energy injection by turbulence. Chemical features, including ionization (cosmic ray and diffuse interstellar radiation), would not directly affect the turbulence power spectra. Nonetheless, their effects are prominent in the distribution profiles of temperatures and gas densities. Comprehensive observations are necessary and useful to eliminate the degeneracies of physical parameters and constrain the properties of diffuse molecular clouds with confidence.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Optimising robotic operation speed with edge computing over 5G networks: Insights from selective harvesting robots
Authors:
Usman A. Zahidi,
Arshad Khan,
Tsvetan Zhivkov,
Johann Dichtl,
Dom Li,
Soran Parsa,
Marc Hanheide,
Grzegorz Cielniak,
Elizabeth I. Sklar,
Simon Pearson,
Amir Ghalamzan
Abstract:
Selective harvesting by autonomous robots will be a critical enabling technology for future farming. Increases in inflation and shortages of skilled labour are driving factors that can help encourage user acceptability of robotic harvesting. For example, robotic strawberry harvesting requires real-time high-precision fruit localisation, 3D map** and path planning for 3-D cluster manipulation. Wh…
▽ More
Selective harvesting by autonomous robots will be a critical enabling technology for future farming. Increases in inflation and shortages of skilled labour are driving factors that can help encourage user acceptability of robotic harvesting. For example, robotic strawberry harvesting requires real-time high-precision fruit localisation, 3D map** and path planning for 3-D cluster manipulation. Whilst industry and academia have developed multiple strawberry harvesting robots, none have yet achieved human-cost parity. Achieving this goal requires increased picking speed (perception, control and movement), accuracy and the development of low-cost robotic system designs. We propose the edge-server over 5G for Selective Harvesting (E5SH) system, which is an integration of high bandwidth and low latency Fifth Generation (5G) mobile network into a crop harvesting robotic platform, which we view as an enabler for future robotic harvesting systems. We also consider processing scale and speed in conjunction with system environmental and energy costs. A system architecture is presented and evaluated with support from quantitative results from a series of experiments that compare the performance of the system in response to different architecture choices, including image segmentation models, network infrastructure (5G vs WiFi) and messaging protocols such as Message Queuing Telemetry Transport (MQTT) and Transport Control Protocol Robot Operating System (TCPROS). Our results demonstrate that the E5SH system delivers step-change peak processing performance speedup of above 18-fold than a stand-alone embedded computing Nvidia Jetson Xavier NX (NJXN) system.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Proceedings of 3rd Workshop on Heterogeneous Composable and Disaggregated Systems
Authors:
Christian Pinto,
Dong Li,
Thaleia Dimitra Doudali,
Christina Giannoula,
Jie Ren
Abstract:
The future of computing systems is inevitably embracing a disaggregated and composable pattern: from clusters of computers to pools of resources that can be dynamically combined together and tailored around applications requirements. Transitioning to this new paradigm requires ground-breaking research, ranging from new hardware architectures up to new models and abstractions at all levels of the s…
▽ More
The future of computing systems is inevitably embracing a disaggregated and composable pattern: from clusters of computers to pools of resources that can be dynamically combined together and tailored around applications requirements. Transitioning to this new paradigm requires ground-breaking research, ranging from new hardware architectures up to new models and abstractions at all levels of the software stack. Recent hardware advancements in CPU and interconnection technologies, enabled the possibility of disaggregating peripherals and system memory. The memory system heterogeneity is further increasing, composability and disaggregation are beneficial to increase memory capacity and improve memory utilization in a cost-effective way, and reduce total cost of ownership. Heterogeneous and Composable Disaggregated Systems (HCDS) provide a system design approach for reducing the imbalance between workloads resource requirements and the static availability of resources in a computing system. The HCDS workshop aims at exploring the novel research ideas around composable disaggregated systems and their integration with operating systems and software runtimes to maximize the benefit perceived from user workloads.
△ Less
Submitted 22 April, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
Pistis-RAG: A Scalable Cascading Framework Towards Content-Centric Retrieval-Augmented Generation
Authors:
Yu Bai,
Yukai Miao,
Li Chen,
Dan Li,
Yanyu Ren,
Hongtao Xie,
Ce Yang,
Xuhui Cai
Abstract:
In Greek mythology, Pistis symbolized good faith, trust, and reliability. Drawing inspiration from these principles, Pistis-RAG is a scalable multi-stage framework designed to address the challenges of large-scale retrieval-augmented generation (RAG) systems. This framework consists of distinct stages: matching, pre-ranking, ranking, reasoning, and aggregating. Each stage contributes to narrowing…
▽ More
In Greek mythology, Pistis symbolized good faith, trust, and reliability. Drawing inspiration from these principles, Pistis-RAG is a scalable multi-stage framework designed to address the challenges of large-scale retrieval-augmented generation (RAG) systems. This framework consists of distinct stages: matching, pre-ranking, ranking, reasoning, and aggregating. Each stage contributes to narrowing the search space, prioritizing semantically relevant documents, aligning with the large language model's (LLM) preferences, supporting complex chain-of-thought (CoT) methods, and combining information from multiple sources.
Our ranking stage introduces a significant innovation by recognizing that semantic relevance alone may not lead to improved generation quality, due to the sensitivity of the few-shot prompt order, as noted in previous research. This critical aspect is often overlooked in current RAG frameworks.
We argue that the alignment issue between LLMs and external knowledge ranking methods is tied to the model-centric paradigm dominant in RAG systems. We propose a content-centric approach, emphasizing seamless integration between LLMs and external information sources to optimize content transformation for specific tasks.
Our novel ranking stage is designed specifically for RAG systems, incorporating principles of information retrieval while considering the unique business scenarios reflected in LLM preferences and user feedback. We simulated feedback signals on the MMLU benchmark, resulting in a 9.3% performance improvement. Our model and code will be open-sourced on GitHub. Additionally, experiments on real-world, large-scale data validate the scalability of our framework.
△ Less
Submitted 3 July, 2024; v1 submitted 21 June, 2024;
originally announced July 2024.
-
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
Authors:
Vikranth Srivatsa,
Zijian He,
Reyna Abhyankar,
Dongming Li,
Yiying Zhang
Abstract:
Prompts to large language models (LLMs) have evolved beyond simple user questions. For LLMs to solve complex problems, today's practices include domain-specific instructions, illustration of tool usages, and long context, such as textbook chapters in prompts. As such, many parts of prompts are repetitive across requests, and their attention computation results can be reused. However, today's LLM s…
▽ More
Prompts to large language models (LLMs) have evolved beyond simple user questions. For LLMs to solve complex problems, today's practices include domain-specific instructions, illustration of tool usages, and long context, such as textbook chapters in prompts. As such, many parts of prompts are repetitive across requests, and their attention computation results can be reused. However, today's LLM serving systems treat every request in isolation, missing the opportunity of computation reuse.
This paper proposes Preble, the first distributed LLM serving platform that targets and optimizes for prompt sharing. We perform a study on five popular LLM workloads. Based on our study results, we designed a distributed scheduling system that co-optimizes computation reuse and load balancing. Our evaluation of Preble on two to 8 GPUs with real workloads and request arrival patterns on two open-source LLM models shows that Preble outperforms the state-of-the-art average latency by 1.5X to 14.5X and p99 by 2X to 10X.
△ Less
Submitted 8 May, 2024;
originally announced July 2024.
-
FAST survey of H I and OH absorption towards extragalactic radio sources
Authors:
Yogesh Chandola,
D. J. Saikia,
Yin-Zhe Ma,
Zheng Zheng,
Chao-Wei Tsai,
Di Li,
Denis Tramonte,
Hengxing Pan
Abstract:
Neutral atomic hydrogen and molecular gas in the host galaxies of radio active galactic nuclei (AGN) can be traced using H I 21-cm and OH-1667 MHz absorption lines to understand the fueling and feedback processes. We present the results of an H I and OH absorption survey with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) towards 40 radio sources of low-intermediate radio luminos…
▽ More
Neutral atomic hydrogen and molecular gas in the host galaxies of radio active galactic nuclei (AGN) can be traced using H I 21-cm and OH-1667 MHz absorption lines to understand the fueling and feedback processes. We present the results of an H I and OH absorption survey with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) towards 40 radio sources of low-intermediate radio luminosity ($\sim$10$^{23}$-10$^{26}$ W Hz$^{-1}$ at 1.4 GHz), red mid-infrared color (W2[4.6 $μ$m]$-$W3[12 $μ$m] $>$ 2.5 mag) and redshift up to 0.35. From 13 sources with good data at H I observing frequencies, we report the detection of H I absorption towards 8 sources, 5 of which are new detections including 4 in the redshift range 0.25 to 0.35. Our detection rates are consistent with our previous results with dependence on the star-formation history of the host galaxy reflected in the mid-infrared \textit{WISE} W2$-$W3 colors and the compactness of the radio source. We find no significant dependence of detection rates on radio luminosity or redshift. We also find that H I column densities are anti-correlated with the low-frequency spectral indices ($α_{\rm 150 MHz}^{\rm 1.4 GHz}$, $S_ν\propto ν^{-α}$). We do not have any detection from 23 sources with good data at OH observing frequencies. However, by stacking the spectra we estimate the 3$σ$ upper limit of OH column density to be 2.27$\times$10$^{14}$$T_{\rm ex}$/10 K $\times$1/$f_{\rm c}$ cm$^{-2}$. By stacking the OH spectra for 7 associated H I absorbers, we get a 3$σ$ upper limit of 3.47$\times$10$^{14}$ $T_{\rm ex}$/10 K $\times$1/$f_{\rm c}$ cm$^{-2}$ on OH column density and 1.78$\times$10$^{-7}$ on [OH]/[H I] ratio.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
$n$ qubits can be entangled in two different ways
Authors:
Dafa Li
Abstract:
In [M. Walter et al., Science 340, 1205, 7 June (2013)], via polytopes they gave a sufficient condition for genuinely entangled pure states and discussed SLOCC classification. In this paper, we study entanglement classification of pure states of $n$ qubits via the basis state matrix (BSM) whose rows are the basis states. We propose a canonical form of BSM obtained by exchanging columns (i.e. permu…
▽ More
In [M. Walter et al., Science 340, 1205, 7 June (2013)], via polytopes they gave a sufficient condition for genuinely entangled pure states and discussed SLOCC classification. In this paper, we study entanglement classification of pure states of $n$ qubits via the basis state matrix (BSM) whose rows are the basis states. We propose a canonical form of BSM obtained by exchanging columns (i.e. permutation of qubits) and rows of BSM and then a necessary and sufficient condition for a genuinely entangled state of n qubits via a canonical form of BSM. Thus, for any $n$ qubits, the genuinely entangled states can be partitioned into two families. One family includes all states whose BSM cannot be transformed into the canonical form. The states with the BSM are always genuinely entangled no matter what the non-zero coefficients are. GHZ and W states belong to the family. The other includes all states whose BSM can be transformed into the canonical form, but for any canonical form of BSM, some two columns or rows of the corresponding coefficient matrix are not proportional. The cluster state belongs to the family.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Existence and uniqueness of weak solutions to a parabolic nonlocal 1-Laplacian equation
Authors:
Dingding Li,
Chao Zhang
Abstract:
We consider a class of parabolic nonlocal $1$-Laplacian equation \begin{align*} u_t+(-Δ)^s_1u=f \quad \text{ in }Ω\times(0,T]. \end{align*} By employing the Rothe time-discretization method, we establish the existence and uniqueness of weak solutions to the equation above. In particular, different from the previous results on the local case, we infer that the weak solution maintains $\frac{1}{2}$-…
▽ More
We consider a class of parabolic nonlocal $1$-Laplacian equation \begin{align*} u_t+(-Δ)^s_1u=f \quad \text{ in }Ω\times(0,T]. \end{align*} By employing the Rothe time-discretization method, we establish the existence and uniqueness of weak solutions to the equation above. In particular, different from the previous results on the local case, we infer that the weak solution maintains $\frac{1}{2}$-Hölder continuity in time.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition
Authors:
Lan Chen,
Dong Li,
Xiao Wang,
Pengpeng Shao,
Wei Zhang,
Yaowei Wang,
Yonghong Tian,
** Tang
Abstract:
Existing event stream-based pattern recognition models usually represent the event stream as the point cloud, voxel, image, etc., and design various deep neural networks to learn their features. Although considerable results can be achieved in simple cases, however, the model performance may be limited by monotonous modality expressions, sub-optimal fusion, and readout mechanisms. In this paper, w…
▽ More
Existing event stream-based pattern recognition models usually represent the event stream as the point cloud, voxel, image, etc., and design various deep neural networks to learn their features. Although considerable results can be achieved in simple cases, however, the model performance may be limited by monotonous modality expressions, sub-optimal fusion, and readout mechanisms. In this paper, we propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++. It models two common event representations simultaneously, i.e., event images and event voxels. The spatial and three-dimensional stereo information can be learned separately by utilizing Transformer and Graph Neural Network (GNN). We believe the features of each representation still contain both efficient and redundant features and a sub-optimal solution may be obtained if we directly fuse them without differentiation. Thus, we divide each feature into three levels and retain high-quality features, blend medium-quality features, and exchange low-quality features. The enhanced dual features will be fed into the fusion Transformer together with bottleneck features. In addition, we introduce a novel hybrid interaction readout mechanism to enhance the diversity of features as final representations. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance on multiple widely used event stream-based classification datasets. Specifically, we achieve new state-of-the-art performance on the Bullying10k dataset, i.e., $90.51\%$, which exceeds the second place by $+2.21\%$. The source code of this paper has been released on \url{https://github.com/Event-AHU/EFV_event_classification/tree/EFVpp}.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Memorized Images in Diffusion Models share a Subspace that can be Located and Deleted
Authors:
Ruchika Chavhan,
Ondrej Bohdal,
Yongshuo Zong,
Da Li,
Timothy Hospedales
Abstract:
Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs, yet concerns arise as research indicates their tendency to memorize and replicate training data, raising We also addressed the issue of memorization in diffusion models, where models tend to replicate exact training samples raising copyright infringement and privacy issues. Efforts within the te…
▽ More
Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs, yet concerns arise as research indicates their tendency to memorize and replicate training data, raising We also addressed the issue of memorization in diffusion models, where models tend to replicate exact training samples raising copyright infringement and privacy issues. Efforts within the text-to-image community to address memorization explore causes such as data duplication, replicated captions, or trigger tokens, proposing per-prompt inference-time or training-time mitigation strategies. In this paper, we focus on the feed-forward layers and begin by contrasting neuron activations of a set of memorized and non-memorized prompts. Experiments reveal a surprising finding: many different sets of memorized prompts significantly activate a common subspace in the model, demonstrating, for the first time, that memorization in the diffusion models lies in a special subspace. Subsequently, we introduce a novel post-hoc method for editing pre-trained models, whereby memorization is mitigated through the straightforward pruning of weights in specialized subspaces, avoiding the need to disrupt the training or inference process as seen in prior research. Finally, we demonstrate the robustness of the pruned model against training data extraction attacks, thereby unveiling new avenues for a practical and one-for-all solution to memorization.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Automated Clinical Data Extraction with Knowledge Conditioned LLMs
Authors:
Diya Li,
Asim Kadav,
Ai**g Gao,
Rui Li,
Richard Bourgon
Abstract:
The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To…
▽ More
The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Our knowledge-conditioned approach also improves the accuracy and reliability of LLM outputs by addressing the extraction task in two stages: (i) lung lesion finding detection and primary structured field parsing, followed by (ii) further parsing of lesion description text into additional structured fields. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Review
Authors:
Meng Cui,
Xubo Liu,
Haohe Liu,
**zheng Zhao,
Daoliang Li,
Wenwu Wang
Abstract:
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which are essential for optimizing production efficiency, enhancing fish welfare, and improving resource management. Previous reviews have focused on single…
▽ More
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which are essential for optimizing production efficiency, enhancing fish welfare, and improving resource management. Previous reviews have focused on single modalities, limiting their ability to address the diverse challenges encountered in these tasks comprehensively. This review provides a comprehensive analysis of the current state of aquaculture digital technologies, including vision-based, acoustic-based, and biosensor-based methods. We examine the advantages, limitations, and applications of these methods, highlighting recent advancements and identifying critical research gaps. The scarcity of comprehensive fish datasets and the lack of unified evaluation standards, which make it difficult to compare the performance of different technologies, are identified as major obstacles hindering progress in this field. To overcome current limitations and improve the accuracy, robustness, and efficiency of fish monitoring systems, we explore the potential of emerging technologies such as multimodal data fusion and deep learning. Additionally, we contribute to the field by providing a summary of existing datasets available for fish tracking, counting, and behaviour analysis. Future research directions are outlined, emphasizing the need for comprehensive datasets and evaluation standards to facilitate meaningful comparisons between technologies and promote their practical implementation in real-world aquaculture settings.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Essential connectivity and spectral radius of graphs
Authors:
Wenxiu Ding,
Dan Li,
Yu Wang,
Jixiang Meng
Abstract:
A graph is trivial if it contains one vertex and no edges. The essential connectivity $κ^{\prime}$ of $G$ is defined to be the minimum number of vertices of $G$ whose removal produces a disconnected graph with at least two non-trivial components. Let $\mathcal{A}_n^{κ',δ}$ be the set of graphs of order $n$ with minimum degree $δ$ and essential connectivity $κ'$. In this paper, we determine the gra…
▽ More
A graph is trivial if it contains one vertex and no edges. The essential connectivity $κ^{\prime}$ of $G$ is defined to be the minimum number of vertices of $G$ whose removal produces a disconnected graph with at least two non-trivial components. Let $\mathcal{A}_n^{κ',δ}$ be the set of graphs of order $n$ with minimum degree $δ$ and essential connectivity $κ'$. In this paper, we determine the graphs attaining the maximum spectral radii among all graphs in $\mathcal{A}_n^{κ',δ}$ and characterize the corresponding extremal graphs. In addition, we also determine the digraphs which achieve the maximum spectral radii among all strongly connected digraphs with given essential connectivity and give the exact values of the spectral radii of these digraphs.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Toward Ubiquitous 3D Object Digitization: A Wearable Computing Framework for Non-Invasive Physical Property Acquisition
Authors:
Yunxiang Zhang,
Xin Sun,
Dengfeng Li,
Xinge Yu,
Qi Sun
Abstract:
Accurately digitizing physical objects is central to many applications, including virtual/augmented reality, industrial design, and e-commerce. Prior research has demonstrated efficient and faithful reconstruction of objects' geometric shapes and visual appearances, which suffice for digitally representing rigid objects. In comparison, physical properties, such as elasticity and pressure, are also…
▽ More
Accurately digitizing physical objects is central to many applications, including virtual/augmented reality, industrial design, and e-commerce. Prior research has demonstrated efficient and faithful reconstruction of objects' geometric shapes and visual appearances, which suffice for digitally representing rigid objects. In comparison, physical properties, such as elasticity and pressure, are also indispensable to the behavioral fidelity of digitized deformable objects. However, existing approaches to acquiring these quantities either rely on invasive specimen collection or expensive/bulky laboratory setups, making them inapplicable to consumer-level usage.
To fill this gap, we propose a wearable and non-invasive computing framework that allows users to conveniently estimate the material elasticity and internal pressure of deformable objects through finger touches. This is achieved by modeling their local surfaces as pressurized elastic shells and analytically deriving the two physical properties from finger-induced wrinkling patterns. Together with photogrammetry-reconstructed geometry and textures, the two estimated physical properties enable us to faithfully replicate the motion and deformation behaviors of several deformable objects. For the pressure estimation, our model achieves a relative error of 3.5%. In the interaction experiments, the virtual-physical deformation discrepancy measures less than 10.1%. Generalization to objects of irregular shape further demonstrates the potential of our approach in practical applications. We envision this work to provide insights for and motivate research toward democratizing the ubiquitous and pervasive digitization of our physical surroundings in daily, industrial, and scientific scenarios.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Scaling Laws for Linear Complexity Language Models
Authors:
Xuyang Shen,
Dong Li,
Ruitao Leng,
Zhen Qin,
Weigao Sun,
Yiran Zhong
Abstract:
The interest in linear complexity models for large language models is on the rise, although their scaling capacity remains uncertain. In this study, we present the scaling laws for linear complexity language models to establish a foundation for their scalability. Specifically, we examine the scaling behaviors of three efficient linear architectures. These include TNL, a linear attention model with…
▽ More
The interest in linear complexity models for large language models is on the rise, although their scaling capacity remains uncertain. In this study, we present the scaling laws for linear complexity language models to establish a foundation for their scalability. Specifically, we examine the scaling behaviors of three efficient linear architectures. These include TNL, a linear attention model with data-independent decay; HGRN2, a linear RNN with data-dependent decay; and cosFormer2, a linear attention model without decay. We also include LLaMA as a baseline architecture for softmax attention for comparison. These models were trained with six variants, ranging from 70M to 7B parameters on a 300B-token corpus, and evaluated with a total of 1,376 intermediate checkpoints on various downstream tasks. These tasks include validation loss, commonsense reasoning, and information retrieval and generation. The study reveals that existing linear complexity language models exhibit similar scaling capabilities as conventional transformer-based models while also demonstrating superior linguistic proficiency and knowledge retention.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Further results on equivalence of multivariate polynomial matrices
Authors:
Jiancheng Guan,
**wang Liu,
Dongmei Li,
Tao Wu
Abstract:
This paper investigates equivalence of square multivariate polynomial matrices with the determinant being some power of a univariate irreducible polynomial. We first generalized a global-local theorem of Vaserstein. Then we proved these matrices are equivalent to their Smith forms by the generalized global-local theorem.
This paper investigates equivalence of square multivariate polynomial matrices with the determinant being some power of a univariate irreducible polynomial. We first generalized a global-local theorem of Vaserstein. Then we proved these matrices are equivalent to their Smith forms by the generalized global-local theorem.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Gigantic-oxidative atomically layered epitaxy for designed complex oxides
Authors:
Guangdi Zhou,
Haoliang Huang,
Fengzhe Wang,
Heng Wang,
Qishuo Yang,
Zihao Nie,
Wei Lv,
Cui Ding,
Yueying Li,
Danfeng Li,
Yujie Sun,
Junhao Lin,
Guang-Ming Zhang,
Qi-Kun Xue,
Zhuoyu Chen
Abstract:
In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly…
▽ More
In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly for oxidation-demanding phases. We introduce a methodology, namely the gigantic-oxidative atomically layered epitaxy (GOAL-Epitaxy), enhancing oxidation power 3-4 orders of magnitude beyond oxide molecular beam epitaxy (OMBE) and pulsed laser deposition (PLD), while ensuring atomic-layer-by-layer growth of designed complex structures. Consequently, thermodynamic stability is markedly augmented at elevated temperatures, improving growth kinetics. We demonstrate the accurate synthesis of complex nickelates and cuprates, especially an artificially designed structure as a parent of high-temperature superconductivity, in which alternating single and double NiO2 layers possess distinct nominal d-orbital occupancy. The GOAL-Epitaxy enables material discovery within the vastly broadened growth parameter space.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning
Authors:
Dongyang Li,
Taolin Zhang,
Longtao Huang,
Chengyu Wang,
Xiaofeng He,
Hui Xue
Abstract:
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs) and integrate these external data sources into language models via self-supervised learning. Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration. In this paper, we propose to learn Knowledge-Enhanced language represe…
▽ More
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs) and integrate these external data sources into language models via self-supervised learning. Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration. In this paper, we propose to learn Knowledge-Enhanced language representations with Hierarchical Reinforcement Learning (KEHRL), which jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge. Specifically, a high-level reinforcement learning (RL) agent utilizes both internal and prior knowledge to iteratively detect essential positions in texts for knowledge injection, which filters out less meaningful entities to avoid diverting the knowledge learning direction. Once the entity positions are selected, a relevant triple filtration module is triggered to perform low-level RL to dynamically refine the triples associated with polysemic entities through binary-valued actions. Experiments validate KEHRL's effectiveness in probing factual knowledge and enhancing the model's performance on various natural language understanding tasks.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
UniPSDA: Unsupervised Pseudo Semantic Data Augmentation for Zero-Shot Cross-Lingual Natural Language Understanding
Authors:
Dongyang Li,
Taolin Zhang,
Jiali Deng,
Longtao Huang,
Chengyu Wang,
Xiaofeng He,
Hui Xue
Abstract:
Cross-lingual representation learning transfers knowledge from resource-rich data to resource-scarce ones to improve the semantic understanding abilities of different languages. However, previous works rely on shallow unsupervised data generated by token surface matching, regardless of the global context-aware semantics of the surrounding text tokens. In this paper, we propose an Unsupervised Pseu…
▽ More
Cross-lingual representation learning transfers knowledge from resource-rich data to resource-scarce ones to improve the semantic understanding abilities of different languages. However, previous works rely on shallow unsupervised data generated by token surface matching, regardless of the global context-aware semantics of the surrounding text tokens. In this paper, we propose an Unsupervised Pseudo Semantic Data Augmentation (UniPSDA) mechanism for cross-lingual natural language understanding to enrich the training data without human interventions. Specifically, to retrieve the tokens with similar meanings for the semantic data augmentation across different languages, we propose a sequential clustering process in 3 stages: within a single language, across multiple languages of a language family, and across languages from multiple language families. Meanwhile, considering the multi-lingual knowledge infusion with context-aware semantics while alleviating computation burden, we directly replace the key constituents of the sentences with the above-learned multi-lingual family knowledge, viewed as pseudo-semantic. The infusion process is further optimized via three de-biasing techniques without introducing any neural parameters. Extensive experiments demonstrate that our model consistently improves the performance on general zero-shot cross-lingual natural language understanding tasks, including sequence classification, information extraction, and question answering.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models
Authors:
Dongyang Li,
Junbing Yan,
Taolin Zhang,
Chengyu Wang,
Xiaofeng He,
Longtao Huang,
Hui Xue,
Jun Huang
Abstract:
Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to ans…
▽ More
Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to answer original queries more accurately. In this paper, we suggest that long-tail knowledge is crucial for RAG as LLMs have already remembered common world knowledge during large-scale pre-training. Based on our observation, we propose a simple but effective long-tail knowledge detection method for LLMs. Specifically, the novel Generative Expected Calibration Error (GECE) metric is derived to measure the ``long-tailness'' of knowledge based on both statistics and semantics. Hence, we retrieve relevant documents and infuse them into the model for patching knowledge loopholes only when the input query relates to long-tail knowledge. Experiments show that, compared to existing RAG pipelines, our method achieves over 4x speedup in average inference time and consistent performance improvement in downstream tasks.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
RuleR: Improving LLM Controllability by Rule-based Data Recycling
Authors:
Ming Li,
Han Chen,
Chenguang Wang,
Dang Nguyen,
Dianqi Li,
Tianyi Zhou
Abstract:
Large language models (LLMs) still lack delicate controllability over their responses, which is critical to enhancing their performance and the user experience. However, curating supervised fine-tuning (SFT) datasets to improve LLM controllability usually relies on human experts or proprietary LLMs, which requires additional costs. To bridge this gap, we propose Rule-based Data Recycling (RuleR),…
▽ More
Large language models (LLMs) still lack delicate controllability over their responses, which is critical to enhancing their performance and the user experience. However, curating supervised fine-tuning (SFT) datasets to improve LLM controllability usually relies on human experts or proprietary LLMs, which requires additional costs. To bridge this gap, we propose Rule-based Data Recycling (RuleR), a data augmentation method incorporating multiple constraints into the original data samples according to predefined rules, which creates new training tasks to consolidate the controllability of LLMs. Instead of creating new data from scratch, RuleR ``recycles'' existing data by simply applying rule-based edits to their responses and appending the rule-instructions in their original instructions. Experimental results demonstrate RuleR's effectiveness in improving LLM controllability while maintaining general instruction-following capabilities. The code will be released on https://github.com/MingLiiii/RuleR.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
A comprehensive overview of diffuse correlation spectroscopy: theoretical framework, recent advances in hardware, analysis, and applications
Authors:
Quan Wang,
Mingliang Pan,
Lucas Kreiss,
Saeed Samaei,
Stefan A. Carp,
Johannes D. Johansson,
Yuanzhe Zhang,
Melissa Wu,
Roarke Horstmeyer,
Mamadou Diop,
David Day-Uei Li
Abstract:
Diffuse correlation spectroscopy (DCS) is a powerful tool for assessing microvascular hemodynamic in deep tissues. Recent advances in sensors, lasers, and deep learning have further boosted the development of new DCS methods. However, newcomers might feel overwhelmed, not only by the already complex DCS theoretical framework but also by the broad range of component options and system architectures…
▽ More
Diffuse correlation spectroscopy (DCS) is a powerful tool for assessing microvascular hemodynamic in deep tissues. Recent advances in sensors, lasers, and deep learning have further boosted the development of new DCS methods. However, newcomers might feel overwhelmed, not only by the already complex DCS theoretical framework but also by the broad range of component options and system architectures. To facilitate new entry into this exciting field, we present a comprehensive review of DCS hardware architectures (continuous-wave, frequency-domain, and time-domain) and summarize corresponding theoretical models. Further, we discuss new applications of highly integrated silicon single-photon avalanche diode (SPAD) sensors in DCS, compare SPADs with existing sensors, and review other components (lasers, fibers, and correlators), as well as new data analysis tools, including deep learning. Potential applications in medical diagnosis are discussed, and an outlook for the future directions is provided, to offer effective guidance to embark on DCS research.
△ Less
Submitted 18 May, 2024;
originally announced June 2024.
-
Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study
Authors:
Yujian Hu,
Yilang Xiang,
Yan-Jie Zhou,
Yangyan He,
Shifeng Yang,
Xiaolong Du,
Chunlan Den,
Youyao Xu,
Gaofeng Wang,
Zhengyao Ding,
**gyong Huang,
Wenjun Zhao,
Xuejun Wu,
Donglin Li,
Qianqian Zhu,
Zhenjiang Li,
Chenyang Qiu,
Ziheng Wu,
Yunjun He,
Chen Tian,
Yihui Qiu,
Zuodong Lin,
Xiaolong Zhang,
Yuan He,
Zhenpeng Yuan
, et al. (15 additional authors not shown)
Abstract:
Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed…
▽ More
Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed as having other acute chest pain conditions. Subsequently, these AAS patients will undergo clinically inaccurate or suboptimal differential diagnosis. Fortunately, even under these suboptimal protocols, nearly all these patients underwent non-contrast CT covering the aorta anatomy at the early stage of differential diagnosis. In this study, we developed an artificial intelligence model (DeepAAS) using non-contrast CT, which is highly accurate for identifying AAS and provides interpretable results to assist in clinical decision-making. Performance was assessed in two major phases: a multi-center retrospective study (n = 20,750) and an exploration in real-world emergency scenarios (n = 137,525). In the multi-center cohort, DeepAAS achieved a mean area under the receiver operating characteristic curve of 0.958 (95% CI 0.950-0.967). In the real-world cohort, DeepAAS detected 109 AAS patients with misguided initial suspicion, achieving 92.6% (95% CI 76.2%-97.5%) in mean sensitivity and 99.2% (95% CI 99.1%-99.3%) in mean specificity. Our AI model performed well on non-contrast CT at all applicable early stages of differential diagnosis workflows, effectively reduced the overall missed diagnosis and misdiagnosis rate from 48.8% to 4.8% and shortened the diagnosis time for patients with misguided initial suspicion from an average of 681.8 (74-11,820) mins to 68.5 (23-195) mins. DeepAAS could effectively fill the gap in the current clinical workflow without requiring additional tests.
△ Less
Submitted 24 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments
Authors:
Yile Liang,
Jiuxia Zhao,
Donghui Li,
Jie Feng,
Chen Zhang,
Xuetao Ding,
**ghua Hao,
Renqing He
Abstract:
The recent past has witnessed a notable surge in on-demand food delivery (OFD) services, offering delivery fulfillment within dozens of minutes after an order is placed. In OFD, pooling multiple orders for simultaneous delivery in real-time order assignment is a pivotal efficiency source, which may in turn extend delivery time. Constructing high-quality order pooling to harmonize platform efficien…
▽ More
The recent past has witnessed a notable surge in on-demand food delivery (OFD) services, offering delivery fulfillment within dozens of minutes after an order is placed. In OFD, pooling multiple orders for simultaneous delivery in real-time order assignment is a pivotal efficiency source, which may in turn extend delivery time. Constructing high-quality order pooling to harmonize platform efficiency with the experiences of consumers and couriers, is crucial to OFD platforms. However, the complexity and real-time nature of order assignment, making extensive calculations impractical, significantly limit the potential for order consolidation. Moreover, offline environment is frequently riddled with unknown factors, posing challenges for the platform's perceptibility and pooling decisions. Nevertheless, delivery behaviors of skilled couriers (SCs) who know the environment well, can improve system awareness and effectively inform decisions. Hence a SC delivery network (SCDN) is constructed, based on an enhanced attributed heterogeneous network embedding approach tailored for OFD. It aims to extract features from rich temporal and spatial information, and uncover the latent potential for order combinations embedded within SC trajectories. Accordingly, the vast search space of order assignment can be effectively pruned through scalable similarity calculations of low-dimensional vectors, making comprehensive and high-quality pooling outcomes more easily identified in real time. SCDN has now been deployed in Meituan dispatch system. Online tests reveal that with SCDN, the pooling quality and extent have been greatly improved. And our system can boost couriers'efficiency by 45-55% during noon peak hours, while upholding the timely delivery commitment.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
Authors:
Tinghao Xie,
Xiangyu Qi,
Yi Zeng,
Yangsibo Huang,
Udari Madhushani Sehwag,
Kaixuan Huang,
Luxi He,
Boyi Wei,
Dacheng Li,
Ying Sheng,
Ruoxi Jia,
Bo Li,
Kai Li,
Danqi Chen,
Peter Henderson,
Prateek Mittal
Abstract:
Evaluating aligned large language models' (LLMs) ability to recognize and reject unsafe user requests is crucial for safe, policy-compliant deployments. Existing evaluation efforts, however, face three limitations that we address with SORRY-Bench, our proposed benchmark. First, existing methods often use coarse-grained taxonomies of unsafe topics, and are over-representing some fine-grained topics…
▽ More
Evaluating aligned large language models' (LLMs) ability to recognize and reject unsafe user requests is crucial for safe, policy-compliant deployments. Existing evaluation efforts, however, face three limitations that we address with SORRY-Bench, our proposed benchmark. First, existing methods often use coarse-grained taxonomies of unsafe topics, and are over-representing some fine-grained topics. For example, among the ten existing datasets that we evaluated, tests for refusals of self-harm instructions are over 3x less represented than tests for fraudulent activities. SORRY-Bench improves on this by using a fine-grained taxonomy of 45 potentially unsafe topics, and 450 class-balanced unsafe instructions, compiled through human-in-the-loop methods. Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations. We supplement SORRY-Bench with 20 diverse linguistic augmentations to systematically examine these effects. Third, existing evaluations rely on large LLMs (e.g., GPT-4) for evaluation, which can be computationally expensive. We investigate design choices for creating a fast, accurate automated safety evaluator. By collecting 7K+ human annotations and conducting a meta-evaluation of diverse LLM-as-a-judge designs, we show that fine-tuned 7B LLMs can achieve accuracy comparable to GPT-4 scale LLMs, with lower computational cost. Putting these together, we evaluate over 40 proprietary and open-source LLMs on SORRY-Bench, analyzing their distinctive refusal behaviors. We hope our effort provides a building block for systematic evaluations of LLMs' safety refusal capabilities, in a balanced, granular, and efficient manner.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Overview of the CAIL 2023 Argument Mining Track
Authors:
**gcong Liang,
Junlong Wang,
Xinyu Zhai,
Yungui Zhuang,
Yiyang Zheng,
Xin Xu,
Xiandong Ran,
Xiaozheng Dong,
Honghui Rong,
Yanlun Liu,
Hao Chen,
Yuhan Wei,
Donghai Li,
Jiajie Peng,
Xuan**g Huang,
Chongde Shi,
Yansong Feng,
Yun Song,
Zhongyu Wei
Abstract:
We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks. The main goal of the track is to identify and extract interacting argument pairs in trial dialogs. It mainly uses summarized judgment documents but can also refer to trial recordings. The track consists of two stages, and we introduce the tasks designed for each stage; we…
▽ More
We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks. The main goal of the track is to identify and extract interacting argument pairs in trial dialogs. It mainly uses summarized judgment documents but can also refer to trial recordings. The track consists of two stages, and we introduce the tasks designed for each stage; we also extend the data from previous events into a new dataset -- CAIL2023-ArgMine -- with annotated new cases from various causes of action. We outline several submissions that achieve the best results, including their methods for different stages. While all submissions rely on language models, they have incorporated strategies that may benefit future work in this field.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms
Authors:
Siyu Yuan,
Kaitao Song,
Jiangjie Chen,
Xu Tan,
Dongsheng Li,
Deqing Yang
Abstract:
The rise of powerful large language models (LLMs) has spurred a new trend in building LLM-based autonomous agents for solving complex tasks, especially multi-agent systems. Despite the remarkable progress, we notice that existing works are heavily dependent on human-designed frameworks, which greatly limits the functional scope and scalability of agent systems. How to automatically extend the spec…
▽ More
The rise of powerful large language models (LLMs) has spurred a new trend in building LLM-based autonomous agents for solving complex tasks, especially multi-agent systems. Despite the remarkable progress, we notice that existing works are heavily dependent on human-designed frameworks, which greatly limits the functional scope and scalability of agent systems. How to automatically extend the specialized agent to multi-agent systems to improve task-solving capability still remains a significant challenge. In this paper, we introduce EvoAgent, a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm, thereby improving the effectiveness of LLM-based agents in solving tasks. Specifically, we consider the existing agent frameworks as the initial individual and then apply a series of evolutionary operators (e.g., mutation, crossover, selection, etc.) to generate multiple agents with diverse agent settings. EvoAgent can be generalized to any LLM-based agent framework, and can automatically extend the existing agent framework to multi-agent systems without any extra human designs. Experimental results across various tasks have shown that EvoAgent can automatically generate multiple expert agents and significantly enhance the task-solving capabilities of LLM-based agents.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
EduQate: Generating Adaptive Curricula through RMABs in Education Settings
Authors:
Sidney Tio,
Dexun Li,
Pradeep Varakantham
Abstract:
There has been significant interest in the development of personalized and adaptive educational tools that cater to a student's individual learning progress. A crucial aspect in develo** such tools is in exploring how mastery can be achieved across a diverse yet related range of content in an efficient manner. While Reinforcement Learning and Multi-armed Bandits have shown promise in educational…
▽ More
There has been significant interest in the development of personalized and adaptive educational tools that cater to a student's individual learning progress. A crucial aspect in develo** such tools is in exploring how mastery can be achieved across a diverse yet related range of content in an efficient manner. While Reinforcement Learning and Multi-armed Bandits have shown promise in educational settings, existing works often assume the independence of learning content, neglecting the prevalent interdependencies between such content. In response, we introduce Education Network Restless Multi-armed Bandits (EdNetRMABs), utilizing a network to represent the relationships between interdependent arms. Subsequently, we propose EduQate, a method employing interdependency-aware Q-learning to make informed decisions on arm selection at each time step. We establish the optimality guarantee of EduQate and demonstrate its efficacy compared to baseline policies, using students modeled from both synthetic and real-world data.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Fully Nonlinear Elliptic Equations With Periodic Data
Authors:
Dongsheng Li,
Lichun Liang
Abstract:
In this paper, we study solutions $u$ of fully nonlinear elliptic equations of the form $F(D^2u)=f$ in $\mathbb{R}^n$, where $f$ is periodic. We establish the existence and Liouville type results for entire quadratic polynomial growth solutions, that is, the solution is a quadratic polynomial plus a periodic function. As a consequence, we consider applications to $k$-Hessian equations.
In this paper, we study solutions $u$ of fully nonlinear elliptic equations of the form $F(D^2u)=f$ in $\mathbb{R}^n$, where $f$ is periodic. We establish the existence and Liouville type results for entire quadratic polynomial growth solutions, that is, the solution is a quadratic polynomial plus a periodic function. As a consequence, we consider applications to $k$-Hessian equations.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Amphista: Accelerate LLM Inference with Bi-directional Multiple Drafting Heads in a Non-autoregressive Style
Authors:
Ze** Li,
Xinlong Yang,
Ziheng Gao,
Ji Liu,
Zhuang Liu,
Dong Li,
**zhang Peng,
Lu Tian,
Emad Barsoum
Abstract:
Large Language Models (LLMs) inherently use autoregressive decoding, which lacks parallelism in inference and results in significantly slow inference speeds, especially when hardware parallel accelerators and memory bandwidth are not fully utilized. In this work, we propose Amphista, a speculative decoding algorithm that adheres to a non-autoregressive decoding paradigm. Owing to the increased par…
▽ More
Large Language Models (LLMs) inherently use autoregressive decoding, which lacks parallelism in inference and results in significantly slow inference speeds, especially when hardware parallel accelerators and memory bandwidth are not fully utilized. In this work, we propose Amphista, a speculative decoding algorithm that adheres to a non-autoregressive decoding paradigm. Owing to the increased parallelism, our method demonstrates higher efficiency in inference compared to autoregressive methods. Specifically, Amphista models an Auto-embedding Block capable of parallel inference, incorporating bi-directional attention to enable interaction between different drafting heads. Additionally, Amphista implements Staged Adaptation Layers to facilitate the transition of semantic information from the base model's autoregressive inference to the drafting heads' non-autoregressive speculation, thereby achieving paradigm transformation and feature fusion. We conduct a series of experiments on a suite of Vicuna models using MT-Bench and Spec-Bench. For the Vicuna 33B model, Amphista achieves up to 2.75$\times$ and 1.40$\times$ wall-clock acceleration compared to vanilla autoregressive decoding and Medusa, respectively, while preserving lossless generation quality.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
$C^1$-robust homoclinic tangencies
Authors:
Dongchen Li
Abstract:
The aim of this paper is twofold. First, motivated by the nearly-affine blender system found in [LT24], we introduce standard blenders and their variations, and prove their fundamental properties on the generation of $C^1$-robust tangencies. Next, as an application, we show that unfolding a homoclinic tangency to a hyperbolic periodic point can produce uncountably many $C^1$-robust homoclinic tang…
▽ More
The aim of this paper is twofold. First, motivated by the nearly-affine blender system found in [LT24], we introduce standard blenders and their variations, and prove their fundamental properties on the generation of $C^1$-robust tangencies. Next, as an application, we show that unfolding a homoclinic tangency to a hyperbolic periodic point can produce uncountably many $C^1$-robust homoclinic tangencies, provided that either this point is involved in a coindex-1 heterodimensional cycle, or the central dynamics near it are not essentially two-dimensional.
△ Less
Submitted 3 July, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
MCSD: An Efficient Language Model with Diverse Fusion
Authors:
Hua Yang,
Duohai Li,
Shiman Li
Abstract:
Transformers excel in Natural Language Processing (NLP) due to their prowess in capturing long-term dependencies but suffer from exponential resource consumption with increasing sequence lengths. To address these challenges, we propose MCSD model, an efficient language model with linear scaling and fast inference speed. MCSD model leverages diverse feature fusion, primarily through the multi-chann…
▽ More
Transformers excel in Natural Language Processing (NLP) due to their prowess in capturing long-term dependencies but suffer from exponential resource consumption with increasing sequence lengths. To address these challenges, we propose MCSD model, an efficient language model with linear scaling and fast inference speed. MCSD model leverages diverse feature fusion, primarily through the multi-channel slope and decay (MCSD) block, to robustly represent features. This block comprises slope and decay sections that extract features across diverse temporal receptive fields, facilitating capture of both local and global information. In addition, MCSD block conducts element-wise fusion of diverse features to further enhance the delicate feature extraction capability. For inference, we formulate the inference process into a recurrent representation, slashing space complexity to $O(1)$ and time complexity to $O(N)$ respectively. Our experiments show that MCSD attains higher throughput and lower GPU memory consumption compared to Transformers, while maintaining comparable performance to larger-scale language learning models on benchmark tests. These attributes position MCSD as a promising base for edge deployment and embodied intelligence.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Scintillation velocity and arc observations of FRB 20201124A
Authors:
Ziwei Wu,
Weiwei Zhu,
Bing Zhang,
Yi Feng,
**Lin Han,
Di Li,
Dongzi Li,
Rui Luo,
Chenhui Niu,
Jiarui Niu,
Bojun Wang,
Fayin Wang,
Pei Wang,
Weiyang Wang,
Heng Xu,
Yuanpei Yang,
Yongkun Zhang,
Dejiang Zhou,
Yuhao Zhu,
Can-Min Deng,
Yonghua Xu
Abstract:
We present the scintillation velocity measurements of FRB~20201124A from the FAST observations, which reveal an annual variation. This annual variation is further supported by changes detected in the scintillation arc as observed from the secondary spectrum. We attribute the annual velocity variation to the presence of a moderately anisotropic scattering screen located at a distance of 0.4$\pm$0.1…
▽ More
We present the scintillation velocity measurements of FRB~20201124A from the FAST observations, which reveal an annual variation. This annual variation is further supported by changes detected in the scintillation arc as observed from the secondary spectrum. We attribute the annual velocity variation to the presence of a moderately anisotropic scattering screen located at a distance of 0.4$\pm$0.1~kpc from Earth. Our results prove that the scintillation of this FRB is mainly caused by material close to Earth on a Galactic scale. However, scintillation observations of other FRBs may expose their surrounding environment or uncover possible orbital motion if scintillation is caused by materials in their host galaxy.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Applications of Explainable artificial intelligence in Earth system science
Authors:
Feini Huang,
Shijie Jiang,
Lu Li,
Yongkun Zhang,
Ye Zhang,
Ruqing Zhang,
Qingliang Li,
Danxi Li,
Wei Shangguan,
Yongjiu Dai
Abstract:
In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a s…
▽ More
In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a set of powerful tools that make the models more transparent. The purpose of this review is twofold: First, to provide ESS scholars, especially newcomers, with a foundational understanding of XAI, serving as a primer to inspire future research advances; second, to encourage ESS professionals to embrace the benefits of AI, free from preconceived biases due to its lack of interpretability. We begin with elucidating the concept of XAI, along with typical methods. We then delve into a review of XAI applications in the ESS literature, highlighting the important role that XAI has played in facilitating communication with AI model decisions, improving model diagnosis, and uncovering scientific insights. We identify four significant challenges that XAI faces within the ESS, and propose solutions. Furthermore, we provide a comprehensive illustration of multifaceted perspectives. Given the unique challenges in ESS, an interpretable hybrid approach that seamlessly integrates AI with domain-specific knowledge appears to be a promising way to enhance the utility of AI in ESS. A visionary outlook for ESS envisions a harmonious blend where process-based models govern the known, AI models explore the unknown, and XAI bridges the gap by providing explanations.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
DRIP: Discriminative Rotation-Invariant Pole Landmark Descriptor for 3D LiDAR Localization
Authors:
Dingrui Li,
Dedi Guo,
Kanji Tanaka
Abstract:
In 3D LiDAR-based robot self-localization, pole-like landmarks are gaining popularity as lightweight and discriminative landmarks. This work introduces a novel approach called "discriminative rotation-invariant poles," which enhances the discriminability of pole-like landmarks while maintaining their lightweight nature. Unlike conventional methods that model a pole landmark as a 3D line segment pe…
▽ More
In 3D LiDAR-based robot self-localization, pole-like landmarks are gaining popularity as lightweight and discriminative landmarks. This work introduces a novel approach called "discriminative rotation-invariant poles," which enhances the discriminability of pole-like landmarks while maintaining their lightweight nature. Unlike conventional methods that model a pole landmark as a 3D line segment perpendicular to the ground, we propose a simple yet powerful approach that includes not only the line segment's main body but also its surrounding local region of interest (ROI) as part of the pole landmark. Specifically, we describe the appearance, geometry, and semantic features within this ROI to improve the discriminability of the pole landmark. Since such pole landmarks are no longer rotation-invariant, we introduce a novel rotation-invariant convolutional neural network that automatically and efficiently extracts rotation-invariant features from input point clouds for recognition. Furthermore, we train a pole dictionary through unsupervised learning and use it to compress poles into compact pole words, thereby significantly reducing real-time costs while maintaining optimal self-localization performance. Monte Carlo localization experiments using publicly available NCLT dataset demonstrate that the proposed method improves a state-of-the-art pole-based localization framework.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Magnetospheric origin of a fast radio burst constrained using scintillation
Authors:
Kenzie Nimmo,
Ziggy Pleunis,
Paz Beniamini,
Pawan Kumar,
Adam E. Lanman,
D. Z. Li,
Robert Main,
Mawson W. Sammons,
Shion Andrew,
Mohit Bhardwaj,
Shami Chatterjee,
Alice P. Curtin,
Emmanuel Fonseca,
B. M. Gaensler,
Ronniy C. Joseph,
Zarif Kader,
Victoria M. Kaspi,
Mattias Lazda,
Calvin Leung,
Kiyoshi W. Masui,
Ryan Mckinven,
Daniele Michilli,
Ayush Pandhi,
Aaron B. Pearlman,
Masoud Rafiei-Ravandi
, et al. (4 additional authors not shown)
Abstract:
Fast radio bursts (FRBs) are micro-to-millisecond duration radio transients that originate mostly from extragalactic distances. The emission mechanism responsible for these high luminosity, short duration transients remains debated. The models are broadly grouped into two classes: physical processes that occur within close proximity to a central engine; and central engines that release energy whic…
▽ More
Fast radio bursts (FRBs) are micro-to-millisecond duration radio transients that originate mostly from extragalactic distances. The emission mechanism responsible for these high luminosity, short duration transients remains debated. The models are broadly grouped into two classes: physical processes that occur within close proximity to a central engine; and central engines that release energy which moves to large radial distances and subsequently interacts with surrounding media producing radio waves. The expected emission region sizes are notably different between these two types of models. FRB emission size constraints can therefore be used to distinguish between these competing models and inform on the physics responsible. Here we present the measurement of two mutually coherent scintillation scales in the frequency spectrum of FRB 20221022A: one originating from a scattering screen located within the Milky Way, and the second originating from a scattering screen located within its host galaxy or local environment. We use the scattering media as an astrophysical lens to constrain the size of the lateral emission region, $R_{\star\mathrm{obs}} \lesssim 3\times10^{4}$ km. We find that this is inconsistent with the expected emission sizes for the large radial distance models, and is more naturally explained with an emission process that operates within or just beyond the magnetosphere of a central compact object. Recently, FRB 20221022A was found to exhibit an S-shaped polarisation angle swing, supporting a magnetospheric emission process. The scintillation results presented in this work independently support this conclusion, while highlighting scintillation as a useful tool in our understanding of FRB emission physics and progenitors.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models
Authors:
Xiyang Wu,
Tianrui Guan,
Dianqi Li,
Shuaiyi Huang,
Xiaoyu Liu,
Xijun Wang,
Ruiqi Xian,
Abhinav Shrivastava,
Furong Huang,
Jordan Lee Boyd-Graber,
Tianyi Zhou,
Dinesh Manocha
Abstract:
Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine…
▽ More
Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine their validity. These motivate us to develop the first automatic benchmark generation approach, AUTOHALLUSION, that harnesses a few principal strategies to create diverse hallucination examples. It probes the language modules in LVLMs for context cues and uses them to synthesize images by: (1) adding objects abnormal to the context cues; (2) for two co-occurring objects, kee** one and excluding the other; or (3) removing objects closely tied to the context cues. It then generates image-based questions whose ground-truth answers contradict the language module's prior. A model has to overcome contextual biases and distractions to reach correct answers, while incorrect or inconsistent answers indicate hallucinations. AUTOHALLUSION enables us to create new benchmarks at the minimum cost and thus overcomes the fragility of hand-crafted benchmarks. It also reveals common failure patterns and reasons, providing key insights to detect, avoid, or control hallucinations. Comprehensive evaluations of top-tier LVLMs, e.g., GPT-4V(ision), Gemini Pro Vision, Claude 3, and LLaVA-1.5, show a 97.7% and 98.7% success rate of hallucination induction on synthetic and real-world datasets of AUTOHALLUSION, paving the way for a long battle against hallucinations.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
On the refined analyticity radius of 3-D generalized Navier-Stokes equations
Authors:
Dong Li,
** Zhang
Abstract:
We analyze the instantaneous growth of analyticity radius for three dimensional generalized Navier-Stokes equations. For the subcritical $H^γ(\mathbb R^3)$ case with $γ>\frac12,$ we prove that there exists a positive time $t_0$ so that for any $t\in]0, t_0]$, the radius of analyticity of the solution $u$ satisfies the pointwise-in-time lower bound…
▽ More
We analyze the instantaneous growth of analyticity radius for three dimensional generalized Navier-Stokes equations. For the subcritical $H^γ(\mathbb R^3)$ case with $γ>\frac12,$ we prove that there exists a positive time $t_0$ so that for any $t\in]0, t_0]$, the radius of analyticity of the solution $u$ satisfies the pointwise-in-time lower bound $${\mathrm{rad}}(u)(t)\ge \sqrt{(2γ-1)t\bigl(|\ln t|+\ln|\ln t|+K_t\bigr)},$$ where $K_t \to \infty$ as $t\to 0^+$. This in particular gives a nontrivial improvement of the previous result by Herbst and Skibsted in \cite{HS} for the case $γ\in ]1/2,3/2[$ and also settles the decade-long open question in \cite{HS}, namely, whether or not
$\liminf_{t\to 0^+}\frac {\mathrm{ rad}(u)(t)}{\sqrt{t|\ln t|}}\ge \sqrt{2γ-1}$ for all $γ\ge \frac32$. For the critical case $H^{\frac 12}(\mathbb R^3)$, we prove that there exists $t_1>0$ so that for any $t\in ]0, t_1],$ ${\mathrm {rad}}(u)(t)\ge λ(t)\sqrt{t}$ with $λ(t)$ satisfying $\lim_{t\to 0^+}λ(t)=\infty.$
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery
Authors:
Libo Wang,
Dongxu Li,
Sijun Dong,
Xiaoliang Meng,
Xiaokang Zhang,
Danfeng Hong
Abstract:
Semantic segmentation, as a basic tool for intelligent interpretation of remote sensing images, plays a vital role in many Earth Observation (EO) applications. Nowadays, accurate semantic segmentation of remote sensing images remains a challenge due to the complex spatial-temporal scenes and multi-scale geo-objects. Driven by the wave of deep learning (DL), CNN- and Transformer-based semantic segm…
▽ More
Semantic segmentation, as a basic tool for intelligent interpretation of remote sensing images, plays a vital role in many Earth Observation (EO) applications. Nowadays, accurate semantic segmentation of remote sensing images remains a challenge due to the complex spatial-temporal scenes and multi-scale geo-objects. Driven by the wave of deep learning (DL), CNN- and Transformer-based semantic segmentation methods have been explored widely, and these two architectures both revealed the importance of multi-scale feature representation for strengthening semantic information of geo-objects. However, the actual multi-scale feature fusion often comes with the semantic redundancy issue due to homogeneous semantic contents in pyramid features. To handle this issue, we propose a novel Mamba-based segmentation network, namely PyramidMamba. Specifically, we design a plug-and-play decoder, which develops a dense spatial pyramid pooling (DSPP) to encode rich multi-scale semantic features and a pyramid fusion Mamba (PFM) to reduce semantic redundancy in multi-scale feature fusion. Comprehensive ablation experiments illustrate the effectiveness and superiority of the proposed method in enhancing multi-scale feature representation as well as the great potential for real-time semantic segmentation. Moreover, our PyramidMamba yields state-of-the-art performance on three publicly available datasets, i.e. the OpenEarthMap (70.8% mIoU), ISPRS Vaihingen (84.8% mIoU) and Potsdam (88.0% mIoU) datasets. The code will be available at https://github.com/WangLibo1995/GeoSeg.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
From Words to Worlds: Transforming One-line Prompt into Immersive Multi-modal Digital Stories with Communicative LLM Agent
Authors:
Samuel S. Sohn,
Danrui Li,
Sen Zhang,
Che-Jui Chang,
Mubbasir Kapadia
Abstract:
Digital storytelling, essential in entertainment, education, and marketing, faces challenges in production scalability and flexibility. The StoryAgent framework, introduced in this paper, utilizes Large Language Models and generative tools to automate and refine digital storytelling. Employing a top-down story drafting and bottom-up asset generation approach, StoryAgent tackles key issues such as…
▽ More
Digital storytelling, essential in entertainment, education, and marketing, faces challenges in production scalability and flexibility. The StoryAgent framework, introduced in this paper, utilizes Large Language Models and generative tools to automate and refine digital storytelling. Employing a top-down story drafting and bottom-up asset generation approach, StoryAgent tackles key issues such as manual intervention, interactive scene orchestration, and narrative consistency. This framework enables efficient production of interactive and consistent narratives across multiple modalities, democratizing content creation and enhancing engagement. Our results demonstrate the framework's capability to produce coherent digital stories without reference videos, marking a significant advancement in automated digital storytelling.
△ Less
Submitted 21 June, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Fine-Grained Urban Flow Inference with Multi-scale Representation Learning
Authors:
Shilu Yuan,
Dongfeng Li,
Wei Liu,
Xinxin Zhang,
Meng Chen,
Junjie Zhang,
Yongshun Gong
Abstract:
Fine-grained urban flow inference (FUFI) is a crucial transportation service aimed at improving traffic efficiency and safety. FUFI can infer fine-grained urban traffic flows based solely on observed coarse-grained data. However, most of existing methods focus on the influence of single-scale static geographic information on FUFI, neglecting the interactions and dynamic information between differe…
▽ More
Fine-grained urban flow inference (FUFI) is a crucial transportation service aimed at improving traffic efficiency and safety. FUFI can infer fine-grained urban traffic flows based solely on observed coarse-grained data. However, most of existing methods focus on the influence of single-scale static geographic information on FUFI, neglecting the interactions and dynamic information between different-scale regions within the city. Different-scale geographical features can capture redundant information from the same spatial areas. In order to effectively learn multi-scale information across time and space, we propose an effective fine-grained urban flow inference model called UrbanMSR, which uses self-supervised contrastive learning to obtain dynamic multi-scale representations of neighborhood-level and city-level geographic information, and fuses multi-scale representations to improve fine-grained accuracy. The fusion of multi-scale representations enhances fine-grained. We validate the performance through extensive experiments on three real-world datasets. The resutls compared with state-of-the-art methods demonstrate the superiority of the proposed model.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.