-
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation
Authors:
Shuting Wang,
Jiongnan Liu,
Shiren Song,
Jiehan Cheng,
Yuqi Fu,
Peidong Guo,
Kun Fang,
Yutao Zhu,
Zhicheng Dou
Abstract:
Retrieval-Augmented Generation (RAG) offers a promising solution to address various limitations of Large Language Models (LLMs), such as hallucination and difficulties in kee** up with real-time updates. This approach is particularly critical in expert and domain-specific applications where LLMs struggle to cover expert knowledge. Therefore, evaluating RAG models in such scenarios is crucial, ye…
▽ More
Retrieval-Augmented Generation (RAG) offers a promising solution to address various limitations of Large Language Models (LLMs), such as hallucination and difficulties in kee** up with real-time updates. This approach is particularly critical in expert and domain-specific applications where LLMs struggle to cover expert knowledge. Therefore, evaluating RAG models in such scenarios is crucial, yet current studies often rely on general knowledge sources like Wikipedia to assess the models' abilities in solving common-sense problems. In this paper, we evaluated LLMs by RAG settings in a domain-specific context, college enrollment. We identified six required abilities for RAG models, including the ability in conversational RAG, analyzing structural information, faithfulness to external knowledge, denoising, solving time-sensitive problems, and understanding multi-document interactions. Each ability has an associated dataset with shared corpora to evaluate the RAG models' performance. We evaluated popular LLMs such as Llama, Baichuan, ChatGLM, and GPT models. Experimental results indicate that existing closed-book LLMs struggle with domain-specific questions, highlighting the need for RAG models to solve expert problems. Moreover, there is room for RAG models to improve their abilities in comprehending conversational history, analyzing structural information, denoising, processing multi-document interactions, and faithfulness in expert knowledge. We expect future studies could solve these problems better.
△ Less
Submitted 16 June, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Authors:
Zhijun Liu,
Shuai Wang,
Sho Inoue,
Qibing Bai,
Haizhou Li
Abstract:
Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromise between code bitrate and reconstruction accuracy. When dealing with low-bitrate audio codes, language models are constrained to process only a subset of the i…
▽ More
Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromise between code bitrate and reconstruction accuracy. When dealing with low-bitrate audio codes, language models are constrained to process only a subset of the information embedded in the audio, which in turn restricts their generative capabilities. To circumvent these issues, we propose encoding audio as vector sequences in continuous space $\mathbb R^d$ and autoregressively generating these sequences using a decoder-only diffusion transformer (ARDiT). Our findings indicate that ARDiT excels in zero-shot text-to-speech and exhibits performance that compares to or even surpasses that of state-of-the-art models. High-bitrate continuous speech representation enables almost flawless reconstruction, allowing our model to achieve nearly perfect speech editing. Our experiments reveal that employing Integral Kullback-Leibler (IKL) divergence for distillation at each autoregressive step significantly boosts the perceived quality of the samples. Simultaneously, it condenses the iterative sampling process of the diffusion model into a single step. Furthermore, ARDiT can be trained to predict several continuous vectors in one step, significantly reducing latency during sampling. Impressively, one of our models can generate $170$ ms of $24$ kHz speech per evaluation step with minimal degradation in performance. Audio samples are available at http://ardit-tts.github.io/ .
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Verbalized Probabilistic Graphical Modeling with Large Language Models
Authors:
Hengguan Huang,
Xing Shen,
Songtao Wang,
Dianbo Liu,
Hao Wang
Abstract:
Faced with complex problems, the human brain demonstrates a remarkable capacity to transcend sensory input and form latent understandings of perceived world patterns. However, this cognitive capacity is not explicitly considered or encoded in current large language models (LLMs). As a result, LLMs often struggle to capture latent structures and model uncertainty in complex compositional reasoning…
▽ More
Faced with complex problems, the human brain demonstrates a remarkable capacity to transcend sensory input and form latent understandings of perceived world patterns. However, this cognitive capacity is not explicitly considered or encoded in current large language models (LLMs). As a result, LLMs often struggle to capture latent structures and model uncertainty in complex compositional reasoning tasks. This work introduces a novel Bayesian prompting approach that facilitates training-free Bayesian inference with LLMs by using a verbalized Probabilistic Graphical Model (PGM). While traditional Bayesian approaches typically depend on extensive data and predetermined mathematical structures for learning latent factors and dependencies, our approach efficiently reasons latent variables and their probabilistic dependencies by prompting LLMs to adhere to Bayesian principles. We evaluated our model on several compositional reasoning tasks, both close-ended and open-ended. Our results indicate that the model effectively enhances confidence elicitation and text generation quality, demonstrating its potential to improve AI language understanding systems, especially in modeling uncertainty.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
Authors:
Xunguang Wang,
Daoyuan Wu,
Zhenlan Ji,
Zongjie Li,
**chuan Ma,
Shuai Wang,
Yingjiu Li,
Yang Liu,
Ning Liu,
Juergen Rahmel
Abstract:
Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering…
▽ More
Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering a practical jailbreak defense is challenging because it needs to not only handle all the above jailbreak attacks but also incur negligible delay to user prompts, as well as be compatible with both open-source and closed-source LLMs.
Inspired by how the traditional security concept of shadow stacks defends against memory overflow attacks, this paper introduces a generic LLM jailbreak defense framework called SelfDefend, which establishes a shadow LLM defense instance to concurrently protect the target LLM instance in the normal stack and collaborate with it for checkpoint-based access control. The effectiveness of SelfDefend builds upon our observation that existing LLMs (both target and defense LLMs) have the capability to identify harmful prompts or intentions in user queries, which we empirically validate using the commonly used GPT-3.5/4 models across all major jailbreak attacks. Our measurements show that SelfDefend enables GPT-3.5 to suppress the attack success rate (ASR) by 8.97-95.74% (average: 60%) and GPT-4 by even 36.36-100% (average: 83%), while incurring negligible effects on normal queries. To further improve the defense's robustness and minimize costs, we employ a data distillation approach to tune dedicated open-source defense models. These models outperform four SOTA defenses and match the performance of GPT-4-based SelfDefend, with significantly lower extra delays. We also empirically show that the tuned models are robust to targeted GCG and prompt injection attacks.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
MatrixGate: A High-performance Data Ingestion Tool for Time-series Databases
Authors:
Shuhui Wang,
Zihan Sun,
Chaochen Hu,
Chao Li,
Yong Zhang,
Yandong Yao,
Hao Wang,
Chunxiao Xing
Abstract:
Recent years have seen massive time-series data generated in many areas. This different scenario brings new challenges, particularly in terms of data ingestion, where existing technologies struggle to handle such massive time-series data, leading to low loading speed and poor timeliness. To address these challenges, this paper presents MatrixGate, a new and efficient data ingestion approach for ma…
▽ More
Recent years have seen massive time-series data generated in many areas. This different scenario brings new challenges, particularly in terms of data ingestion, where existing technologies struggle to handle such massive time-series data, leading to low loading speed and poor timeliness. To address these challenges, this paper presents MatrixGate, a new and efficient data ingestion approach for massive time-series data. MatrixGate implements both single-instance and multi-instance parallel procedures, which is based on its unique ingestion strategies. First, MatrixGate uses policies to tune the slots that are synchronized with segments to ingest data, which eliminates the cost of starting transactions and enhance the efficiency. Second, multi-coroutines are responsible for transfer data, which can increase the degree of parallelism significantly. Third, lock-free queues are used to enable direct data transfer without the need for disk storage or lodging in the master instance. Experiment results on multiple datasets show that MatrixGate outperforms state-of-the-art methods by 3 to 100 times in loading speed, and cuts down about 80% query latency. Furthermore, MatrixGate scales out efficiently under distributed architecture, achieving scalability of 86%.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Multiple peaks in gravitational waves induced from primordial curvature perturbations with non-Gaussianity
Authors:
Xiang-Xi Zeng,
Rong-Gen Cai,
Shao-Jiang Wang
Abstract:
First-order primordial curvature perturbations are known to induce gravitational waves at the second-order, which can in turn probe the small-scale curvature perturbations near the end of the inflation. In this work, we extend the previous analysis in the Gaussian case into the non-Gaussian case, with particular efforts to obtain some thumb rules of sandwiching the associated peaks in gravitationa…
▽ More
First-order primordial curvature perturbations are known to induce gravitational waves at the second-order, which can in turn probe the small-scale curvature perturbations near the end of the inflation. In this work, we extend the previous analysis in the Gaussian case into the non-Gaussian case, with particular efforts to obtain some thumb rules of sandwiching the associated peaks in gravitational waves induced from multiple peaks of non-Gaussian curvature perturbations.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection
Authors:
Liting Huang,
Zhihao Zhang,
Yiran Zhang,
Xiyue Zhou,
Shou** Wang
Abstract:
The recent advancements in generative AI models, which can create realistic and human-like content, are significantly transforming how people communicate, create, and work. While the appropriate use of generative AI models can benefit the society, their misuse poses significant threats to data reliability and authentication. However, due to a lack of aligned multimodal datasets, effective and robu…
▽ More
The recent advancements in generative AI models, which can create realistic and human-like content, are significantly transforming how people communicate, create, and work. While the appropriate use of generative AI models can benefit the society, their misuse poses significant threats to data reliability and authentication. However, due to a lack of aligned multimodal datasets, effective and robust methods for detecting machine-generated content are still in the early stages of development. In this paper, we introduce RU-AI, a new large-scale multimodal dataset designed for the robust and efficient detection of machine-generated content in text, image, and voice. Our dataset is constructed from three large publicly available datasets: Flickr8K, COCO, and Places205, by combining the original datasets and their corresponding machine-generated pairs. Additionally, experimental results show that our proposed unified model, which incorporates a multimodal embedding module with a multilayer perceptron network, can effectively determine the origin of the data (i.e., original data samples or machine-generated ones) from RU-AI. However, future work is still required to address the remaining challenges posed by RU-AI. The source code and dataset are available at https://github.com/ZhihaoZhang97/RU-AI.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
Authors:
Xiaobiao Du,
Haiyang Sun,
Shuyun Wang,
Zhuojie Wu,
Hongwei Sheng,
Jiaying Ying,
Ming Lu,
Tianqing Zhu,
Kun Zhan,
Xin Yu
Abstract:
3D cars are commonly used in self-driving systems, virtual/augmented reality, and games. However, existing 3D car datasets are either synthetic or low-quality, presenting a significant gap toward the high-quality real-world 3D car datasets and limiting their applications in practical scenarios. In this paper, we propose the first large-scale 3D real car dataset, termed 3DRealCar, offering three di…
▽ More
3D cars are commonly used in self-driving systems, virtual/augmented reality, and games. However, existing 3D car datasets are either synthetic or low-quality, presenting a significant gap toward the high-quality real-world 3D car datasets and limiting their applications in practical scenarios. In this paper, we propose the first large-scale 3D real car dataset, termed 3DRealCar, offering three distinctive features. (1) \textbf{High-Volume}: 2,500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) \textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) \textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark. Additionally, we offer detailed car parsing maps for each instance to promote research in car parsing tasks. Moreover, we remove background point clouds and standardize the car orientation to a unified axis for the reconstruction only on cars without background and controllable rendering. We benchmark 3D reconstruction results with state-of-the-art methods across each lighting condition in 3DRealCar. Extensive experiments demonstrate that the standard lighting condition part of 3DRealCar can be used to produce a large number of high-quality 3D cars, improving various 2D and 3D tasks related to cars. Notably, our dataset brings insight into the fact that recent 3D reconstruction methods face challenges in reconstructing high-quality 3D cars under reflective and dark lighting conditions. \textcolor{red}{\href{https://xiaobiaodu.github.io/3drealcar/}{Our dataset is available here.}}
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
RiskMap: A Unified Driving Context Representation for Autonomous Motion Planning in Urban Driving Environment
Authors:
Ren Xin,
Sheng Wang,
Yingbing Chen,
Jie Cheng,
Ming Liu
Abstract:
Planning is complicated by the combination of perception and map information, particularly when driving in heavy traffic. Develo** an extendable and efficient representation that visualizes sensor noise and provides constraints to real-time planning tasks is desirable. We aim to develop an extendable map representation offering prior to cost in planning tasks to simplify the planning process of…
▽ More
Planning is complicated by the combination of perception and map information, particularly when driving in heavy traffic. Develo** an extendable and efficient representation that visualizes sensor noise and provides constraints to real-time planning tasks is desirable. We aim to develop an extendable map representation offering prior to cost in planning tasks to simplify the planning process of dealing with complex driving scenarios and visualize sensor noise. In this paper, we illustrate a unified context representation empowered by a modern deep learning motion prediction model, representing statistical cognition of motion prediction for human beings. A sampling-based planner is adopted to train and compare the difference in risk map generation methods. The training tools and model structures are investigated illustrating their efficiency in this task.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
A Comprehensive Study of Quantum Arithmetic Circuits
Authors:
Siyi Wang,
Xiufan Li,
Wei Jie Bryan Lee,
Suman Deb,
Eugene Lim,
Anupam Chattopadhyay
Abstract:
In recent decades, the field of quantum computing has experienced remarkable progress. This progress is marked by the superior performance of many quantum algorithms compared to their classical counterparts, with Shor's algorithm serving as a prominent illustration. Quantum arithmetic circuits, which are the fundamental building blocks in numerous quantum algorithms, have attracted much attention.…
▽ More
In recent decades, the field of quantum computing has experienced remarkable progress. This progress is marked by the superior performance of many quantum algorithms compared to their classical counterparts, with Shor's algorithm serving as a prominent illustration. Quantum arithmetic circuits, which are the fundamental building blocks in numerous quantum algorithms, have attracted much attention. Despite extensive exploration of various designs in the existing literature, researchers remain keen on develo** novel designs and improving existing ones.
In this review article, we aim to provide a systematically organized and easily comprehensible overview of the current state-of-the-art in quantum arithmetic circuits. Specifically, this study covers fundamental operations such as addition, subtraction, multiplication, division and modular exponentiation. We delve into the detailed quantum implementations of these prominent designs and evaluate their efficiency considering various objectives. We also discuss potential applications of presented arithmetic circuits and suggest future research directions.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Optical biomarker of metabolism for breast tumor diagnosis: Insights from subcellular dynamics
Authors:
Zichen Yin,
Shuwei Zhang,
Bin He,
Houpu Yang,
Zhengyu Chen,
Zhangwei Hu,
Yejiong Shi,
Ruizhi Xue,
Panqi Yang,
Yuzhe Ying,
Chengming Wang,
Shu Wang,
** Xue
Abstract:
Label-free metabolic dynamics contrast is highly appealing but difficult to achieve in biomedical imaging. Interference offers a highly sensitive mechanism for capturing the metabolic dynamics of the subcellular scatterers. However, traditional interference detection methods fail to isolate pure metabolic dynamics, as the dynamic signals are coupled with scatterer reflectivity and other uncontroll…
▽ More
Label-free metabolic dynamics contrast is highly appealing but difficult to achieve in biomedical imaging. Interference offers a highly sensitive mechanism for capturing the metabolic dynamics of the subcellular scatterers. However, traditional interference detection methods fail to isolate pure metabolic dynamics, as the dynamic signals are coupled with scatterer reflectivity and other uncontrollable imaging factors. Here, we demonstrate active phase modulation-assisted dynamic full-field optical coherence tomography (APMD-FFOCT) that decouples and quantifies the metabolic dynamics by adding a reference movement for all interferential scatterers. This novel technique enables imaging and dynamic analysis of subcellular structures along with their changes during the apoptotic process in tumor tissues. Furthermore, the nucleus-to-cytoplasm dynamic intensity ratio could serve as an optical biomarker for breast tumor grading, enhancing intraoperative diagnosis.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
FastGAS: Fast Graph-based Annotation Selection for In-Context Learning
Authors:
Zihan Chen,
Song Wang,
Cong Shen,
Jundong Li
Abstract:
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Since generating the prompts needs to sample from a vast pool of instances and annotate them (e.g., add labels in classification task), existing methods have proposed to select a subset of unlabeled examples for annotation, thus enhancing the quality of prompts an…
▽ More
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Since generating the prompts needs to sample from a vast pool of instances and annotate them (e.g., add labels in classification task), existing methods have proposed to select a subset of unlabeled examples for annotation, thus enhancing the quality of prompts and concurrently mitigating annotation costs. However, these methods often require a long time to select instances due to their complexity, hindering their practical viability. To address this limitation, we propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances while minimizing computational overhead. Initially, we construct a data similarity graph based on instance similarities. Subsequently, employing a graph partitioning algorithm, we partition the graph into pieces. Within each piece (i.e., subgraph), we adopt a greedy approach to pick the most representative nodes. By aggregating nodes from diverse pieces and annotating the corresponding instances, we identify a set of diverse and representative instances for ICL. Compared to prior approaches, our method not only exhibits superior performance on different tasks but also significantly reduces selection time. In addition, we demonstrate the efficacy of our approach in LLMs of larger sizes.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
What is the Best Way for ChatGPT to Translate Poetry?
Authors:
Shanshan Wang,
Derek F. Wong,
**gming Yao,
Lidia S. Chao
Abstract:
Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation. The advent of Large Language Models such as ChatGPT holds potential for innovation in this field. This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to asce…
▽ More
Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation. The advent of Large Language Models such as ChatGPT holds potential for innovation in this field. This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to ascertain optimal performance. Despite promising outcomes, our analysis reveals persistent issues in the translations generated by ChatGPT that warrant attention. To address these shortcomings, we propose an Explanation-Assisted Poetry Machine Translation (EAPMT) method, which leverages monolingual poetry explanation as a guiding information for the translation process. Furthermore, we refine existing evaluation criteria to better suit the nuances of modern poetry translation. We engaged a panel of professional poets for assessments, complemented evaluations by using GPT-4. The results from both human and machine evaluations demonstrate that our EAPMT method outperforms traditional translation methods of ChatGPT and the existing online systems. This paper validates the efficacy of our method and contributes a novel perspective to machine-assisted literary translation.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Joint Association, Beamforming, and Resource Allocation for Multi-IRS Enabled MU-MISO Systems With RSMA
Authors:
Chunjie Wang,
Xuhui Zhang,
Huijun Xing,
Liang Xue,
Shuqiang Wang,
Yanyan Shen,
Bo Yang,
** Guan
Abstract:
Intelligent reflecting surface (IRS) and rate-splitting multiple access (RSMA) technologies are at the forefront of enhancing spectrum and energy efficiency in the next generation multi-antenna communication systems. This paper explores a RSMA system with multiple IRSs, and proposes two purpose-driven scheduling schemes, i.e., the exhaustive IRS-aided (EIA) and opportunistic IRS-aided (OIA) scheme…
▽ More
Intelligent reflecting surface (IRS) and rate-splitting multiple access (RSMA) technologies are at the forefront of enhancing spectrum and energy efficiency in the next generation multi-antenna communication systems. This paper explores a RSMA system with multiple IRSs, and proposes two purpose-driven scheduling schemes, i.e., the exhaustive IRS-aided (EIA) and opportunistic IRS-aided (OIA) schemes. The aim is to optimize the system weighted energy efficiency (EE) under the above two schemes, respectively. Specifically, the Dinkelbach, branch and bound, successive convex approximation, and the semidefinite relaxation methods are exploited within the alternating optimization framework to obtain effective solutions to the considered problems. The numerical findings indicate that the EIA scheme exhibits better performance compared to the OIA scheme in diverse scenarios when considering the weighted EE, and the proposed algorithm demonstrates superior performance in comparison to the baseline algorithms.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Calibrated absolute optical contrast for high-throughput characterization of horizontally aligned carbon nanotube arrays
Authors:
Yue Li,
Ying Xie,
Jian** Wang,
Yang Xu,
Shurui Wang,
Yunbiao Zhao,
Liu Qian,
Ziqiang Zhao,
** Zhang
Abstract:
Horizontally aligned carbon nanotube (HACNT) arrays hold significant potential for various applications in nanoelectronics and material science. However, their high-throughput characterization remains challenging due to the lack of methods with both high efficiency and high accuracy. Here, we present a novel technique, Calibrated Absolute Optical Contrast (CAOC), achieved through the implementatio…
▽ More
Horizontally aligned carbon nanotube (HACNT) arrays hold significant potential for various applications in nanoelectronics and material science. However, their high-throughput characterization remains challenging due to the lack of methods with both high efficiency and high accuracy. Here, we present a novel technique, Calibrated Absolute Optical Contrast (CAOC), achieved through the implementation of differential principles to filter out stray signals and high-resolution calibration to endow optical contrast with physical significance. CAOC offers major advantages over previous characterization techniques, providing consistent and reliable measurements of HACNT array density with high throughput and non-destructive assessment. To validate its utility, we demonstrate wafer-scale uniformity assessment by rapid density map**. This technique not only facilitates the practical evaluation of HACNT arrays but also provides insights into balancing high throughput and high resolution in nanomaterial characterization.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for…
▽ More
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for $h_c \to K^+ K^- π^0$ and $h_c \to K^+ K^- η$ are found with significances of $3.5σ$ and $3.3σ$, respectively, after considering the systematic uncertainties. The branching fractions of these decays are measured to be $\mathcal{B}(h_c \to π^+ π^- π^0)=(1.36\pm0.16\pm0.14)\times10^{-3}$, $\mathcal{B}(h_c \to K^+ K^- π^0)=(3.26\pm0.84\pm0.36)\times10^{-4}$, and $\mathcal{B}(h_c \to K^+ K^- η)=(3.13\pm1.08\pm0.38)\times10^{-4}$, where the first uncertainties are statistical and the second are systematic. No significant signal of $h_c\toπ^+π^-η$ is found, and the upper limit of its decay branching fraction is determined to be $\mathcal{B}(h_c\toπ^+π^-η) < 4.0 \times 10^{-4}$ at 90% confidence level.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Controllable Talking Face Generation by Implicit Facial Keypoints Editing
Authors:
Dong Zhao,
Jiaying Shi,
Wenjun Li,
Shudong Wang,
Shenghui Xu,
Zhaoming Pan
Abstract:
Audio-driven talking face generation has garnered significant interest within the domain of digital human research. Existing methods are encumbered by intricate model architectures that are intricately dependent on each other, complicating the process of re-editing image or video inputs. In this work, we present ControlTalk, a talking face generation method to control face expression deformation b…
▽ More
Audio-driven talking face generation has garnered significant interest within the domain of digital human research. Existing methods are encumbered by intricate model architectures that are intricately dependent on each other, complicating the process of re-editing image or video inputs. In this work, we present ControlTalk, a talking face generation method to control face expression deformation based on driven audio, which can construct the head pose and facial expression including lip motion for both single image or sequential video inputs in a unified manner. By utilizing a pre-trained video synthesis renderer and proposing the lightweight adaptation, ControlTalk achieves precise and naturalistic lip synchronization while enabling quantitative control over mouth opening shape. Our experiments show that our method is superior to state-of-the-art performance on widely used benchmarks, including HDTF and MEAD. The parameterized adaptation demonstrates remarkable generalization capabilities, effectively handling expression deformation across same-ID and cross-ID scenarios, and extending its utility to out-of-domain portraits, regardless of languages.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Local control and mixed dimensions: Exploring high-temperature superconductivity in optical lattices
Authors:
Henning Schlömer,
Hannah Lange,
Titus Franz,
Thomas Chalopin,
Petar Bojović,
Si Wang,
Immanuel Bloch,
Timon A. Hilker,
Fabian Grusdt,
Annabelle Bohrdt
Abstract:
The simulation of high-temperature superconducting materials by implementing strongly correlated fermionic models in optical lattices is one of the major objectives in the field of analog quantum simulation. Here we show that local control and optical bilayer capabilities create a versatile toolbox to study both nickelate and cuprate high-temperature superconductors. On the one hand, we present a…
▽ More
The simulation of high-temperature superconducting materials by implementing strongly correlated fermionic models in optical lattices is one of the major objectives in the field of analog quantum simulation. Here we show that local control and optical bilayer capabilities create a versatile toolbox to study both nickelate and cuprate high-temperature superconductors. On the one hand, we present a scheme to implement a mixed-dimensional (mixD) bilayer model that has been proposed to capture the essential pairing physics of pressurized bilayer nickelates. This allows for the long-sought realization of a state with long-range superconducting order in current lattice quantum simulation machines. In particular, we show how coherent pairing correlations can be accessed in a partially particle-hole transformed and rotated basis. On the other hand, we demonstrate that control of local gates enables the observation of $d$-wave pairing order in the two-dimensional (single-layer) repulsive Fermi-Hubbard model through the simulation of a system with attractive interactions. Lastly, we introduce a scheme to measure momentum-resolved dopant densities, providing access to observables complementary to solid-state experiments -- which is of particular interest for future studies of the enigmatic pseudogap phase appearing in cuprates.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Kolmogorov-Arnold Networks for Time Series: Bridging Predictive Power and Interpretability
Authors:
Kunpeng Xu,
Lifei Chen,
Shengrui Wang
Abstract:
Kolmogorov-Arnold Networks (KAN) is a groundbreaking model recently proposed by the MIT team, representing a revolutionary approach with the potential to be a game-changer in the field. This innovative concept has rapidly garnered worldwide interest within the AI community. Inspired by the Kolmogorov-Arnold representation theorem, KAN utilizes spline-parametrized univariate functions in place of t…
▽ More
Kolmogorov-Arnold Networks (KAN) is a groundbreaking model recently proposed by the MIT team, representing a revolutionary approach with the potential to be a game-changer in the field. This innovative concept has rapidly garnered worldwide interest within the AI community. Inspired by the Kolmogorov-Arnold representation theorem, KAN utilizes spline-parametrized univariate functions in place of traditional linear weights, enabling them to dynamically learn activation patterns and significantly enhancing interpretability. In this paper, we explore the application of KAN to time series forecasting and propose two variants: T-KAN and MT-KAN. T-KAN is designed to detect concept drift within time series and can explain the nonlinear relationships between predictions and previous time steps through symbolic regression, making it highly interpretable in dynamically changing environments. MT-KAN, on the other hand, improves predictive performance by effectively uncovering and leveraging the complex relationships among variables in multivariate time series. Experiments validate the effectiveness of these approaches, demonstrating that T-KAN and MT-KAN significantly outperform traditional methods in time series forecasting tasks, not only enhancing predictive accuracy but also improving model interpretability. This research opens new avenues for adaptive forecasting models, highlighting the potential of KAN as a powerful and interpretable tool in predictive analytics.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection
Authors:
Ronghui Xu,
Hao Miao,
Senzhang Wang,
Philip S. Yu,
Jianxin Wang
Abstract:
With the proliferation of mobile sensing techniques, huge amounts of time series data are generated and accumulated in various domains, fueling plenty of real-world applications. In this setting, time series anomaly detection is practically important. It endeavors to identify deviant samples from the normal sample distribution in time series. Existing approaches generally assume that all the time…
▽ More
With the proliferation of mobile sensing techniques, huge amounts of time series data are generated and accumulated in various domains, fueling plenty of real-world applications. In this setting, time series anomaly detection is practically important. It endeavors to identify deviant samples from the normal sample distribution in time series. Existing approaches generally assume that all the time series is available at a central location. However, we are witnessing the decentralized collection of time series due to the deployment of various edge devices. To bridge the gap between the decentralized time series data and the centralized anomaly detection algorithms, we propose a Parameter-efficient Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. PeFAD for the first time employs the pre-trained language model (PLM) as the body of the client's local model, which can benefit from its cross-modality knowledge transfer capability. To reduce the communication overhead and local model adaptation cost, we propose a parameter-efficient federated training module such that clients only need to fine-tune small-scale parameters and transmit them to the server for update. PeFAD utilizes a novel anomaly-driven mask selection strategy to mitigate the impact of neglected anomalies during training. A knowledge distillation operation on a synthetic privacy-preserving dataset that is shared by all the clients is also proposed to address the data heterogeneity issue across clients. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74\%.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Traffic Response Functions: Patterns, Propagation and Congestion
Authors:
Sebastian Gartzke,
Shanshan Wang,
Thomas Guhr,
Michael Schreckenberg
Abstract:
Using empirical data gathered on motorways in Germany, we follow a new approach by further exploring response functions as a possible tool to study traffic dynamics in motorway networks. We uncover the basic characteristics of responses of flow and density to given signals and the capability of responses to capture the correlation between these fundamental observables. Furthermore, we uncover the…
▽ More
Using empirical data gathered on motorways in Germany, we follow a new approach by further exploring response functions as a possible tool to study traffic dynamics in motorway networks. We uncover the basic characteristics of responses of flow and density to given signals and the capability of responses to capture the correlation between these fundamental observables. Furthermore, we uncover the potential use of responses to characterize traffic patterns. We are able to demonstrate the differentiation of congestion patterns and the determination of the propagation velocity of moving congestion.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts
Authors:
Haodong Hong,
Sen Wang,
Zi Huang,
Qi Wu,
Jiajun Liu
Abstract:
Current Vision-and-Language Navigation (VLN) tasks mainly employ textual instructions to guide agents. However, being inherently abstract, the same textual instruction can be associated with different visual signals, causing severe ambiguity and limiting the transfer of prior knowledge in the vision domain from the user to the agent. To fill this gap, we propose Vision-and-Language Navigation with…
▽ More
Current Vision-and-Language Navigation (VLN) tasks mainly employ textual instructions to guide agents. However, being inherently abstract, the same textual instruction can be associated with different visual signals, causing severe ambiguity and limiting the transfer of prior knowledge in the vision domain from the user to the agent. To fill this gap, we propose Vision-and-Language Navigation with Multi-modal Prompts (VLN-MP), a novel task augmenting traditional VLN by integrating both natural language and images in instructions. VLN-MP not only maintains backward compatibility by effectively handling text-only prompts but also consistently shows advantages with different quantities and relevance of visual prompts. Possible forms of visual prompts include both exact and similar object images, providing adaptability and versatility in diverse navigation scenarios. To evaluate VLN-MP under a unified framework, we implement a new benchmark that offers: (1) a training-free pipeline to transform textual instructions into multi-modal forms with landmark images; (2) diverse datasets with multi-modal instructions for different downstream tasks; (3) a novel module designed to process various image prompts for seamless integration with state-of-the-art VLN models. Extensive experiments on four VLN benchmarks (R2R, RxR, REVERIE, CVDN) show that incorporating visual prompts significantly boosts navigation performance. While maintaining efficiency with text-only prompts, VLN-MP enables agents to navigate in the pre-explore setting and outperform text-based models, showing its broader applicability.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
LongSSM: On the Length Extension of State-space Models in Language Modelling
Authors:
Shida Wang
Abstract:
In this paper, we investigate the length-extension of state-space models (SSMs) in language modeling. Length extension involves training models on short sequences and testing them on longer ones. We show that state-space models trained with zero hidden states initialization have difficulty doing length extension. We explain this difficulty by pointing out the length extension is equivalent to poly…
▽ More
In this paper, we investigate the length-extension of state-space models (SSMs) in language modeling. Length extension involves training models on short sequences and testing them on longer ones. We show that state-space models trained with zero hidden states initialization have difficulty doing length extension. We explain this difficulty by pointing out the length extension is equivalent to polynomial extrapolation. Based on the theory, we propose a simple yet effective method - changing the hidden states initialization scheme - to improve the length extension. Moreover, our method shows that using long training sequence length is beneficial but not necessary to length extension. Changing the hidden state initialization enables the efficient training of long-memory model with a smaller training context length.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
DrEureka: Language Model Guided Sim-To-Real Transfer
Authors:
Yecheng Jason Ma,
William Liang,
Hung-Ju Wang,
Sam Wang,
Yuke Zhu,
Linxi Fan,
Osbert Bastani,
Dinesh Jayaraman
Abstract:
Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. However, sim-to-real approaches typically rely on manual design and tuning of the task reward function as well as the simulation physics parameters, rendering the process slow and human-labor intensive. In this paper, we investigate using Large Language Models (LLMs) to automa…
▽ More
Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. However, sim-to-real approaches typically rely on manual design and tuning of the task reward function as well as the simulation physics parameters, rendering the process slow and human-labor intensive. In this paper, we investigate using Large Language Models (LLMs) to automate and accelerate sim-to-real design. Our LLM-guided sim-to-real approach, DrEureka, requires only the physics simulation for the target task and automatically constructs suitable reward functions and domain randomization distributions to support real-world transfer. We first demonstrate that our approach can discover sim-to-real configurations that are competitive with existing human-designed ones on quadruped locomotion and dexterous manipulation tasks. Then, we showcase that our approach is capable of solving novel robot tasks, such as quadruped balancing and walking atop a yoga ball, without iterative manual design.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Optimal Transport Guided Correlation Assignment for Multimodal Entity Linking
Authors:
Zefeng Zhang,
Jiawei Sheng,
Chuang Zhang,
Yunzhi Liang,
Wenyuan Zhang,
Siqi Wang,
Tingwen Liu
Abstract:
Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically lear…
▽ More
Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically learned attention weights, which may over-concentrate on partial correlations. To mitigate this issue, we formulate the correlation assignment problem as an optimal transport (OT) problem, and propose a novel MEL framework, namely OT-MEL, with OT-guided correlation assignment. Thereby, we exploit the correlation between multimodal features to enhance multimodal fusion, and the correlation between mentions and entities to enhance fine-grained matching. To accelerate model prediction, we further leverage knowledge distillation to transfer OT assignment knowledge to attention mechanism. Experimental results show that our model significantly outperforms previous state-of-the-art baselines and confirm the effectiveness of the OT-guided correlation assignment.
△ Less
Submitted 5 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
Authors:
Bingheng Li,
Linxin Yang,
Yupeng Chen,
Senmiao Wang,
Qian Chen,
Haitao Mao,
Yao Ma,
Akang Wang,
Tian Ding,
Jiliang Tang,
Ruoyu Sun
Abstract:
Solving large-scale linear programming (LP) problems is an important task in various areas such as communication networks, power systems, finance and logistics. Recently, two distinct approaches have emerged to expedite LP solving: (i) First-order methods (FOMs); (ii) Learning to optimize (L2O). In this work, we propose an FOM-unrolled neural network (NN) called PDHG-Net, and propose a two-stage L…
▽ More
Solving large-scale linear programming (LP) problems is an important task in various areas such as communication networks, power systems, finance and logistics. Recently, two distinct approaches have emerged to expedite LP solving: (i) First-order methods (FOMs); (ii) Learning to optimize (L2O). In this work, we propose an FOM-unrolled neural network (NN) called PDHG-Net, and propose a two-stage L2O method to solve large-scale LP problems. The new architecture PDHG-Net is designed by unrolling the recently emerged PDHG method into a neural network, combined with channel-expansion techniques borrowed from graph neural networks. We prove that the proposed PDHG-Net can recover PDHG algorithm, thus can approximate optimal solutions of LP instances with a polynomial number of neurons. We propose a two-stage inference approach: first use PDHG-Net to generate an approximate solution, and then apply PDHG algorithm to further improve the solution. Experiments show that our approach can significantly accelerate LP solving, achieving up to a 3$\times$ speedup compared to FOMs for large-scale LP problems.
△ Less
Submitted 6 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Simulation of DAMPE silicon microstrip detectors in the $\rm Allpix^{2}$ framework
Authors:
Yu-Xin Cui,
Xiang Li,
Shen Wang,
Chuan Yue,
Qiang Wan,
Shi-Jun Lei,
Guan-Wen Yuan,
Yi-Ming Hu,
Jia-Ju Wei,
Jian-Hua Guo
Abstract:
Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-st…
▽ More
Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-strip detector with the $\rm Allpix^{2}$ framework is developed. By incorporating the electric field into the particle transport simulation based on Geant4, this framework could precisely emulate the carrier drift in the silicon micro-strip detector. The simulation results are validated using the beam test data as well as the flight data of the DAMPE experiment, which suggests that the $\rm Allpix^{2}$ framework is a powerful tool to obtain the performance of the silicon micro-strip detector.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Authors:
Xutong Liu,
Siwei Wang,
**hang Zuo,
Han Zhong,
Xuchuang Wang,
Zhiyong Wang,
Shuai Li,
Mohammad Hajiesmaili,
John C. S. Lui,
Wei Chen
Abstract:
We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by lev…
▽ More
We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by leveraging distinct statistical properties for multivariant random variables. For CMAB-MT, we propose a general 1-norm multivariant and triggering probability-modulated smoothness condition, and an optimistic CUCB-MT algorithm built upon this condition. Our framework can include many important problems as applications, such as episodic reinforcement learning (RL) and probabilistic maximum coverage for goods distribution, all of which meet the above smoothness condition and achieve matching or improved regret bounds compared to existing works. Through our new framework, we build the first connection between the episodic RL and CMAB literature, by offering a new angle to solve the episodic RL through the lens of CMAB, which may encourage more interactions between these two important directions.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are…
▽ More
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are ${\mathcal B}(D_s^+\to ηe^+ν_e)=(2.35\pm0.11_{\rm stat}\pm 0.10_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to η^\prime e^+ν_e)=(0.82\pm0.09_{\rm stat}\pm 0.04_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to φe^+ν_e)=(2.21\pm0.16_{\rm stat}\pm 0.11_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to f_0(980) e^+ν_e,f_0(980)\toπ^+π^-)=(0.15\pm0.02_{\rm stat}\pm 0.01_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to K^0 e^+ν_e)=(0.24\pm0.04_{\rm stat}\pm 0.01_{\rm syst})\%,$ and ${\mathcal B}(D_s^+\to K^{*0} e^+ν_e)=(0.19\pm0.03_{\rm stat}\pm 0.01_{\rm syst})\%.$ These results are consistent with those measured via the $e^+e^-\to D_s^{*\pm}D_s^{\mp}$ process by BESIII and CLEO. The hadronic transition form factors $D^+_s\to ηe^+ν_e$, $D^+_s\to η^\prime e^+ν_e$, and $D^+_s\to K^0 e^+ν_e$ at four-momentum transfer squared $q^2$ = 0 are determined to be $f^η_+(0) = 0.482 \pm 0.011_{\rm stat} \pm 0.009_{\rm syst}\pm0.004_{\rm input},$ $f^{η^{\prime}}_+(0) = 0.562 \pm 0.031_{\rm stat} \pm 0.014_{\rm
syst}\pm0.003_{\rm input},$ and $f^{K^0}_+(0) = 0.624 \pm 0.052_{\rm
stat} \pm 0.013_{\rm syst}\pm0.002_{\rm input}.$
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
EduNLP: Towards a Unified and Modularized Library for Educational Resources
Authors:
Zhenya Huang,
Yuting Ning,
Longhu Qin,
Shiwei Tong,
Shangzi Xue,
Tong Xiao,
Xin Lin,
Jiayu Liu,
Qi Liu,
Enhong Chen,
Shi**g Wang
Abstract:
Educational resource understanding is vital to online learning platforms, which have demonstrated growing applications recently. However, researchers and developers always struggle with using existing general natural language toolkits or domain-specific models. The issue raises a need to develop an effective and easy-to-use one that benefits AI education-related research and applications. To bridg…
▽ More
Educational resource understanding is vital to online learning platforms, which have demonstrated growing applications recently. However, researchers and developers always struggle with using existing general natural language toolkits or domain-specific models. The issue raises a need to develop an effective and easy-to-use one that benefits AI education-related research and applications. To bridge this gap, we present a unified, modularized, and extensive library, EduNLP, focusing on educational resource understanding. In the library, we decouple the whole workflow to four key modules with consistent interfaces including data configuration, processing, model implementation, and model evaluation. We also provide a configurable pipeline to unify the data usage and model usage in standard ways, where users can customize their own needs. For the current version, we primarily provide 10 typical models from four categories, and 5 common downstream-evaluation tasks in the education domain on 8 subjects for users' usage. The project is released at: https://github.com/bigdata-ustc/EduNLP.
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
LLM and GNN are Complementary: Distilling LLM for Multimodal Graph Learning
Authors:
Junjie Xu,
Zongyu Wu,
Minhua Lin,
Xiang Zhang,
Suhang Wang
Abstract:
Recent progress in Graph Neural Networks (GNNs) has greatly enhanced the ability to model complex molecular structures for predicting properties. Nevertheless, molecular data encompasses more than just graph structures, including textual and visual information that GNNs do not handle well. To bridge this gap, we present an innovative framework that utilizes multimodal molecular data to extract ins…
▽ More
Recent progress in Graph Neural Networks (GNNs) has greatly enhanced the ability to model complex molecular structures for predicting properties. Nevertheless, molecular data encompasses more than just graph structures, including textual and visual information that GNNs do not handle well. To bridge this gap, we present an innovative framework that utilizes multimodal molecular data to extract insights from Large Language Models (LLMs). We introduce GALLON (Graph Learning from Large Language Model Distillation), a framework that synergizes the capabilities of LLMs and GNNs by distilling multimodal knowledge into a unified Multilayer Perceptron (MLP). This method integrates the rich textual and visual data of molecules with the structural analysis power of GNNs. Extensive experiments reveal that our distilled MLP model notably improves the accuracy and efficiency of molecular property predictions.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation
Authors:
Tianyu Huang,
Tao Zhou,
Weidi Xie,
Shuo Wang,
Qi Dou,
Yizhe Zhang
Abstract:
The current variants of the Segment Anything Model (SAM), which include the original SAM and Medical SAM, still lack the capability to produce sufficiently accurate segmentation for medical images. In medical imaging contexts, it is not uncommon for human experts to rectify segmentations of specific test samples after SAM generates its segmentation predictions. These rectifications typically entai…
▽ More
The current variants of the Segment Anything Model (SAM), which include the original SAM and Medical SAM, still lack the capability to produce sufficiently accurate segmentation for medical images. In medical imaging contexts, it is not uncommon for human experts to rectify segmentations of specific test samples after SAM generates its segmentation predictions. These rectifications typically entail manual or semi-manual corrections employing state-of-the-art annotation tools. Motivated by this process, we introduce a novel approach that leverages the advantages of online machine learning to enhance Segment Anything (SA) during test time. We employ rectified annotations to perform online learning, with the aim of improving the segmentation quality of SA on medical images. To improve the effectiveness and efficiency of online learning when integrated with large-scale vision models like SAM, we propose a new method called Auxiliary Online Learning (AuxOL). AuxOL creates and applies a small auxiliary model (specialist) in conjunction with SAM (generalist), entails adaptive online-batch and adaptive segmentation fusion. Experiments conducted on eight datasets covering four medical imaging modalities validate the effectiveness of the proposed method. Our work proposes and validates a new, practical, and effective approach for enhancing SA on downstream segmentation tasks (e.g., medical image segmentation).
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation
Authors:
Fei Gao,
Siwen Wang,
Churan Wang,
Fandong Zhang,
Hong-Yu Zhou,
Yizhou Wang,
Gang Yu,
Yizhou Yu
Abstract:
Medical image analysis suffers from a shortage of data, whether annotated or not. This becomes even more pronounced when it comes to 3D medical images. Self-Supervised Learning (SSL) can partially ease this situation by using unlabeled data. However, most existing SSL methods can only make use of data in a single dimensionality (e.g. 2D or 3D), and are incapable of enlarging the training dataset b…
▽ More
Medical image analysis suffers from a shortage of data, whether annotated or not. This becomes even more pronounced when it comes to 3D medical images. Self-Supervised Learning (SSL) can partially ease this situation by using unlabeled data. However, most existing SSL methods can only make use of data in a single dimensionality (e.g. 2D or 3D), and are incapable of enlarging the training dataset by using data with differing dimensionalities jointly. In this paper, we propose a new cross-dimensional SSL framework based on a pseudo-3D transformation (CDSSL-P3D), that can leverage both 2D and 3D data for joint pre-training. Specifically, we introduce an image transformation based on the im2col algorithm, which converts 2D images into a format consistent with 3D data. This transformation enables seamless integration of 2D and 3D data, and facilitates cross-dimensional self-supervised learning for 3D medical image analysis. We run extensive experiments on 13 downstream tasks, including 2D and 3D classification and segmentation. The results indicate that our CDSSL-P3D achieves superior performance, outperforming other advanced SSL methods.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Modeling the refractive index profile n(z) of polar ice for ultra-high energy neutrino experiments
Authors:
S. Ali,
P. Allison,
S. Archambault,
J. J. Beatty,
D. Z. Besson,
A. Bishop,
P. Chen,
Y. C. Chen,
B. A. Clark,
W. Clay,
A. Connolly,
K. Couberly,
L. Cremonesi,
A. Cummings,
P. Dasgupta,
R. Debolt,
S. de Kockere,
K. D. de Vries,
C. Deaconu,
M. A. DuVernois,
J. Flaherty,
E. Friedman,
R. Gaior,
P. Giri,
J. Hanson
, et al. (45 additional authors not shown)
Abstract:
We develop an in-situ index of refraction profile using the transit time of radio signals broadcast from an englacial transmitter to 2-5 km distant radio-frequency receivers, deployed at depths up to 200 m. Maxwell's equations generally admit two ray propagation solutions from a given transmitter, corresponding to a direct path (D) and a refracted path (R); the measured D vs. R (dt(D,R)) timing di…
▽ More
We develop an in-situ index of refraction profile using the transit time of radio signals broadcast from an englacial transmitter to 2-5 km distant radio-frequency receivers, deployed at depths up to 200 m. Maxwell's equations generally admit two ray propagation solutions from a given transmitter, corresponding to a direct path (D) and a refracted path (R); the measured D vs. R (dt(D,R)) timing differences provide constraints on the index of refraction profile near South Pole, where the Askaryan Radio Array (ARA) neutrino observatory is located. We constrain the refractive index profile by simulating D and R ray paths via ray tracing and comparing those to measured dt(D,R) signals. Using previous ice density data as a proxy for n(z), we demonstrate that our data strongly favors a glaciologically-motivated three-phase densification model rather than a single exponential scale height model. Simulations show that the single exponential model overestimates ARA neutrino sensitivity compared to the three-phase model.
△ Less
Submitted 11 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Maximum-Entropy Regularized Decision Transformer with Reward Relabelling for Dynamic Recommendation
Authors:
Xiaocong Chen,
Siyu Wang,
Lina Yao
Abstract:
Reinforcement learning-based recommender systems have recently gained popularity. However, due to the typical limitations of simulation environments (e.g., data inefficiency), most of the work cannot be broadly applied in all domains. To counter these challenges, recent advancements have leveraged offline reinforcement learning methods, notable for their data-driven approach utilizing offline data…
▽ More
Reinforcement learning-based recommender systems have recently gained popularity. However, due to the typical limitations of simulation environments (e.g., data inefficiency), most of the work cannot be broadly applied in all domains. To counter these challenges, recent advancements have leveraged offline reinforcement learning methods, notable for their data-driven approach utilizing offline datasets. A prominent example of this is the Decision Transformer. Despite its popularity, the Decision Transformer approach has inherent drawbacks, particularly evident in recommendation methods based on it. This paper identifies two key shortcomings in existing Decision Transformer-based methods: a lack of stitching capability and limited effectiveness in online adoption. In response, we introduce a novel methodology named Max-Entropy enhanced Decision Transformer with Reward Relabeling for Offline RLRS (EDT4Rec). Our approach begins with a max entropy perspective, leading to the development of a max entropy enhanced exploration strategy. This strategy is designed to facilitate more effective exploration in online environments. Additionally, to augment the model's capability to stitch sub-optimal trajectories, we incorporate a unique reward relabeling technique. To validate the effectiveness and superiority of EDT4Rec, we have conducted comprehensive experiments across six real-world offline datasets and in an online simulator.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Gyrokinetic simulation of the spontaneous toroidal rotation of plasma in a stochastic magnetic field
Authors:
**xiang You,
Shaojie Wang
Abstract:
Since the DIII-D resonant magnetic perturbation experiment [Nucl. Fusion $\bm{59}$, 126010 (2019)] suggests that the neoclassical toroidal viscosity due to the collisional effects associated with the non-resonant magnetic perturbations is not enough to explain the observed toroidal rotation, it is of interest to investigate the toroidal rotation induced by the anomalous diffusion due to the resona…
▽ More
Since the DIII-D resonant magnetic perturbation experiment [Nucl. Fusion $\bm{59}$, 126010 (2019)] suggests that the neoclassical toroidal viscosity due to the collisional effects associated with the non-resonant magnetic perturbations is not enough to explain the observed toroidal rotation, it is of interest to investigate the toroidal rotation induced by the anomalous diffusion due to the resonant magnetic perturbations. Gyrokinetic simulation of the toroidal rotation of plasma in a stochastic magnetic field is carried out to investigate the resonant magnetic perturbations effects on toroidal rotation. The simulation results suggest that, in a stochastic magnetic field, resonant magnetic perturbations drive the plasma to toroidally rotate through the ambipolar radial electric field. It is found that this spontaneous flow driven on the time scale less than an ion-ion collision time is the parallel return flow of the $\bm{E}_r\times\bm{B}_0$ drift, which is due to the the ambipolar radial electric field induced by the non-ambipolar radial diffusion in the stochastic magnetic field. Collisional effect changes the plasma toroidal rotation from the return flow to the rigid-body flow after a few ion-ion collision times. The toroidal rotation observed in DIII-D resonant magnetic perturbation experiment [Nucl. Fusion $\bm{59}$, 126010 (2019)], can be explained by the rigid-body rotation driven by the ambipolar radial electric field generated by the stochastic magnetic field layer.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
SuperGaussian: Repurposing Video Models for 3D Super Resolution
Authors:
Yuan Shen,
Duygu Ceylan,
Paul Guerrero,
Zexiang Xu,
Niloy J. Mitra,
Shenlong Wang,
Anna Frühstück
Abstract:
We present a simple, modular, and generic method that upsamples coarse 3D models by adding geometric and appearance details. While generative 3D models now exist, they do not yet match the quality of their counterparts in image and video domains. We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution and thus sidestep the problem of the…
▽ More
We present a simple, modular, and generic method that upsamples coarse 3D models by adding geometric and appearance details. While generative 3D models now exist, they do not yet match the quality of their counterparts in image and video domains. We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution and thus sidestep the problem of the shortage of large repositories of high-quality 3D training models. We describe how to repurpose video upsampling models, which are not 3D consistent, and combine them with 3D consolidation to produce 3D-consistent results. As output, we produce high quality Gaussian Splat models, which are object centric and effective. Our method is category agnostic and can be easily incorporated into existing 3D workflows. We evaluate our proposed SuperGaussian on a variety of 3D inputs, which are diverse both in terms of complexity and representation (e.g., Gaussian Splats or NeRFs), and demonstrate that our simple method significantly improves the fidelity of the final 3D models. Check our project website for details: supergaussian.github.io
△ Less
Submitted 4 June, 2024; v1 submitted 1 June, 2024;
originally announced June 2024.
-
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Authors:
Tianci Liu,
Haoyu Wang,
Shiyang Wang,
Yu Cheng,
**g Gao
Abstract:
Large language models (LLMs) have achieved impressive performance on various natural language generation tasks. Nonetheless, they suffer from generating negative and harmful contents that are biased against certain demographic groups (e.g., female), raising severe fairness concerns. As remedies, prior works intervened the generation by removing attitude or demographic information, inevitably degra…
▽ More
Large language models (LLMs) have achieved impressive performance on various natural language generation tasks. Nonetheless, they suffer from generating negative and harmful contents that are biased against certain demographic groups (e.g., female), raising severe fairness concerns. As remedies, prior works intervened the generation by removing attitude or demographic information, inevitably degrading the generation quality and resulting in notable \textit{fairness-fluency} trade-offs. However, it is still under-explored to what extent the fluency \textit{has to} be affected in order to achieve a desired level of fairness. In this work, we conduct the first formal study from an information-theoretic perspective. We show that previous approaches are excessive for debiasing and propose LIDAO, a general framework to debias a (L)LM at a better fluency provably. We further robustify LIDAO in adversarial scenarios, where a carefully-crafted prompt may stimulate LLMs exhibiting instruction-following abilities to generate texts with fairness issue appears only when the prompt is also taken into account. Experiments on three LMs ranging from 0.7B to 7B parameters demonstrate the superiority of our method.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Optimal Transmission Power Scheduling for Networked Control System under DoS Attack
Authors:
Siyi Wang,
Yulong Gao,
Sandra Hirche
Abstract:
Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sen…
▽ More
Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sensory data transmission. Yet DoS attacks render data packets unavailable by disrupting the communication channel. This paper co-designs the control and power scheduling laws in the presence of DoS attacks and aims to minimize the sum of regulation control performance and transmission power consumption. Both finite- and infinite-horizon discounted cost criteria are addressed, respectively. By delving into the information structure between the controller and the power scheduler under attack, the original co-design problem is divided into two subproblems that can be solved individually without compromising optimality. The optimal control is shown to be certainty equivalent, and the optimal transmission power scheduling is solved using a dynamic programming approach. Moreover, in the infinite-horizon scenario, we analyze the performance of the designed scheduling policy and develop an upper bound of the total costs. Finally, a numerical example is provided to demonstrate the theoretical results.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Representation and De-interleaving of Mixtures of Hidden Markov Processes
Authors:
Jiadi Bao,
Mengtao Zhu,
Yunjie Li,
Shafei Wang
Abstract:
De-interleaving of the mixtures of Hidden Markov Processes (HMPs) generally depends on its representation model. Existing representation models consider Markov chain mixtures rather than hidden Markov, resulting in the lack of robustness to non-ideal situations such as observation noise or missing observations. Besides, de-interleaving methods utilize a search-based strategy, which is time-consumi…
▽ More
De-interleaving of the mixtures of Hidden Markov Processes (HMPs) generally depends on its representation model. Existing representation models consider Markov chain mixtures rather than hidden Markov, resulting in the lack of robustness to non-ideal situations such as observation noise or missing observations. Besides, de-interleaving methods utilize a search-based strategy, which is time-consuming. To address these issues, this paper proposes a novel representation model and corresponding de-interleaving methods for the mixtures of HMPs. At first, a generative model for representing the mixtures of HMPs is designed. Subsequently, the de-interleaving process is formulated as a posterior inference for the generative model. Secondly, an exact inference method is developed to maximize the likelihood of the complete data, and two approximate inference methods are developed to maximize the evidence lower bound by creating tractable structures. Then, a theoretical error probability lower bound is derived using the likelihood ratio test, and the algorithms are shown to get reasonably close to the bound. Finally, simulation results demonstrate that the proposed methods are highly effective and robust for non-ideal situations, outperforming baseline methods on simulated and real-life data.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Real-Time State Modulation and Acquisition Circuit in Neuromorphic Memristive Systems
Authors:
Shengbo Wang,
Cong Li,
Tongming Pu,
Jian Zhang,
Weihao Ma,
Luigi Occhipinti,
Arokia Nathan,
Shuo Gao
Abstract:
Memristive neuromorphic systems are designed to emulate human perception and cognition, where the memristor states represent essential historical information to perform both low-level and high-level tasks. However, current systems face challenges with the separation of state modulation and acquisition, leading to undesired time delays that impact real-time performance. To overcome this issue, we i…
▽ More
Memristive neuromorphic systems are designed to emulate human perception and cognition, where the memristor states represent essential historical information to perform both low-level and high-level tasks. However, current systems face challenges with the separation of state modulation and acquisition, leading to undesired time delays that impact real-time performance. To overcome this issue, we introduce a dual-function circuit that concurrently modulates and acquires memristor state information. This is achieved through two key features: 1) a feedback operational amplifier (op-amp) based circuit that ensures precise voltage application on the memristor while converting the passing current into a voltage signal; 2) a division calculation circuit that acquires state information from the modulation voltage and the converted voltage, improving stability by leveraging the intrinsic threshold characteristics of memristors. This circuit has been evaluated in a memristor-based nociceptor and a memristor crossbar, demonstrating exceptional performance. For instance, it achieves mean absolute acquisition errors below 1 Ω during the modulation process in the nociceptor application. These results demonstrate that the proposed circuit can operate at different scales, holding the potential to enhance a wide range of neuromorphic applications.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Approaching 100% Confidence in Stream Summary through ReliableSketch
Authors:
Yuhan Wu,
Hanbo Wu,
Xilai Liu,
Yikai Zhao,
Tong Yang,
Kaicheng Yang,
Sha Wang,
Lihua Miao,
Gaogang Xie
Abstract:
To approximate sums of values in key-value data streams, sketches are widely used in databases and networking systems. They offer high-confidence approximations for any given key while ensuring low time and space overhead. While existing sketches are proficient in estimating individual keys, they struggle to maintain this high confidence across all keys collectively, an objective that is criticall…
▽ More
To approximate sums of values in key-value data streams, sketches are widely used in databases and networking systems. They offer high-confidence approximations for any given key while ensuring low time and space overhead. While existing sketches are proficient in estimating individual keys, they struggle to maintain this high confidence across all keys collectively, an objective that is critically important in both algorithm theory and its practical applications. We propose ReliableSketch, the first to control the error of all keys to less than $Λ$ with a small failure probability $Δ$, requiring only $O(1 + Δ\ln\ln(\frac{N}Λ))$ amortized time and $O(\frac{N}Λ + \ln(\frac{1}Δ))$ space. Furthermore, its simplicity makes it hardware-friendly, and we implement it on CPU servers, FPGAs, and programmable switches. Our experiments show that under the same small space, ReliableSketch not only keeps all keys' errors below $Λ$ but also achieves near-optimal throughput, outperforming competitors with thousands of uncontrolled estimations. We have made our source code publicly available.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Search for $e^{+}e^{-}\toη'ψ(2S)$ at center-of-mass energies from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence lev…
▽ More
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence level are determined.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Realization of a cold atom gyroscope in space
Authors:
**ting Li,
Xi Chen,
Danfang Zhang,
Wenzhang Wang,
Yang Zhou,
Meng He,
Jie Fang,
Lin Zhou,
Chuan He,
Junjie Jiang,
Huanyao Sun,
Qunfeng Chen,
Lei Qin,
Xiao Li,
Yibo Wang,
Xiaowei Zhang,
Jiaqi Zhong,
Runbing Li,
Meizhen An,
Long Zhang,
Shuquan Wang,
Zongfeng Li,
** Wang,
Mingsheng Zhan
Abstract:
High precision gyroscopes in space are important for sophisticated scientific experiments and deep space navigation. Microgravity in the space provides an ideal condition for operation of a cold atom gyroscope. To demonstrate this advantage, an atom interferometer (AI) was launched and installed in the China Space Station in 2022. Here reported is a realization of the cold atom gyroscope with this…
▽ More
High precision gyroscopes in space are important for sophisticated scientific experiments and deep space navigation. Microgravity in the space provides an ideal condition for operation of a cold atom gyroscope. To demonstrate this advantage, an atom interferometer (AI) was launched and installed in the China Space Station in 2022. Here reported is a realization of the cold atom gyroscope with this AI. By applying point source interferometry, spatial fringes are obtained and acceleration and rotation are extracted. The angles of the Raman lasers are precisely calibrated to avoid measurement error, and other systematic errors are also considered for the rotation measurement. The evaluated rotation measurement is (-115.64+/-1.71)*10^-5 rad/s in space, and an acceleration measurement resolution of 1.03*10^-6 m/s^2 is also obtained for a single image. This study conducts the first AI-based gyroscope in space and paves a way for future space-based AI experiments.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Study of the decays $χ_{cJ} \rightarrow Λ\barΛφ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured t…
▽ More
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured to be $( 2.99\pm1.24\pm0.19) \times 10^{-5}$, $(6.01\pm0.90\pm0.40 )\times 10^{-5}$, and $(7.13\pm0.81\pm0.36) \times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No obvious enhancement near the $Λ\barΛ$ production threshold or excited $Λ$ state is found in the $Λφ$ (or $\barΛφ$) system.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Fast Algorithm for Multiplication on the Skein Algebra of One-hole Torus
Authors:
Sike Wang,
Helen Wong
Abstract:
The Kauffman bracket skein algebra of a surface is a generalization of the Jones polynomial invariant for links and plays a principal role in the Witten-Reshetikhin- Turaev topological quantum field theory. However, the multiplicative structure of the skein algebra is not well understood, with a priori exponential complexity. We consider the case of one-hole torus, and provide a polynomial algorit…
▽ More
The Kauffman bracket skein algebra of a surface is a generalization of the Jones polynomial invariant for links and plays a principal role in the Witten-Reshetikhin- Turaev topological quantum field theory. However, the multiplicative structure of the skein algebra is not well understood, with a priori exponential complexity. We consider the case of one-hole torus, and provide a polynomial algorithm for computing multiplication of any two skein elements. Some closed form formulas for multiplication of curves with low crossing number are also given.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Authors:
Kai Wu,
Boyuan Jiang,
Zhengkai Jiang,
Qingdong He,
Donghao Luo,
Shengzhi Wang,
Qingwen Liu,
Chengjie Wang
Abstract:
Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detailed descriptions for images. Our analysis reveals that hallucinations stem from the inherent summarization mechanism of large language models, leading…
▽ More
Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detailed descriptions for images. Our analysis reveals that hallucinations stem from the inherent summarization mechanism of large language models, leading to excessive dependence on linguistic tokens while neglecting vision information. In this paper, we propose NoiseBoost, a broadly applicable and simple method for alleviating hallucinations for MLLMs through the integration of noise feature perturbations. Noise perturbation acts as a regularizer, facilitating a balanced distribution of attention weights among visual and linguistic tokens. Despite its simplicity, NoiseBoost consistently enhances the performance of MLLMs across common training strategies, including supervised fine-tuning and reinforcement learning. Further, NoiseBoost pioneerly enables semi-supervised learning for MLLMs, unleashing the power of unlabeled data. Comprehensive experiments demonstrate that NoiseBoost improves dense caption accuracy by 8.1% with human evaluation and achieves comparable results with 50% of the data by mining unlabeled data. Code and models are available at https://kaiwu5.github.io/noiseboost.
△ Less
Submitted 31 May, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
A Larger Sample Confirms Small Planets Around Hot Stars Are Misaligned
Authors:
Emma M. Louden,
Songhu Wang,
Joshua N. Winn,
Erik A. Petigura,
Howard Isaacson,
Luke Handley,
Samuel W. Yee,
Corey Beard,
Joseph M. Akana Murphy,
Gregory Laughlin
Abstract:
The distribution of stellar obliquities provides critical insight into the formation and evolution pathways of exoplanets. In the past decade, it was found that hot stars hosting hot Jupiters are more likely to have high obliquities than cool stars, but it is not clear whether this trend exists only for hot Jupiters or holds for other types of planets. In this work, we extend the study of the obli…
▽ More
The distribution of stellar obliquities provides critical insight into the formation and evolution pathways of exoplanets. In the past decade, it was found that hot stars hosting hot Jupiters are more likely to have high obliquities than cool stars, but it is not clear whether this trend exists only for hot Jupiters or holds for other types of planets. In this work, we extend the study of the obliquities of hot (6250-7000\,K) stars with transiting super-Earth and sub-Neptune-sized planets. We constrain the obliquity distribution based on measurements of the stars' projected rotation velocities. Our sample consists of 170 TESS and \textit{Kepler} planet-hosting stars and 180 control stars chosen to have indistinguishable spectroscopic characteristics. In our analysis, we find evidence suggesting that the planet hosts have a systematically higher $\langle \sin i \rangle$ compared to the control sample. This result implies that the planet hosts tend to have lower obliquities. However, the observed difference in $\langle \sin i \rangle$ is not significant enough to confirm spin-orbit alignment, as it is 3.8$σ$ away from perfect alignment. We also find evidence that within the planet-hosting stars there is a trend of higher obliquity (lower $\langle \sin i\rangle$) for the hotter stars ($\teff > 6250$ K) than for the cooler stars in the sample. This suggests that hot stars hosting smaller planets exhibit a broader obliquity distribution($\langle \sin i\rangle = 0.79 \pm 0.053$) than cooler planet-hosting stars, indicating that high obliquities are not exclusive to hot Jupiters and instead are more broadly tied to hot stars.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
X-Instruction: Aligning Language Model in Low-resource Languages with Self-curated Cross-lingual Instructions
Authors:
Chong Li,
Wen Yang,
Jiajun Zhang,
**liang Lu,
Shaonan Wang,
Chengqing Zong
Abstract:
Large language models respond well in high-resource languages like English but struggle in low-resource languages. It may arise from the lack of high-quality instruction following data in these languages. Directly translating English samples into these languages can be a solution but unreliable, leading to responses with translation errors and lacking language-specific or cultural knowledge. To ad…
▽ More
Large language models respond well in high-resource languages like English but struggle in low-resource languages. It may arise from the lack of high-quality instruction following data in these languages. Directly translating English samples into these languages can be a solution but unreliable, leading to responses with translation errors and lacking language-specific or cultural knowledge. To address this issue, we propose a novel method to construct cross-lingual instruction following samples with instruction in English and response in low-resource languages. Specifically, the language model first learns to generate appropriate English instructions according to the natural web texts in other languages as responses. The candidate cross-lingual instruction tuning samples are further refined and diversified. We have employed this method to build a large-scale cross-lingual instruction tuning dataset on 10 languages, namely X-Instruction. The instruction data built using our method incorporate more language-specific knowledge compared with the naive translation method. Experimental results have shown that the response quality of the model tuned on X-Instruction greatly exceeds the model distilled from a powerful teacher model, reaching or even surpassing the ones of ChatGPT. In addition, we find that models tuned on cross-lingual instruction following samples can follow the instruction in the output language without further tuning.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Research on Foundation Model for Spatial Data Intelligence: China's 2024 White Paper on Strategic Development of Spatial Data Intelligence
Authors:
Shaohua Wang,
Xing Xie,
Yong Li,
Danhuai Guo,
Zhi Cai,
Yu Liu,
Yang Yue,
Xiao Pan,
Feng Lu,
Huayi Wu,
Zhipeng Gui,
Zhiming Ding,
Bolong Zheng,
Fuzheng Zhang,
Tao Qin,
**gyuan Wang,
Chuang Tao,
Zhengchao Chen,
Hao Lu,
Jiayi Li,
Hongyang Chen,
Peng Yue,
Wenhao Yu,
Yao Yao,
Leilei Sun
, et al. (9 additional authors not shown)
Abstract:
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial dat…
▽ More
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial data intelligent large models and their applications in urban environments, aerospace remote sensing, geography, transportation, and other scenarios. Additionally, it summarizes the latest application cases of spatial data intelligent large models in themes such as urban development, multimodal systems, remote sensing, smart transportation, and resource environments. Finally, the report concludes with an overview and outlook on the development prospects of spatial data intelligent large models.
△ Less
Submitted 29 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.