-
Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning
Authors:
Zhijie Nie,
Richong Zhang,
Zhangchi Feng,
Hailang Huang,
Xudong Liu
Abstract:
Cross-lingual Cross-modal Retrieval (CCR) is an essential task in web search, which aims to break the barriers between modality and language simultaneously and achieves image-text retrieval in the multi-lingual scenario with a single model. In recent years, excellent progress has been made based on cross-lingual cross-modal pre-training; particularly, the methods based on contrastive learning on l…
▽ More
Cross-lingual Cross-modal Retrieval (CCR) is an essential task in web search, which aims to break the barriers between modality and language simultaneously and achieves image-text retrieval in the multi-lingual scenario with a single model. In recent years, excellent progress has been made based on cross-lingual cross-modal pre-training; particularly, the methods based on contrastive learning on large-scale data have significantly improved retrieval tasks. However, these methods directly follow the existing pre-training methods in the cross-lingual or cross-modal domain, leading to two problems of inconsistency in CCR: The methods with cross-lingual style suffer from the intra-modal error propagation, resulting in inconsistent recall performance across languages in the whole dataset. The methods with cross-modal style suffer from the inter-modal optimization direction bias, resulting in inconsistent rank across languages within each instance, which cannot be reflected by Recall@K. To solve these problems, we propose a simple but effective 1-to-K contrastive learning method, which treats each language equally and eliminates error propagation and optimization bias. In addition, we propose a new evaluation metric, Mean Rank Variance (MRV), to reflect the rank inconsistency across languages within each instance. Extensive experiments on four CCR datasets show that our method improves both recall rates and MRV with smaller-scale pre-trained data, achieving the new state-of-art.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens
Authors:
Zhijie Nie,
Richong Zhang,
Zhanyu Wu
Abstract:
Text embeddings from large language models (LLMs) have achieved excellent results in tasks such as information retrieval, semantic textual similarity, etc. In this work, we show an interesting finding: when feeding a text into the embedding LLMs, the obtained text embedding will be able to be aligned with the key tokens in the input text. We first fully analyze this phenomenon on eight embedding L…
▽ More
Text embeddings from large language models (LLMs) have achieved excellent results in tasks such as information retrieval, semantic textual similarity, etc. In this work, we show an interesting finding: when feeding a text into the embedding LLMs, the obtained text embedding will be able to be aligned with the key tokens in the input text. We first fully analyze this phenomenon on eight embedding LLMs and show that this phenomenon is universal and is not affected by model architecture, training strategy, and embedding method. With a deeper analysis, we then find that the main change in embedding space between the embedding LLMs and their original generative LLMs is in the first principal component. By adjusting the first principal component, we can align text embedding with the key tokens. Finally, we give several examples to demonstrate the vast application potential of this finding: (1) we propose a simple and practical sparse retrieval method based on the aligned tokens, which can achieve 80\% of the dense retrieval effect of the same model while reducing the computation significantly; (2) we show that our findings provide a fresh perspective to help understand fuzzy concepts (e.g., semantic relatedness vs. semantic similarity) and emerging technologies (e.g., instruction-following embedding) in this field.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Gigantic-oxidative atomically layered epitaxy for designed complex oxides
Authors:
Guangdi Zhou,
Haoliang Huang,
Fengzhe Wang,
Heng Wang,
Qishuo Yang,
Zihao Nie,
Wei Lv,
Cui Ding,
Yueying Li,
Danfeng Li,
Yujie Sun,
Junhao Lin,
Guang-Ming Zhang,
Qi-Kun Xue,
Zhuoyu Chen
Abstract:
In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly…
▽ More
In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly for oxidation-demanding phases. We introduce a methodology, namely the gigantic-oxidative atomically layered epitaxy (GOAL-Epitaxy), enhancing oxidation power 3-4 orders of magnitude beyond oxide molecular beam epitaxy (OMBE) and pulsed laser deposition (PLD), while ensuring atomic-layer-by-layer growth of designed complex structures. Consequently, thermodynamic stability is markedly augmented at elevated temperatures, improving growth kinetics. We demonstrate the accurate synthesis of complex nickelates and cuprates, especially an artificially designed structure as a parent of high-temperature superconductivity, in which alternating single and double NiO2 layers possess distinct nominal d-orbital occupancy. The GOAL-Epitaxy enables material discovery within the vastly broadened growth parameter space.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Energy efficiency analysis of ammonia-fueled power systems for vehicles considering residual heat recovery
Authors:
Zexin Nie,
Yi Huang,
Guangyu Tian
Abstract:
Ammonia, known as a good hydrogen carrier, shows great potential for use as a zero-carbon fuel for vehicles. However, both the internal combustion engine (ICE) and the proton exchange membrane fuel cell (PEMFC), the currently available engines used by the vehicle, require hydrogen decomposed from ammonia. On-board hydrogen production is an energy-intensive process that significantly reduces system…
▽ More
Ammonia, known as a good hydrogen carrier, shows great potential for use as a zero-carbon fuel for vehicles. However, both the internal combustion engine (ICE) and the proton exchange membrane fuel cell (PEMFC), the currently available engines used by the vehicle, require hydrogen decomposed from ammonia. On-board hydrogen production is an energy-intensive process that significantly reduces system efficiency. Therefore, energy recovery from the system's residual heat is essential to promote system efficiency. ICEs and FCs require different amounts of hydrogen, and they produce residual heat of different quality and quantity, so the system efficiency is not only determined by the engine operating point, but also by the measures and ratios of residual heat recovery. To thoroughly understand the relationships between system energy efficiency and system configuration as well as system parameters, this paper takes three typical power systems with different configurations as our objects. Models of three systems are set up for system energy efficiency analysis, and carry out simulations under different conditions to conduct system output power and energy efficiency. By analyzing the simulation results, the factors that most significantly impact the system efficiency are identified, the guidelines for system design and parameter optimization are proposed.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Learning Multi-view Molecular Representations with Structured and Unstructured Knowledge
Authors:
Yizhen Luo,
Kai Yang,
Massimo Hong,
Xing Yi Liu,
Zikun Nie,
Hao Zhou,
Zaiqing Nie
Abstract:
Capturing molecular knowledge with representation learning approaches holds significant potential in vast scientific fields such as chemistry and life science. An effective and generalizable molecular representation is expected to capture the consensus and complementary molecular expertise from diverse views and perspectives. However, existing works fall short in learning multi-view molecular repr…
▽ More
Capturing molecular knowledge with representation learning approaches holds significant potential in vast scientific fields such as chemistry and life science. An effective and generalizable molecular representation is expected to capture the consensus and complementary molecular expertise from diverse views and perspectives. However, existing works fall short in learning multi-view molecular representations, due to challenges in explicitly incorporating view information and handling molecular knowledge from heterogeneous sources. To address these issues, we present MV-Mol, a molecular representation learning model that harvests multi-view molecular expertise from chemical structures, unstructured knowledge from biomedical texts, and structured knowledge from knowledge graphs. We utilize text prompts to model view information and design a fusion architecture to extract view-based molecular representations. We develop a two-stage pre-training procedure, exploiting heterogeneous data of varying quality and quantity. Through extensive experiments, we show that MV-Mol provides improved representations that substantially benefit molecular property prediction. Additionally, MV-Mol exhibits state-of-the-art performance in multi-modal comprehension of molecular structures and texts. Code and data are available at https://github.com/PharMolix/OpenBioMed.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Dynamical and thermodynamic crossovers in the supercritical region of a holographic superfluid model
Authors:
Zi-Qiang Zhao,
Zhang-Yu Nie,
**g-Fei Zhang,
Xin Zhang,
Matteo Baggioli
Abstract:
Many physical systems, including classical fluids, present in their phase diagram the competition between two phases that are separated by a line of first-order phase transitions which terminates at a so-called critical point. Despite several proposals, in the supercritical region beyond the critical point, whether the two phases can still be distinguished and by which criterion remain open questi…
▽ More
Many physical systems, including classical fluids, present in their phase diagram the competition between two phases that are separated by a line of first-order phase transitions which terminates at a so-called critical point. Despite several proposals, in the supercritical region beyond the critical point, whether the two phases can still be distinguished and by which criterion remain open questions. In this work, we study the thermodynamics and linear dynamics of a holographic superfluid model with nonlinear potential terms in the supercritical region. We identify the presence of a dynamical crossover, akin to the liquid-like to gas-like Frenkel transition in supercritical fluids, and we define other separation lines of thermodynamic origin based on higher order derivatives of the free energy with respect to the charge density. Our results highlight the universal dynamical and thermodynamic features of supercritical systems from nuclear matter and classical fluids to superfluid systems.
△ Less
Submitted 11 June, 2024; v1 submitted 8 June, 2024;
originally announced June 2024.
-
ProtFAD: Introducing function-aware domains as implicit modality towards protein function perception
Authors:
Mingqing Wang,
Zhiwei Nie,
Yonghong He,
Zhixiang Ren
Abstract:
Protein function prediction is currently achieved by encoding its sequence or structure, where the sequence-to-function transcendence and high-quality structural data scarcity lead to obvious performance bottlenecks. Protein domains are "building blocks" of proteins that are functionally independent, and their combinations determine the diverse biological functions. However, most existing studies…
▽ More
Protein function prediction is currently achieved by encoding its sequence or structure, where the sequence-to-function transcendence and high-quality structural data scarcity lead to obvious performance bottlenecks. Protein domains are "building blocks" of proteins that are functionally independent, and their combinations determine the diverse biological functions. However, most existing studies have yet to thoroughly explore the intricate functional information contained in the protein domains. To fill this gap, we propose a synergistic integration approach for a function-aware domain representation, and a domain-joint contrastive learning strategy to distinguish different protein functions while aligning the modalities. Specifically, we associate domains with the GO terms as function priors to pre-train domain embeddings. Furthermore, we partition proteins into multiple sub-views based on continuous joint domains for contrastive training under the supervision of a novel triplet InfoNCE loss. Our approach significantly and comprehensively outperforms the state-of-the-art methods on various benchmarks, and clearly differentiates proteins carrying distinct functions compared to the competitor.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Line intensities of CO near 1560 nm measured with absorption and dispersion spectroscopy
Authors:
Q. Huang,
Y. Tan,
R. -H. Yin,
Z. -L. Nie,
J. Wang,
S. -M Hu
Abstract:
High-precision line intensities are of great value in various applications, such as greenhouse gas metrology, planetary atmospheric analysis, and trace gas detection. Here we report simultaneous measurements of cavity-enhanced absorption and dispersion spectroscopy of the prototype molecule $^{12}$C$^{16}$O using the same optical resonant cavity. Nine lines were measured in the R branch of the…
▽ More
High-precision line intensities are of great value in various applications, such as greenhouse gas metrology, planetary atmospheric analysis, and trace gas detection. Here we report simultaneous measurements of cavity-enhanced absorption and dispersion spectroscopy of the prototype molecule $^{12}$C$^{16}$O using the same optical resonant cavity. Nine lines were measured in the R branch of the $v=3-0$ band. The absorption and dispersion spectra were fitted separately with speed-dependent Voigt profiles, and the line intensities obtained by the two methods agree within the experimental uncertainty of about 1\textperthousand. The results demonstrate the feasibility of SI-traceable molecular density measurements based on laser spectroscopy.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Authors:
Suyuan Zhao,
Jiahuan Zhang,
Yushuai Wu,
Yizhen Luo,
Zaiqing Nie
Abstract:
Cell identity encompasses various semantic aspects of a cell, including cell type, pathway information, disease information, and more, which are essential for biologists to gain insights into its biological characteristics. Understanding cell identity from the transcriptomic data, such as annotating cell types, has become an important task in bioinformatics. As these semantic aspects are determine…
▽ More
Cell identity encompasses various semantic aspects of a cell, including cell type, pathway information, disease information, and more, which are essential for biologists to gain insights into its biological characteristics. Understanding cell identity from the transcriptomic data, such as annotating cell types, has become an important task in bioinformatics. As these semantic aspects are determined by human experts, it is impossible for AI models to effectively carry out cell identity understanding tasks without the supervision signals provided by single-cell and label pairs. The single-cell pre-trained language models (PLMs) currently used for this task are trained only on a single modality, transcriptomics data, lack an understanding of cell identity knowledge. As a result, they have to be fine-tuned for downstream tasks and struggle when lacking labeled data with the desired semantic labels. To address this issue, we propose an innovative solution by constructing a unified representation of single-cell data and natural language during the pre-training phase, allowing the model to directly incorporate insights related to cell identity. More specifically, we introduce $\textbf{LangCell}$, the first $\textbf{Lang}$uage-$\textbf{Cell}$ pre-training framework. LangCell utilizes texts enriched with cell identity information to gain a profound comprehension of cross-modal knowledge. Results from experiments conducted on different benchmarks show that LangCell is the only single-cell PLM that can work effectively in zero-shot cell identity understanding scenarios, and also significantly outperforms existing models in few-shot and fine-tuning cell identity understanding scenarios.
△ Less
Submitted 11 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Bidirectional cascaded superfluorescent lasing in air enabled by resonant third harmonic photon exchange from nitrogen to argon
Authors:
Zan Nie,
Noa Nambu,
Kenneth A. Marsh,
Daniel Matteo,
C. Kumar Patel,
Chaojie Zhang,
Yipeng Wu,
Stefanos Carlström,
Felipe Morales,
Serguei Patchkovskii,
Olga Smirnova,
Misha Ivanov,
Chan Joshi
Abstract:
Cavity-free lasing in atmospheric air has stimulated intense research towards fundamental understanding of underlying physical mechanisms. In this Letter, we identify a new mechanism -- third harmonic photon mediated resonant energy transfer pathway leading to population inversion in argon via initial three-photon excitation of nitrogen molecules irradiated by intense 261 nm pulses -- that enables…
▽ More
Cavity-free lasing in atmospheric air has stimulated intense research towards fundamental understanding of underlying physical mechanisms. In this Letter, we identify a new mechanism -- third harmonic photon mediated resonant energy transfer pathway leading to population inversion in argon via initial three-photon excitation of nitrogen molecules irradiated by intense 261 nm pulses -- that enables bidirectional two-color cascaded lasing in atmospheric air. By making pump-probe measurements, we conclusively show that such cascaded lasing results from superfluorescence (SF) rather than amplified spontaneous emission (ASE). Such cascaded lasing with the capability of producing bidirectional multicolor coherent pulses opens additional possibilities for remote sensing applications.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Correlations between X-rays, Visible Light and Drive-Beam Energy Loss Observed in Plasma Wakefield Acceleration Experiments at FACET-II
Authors:
Chaojie Zhang,
Doug Storey,
Pablo San Miguel Claveria,
Zan Nie,
Ken A. Marsh,
Warren B. Mori,
Erik Adli,
Weiming An,
Robert Ariniello,
Gevy J. Cao,
Christine Clark,
Sebastien Corde,
Thamine Dalichaouch,
Christopher E. Doss,
Claudio Emma,
Henrik Ekerfelt,
Elias Gerstmayr,
Spencer Gessner,
Claire Hansel,
Alexander Knetsch,
Valentina Lee,
Fei Li,
Mike Litos,
Brendan O'Shea,
Glen White
, et al. (4 additional authors not shown)
Abstract:
This study documents several correlations observed during the first run of the plasma wakefield acceleration experiment E300 conducted at FACET-II, using a single drive electron bunch. The established correlations include those between the measured maximum energy loss of the drive electron beam and the integrated betatron x-ray signal, the calculated total beam energy deposited in the plasma and t…
▽ More
This study documents several correlations observed during the first run of the plasma wakefield acceleration experiment E300 conducted at FACET-II, using a single drive electron bunch. The established correlations include those between the measured maximum energy loss of the drive electron beam and the integrated betatron x-ray signal, the calculated total beam energy deposited in the plasma and the integrated x-ray signal, among three visible light emission measuring cameras, and between the visible plasma light and x-ray signal. The integrated x-ray signal correlates almost linearly with both the maximum energy loss of the drive beam and the energy deposited into the plasma, demonstrating its usability as a measure of energy transfer from the drive beam to the plasma. Visible plasma light is found to be a useful indicator of the presence of wake at three locations that overall are two meters apart. Despite the complex dynamics and vastly different timescales, the x-ray radiation from the drive bunch and visible light emission from the plasma may prove to be effective non-invasive diagnostics for monitoring the energy transfer from the beam to the plasma in future high-repetition-rate experiments.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
Authors:
Zhangchi Feng,
Richong Zhang,
Zhijie Nie
Abstract:
The Composed Image Retrieval (CIR) task aims to retrieve target images using a composed query consisting of a reference image and a modified text. Advanced methods often utilize contrastive learning as the optimization objective, which benefits from adequate positive and negative examples. However, the triplet for CIR incurs high manual annotation costs, resulting in limited positive examples. Fur…
▽ More
The Composed Image Retrieval (CIR) task aims to retrieve target images using a composed query consisting of a reference image and a modified text. Advanced methods often utilize contrastive learning as the optimization objective, which benefits from adequate positive and negative examples. However, the triplet for CIR incurs high manual annotation costs, resulting in limited positive examples. Furthermore, existing methods commonly use in-batch negative sampling, which reduces the negative number available for the model. To address the problem of lack of positives, we propose a data generation method by leveraging a multi-modal large language model to construct triplets for CIR. To introduce more negatives during fine-tuning, we design a two-stage fine-tuning framework for CIR, whose second stage introduces plenty of static representations of negatives to optimize the representation space rapidly. The above two improvements can be effectively stacked and designed to be plug-and-play, easily applied to existing CIR models without changing their original architectures. Extensive experiments and ablation analysis demonstrate that our method effectively scales positives and negatives and achieves state-of-the-art results on both FashionIQ and CIRR datasets. In addition, our methods also perform well in zero-shot composed image retrieval, providing a new CIR solution for the low-resources scenario.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Amplitude-Phase Fusion for Enhanced Electrocardiogram Morphological Analysis
Authors:
Shuaicong Hu,
Yanan Wang,
Jian Liu,
**gyu Lin,
Shengmei Qin,
Zhenning Nie,
Zhifeng Yao,
Wenjie Cai,
Cuiwei Yang
Abstract:
Considering the variability of amplitude and phase patterns in electrocardiogram (ECG) signals due to cardiac activity and individual differences, existing entropy-based studies have not fully utilized these two patterns and lack integration. To address this gap, this paper proposes a novel fusion entropy metric, morphological ECG entropy (MEE) for the first time, specifically designed for ECG mor…
▽ More
Considering the variability of amplitude and phase patterns in electrocardiogram (ECG) signals due to cardiac activity and individual differences, existing entropy-based studies have not fully utilized these two patterns and lack integration. To address this gap, this paper proposes a novel fusion entropy metric, morphological ECG entropy (MEE) for the first time, specifically designed for ECG morphology, to comprehensively describe the fusion of amplitude and phase patterns. MEE is computed based on beat-level samples, enabling detailed analysis of each cardiac cycle. Experimental results demonstrate that MEE achieves rapid, accurate, and label-free localization of abnormal ECG arrhythmia regions. Furthermore, MEE provides a method for assessing sample diversity, facilitating compression of imbalanced training sets (via representative sample selection), and outperforms random pruning. Additionally, MEE exhibits the ability to describe areas of poor quality. By discussing, it proves the robustness of MEE value calculation to noise interference and its low computational complexity. Finally, we integrate this method into a clinical interactive interface to provide a more convenient and intuitive user experience. These findings indicate that MEE serves as a valuable clinical descriptor for ECG characterization. The implementation code can be referenced at the following link: https://github.com/fdu-harry/ECG-MEE-metric.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Coexistence of interacting charge density waves in a layered semiconductor
Authors:
B. Q. Lv,
Alfred Zong,
Dong Wu,
Zhengwei Nie,
Yifan Su,
Dongsung Choi,
Batyr Ilyas,
Bryan T. Fichera,
Jiarui Li,
Edoardo Baldini,
Masataka Mogi,
Y. -B. Huang,
Hoi Chun Po,
Sheng Meng,
Yao Wang,
N. L. Wang,
Nuh Gedik
Abstract:
Coexisting orders are key features of strongly correlated materials and underlie many intriguing phenomena from unconventional superconductivity to topological orders. Here, we report the coexistence of two interacting charge-density-wave (CDW) orders in EuTe4, a layered crystal that has drawn considerable attention owing to its anomalous thermal hysteresis and a semiconducting CDW state despite t…
▽ More
Coexisting orders are key features of strongly correlated materials and underlie many intriguing phenomena from unconventional superconductivity to topological orders. Here, we report the coexistence of two interacting charge-density-wave (CDW) orders in EuTe4, a layered crystal that has drawn considerable attention owing to its anomalous thermal hysteresis and a semiconducting CDW state despite the absence of perfect FS nesting. By accessing unoccupied conduction bands with time- and angle-resolved photoemission measurements, we find that mono- and bi-layers of Te in the unit cell host different CDWs that are associated with distinct energy gaps. The two gaps display dichotomous evolutions following photoexcitation, where the larger bilayer CDW gap exhibits less renormalization and faster recovery. Surprisingly, the CDW in the Te monolayer displays an additional momentum-dependent gap renormalization that cannot be captured by density-functional theory calculations. This phenomenon is attributed to interlayer interactions between the two CDW orders, which account for the semiconducting nature of the equilibrium state. Our findings not only offer microscopic insights into the correlated ground state of EuTe4 but also provide a general non-equilibrium approach to understand coexisting, layer-dependent orders in a complex system.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
A Critique of Du's "A Polynomial-Time Algorithm for 3-SAT
Authors:
Yumeng He,
Matan Kotler-Berkowitz,
Harry Liuson,
Zeyu Nie
Abstract:
In this paper, we examine the claims made by the paper "A polynomial-time algorithm for 3-SAT" by Lizhi Du. The paper claims to provide a polynomial-time algorithm for solving the NP-complete problem 3-SAT. In examining the paper's argument, we find a flaw in one of the main sections of its algorithm. We argue that this flaw causes the paper's algorithm to incorrectly decide that an infinite famil…
▽ More
In this paper, we examine the claims made by the paper "A polynomial-time algorithm for 3-SAT" by Lizhi Du. The paper claims to provide a polynomial-time algorithm for solving the NP-complete problem 3-SAT. In examining the paper's argument, we find a flaw in one of the main sections of its algorithm. We argue that this flaw causes the paper's algorithm to incorrectly decide that an infinite family of satisfiable 3-CNF boolean formulas are not satisfiable. Therefore, the paper does not establish that P = NP.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
End-to-End Autonomous Driving through V2X Cooperation
Authors:
Haibao Yu,
Wenxian Yang,
Jiaru Zhong,
Zhenwei Yang,
Siqi Fan,
** Luo,
Zaiqing Nie
Abstract:
Cooperatively utilizing both ego-vehicle and infrastructure sensor data via V2X communication has emerged as a promising approach for advanced autonomous driving. However, current research mainly focuses on improving individual modules, rather than taking end-to-end learning to optimize final planning performance, resulting in underutilized data potential. In this paper, we introduce UniV2X, a pio…
▽ More
Cooperatively utilizing both ego-vehicle and infrastructure sensor data via V2X communication has emerged as a promising approach for advanced autonomous driving. However, current research mainly focuses on improving individual modules, rather than taking end-to-end learning to optimize final planning performance, resulting in underutilized data potential. In this paper, we introduce UniV2X, a pioneering cooperative autonomous driving framework that seamlessly integrates all key driving modules across diverse views into a unified network. We propose a sparse-dense hybrid data transmission and fusion mechanism for effective vehicle-infrastructure cooperation, offering three advantages: 1) Effective for simultaneously enhancing agent perception, online map**, and occupancy prediction, ultimately improving planning performance. 2) Transmission-friendly for practical and limited communication conditions. 3) Reliable data fusion with interpretability of this hybrid data. We implement UniV2X, as well as reproducing several benchmark methods, on the challenging DAIR-V2X, the real-world cooperative driving dataset. Experimental results demonstrate the effectiveness of UniV2X in significantly enhancing planning performance, as well as all intermediate output performance. Code is at https://github.com/AIR-THU/UniV2X.
△ Less
Submitted 19 April, 2024; v1 submitted 31 March, 2024;
originally announced April 2024.
-
ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling
Authors:
Kangjie Zheng,
Siyu Long,
Tianyu Lu,
Junwei Yang,
Xinyu Dai,
Ming Zhang,
Zaiqing Nie,
Wei-Ying Ma,
Hao Zhou
Abstract:
Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small mole…
▽ More
Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ESM-AA (ESM All-Atom), a novel approach that enables atom-scale and residue-scale unified molecular modeling. ESM-AA achieves this by pre-training on multi-scale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ESM-AA surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further investigations reveal that through unified molecular modeling, ESM-AA not only gains molecular knowledge but also retains its understanding of proteins. The source codes of ESM-AA are publicly released at https://github.com/zhengkangjie/ESM-AA.
△ Less
Submitted 12 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception
Authors:
Ruiyang Hao,
Siqi Fan,
Yingru Dai,
Zhenlin Zhang,
Chenxi Li,
Yuntian Wang,
Haibao Yu,
Wenxian Yang,
Jirui Yuan,
Zaiqing Nie
Abstract:
The value of roadside perception, which could extend the boundaries of autonomous driving and traffic management, has gradually become more prominent and acknowledged in recent years. However, existing roadside perception approaches only focus on the single-infrastructure sensor system, which cannot realize a comprehensive understanding of a traffic area because of the limited sensing range and bl…
▽ More
The value of roadside perception, which could extend the boundaries of autonomous driving and traffic management, has gradually become more prominent and acknowledged in recent years. However, existing roadside perception approaches only focus on the single-infrastructure sensor system, which cannot realize a comprehensive understanding of a traffic area because of the limited sensing range and blind spots. Orienting high-quality roadside perception, we need Roadside Cooperative Perception (RCooper) to achieve practical area-coverage roadside perception for restricted traffic areas. Rcooper has its own domain-specific challenges, but further exploration is hindered due to the lack of datasets. We hence release the first real-world, large-scale RCooper dataset to bloom the research on practical roadside cooperative perception, including detection and tracking. The manually annotated dataset comprises 50k images and 30k point clouds, including two representative traffic scenes (i.e., intersection and corridor). The constructed benchmarks prove the effectiveness of roadside cooperation perception and demonstrate the direction of further research. Codes and dataset can be accessed at: https://github.com/AIR-THU/DAIR-RCooper.
△ Less
Submitted 31 March, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Breaking Abbe's diffraction limit with harmonic deactivation microscopy
Authors:
Kevin Murzyn,
Maarten L. S. van der Geest,
Leo Guery,
Zhonghui Nie,
Pieter van Essen,
Stefan Witte,
Peter M. Kraus
Abstract:
Nonlinear optical microscopy provides elegant means for label-free imaging of biological samples and condensed matter systems. The widespread areas of application could even be increased if resolution was improved, which is currently limited by the famous Abbe diffraction limit. Super-resolution techniques can break the diffraction limit but rely on fluorescent labeling. This makes them incompatib…
▽ More
Nonlinear optical microscopy provides elegant means for label-free imaging of biological samples and condensed matter systems. The widespread areas of application could even be increased if resolution was improved, which is currently limited by the famous Abbe diffraction limit. Super-resolution techniques can break the diffraction limit but rely on fluorescent labeling. This makes them incompatible with (sub-)femtosecond temporal resolution and applications that demand the absence of labeling. Here, we introduce harmonic deactivation microscopy (HADES) for breaking the diffraction limit in non-fluorescent samples. By controlling the harmonic generation process on the quantum level with a second donut-shaped pulse, we confine the third harmonic generation to three times below the original focus size and use this pulse for scanning microscopy. We demonstrate that resolution improvement by deactivation is more efficient for higher harmonic orders, and only limited by the maximum applicable deactivation-pulse fluence. This provides a route towards sub-100~nm resolution in a regular nonlinear microscope. The new capability of label-free super-resolution can find immediate applications in condensed matter physics, semiconductor metrology, and biomedical imaging.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
Authors:
Hailang Huang,
Zhijie Nie,
Ziqiao Wang,
Ziyu Shang
Abstract:
Current image-text retrieval methods have demonstrated impressive performance in recent years. However, they still face two problems: the inter-modal matching missing problem and the intra-modal semantic loss problem. These problems can significantly affect the accuracy of image-text retrieval. To address these challenges, we propose a novel method called Cross-modal and Uni-modal Soft-label Align…
▽ More
Current image-text retrieval methods have demonstrated impressive performance in recent years. However, they still face two problems: the inter-modal matching missing problem and the intra-modal semantic loss problem. These problems can significantly affect the accuracy of image-text retrieval. To address these challenges, we propose a novel method called Cross-modal and Uni-modal Soft-label Alignment (CUSA). Our method leverages the power of uni-modal pre-trained models to provide soft-label supervision signals for the image-text retrieval model. Additionally, we introduce two alignment techniques, Cross-modal Soft-label Alignment (CSA) and Uni-modal Soft-label Alignment (USA), to overcome false negatives and enhance similarity recognition between uni-modal samples. Our method is designed to be plug-and-play, meaning it can be easily applied to existing image-text retrieval models without changing their original architectures. Extensive experiments on various image-text retrieval models and datasets, we demonstrate that our method can consistently improve the performance of image-text retrieval and achieve new state-of-the-art results. Furthermore, our method can also boost the uni-modal retrieval performance of image-text retrieval models, enabling it to achieve universal retrieval. The code and supplementary files can be found at https://github.com/lerogo/aaai24_itr_cusa.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
DeepCRE: Transforming Drug R&D via AI-Driven Cross-drug Response Evaluation
Authors:
Yushuai Wu,
Ting Zhang,
Hao Zhou,
Hainan Wu,
Hanwen Sunchu,
Lei Hu,
Xiaofang Chen,
Suyuan Zhao,
Gaochao Liu,
Chao Sun,
Jiahuan Zhang,
Yizhen Luo,
Peng Liu,
Zaiqing Nie,
Yushuai Wu
Abstract:
The fields of therapeutic application and drug research and development (R&D) both face substantial challenges, i.e., the therapeutic domain calls for more treatment alternatives, while numerous promising pre-clinical drugs have failed in clinical trials. One of the reasons is the inadequacy of Cross-drug Response Evaluation (CRE) during the late stages of drug R&D. Although in-silico CRE models b…
▽ More
The fields of therapeutic application and drug research and development (R&D) both face substantial challenges, i.e., the therapeutic domain calls for more treatment alternatives, while numerous promising pre-clinical drugs have failed in clinical trials. One of the reasons is the inadequacy of Cross-drug Response Evaluation (CRE) during the late stages of drug R&D. Although in-silico CRE models bring a promising solution, existing methodologies are restricted to early stages of drug R&D, such as target and cell-line levels, offering limited improvement to clinical success rates. Herein, we introduce DeepCRE, a pioneering AI model designed to predict CRE effectively in the late stages of drug R&D. DeepCRE outperforms the existing best models by achieving an average performance improvement of 17.7% in patient-level CRE, and a 5-fold increase in indication-level CRE, facilitating more accurate personalized treatment predictions and better pharmaceutical value assessment for indications, respectively. Furthermore, DeepCRE has identified a set of six drug candidates that show significantly greater effectiveness than a comparator set of two approved drugs in 5/8 colorectal cancer organoids. This demonstrates the capability of DeepCRE to systematically uncover a spectrum of drug candidates with enhanced therapeutic effects, highlighting its potential to transform drug R&D.
△ Less
Submitted 18 March, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Towards Better Understanding of Contrastive Sentence Representation Learning: A Unified Paradigm for Gradient
Authors:
Mingxin Li,
Richong Zhang,
Zhijie Nie
Abstract:
Sentence Representation Learning (SRL) is a crucial task in Natural Language Processing (NLP), where contrastive Self-Supervised Learning (SSL) is currently a mainstream approach. However, the reasons behind its remarkable effectiveness remain unclear. Specifically, many studies have investigated the similarities between contrastive and non-contrastive SSL from a theoretical perspective. Such simi…
▽ More
Sentence Representation Learning (SRL) is a crucial task in Natural Language Processing (NLP), where contrastive Self-Supervised Learning (SSL) is currently a mainstream approach. However, the reasons behind its remarkable effectiveness remain unclear. Specifically, many studies have investigated the similarities between contrastive and non-contrastive SSL from a theoretical perspective. Such similarities can be verified in classification tasks, where the two approaches achieve comparable performance. But in ranking tasks (i.e., Semantic Textual Similarity (STS) in SRL), contrastive SSL significantly outperforms non-contrastive SSL. Therefore, two questions arise: First, *what commonalities enable various contrastive losses to achieve superior performance in STS?* Second, *how can we make non-contrastive SSL also effective in STS?* To address these questions, we start from the perspective of gradients and discover that four effective contrastive losses can be integrated into a unified paradigm, which depends on three components: the **Gradient Dissipation**, the **Weight**, and the **Ratio**. Then, we conduct an in-depth analysis of the roles these components play in optimization and experimentally demonstrate their significance for model performance. Finally, by adjusting these components, we enable non-contrastive SSL to achieve outstanding performance in STS.
△ Less
Submitted 5 June, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Towards complete all-optical emission control of high-harmonic generation from solids
Authors:
Pieter J. van Essen,
Zhonghui Nie,
Brian de Keijzer,
Peter M. Kraus
Abstract:
Optical modulation of high-harmonics generation in solids enables the detection of material properties such as the band structure and promising new applications such as super-resolution imaging in semiconductors. Various recent studies have shown optical modulation of high-harmonics generation in solids, in particular, suppression of high-harmonics generation has been observed by synchronized or d…
▽ More
Optical modulation of high-harmonics generation in solids enables the detection of material properties such as the band structure and promising new applications such as super-resolution imaging in semiconductors. Various recent studies have shown optical modulation of high-harmonics generation in solids, in particular, suppression of high-harmonics generation has been observed by synchronized or delayed multi-pulse sequences. Here we provide an overview of the underlying mechanisms attributed to this suppression and provide a perspective on the challenges and opportunities regarding these mechanisms. All-optical control of high-harmonic generation allows for femtosecond, and in the future possibly subfemtosecond, switching, which has numerous possible applications: These range from super-resolution microscopy, to nanoscale controlled chemistry, and highly tunable nonlinear light sources.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Exploring the Impact: How Decentralized Exchange Designs Shape Traders' Behavior on Perpetual Future Contracts
Authors:
Erdong Chen,
Mengzhong Ma,
Zixin Nie
Abstract:
In this paper, we analyze traders' behavior within both centralized exchanges (CEXs) and decentralized exchanges (DEXs), focusing on the volatility of Bitcoin prices and the trading activity of investors engaged in perpetual future contracts. We categorize the architecture of perpetual future exchanges into three distinct models, each exhibiting unique patterns of trader behavior in relation to tr…
▽ More
In this paper, we analyze traders' behavior within both centralized exchanges (CEXs) and decentralized exchanges (DEXs), focusing on the volatility of Bitcoin prices and the trading activity of investors engaged in perpetual future contracts. We categorize the architecture of perpetual future exchanges into three distinct models, each exhibiting unique patterns of trader behavior in relation to trading volume, open interest, liquidation, and leverage. Our detailed examination of DEXs, especially those utilizing the Virtual Automated Market Making (VAMM) Model, uncovers a differential impact of open interest on long versus short positions. In exchanges which operate under the Oracle Pricing Model, we find that traders primarily act as price takers, with their trading actions reflecting direct responses to price movements of the underlying assets. Furthermore, our research highlights a significant propensity among less informed traders to overreact to positive news, as demonstrated by an increase in long positions. This study contributes to the understanding of market dynamics in digital asset exchanges, offering insights into the behavioral finance for future innovation of decentralized finance.
△ Less
Submitted 25 April, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
UNeR3D: Versatile and Scalable 3D RGB Point Cloud Generation from 2D Images in Unsupervised Reconstruction
Authors:
Hongbin Lin,
Juangui Xu,
Qingfeng Xu,
Zhengyu Hu,
Handing Xu,
Yunzhi Chen,
Yongjun Hu,
Zhenguo Nie
Abstract:
In the realm of 3D reconstruction from 2D images, a persisting challenge is to achieve high-precision reconstructions devoid of 3D Ground Truth data reliance. We present UNeR3D, a pioneering unsupervised methodology that sets a new standard for generating detailed 3D reconstructions solely from 2D views. Our model significantly cuts down the training costs tied to supervised approaches and introdu…
▽ More
In the realm of 3D reconstruction from 2D images, a persisting challenge is to achieve high-precision reconstructions devoid of 3D Ground Truth data reliance. We present UNeR3D, a pioneering unsupervised methodology that sets a new standard for generating detailed 3D reconstructions solely from 2D views. Our model significantly cuts down the training costs tied to supervised approaches and introduces RGB coloration to 3D point clouds, enriching the visual experience. Employing an inverse distance weighting technique for color rendering, UNeR3D ensures seamless color transitions, enhancing visual fidelity. Our model's flexible architecture supports training with any number of views, and uniquely, it is not constrained by the number of views used during training when performing reconstructions. It can infer with an arbitrary count of views during inference, offering unparalleled versatility. Additionally, the model's continuous spatial input domain allows the generation of point clouds at any desired resolution, empowering the creation of high-resolution 3D RGB point clouds. We solidify the reconstruction process with a novel multi-view geometric loss and color loss, demonstrating that our model excels with single-view inputs and beyond, thus resha** the paradigm of unsupervised learning in 3D vision. Our contributions signal a substantial leap forward in 3D vision, offering new horizons for content creation across diverse applications. Code is available at https://github.com/HongbinLin3589/UNeR3D.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Evaluating the Claims of "SAT Requires Exhaustive Search"
Authors:
Michael C. Chavrimootoo,
Yumeng He,
Matan Kotler-Berkowitz,
Harry Liuson,
Zeyu Nie
Abstract:
In this paper, we take a closer look at the claims made by Xu and Zhou in their paper "SAT Requires Exhaustive Search" [XZ23], which claims to provide a lower bound on the complexity of the so-called Model RB. Xu and Zhou conclude that their result implies a separation between P and NP, since the lower bound purportedly proves that the Strong Exponential Time Hypothesis (SETH) is true. In examinin…
▽ More
In this paper, we take a closer look at the claims made by Xu and Zhou in their paper "SAT Requires Exhaustive Search" [XZ23], which claims to provide a lower bound on the complexity of the so-called Model RB. Xu and Zhou conclude that their result implies a separation between P and NP, since the lower bound purportedly proves that the Strong Exponential Time Hypothesis (SETH) is true. In examining Xu and Zhou's arguments, we find a flaw in their main theorems. The authors assume that an algorithm for Model RB must have a certain structure that can leverage downward self-reducibility, and argue that such an algorithm cannot run in polynomial time. We argue that this structure is not guaranteed to exist and thus their paper neither proves SETH to be true nor proves P $\neq$ NP.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Efficient generation of intense spatial and spatiotemporal vortex harmonics using plasma mirrors
Authors:
Yipeng Wu,
Zan Nie,
Fei Li,
Chaojie Zhang,
Ken A Marsh,
Warren B. Mori,
Chan Joshi
Abstract:
Intense spatial or spatiotemporal vortex pulses from the extreme ultraviolet to soft X-ray spectral windows are expected to provide new degrees of freedom for a variety of key applications since they carry longitudinal or transverse orbital angular momentum (OAM), respectively. Plasma-based high harmonic generation driven by a near-infrared spatial or spatiotemporal optical vortex offers a promisi…
▽ More
Intense spatial or spatiotemporal vortex pulses from the extreme ultraviolet to soft X-ray spectral windows are expected to provide new degrees of freedom for a variety of key applications since they carry longitudinal or transverse orbital angular momentum (OAM), respectively. Plasma-based high harmonic generation driven by a near-infrared spatial or spatiotemporal optical vortex offers a promising route to such novel light sources. However, the energy conversion efficiency from the incident vortex beam to the vortex harmonics is rather low because of the limited driving intensities available in practice. Here, we propose and demonstrate through simulations that by adding a readily available relativistic Gaussian pump beam as a source of energy, the energy conversion efficiency can be increased by several orders of magnitude. In addition, the proposed scheme allows independent control over the frequency and OAM of the vortex harmonics.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Efficient generation and amplification of intense vortex and vector laser pulses via strongly coupled stimulated Brillouin scattering in plasmas
Authors:
Yipeng Wu,
Chaojie Zhang,
Zan Nie,
Mitchell Sinclair,
Audrey Farrell,
Kenneth A Marsh,
E. Paulo Alves,
Frank Tsung,
Warren B. Mori,
Chan Joshi
Abstract:
The past decade has seen tremendous progress in the production and utilization of vortex and vector laser pulses. Although both are considered as structured light beams, the vortex lasers have helical phase fronts and phase singularities, while the vector lasers have spatially variable polarization states and polarization singularities. In contrast to the vortex pulses that carry orbital angular m…
▽ More
The past decade has seen tremendous progress in the production and utilization of vortex and vector laser pulses. Although both are considered as structured light beams, the vortex lasers have helical phase fronts and phase singularities, while the vector lasers have spatially variable polarization states and polarization singularities. In contrast to the vortex pulses that carry orbital angular momentum (OAM), the vector laser pulses have a complex spin angular momentum (SAM) and OAM coupling. Despite many potential applications enabled by such pulses, the generation of high-power/-intensity vortex and vector beams remains challenging. Here, we demonstrate using theory and three-dimensional simulations that the strongly-coupled stimulated Brillouin scattering (SC-SBS) process in plasmas can be used as a promising amplification technique with up to 65% energy transfer efficiency from the pump beam to the seed beam for both vortex and vector pulses. We also show that SC-SBS is strongly polarization-dependent in plasmas, enabling an all-optical polarization control of the amplified seed beam. Additionally, the interaction of such structured lasers with plasmas leads to various angular momentum couplings and decouplings that produce intense new light structures with controllable OAM and SAM. This scheme paves the way for novel optical devices such as plasma-based amplifiers and light field manipulators.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Dynamical evolution of spinodal decomposition in holographic superfluids
Authors:
Xin Zhao,
Zi-Qiang Zhao,
Zhang-Yu Nie,
Hua-Bi Zeng,
Yu Tian,
Matteo Baggioli
Abstract:
We study the nonlinear dynamical evolution of spinodal decomposition in a first-order superfluid phase transition using a simple holographic model in the probe limit. We first confirm the linear stability analysis based on quasinormal modes and verify the existence of a critical length scale related to a gradient instability -- negative speed of sound squared -- of the superfluid sound mode, which…
▽ More
We study the nonlinear dynamical evolution of spinodal decomposition in a first-order superfluid phase transition using a simple holographic model in the probe limit. We first confirm the linear stability analysis based on quasinormal modes and verify the existence of a critical length scale related to a gradient instability -- negative speed of sound squared -- of the superfluid sound mode, which is a consequence of a negative thermodynamic charge susceptibility. We present a comparison between our case and the standard Cahn-Hilliard equation for spinodal instability, in which a critical length scale can be also derived based on a diffusive instability. We then perform several numerical tests which include the nonlinear time evolution directly from an unstable state and fast quenches from a stable to an unstable state in the spinodal region. Our numerical results provide a real time description of spinodal decomposition and phase separation in one and two spatial dimensions. We reveal the existence of four different stages in the dynamical evolution, and characterize their main properties. Finally, we investigate the strength of dynamical heterogeneity using the spatial variance of the local chemical potential and we correlate the latter to other features of the dynamical evolution.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection
Authors:
Haibao Yu,
Yingjuan Tang,
Enze Xie,
Jilei Mao,
** Luo,
Zaiqing Nie
Abstract:
Cooperatively utilizing both ego-vehicle and infrastructure sensor data can significantly enhance autonomous driving perception abilities. However, the uncertain temporal asynchrony and limited communication conditions can lead to fusion misalignment and constrain the exploitation of infrastructure data. To address these issues in vehicle-infrastructure cooperative 3D (VIC3D) object detection, we…
▽ More
Cooperatively utilizing both ego-vehicle and infrastructure sensor data can significantly enhance autonomous driving perception abilities. However, the uncertain temporal asynchrony and limited communication conditions can lead to fusion misalignment and constrain the exploitation of infrastructure data. To address these issues in vehicle-infrastructure cooperative 3D (VIC3D) object detection, we propose the Feature Flow Net (FFNet), a novel cooperative detection framework. FFNet is a flow-based feature fusion framework that uses a feature flow prediction module to predict future features and compensate for asynchrony. Instead of transmitting feature maps extracted from still-images, FFNet transmits feature flow, leveraging the temporal coherence of sequential infrastructure frames. Furthermore, we introduce a self-supervised training approach that enables FFNet to generate feature flow with feature prediction ability from raw infrastructure sequences. Experimental results demonstrate that our proposed method outperforms existing cooperative detection methods while only requiring about 1/100 of the transmission cost of raw data and covers all latency in one model on the DAIR-V2X dataset. The code is available at \href{https://github.com/haibao-yu/FFNet-VIC3D}{https://github.com/haibao-yu/FFNet-VIC3D}.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
Learning Cooperative Trajectory Representations for Motion Forecasting
Authors:
Hongzhi Ruan,
Haibao Yu,
Wenxian Yang,
Siqi Fan,
Yingjuan Tang,
Zaiqing Nie
Abstract:
Motion forecasting is an essential task for autonomous driving, and the effective information utilization from infrastructure and other vehicles can enhance motion forecasting capabilities. Existing research have primarily focused on leveraging single-frame cooperative information to enhance the limited perception capability of the ego vehicle, while underutilizing the motion and interaction infor…
▽ More
Motion forecasting is an essential task for autonomous driving, and the effective information utilization from infrastructure and other vehicles can enhance motion forecasting capabilities. Existing research have primarily focused on leveraging single-frame cooperative information to enhance the limited perception capability of the ego vehicle, while underutilizing the motion and interaction information of traffic participants observed from cooperative devices. In this paper, we first propose the cooperative trajectory representations learning paradigm. Specifically, we present V2X-Graph, the first interpretable and end-to-end learning framework for cooperative motion forecasting. V2X-Graph employs an interpretable graph to fully leverage the cooperative motion and interaction contexts. Experimental results on the vehicle-to-infrastructure (V2I) motion forecasting dataset, V2X-Seq, demonstrate the effectiveness of V2X-Graph. To further evaluate on V2X scenario, we construct the first real-world vehicle-to-everything (V2X) motion forecasting dataset V2X-Traj, and the performance shows the advantage of our method. We hope both V2X-Graph and V2X-Traj can facilitate the further development of cooperative motion forecasting. Find project at https://github.com/AIR-THU/V2X-Graph, find data at https://github.com/AIR-THU/DAIR-V2X-Seq.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Chainmail links and non-left-orderability
Authors:
Zipei Nie
Abstract:
We prove that the alternating surgeries on flat fully augmented chainmail links yield total L-spaces. We also study the non-left-orderability of surgeries on the connected sum with an L-space knot using order detection.
We prove that the alternating surgeries on flat fully augmented chainmail links yield total L-spaces. We also study the non-left-orderability of surgeries on the connected sum with an L-space knot using order detection.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Petal diagram from simple braids
Authors:
Zipei Nie
Abstract:
We construct petal diagrams from simple braids. This approach allows us to confirm a conjecture proposed by Kim, No and Yoo, which states that the petal number of the nontrivial torus knot $T_{r,s}$ ($r<s$) is at most $2s-2\lfloor\frac{s}{r}\rfloor+1$. As a consequence, we deduce that the petal number of a nontrivial torus knot $T_{r,s}$ is equal to $2s-1$ if and only if $r<s<2r$.
We construct petal diagrams from simple braids. This approach allows us to confirm a conjecture proposed by Kim, No and Yoo, which states that the petal number of the nontrivial torus knot $T_{r,s}$ ($r<s$) is at most $2s-2\lfloor\frac{s}{r}\rfloor+1$. As a consequence, we deduce that the petal number of a nontrivial torus knot $T_{r,s}$ is equal to $2s-1$ if and only if $r<s<2r$.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Wakefield Generation in Hydrogen and Lithium Plasmas at FACET-II: Diagnostics and First Beam-Plasma Interaction Results
Authors:
D. Storey,
C. Zhang,
P. San Miguel Claveria,
G. J. Cao,
E. Adli,
L. Alsberg,
R. Ariniello,
C. Clarke,
S. Corde,
T. N. Dalichaouch,
H. Ekerfelt,
C. Emma,
E. Gerstmayr,
S. Gessner,
M. Gilljohann,
C. Hast,
A. Knetsch,
V. Lee,
M. Litos,
R. Loney,
K. A. Marsh,
A. Matheron,
W. B. Mori,
Z. Nie,
B. O'Shea
, et al. (6 additional authors not shown)
Abstract:
Plasma Wakefield Acceleration (PWFA) provides ultrahigh acceleration gradients of 10s of GeV/m, providing a novel path towards efficient, compact, TeV-scale linear colliders and high brightness free electron lasers. Critical to the success of these applications is demonstrating simultaneously high gradient acceleration, high energy transfer efficiency, and preservation of emittance, charge, and en…
▽ More
Plasma Wakefield Acceleration (PWFA) provides ultrahigh acceleration gradients of 10s of GeV/m, providing a novel path towards efficient, compact, TeV-scale linear colliders and high brightness free electron lasers. Critical to the success of these applications is demonstrating simultaneously high gradient acceleration, high energy transfer efficiency, and preservation of emittance, charge, and energy spread. Experiments at the FACET-II National User Facility at SLAC National Accelerator Laboratory aim to achieve all of these milestones in a single stage plasma wakefield accelerator, providing a 10 GeV energy gain in a <1 m plasma with high energy transfer efficiency. Such a demonstration depends critically on diagnostics able to measure emittance with mm-mrad accuracy, energy spectra to determine both %-level energy spread and broadband energy gain and loss, incoming longitudinal phase space, and matching dynamics. This paper discusses the experimental setup at FACET-II, including the incoming beam parameters from the FACET-II linac, plasma sources, and diagnostics developed to meet this challenge. Initial progress on the generation of beam ionized wakes in meter-scale hydrogen gas is discussed, as well as commissioning of the plasma sources and diagnostics.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Automated reasoning for proving non-orderability of groups
Authors:
Alexei Lisitsa,
Zipei Nie,
Alexei Vernitski
Abstract:
We demonstrate how a generic automated theorem prover can be applied to establish the non-orderability of groups. Our approach incorporates various tools such as positive cones, torsions, generalised torsions and cofinal elements.
We demonstrate how a generic automated theorem prover can be applied to establish the non-orderability of groups. Our approach incorporates various tools such as positive cones, torsions, generalised torsions and cofinal elements.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Generation of meter-scale hydrogen plasmas and efficient, pump-depletion-limited wakefield excitation using 10 GeV electron bunches
Authors:
C. Zhang,
D. Storey,
P. San Miguel Claveria,
Z. Nie,
K. A. Marsh,
M. Hogan,
W. B. Mori,
E. Adli,
W. An,
R. Ariniello,
G. J. Cao,
C. Clarke,
S. Corde,
T. Dalichaouch,
C. E. Doss,
C. Emma,
H. Ekerfelt,
E. Gerstmayr,
S. Gessner,
C. Hansel,
A. Knetsch,
V. Lee,
F. Li,
M. Litos,
B. O'Shea
, et al. (4 additional authors not shown)
Abstract:
High repetition rates and efficient energy transfer to the accelerating beam are important for a future linear collider based on the beam-driven plasma wakefield acceleration scheme (PWFA-LC). This paper reports the first results from the Plasma Wakefield Acceleration Collaboration (E300) that are beginning to address both of these issues using the recently commissioned FACET-II facility at SLAC.…
▽ More
High repetition rates and efficient energy transfer to the accelerating beam are important for a future linear collider based on the beam-driven plasma wakefield acceleration scheme (PWFA-LC). This paper reports the first results from the Plasma Wakefield Acceleration Collaboration (E300) that are beginning to address both of these issues using the recently commissioned FACET-II facility at SLAC. We have generated meter-scale hydrogen plasmas using time-structured 10 GeV electron bunches from FACET-II, which hold the promise of dramatically increasing the repetition rate of PWFA by rapidly replenishing the gas between each shot compared to the hitherto used lithium plasmas that operate at 1-10 Hz. Furthermore, we have excited wakes in such plasmas that are suitable for high gradient particle acceleration with high drive-bunch to wake energy transfer efficiency -- a first step in achieving a high overall energy transfer efficiency. We have done this by using time-structured electron drive bunches that typically have one or more ultra-high current (>30 kA) femtosecond spike(s) superimposed on a longer (~0.4 ps) lower current (<10 kA) bunch structure. The first spike effectively field-ionizes the gas and produces a meter-scale (30-160 cm) plasma, whereas the subsequent beam charge creates a wake. The length and amplitude of the wake depends on the longitudinal current profile of the bunch and plasma density. We find that the onset of pump depletion, when some of the drive beam electrons are nearly fully depleted of their energy, occurs for hydrogen pressure >1.5 Torr. We also show that some electrons in the rear of the bunch can gain several GeV energies from the wake. These results are reproduced by particle-in-cell simulations using the QPAD code. At a pressure of ~2 Torr, simulations results and experimental data show that the beam transfers about 60% of its energy to the wake.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Narrowing the Gap between Supervised and Unsupervised Sentence Representation Learning with Large Language Model
Authors:
Mingxin Li,
Richong Zhang,
Zhijie Nie,
Yongyi Mao
Abstract:
Sentence Representation Learning (SRL) is a fundamental task in Natural Language Processing (NLP), with the Contrastive Learning of Sentence Embeddings (CSE) being the mainstream technique due to its superior performance. An intriguing phenomenon in CSE is the significant performance gap between supervised and unsupervised methods, with their only difference lying in the training data. Previous wo…
▽ More
Sentence Representation Learning (SRL) is a fundamental task in Natural Language Processing (NLP), with the Contrastive Learning of Sentence Embeddings (CSE) being the mainstream technique due to its superior performance. An intriguing phenomenon in CSE is the significant performance gap between supervised and unsupervised methods, with their only difference lying in the training data. Previous works attribute this performance gap to differences in two representation properties (alignment and uniformity). However, since alignment and uniformity only measure the results, they fail to answer "What aspects of the training data contribute to the performance gap?" and "How can the performance gap be narrowed?", In this paper, we conduct empirical experiments to answer these "What" and "How" questions. We first answer the "What" question by thoroughly comparing the behavior of supervised and unsupervised CSE during their respective training processes. From the comparison, we identify the similarity pattern as a key factor to the performance gap, and introduce a metric, called Relative Fitting Difficulty (RFD), to measure the complexity of the similarity pattern. Then, based on the insights gained from the "What" question, we tackle the "How" question by increasing the pattern complexity of the training data. We achieve this by leveraging the In-Context Learning (ICL) capability of the Large Language Model (LLM) to generate data that simulates complex patterns. By utilizing the hierarchical patterns in the LLM-generated data, we effectively narrow the gap between supervised and unsupervised CSE. We release our codes and appendix at https://github.com/BDBC-KG-NLP/NGCSE.
△ Less
Submitted 19 December, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Code-Style In-Context Learning for Knowledge-Based Question Answering
Authors:
Zhijie Nie,
Richong Zhang,
Zhongyuan Wang,
Xudong Liu
Abstract:
Current methods for Knowledge-Based Question Answering (KBQA) usually rely on complex training techniques and model frameworks, leading to many limitations in practical applications. Recently, the emergence of In-Context Learning (ICL) capabilities in Large Language Models (LLMs) provides a simple and training-free semantic parsing paradigm for KBQA: Given a small number of questions and their lab…
▽ More
Current methods for Knowledge-Based Question Answering (KBQA) usually rely on complex training techniques and model frameworks, leading to many limitations in practical applications. Recently, the emergence of In-Context Learning (ICL) capabilities in Large Language Models (LLMs) provides a simple and training-free semantic parsing paradigm for KBQA: Given a small number of questions and their labeled logical forms as demo examples, LLMs can understand the task intent and generate the logic form for a new question. However, current powerful LLMs have little exposure to logic forms during pre-training, resulting in a high format error rate. To solve this problem, we propose a code-style in-context learning method for KBQA, which converts the generation process of unfamiliar logical form into the more familiar code generation process for LLMs. Experimental results on three mainstream datasets show that our method dramatically mitigated the formatting error problem in generating logic forms while realizing a new SOTA on WebQSP, GrailQA, and GraphQ under the few-shot setting. The code and supplementary files are released at https://github.com/Arthurizijar/KB-Coder .
△ Less
Submitted 5 January, 2024; v1 submitted 9 September, 2023;
originally announced September 2023.
-
Semidiscrete optical vortex droplets in quasi-phase-matched photonic crystals
Authors:
Xiaoxi Xu,
Feiyan Zhao,
Jiayao Huang,
Hehe Xiang,
Li Zhang,
Zhaopin Chen,
Zhongquan Nie,
Boris A Malomed,
Yongyao Li
Abstract:
A new scheme for producing semidiscrete self-trapped vortices (\textquotedblleft swirling photon droplets\textquotedblright ) in photonic crystals with competing quadratic ($χ^{(2)}$) and self-defocusing cubic ($χ^{(3)}$) nonlinearities is proposed. The photonic crystal is designed with a striped structure, in the form of spatially periodic modulation of the $χ^{(2)}$ susceptibility, which is impo…
▽ More
A new scheme for producing semidiscrete self-trapped vortices (\textquotedblleft swirling photon droplets\textquotedblright ) in photonic crystals with competing quadratic ($χ^{(2)}$) and self-defocusing cubic ($χ^{(3)}$) nonlinearities is proposed. The photonic crystal is designed with a striped structure, in the form of spatially periodic modulation of the $χ^{(2)}$ susceptibility, which is imposed by the quasi-phase-matching technique. Unlike previous realizations of semidiscrete optical modes in composite media, built as combinations of continuous and arrayed discrete waveguides, the semidiscrete vortex droplets are produced here in the fully continuous medium. This work reveals that the system supports two types of semidiscrete vortex droplets, \textit{viz}., onsite- and intersite-centered ones, which feature, respectively, odd and even numbers of stripes, $\mathcal{N}$. Stability areas for the states with different values of $\mathcal{N}$ are identified in the system's parameter space. Some stability areas overlap with each others, giving rise to multistability of states with different $\mathcal{N}$. The coexisting states are mutually degenerate, featuring equal values of the Hamiltonian and propagation constant. An experimental scheme to realize the droplets is outlined, suggesting new possibilities for the long-distance transmission of structured light carrying orbital angular momentum in nonlinear media.
△ Less
Submitted 15 September, 2023; v1 submitted 31 August, 2023;
originally announced August 2023.
-
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
Authors:
Yizhen Luo,
Jiahuan Zhang,
Siqi Fan,
Kai Yang,
Yushuai Wu,
Mu Qiao,
Zaiqing Nie
Abstract:
Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cel…
▽ More
Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this paper, we introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT allows users to easily ``communicate'' with diverse biological modalities through free text, which is the first of its kind. BioMedGPT aligns different biological modalities with natural language via a large generative language model, namely, BioMedGPT-LM. We publish BioMedGPT-10B, which unifies the feature spaces of molecules, proteins, and natural language via encoding and alignment. Through fine-tuning, BioMedGPT-10B outperforms or is on par with human and significantly larger general-purpose foundation models on the biomedical QA task. It also demonstrates promising performance in the molecule QA and protein QA tasks, which could greatly accelerate the discovery of new drugs and therapeutic targets. In addition, BioMedGPT-LM-7B is the first large generative language model based on Llama2 in the biomedical domain, therefore is commercial friendly. Both BioMedGPT-10B and BioMedGPT-LM-7B are open-sourced to the research community. In addition, we publish the datasets that are meticulously curated for the alignment of multi-modalities, i.e., PubChemQA and UniProtQA. All the models, codes, and datasets are available at \url{https://github.com/PharMolix/OpenBioMed}.
△ Less
Submitted 21 August, 2023; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Simpler Analyses of Union-Find
Authors:
Zhiyi Huang,
Chris Lambert,
Zipei Nie,
Richard Peng
Abstract:
We analyze union-find using potential functions motivated by continuous algorithms, and give alternate proofs of the $O(\log\log{n})$, $O(\log^{*}n)$, $O(\log^{**}n)$, and $O(α(n))$ amortized cost upper bounds. The proof of the $O(\log\log{n})$ amortized bound goes as follows. Let each node's potential be the square root of its size, i.e., the size of the subtree rooted from it. The overall potent…
▽ More
We analyze union-find using potential functions motivated by continuous algorithms, and give alternate proofs of the $O(\log\log{n})$, $O(\log^{*}n)$, $O(\log^{**}n)$, and $O(α(n))$ amortized cost upper bounds. The proof of the $O(\log\log{n})$ amortized bound goes as follows. Let each node's potential be the square root of its size, i.e., the size of the subtree rooted from it. The overall potential increase is $O(n)$ because the node sizes increase geometrically along any tree path. When compressing a path, each node on the path satisfies that either its potential decreases by $Ω(1)$, or its child's size along the path is less than the square root of its size: this can happen at most $O(\log\log{n})$ times along any tree path.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
QUEST: Query Stream for Practical Cooperative Perception
Authors:
Siqi Fan,
Haibao Yu,
Wenxian Yang,
Jirui Yuan,
Zaiqing Nie
Abstract:
Cooperative perception can effectively enhance individual perception performance by providing additional viewpoint and expanding the sensing field. Existing cooperation paradigms are either interpretable (result cooperation) or flexible (feature cooperation). In this paper, we propose the concept of query cooperation to enable interpretable instance-level flexible feature interaction. To specifica…
▽ More
Cooperative perception can effectively enhance individual perception performance by providing additional viewpoint and expanding the sensing field. Existing cooperation paradigms are either interpretable (result cooperation) or flexible (feature cooperation). In this paper, we propose the concept of query cooperation to enable interpretable instance-level flexible feature interaction. To specifically explain the concept, we propose a cooperative perception framework, termed QUEST, which let query stream flow among agents. The cross-agent queries are interacted via fusion for co-aware instances and complementation for individual unaware instances. Taking camera-based vehicle-infrastructure perception as a typical practical application scene, the experimental results on the real-world dataset, DAIR-V2X-Seq, demonstrate the effectiveness of QUEST and further reveal the advantage of the query cooperation paradigm on transmission flexibility and robustness to packet dropout. We hope our work can further facilitate the cross-agent representation interaction for better cooperative perception in practice.
△ Less
Submitted 22 May, 2024; v1 submitted 3 August, 2023;
originally announced August 2023.
-
LiveRetro: Visual Analytics for Strategic Retrospect in Livestream E-Commerce
Authors:
Yuchen Wu,
Yuansong Xu,
Shenghan Gao,
Xingbo Wang,
Wenkai Song,
Zhiheng Nie,
Xiaomeng Fan,
Quan Li
Abstract:
Livestream e-commerce integrates live streaming and online shop**, allowing viewers to make purchases while watching. However, effective marketing strategies remain a challenge due to limited empirical research and subjective biases from the absence of quantitative data. Current tools fail to capture the interdependence between live performances and feedback. This study identified computational…
▽ More
Livestream e-commerce integrates live streaming and online shop**, allowing viewers to make purchases while watching. However, effective marketing strategies remain a challenge due to limited empirical research and subjective biases from the absence of quantitative data. Current tools fail to capture the interdependence between live performances and feedback. This study identified computational features, formulated design requirements, and developed LiveRetro, an interactive visual analytics system. It enables comprehensive retrospective analysis of livestream e-commerce for streamers, viewers, and merchandise. LiveRetro employs enhanced visualization and time-series forecasting models to align performance features and feedback, identifying influences at channel, merchandise, feature, and segment levels. Through case studies and expert interviews, the system provides deep insights into the relationship between live performance and streaming statistics, enabling efficient strategic analysis from multiple perspectives.
△ Less
Submitted 2 August, 2023; v1 submitted 22 July, 2023;
originally announced July 2023.
-
MolFM: A Multimodal Molecular Foundation Model
Authors:
Yizhen Luo,
Kai Yang,
Massimo Hong,
Xing Yi Liu,
Zaiqing Nie
Abstract:
Molecular knowledge resides within three different modalities of information sources: molecular structures, biomedical documents, and knowledge bases. Effective incorporation of molecular knowledge from these modalities holds paramount significance in facilitating biomedical research. However, existing multimodal molecular foundation models exhibit limitations in capturing intricate connections be…
▽ More
Molecular knowledge resides within three different modalities of information sources: molecular structures, biomedical documents, and knowledge bases. Effective incorporation of molecular knowledge from these modalities holds paramount significance in facilitating biomedical research. However, existing multimodal molecular foundation models exhibit limitations in capturing intricate connections between molecular structures and texts, and more importantly, none of them attempt to leverage a wealth of molecular expertise derived from knowledge graphs. In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs. We propose cross-modal attention between atoms of molecular structures, neighbors of molecule entities and semantically related texts to facilitate cross-modal comprehension. We provide theoretical analysis that our cross-modal pre-training captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule, as well as molecules sharing similar structures or functions. MolFM achieves state-of-the-art performance on various downstream tasks. On cross-modal retrieval, MolFM outperforms existing models with 12.13% and 5.04% absolute gains under the zero-shot and fine-tuning settings, respectively. Furthermore, qualitative analysis showcases MolFM's implicit ability to provide grounding from molecular substructures and knowledge graphs. Code and models are available on https://github.com/BioFM/OpenBioMed.
△ Less
Submitted 21 July, 2023; v1 submitted 6 June, 2023;
originally announced July 2023.
-
The holographic s+p model in 4D and 5D Einstein-Gauss-Bonnet gravity
Authors:
Xing-Kun Zhang,
Zhang-Yu Nie,
Hui Zeng,
Qiyuan Pan
Abstract:
We study the holographic s+p model in both four dimensional (4D) and five dimensional (5D) Einstein-Gauss-Bonnet (EGB) gravity. We first show a phase diagram with the Gauss-Bonnet parameter fixed to a small value $α=10^{-7}$ to choose propitiate values of $q_p/q_s$. Then we fix the value of $q_p/q_s$ and plot $α-μ$ phase diagrams to show the influence of Gauss-Bonnet term on the phase transitions…
▽ More
We study the holographic s+p model in both four dimensional (4D) and five dimensional (5D) Einstein-Gauss-Bonnet (EGB) gravity. We first show a phase diagram with the Gauss-Bonnet parameter fixed to a small value $α=10^{-7}$ to choose propitiate values of $q_p/q_s$. Then we fix the value of $q_p/q_s$ and plot $α-μ$ phase diagrams to show the influence of Gauss-Bonnet term on the phase transitions in both 4D and 5D bulk, respectively. The phase diagrams in 4D and 5D present the same qualitative features, indicating similarity of 4D Einstein-Gauss-Bonnet gravity with the 5D case in holography. We also study the influences of Gauss-Bonnet parameter on the special values of the fourth order nonlinear term parameters $λ_s$ and $λ_p$, below which the condensate grows to a different direction near the critical point, that is important in realizing 1st order superfluid phase transitions. Especially, we notice that these special values are different in the canonical and grand canonical ensembles, which is closely related to the study of the spinodal region, where the phase separations occurs with the linear instability at finite wave vector.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Large-Scale Cell Representation Learning via Divide-and-Conquer Contrastive Learning
Authors:
Suyuan Zhao,
Jiahuan Zhang,
Zaiqing Nie
Abstract:
Single-cell RNA sequencing (scRNA-seq) data is a potent tool for comprehending the "language of life" and can provide insights into various downstream biomedical tasks. Large-scale language models (LLMs) are starting to be used for cell representation learning. However, current LLM-based cell representation learning methods depend solely on the BERT architecture, causing an anisotropic embedding s…
▽ More
Single-cell RNA sequencing (scRNA-seq) data is a potent tool for comprehending the "language of life" and can provide insights into various downstream biomedical tasks. Large-scale language models (LLMs) are starting to be used for cell representation learning. However, current LLM-based cell representation learning methods depend solely on the BERT architecture, causing an anisotropic embedding space that leads to inefficient semantic representation. Contrastive learning alleviates this problem by distributing the embeddings uniformly. As a larger batch size in contrastive learning results in better representation, the practical application of contrastive learning in cell representation learning is hampered by the high dimensionality of scRNA-seq data and the large parameter volume of LLMs. To address the batch size limitation, we propose a novel divide-and-conquer contrastive learning approach to decouple the batch size from the GPU memory size for cell representation learning. Based on our divide-and-conquer contrastive learning approach, we introduce Single-Cell Language Model CellLM, a large-scale cell representation learning model to handle high-dimensional scRNA-seq data with tens of thousands of genes. CellLM has over 50 million parameters trained with 2 million scRNA-seq data and makes the first attempt to learn cell language models from both normal cells and cancer cells. CellLM achieves new state-of-the-art (SOTA) results in all evaluated downstream tasks: including a 71.8 F_1-score for cell type annotation (a 3.0% absolute improvement over scBERT), an average F_1-score of 88.9 for single-cell drug sensitivity prediction in a few-shot scenario (an 8.3% absolute improvement), and a 93.4 Pearson's correlation for single-omics cell line drug sensitivity prediction (a 6.2% absolute improvement).
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting
Authors:
Haibao Yu,
Wenxian Yang,
Hongzhi Ruan,
Zhenwei Yang,
Yingjuan Tang,
Xu Gao,
Xin Hao,
Yifeng Shi,
Yifeng Pan,
Ning Sun,
Juan Song,
Jirui Yuan,
** Luo,
Zaiqing Nie
Abstract:
Utilizing infrastructure and vehicle-side information to track and forecast the behaviors of surrounding traffic participants can significantly improve decision-making and safety in autonomous driving. However, the lack of real-world sequential datasets limits research in this area. To address this issue, we introduce V2X-Seq, the first large-scale sequential V2X dataset, which includes data frame…
▽ More
Utilizing infrastructure and vehicle-side information to track and forecast the behaviors of surrounding traffic participants can significantly improve decision-making and safety in autonomous driving. However, the lack of real-world sequential datasets limits research in this area. To address this issue, we introduce V2X-Seq, the first large-scale sequential V2X dataset, which includes data frames, trajectories, vector maps, and traffic lights captured from natural scenery. V2X-Seq comprises two parts: the sequential perception dataset, which includes more than 15,000 frames captured from 95 scenarios, and the trajectory forecasting dataset, which contains about 80,000 infrastructure-view scenarios, 80,000 vehicle-view scenarios, and 50,000 cooperative-view scenarios captured from 28 intersections' areas, covering 672 hours of data. Based on V2X-Seq, we introduce three new tasks for vehicle-infrastructure cooperative (VIC) autonomous driving: VIC3D Tracking, Online-VIC Forecasting, and Offline-VIC Forecasting. We also provide benchmarks for the introduced tasks. Find data, code, and more up-to-date information at \href{https://github.com/AIR-THU/DAIR-V2X-Seq}{https://github.com/AIR-THU/DAIR-V2X-Seq}.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Towards Unified AI Drug Discovery with Multiple Knowledge Modalities
Authors:
Yizhen Luo,
Xing Yi Liu,
Kai Yang,
Kui Huang,
Massimo Hong,
Jiahuan Zhang,
Yushuai Wu,
Zaiqing Nie
Abstract:
In recent years, AI models that mine intrinsic patterns from molecular structures and protein sequences have shown promise in accelerating drug discovery. However, these methods partly lag behind real-world pharmaceutical approaches of human experts that additionally grasp structured knowledge from knowledge bases and unstructured knowledge from biomedical literature. To bridge this gap, we propos…
▽ More
In recent years, AI models that mine intrinsic patterns from molecular structures and protein sequences have shown promise in accelerating drug discovery. However, these methods partly lag behind real-world pharmaceutical approaches of human experts that additionally grasp structured knowledge from knowledge bases and unstructured knowledge from biomedical literature. To bridge this gap, we propose KEDD, a unified, end-to-end, and multimodal deep learning framework that optimally incorporates both structured and unstructured knowledge for vast AI drug discovery tasks. The framework first extracts underlying characteristics from heterogeneous inputs, and then applies multimodal fusion for accurate prediction. To mitigate the problem of missing modalities, we leverage multi-head sparse attention and a modality masking mechanism to extract relevant information robustly. Benefiting from integrated knowledge, our framework achieves a deeper understanding of molecule entities, brings significant improvements over state-of-the-art methods on a wide range of tasks and benchmarks, and reveals its promising potential in assisting real-world drug discovery.
△ Less
Submitted 14 October, 2023; v1 submitted 17 April, 2023;
originally announced May 2023.
-
Affine Toda system of $\mathbf{A}$ and $\mathbf{C}^t$ type: compactness and affine Weyl group
Authors:
Leilei Cui,
Zhaohu Nie,
Wen Yang
Abstract:
The local mass is a fundamental quantized information that characterizes the blow-up solution to the Toda system and has a profound relationship with its underlying algebraic structure. In \cite{Lin-Yang-Zhong-2020}, it was observed that the associated Weyl group can be employed to represent this information for the $\mathbf{A}_n$, $\mathbf{B}_n$, $\mathbf{C}_n$ and $\mathbf{G}_2$ type Toda system…
▽ More
The local mass is a fundamental quantized information that characterizes the blow-up solution to the Toda system and has a profound relationship with its underlying algebraic structure. In \cite{Lin-Yang-Zhong-2020}, it was observed that the associated Weyl group can be employed to represent this information for the $\mathbf{A}_n$, $\mathbf{B}_n$, $\mathbf{C}_n$ and $\mathbf{G}_2$ type Toda systems. The relationship between the local mass of blow-up solution and the corresponding affine Weyl group is further explored for some affine $\mathbf{B}$ type Toda systems in \cite{Cui-Wei-Yang-Zhang-2022}, where the possible local masses are explicitly expressed in terms of $8$ types. The current work presents a comprehensive study of the general affine $\mathbf{A}$ and $\mathbf{C}^t$ type Toda systems with arbitrary rank. At each stage of the blow-up process (via scaling), we can employ certain elements (known as "set chains") in the corresponding affine Weyl group to measure the variation of local mass. Consequently, we obtain the a priori estimate of the affine $\mathbf{A}$ and $\mathbf{C}^t$ type Toda systems with arbitrary number of singularities.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Euclidean Capacitated Vehicle Routing in Random Setting: A $1.55$-Approximation Algorithm
Authors:
Zipei Nie,
Hang Zhou
Abstract:
We study the unit-demand capacitated vehicle routing problem in the random setting of the Euclidean plane. The objective is to visit $n$ random terminals in a square using a set of tours of minimum total length, such that each tour visits the depot and at most $k$ terminals.
We design an elegant algorithm combining the classical sweep heuristic and Arora's framework for the Euclidean traveling s…
▽ More
We study the unit-demand capacitated vehicle routing problem in the random setting of the Euclidean plane. The objective is to visit $n$ random terminals in a square using a set of tours of minimum total length, such that each tour visits the depot and at most $k$ terminals.
We design an elegant algorithm combining the classical sweep heuristic and Arora's framework for the Euclidean traveling salesman problem [Journal of the ACM 1998]. We show that our algorithm is a polynomial-time approximation of ratio at most $1.55$ asymptotically almost surely. This improves on previous approximation ratios of $1.995$ due to Bompadre, Dror, and Orlin [Journal of Applied Probability 2007] and $1.915$ due to Mathieu and Zhou [Random Structures and Algorithms 2022]. In addition, we conjecture that, for any $\varepsilon>0$, our algorithm is a $(1+\varepsilon)$-approximation asymptotically almost surely.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.