-
Tensor Network Computations That Capture Strict Variationality, Volume Law Behavior, and the Efficient Representation of Neural Network States
Authors:
Wen-Yuan Liu,
Si-**g Du,
Ruo**g Peng,
Johnnie Gray,
Garnet Kin-Lic Chan
Abstract:
We introduce a change of perspective on tensor network states that is defined by the computational graph of the contraction of an amplitude. The resulting class of states, which we refer to as tensor network functions, inherit the conceptual advantages of tensor network states while removing computational restrictions arising from the need to converge approximate contractions. We use tensor networ…
▽ More
We introduce a change of perspective on tensor network states that is defined by the computational graph of the contraction of an amplitude. The resulting class of states, which we refer to as tensor network functions, inherit the conceptual advantages of tensor network states while removing computational restrictions arising from the need to converge approximate contractions. We use tensor network functions to compute strict variational estimates of the energy on loopy graphs, analyze their expressive power for ground-states, show that we can capture aspects of volume law time evolution, and provide a map** of general feed-forward neural nets onto efficient tensor network functions. Our work expands the realm of computable tensor networks to ones where accurate contraction methods are not available, and opens up new avenues to use tensor networks.
△ Less
Submitted 21 May, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Are We in The Zone? Exploring The Features and Method of Detecting Simultaneous Flow Experiences Based on EEG Signals
Authors:
Baiqiao Zhang,
Xiangxian Li,
Yunfan Zhou,
Juan Liu,
Weiying Liu,
Chao Zhou,
Yulong Bian
Abstract:
When executing interdependent personal tasks for the team's purpose, simultaneous individual flow(simultaneous flow) is the antecedent condition of achieving shared team flow. Detecting simultaneous flow helps better understanding the status of team members, which is thus important for optimizing multi-user interaction systems. However, there is currently a lack exploration on objective features a…
▽ More
When executing interdependent personal tasks for the team's purpose, simultaneous individual flow(simultaneous flow) is the antecedent condition of achieving shared team flow. Detecting simultaneous flow helps better understanding the status of team members, which is thus important for optimizing multi-user interaction systems. However, there is currently a lack exploration on objective features and methods for detecting simultaneous flow. Based on brain mechanism of flow in teamwork and previous studies on electroencephalogram (EEG)-based individual flow detection, this study aims to explore the significant EEG features related to simultaneous flow, as well as effective detection methods based on EEG signals. First, a two-player simultaneous flow task is designed, based on which we construct the first multi-EEG signals dataset of simultaneous flow. Then, we explore the potential EEG signal features that may be related to individual and simultaneous flow and validate their effectiveness in simultaneous flow detection with various machine learning models. The results show that 1) the inter-brain synchrony features are relevant to simultaneous flow due to enhancing the models' performance in detecting different types of simultaneous flow; 2) the features from the frontal lobe area seem to be given priority attention when detecting simultaneous flows; 3) Random Forests performed best in binary classification while Neural Network and Deep Neural Network3 performed best in ternary classification.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction
Authors:
Wanlong Liu,
Li Zhou,
Dingyi Zeng,
Yichen Xiao,
Shaohuan Cheng,
Chen Zhang,
Grandee Lee,
Malu Zhang,
Wenyu Chen
Abstract:
Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a do…
▽ More
Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a document simultaneouslyThe proposed DEEIA model employs a multi-event prompt mechanism, comprising DE and EIA modules. The DE module is designed to improve the correlation between prompts and their corresponding event contexts, whereas the EIA module provides event-specific information to improve contextual understanding. Extensive experiments show that our method achieves new state-of-the-art performance on four public datasets (RAMS, WikiEvents, MLEE, and ACE05), while significantly saving the inference time compared to the baselines. Further analyses demonstrate the effectiveness of the proposed modules.
△ Less
Submitted 16 June, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Solving the train-platforming problem via a two-level Lagrangian Relaxation approach
Authors:
Qin Zhang,
Richard Martin Lusby,
Pan Shang,
Chang Liu,
Wenqian Liu
Abstract:
High-speed railway stations are crucial junctions in high-speed railway networks. Compared to operations on the tracks between stations, trains have more routing possibilities within stations. As a result, track allocation at a station is relatively complicated. In this study, we aim to solve the train platforming problem for a busy high-speed railway station by considering comprehensive track res…
▽ More
High-speed railway stations are crucial junctions in high-speed railway networks. Compared to operations on the tracks between stations, trains have more routing possibilities within stations. As a result, track allocation at a station is relatively complicated. In this study, we aim to solve the train platforming problem for a busy high-speed railway station by considering comprehensive track resources and interlocking configurations. A two-level space-time network is constructed to capture infrastructure information at various levels of detail from both macroscopic and microscopic perspectives. Additionally, we propose a nonlinear programming model that minimizes a weighted sum of total travel time and total deviation time for trains at the station. We apply a Two-level Lagrangian Relaxation (2-L LR) to a linearized version of the model and demonstrate how this induces a decomposable train-specific path choice problem at the macroscopic level that is guided by Lagrange multipliers associated with microscopic resource capacity violation. As case studies, the proposed model and solution approach are applied to a small virtual railway station and a high-speed railway hub station located on the busiest high-speed railway line in China. Through a comparison of other approaches that include Logic-based Benders Decomposition (LBBD), we highlight the superiority of the proposed method; on realistic instances, the 2-L LR method finds solution that are, on average, approximately 2% from optimality. Finally, we test algorithm performance at the operational level and obtain near-optimal solutions, with optimality gaps of approximately 1%, in a very short time.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
A matter of performance & criticality: a review of rare-earth-based magnetocaloric intermetallic compounds for hydrogen liquefaction
Authors:
Wei Liu,
Tino Gottschall,
Franziska Scheibel,
Eduard Bykov,
Alex Aubert,
Nuno Fortunato,
Benedikt Beckmann,
Allan M. Döring,
Hongbin Zhang,
Konstantin Skokov,
Olivler Gutfleisch
Abstract:
The low efficiency of conventional liquefaction technologies based on the Joule-Thomson expansion makes liquid hydrogen currently not attractive enough for large-scale energy-related technologies that are important for the transition to a carbon-neutral society. Magnetocaloric hydrogen liquefaction has great potential to achieve higher efficiency and is therefore a crucial enabler for affordable l…
▽ More
The low efficiency of conventional liquefaction technologies based on the Joule-Thomson expansion makes liquid hydrogen currently not attractive enough for large-scale energy-related technologies that are important for the transition to a carbon-neutral society. Magnetocaloric hydrogen liquefaction has great potential to achieve higher efficiency and is therefore a crucial enabler for affordable liquid hydrogen. Cost-effective magnetocaloric materials with large magnetic entropy and adiabatic temperature changes in the temperature range of 77 $\sim$ 20 K under commercially practicable magnetic fields are the foundation for the success of magnetocaloric hydrogen liquefaction. Heavy rare-earth-based magnetocaloric intermetallic compounds generally show excellent magnetocaloric performances, but the heavy rare-earth elements (Gd, Tb, Dy, Ho, Er, and Tm) are highly critical in resources. Yttrium and light rare-earth elements (La, Ce, Pr, and Nd) are relatively abundant, but their alloys generally show less excellent magnetocaloric properties. A dilemma appears: higher performance or lower criticality? In this review, we study how cryogenic temperature influences magnetocaloric performance by first reviewing heavy rare-earth-based intermetallic compounds. Next, we look at light rare-earth-based, "mixed" rare-earth-based, and Gd-based intermetallic compounds with the nature of the phase transition order taken into consideration, and summarize ways to resolve the dilemma.
△ Less
Submitted 16 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Reed-Solomon Codes over Cyclic Polynomial Ring with Lower Encoding/Decoding Complexity
Authors:
Wenhao Liu,
Zhengyi Jiang,
Zhongyi Huang,
Linqi Song,
Hanxu Hou
Abstract:
Reed-Solomon (RS) codes are constructed over a finite field that have been widely employed in storage and communication systems. Many fast encoding/decoding algorithms such as fast Fourier transform (FFT) and modular approach are designed for RS codes to reduce the encoding/decoding complexity defined as the number of XORs involved in the encoding/decoding procedure. In this paper, we present the…
▽ More
Reed-Solomon (RS) codes are constructed over a finite field that have been widely employed in storage and communication systems. Many fast encoding/decoding algorithms such as fast Fourier transform (FFT) and modular approach are designed for RS codes to reduce the encoding/decoding complexity defined as the number of XORs involved in the encoding/decoding procedure. In this paper, we present the construction of RS codes over the cyclic polynomial ring $ \mathbb{F}_2[x]/(1+x+\ldots+x^{p-1})$ and show that our codes are maximum distance separable (MDS) codes. Moreover, we propose the FFT and modular approach over the ring that can be employed in our codes for encoding/decoding complexity reduction. We show that our codes have 17.9\% encoding complexity reduction and 7.5\% decoding complexity reduction compared with RS codes over finite field, for $(n,k)=(2048,1984)$.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Virtual Psychedelia
Authors:
Jacob Yenney,
Weichen Liu,
Ying C. Wu
Abstract:
We present an approach to designing 3D Iterated Function Systems (IFS) within the Unity Editor and rendered to VR in real-time. Objects are modeled as a hierarchical tree of primitive shapes and operators, editable using a graphical user interface allowing artists to develop psychedelic scenes with little to no coding knowledge, and is easily extensible for more advanced users to add their own pri…
▽ More
We present an approach to designing 3D Iterated Function Systems (IFS) within the Unity Editor and rendered to VR in real-time. Objects are modeled as a hierarchical tree of primitive shapes and operators, editable using a graphical user interface allowing artists to develop psychedelic scenes with little to no coding knowledge, and is easily extensible for more advanced users to add their own primitive shapes and operators.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation
Authors:
Hanyang Chi,
Jian Pang,
Bingfeng Zhang,
Weifeng Liu
Abstract:
Consistency learning is a central strategy to tackle unlabeled data in semi-supervised medical image segmentation (SSMIS), which enforces the model to produce consistent predictions under the perturbation. However, most current approaches solely focus on utilizing a specific single perturbation, which can only cope with limited cases, while employing multiple perturbations simultaneously is hard t…
▽ More
Consistency learning is a central strategy to tackle unlabeled data in semi-supervised medical image segmentation (SSMIS), which enforces the model to produce consistent predictions under the perturbation. However, most current approaches solely focus on utilizing a specific single perturbation, which can only cope with limited cases, while employing multiple perturbations simultaneously is hard to guarantee the quality of consistency learning. In this paper, we propose an Adaptive Bidirectional Displacement (ABD) approach to solve the above challenge. Specifically, we first design a bidirectional patch displacement based on reliable prediction confidence for unlabeled data to generate new samples, which can effectively suppress uncontrollable regions and still retain the influence of input perturbations. Meanwhile, to enforce the model to learn the potentially uncontrollable content, a bidirectional displacement operation with inverse confidence is proposed for the labeled images, which generates samples with more unreliable information to facilitate model learning. Extensive experiments show that ABD achieves new state-of-the-art performances for SSMIS, significantly improving different baselines. Source code is available at https://github.com/chy-upc/ABD.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
Authors:
Chun Feng,
Joy Hsu,
Weiyu Liu,
Jiajun Wu
Abstract:
3D visual grounding is a challenging task that often requires direct and dense supervision, notably the semantic label for each object in the scene. In this paper, we instead study the naturally supervised setting that learns from only 3D scene and QA pairs, where prior works underperform. We propose the Language-Regularized Concept Learner (LARC), which uses constraints from language as regulariz…
▽ More
3D visual grounding is a challenging task that often requires direct and dense supervision, notably the semantic label for each object in the scene. In this paper, we instead study the naturally supervised setting that learns from only 3D scene and QA pairs, where prior works underperform. We propose the Language-Regularized Concept Learner (LARC), which uses constraints from language as regularization to significantly improve the accuracy of neuro-symbolic concept learners in the naturally supervised setting. Our approach is based on two core insights: the first is that language constraints (e.g., a word's relation to another) can serve as effective regularization for structured representations in neuro-symbolic models; the second is that we can query large language models to distill such constraints from language properties. We show that LARC improves performance of prior works in naturally supervised 3D visual grounding, and demonstrates a wide range of 3D visual reasoning capabilities-from zero-shot composition, to data efficiency and transferability. Our method represents a promising step towards regularizing structured visual reasoning frameworks with language-based priors, for learning in settings without dense supervision.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
UniFS: Universal Few-shot Instance Perception with Point Representations
Authors:
Sheng **,
Ruijie Yao,
Lumin Xu,
Wentao Liu,
Chen Qian,
Ji Wu,
** Luo
Abstract:
Instance perception tasks (object detection, instance segmentation, pose estimation, counting) play a key role in industrial applications of visual models. As supervised learning methods suffer from high labeling cost, few-shot learning methods which effectively learn from a limited number of labeled examples are desired. Existing few-shot learning methods primarily focus on a restricted set of ta…
▽ More
Instance perception tasks (object detection, instance segmentation, pose estimation, counting) play a key role in industrial applications of visual models. As supervised learning methods suffer from high labeling cost, few-shot learning methods which effectively learn from a limited number of labeled examples are desired. Existing few-shot learning methods primarily focus on a restricted set of tasks, presumably due to the challenges involved in designing a generic model capable of representing diverse tasks in a unified manner. In this paper, we propose UniFS, a universal few-shot instance perception model that unifies a wide range of instance perception tasks by reformulating them into a dynamic point representation learning framework. Additionally, we propose Structure-Aware Point Learning (SAPL) to exploit the higher-order structural relationship among points to further enhance representation learning. Our approach makes minimal assumptions about the tasks, yet it achieves competitive results compared to highly specialized and well optimized specialist models. Codes will be released soon.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Two-way Homogeneity Pursuit for Quantile Network Vector Autoregression
Authors:
Wenyang Liu,
Ganggang Xu,
Jianqing Fan,
Xuening Zhu
Abstract:
While the Vector Autoregression (VAR) model has received extensive attention for modelling complex time series, quantile VAR analysis remains relatively underexplored for high-dimensional time series data. To address this disparity, we introduce a two-way grouped network quantile (TGNQ) autoregression model for time series collected on large-scale networks, known for their significant heterogeneou…
▽ More
While the Vector Autoregression (VAR) model has received extensive attention for modelling complex time series, quantile VAR analysis remains relatively underexplored for high-dimensional time series data. To address this disparity, we introduce a two-way grouped network quantile (TGNQ) autoregression model for time series collected on large-scale networks, known for their significant heterogeneous and directional interactions among nodes. Our proposed model simultaneously conducts node clustering and model estimation to balance complexity and interpretability. To account for the directional influence among network nodes, each network node is assigned two latent group memberships that can be consistently estimated using our proposed estimation procedure. Theoretical analysis demonstrates the consistency of membership and parameter estimators even with an overspecified number of groups. With the correct group specification, estimated parameters are proven to be asymptotically normal, enabling valid statistical inferences. Moreover, we propose a quantile information criterion for consistently selecting the number of groups. Simulation studies show promising finite sample performance, and we apply the methodology to analyze connectedness and risk spillover effects among Chinese A-share stocks.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Entanglement enhancement of two giant atoms with multiple connection points in bidirectional-chiral quantum waveguide-QED system
Authors:
Jie Liu,
Yue Cai,
Kang-Jie Ma,
Lei Tan,
Wu-Ming Liu
Abstract:
We study the entanglement generation of two giant atoms within a one-dimensional bidirectional-chiral waveguide quantum electrodynamics (QED) system, where the initial state of the two giant atoms are $|e_a,g_b\rangle $. Here, each giant atom is coupled to the waveguide through three connection points, with the configurations divided into five types based on the arrangement of coupling points betw…
▽ More
We study the entanglement generation of two giant atoms within a one-dimensional bidirectional-chiral waveguide quantum electrodynamics (QED) system, where the initial state of the two giant atoms are $|e_a,g_b\rangle $. Here, each giant atom is coupled to the waveguide through three connection points, with the configurations divided into five types based on the arrangement of coupling points between the giant atoms and the waveguide: separate, fully braided, partially braided, fully nested, and partially nested. We explore the entanglement generation process within each configuration in both nonchiral and chiral coupling cases. It is demonstrated that entanglement can be controlled as needed by either adjusting the phase shift or selecting different configurations. For nonchiral coupling, the entanglement of each configuration exhibits steady state properties attributable to the presence of dark state. In addition, we find that steady-state entanglement can be obtained at more phase shifts in certain configurations by increasing the number of coupling points between the giant atoms and the bidirectional waveguide. In the case of chiral coupling, the entanglement is maximally enhanced compared to the one of nonchiral case. Especially in fully braided configuration, the concurrence reaches its peak value 1, which is robust to chirality. We further show the influence of atomic initial states on the evolution of interatomic entanglement. Our scheme can be used for entanglement generation in chiral quantum networks of giant-atom waveguide-QED systems, with potential applications in quantum networks and quantum communications.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Studies on Topological High-fold Degenerate Semimetal with Chiral Structure
Authors:
Yan Wang,
Xiaosong Bai,
Wujun Shi,
Wenjian Liu,
Qiunan Xu
Abstract:
In recent years, a type of topological semimetals (TSMs) that can host new fermions with high-fold degeneracy has attracted considerable interest. Among them, ones with chiral structrue particularly catch our attention. Such chiral high-fold degenerate semimetals always have a larger topological charge and longer Fermi arcs which bringing about some special properties. In this work, we found 147 c…
▽ More
In recent years, a type of topological semimetals (TSMs) that can host new fermions with high-fold degeneracy has attracted considerable interest. Among them, ones with chiral structrue particularly catch our attention. Such chiral high-fold degenerate semimetals always have a larger topological charge and longer Fermi arcs which bringing about some special properties. In this work, we found 147 chiral materials with exotic fermions near Fermi level by high-throughput calculation and screening. We selected some typical examples to analyse its topological properties such as topological surface states (TSSs) and Berry curvature. Our results are helpful to provide a promising platform for exploring the physical properties of chiral fermions and application of chiral TSMs.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation
Authors:
Yikun Zhang,
Geyan Ye,
Chaohao Yuan,
Bo Han,
Long-Kai Huang,
Jianhua Yao,
Wei Liu,
Yu Rong
Abstract:
Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to cap…
▽ More
Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to capture fine-grained information, such as molecular fragments and their corresponding textual description, which is crucial for downstream tasks. Furthermore, it is incapable to model such information using a similar global alignment strategy due to data scarcity of paired local part annotated data from existing datasets. In this paper, we propose Atomas, a multi-modal molecular representation learning framework to jointly learn representations from SMILES string and text. We design a Hierarchical Adaptive Alignment model to concurrently learn the fine-grained fragment correspondence between two modalities and align these representations of fragments in three levels. Additionally, Atomas's end-to-end training framework incorporates the tasks of understanding and generating molecule, thereby supporting a wider range of downstream tasks. In the retrieval task, Atomas exhibits robust generalization ability and outperforms the baseline by 30.8% of recall@1 on average. In the generation task, Atomas achieves state-of-the-art results in both molecule captioning task and molecule generation task. Moreover, the visualization of the Hierarchical Adaptive Alignment model further confirms the chemical significance of our approach. Our codes can be found at https://anonymous.4open.science/r/Atomas-03C3.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Functional Protein Design with Local Domain Alignment
Authors:
Chaohao Yuan,
Songyou Li,
Geyan Ye,
Yikun Zhang,
Long-Kai Huang,
Wenbing Huang,
Wei Liu,
Jianhua Yao,
Yu Rong
Abstract:
The core challenge of de novo protein design lies in creating proteins with specific functions or properties, guided by certain conditions. Current models explore to generate protein using structural and evolutionary guidance, which only provide indirect conditions concerning functions and properties. However, textual annotations of proteins, especially the annotations for protein domains, which d…
▽ More
The core challenge of de novo protein design lies in creating proteins with specific functions or properties, guided by certain conditions. Current models explore to generate protein using structural and evolutionary guidance, which only provide indirect conditions concerning functions and properties. However, textual annotations of proteins, especially the annotations for protein domains, which directly describe the protein's high-level functionalities, properties, and their correlation with target amino acid sequences, remain unexplored in the context of protein design tasks. In this paper, we propose Protein-Annotation Alignment Generation (PAAG), a multi-modality protein design framework that integrates the textual annotations extracted from protein database for controllable generation in sequence space. Specifically, within a multi-level alignment module, PAAG can explicitly generate proteins containing specific domains conditioned on the corresponding domain annotations, and can even design novel proteins with flexible combinations of different kinds of annotations. Our experimental results underscore the superiority of the aligned protein representations from PAAG over 7 prediction tasks. Furthermore, PAAG demonstrates a nearly sixfold increase in generation success rate (24.7% vs 4.7% in zinc finger, and 54.3% vs 8.7% in the immunoglobulin domain) in comparison to the existing model.
△ Less
Submitted 27 May, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Machine Unlearning in Large Language Models
Authors:
Kongyang Chen,
Zixin Wang,
Bing Mi,
Waixi Liu,
Shaowei Wang,
Xiaojun Ren,
Jiaxing Shen
Abstract:
Recently, large language models (LLMs) have emerged as a notable field, attracting significant attention for its ability to automatically generate intelligent contents for various application domains. However, LLMs still suffer from significant security and privacy issues. For example, LLMs might expose user privacy from hacking attacks or targeted prompts. To address this problem, this paper intr…
▽ More
Recently, large language models (LLMs) have emerged as a notable field, attracting significant attention for its ability to automatically generate intelligent contents for various application domains. However, LLMs still suffer from significant security and privacy issues. For example, LLMs might expose user privacy from hacking attacks or targeted prompts. To address this problem, this paper introduces a novel machine unlearning framework into LLMs. Our objectives are to make LLMs not produce harmful, hallucinatory, or privacy-compromising responses, while retaining their standard output capabilities. To accomplish this, we use an evaluative model to pinpoint dialogues needing unlearning. We also establish a distance loss to function as the model's negative loss, diverting it from previous undesirable outputs. Furthermore, we determine the expected output's cluster mean to formulate a positive loss, directing the model's outputs toward preferable outcomes without compromising its reasoning abilities and performance. Experimental results show that our approach effectively meets unlearning objectives without substantially compromising model performance.
△ Less
Submitted 3 February, 2024;
originally announced April 2024.
-
Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples
Authors:
Kuofeng Gao,
**dong Gu,
Yang Bai,
Shu-Tao Xia,
Philip Torr,
Wei Liu,
Zhifeng Li
Abstract:
Despite the exceptional performance of multi-modal large language models (MLLMs), their deployment requires substantial computational resources. Once malicious users induce high energy consumption and latency time (energy-latency cost), it will exhaust computational resources and harm availability of service. In this paper, we investigate this vulnerability for MLLMs, particularly image-based and…
▽ More
Despite the exceptional performance of multi-modal large language models (MLLMs), their deployment requires substantial computational resources. Once malicious users induce high energy consumption and latency time (energy-latency cost), it will exhaust computational resources and harm availability of service. In this paper, we investigate this vulnerability for MLLMs, particularly image-based and video-based ones, and aim to induce high energy-latency cost during inference by crafting an imperceptible perturbation. We find that high energy-latency cost can be manipulated by maximizing the length of generated sequences, which motivates us to propose verbose samples, including verbose images and videos. Concretely, two modality non-specific losses are proposed, including a loss to delay end-of-sequence (EOS) token and an uncertainty loss to increase the uncertainty over each generated token. In addition, improving diversity is important to encourage longer responses by increasing the complexity, which inspires the following modality specific loss. For verbose images, a token diversity loss is proposed to promote diverse hidden states. For verbose videos, a frame feature diversity loss is proposed to increase the feature diversity among frames. To balance these losses, we propose a temporal weight adjustment algorithm. Experiments demonstrate that our verbose samples can largely extend the length of generated sequences.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Boosting Architectural Generation via Prompts: Report
Authors:
Xin Zhang,
Wenwen Liu
Abstract:
In the realm of AI architectural design, the importance of prompts is becoming increasingly prominent. With advancements in artificial intelligence and large-scale model technology, more design tasks are being delegated to machine learning algorithms. This necessitates a method for designers to guide algorithms in producing their desired designs. Prompts serve as a guiding and motivational mechani…
▽ More
In the realm of AI architectural design, the importance of prompts is becoming increasingly prominent. With advancements in artificial intelligence and large-scale model technology, more design tasks are being delegated to machine learning algorithms. This necessitates a method for designers to guide algorithms in producing their desired designs. Prompts serve as a guiding and motivational mechanism, playing a crucial role in AI-generated architectural design. This paper categorizes and summarizes common vocabulary used in architectural design, discussing how to craft effective prompts and their impact on the quality and creativity of generated results. Through careful prompt design, designers can better control the generated architectural design images, thereby achieving designs that are more aligned with requirements and innovative.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
NMBEnet: Efficient Near-field mmWave Beam Training for Multiuser OFDM Systems Using Sub-6 GHz Pilots
Authors:
Wang Liu,
Cunhua Pan,
Hong Ren,
Cheng-Xiang Wang,
Jiangzhou Wang,
Xiaohu You
Abstract:
Combining millimetre-wave (mmWave) communications with an extremely large-scale antenna array (ELAA) presents a promising avenue for meeting the spectral efficiency demands of the future sixth generation (6G) mobile communications. However, beam training for mmWave ELAA systems is challenged by excessive pilot overheads as well as insufficient accuracy, as the huge near-field codebook has to be ac…
▽ More
Combining millimetre-wave (mmWave) communications with an extremely large-scale antenna array (ELAA) presents a promising avenue for meeting the spectral efficiency demands of the future sixth generation (6G) mobile communications. However, beam training for mmWave ELAA systems is challenged by excessive pilot overheads as well as insufficient accuracy, as the huge near-field codebook has to be accounted for. In this paper, inspired by the similarity between far-field sub-6 GHz channels and near-field mmWave channels, we propose to leverage sub-6 GHz uplink pilot signals to directly estimate the optimal near-field mmWave codeword, which aims to reduce pilot overhead and bypass the channel estimation. Moreover, we adopt deep learning to perform this dual map** function, i.e., sub-6 GHz to mmWave, far-field to near-field, and a novel neural network structure called NMBEnet is designed to enhance the precision of beam training. Specifically, when considering the orthogonal frequency division multiplexing (OFDM) communication scenarios with high user density, correlations arise both between signals from different users and between signals from different subcarriers. Accordingly, the convolutional neural network (CNN) module and graph neural network (GNN) module included in the proposed NMBEnet can leverage these two correlations to further enhance the precision of beam training.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Re-Thinking Inverse Graphics With Large Language Models
Authors:
Peter Kulits,
Haiwen Feng,
Weiyang Liu,
Victoria Abrevaya,
Michael J. Black
Abstract:
Inverse graphics -- the task of inverting an image into physical variables that, when rendered, enable reproduction of the observed scene -- is a fundamental challenge in computer vision and graphics. Disentangling an image into its constituent elements, such as the shape, color, and material properties of the objects of the 3D scene that produced it, requires a comprehensive understanding of the…
▽ More
Inverse graphics -- the task of inverting an image into physical variables that, when rendered, enable reproduction of the observed scene -- is a fundamental challenge in computer vision and graphics. Disentangling an image into its constituent elements, such as the shape, color, and material properties of the objects of the 3D scene that produced it, requires a comprehensive understanding of the environment. This requirement limits the ability of existing carefully engineered approaches to generalize across domains. Inspired by the zero-shot ability of large language models (LLMs) to generalize to novel contexts, we investigate the possibility of leveraging the broad world knowledge encoded in such models in solving inverse-graphics problems. To this end, we propose the Inverse-Graphics Large Language Model (IG-LLM), an inverse-graphics framework centered around an LLM, that autoregressively decodes a visual embedding into a structured, compositional 3D-scene representation. We incorporate a frozen pre-trained visual encoder and a continuous numeric head to enable end-to-end training. Through our investigation, we demonstrate the potential of LLMs to facilitate inverse graphics through next-token prediction, without the use of image-space supervision. Our analysis opens up new possibilities for precise spatial reasoning about images that exploit the visual knowledge of LLMs. We will release our code and data to ensure the reproducibility of our investigation and to facilitate future research at https://ig-llm.is.tue.mpg.de/
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Simulation-Free Determination of Microstructure Representative Volume Element Size via Fisher Scores
Authors:
Wei Liu,
Satyajit Mojumder,
Wing Kam Liu,
Wei Chen,
Daniel W. Apley
Abstract:
A representative volume element (RVE) is a reasonably small unit of microstructure that can be simulated to obtain the same effective properties as the entire microstructure sample. Finite element (FE) simulation of RVEs, as opposed to much larger samples, saves computational expense, especially in multiscale modeling. Therefore, it is desirable to have a framework that determines RVE size prior t…
▽ More
A representative volume element (RVE) is a reasonably small unit of microstructure that can be simulated to obtain the same effective properties as the entire microstructure sample. Finite element (FE) simulation of RVEs, as opposed to much larger samples, saves computational expense, especially in multiscale modeling. Therefore, it is desirable to have a framework that determines RVE size prior to FE simulations. Existing methods select the RVE size based on when the FE-simulated properties of samples of increasing size converge with insignificant statistical variations, with the drawback that many samples must be simulated. We propose a simulation-free alternative that determines RVE size based only on a micrograph. The approach utilizes a machine learning model trained to implicitly characterize the stochastic nature of the input micrograph. The underlying rationale is to view RVE size as the smallest moving window size for which the stochastic nature of the microstructure within the window is stationary as the window moves across a large micrograph. For this purpose, we adapt a recently developed Fisher score-based framework for microstructure nonstationarity monitoring. Because the resulting RVE size is based solely on the micrograph and does not involve any FE simulation of specific properties, it constitutes an RVE for any property of interest that solely depends on the microstructure characteristics. Through numerical experiments of simple and complex microstructures, we validate our approach and show that our selected RVE sizes are consistent with when the chosen FE-simulated properties converge.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Spin-mechanical coupling in 2D antiferromagnet CrSBr
Authors:
Fan Fei,
Yulu Mao,
Wuzhang Fang,
Wenhao Liu,
Jack P. Rollins,
Aswin L. N. Kondusamy,
Bing Lv,
Yuan **,
Ying Wang,
Jun Xiao
Abstract:
Spin-mechanical coupling is vital in diverse fields including spintronics, sensing and quantum transduction. Two-dimensional (2D) magnetic materials provide a unique platform for investigating spin-mechanical coupling, attributed to their mechanical flexibility and novel spin orderings. However, studying spin-mechanical coupling in 2D magnets presents challenges in probing mechanical deformation a…
▽ More
Spin-mechanical coupling is vital in diverse fields including spintronics, sensing and quantum transduction. Two-dimensional (2D) magnetic materials provide a unique platform for investigating spin-mechanical coupling, attributed to their mechanical flexibility and novel spin orderings. However, studying spin-mechanical coupling in 2D magnets presents challenges in probing mechanical deformation and thermodynamic properties change at nanoscale. Here we use nano opto-electro-mechanical interferometry to mechanically detect the phase transition and magnetostriction effect in multilayer CrSBr, an air-stable antiferromagnets with large magnon-exciton coupling. The transitions among antiferromagnetism, spin-canted ferromagnetism and paramagnetism are visualized by optomechanical frequency anomalies. Nontrivial magnetostriction coefficient 2.3x10^(-5) and magnetoelastic coupling strength on the order of 10^6 J/m^3 have been found. Moreover, we demonstrate the substantial tunability of the magnetoelastic constant by nearly 50% via gate-induced strain. Our findings demonstrate the strong spin-mechanical coupling in CrSBr and paves the way for develo** sensitive magnetic sensing and efficient quantum transduction at atomically thin limit.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
A Short Review for Ontology Learning: Stride to Large Language Models Trend
Authors:
Rick Du,
Huilong An,
Keyu Wang,
Weidong Liu
Abstract:
Ontologies provide formal representation of knowledge shared within Semantic Web applications. Ontology learning involves the construction of ontologies from a given corpus. In the past years, ontology learning has traversed through shallow learning and deep learning methodologies, each offering distinct advantages and limitations in the quest for knowledge extraction and representation. A new tre…
▽ More
Ontologies provide formal representation of knowledge shared within Semantic Web applications. Ontology learning involves the construction of ontologies from a given corpus. In the past years, ontology learning has traversed through shallow learning and deep learning methodologies, each offering distinct advantages and limitations in the quest for knowledge extraction and representation. A new trend of these approaches is relying on large language models (LLMs) to enhance ontology learning. This paper gives a review in approaches and challenges of ontology learning. It analyzes the methodologies and limitations of shallow-learning-based and deep-learning-based techniques for ontology learning, and provides comprehensive knowledge for the frontier work of using LLMs to enhance ontology learning. In addition, it proposes several noteworthy future directions for further exploration into the integration of LLMs with ontology learning tasks.
△ Less
Submitted 17 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Unified Unsupervised Salient Object Detection via Knowledge Transfer
Authors:
Yao Yuan,
Wutao Liu,
Pan Gao,
Qun Dai,
Jie Qin
Abstract:
Recently, unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature. However, current methods mainly focus on specific tasks such as RGB and RGB-D, neglecting the potential for task migration. In this paper, we propose a unified USOD framework for generic USOD tasks. Firstly, we propose a Progressive Curriculum Learning-based Saliency Distilling…
▽ More
Recently, unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature. However, current methods mainly focus on specific tasks such as RGB and RGB-D, neglecting the potential for task migration. In this paper, we propose a unified USOD framework for generic USOD tasks. Firstly, we propose a Progressive Curriculum Learning-based Saliency Distilling (PCL-SD) mechanism to extract saliency cues from a pre-trained deep network. This mechanism starts with easy samples and progressively moves towards harder ones, to avoid initial interference caused by hard samples. Afterwards, the obtained saliency cues are utilized to train a saliency detector, and we employ a Self-rectify Pseudo-label Refinement (SPR) mechanism to improve the quality of pseudo-labels. Finally, an adapter-tuning method is devised to transfer the acquired saliency knowledge, leveraging shared knowledge to attain superior transferring performance on the target tasks. Extensive experiments on five representative SOD tasks confirm the effectiveness and feasibility of our proposed method. Code and supplement materials are available at https://github.com/I2-Multimedia-Lab/A2S-v3.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Pure skin effect obeying power partition in directed graphs
Authors:
Wenwen Liu,
Oubo You,
Bumki Min,
Shuang Zhang
Abstract:
Non-Hermitian physics has received great attention recently. In particular, band structures in non-Hermitian systems can be engineered to exhibit various topological effects. Among them, one of the most intriguing phenomena is the non-Hermitian skin effect (NHSE). Here, we investigate NHSE in systems featuring directed chains or directed graphs, where the arrows denote the directions of the non-re…
▽ More
Non-Hermitian physics has received great attention recently. In particular, band structures in non-Hermitian systems can be engineered to exhibit various topological effects. Among them, one of the most intriguing phenomena is the non-Hermitian skin effect (NHSE). Here, we investigate NHSE in systems featuring directed chains or directed graphs, where the arrows denote the directions of the non-reciprocal hop** between neighbouring nodes. We show that the systems exhibit pure skin modes with non-oscillatory wavefunctions, in contrast to previously studied NHSE. Interestingly, the sum of the decay constants along different directions for each skin mode obeys a power partition rule, i.e. their sum is a fixed value and the value of each constant only depends on the ratio between the non-reciprocal hop** parameters and is independent of detailed graph configurations. Such Pure Skin Effect (PSE) can be explained by using a generalized method for solving the Generalized Brillouin-zone with multiple bulk states.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Fast Monte Carlo Dose Calculation in Proton Therapy
Authors:
Jason Holmes,
Hongying Feng,
Lian Zhang,
Michael Fix,
Steve B. Jiang,
Wei Liu
Abstract:
This article examines the critical role of fast Monte Carlo dose calculations in advancing proton therapy techniques, particularly in the context of increasing treatment customization and precision. As adaptive radiotherapy and other patient-specific approaches evolve, the need for accurate and precise dose calculations, essential for techniques like proton-based stereotactic radiosurgery, becomes…
▽ More
This article examines the critical role of fast Monte Carlo dose calculations in advancing proton therapy techniques, particularly in the context of increasing treatment customization and precision. As adaptive radiotherapy and other patient-specific approaches evolve, the need for accurate and precise dose calculations, essential for techniques like proton-based stereotactic radiosurgery, becomes more prominent. These calculations, however, are time-intensive, with the treatment planning/optimization process constrained by the achievable speed of dose computations. Thus, enhancing the speed of Monte Carlo methods is vital, as it not only facilitates the implementation of novel treatment modalities but also improves the optimality of treatment plans. Today, the state-of-the-art in Monte Carlo dose calculation speeds is 106 - 107 protons per second. This review highlights the latest advancements in fast Monte Carlo dose calculations that have led to such speeds, including emerging artificial intelligence-based techniques, and discusses their application in both current and emerging proton therapy strategies.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
JWST ERS Program Q3D: The pitfalls of virial BH mass constraints shown in a z = 3 quasar with an ultramassive host
Authors:
Caroline Bertemes,
Dominika Wylezalek,
David S. N. Rupke,
Nadia L. Zakamska,
Sylvain Veilleux,
Benjamin Beckmann,
Andrey Vayner,
Swetha Sankar,
Yuzo Ishikawa,
Nadiia Diachenko,
Weizhe Liu,
Yu-Ching Chen,
Jerome Seebeck,
Dieter Lutz,
Guilin Liu
Abstract:
We present JWST MIRI/NIRSpec observations of the extremely red quasar SDSS J165202.64+172852.3 at z~3, one of the most luminous quasars known to date, driving powerful outflows and hosting a clumpy starburst, amidst several interacting companions. We estimate the black hole (BH) mass of the system based on the broad H$α$ and H$β$ lines, as well as the Pa$β$ emission in the IR and MgII in the UV. W…
▽ More
We present JWST MIRI/NIRSpec observations of the extremely red quasar SDSS J165202.64+172852.3 at z~3, one of the most luminous quasars known to date, driving powerful outflows and hosting a clumpy starburst, amidst several interacting companions. We estimate the black hole (BH) mass of the system based on the broad H$α$ and H$β$ lines, as well as the Pa$β$ emission in the IR and MgII in the UV. We recover a very broad range of mass estimates, with constraints ranging between log $M_{\rm BH}$=9 and 10.1, which is exacerbated if imposing a uniform BLR geometry at all wavelengths. Several factors may contribute to the large spread: measurement uncertainties (insufficient sensitivity to detect the broadest component of the faint Pa$β$ line, spectral blending, ambiguities in the broad/narrow component distinction), lack of virial equilibrium (in a system characterised by powerful outflows and rapid accretion), and uncertainties on the luminosity-inferred size of the broad line region, a.o. given central dust obscuration. We constrain the stellar mass via SED fitting, suggesting the host to be extremely massive at $10^{12.8\pm 0.5} M_\odot$ - ~2 dex above the characteristic mass of the Schechter fit to the z=3 stellar mass function. Notably, J1652's central BH might be interpreted as being either undermassive, overmassive, or in line with the BH mass-stellar mass relation, depending on the choice of assumptions. The recovered Eddington ratio varies accordingly, but exceeds 10% in any case. We put our results into context by providing an extensive overview and discussion of recent literature results and their associated assumptions. Our findings provide an important demonstration of the uncertainties inherent in virial BH mass estimates, which are of particular relevance in the JWST era given the growing number of studies on rapidly accreting quasars at high redshift.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Authors:
Marah Abdin,
Sam Ade Jacobs,
Ammar Ahmad Awan,
Jyoti Aneja,
Ahmed Awadallah,
Hany Awadalla,
Nguyen Bach,
Amit Bahree,
Arash Bakhtiari,
Jianmin Bao,
Harkirat Behl,
Alon Benhaim,
Misha Bilenko,
Johan Bjorck,
Sébastien Bubeck,
Qin Cai,
Martin Cai,
Caio César Teodoro Mendes,
Weizhu Chen,
Vishrav Chaudhary,
Dong Chen,
Dongdong Chen,
Yen-Chun Chen,
Yi-Ling Chen,
Parul Chopra
, et al. (90 additional authors not shown)
Abstract:
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset…
▽ More
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered publicly available web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench). Moreover, we also introduce phi-3-vision, a 4.2 billion parameter model based on phi-3-mini with strong reasoning capabilities for image and text prompts.
△ Less
Submitted 23 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Multi-Level Sequence Denoising with Cross-Signal Contrastive Learning for Sequential Recommendation
Authors:
Xiaofei Zhu,
Liang Li,
Weidong Liu,
Xin Luo
Abstract:
Sequential recommender systems (SRSs) aim to suggest next item for a user based on her historical interaction sequences. Recently, many research efforts have been devoted to attenuate the influence of noisy items in sequences by either assigning them with lower attention weights or discarding them directly. The major limitation of these methods is that the former would still prone to overfit noisy…
▽ More
Sequential recommender systems (SRSs) aim to suggest next item for a user based on her historical interaction sequences. Recently, many research efforts have been devoted to attenuate the influence of noisy items in sequences by either assigning them with lower attention weights or discarding them directly. The major limitation of these methods is that the former would still prone to overfit noisy items while the latter may overlook informative items. To the end, in this paper, we propose a novel model named Multi-level Sequence Denoising with Cross-signal Contrastive Learning (MSDCCL) for sequential recommendation. To be specific, we first introduce a target-aware user interest extractor to simultaneously capture users' long and short term interest with the guidance of target items. Then, we develop a multi-level sequence denoising module to alleviate the impact of noisy items by employing both soft and hard signal denoising strategies. Additionally, we extend existing curriculum learning by simulating the learning pattern of human beings. It is worth noting that our proposed model can be seamlessly integrated with a majority of existing recommendation models and significantly boost their effectiveness. Experimental studies on five public datasets are conducted and the results demonstrate that the proposed MSDCCL is superior to the state-of-the-art baselines. The source code is publicly available at https://github.com/lalunex/MSDCCL/tree/main.
△ Less
Submitted 19 June, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be…
▽ More
Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be $0.38\pm0.20_\text{stat.}\pm0.01_\text{syst.}$ ($R< 0.83$ at 90\% confidence level). In addition, we measure the ratio of the average cross section of $e^+e^-\toωX(3872)$ to $e^+e^-\toωχ_{c1}(ωχ_{c2})$ to be $σ_{ωX(3872)}/σ_{ωχ_{c1}}~(σ_{ωX(3872)}/σ_{ωχ_{c2}})=5.2\pm1.0_\text{stat.}\pm1.9_\text{syst.}~ (5.5\pm1.1_\text{stat.}\pm2.4_\text{syst.})$. Finally, we search for the process of $e^+e^-\toγX(3872)$, and no obvious signal is observed. The upper limit on the ratio of the average cross section of $e^+e^-\toγX(3872)$ to $e^+e^-\toωX(3872)$ is set as $σ_{γX(3872)}/σ_{ωX(3872)}<0.23$ at 90\% confidence level.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Variational Bayesian Optimal Experimental Design with Normalizing Flows
Authors:
Jiayuan Dong,
Christian Jacobsen,
Mehdi Khalloufi,
Maryam Akram,
Wanjiao Liu,
Karthik Duraisamy,
Xun Huan
Abstract:
Bayesian optimal experimental design (OED) seeks experiments that maximize the expected information gain (EIG) in model parameters. Directly estimating the EIG using nested Monte Carlo is computationally expensive and requires an explicit likelihood. Variational OED (vOED), in contrast, estimates a lower bound of the EIG without likelihood evaluations by approximating the posterior distributions w…
▽ More
Bayesian optimal experimental design (OED) seeks experiments that maximize the expected information gain (EIG) in model parameters. Directly estimating the EIG using nested Monte Carlo is computationally expensive and requires an explicit likelihood. Variational OED (vOED), in contrast, estimates a lower bound of the EIG without likelihood evaluations by approximating the posterior distributions with variational forms, and then tightens the bound by optimizing its variational parameters. We introduce the use of normalizing flows (NFs) for representing variational distributions in vOED; we call this approach vOED-NFs. Specifically, we adopt NFs with a conditional invertible neural network architecture built from compositions of coupling layers, and enhanced with a summary network for data dimension reduction. We present Monte Carlo estimators to the lower bound along with gradient expressions to enable a gradient-based simultaneous optimization of the variational parameters and the design variables. The vOED-NFs algorithm is then validated in two benchmark problems, and demonstrated on a partial differential equation-governed application of cathodic electrophoretic deposition and an implicit likelihood case with stochastic modeling of aphid population. The findings suggest that a composition of 4--5 coupling layers is able to achieve lower EIG estimation bias, under a fixed budget of forward model runs, compared to previous approaches. The resulting NFs produce approximate posteriors that agree well with the true posteriors, able to capture non-Gaussian and multi-modal features effectively.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
AutoInspect: Towards Long-Term Autonomous Industrial Inspection
Authors:
Michal Staniaszek,
Tobit Flatscher,
Joseph Rowell,
Hanlin Niu,
Wenxing Liu,
Yang You,
Robert Skilton,
Maurice Fallon,
Nick Hawes
Abstract:
We give an overview of AutoInspect, a ROS-based software system for robust and extensible mission-level autonomy. Over the past three years AutoInspect has been deployed in a variety of environments, including at a mine, a chemical plant, a mock oil rig, decommissioned nuclear power plants, and a fusion reactor for durations ranging from hours to weeks. The system combines robust map** and local…
▽ More
We give an overview of AutoInspect, a ROS-based software system for robust and extensible mission-level autonomy. Over the past three years AutoInspect has been deployed in a variety of environments, including at a mine, a chemical plant, a mock oil rig, decommissioned nuclear power plants, and a fusion reactor for durations ranging from hours to weeks. The system combines robust map** and localisation with graph-based autonomous navigation, mission execution, and scheduling to achieve a complete autonomous inspection system. The time from arrival at a new site to autonomous mission execution can be under an hour. It is deployed on a Boston Dynamics Spot robot using a custom sensing and compute payload called Frontier. In this work we go into detail of the system's performance in two long-term deployments of 49 days at a robotics test facility, and 35 days at the Joint European Torus (JET) fusion reactor in Oxfordshire, UK.
△ Less
Submitted 23 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation
Authors:
Wenkai Liu,
Tao Guan,
Bin Zhu,
Lili Ju,
Zikai Song,
Dan Li,
Yuesong Wang,
Wei Yang
Abstract:
In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-res…
▽ More
In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-resolution, large-scale scenes. We analyze the densification process in 3DGS and identify areas of Gaussian over-proliferation. We propose a selective strategy, limiting Gaussian increase to key primitives, thereby enhancing the representational efficiency. Additionally, we develop a pruning mechanism to remove redundant Gaussians, those that are merely auxiliary to adjacent ones. For further enhancement, we integrate a sparse order increment for Spherical Harmonics (SH), designed to alleviate storage constraints and reduce training overhead. Our empirical evaluations, conducted on a range of datasets including extensive 4K+ aerial images, demonstrate that 'EfficientGS' not only expedites training and rendering times but also achieves this with a model size approximately tenfold smaller than conventional 3DGS while maintaining high rendering fidelity.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Authors:
Song Wang,
Jiawei Yu,
Wentong Li,
Wenyu Liu,
Xiaolu Liu,
Junbo Chen,
Jianke Zhu
Abstract:
Semantic scene completion, also known as semantic occupancy prediction, can provide dense geometric and semantic information for autonomous vehicles, which attracts the increasing attention of both academia and industry. Unfortunately, existing methods usually formulate this task as a voxel-wise classification problem and treat each voxel equally in 3D space during training. As the hard voxels hav…
▽ More
Semantic scene completion, also known as semantic occupancy prediction, can provide dense geometric and semantic information for autonomous vehicles, which attracts the increasing attention of both academia and industry. Unfortunately, existing methods usually formulate this task as a voxel-wise classification problem and treat each voxel equally in 3D space during training. As the hard voxels have not been paid enough attention, the performance in some challenging regions is limited. The 3D dense space typically contains a large number of empty voxels, which are easy to learn but require amounts of computation due to handling all the voxels uniformly for the existing models. Furthermore, the voxels in the boundary region are more challenging to differentiate than those in the interior. In this paper, we propose HASSC approach to train the semantic scene completion model with hardness-aware design. The global hardness from the network optimization process is defined for dynamical hard voxel selection. Then, the local hardness with geometric anisotropy is adopted for voxel-wise refinement. Besides, self-distillation strategy is introduced to make training process stable and consistent. Extensive experiments show that our HASSC scheme can effectively promote the accuracy of the baseline model without incurring the extra inference cost. Source code is available at: https://github.com/songw-zju/HASSC.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Toward Short-Term Glucose Prediction Solely Based on CGM Time Series
Authors:
Ming Cheng,
Xingjian Diao,
Ziyi Zhou,
Yanjun Cui,
Wenjun Liu,
Shitong Cheng
Abstract:
The global diabetes epidemic highlights the importance of maintaining good glycemic control. Glucose prediction is a fundamental aspect of diabetes management, facilitating real-time decision-making. Recent research has introduced models focusing on long-term glucose trend prediction, which are unsuitable for real-time decision-making and result in delayed responses. Conversely, models designed to…
▽ More
The global diabetes epidemic highlights the importance of maintaining good glycemic control. Glucose prediction is a fundamental aspect of diabetes management, facilitating real-time decision-making. Recent research has introduced models focusing on long-term glucose trend prediction, which are unsuitable for real-time decision-making and result in delayed responses. Conversely, models designed to respond to immediate glucose level changes cannot analyze glucose variability comprehensively. Moreover, contemporary research generally integrates various physiological parameters (e.g. insulin doses, food intake, etc.), which inevitably raises data privacy concerns. To bridge such a research gap, we propose TimeGlu -- an end-to-end pipeline for short-term glucose prediction solely based on CGM time series data. We implement four baseline methods to conduct a comprehensive comparative analysis of the model's performance. Through extensive experiments on two contrasting datasets (CGM Glucose and Colas dataset), TimeGlu achieves state-of-the-art performance without the need for additional personal data from patients, providing effective guidance for real-world diabetic glucose management.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Generation of optical toroidal vortex with circular asymmetric gratings
Authors:
Weichao Liu,
Jie Cheng,
Chenhao Wan
Abstract:
Toroidal vortex, a topological structure commonly observed in nature, exist in various types such as bubbles produced by dolphins and the air flow surrounding a flying dandelion. A toroidal vortex corresponds to a spatiotemporal wave packet in the shape of a donut that propagates in the direction perpendicular to the plane of the ring. In this work, we propose a circular asymmetric grating to gene…
▽ More
Toroidal vortex, a topological structure commonly observed in nature, exist in various types such as bubbles produced by dolphins and the air flow surrounding a flying dandelion. A toroidal vortex corresponds to a spatiotemporal wave packet in the shape of a donut that propagates in the direction perpendicular to the plane of the ring. In this work, we propose a circular asymmetric grating to generate vortex rings. A cylindrical vector wave packet is transformed by the device into a transmitted toroidal vortex pulse. Such a compact toroidal vortex generator may find applications in optical topology research and high-dimensional optical communications.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
RD2Bench: Toward Data-Centric Automatic R&D
Authors:
Haotian Chen,
Xinjie Shen,
Zeqi Ye,
Xiao Yang,
Xu Yang,
Weiqing Liu,
Jiang Bian
Abstract:
The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method demonstrates its effectiveness in a wide range of r…
▽ More
The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method demonstrates its effectiveness in a wide range of real-world scenarios, which exacerbates the experimental burden of researchers and thus renders the potential successful discoveries veiled. Therefore, automating such a research and development (R&D) process is an urgent need. In this paper, we serve as the first effort to formalize the goal by proposing a Real-world Data-centric automatic R&D Benchmark, namely RD2Bench. RD2Bench benchmarks all the operations in data-centric automatic R&D (D-CARD) as a whole to navigate future work toward our goal directly. We focuses on evaluating the interaction and synergistic effects of various model capabilities and aiding to select the well-performed trustworthy models. Although RD2Bench is very challenging to the state-of-the-art (SOTA) large language model (LLM) named GPT-4, indicating ample research opportunities and more research efforts, LLMs possess promising potential to bring more significant development to D-CARD: They are able to implement some simple methods without adopting any additional techniques. We appeal to future work to take develo** techniques for tackling automatic R&D into consideration, thus bringing the opportunities of the potential revolutionary upgrade to human productivity.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Kinetic network in Milestoning: Clustering, reduction, and transition path analysis
Authors:
Ru Wang,
Xiaojun Ji,
Hao Wang,
Wenjian Liu
Abstract:
We present a reduction of Milestoning (ReM) algorithm to analyze the high-dimensional Milestoning kinetic network. The algorithm reduces the Milestoning network to low dimensions but preserves essential kinetic information, such as local residence time, exit time, and mean first passage time between any two states. This is achieved in three steps. First, nodes (milestones) in the high-dimensional…
▽ More
We present a reduction of Milestoning (ReM) algorithm to analyze the high-dimensional Milestoning kinetic network. The algorithm reduces the Milestoning network to low dimensions but preserves essential kinetic information, such as local residence time, exit time, and mean first passage time between any two states. This is achieved in three steps. First, nodes (milestones) in the high-dimensional Milestoning network are grouped into clusters based on the metastability identified by an auxiliary continuous-time Markov chain. Our clustering method is applicable not only to time-reversible networks but also to non-reversible networks generated from practical simulations with statistical fluctuations. Second, a reduced network is established via network transformation, containing only the core sets of clusters as nodes. Finally, transition pathways are analyzed in the reduced network based on the transition path theory. The algorithm is illustrated using a toy model and a solvated alanine dipeptide in two and four dihedral angles.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Efficient 6-dimensional phase space reconstruction from experimental measurements using generative machine learning
Authors:
Ryan Roussel,
Juan Pablo Gonzalez-Aguilera,
Auralee Edelen,
Eric Wisniewski,
Alex Ody,
Wanming Liu,
Young-Kee Kim,
John Power
Abstract:
Next-generation accelerator concepts which hinge on the precise sha** of beam distributions, demand equally precise diagnostic methods capable of reconstructing beam distributions within 6-dimensional position-momentum spaces. However, the characterization of intricate features within 6-dimensional beam distributions using conventional diagnostic techniques necessitates hundreds of measurements,…
▽ More
Next-generation accelerator concepts which hinge on the precise sha** of beam distributions, demand equally precise diagnostic methods capable of reconstructing beam distributions within 6-dimensional position-momentum spaces. However, the characterization of intricate features within 6-dimensional beam distributions using conventional diagnostic techniques necessitates hundreds of measurements, using many hours of valuable beam time. Novel phase space reconstruction techniques are needed to substantially reduce the number of measurements required to reconstruct detailed, high-dimensional beam features in order to resolve complex beam phenomena, and as feedback in precision beam sha** applications. In this study, we present a novel approach to reconstructing detailed 6-dimensional phase space distributions from experimental measurements using generative machine learning and differentiable beam dynamics simulations. We demonstrate that for a collection of synthetic beam distribution test cases that this approach can be used to resolve 6-dimensional phase space distributions using basic beam manipulations and as few as 20 2-dimensional measurements of the beam profile, without the need for prior data collection or model training. We also demonstrate an application of the reconstruction method in an experimental setting at the Argonne Wakefield Accelerator, where it is able to reconstruct the beam distribution and accurately predict previously unseen measurements 75x faster than previous methods.
△ Less
Submitted 15 May, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Authors:
Shusheng Xu,
Wei Fu,
Jiaxuan Gao,
Wenjie Ye,
Weilin Liu,
Zhiyu Mei,
Guangju Wang,
Chao Yu,
Yi Wu
Abstract:
Reinforcement Learning from Human Feedback (RLHF) is currently the most widely used method to align large language models (LLMs) with human preferences. Existing RLHF methods can be roughly categorized as either reward-based or reward-free. Novel applications such as ChatGPT and Claude leverage reward-based methods that first learn a reward model and apply actor-critic algorithms, such as Proximal…
▽ More
Reinforcement Learning from Human Feedback (RLHF) is currently the most widely used method to align large language models (LLMs) with human preferences. Existing RLHF methods can be roughly categorized as either reward-based or reward-free. Novel applications such as ChatGPT and Claude leverage reward-based methods that first learn a reward model and apply actor-critic algorithms, such as Proximal Policy Optimization (PPO). However, in academic benchmarks, state-of-the-art results are often achieved via reward-free methods, such as Direct Preference Optimization (DPO). Is DPO truly superior to PPO? Why does PPO perform poorly on these benchmarks? In this paper, we first conduct both theoretical and empirical studies on the algorithmic properties of DPO and show that DPO may have fundamental limitations. Moreover, we also comprehensively examine PPO and reveal the key factors for the best performances of PPO in fine-tuning LLMs. Finally, we benchmark DPO and PPO across a collection of RLHF testbeds, ranging from dialogue to code generation. Experiment results demonstrate that PPO is able to surpass other alignment methods in all cases and achieve state-of-the-art results in challenging code competitions.
△ Less
Submitted 21 April, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Graph Neural Networks for Protein-Protein Interactions -- A Short Survey
Authors:
Mingda Xu,
Peisheng Qian,
Ziyuan Zhao,
Zeng Zeng,
Jianguo Chen,
Weide Liu,
Xulei Yang
Abstract:
Protein-protein interactions (PPIs) play key roles in a broad range of biological processes. Numerous strategies have been proposed for predicting PPIs, and among them, graph-based methods have demonstrated promising outcomes owing to the inherent graph structure of PPI networks. This paper reviews various graph-based methodologies, and discusses their applications in PPI prediction. We classify t…
▽ More
Protein-protein interactions (PPIs) play key roles in a broad range of biological processes. Numerous strategies have been proposed for predicting PPIs, and among them, graph-based methods have demonstrated promising outcomes owing to the inherent graph structure of PPI networks. This paper reviews various graph-based methodologies, and discusses their applications in PPI prediction. We classify these approaches into two primary groups based on their model structures. The first category employs Graph Neural Networks (GNN) or Graph Convolutional Networks (GCN), while the second category utilizes Graph Attention Networks (GAT), Graph Auto-Encoders and Graph-BERT. We highlight the distinctive methodologies of each approach in managing the graph-structured data inherent in PPI networks and anticipate future research directions in this domain.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration
Authors:
Chanwook Park,
Sourav Saha,
Jiachen Guo,
Xiaoyu Xie,
Satyajit Mojumder,
Miguel A. Bessa,
Dong Qian,
Wei Chen,
Gregory J. Wagner,
Jian Cao,
Wing Kam Liu
Abstract:
The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpola…
▽ More
The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpolation theories and tensor decomposition, the interpolating neural network (INN). Instead of interpolating training data, a common notion in computer science, INN interpolates interpolation points in the physical space whose coordinates and values are trainable. It can also extrapolate if the interpolation points reside outside of the range of training data and the interpolation functions have a larger support domain. INN features orders of magnitude fewer trainable parameters, faster training, a smaller memory footprint, and higher model accuracy compared to feed-forward neural networks (FFNN) or physics-informed neural networks (PINN). INN is poised to usher in Engineering Software 2.0, a unified neural network that spans various domains of space, time, parameters, and initial/boundary conditions. This has previously been computationally prohibitive due to the exponentially growing number of trainable parameters, easily exceeding the parameter size of ChatGPT, which is over 1 trillion. INN addresses this challenge by leveraging tensor decomposition and tensor product, with adaptable network architecture.
△ Less
Submitted 22 April, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development
Authors:
Xiaohui Duan,
Yuxuan Li,
Zhao Liu,
Bin Yang,
Juepeng Zheng,
Haohuan Fu,
Shaoqing Zhang,
Shiming Xu,
Yang Gao,
Wei Xue,
Di Wei,
Xiao**g Lv,
Lifeng Yan,
Haopeng Huang,
Haitian Lu,
Lingfeng Wan,
Haoran Lin,
Qixin Chang,
Chenlin Li,
Quanjie He,
Zeyu Song,
Xuantong Wang,
Yangyang Yu,
Xilong Fan,
Zhaopeng Qu
, et al. (16 additional authors not shown)
Abstract:
With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t…
▽ More
With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries to minimizes manual code modifications, our project tries to achieve both improvement of performance and consistency of the model code. By using a hierarchical grid system and an OpenMP-based offloading toolkit, our porting and parallelization effort covers over 80% of the code, and achieves a simulation speed of 340 SDPD (simulated days per day) for 5-km atmosphere, 265 SDPD for 3-km ocean, and 222 SDPD for a coupled model, thus making multi-year or even multi-decadal experiments at such high resolution possible.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Nehari manifold optimization and its application for finding unstable solutions of semilinear elliptic PDEs
Authors:
Zhaoxing Chen,
Wei Liu,
Ziqing Xie,
Wenfan Yi
Abstract:
A Nehari manifold optimization method (NMOM) is introduced for finding 1-saddles, i.e., saddle points with the Morse index equal to one, of a generic nonlinear functional in Hilbert spaces. Actually, it is based on the variational characterization that 1-saddles of the generic functional are local minimizers of the same functional restricted on the associated Nehari manifold. The framework contain…
▽ More
A Nehari manifold optimization method (NMOM) is introduced for finding 1-saddles, i.e., saddle points with the Morse index equal to one, of a generic nonlinear functional in Hilbert spaces. Actually, it is based on the variational characterization that 1-saddles of the generic functional are local minimizers of the same functional restricted on the associated Nehari manifold. The framework contains two important ingredients: one is the retraction map** to make the iteration points always lie on the Nehari manifold; the other is the tangential search direction to decrease the generic functional with suitable step-size search rules. Particularly, the global convergence is rigorously established by virtue of some crucial analysis techniques (including a weak convergence method) overcoming difficulties in the infinite-dimensional setting. In practice, combining with an easy-to-implement Nehari retraction and the negative Riemannian gradient direction, the NMOM is successfully applied to compute the unstable ground-state solutions of a class of typical semilinear elliptic PDEs such as Hénon equation and the stationary nonlinear Schrödinger equation. In particular, the symmetry-breaking phenomenon of the ground states of Hénon equation is explored numerically in 1D and 2D with interesting numerical findings on the critical value of symmetry-breaking reported.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Observation of $D \to a_{0}(980)π$ in the decays $D^{0} \rightarrow π^{+}π^{-}η$ and $D^{+} \rightarrow π^{+}π^{0}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the…
▽ More
We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the $D^{0(+)} \to a_{0}(980)^{-(0)} π^{+}$ contribution. The ratios $\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{+}π^{-})/\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{-}π^{+})$ and $\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{+}π^{0})/\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{0}π^{+})$ are measured to be $7.5^{+2.5}_{-0.8\,\mathrm{stat.}}\pm1.7_{\mathrm{syst.}}$ and $2.6\pm0.6_{\mathrm{stat.}}\pm0.3_{\mathrm{syst.}}$, respectively. The measured $D^{0}$ ratio disagrees with the theoretical predictions by orders of magnitudes, thus implying a substantial contribution from final-state interactions.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Temporal-Spatial Manipulation of Bi-Focal Bi-Chromatic Fields for Terahertz Radiations
Authors:
**g**g Zhao,
Yizhu Zhang,
Yanjun Gao,
Meng Li,
Xiaokun Liu,
Weimin Liu,
Tian-Min Yan,
Yuhai Jiang
Abstract:
Mixing the fundamental ($ω$) and the second harmonic (2$ω$) waves in gas phase is a widely employed technique for emitting terahertz (THz) pulses. The THz generation driven by bi-chromatic fields can be described by the photocurrent model, where the THz generation is attributed to free electrons ionized by the $ω$ field, and the 2$ω$ field provides a perturbation to break the symmetry of the asymp…
▽ More
Mixing the fundamental ($ω$) and the second harmonic (2$ω$) waves in gas phase is a widely employed technique for emitting terahertz (THz) pulses. The THz generation driven by bi-chromatic fields can be described by the photocurrent model, where the THz generation is attributed to free electrons ionized by the $ω$ field, and the 2$ω$ field provides a perturbation to break the symmetry of the asymptotic momentum of free electrons. However, we find that the THz radiation is amplified by one order of magnitude when driven by bi-focal bi-chromatic fields, contradicting the common understanding of the photocurrent model. Meanwhile, present measurements demonstrate that the THz radiation mainly originates from the plasma created by the 2$ω$ pulses instead of the $ω$ pulses. Energy transfer from the 2$ω$ beam to the THz beam during the THz generation has been observed, validating the major contribution of the 2$ω$ beam. Furthermore, the THz bandwidth has been observed to extensively exceed the bandwidth of the pump pulse, not be explained by the photocurrent model as well. These counterintuitive results indicate that undiscovered physical mechanisms are involved in bi-chromatic THz generation in plasma, presenting a significant challenge for understanding strong-field nonlinear optics and simultaneously expanding various applications.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Authors:
Andi Zhang,
Tim Z. Xiao,
Weiyang Liu,
Robert Bamler,
Damon Wischik
Abstract:
We revisit the likelihood ratio between a pretrained large language model (LLM) and its finetuned variant as a criterion for out-of-distribution (OOD) detection. The intuition behind such a criterion is that, the pretrained LLM has the prior knowledge about OOD data due to its large amount of training data, and once finetuned with the in-distribution data, the LLM has sufficient knowledge to disti…
▽ More
We revisit the likelihood ratio between a pretrained large language model (LLM) and its finetuned variant as a criterion for out-of-distribution (OOD) detection. The intuition behind such a criterion is that, the pretrained LLM has the prior knowledge about OOD data due to its large amount of training data, and once finetuned with the in-distribution data, the LLM has sufficient knowledge to distinguish their difference. Leveraging the power of LLMs, we show that, for the first time, the likelihood ratio can serve as an effective OOD detector. Moreover, we apply the proposed LLM-based likelihood ratio to detect OOD questions in question-answering (QA) systems, which can be used to improve the performance of specialized LLMs for general questions. Given that likelihood can be easily obtained by the loss functions within contemporary neural network frameworks, it is straightforward to implement this approach in practice. Since both the pretrained LLMs and its various finetuned models are available, our proposed criterion can be effortlessly incorporated for OOD detection without the need for further training. We conduct comprehensive evaluation across on multiple settings, including far OOD, near OOD, spam detection, and QA scenarios, to demonstrate the effectiveness of the method.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
The photoinduced hidden metallic phase of monoclinic VO2 driven by local nucleation via a self-amplification process
Authors:
Feng-Wu Guo,
Wen-Hao Liu,
Zhi Wang,
Shu-Shen Li,
Lin-Wang Wang,
Jun-Wei Luo
Abstract:
The insulator-to-metal transition (IMT) in vanadium dioxide (VO2) has garnered extensive attention for its potential applications in ultrafast switches, neuronal network architectures, and storage technologies. However, a significant controversy persists regarding the formation of the IMT, specifically concerning whether a complete structural phase transition from monoclinic (M1) to rutile (R) pha…
▽ More
The insulator-to-metal transition (IMT) in vanadium dioxide (VO2) has garnered extensive attention for its potential applications in ultrafast switches, neuronal network architectures, and storage technologies. However, a significant controversy persists regarding the formation of the IMT, specifically concerning whether a complete structural phase transition from monoclinic (M1) to rutile (R) phase is necessary. Here we employ the real-time time-dependent density functional theory (rt-TDDFT) to track the dynamic evolution of atomic and electronic structures in photoexcited VO2, revealing the emergence of a long-lived monoclinic metal phase (MM) under low electronic excitation. The emergence of the metal phase in the monoclinic structure originates from the dissociation of the local V-V dimer, driven by the self-trapped and self-amplified dynamics of photoexcited holes, rather than by a pure electron-electron correction. On the other hand, the M1-to-R phase transition does appear at higher electronic excitation. Our findings validate the existence of MM phase and provide a comprehensive picture of the IMT in photoexcited VO2.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Electro-optically Modulated Nonlinear Metasurfaces
Authors:
Zhengqing He,
Lun Qu,
Wei Wu,
Jikun Liu,
**gfei You,
Weiye Liu,
Lu Bai,
Chunyan **,
Chenxiong Wang,
Zhidong Gu,
Wei Cai,
Mengxin Ren,
**gjun Xu
Abstract:
Tunable nonlinearity facilitates the creation of reconfigurable nonlinear metasurfaces, enabling innovative applications in signal processing, light switching, and sensing. This paper presents a novel approach to electrically modulate SHG from a lithium niobate (LN) metasurface, exploiting the electro-optical (EO) effect. By fabricating a nanohole array metasurface on a thin LN film and applying a…
▽ More
Tunable nonlinearity facilitates the creation of reconfigurable nonlinear metasurfaces, enabling innovative applications in signal processing, light switching, and sensing. This paper presents a novel approach to electrically modulate SHG from a lithium niobate (LN) metasurface, exploiting the electro-optical (EO) effect. By fabricating a nanohole array metasurface on a thin LN film and applying an electric field, we demonstrate the alteration of the material's refractive index, resulting in resonance shifts and modulation of SHG intensity at specific wavelengths. Our findings provide valuable insights for the development of electrically tunable nonlinear light sources, quantum optics, dynamic nonlinear holography, and nonlinear information processing.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
WESE: Weak Exploration to Strong Exploitation for LLM Agents
Authors:
Xu Huang,
Weiwen Liu,
Xiaolong Chen,
Xingmei Wang,
Defu Lian,
Yasheng Wang,
Ruiming Tang,
Enhong Chen
Abstract:
Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive envi…
▽ More
Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive environments, these methods exhibit limitations. Firstly, the lack of global information of environments leads to greedy decisions, resulting in sub-optimal solutions. On the other hand, irrelevant information acquired from the environment not only adversely introduces noise, but also incurs additional cost. This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE), to enhance LLM agents in solving open-world interactive tasks. Concretely, WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task. Our approach is flexible enough to incorporate diverse tasks, and obtains significant improvements in both success rates and efficiency across four interactive benchmarks.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.