-
Thoroughly Modeling Multi-domain Pre-trained Recommendation as Language
Authors:
Zekai Qu,
Ruobing Xie,
Chaojun Xiao,
Yuan Yao,
Zhiyuan Liu,
Fengzong Lian,
Zhanhui Kang,
Jie Zhou
Abstract:
With the thriving of pre-trained language model (PLM) widely verified in various of NLP tasks, pioneer efforts attempt to explore the possible cooperation of the general textual information in PLM with the personalized behavioral information in user historical behavior sequences to enhance sequential recommendation (SR). However, despite the commonalities of input format and task goal, there are h…
▽ More
With the thriving of pre-trained language model (PLM) widely verified in various of NLP tasks, pioneer efforts attempt to explore the possible cooperation of the general textual information in PLM with the personalized behavioral information in user historical behavior sequences to enhance sequential recommendation (SR). However, despite the commonalities of input format and task goal, there are huge gaps between the behavioral and textual information, which obstruct thoroughly modeling SR as language modeling via PLM. To bridge the gap, we propose a novel Unified pre-trained language model enhanced sequential recommendation (UPSR), aiming to build a unified pre-trained recommendation model for multi-domain recommendation tasks. We formally design five key indicators, namely naturalness, domain consistency, informativeness, noise & ambiguity, and text length, to guide the text-item adaptation and behavior sequence-text sequence adaptation differently for pre-training and fine-tuning stages, which are essential but under-explored by previous works. In experiments, we conduct extensive evaluations on seven datasets with both tuning and zero-shot settings and achieve the overall best performance. Comprehensive model analyses also provide valuable insights for behavior modeling via PLM, shedding light on large pre-trained recommendation models. The source codes will be released in the future.
△ Less
Submitted 27 November, 2023; v1 submitted 20 October, 2023;
originally announced October 2023.
-
BRFL: A Blockchain-based Byzantine-Robust Federated Learning Model
Authors:
Yang Li,
Chunhe Xia,
Chang Li,
Tianbo Wang
Abstract:
With the increasing importance of machine learning, the privacy and security of training data have become critical. Federated learning, which stores data in distributed nodes and shares only model parameters, has gained significant attention for addressing this concern. However, a challenge arises in federated learning due to the Byzantine Attack Problem, where malicious local models can compromis…
▽ More
With the increasing importance of machine learning, the privacy and security of training data have become critical. Federated learning, which stores data in distributed nodes and shares only model parameters, has gained significant attention for addressing this concern. However, a challenge arises in federated learning due to the Byzantine Attack Problem, where malicious local models can compromise the global model's performance during aggregation. This article proposes the Blockchain-based Byzantine-Robust Federated Learning (BRLF) model that combines federated learning with blockchain technology. This integration enables traceability of malicious models and provides incentives for locally trained clients. Our approach involves selecting the aggregation node based on Pearson's correlation coefficient, and we perform spectral clustering and calculate the average gradient within each cluster, validating its accuracy using local dataset of the aggregation nodes. Experimental results on public datasets demonstrate the superior byzantine robustness of our secure aggregation algorithm compared to other baseline byzantine robust aggregation methods, and proved our proposed model effectiveness in addressing the resource consumption problem.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Correlation functions for the $N^*(1535)$ and the inverse problem
Authors:
Raquel Molina,
Chu-Wen Xiao,
Wei-Hong Liang,
Eulogio Oset
Abstract:
The $N^*(1535)$ can be dynamically generated in the chiral unitary approach with the coupled channels, $K^0 Σ^+, K^+ Σ^0, K^+ Λ$ and $ηp$. In this work we evaluate the correlation functions for every channel and face the inverse problem. Assuming the correlation functions to correspond to real measurements, we conduct a fit to the data within a general framework in order to extract the information…
▽ More
The $N^*(1535)$ can be dynamically generated in the chiral unitary approach with the coupled channels, $K^0 Σ^+, K^+ Σ^0, K^+ Λ$ and $ηp$. In this work we evaluate the correlation functions for every channel and face the inverse problem. Assuming the correlation functions to correspond to real measurements, we conduct a fit to the data within a general framework in order to extract the information contained in these correlation functions. The bootstrap method is used to determine the uncertainties of the different observables, and we find that, assuming errors of the same order than in present measurements of correlation functions, one can determine the scattering length and effective range of all channels with a very good accuracy. Most remarkable is the fact that the method predicts the existence of a bound state of isospin $\frac{1}{2}$ nature around the mass of the $N^*(1535)$ with an accuracy of $6\; \rm MeV$. These results should encourage the actual measurement of these correlation functions (only the $K^+ Λ$ one is measured so far), which can shed valuable light on the relationship of the $ N^*(1535)$ state to these coupled channels, a subject of continuous debate.
△ Less
Submitted 8 February, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Hierarchical accompanying and inhibiting patterns on the spatial arrangement of taxis' local hotspots
Authors:
Xiao-Jian Chen,
Changjiang Xiao,
Zhou Huanga,
Keli Wang,
Weiyu Zhang,
Yu Liu
Abstract:
Due to the large volume of recording, the complete spontaneity, and the flexible pick-up and drop-off locations, taxi data portrays a realistic and detailed picture of urban space use to a certain extent. The spatial arrangement of pick-up and drop-off hotspots reflects the organizational space, which has received attention in urban structure studies. Previous studies mainly explore the hotspots a…
▽ More
Due to the large volume of recording, the complete spontaneity, and the flexible pick-up and drop-off locations, taxi data portrays a realistic and detailed picture of urban space use to a certain extent. The spatial arrangement of pick-up and drop-off hotspots reflects the organizational space, which has received attention in urban structure studies. Previous studies mainly explore the hotspots at a large scale by visual analysis or some simple indexes, where the hotspots usually cover the entire central business district, train stations, or dense residential areas, reaching a radius of hundreds or even thousands of meters. However, the spatial arrangement patterns of small-scale hotspots, reflecting the specific popular pick-up and drop-off locations, have not received much attention. Using two taxi trajectory datasets in Wuhan and Bei**g, China, this study quantitatively explores the spatial arrangement of fine-grained pick-up and drop-off local hotspots with different levels of popularity, where the sizes are adaptively set as 90m*90m in Wuhan and 105m*105m in Bei**g according to the local hotspot identification method. Results show that popular hotspots tend to be surrounded by less popular hotspots, but the existence of less popular hotspots is inhibited in regions with a large number of popular hotspots. We use the terms hierarchical accompany and inhibiting patterns for these two spatial configurations. Finally, to uncover the underlying mechanism, a KNN-based model is proposed to reproduce the spatial distribution of other less popular hotspots according to the most popular ones. These findings help decision-makers construct reasonable urban minimum units for precise traffic and disease control, as well as plan a more humane spatial arrangement of points of interest.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Magnetic eight-fold nodal-point and nodal-network fermions in MnB2
Authors:
Yongheng Ge,
Ziming Zhu,
Zeying Zhang,
Weikang Wu,
Cong Xiao,
Shengyuan A. Yang
Abstract:
Realizing topological semimetal states with novel emergent fermions in magnetic materials is a focus of current research. Based on first-principle calculations and symmetry analysis, we reveal interesting magnetic emergent fermions in an existing material MnB2. In the temperature range from 157 K to 760 K, MnB2 is a collinear antiferromagnet. We find the coexistence of eightfold nodal points and n…
▽ More
Realizing topological semimetal states with novel emergent fermions in magnetic materials is a focus of current research. Based on first-principle calculations and symmetry analysis, we reveal interesting magnetic emergent fermions in an existing material MnB2. In the temperature range from 157 K to 760 K, MnB2 is a collinear antiferromagnet. We find the coexistence of eightfold nodal points and nodal net close to the Fermi level, which are protected by the spin group in the absence of spin-orbit coupling. Depending on the Neel vector orientation, consideration of spin-orbit coupling will either open small gaps at these nodal features, or transform them into magnetic linear and quadratic Dirac points and nodal rings. Below 157 K, MnB2 acquires weak ferromagnetism due to spin tilting. We predict that this transition is accompanied by a drastic change in anomalous Hall response, from zero above 157 K to 200 $Ω\cdot \text{cm}^{-1}$ below 157 K.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Alzheimer Disease is Associated with Isotropic Ocular Enlargement
Authors:
Shuyue Ma,
Qihui Ye,
Chufan Xiao,
Haifei Guan,
Zhicheng Du,
Peiwu Qin
Abstract:
Recent studies have documented ocular changes in dementia patients, especially Alzheimer Disease (AD). In this study, we explored the change of eye size and eye shape in dementia, including AD patients. The eyeball volume and diameters were estimated via T1-weighted brain magnetic resonance (MR) images in the OASIS-3 database which included 83 AD, 247 non-AD dementiaand 336 normal-aging participan…
▽ More
Recent studies have documented ocular changes in dementia patients, especially Alzheimer Disease (AD). In this study, we explored the change of eye size and eye shape in dementia, including AD patients. The eyeball volume and diameters were estimated via T1-weighted brain magnetic resonance (MR) images in the OASIS-3 database which included 83 AD, 247 non-AD dementiaand 336 normal-aging participants qualified for this study. After adjustment of age, sex, race, apolipoprotein E genotypes, anisotropic ratio and intracranial volume, we observed the eyeball volume of the AD group was significantly larger than both the normal control (6871mm3 vs 6415mm3, p < 0.001) and the non-AD dementia group (6871mm3 vs 6391 mm3, p < 0.001), but there was no difference between the non-AD dementia group and the normal control (6391 mm3 vs 6415mm3, p = 0.795). Similar results were observed for the axial, transverse and vertical length. No group differences were observed in the anisotropic ratio, indicating an isotropic volume increaseconsistent with previous changes induced by the ocular hypertension (OH), which suggested possible elevation of the intraocular pressure (IOP) in AD. In consideration of the recent findings in ocular changes of dementia, our findings emphasize routine eye examinations and eye cares for AD patients in the clinic.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings
Authors:
Zhuofeng Wu,
Chaowei Xiao,
VG Vinod Vydiswaran
Abstract:
In this paper, we propose a hierarchical contrastive learning framework, HiCL, which considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness. Traditional methods typically encode a sequence in its entirety for contrast with others, often neglecting local representation learning, leading to challenges in generalizing to shorter texts. C…
▽ More
In this paper, we propose a hierarchical contrastive learning framework, HiCL, which considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness. Traditional methods typically encode a sequence in its entirety for contrast with others, often neglecting local representation learning, leading to challenges in generalizing to shorter texts. Conversely, HiCL improves its effectiveness by dividing the sequence into several segments and employing both local and global contrastive learning to model segment-level and sequence-level relationships. Further, considering the quadratic time complexity of transformers over input tokens, HiCL boosts training efficiency by first encoding short segments and then aggregating them to obtain the sequence representation. Extensive experiments show that HiCL enhances the prior top-performing SNCSE model across seven extensively evaluated STS tasks, with an average increase of +0.2% observed on BERT-large and +0.44% on RoBERTa-large.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Incorporating Domain Knowledge Graph into Multimodal Movie Genre Classification with Self-Supervised Attention and Contrastive Learning
Authors:
Jiaqi Li,
Guilin Qi,
Chuanyi Zhang,
Yongrui Chen,
Yiming Tan,
Chenlong Xia,
Ye Tian
Abstract:
Multimodal movie genre classification has always been regarded as a demanding multi-label classification task due to the diversity of multimodal data such as posters, plot summaries, trailers and metadata. Although existing works have made great progress in modeling and combining each modality, they still face three issues: 1) unutilized group relations in metadata, 2) unreliable attention allocat…
▽ More
Multimodal movie genre classification has always been regarded as a demanding multi-label classification task due to the diversity of multimodal data such as posters, plot summaries, trailers and metadata. Although existing works have made great progress in modeling and combining each modality, they still face three issues: 1) unutilized group relations in metadata, 2) unreliable attention allocation, and 3) indiscriminative fused features. Given that the knowledge graph has been proven to contain rich information, we present a novel framework that exploits the knowledge graph from various perspectives to address the above problems. As a preparation, the metadata is processed into a domain knowledge graph. A translate model for knowledge graph embedding is adopted to capture the relations between entities. Firstly we retrieve the relevant embedding from the knowledge graph by utilizing group relations in metadata and then integrate it with other modalities. Next, we introduce an Attention Teacher module for reliable attention allocation based on self-supervised learning. It learns the distribution of the knowledge graph and produces rational attention weights. Finally, a Genre-Centroid Anchored Contrastive Learning module is proposed to strengthen the discriminative ability of fused features. The embedding space of anchors is initialized from the genre entities in the knowledge graph. To verify the effectiveness of our framework, we collect a larger and more challenging dataset named MM-IMDb 2.0 compared with the MM-IMDb dataset. The experimental results on two datasets demonstrate that our model is superior to the state-of-the-art methods. We will release the code in the near future.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation
Authors:
Haizhong Zheng,
Jiachen Sun,
Shutong Wu,
Bhavya Kailkhura,
Zhuoqing Mao,
Chaowei Xiao,
Atul Prakash
Abstract:
Given a real-world dataset, data condensation (DC) aims to synthesize a significantly smaller dataset that captures the knowledge of this dataset for model training with high performance. Recent works propose to enhance DC with data parameterization, which condenses data into parameterized data containers rather than pixel space. The intuition behind data parameterization is to encode shared featu…
▽ More
Given a real-world dataset, data condensation (DC) aims to synthesize a significantly smaller dataset that captures the knowledge of this dataset for model training with high performance. Recent works propose to enhance DC with data parameterization, which condenses data into parameterized data containers rather than pixel space. The intuition behind data parameterization is to encode shared features of images to avoid additional storage costs. In this paper, we recognize that images share common features in a hierarchical way due to the inherent hierarchical structure of the classification system, which is overlooked by current data parameterization methods. To better align DC with this hierarchical nature and encourage more efficient information sharing inside data containers, we propose a novel data parameterization architecture, Hierarchical Memory Network (HMN). HMN stores condensed data in a three-tier structure, representing the dataset-level, class-level, and instance-level features. Another helpful property of the hierarchical architecture is that HMN naturally ensures good independence among images despite achieving information sharing. This enables instance-level pruning for HMN to reduce redundant information, thereby further minimizing redundancy and enhancing performance. We evaluate HMN on four public datasets (SVHN, CIFAR10, CIFAR100, and Tiny-ImageNet) and compare HMN with eight DC baselines. The evaluation results show that our proposed method outperforms all baselines, even when trained with a batch-based loss consuming less GPU memory.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Anyview: Generalizable Indoor 3D Object Detection with Variable Frames
Authors:
Zhenyu Wu,
Xiuwei Xu,
Ziwei Wang,
Chong Xia,
Linqing Zhao,
Jiwen Lu,
Haibin Yan
Abstract:
In this paper, we propose a novel network framework for indoor 3D object detection to handle variable input frame numbers in practical scenarios. Existing methods only consider fixed frames of input data for a single detector, such as monocular RGB-D images or point clouds reconstructed from dense multi-view RGB-D images. While in practical application scenes such as robot navigation and manipulat…
▽ More
In this paper, we propose a novel network framework for indoor 3D object detection to handle variable input frame numbers in practical scenarios. Existing methods only consider fixed frames of input data for a single detector, such as monocular RGB-D images or point clouds reconstructed from dense multi-view RGB-D images. While in practical application scenes such as robot navigation and manipulation, the raw input to the 3D detectors is the RGB-D images with variable frame numbers instead of the reconstructed scene point cloud. However, the previous approaches can only handle fixed frame input data and have poor performance with variable frame input. In order to facilitate 3D object detection methods suitable for practical tasks, we present a novel 3D detection framework named AnyView for our practical applications, which generalizes well across different numbers of input frames with a single model. To be specific, we propose a geometric learner to mine the local geometric features of each input RGB-D image frame and implement local-global feature interaction through a designed spatial mixture module. Meanwhile, we further utilize a dynamic token strategy to adaptively adjust the number of extracted features for each frame, which ensures consistent global feature density and further enhances the generalization after fusion. Extensive experiments on the ScanNet dataset show our method achieves both great generalizability and high detection accuracy with a simple and clean architecture containing a similar amount of parameters with the baselines.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Authors:
Shuaiwen Leon Song,
Bonnie Kruft,
Minjia Zhang,
Conglong Li,
Shiyang Chen,
Chengming Zhang,
Masahiro Tanaka,
Xiaoxia Wu,
Jeff Rasley,
Ammar Ahmad Awan,
Connor Holmes,
Martin Cai,
Adam Ghanem,
Zhongzhu Zhou,
Yuxiong He,
Pete Luferenko,
Divya Kumar,
Jonathan Weyn,
Ruixiong Zhang,
Sylwester Klocek,
Volodymyr Vragov,
Mohammed AlQuraishi,
Gustaf Ahdritz,
Christina Floristean,
Cristina Negri
, et al. (67 additional authors not shown)
Abstract:
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique…
▽ More
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.
△ Less
Submitted 11 October, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
Authors:
Xiaogeng Liu,
Nan Xu,
Muhao Chen,
Chaowei Xiao
Abstract:
The aligned Large Language Models (LLMs) are powerful language understanding and decision-making tools that are created through extensive alignment with human feedback. However, these large models remain susceptible to jailbreak attacks, where adversaries manipulate prompts to elicit malicious outputs that should not be given by aligned LLMs. Investigating jailbreak prompts can lead us to delve in…
▽ More
The aligned Large Language Models (LLMs) are powerful language understanding and decision-making tools that are created through extensive alignment with human feedback. However, these large models remain susceptible to jailbreak attacks, where adversaries manipulate prompts to elicit malicious outputs that should not be given by aligned LLMs. Investigating jailbreak prompts can lead us to delve into the limitations of LLMs and further guide us to secure them. Unfortunately, existing jailbreak techniques suffer from either (1) scalability issues, where attacks heavily rely on manual crafting of prompts, or (2) stealthiness problems, as attacks depend on token-based algorithms to generate prompts that are often semantically meaningless, making them susceptible to detection through basic perplexity testing. In light of these challenges, we intend to answer this question: Can we develop an approach that can automatically generate stealthy jailbreak prompts? In this paper, we introduce AutoDAN, a novel jailbreak attack against aligned LLMs. AutoDAN can automatically generate stealthy jailbreak prompts by the carefully designed hierarchical genetic algorithm. Extensive evaluations demonstrate that AutoDAN not only automates the process while preserving semantic meaningfulness, but also demonstrates superior attack strength in cross-model transferability, and cross-sample universality compared with the baseline. Moreover, we also compare AutoDAN with perplexity-based defense methods and show that AutoDAN can bypass them effectively.
△ Less
Submitted 20 March, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
CSI: Enhancing the Robustness of 3D Point Cloud Recognition against Corruption
Authors:
Zhuoyuan Wu,
Jiachen Sun,
Chaowei Xiao
Abstract:
Despite recent advancements in deep neural networks for point cloud recognition, real-world safety-critical applications present challenges due to unavoidable data corruption. Current models often fall short in generalizing to unforeseen distribution shifts. In this study, we harness the inherent set property of point cloud data to introduce a novel critical subset identification (CSI) method, aim…
▽ More
Despite recent advancements in deep neural networks for point cloud recognition, real-world safety-critical applications present challenges due to unavoidable data corruption. Current models often fall short in generalizing to unforeseen distribution shifts. In this study, we harness the inherent set property of point cloud data to introduce a novel critical subset identification (CSI) method, aiming to bolster recognition robustness in the face of data corruption. Our CSI framework integrates two pivotal components: density-aware sampling (DAS) and self-entropy minimization (SEM), which cater to static and dynamic CSI, respectively. DAS ensures efficient robust anchor point sampling by factoring in local density, while SEM is employed during training to accentuate the most salient point-to-point attention. Evaluations reveal that our CSI approach yields error rates of 18.4\% and 16.3\% on ModelNet40-C and PointCloud-C, respectively, marking a notable improvement over state-of-the-art methods by margins of 5.2\% and 4.2\% on the respective benchmarks. Code is available at \href{https://github.com/masterwu2115/CSI/tree/main}{https://github.com/masterwu2115/CSI/tree/main}
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
HarmonyDream: Task Harmonization Inside World Models
Authors:
Haoyu Ma,
Jialong Wu,
Ningya Feng,
Chenjun Xiao,
Dong Li,
Jianye Hao,
Jianmin Wang,
Mingsheng Long
Abstract:
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning by utilizing a world model, which models how the environment works and typically encompasses components for two tasks: observation modeling and reward modeling. In this paper, through a dedicated empirical investigation, we gain a deeper understanding of the role each task plays in world models and uncover the…
▽ More
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning by utilizing a world model, which models how the environment works and typically encompasses components for two tasks: observation modeling and reward modeling. In this paper, through a dedicated empirical investigation, we gain a deeper understanding of the role each task plays in world models and uncover the overlooked potential of sample-efficient MBRL by mitigating the domination of either observation or reward modeling. Our key insight is that while prevalent approaches of explicit MBRL attempt to restore abundant details of the environment via observation models, it is difficult due to the environment's complexity and limited model capacity. On the other hand, reward models, while dominating implicit MBRL and adept at learning compact task-centric dynamics, are inadequate for sample-efficient learning without richer learning signals. Motivated by these insights and discoveries, we propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization, i.e. a dynamic equilibrium between the two tasks in world model learning. Our experiments show that the base MBRL method equipped with HarmonyDream gains 10%-69% absolute performance boosts on visual robotic tasks and sets a new state-of-the-art result on the Atari 100K benchmark. Code is available at https://github.com/thuml/HarmonyDream.
△ Less
Submitted 5 June, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Encountered-Type Haptic Display via Tracking Calibrated Robot
Authors:
Chenxi Xiao,
Yuan Tian
Abstract:
In the past decades, a variety of haptic devices have been developed to facilitate high-fidelity human-computer interaction (HCI) in virtual reality (VR). In particular, passive haptic feedback can create a compelling sensation based on real objects spatially overlap** with their virtual counterparts. However, these approaches require pre-deployment efforts, hindering their democratizing use in…
▽ More
In the past decades, a variety of haptic devices have been developed to facilitate high-fidelity human-computer interaction (HCI) in virtual reality (VR). In particular, passive haptic feedback can create a compelling sensation based on real objects spatially overlap** with their virtual counterparts. However, these approaches require pre-deployment efforts, hindering their democratizing use in practice. We propose the Tracking Calibrated Robot (TCR), a novel and general haptic approach to free developers from deployment efforts, which can be potentially deployed in any scenario. Specifically, we augment the VR with a collaborative robot that renders haptic contact in the real world while the user touches a virtual object in the virtual world. The distance between the user's finger and the robot end-effector is controlled over time. The distance starts to smoothly reduce to zero when the user intends to touch the virtual object. A mock user study tested users' perception of three virtual objects, and the result shows that TCR is effective in terms of conveying discriminative shape information.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Experimental Study of the Nematic Transition in Granular Spherocylinder Packings under Tap**
Authors:
Haitao Yu,
Zhikun Zeng,
Ye Yuan,
Shuyang Zhang,
Chengjie Xia,
Yujie Wang
Abstract:
Using x-ray tomography, we experimentally investigate the nematic transition in granular spherocylinder packings induced by tap**. Upon the validation of the Edwards ensemble framework in spherocylinders, we introduce an empirical free energy that accounts for the influence of gravity and the mechanical stability requirements specific to granular systems. This free energy can predict not only th…
▽ More
Using x-ray tomography, we experimentally investigate the nematic transition in granular spherocylinder packings induced by tap**. Upon the validation of the Edwards ensemble framework in spherocylinders, we introduce an empirical free energy that accounts for the influence of gravity and the mechanical stability requirements specific to granular systems. This free energy can predict not only the correct phase transition behavior of the system from a disordered state to a nematic phase, but also a phase coexistence range and nucleation energy barriers that agree with experimental observations.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Multiferroic Magnon Spin-Torque Based Reconfigurable Logic-In-Memory
Authors:
Yahong Chai,
Yuhan Liang,
Cancheng Xiao,
Yue Wang,
Bo Li,
Dingsong Jiang,
Pratap Pal,
Yongjian Tang,
Hetian Chen,
Yuejie Zhang,
Witold Skowroński,
Qinghua Zhang,
Lin Gu,
**g Ma,
Pu Yu,
Jianshi Tang,
Yuan-Hua Lin,
Di Yi,
Daniel C. Ralph,
Chang-Beom Eom,
Huaqiang Wu,
Tianxiang Nan
Abstract:
Magnons, bosonic quasiparticles carrying angular momentum, can flow through insulators for information transmission with minimal power dissipation. However, it remains challenging to develop a magnon-based logic due to the lack of efficient electrical manipulation of magnon transport. Here we present a magnon logic-in-memory device in a spin-source/multiferroic/ferromagnet structure, where multife…
▽ More
Magnons, bosonic quasiparticles carrying angular momentum, can flow through insulators for information transmission with minimal power dissipation. However, it remains challenging to develop a magnon-based logic due to the lack of efficient electrical manipulation of magnon transport. Here we present a magnon logic-in-memory device in a spin-source/multiferroic/ferromagnet structure, where multiferroic magnon modes can be electrically excited and controlled. In this device, magnon information is encoded to ferromagnetic bits by the magnon-mediated spin torque. We show that the ferroelectric polarization can electrically modulate the magnon spin-torque by controlling the non-collinear antiferromagnetic structure in multiferroic bismuth ferrite thin films with coupled antiferromagnetic and ferroelectric orders. By manipulating the two coupled non-volatile state variables (ferroelectric polarization and magnetization), we further demonstrate reconfigurable logic-in-memory operations in a single device. Our findings highlight the potential of multiferroics for controlling magnon information transport and offer a pathway towards room-temperature voltage-controlled, low-power, scalable magnonics for in-memory computing.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Hybrid Strangeon Stars
Authors:
Chen Zhang,
Yong Gao,
Cheng-Jun Xia,
Renxin Xu
Abstract:
It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structur…
▽ More
It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structure and astrophysical implications of hybrid strangeon stars. We find that hybrid strangeon stars can meet various astrophysical constraints on pulsar masses, radii, and tidal deformabilities. Finally, we show that the strangeon-SQM mixed phase is not preferred if the charge-neutrality condition is imposed at the strangeon-SQM transition region.
△ Less
Submitted 8 January, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Strangelets formation in high energy heavy-ion collisions
Authors:
Huai-Min Chen,
Cheng-Jun Xia,
Guang-Xiong Peng
Abstract:
The properties of phase diagram of strange quark matter in equilibrium with hadronic matter at finite temperature are studied, where the quark phase and hadron phase are treated by baryon density-dependent quark mass model and hadron resonance gas model with hard core repulsion factor, respectively. The thermodynamic conditions for the formation of metastable strange quark droplets ("strangelets")…
▽ More
The properties of phase diagram of strange quark matter in equilibrium with hadronic matter at finite temperature are studied, where the quark phase and hadron phase are treated by baryon density-dependent quark mass model and hadron resonance gas model with hard core repulsion factor, respectively. The thermodynamic conditions for the formation of metastable strange quark droplets ("strangelets") in relativistic nuclear collisions are discussed. We obtained a rich structure of the phase diagram at finite temperature, and study the dynamical trajectories of an expanding strange fireball. Our results indicate that the strangeness fraction fs, perturbation parameter C, and confinement parameter D have strong influence on the properties of phase diagram and the formation of strangelets. Consider the isentropic expansion process, we found that the initial entropy per baryon is less than or equal to 5, which gives a large probability for the formation of strangelets. Furthermore, a sufficiently large strangeness fraction fs and one-gluon-exchange interaction and sufficiently small confinement interaction create possibilities for the formation of strangelets. On the contrary, the fireball will always complete the hadronization process when fs=0 or C>=0 or D^{1/2}>=170 MeV.
△ Less
Submitted 29 February, 2024; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Effective Distillation of Table-based Reasoning Ability from LLMs
Authors:
Bohao Yang,
Chen Tang,
Kun Zhao,
Chenghao Xiao,
Chenghua Lin
Abstract:
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models thro…
▽ More
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models through distillation. Some studies explore the potential of leveraging LLMs to perform table-based reasoning. However, there has been no prior work focusing on table reasoning skills in smaller models specifically tailored for scientific table-to-text generation tasks. In this paper, we propose a novel table-based reasoning distillation approach, with the aim of distilling LLMs into tailored smaller models. Our experimental results have shown that a 220 million parameter model (Flan-T5-base) fine-tuned using distilled data, not only achieves a significant improvement compared to traditionally fine-tuned baselines, but also surpasses specific LLMs on a scientific table-to-text generation dataset. Our code is available at https://github.com/Bernard-Yang/DistillTableCoT.
△ Less
Submitted 25 March, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Audio Contrastive based Fine-tuning
Authors:
Yang Wang,
Qibin Liang,
Chenghao Xiao,
Yizhi Li,
Noura Al Moubayed,
Chenghua Lin
Abstract:
Audio classification plays a crucial role in speech and sound processing tasks with a wide range of applications. There still remains a challenge of striking the right balance between fitting the model to the training data (avoiding overfitting) and enabling it to generalise well to a new domain. Leveraging the transferability of contrastive learning, we introduce Audio Contrastive-based Fine-tuni…
▽ More
Audio classification plays a crucial role in speech and sound processing tasks with a wide range of applications. There still remains a challenge of striking the right balance between fitting the model to the training data (avoiding overfitting) and enabling it to generalise well to a new domain. Leveraging the transferability of contrastive learning, we introduce Audio Contrastive-based Fine-tuning (AudioConFit), an efficient approach characterised by robust generalisability. Empirical experiments on a variety of audio classification tasks demonstrate the effectiveness and robustness of our approach, which achieves state-of-the-art results in various settings.
△ Less
Submitted 19 October, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Periodic solution for transport of intense and coupled coasting beams through quadrupole channels
Authors:
Chen Xiao,
Lars Groening
Abstract:
Imposing defined spinning to a particle beam increases its stability against perturbations from space charge~[Y.-L.~Cheon et al., Effects of beam spinning on the fourth-order particle resonance of 3D bunched beams in high-intensity linear accelerators, Phys. Rev. Accel. \& Beams {\bf 25}, 064002 (2022)]. In order to fully explore this potential, proper matching of intense coupled beams along regul…
▽ More
Imposing defined spinning to a particle beam increases its stability against perturbations from space charge~[Y.-L.~Cheon et al., Effects of beam spinning on the fourth-order particle resonance of 3D bunched beams in high-intensity linear accelerators, Phys. Rev. Accel. \& Beams {\bf 25}, 064002 (2022)]. In order to fully explore this potential, proper matching of intense coupled beams along regular lattices is mandatory. Herein, a novel procedure assuring matched transport is described and benchmarked through simulations. The concept of matched transport along periodic lattices has been extended from uncoupled beams to those with considerable coupling between the two transverse degrees of freedom. For coupled beams, matching means extension of cell-to-cell periodicity from just transverse envelopes to the coupled beam moments and to quantities being derived from these.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
GAME: Generalized deep learning model towards multimodal data integration for early screening of adolescent mental disorders
Authors:
Zhicheng Du,
Chenyao Jiang,
Xi Yuan,
Shiyao Zhai,
Zhengyang Lei,
Shuyue Ma,
Yang Liu,
Qihui Ye,
Chufan Xiao,
Qiming Huang,
Ming Xu,
Dongmei Yu,
Peiwu Qin
Abstract:
The timely identification of mental disorders in adolescents is a global public health challenge.Single factor is difficult to detect the abnormality due to its complex and subtle nature. Additionally, the generalized multimodal Computer-Aided Screening (CAS) systems with interactive robots for adolescent mental disorders are not available. Here, we design an android application with mini-games an…
▽ More
The timely identification of mental disorders in adolescents is a global public health challenge.Single factor is difficult to detect the abnormality due to its complex and subtle nature. Additionally, the generalized multimodal Computer-Aided Screening (CAS) systems with interactive robots for adolescent mental disorders are not available. Here, we design an android application with mini-games and chat recording deployed in a portable robot to screen 3,783 middle school students and construct the multimodal screening dataset, including facial images, physiological signs, voice recordings, and textual transcripts.We develop a model called GAME (Generalized Model with Attention and Multimodal EmbraceNet) with novel attention mechanism that integrates cross-modal features into the model. GAME evaluates adolescent mental conditions with high accuracy (73.34%-92.77%) and F1-Score (71.32%-91.06%).We find each modality contributes dynamically to the mental disorders screening and comorbidities among various mental disorders, indicating the feasibility of explainable model. This study provides a system capable of acquiring multimodal information and constructs a generalized multimodal integration algorithm with novel attention mechanisms for the early screening of adolescent mental disorders.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Quasi-periodic oscillations during magnetar giant flares in the strangeon star model
Authors:
Hong-Bo Li,
Yacheng Kang,
Zexin Hu,
Li**g Shao,
Cheng-Jun Xia,
Ren-Xin Xu
Abstract:
Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theore…
▽ More
Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theoretical studies suggest that the torsional oscillations caused by starquakes could explain QPOs. Motivated by this scenario, we systematically investigate torsional oscillation frequencies based on the strangeon-star (SS) model with various values of harmonic indices and overtones. To characterize the strong-repulsive interaction at short distances and the non-relativistic nature of strangeons, a phenomenological Lennard-Jones model is adopted. We show that, attributing to the large shear modulus of SSs, our results explain well the high-frequency QPOs ($\gtrsim 150\,\mathrm{Hz}$) during the GFs. The low-frequency QPOs ($\lesssim 150\,\mathrm{Hz}$) can also be interpreted when the ocean-crust interface modes are included. We also discuss possible effects of the magnetic field on the torsional mode frequencies. Considering realistic models with general-relativistic corrections and magnetic fields, we further calculate torsional oscillation frequencies for quark stars. We show that it would be difficult for quark stars to explain all QPOs in GFs. Our work advances the understanding of the nature of QPOs and magnetar asteroseismology.
△ Less
Submitted 28 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Semantic Adversarial Attacks via Diffusion Models
Authors:
Chenan Wang,
**hao Duan,
Chaowei Xiao,
Edward Kim,
Matthew Stamm,
Kaidi Xu
Abstract:
Traditional adversarial attacks concentrate on manipulating clean examples in the pixel space by adding adversarial perturbations. By contrast, semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features, which are more feasible in the real world. In this paper, we propose a framework to quickly generate a semantic adversarial attack b…
▽ More
Traditional adversarial attacks concentrate on manipulating clean examples in the pixel space by adding adversarial perturbations. By contrast, semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features, which are more feasible in the real world. In this paper, we propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models since semantic information is included in the latent space of well-trained diffusion models. Then there are two variants of this framework: 1) the Semantic Transformation (ST) approach fine-tunes the latent space of the generated image and/or the diffusion model itself; 2) the Latent Masking (LM) approach masks the latent space with another target image and local backpropagation-based interpretation methods. Additionally, the ST approach can be applied in either white-box or black-box settings. Extensive experiments are conducted on CelebA-HQ and AFHQ datasets, and our framework demonstrates great fidelity, generalizability, and transferability compared to other baselines. Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61. Code is available at https://github.com/steven202/semantic_adv_via_dm.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Distribution Grid Line Outage Identification with Unknown Pattern and Performance Guarantee
Authors:
Chenhan Xiao,
Yizheng Liao,
Yang Weng
Abstract:
Line outage identification in distribution grids is essential for sustainable grid operation. In this work, we propose a practical yet robust detection approach that utilizes only readily available voltage magnitudes, eliminating the need for costly phase angles or power flow data. Given the sensor data, many existing detection methods based on change-point detection require prior knowledge of out…
▽ More
Line outage identification in distribution grids is essential for sustainable grid operation. In this work, we propose a practical yet robust detection approach that utilizes only readily available voltage magnitudes, eliminating the need for costly phase angles or power flow data. Given the sensor data, many existing detection methods based on change-point detection require prior knowledge of outage patterns, which are unknown for real-world outage scenarios. To remove this impractical requirement, we propose a data-driven method to learn the parameters of the post-outage distribution through gradient descent. However, directly using gradient descent presents feasibility issues. To address this, we modify our approach by adding a Bregman divergence constraint to control the trajectory of the parameter updates, which eliminates the feasibility problems. As timely operation is the key nowadays, we prove that the optimal parameters can be learned with convergence guarantees via leveraging the statistical and physical properties of voltage data. We evaluate our approach using many representative distribution grids and real load profiles with 17 outage configurations. The results show that we can detect and localize the outage in a timely manner with only voltage magnitudes and without assuming a prior knowledge of outage patterns.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
A real-time hole depth diagnostic based on coherent imaging with plasma amendment during femtosecondlaser hole-drilling
Authors:
** Xu,
Yi Yu,
Chijie Xiao,
Ruijia Liu,
Kang Zha,
Lin Zhou,
Yongtao Liu,
Zhou Xu
Abstract:
An in-process coherent imaging diagnostic has been developed to real-time measure the hole depth during air-film hole drilling by a femtosecond laser. A super-luminescent diode with a wavelength of 830~13 nm is chosen as the coherent light source which determines a depth resolution of 12 μm. The drilled hole is coupled as a part of the sample arm and the depth variation can be extracted from the l…
▽ More
An in-process coherent imaging diagnostic has been developed to real-time measure the hole depth during air-film hole drilling by a femtosecond laser. A super-luminescent diode with a wavelength of 830~13 nm is chosen as the coherent light source which determines a depth resolution of 12 μm. The drilled hole is coupled as a part of the sample arm and the depth variation can be extracted from the length variation of the optical path. Interference is realized in the detection part and a code has been written to discriminate the interference fringes. Density of plasma in the hole is diagnosed to evaluate its amendment to the optical path length and the depth measurement error induced by plasma is non-ignorable when drilling deep holes.
△ Less
Submitted 17 June, 2023;
originally announced September 2023.
-
Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data
Authors:
Gang Fu,
Qing Zhang,
Lei Zhu,
Chunxia Xiao,
** Li
Abstract:
This paper aims to remove specular highlights from a single object-level image. Although previous methods have made some progresses, their performance remains somewhat limited, particularly for real images with complex specular highlights. To this end, we propose a three-stage network to address them. Specifically, given an input image, we first decompose it into the albedo, shading, and specular…
▽ More
This paper aims to remove specular highlights from a single object-level image. Although previous methods have made some progresses, their performance remains somewhat limited, particularly for real images with complex specular highlights. To this end, we propose a three-stage network to address them. Specifically, given an input image, we first decompose it into the albedo, shading, and specular residue components to estimate a coarse specular-free image. Then, we further refine the coarse result to alleviate its visual artifacts such as color distortion. Finally, we adjust the tone of the refined result to match that of the input as closely as possible. In addition, to facilitate network training and quantitative evaluation, we present a large-scale synthetic dataset of object-level images, covering diverse objects and illumination conditions. Extensive experiments illustrate that our network is able to generalize well to unseen real object-level images, and even produce good results for scene-level images with multiple background objects and complex lighting.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Proximity-induced interfacial room-temperature ferromagnetism in semiconducting Fe3GeTe2
Authors:
Qianwen Zhao,
Yingmei Zhu,
Hanying Zhang,
Baiqing Jiang,
Yuan Wang,
Tunan Xie,
Kaihua Lou,
ChaoChao Xia,
Hongxin Yang,
C. Bi
Abstract:
The discoveries of two-dimensional ferromagnetism and magnetic semiconductors highly enrich the magnetic material family for constructing spin-based electronic devices but with an acknowledged challenge that the Curie temperature (Tc) is usually far below room temperature. Many efforts such as voltage control and magnetic ion do** are currently underway to enhance the functional temperature, in…
▽ More
The discoveries of two-dimensional ferromagnetism and magnetic semiconductors highly enrich the magnetic material family for constructing spin-based electronic devices but with an acknowledged challenge that the Curie temperature (Tc) is usually far below room temperature. Many efforts such as voltage control and magnetic ion do** are currently underway to enhance the functional temperature, in which the involvement of additional electrodes or extra magnetic ions limits their plenty of applications in practical devices. Here we demonstrate that the magnetic proximity, a robust effect but with elusive mechanisms, can induce room-temperature ferromagnetism at the interface between sputtered Pt and semiconducting Fe3GeTe2, both of which do not show ferromagnetism at 300 K. The independent electrical and magnetization measurements, structure analysis, and control samples with Ta highlighting the role of Pt confirm that the ferromagnetism with the Tc of above 400 K arises from the Fe3GeTe2/Pt interfaces, rather than Fe aggregation or other artificial effects. Moreover, contrary to conventional ferromagnet/Pt structures, the spin current generated by the Pt layer is enhanced more than two times at the Fe3GeTe2/Pt interfaces, indicating the potential applications of the unique proximity effect in building high-efficient spintronic devices. These results may pave a new avenue to create room-temperature functional spin devices based on low-Tc materials and provide clear evidences of magnetic proximity effects by using non-ferromagnetic materials.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Privacy-Preserving Line Outage Detection in Distribution Grids: An Efficient Approach with Uncompromised Performance
Authors:
Chenhan Xiao,
Yizheng Liao,
Yang Weng
Abstract:
Recent advancements in research have shown the efficacy of employing sensor measurements, such as voltage and power data, in identifying line outages within distribution grids. However, these measurements inadvertently pose privacy risks to electricity customers by potentially revealing their sensitive information, such as household occupancy and economic status, to adversaries. To safeguard raw d…
▽ More
Recent advancements in research have shown the efficacy of employing sensor measurements, such as voltage and power data, in identifying line outages within distribution grids. However, these measurements inadvertently pose privacy risks to electricity customers by potentially revealing their sensitive information, such as household occupancy and economic status, to adversaries. To safeguard raw data from direct exposure to third-party adversaries, this paper proposes a novel decentralized data encryption scheme. The effectiveness of this encryption strategy is validated via demonstration of its differential privacy attributes by studying the Gaussian differential privacy. Recognizing that the encryption of raw data could affect the efficacy of outage detection, this paper analyzes the performance degradation by examining the Kullback-Leibler divergence between data distributions before and after the line outage. This analysis allows us to further alleviate the performance degradation by designing an innovative detection statistic that accurately approximates the optimal one. Manipulating the variance of this statistic, we demonstrate its ability to approach the optimal detection performance. The proposed privacy-aware detection procedure is evaluated using representative distribution grids and real load profiles, covering 17 distinct outage configurations. Our empirical results confirm the privacy-preserving nature of our approach and show that it achieves comparable detection performance to the optimal baseline.
△ Less
Submitted 5 June, 2024; v1 submitted 10 September, 2023;
originally announced September 2023.
-
From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
Authors:
Changming Xiao,
Qi Yang,
Feng Zhou,
Changshui Zhang
Abstract:
Diffusion models have revolted the field of text-to-image generation recently. The unique way of fusing text and image information contributes to their remarkable capability of generating highly text-related images. From another perspective, these generative models imply clues about the precise correlation between words and pixels. In this work, a simple but effective method is proposed to utilize…
▽ More
Diffusion models have revolted the field of text-to-image generation recently. The unique way of fusing text and image information contributes to their remarkable capability of generating highly text-related images. From another perspective, these generative models imply clues about the precise correlation between words and pixels. In this work, a simple but effective method is proposed to utilize the attention mechanism in the denoising network of text-to-image diffusion models. Without re-training nor inference-time optimization, the semantic grounding of phrases can be attained directly. We evaluate our method on Pascal VOC 2012 and Microsoft COCO 2014 under weakly-supervised semantic segmentation setting and our method achieves superior performance to prior methods. In addition, the acquired word-pixel correlation is found to be generalizable for the learned text embedding of customized generation methods, requiring only a few modifications. To validate our discovery, we introduce a new practical task called "personalized referring image segmentation" with a new dataset. Experiments in various situations demonstrate the advantages of our method compared to strong baselines on this task. In summary, our work reveals a novel way to extract the rich multi-modal knowledge hidden in diffusion models for segmentation.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Two states for the $Ξ(1820)$ resonance
Authors:
R. Molina,
Wei-Hong Liang,
Chu-Wen Xiao,
Zhi-Feng Sun,
E. Oset
Abstract:
We recall that the chiral unitary approach for the interaction of pseudoscalar mesons with the baryons of the decuplet predicts two states for the $Ξ(1820)$ resonance, one with a narrow width and the other one with a large width. We contrast this fact with the recent BESIII measurement of the $K^- Λ$ mass distribution in the $ψ(3686)$ decay to $K^- Λ\barΞ^+ $, which demands a width much larger tha…
▽ More
We recall that the chiral unitary approach for the interaction of pseudoscalar mesons with the baryons of the decuplet predicts two states for the $Ξ(1820)$ resonance, one with a narrow width and the other one with a large width. We contrast this fact with the recent BESIII measurement of the $K^- Λ$ mass distribution in the $ψ(3686)$ decay to $K^- Λ\barΞ^+ $, which demands a width much larger than the average of the PDG, and show how the consideration of the two $Ξ(1820)$ states provides a natural explanation to this apparent contradiction.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
All Labels Together: Low-shot Intent Detection with an Efficient Label Semantic Encoding Paradigm
Authors:
Jiangshu Du,
Congying Xia,
Wenpeng Yin,
Tingting Liang,
Philip S. Yu
Abstract:
In intent detection tasks, leveraging meaningful semantic information from intent labels can be particularly beneficial for few-shot scenarios. However, existing few-shot intent detection methods either ignore the intent labels, (e.g. treating intents as indices) or do not fully utilize this information (e.g. only using part of the intent labels). In this work, we present an end-to-end One-to-All…
▽ More
In intent detection tasks, leveraging meaningful semantic information from intent labels can be particularly beneficial for few-shot scenarios. However, existing few-shot intent detection methods either ignore the intent labels, (e.g. treating intents as indices) or do not fully utilize this information (e.g. only using part of the intent labels). In this work, we present an end-to-end One-to-All system that enables the comparison of an input utterance with all label candidates. The system can then fully utilize label semantics in this way. Experiments on three few-shot intent detection tasks demonstrate that One-to-All is especially effective when the training resource is extremely scarce, achieving state-of-the-art performance in 1-, 3- and 5-shot settings. Moreover, we present a novel pretraining strategy for our model that utilizes indirect supervision from paraphrasing, enabling zero-shot cross-domain generalization on intent detection tasks. Our code is at https://github.com/jiangshdd/AllLablesTogether.
△ Less
Submitted 7 September, 2023; v1 submitted 7 September, 2023;
originally announced September 2023.
-
XGen-7B Technical Report
Authors:
Erik Nijkamp,
Tian Xie,
Hiroaki Hayashi,
Bo Pang,
Congying Xia,
Chen Xing,
Jesse Vig,
Semih Yavuz,
Philippe Laban,
Ben Krause,
Senthil Purushwalkam,
Tong Niu,
Wojciech Kryściński,
Lidiya Murakhovs'ka,
Prafulla Kumar Choubey,
Alex Fabbri,
Ye Liu,
Rui Meng,
Lifu Tu,
Meghana Bhat,
Chien-Sheng Wu,
Silvio Savarese,
Yingbo Zhou,
Shafiq Joty,
Caiming Xiong
Abstract:
Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many t…
▽ More
Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context. To address this, we have trained XGen, a series of 7B parameter models on up to 8K sequence length for up to 1.5T tokens. We have also finetuned the XGen models on public-domain instructional data, creating their instruction-tuned counterparts (XGen-Inst). We open-source our models for both research advancements and commercial applications. Our evaluation on standard benchmarks shows that XGen models achieve comparable or better results when compared with state-of-the-art open-source LLMs. Our targeted evaluation on long sequence modeling tasks shows the benefits of our 8K-sequence models over 2K-sequence open-source LLMs.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Zero-Resource Hallucination Prevention for Large Language Models
Authors:
Junyu Luo,
Cao Xiao,
Fenglong Ma
Abstract:
The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination," which refers to instances where LLMs generate factually inaccurate or ungrounded information. Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based method…
▽ More
The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination," which refers to instances where LLMs generate factually inaccurate or ungrounded information. Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based methods that suffer from interpretability issues. Additionally, the methods that identify hallucinations post-generation could not prevent their occurrence and suffer from inconsistent performance due to the influence of the instruction format and model style. In this paper, we introduce a novel pre-detection self-evaluation technique, referred to as SELF-FAMILIARITY, which focuses on evaluating the model's familiarity with the concepts present in the input instruction and withholding the generation of response in case of unfamiliar concepts. This approach emulates the human ability to refrain from responding to unfamiliar topics, thus reducing hallucinations. We validate SELF-FAMILIARITY across four different large language models, demonstrating consistently superior performance compared to existing techniques. Our findings propose a significant shift towards preemptive strategies for hallucination mitigation in LLM assistants, promising improvements in reliability, applicability, and interpretability.
△ Less
Submitted 7 October, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Data-constrained Magnetohydrodynamic Simulation of an Intermediate Solar Filament Eruption
Authors:
Yang Guo,
**han Guo,
Yiwei Ni,
M. D. Ding,
P. F. Chen,
Chun Xia,
Rony Keppens,
Kai E. Yang
Abstract:
Solar eruptive activities could occur in weak magnetic field environments and over large spatial scales, especially relevant to eruptions involving intermediate or quiescent solar filaments. To handle the large scales, we implement and apply a flux rope embedding method using regularized Biot-Savart laws in the spherical coordinate system. Combined with a potential field source surface model and a…
▽ More
Solar eruptive activities could occur in weak magnetic field environments and over large spatial scales, especially relevant to eruptions involving intermediate or quiescent solar filaments. To handle the large scales, we implement and apply a flux rope embedding method using regularized Biot-Savart laws in the spherical coordinate system. Combined with a potential field source surface model and a magneto-frictional method, a nonlinear force-free field comprising a flux rope embedded in a potential field is constructed. Using the combined nonlinear force-free field as the initial condition, we then perform a zero-$β$ data-constrained magnetohydrodynamic (MHD) simulation for an M8.7 flare at 03:38 UT on 2012 January 23. The MHD model reproduces the eruption process, flare ribbon evolution (represented by the quasi-separatrix layer evolution) and kinematics of the flux rope. This approach could potentially model global-scale eruptions from weak field regions.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Reinforcement Learning with Human Feedback for Realistic Traffic Simulation
Authors:
Yulong Cao,
Boris Ivanovic,
Chaowei Xiao,
Marco Pavone
Abstract:
In light of the challenges and costs of real-world testing, autonomous vehicle developers often rely on testing in simulation for the creation of reliable systems. A key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge, an aspect that has proven challenging due to the need to balance realism and diversity. This works aims to address t…
▽ More
In light of the challenges and costs of real-world testing, autonomous vehicle developers often rely on testing in simulation for the creation of reliable systems. A key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge, an aspect that has proven challenging due to the need to balance realism and diversity. This works aims to address this by develo** a framework that employs reinforcement learning with human preference (RLHF) to enhance the realism of existing traffic models. This study also identifies two main challenges: capturing the nuances of human preferences on realism and the unification of diverse traffic simulation models. To tackle these issues, we propose using human feedback for alignment and employ RLHF due to its sample efficiency. We also introduce the first dataset for realism alignment in traffic modeling to support such research. Our framework, named TrafficRLHF, demonstrates its proficiency in generating realistic traffic scenarios that are well-aligned with human preferences, as corroborated by comprehensive evaluations on the nuScenes dataset.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair
Authors:
Yuxiang Wei,
Chunqiu Steven Xia,
Lingming Zhang
Abstract:
During Automated Program Repair (APR), it can be challenging to synthesize correct patches for real-world systems in general-purpose programming languages. Recent Large Language Models (LLMs) have been shown to be helpful "copilots" in assisting developers with various coding tasks, and have also been directly applied for patch synthesis. However, most LLMs treat programs as sequences of tokens, m…
▽ More
During Automated Program Repair (APR), it can be challenging to synthesize correct patches for real-world systems in general-purpose programming languages. Recent Large Language Models (LLMs) have been shown to be helpful "copilots" in assisting developers with various coding tasks, and have also been directly applied for patch synthesis. However, most LLMs treat programs as sequences of tokens, meaning that they are ignorant of the underlying semantics constraints of the target programming language. This results in plenty of statically invalid generated patches, impeding the practicality of the technique. Therefore, we propose Repilot, a general code generation framework to further copilot the AI "copilots" (i.e., LLMs) by synthesizing more valid patches during the repair process. Our key insight is that many LLMs produce outputs autoregressively (i.e., token by token), resembling human writing programs, which can be significantly boosted and guided through a Completion Engine. Repilot synergistically synthesizes a candidate patch through the interaction between an LLM and a Completion Engine, which 1) prunes away infeasible tokens suggested by the LLM and 2) proactively completes the token based on the suggestions provided by the Completion Engine. Our evaluation on a subset of the widely-used Defects4j 1.2 and 2.0 datasets shows that Repilot outperforms state-of-the-art techniques by fixing 27% and 47% more bugs, respectively. Moreover, Repilot produces more valid and correct patches than the base LLM with the same budget. While we focus on leveraging Repilot for APR in this work, the overall approach is also generalizable to other code generation tasks.
△ Less
Submitted 8 November, 2023; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Hydrodynamic limit and Newtonian limit from the relativistic Boltzmann equation to the classical Euler equations
Authors:
Yong Wang,
Changguo Xiao
Abstract:
The hydrodynamic limit and Newtonian limit are important in the relativistic kinetic theory. We justify rigorously the validity of the two independent limits from the special relativistic Boltzmann equation to the classical Euler equations without assuming any dependence between the Knudsen number $\varepsilon$ and the light speed $\mathfrak{c}$. The convergence rates are also obtained. This is ac…
▽ More
The hydrodynamic limit and Newtonian limit are important in the relativistic kinetic theory. We justify rigorously the validity of the two independent limits from the special relativistic Boltzmann equation to the classical Euler equations without assuming any dependence between the Knudsen number $\varepsilon$ and the light speed $\mathfrak{c}$. The convergence rates are also obtained. This is achieved by Hilbert expansion of relativistic Boltzmann equation. New difficulties arise when tacking the uniform in $\mathfrak{c}$ and $\varepsilon$ estimates for the Hilbert expansion, which have been overcome by establishing some uniform-in-$\mathfrak{c}$ estimate for relativistic Boltzmann operators.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Large Language Models as Data Preprocessors
Authors:
Haochen Zhang,
Yuyang Dong,
Chuan Xiao,
Masafumi Oyamada
Abstract:
Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's LLaMA variants, have marked a significant advancement in artificial intelligence. Trained on vast amounts of text data, LLMs are capable of understanding and generating human-like text across a diverse range of topics. This study expands on the applications of LLMs, exploring their potential in data preprocessing, a critical…
▽ More
Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's LLaMA variants, have marked a significant advancement in artificial intelligence. Trained on vast amounts of text data, LLMs are capable of understanding and generating human-like text across a diverse range of topics. This study expands on the applications of LLMs, exploring their potential in data preprocessing, a critical stage in data mining and analytics applications. We delve into the applicability of state-of-the-art LLMs such as GPT-3.5, GPT-4, and Vicuna-13B for error detection, data imputation, schema matching, and entity matching tasks. Alongside showcasing the inherent capabilities of LLMs, we highlight their limitations, particularly in terms of computational expense and inefficiency. We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques, coupled with traditional methods like contextualization and feature selection, to improve the performance and efficiency of these models. The effectiveness of LLMs in data preprocessing is evaluated through an experimental study spanning 12 datasets. GPT-4 emerged as a standout, achieving 100\% accuracy or F1 score on 4 datasets, suggesting LLMs' immense potential in these tasks. Despite certain limitations, our study underscores the promise of LLMs in this domain and anticipates future developments to overcome current hurdles.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Pionic and radiative transitions from $T_{c\bar{s}0}^+(2900)$ to $D_{s1}^+(2460)$ as a probe of the structure of $D_{s1}^+(2460)$
Authors:
Zi-Li Yue,
Cheng-Jian Xiao,
Dian-Yong Chen
Abstract:
In this work, we evaluated the widths of the pionic and radiative transitions from the $T_{c\bar{s}0}^{+}(2900)$ to the $D_{s1}^{+}(2460)$ in the $D_{s1}^{+}(2460)$ molecular frame and the $D_{s1}^{+}(2460)$ charmed-strange meson frame. Our estimations demonstrate that the transition widths in the $D_{s1}^{+}(2460)$ molecular frame are much larger than those in the the $D_{s1}^{+}(2460)$ charmed-s…
▽ More
In this work, we evaluated the widths of the pionic and radiative transitions from the $T_{c\bar{s}0}^{+}(2900)$ to the $D_{s1}^{+}(2460)$ in the $D_{s1}^{+}(2460)$ molecular frame and the $D_{s1}^{+}(2460)$ charmed-strange meson frame. Our estimations demonstrate that the transition widths in the $D_{s1}^{+}(2460)$ molecular frame are much larger than those in the the $D_{s1}^{+}(2460)$ charmed-strange meson frame. Specifically, the ratio of the widths of $Γ(T_{c\bar{s}0}^{+}(2900)\to D_{s1}^{+} π^{0})$ and $Γ(T_{c\bar{s}0}^{+}(2900)\to D^{+(0)}K^{0(+)})$ is estimated to be around 0.1 in the $D_{s1}^{+}(2460)$ charmed-strange meson frame, whereas the lower limit of this ratio is 0.67 in the $D_{s1}^{+}(2460)$ molecular frame. Thus, the aforementioned ratio could be employed as a tool for testing the nature of the $D_{s1}^{+}(2460)$.
△ Less
Submitted 3 September, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Exciton-exciton Interaction in Monolayer MoSe$_2$ from Mutual Screening of Coulomb Binding
Authors:
Ke Xiao,
Tengfei Yan,
Chengxin Xiao,
Feng-ren Fan,
Ruihuan Duan,
Zheng Liu,
Kenji Watanabe,
Takashi Taniguchi,
Wang Yao,
Xiaodong Cui
Abstract:
The potential for low-threshold optical nonlinearity has received significant attention in the fields of photonics and conceptual optical neuron networks. Excitons in two-dimensional (2D) semiconductors are particularly promising in this regard as reduced screening and dimensional confinement foster their pronounced many-body interactions towards nonlinearity. However, experimental determination o…
▽ More
The potential for low-threshold optical nonlinearity has received significant attention in the fields of photonics and conceptual optical neuron networks. Excitons in two-dimensional (2D) semiconductors are particularly promising in this regard as reduced screening and dimensional confinement foster their pronounced many-body interactions towards nonlinearity. However, experimental determination of the interactions remains ambiguous, as optical pum** in general creates a mixture of excitons and unbound carriers, where the impacts of band gap renormalization and carrier screening on exciton energy counteract each other. Here by comparing the influences on exciton ground and excited states energies in the photoluminescence spectroscopy of monolayer MoSe$_2$, we are able to identify separately the screening of Coulomb binding by the neutral excitons and by charge carriers. The energy difference between exciton ground state (A-1s) and excited state (A-2s) red-shifts by 5.5 meV when the neutral exciton density increases from 0 to $4\times 10^{11}$ cm$^{-2}$, in contrast to the blue shifts with the increase of either electron or hole density. This energy difference change is attributed to the mutual screening of Coulomb binding of neutral excitons, from which we extract an exciton polarizability of $α_{2D}^{\rm exciton} = 2.55\times 10^{-17}$ eV(m/V)$^2$. Our finding uncovers a new mechanism that dominates the repulsive part of many-body interaction between neutral excitons.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing
Authors:
Jiawei Zhang,
Zhongzhu Chen,
Huan Zhang,
Chaowei Xiao,
Bo Li
Abstract:
Diffusion models have been leveraged to perform adversarial purification and thus provide both empirical and certified robustness for a standard model. On the other hand, different robustly trained smoothed models have been studied to improve the certified robustness. Thus, it raises a natural question: Can diffusion model be used to achieve improved certified robustness on those robustly trained…
▽ More
Diffusion models have been leveraged to perform adversarial purification and thus provide both empirical and certified robustness for a standard model. On the other hand, different robustly trained smoothed models have been studied to improve the certified robustness. Thus, it raises a natural question: Can diffusion model be used to achieve improved certified robustness on those robustly trained smoothed models? In this work, we first theoretically show that recovered instances by diffusion models are in the bounded neighborhood of the original instance with high probability; and the "one-shot" denoising diffusion probabilistic models (DDPM) can approximate the mean of the generated distribution of a continuous-time diffusion model, which approximates the original instance under mild conditions. Inspired by our analysis, we propose a certifiably robust pipeline DiffSmooth, which first performs adversarial purification via diffusion models and then maps the purified instances to a common region via a simple yet effective local smoothing strategy. We conduct extensive experiments on different datasets and show that DiffSmooth achieves SOTA-certified robustness compared with eight baselines. For instance, DiffSmooth improves the SOTA-certified accuracy from $36.0\%$ to $53.0\%$ under $\ell_2$ radius $1.5$ on ImageNet. The code is available at [https://github.com/javyduck/DiffSmooth].
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Orbital Chern Insulator at $ν=-2$ in Twisted MoTe$_{2}$
Authors:
Feng-Ren Fan,
Cong Xiao,
Wang Yao
Abstract:
In twisted MoTe$_{2}$, latest transport measurement has reported observation of quantum anomalous Hall effect at hole filling $ν=-1$, which undergoes a topological phase transition to a trivial ferromagnet as layer hybridization gets suppressed by interlayer bias $D$. Here we show that this underlies the existence of an orbital Chern insulating state with gate ($D$) switchable sign in an antiferro…
▽ More
In twisted MoTe$_{2}$, latest transport measurement has reported observation of quantum anomalous Hall effect at hole filling $ν=-1$, which undergoes a topological phase transition to a trivial ferromagnet as layer hybridization gets suppressed by interlayer bias $D$. Here we show that this underlies the existence of an orbital Chern insulating state with gate ($D$) switchable sign in an antiferromagtic spin background at hole filling $ν=-2$. From momentum-space Hartree Fock calculations, we find this state has a topological phase diagram complementary to that of the $ν=-1$ one: by swee** $D$ from negative to positive, the Chern number of this $ν=-2$ state can be switched between $+1$, $0$, and $-1$, accompanied by a sign change of a sizable orbital magnetization. In range of $D$ where this antiferronagnet is the ground state, the orbital magnetization allows magnetic field initialization of the spin antiferromagnetic order and the Chern number.
△ Less
Submitted 20 December, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
"Guinea Pig Trials" Utilizing GPT: A Novel Smart Agent-Based Modeling Approach for Studying Firm Competition and Collusion
Authors:
Xu Han,
Zengqing Wu,
Chuan Xiao
Abstract:
Firm competition and collusion involve complex dynamics, particularly when considering communication among firms. Such issues can be modeled as problems of complex systems, traditionally approached through experiments involving human subjects or agent-based modeling methods. We propose an innovative framework called Smart Agent-Based Modeling (SABM), wherein smart agents, supported by GPT-4 techno…
▽ More
Firm competition and collusion involve complex dynamics, particularly when considering communication among firms. Such issues can be modeled as problems of complex systems, traditionally approached through experiments involving human subjects or agent-based modeling methods. We propose an innovative framework called Smart Agent-Based Modeling (SABM), wherein smart agents, supported by GPT-4 technologies, represent firms, and interact with one another. We conducted a controlled experiment to study firm price competition and collusion behaviors under various conditions. SABM is more cost-effective and flexible compared to conducting experiments with human subjects. Smart agents possess an extensive knowledge base for decision-making and exhibit human-like strategic abilities, surpassing traditional ABM agents. Furthermore, smart agents can simulate human conversation and be personalized, making them ideal for studying complex situations involving communication. Our results demonstrate that, in the absence of communication, smart agents consistently reach tacit collusion, leading to prices converging at levels higher than the Bertrand equilibrium price but lower than monopoly or cartel prices. When communication is allowed, smart agents achieve a higher-level collusion with prices close to cartel prices. Collusion forms more quickly with communication, while price convergence is smoother without it. These results indicate that communication enhances trust between firms, encouraging frequent small price deviations to explore opportunities for a higher-level win-win situation and reducing the likelihood of triggering a price war. We also assigned different personas to firms to analyze behavioral differences and tested variant models under diverse market structures. The findings showcase the effectiveness and robustness of SABM and provide intriguing insights into competition and collusion.
△ Less
Submitted 31 January, 2024; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Numerical strategy on the grid orientation effect in the simulation for two-phase flow in porous media by using the adaptive artificial viscosity method
Authors:
Xiao-Hong Wang,
Meng-Chen Yue,
Zhi-Feng Liu,
Wei-Dong Cao,
Yong Wang,
Jun Hu,
Chang-Hao Xiao,
Yao-Yong Li
Abstract:
It is a challenge to numerically solve nonlinear partial differential equations whose solution involves discontinuity. In the context of numerical simulators for multi-phase flow in porous media, there exists a long-standing issue known as Grid Orientation Effect (GOE), wherein different numerical solutions can be obtained when considering grids with different orientations under certain unfavorabl…
▽ More
It is a challenge to numerically solve nonlinear partial differential equations whose solution involves discontinuity. In the context of numerical simulators for multi-phase flow in porous media, there exists a long-standing issue known as Grid Orientation Effect (GOE), wherein different numerical solutions can be obtained when considering grids with different orientations under certain unfavorable conditions. Our perspective is that GOE arises due to numerical instability near displacement fronts, where spurious oscillations accompanied by sharp fronts, if not adequately suppressed, lead to GOE. To reduce or even eliminate GOE, we propose augmenting adaptive artificial viscosity when solving the saturation equation. It has been demonstrated that appropriate artificial viscosity can effectively reduce or even eliminate GOE. The proposed numerical method can be easily applied in practical engineering problems.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Fuzz4All: Universal Fuzzing with Large Language Models
Authors:
Chunqiu Steven Xia,
Matteo Paltenghi,
Jia Le Tian,
Michael Pradel,
Lingming Zhang
Abstract:
Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuz…
▽ More
Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuzzers for such systems often target a specific language, and thus cannot be easily applied to other languages or even other versions of the same language. Moreover, the inputs generated by existing fuzzers are often limited to specific features of the input language, and thus can hardly reveal bugs related to other or new features. This paper presents Fuzz4All, the first fuzzer that is universal in the sense that it can target many different input languages and many different features of these languages. The key idea behind Fuzz4All is to leverage large language models (LLMs) as an input generation and mutation engine, which enables the approach to produce diverse and realistic inputs for any practically relevant language. To realize this potential, we present a novel autoprompting technique, which creates LLM prompts that are wellsuited for fuzzing, and a novel LLM-powered fuzzing loop, which iteratively updates the prompt to create new fuzzing inputs. We evaluate Fuzz4All on nine systems under test that take in six different languages (C, C++, Go, SMT2, Java and Python) as inputs. The evaluation shows, across all six languages, that universal fuzzing achieves higher coverage than existing, language-specific fuzzers. Furthermore, Fuzz4All has identified 98 bugs in widely used systems, such as GCC, Clang, Z3, CVC5, OpenJDK, and the Qiskit quantum computing platform, with 64 bugs already confirmed by developers as previously unknown.
△ Less
Submitted 15 January, 2024; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Authors:
Catherine Huang,
Chelse Swoopes,
Christina Xiao,
Jiaqi Ma,
Himabindu Lakkaraju
Abstract:
Machine learning models are increasingly utilized across impactful domains to predict individual outcomes. As such, many models provide algorithmic recourse to individuals who receive negative outcomes. However, recourse can be leveraged by adversaries to disclose private information. This work presents the first attempt at mitigating such attacks. We present two novel methods to generate differen…
▽ More
Machine learning models are increasingly utilized across impactful domains to predict individual outcomes. As such, many models provide algorithmic recourse to individuals who receive negative outcomes. However, recourse can be leveraged by adversaries to disclose private information. This work presents the first attempt at mitigating such attacks. We present two novel methods to generate differentially private recourse: Differentially Private Model (DPM) and Laplace Recourse (LR). Using logistic regression classifiers and real world and synthetic datasets, we find that DPM and LR perform well in reducing what an adversary can infer, especially at low FPR. When training dataset size is large enough, we find particular success in preventing privacy leakage while maintaining model and recourse accuracy with our novel LR method.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Multi-scale Alternated Attention Transformer for Generalized Stereo Matching
Authors:
Wei Miao,
Hong Zhao,
Tongjia Chen,
Wei Huang,
Changyan Xiao
Abstract:
Recent stereo matching networks achieves dramatic performance by introducing epipolar line constraint to limit the matching range of dual-view. However, in complicated real-world scenarios, the feature information based on intra-epipolar line alone is too weak to facilitate stereo matching. In this paper, we present a simple but highly effective network called Alternated Attention U-shaped Transfo…
▽ More
Recent stereo matching networks achieves dramatic performance by introducing epipolar line constraint to limit the matching range of dual-view. However, in complicated real-world scenarios, the feature information based on intra-epipolar line alone is too weak to facilitate stereo matching. In this paper, we present a simple but highly effective network called Alternated Attention U-shaped Transformer (AAUformer) to balance the impact of epipolar line in dual and single view respectively for excellent generalization performance. Compared to other models, our model has several main designs: 1) to better liberate the local semantic features of the single-view at pixel level, we introduce window self-attention to break the limits of intra-row self-attention and completely replace the convolutional network for denser features before cross-matching; 2) the multi-scale alternated attention backbone network was designed to extract invariant features in order to achieves the coarse-to-fine matching process for hard-to-discriminate regions. We performed a series of both comparative studies and ablation studies on several mainstream stereo matching datasets. The results demonstrate that our model achieves state-of-the-art on the Scene Flow dataset, and the fine-tuning performance is competitive on the KITTI 2015 dataset. In addition, for cross generalization experiments on synthetic and real-world datasets, our model outperforms several state-of-the-art works.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
Passively Adaptive Radiative Switch for Thermoregulation in Buildings
Authors:
Charles Xiao,
Bolin Liao,
Elliot W. Hawkes
Abstract:
With the ever-growing need to reduce energy consumption, building materials that passively heat or cool are gaining importance. However, many buildings require both heating and cooling, even within the same day. To date, few technologies can automatically switch between passive heating and cooling, and those that can require a large temperature range to cycle states (>15o C), making them ineffecti…
▽ More
With the ever-growing need to reduce energy consumption, building materials that passively heat or cool are gaining importance. However, many buildings require both heating and cooling, even within the same day. To date, few technologies can automatically switch between passive heating and cooling, and those that can require a large temperature range to cycle states (>15o C), making them ineffective for daily switching. We present a passively adaptive radiative switch that leverages the expansion in phase-change energy storage materials to actuate the motion of louvers and can cycle states in less than 3o C. The black selective-absorber louvers induce high heat gain when closed, yet when open, expose a white, emissive surface for low heat gain. During an outdoor test in which temperature was held steady, our device reduced the energetic cost of cooling by 3.1x and heating by 2.6x compared to non-switching devices. Our concept opens the door for passively adaptive thermoregulating building materials.
△ Less
Submitted 8 October, 2023; v1 submitted 3 August, 2023;
originally announced August 2023.