-
ScribblePolyp: Scribble-Supervised Polyp Segmentation through Dual Consistency Alignment
Authors:
Zixun Zhang,
Yuncheng Jiang,
Jun Wei,
Hannah Cui,
Zhen Li
Abstract:
Automatic polyp segmentation models play a pivotal role in the clinical diagnosis of gastrointestinal diseases. In previous studies, most methods relied on fully supervised approaches, necessitating pixel-level annotations for model training. However, the creation of pixel-level annotations is both expensive and time-consuming, impeding the development of model generalization. In response to this…
▽ More
Automatic polyp segmentation models play a pivotal role in the clinical diagnosis of gastrointestinal diseases. In previous studies, most methods relied on fully supervised approaches, necessitating pixel-level annotations for model training. However, the creation of pixel-level annotations is both expensive and time-consuming, impeding the development of model generalization. In response to this challenge, we introduce ScribblePolyp, a novel scribble-supervised polyp segmentation framework. Unlike fully-supervised models, ScribblePolyp only requires the annotation of two lines (scribble labels) for each image, significantly reducing the labeling cost. Despite the coarse nature of scribble labels, which leave a substantial portion of pixels unlabeled, we propose a two-branch consistency alignment approach to provide supervision for these unlabeled pixels. The first branch employs transformation consistency alignment to narrow the gap between predictions under different transformations of the same input image. The second branch leverages affinity propagation to refine predictions into a soft version, extending additional supervision to unlabeled pixels. In summary, ScribblePolyp is an efficient model that does not rely on teacher models or moving average pseudo labels during training. Extensive experiments on the SUN-SEG dataset underscore the effectiveness of ScribblePolyp, achieving a Dice score of 0.8155, with the potential for a 1.8% improvement in the Dice score through a straightforward self-training strategy.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Flexible uniform-sampling foveated Fourier single-pixel imaging
Authors:
Huan Cui,
Jie Cao,
Qun Hao,
Haoyu Zhang,
Chang Zhou
Abstract:
Fourier single-pixel imaging (FSI) is a data-efficient single-pixel imaging (SPI). However, there is still a serious challenge to obtain higher imaging quality using fewer measurements, which limits the development of real-time SPI. In this work, a uniform-sampling foveated FSI (UFFSI) is proposed with three features, uniform sampling, effective sampling and flexible fovea, to achieve under-sampli…
▽ More
Fourier single-pixel imaging (FSI) is a data-efficient single-pixel imaging (SPI). However, there is still a serious challenge to obtain higher imaging quality using fewer measurements, which limits the development of real-time SPI. In this work, a uniform-sampling foveated FSI (UFFSI) is proposed with three features, uniform sampling, effective sampling and flexible fovea, to achieve under-sampling high-efficiency and high-quality SPI, even in a large-scale scene. First, by flexibly using the three proposed foveated pattern structures, data redundancy is reduced significantly to only require high resolution (HR) on regions of interest (ROIs), which radically reduces the need of total data number. Next, by the non-uniform weight distribution processing, non-uniform spatial sampling is transformed into uniform sampling, then the fast Fourier transform is used accurately and directly to obtain under-sampling high imaging quality with further reduced measurements. At a sampling ratio of 0.0084 referring to HR FSI with 1024*768 pixels, experimentally, by UFFSI with 255*341 cells of 89% reduction in data redundancy, the ROI has a significantly better imaging quality to meet imaging needs. We hope this work can provide a breakthrough for future real-time SPI.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Authors:
Ran Xu,
Hejie Cui,
Yue Yu,
Xuan Kan,
Wenqi Shi,
Yuchen Zhuang,
Wei **,
Joyce Ho,
Carl Yang
Abstract:
Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation us…
▽ More
Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation using LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks, effectively aligning the distribution of real datasets and significantly enriching the diversity of generated training instances. We will publish our code and all the generated data in \url{https://github.com/ritaranx/ClinGen}.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video Hashing
Authors:
Rukai Wei,
Yu Liu,
**gkuan Song,
Heng Cui,
Yanzhao Xie,
Ke Zhou
Abstract:
Compressing videos into binary codes can improve retrieval speed and reduce storage overhead. However, learning accurate hash codes for video retrieval can be challenging due to high local redundancy and complex global dependencies between video frames, especially in the absence of labels. Existing self-supervised video hashing methods have been effective in designing expressive temporal encoders,…
▽ More
Compressing videos into binary codes can improve retrieval speed and reduce storage overhead. However, learning accurate hash codes for video retrieval can be challenging due to high local redundancy and complex global dependencies between video frames, especially in the absence of labels. Existing self-supervised video hashing methods have been effective in designing expressive temporal encoders, but have not fully utilized the temporal dynamics and spatial appearance of videos due to less challenging and unreliable learning tasks. To address these challenges, we begin by utilizing the contrastive learning task to capture global spatio-temporal information of videos for hashing. With the aid of our designed augmentation strategies, which focus on spatial and temporal variations to create positive pairs, the learning framework can generate hash codes that are invariant to motion, scale, and viewpoint. Furthermore, we incorporate two collaborative learning tasks, i.e., frame order verification and scene change regularization, to capture local spatio-temporal details within video frames, thereby enhancing the perception of temporal structure and the modeling of spatio-temporal relationships. Our proposed Contrastive Hashing with Global-Local Spatio-temporal Information (CHAIN) outperforms state-of-the-art self-supervised video hashing methods on four video benchmark datasets. Our codes will be released.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Authors:
Hejie Cui,
Xinyu Fang,
Zihan Zhang,
Ran Xu,
Xuan Kan,
Xin Liu,
Yue Yu,
Manling Li,
Yangqiu Song,
Carl Yang
Abstract:
Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achi…
▽ More
Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achieve this, we present OpenVik which consists of an open relational region detector to detect regions potentially containing relational knowledge and a visual knowledge generator that generates format-free knowledge by prompting the large multimodality model with the detected region of interest. We also explore two data enhancement techniques for diversifying the generated format-free visual knowledge. Extensive knowledge quality evaluations highlight the correctness and uniqueness of the extracted open visual knowledge by OpenVik. Moreover, integrating our extracted knowledge across various visual reasoning applications shows consistent improvements, indicating the real-world applicability of OpenVik.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue
Authors:
Yuanxing Liu,
Wei-Nan Zhang,
Yifan Chen,
Yuchi Zhang,
Haopeng Bai,
Fan Feng,
Hengbin Cui,
Yongbin Li,
Wanxiang Che
Abstract:
E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dia…
▽ More
E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dialogues after fine-tuning, but lack domain-specific knowledge for accurate recommendations. Intuitively, the strengths of LLM and CRS in E-commerce pre-sales dialogues are complementary, yet no previous work has explored this. This paper investigates the effectiveness of combining LLM and CRS in E-commerce pre-sales dialogues, proposing two collaboration methods: CRS assisting LLM and LLM assisting CRS. We conduct extensive experiments on a real-world dataset of Ecommerce pre-sales dialogues. We analyze the impact of two collaborative approaches with two CRSs and two LLMs on four tasks of Ecommerce pre-sales dialogue. We find that collaborations between CRS and LLM can be very effective in some cases.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Trajectory-aware Principal Manifold Framework for Data Augmentation and Image Generation
Authors:
Elvis Han Cui,
Bingbin Li,
Yanan Li,
Weng Kee Wong,
Donghui Wang
Abstract:
Data augmentation for deep learning benefits model training, image transformation, medical imaging analysis and many other fields. Many existing methods generate new samples from a parametric distribution, like the Gaussian, with little attention to generate samples along the data manifold in either the input or feature space. In this paper, we verify that there are theoretical and practical advan…
▽ More
Data augmentation for deep learning benefits model training, image transformation, medical imaging analysis and many other fields. Many existing methods generate new samples from a parametric distribution, like the Gaussian, with little attention to generate samples along the data manifold in either the input or feature space. In this paper, we verify that there are theoretical and practical advantages of using the principal manifold hidden in the feature space than the Gaussian distribution. We then propose a novel trajectory-aware principal manifold framework to restore the manifold backbone and generate samples along a specific trajectory. On top of the autoencoder architecture, we further introduce an intrinsic dimension regularization term to make the manifold more compact and enable few-shot image generation. Experimental results show that the novel framework is able to extract more compact manifold representation, improve classification accuracy and generate smooth transformation among few samples.
△ Less
Submitted 30 July, 2023;
originally announced October 2023.
-
RaftFed: A Lightweight Federated Learning Framework for Vehicular Crowd Intelligence
Authors:
Changan Yang,
Yaxing Chen,
Yao Zhang,
Helei Cui,
Zhiwen Yu,
Bin Guo,
Zheng Yan,
Zijiang Yang
Abstract:
Vehicular crowd intelligence (VCI) is an emerging research field. Facilitated by state-of-the-art vehicular ad-hoc networks and artificial intelligence, various VCI applications come to place, e.g., collaborative sensing, positioning, and map**. The collaborative property of VCI applications generally requires data to be shared among participants, thus forming network-wide intelligence. How to f…
▽ More
Vehicular crowd intelligence (VCI) is an emerging research field. Facilitated by state-of-the-art vehicular ad-hoc networks and artificial intelligence, various VCI applications come to place, e.g., collaborative sensing, positioning, and map**. The collaborative property of VCI applications generally requires data to be shared among participants, thus forming network-wide intelligence. How to fulfill this process without compromising data privacy remains a challenging issue. Although federated learning (FL) is a promising tool to solve the problem, adapting conventional FL frameworks to VCI is nontrivial. First, the centralized model aggregation is unreliable in VCI because of the existence of stragglers with unfavorable channel conditions. Second, existing FL schemes are vulnerable to Non-IID data, which is intensified by the data heterogeneity in VCI. This paper proposes a novel federated learning framework called RaftFed to facilitate privacy-preserving VCI. The experimental results show that RaftFed performs better than baselines regarding communication overhead, model accuracy, and model convergence.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Analysis of learning a flow-based generative model from limited sample complexity
Authors:
Hugo Cui,
Florent Krzakala,
Eric Vanden-Eijnden,
Lenka Zdeborová
Abstract:
We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form characterization of the learnt velocity field, when parametrized by a shallow denoising auto-encoder trained on a finite number $n$ of samples from th…
▽ More
We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form characterization of the learnt velocity field, when parametrized by a shallow denoising auto-encoder trained on a finite number $n$ of samples from the target distribution. Building on this analysis, we provide a sharp description of the corresponding generative flow, which pushes the base Gaussian density forward to an approximation of the target density. In particular, we provide closed-form formulae for the distance between the mean of the generated mixture and the mean of the target mixture, which we show decays as $Θ_n(\frac{1}{n})$. Finally, this rate is shown to be in fact Bayes-optimal.
△ Less
Submitted 25 June, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Incremental Rotation Averaging Revisited and More: A New Rotation Averaging Benchmark
Authors:
Xiang Gao,
Hainan Cui,
Shuhan Shen
Abstract:
In order to further advance the accuracy and robustness of the incremental parameter estimation-based rotation averaging methods, in this paper, a new member of the Incremental Rotation Averaging (IRA) family is introduced, which is termed as IRAv4. As the most significant feature of the IRAv4, a task-specific connected dominating set is extracted to serve as a more reliable and accurate reference…
▽ More
In order to further advance the accuracy and robustness of the incremental parameter estimation-based rotation averaging methods, in this paper, a new member of the Incremental Rotation Averaging (IRA) family is introduced, which is termed as IRAv4. As the most significant feature of the IRAv4, a task-specific connected dominating set is extracted to serve as a more reliable and accurate reference for rotation global alignment. In addition, to further address the limitations of the existing rotation averaging benchmark of relying on the slightly outdated Bundler camera calibration results as ground truths and focusing solely on rotation estimation accuracy, this paper presents a new COLMAP-based rotation averaging benchmark that incorporates a cross check between COLMAP and Bundler, and employ the accuracy of both rotation and downstream location estimation as evaluation metrics, which is desired to provide a more reliable and comprehensive evaluation tool for the rotation averaging research. Comprehensive comparisons between the proposed IRAv4 and other mainstream rotation averaging methods on this new benchmark demonstrate the effectiveness of our proposed approach.
△ Less
Submitted 4 January, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Bias Testing and Mitigation in LLM-based Code Generation
Authors:
Dong Huang,
Qingwen Bu,
Jie Zhang,
Xiaofei Xie,
Junjie Chen,
Heming Cui
Abstract:
Utilizing state-of-the-art Large Language Models (LLMs), automatic code generation models play a pivotal role in enhancing the productivity of software development procedures. As the adoption of LLMs becomes more widespread in software coding ecosystems, a pressing issue has emerged: does the generated code contain social bias and unfairness, such as those related to age, gender, and race? This is…
▽ More
Utilizing state-of-the-art Large Language Models (LLMs), automatic code generation models play a pivotal role in enhancing the productivity of software development procedures. As the adoption of LLMs becomes more widespread in software coding ecosystems, a pressing issue has emerged: does the generated code contain social bias and unfairness, such as those related to age, gender, and race? This issue concerns the integrity, fairness, and ethical foundation of software applications that depend on the code generated by these models, yet is under-explored in the literature. This paper presents a novel bias testing framework that is specifically designed for code generation tasks. Based on this framework, we conduct an extensive evaluation of the bias in code generated by five state-of-the-art LLMs. Our findings reveal that 20.29% to 44.93% code functions generated by the models under study are biased when handling bias sensitive tasks (i.e., tasks that involve sensitive attributes such as age and gender). This indicates that the existing LLMs can be unfair in code generation, posing risks of unintended and harmful software behaviors. To mitigate bias for code generation models, we evaluate five bias mitigation prompt strategies, i.e., utilizing bias testing results to refine the code (zero-shot), one-, few-shot, and two Chain-of-Thought (CoT) prompts. Our evaluation results illustrate that these strategies are all effective in mitigating bias. Overall, one-shot and few-shot learning are the two most effective. For GPT-4, 80% to 90% code bias can be removed with one-shot learning.
△ Less
Submitted 24 May, 2024; v1 submitted 3 September, 2023;
originally announced September 2023.
-
MiliPoint: A Point Cloud Dataset for mmWave Radar
Authors:
Han Cui,
Shu Zhong,
Jiacheng Wu,
Zichao Shen,
Naim Dahnoun,
Yiren Zhao
Abstract:
Millimetre-wave (mmWave) radar has emerged as an attractive and cost-effective alternative for human activity sensing compared to traditional camera-based systems. mmWave radars are also non-intrusive, providing better protection for user privacy. However, as a Radio Frequency (RF) based technology, mmWave radars rely on capturing reflected signals from objects, making them more prone to noise com…
▽ More
Millimetre-wave (mmWave) radar has emerged as an attractive and cost-effective alternative for human activity sensing compared to traditional camera-based systems. mmWave radars are also non-intrusive, providing better protection for user privacy. However, as a Radio Frequency (RF) based technology, mmWave radars rely on capturing reflected signals from objects, making them more prone to noise compared to cameras. This raises an intriguing question for the deep learning community: Can we develop more effective point set-based deep learning methods for such attractive sensors?
To answer this question, our work, termed MiliPoint, delves into this idea by providing a large-scale, open dataset for the community to explore how mmWave radars can be utilised for human activity recognition. Moreover, MiliPoint stands out as it is larger in size than existing datasets, has more diverse human actions represented, and encompasses all three key tasks in human activity recognition. We have also established a range of point-based deep neural networks such as DGCNN, PointNet++ and PointTransformer, on MiliPoint, which can serve to set the ground baseline for further development.
△ Less
Submitted 2 November, 2023; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Fingerprints for anisotropic Kondo lattice behavior in the quasiparticle dynamics of the kagome metal Ni$_3$In
Authors:
Dong-Hyeon Gim,
Dirk Wulferding,
Chulwan Lee,
Hengbo Cui,
Kiwan Nam,
Myung Joon Han,
Kee Hoon Kim
Abstract:
We present a temperature- and polarization-resolved phononic and electronic Raman scattering study in combination with the first-principles calculations on the kagome metal Ni$_3$In with anisotropic transport properties and non-Fermi liquid behavior. At temperatures below 50 K and down to 2 K, several Raman phonon modes, including particularly an interlayer shear mode, exhibit appreciable frequenc…
▽ More
We present a temperature- and polarization-resolved phononic and electronic Raman scattering study in combination with the first-principles calculations on the kagome metal Ni$_3$In with anisotropic transport properties and non-Fermi liquid behavior. At temperatures below 50 K and down to 2 K, several Raman phonon modes, including particularly an interlayer shear mode, exhibit appreciable frequency and linewidth renormalization, reminiscent of the onset of the Kondo screening without an accompanying structural or magnetic phase transition. In addition, a low-energy electronic continuum observed in polarization perpendicular to the kagome planes reveals strong temperature dependence below 50 K, implying thermal depletion of incoherent quasiparticles, while the in-plane continuum remains invariant. These concomitant electronic and phononic Raman signatures suggest that Ni$_3$In undergoes an anisotropic electronic crossover from an incoherent to a coherent Kondo lattice regime below 50 K. We discuss the origin of the anisotropic incoherent-coherent crossover in association with the possible anisotropic Kondo hybridization involving localized Ni-$3d_{xz}$ flat-band electrons.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
On the Acoustoelasticity of Backward Lamb Wave in Prestressed Plate
Authors:
Zhongtao Hu,
Guo-Yang Li,
Hanyin Cui
Abstract:
Backward Lamb waves, which exhibit a group velocity that propagates in the opposite direction to their phase velocity, have recently garnered considerable attention for their potential applications in nondestructive testing. Herein we present a theoretical study on backward Lamb waves in the elastic plate subject to prestresses. We demonstrate that the group velocity of the first antisymmetric bac…
▽ More
Backward Lamb waves, which exhibit a group velocity that propagates in the opposite direction to their phase velocity, have recently garnered considerable attention for their potential applications in nondestructive testing. Herein we present a theoretical study on backward Lamb waves in the elastic plate subject to prestresses. We demonstrate that the group velocity of the first antisymmetric backward Lamb wave, A3b, decreases with tensile stress, whereas that of the first symmetric backward Lamb wave, S2b, increases. Notably, the sensitivity of A3b to prestress is approximately ten times greater than that of S2b, with a ~5% change in group velocity observed under a uniaxial stress of 100 MPa in steel. This heightened sensitivity facilitates an inverse method for determining prestress levels in elastic plates by examining variations in the A3b group velocity. We also investigate the acoustoelastic properties of zero-group-velocity (ZGV) points, which demarcate the dispersion curves of forward and backward Lamb waves. Our findings indicate that the ratio of resonance frequencies corresponding to A3b and S2b monotonically decreases as uniaxial stress increases, providing an alternative method for prestress assessment. Lastly, we propose an experimental setup for measuring backward Lamb waves and visualize the generation of A3b using dynamic photoelastic techniques. Our research elucidates the acoustoelastic characteristics of backward Lamb waves and highlights their promising utility for stress measurement in elastic plates.
△ Less
Submitted 25 September, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
When Geoscience Meets Foundation Models: Towards General Geoscience Artificial Intelligence System
Authors:
Hao Zhang,
**-Jian Xu,
Hong-Wei Cui,
Lin Li,
Yaowen Yang,
Chao-Sheng Tang,
Niklas Boers
Abstract:
Geoscience foundation models (GFMs) represent a revolutionary approach within Earth sciences to integrate massive cross-disciplinary data for improved simulation and understanding of Earth system dynamics. As a data-centric artificial intelligence paradigm, GFMs extract valuable insights from petabytes of both structured and unstructured data. Their versatility in task specification, diverse input…
▽ More
Geoscience foundation models (GFMs) represent a revolutionary approach within Earth sciences to integrate massive cross-disciplinary data for improved simulation and understanding of Earth system dynamics. As a data-centric artificial intelligence paradigm, GFMs extract valuable insights from petabytes of both structured and unstructured data. Their versatility in task specification, diverse inputs and outputs, and multi-modal knowledge representation enable a comprehensive analysis that surpasses the capabilities of individual data sources. Critically, the scalability and generalizability of GFMs empower them to address a wide array of prediction, simulation, and decision tasks related to the intricate interactions among Earth system components. By unraveling the causal mechanisms underlying observed patterns and changes, GFMs contribute to advancing our knowledge of the Earth system and its responses to various drivers and perturbations. Collaboration between domain experts and computer scientists plays a pivotal role in fostering innovations in these invaluable tools for understanding the past, present, and future of our planet. Moreover, we introduce recent advances including key technologies for constructing GFMs, especially remote sensing applications. However, challenges remain in validation and verification, scalability, interpretability, knowledge representation, and addressing social bias. Going forward, the key lies in enhancing model integration, resolution, accuracy, and equity through interdisciplinary teamwork. Despite current limitations, GFMs hold great promise for providing critical insights into pressing issues including climate change, natural hazards, and sustainability through their ability to explore multiple scenarios and quantify uncertainties. Their continued evolution toward integrated, data-driven modeling holds paradigm-shifting potential for Earth science.
△ Less
Submitted 14 March, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
PBP: Path-based Trajectory Prediction for Autonomous Driving
Authors:
Sepideh Afshar,
Nachiket Deo,
Akshay Bhagat,
Titas Chakraborty,
Yunming Shao,
Balarama Raju Buddharaju,
Adwait Deshpande,
Henggang Cui
Abstract:
Trajectory prediction plays a crucial role in the autonomous driving stack by enabling autonomous vehicles to anticipate the motion of surrounding agents. Goal-based prediction models have gained traction in recent years for addressing the multimodal nature of future trajectories. Goal-based prediction models simplify multimodal prediction by first predicting 2D goal locations of agents and then p…
▽ More
Trajectory prediction plays a crucial role in the autonomous driving stack by enabling autonomous vehicles to anticipate the motion of surrounding agents. Goal-based prediction models have gained traction in recent years for addressing the multimodal nature of future trajectories. Goal-based prediction models simplify multimodal prediction by first predicting 2D goal locations of agents and then predicting trajectories conditioned on each goal. However, a single 2D goal location serves as a weak inductive bias for predicting the whole trajectory, often leading to poor map compliance, i.e., part of the trajectory going off-road or breaking traffic rules. In this paper, we improve upon goal-based prediction by proposing the Path-based prediction (PBP) approach. PBP predicts a discrete probability distribution over reference paths in the HD map using the path features and predicts trajectories in the path-relative Frenet frame. We applied the PBP trajectory decoder on top of the HiVT scene encoder and report results on the Argoverse dataset. Our experiments show that PBP achieves competitive performance on the standard trajectory prediction metrics, while significantly outperforming state-of-the-art baselines in terms of map compliance.
△ Less
Submitted 2 March, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
MLN-net: A multi-source medical image segmentation method for clustered microcalcifications using multiple layer normalization
Authors:
Ke Wang,
Zanting Ye,
Xiang Xie,
Haidong Cui,
Tao Chen,
Banteng Liu
Abstract:
Accurate segmentation of clustered microcalcifications in mammography is crucial for the diagnosis and treatment of breast cancer. Despite exhibiting expert-level accuracy, recent deep learning advancements in medical image segmentation provide insufficient contribution to practical applications, due to the domain shift resulting from differences in patient postures, individual gland density, and…
▽ More
Accurate segmentation of clustered microcalcifications in mammography is crucial for the diagnosis and treatment of breast cancer. Despite exhibiting expert-level accuracy, recent deep learning advancements in medical image segmentation provide insufficient contribution to practical applications, due to the domain shift resulting from differences in patient postures, individual gland density, and imaging modalities of mammography etc. In this paper, a novel framework named MLN-net, which can accurately segment multi-source images using only single source images, is proposed for clustered microcalcification segmentation. We first propose a source domain image augmentation method to generate multi-source images, leading to improved generalization. And a structure of multiple layer normalization (LN) layers is used to construct the segmentation network, which can be found efficient for clustered microcalcification segmentation in different domains. Additionally, a branch selection strategy is designed for measuring the similarity of the source domain data and the target domain data. To validate the proposed MLN-net, extensive analyses including ablation experiments are performed, comparison of 12 baseline methods. Extensive experiments validate the effectiveness of MLN-net in segmenting clustered microcalcifications from different domains and the its segmentation accuracy surpasses state-of-the-art methods. Code will be available at https://github.com/yezanting/MLN-NET-VERSON1.
△ Less
Submitted 3 January, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Dynamic Brain Transformer with Multi-level Attention for Functional Brain Network Analysis
Authors:
Xuan Kan,
Antonio Aodong Chen Gu,
Hejie Cui,
Ying Guo,
Carl Yang
Abstract:
Recent neuroimaging studies have highlighted the importance of network-centric brain analysis, particularly with functional magnetic resonance imaging. The emergence of Deep Neural Networks has fostered a substantial interest in predicting clinical outcomes and categorizing individuals based on brain networks. However, the conventional approach involving static brain network analysis offers limite…
▽ More
Recent neuroimaging studies have highlighted the importance of network-centric brain analysis, particularly with functional magnetic resonance imaging. The emergence of Deep Neural Networks has fostered a substantial interest in predicting clinical outcomes and categorizing individuals based on brain networks. However, the conventional approach involving static brain network analysis offers limited potential in capturing the dynamism of brain function. Although recent studies have attempted to harness dynamic brain networks, their high dimensionality and complexity present substantial challenges. This paper proposes a novel methodology, Dynamic bRAin Transformer (DART), which combines static and dynamic brain networks for more effective and nuanced brain function analysis. Our model uses the static brain network as a baseline, integrating dynamic brain networks to enhance performance against traditional methods. We innovatively employ attention mechanisms, enhancing model explainability and exploiting the dynamic brain network's temporal variations. The proposed approach offers a robust solution to the low signal-to-noise ratio of blood-oxygen-level-dependent signals, a recurring issue in direct DNN modeling. It also provides valuable insights into which brain circuits or dynamic networks contribute more to final predictions. As such, DRAT shows a promising direction in neuroimaging studies, contributing to the comprehensive understanding of brain organization and the role of neural circuits.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Metaheuristic Algorithms in Artificial Intelligence with Applications to Bioinformatics, Biostatistics, Ecology and, the Manufacturing Industries
Authors:
Elvis Han Cui,
Zizhao Zhang,
Culsome Junwen Chen,
Weng Kee Wong
Abstract:
Nature-inspired metaheuristic algorithms are important components of artificial intelligence, and are increasingly used across disciplines to tackle various types of challenging optimization problems. We apply a newly proposed nature-inspired metaheuristic algorithm called competitive swarm optimizer with mutated agents (CSO-MA) and demonstrate its flexibility and out-performance relative to its c…
▽ More
Nature-inspired metaheuristic algorithms are important components of artificial intelligence, and are increasingly used across disciplines to tackle various types of challenging optimization problems. We apply a newly proposed nature-inspired metaheuristic algorithm called competitive swarm optimizer with mutated agents (CSO-MA) and demonstrate its flexibility and out-performance relative to its competitors in a variety of optimization problems in the statistical sciences. In particular, we show the algorithm is efficient and can incorporate various cost structures or multiple user-specified nonlinear constraints. Our applications include (i) finding maximum likelihood estimates of parameters in a single cell generalized trend model to study pseudotime in bioinformatics, (ii) estimating parameters in a commonly used Rasch model in education research, (iii) finding M-estimates for a Cox regression in a Markov renewal model and (iv) matrix completion to impute missing values in a two compartment model. In addition we discuss applications to (v) select variables optimally in an ecology problem and (vi) design a car refueling experiment for the auto industry using a logistic model with multiple interacting factors.
△ Less
Submitted 16 October, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation
Authors:
Dong Huang,
Qingwen Bu,
Yuhao Qing,
Heming Cui
Abstract:
Chain-of-thought (CoT) has emerged as a groundbreaking tool in NLP, notably for its efficacy in complex reasoning tasks, such as mathematical proofs. However, its application in code generation faces a distinct challenge, i.e., although the code generated with CoT reasoning is logically correct, it faces the problem of syntax error (e.g., invalid syntax error report) during code execution, which c…
▽ More
Chain-of-thought (CoT) has emerged as a groundbreaking tool in NLP, notably for its efficacy in complex reasoning tasks, such as mathematical proofs. However, its application in code generation faces a distinct challenge, i.e., although the code generated with CoT reasoning is logically correct, it faces the problem of syntax error (e.g., invalid syntax error report) during code execution, which causes the CoT result's pass@1 in HumanEval even lower than the zero-shot result.
In this paper, we present Code Chain-of-Thought (CodeCoT) that integrates CoT with a self-examination process for code generation. CodeCoT begins with the LLMs using CoT for initial code development to ensure the generated code follows the correct logic flow. Then, CodeCoT will generate test cases to validate whether the code has syntax errors during the execution. CodeCoT then employs a self-examination phase, in which the generated code is executed against these test cases in the local environment. If the local environment raises error information (e.g., invalid syntax error), CodeCoT will iteratively refine the code based on the feedback information. Within this loop, CodeCoT can make sure their generated codes not only follow the logic flow of the code description, but the syntax error will also be addressed with the self-examination process. Our evaluation results reveal that CodeCoT improves the effectiveness of code generation. For example, CodeCoT increases pass@1 from 75.6% to 79.3% for the HumanEval dataset.
△ Less
Submitted 22 February, 2024; v1 submitted 17 August, 2023;
originally announced August 2023.
-
BehaVR: User Identification Based on VR Sensor Data
Authors:
Ismat Jarin,
Yu Duan,
Rahmadi Trimananda,
Hao Cui,
Salma Elmalaki,
Athina Markopoulou
Abstract:
Virtual reality (VR) platforms enable a wide range of applications, however pose unique privacy risks. In particular, VR devices are equipped with a rich set of sensors that collect personal and sensitive information (e.g., body motion, eye gaze, hand joints, and facial expression), which can be used to uniquely identify a user, even without explicit identifiers. In this paper, we are interested i…
▽ More
Virtual reality (VR) platforms enable a wide range of applications, however pose unique privacy risks. In particular, VR devices are equipped with a rich set of sensors that collect personal and sensitive information (e.g., body motion, eye gaze, hand joints, and facial expression), which can be used to uniquely identify a user, even without explicit identifiers. In this paper, we are interested in understanding the extent to which a user can be identified based on data collected by different VR sensors. We consider adversaries with capabilities that range from observing APIs available within a single VR app (app adversary) to observing all, or selected, sensor measurements across all apps on the VR device (device adversary). To that end, we introduce BEHAVR, a framework for collecting and analyzing data from all sensor groups collected by all apps running on a VR device. We use BEHAVR to perform a user study and collect data from real users that interact with popular real-world apps. We use that data to build machine learning models for user identification, with features extracted from sensor data available within and across apps. We show that these models can identify users with an accuracy of up to 100%, and we reveal the most important features and sensor groups, depending on the functionality of the app and the strength of the adversary, as well as the minimum time needed for user identification. To the best of our knowledge, BEHAVR is the first to analyze user identification in VR comprehensively, i.e., considering jointly all sensor measurements available on a VR device (whether within an app or across multiple apps), collected by real-world, as opposed to custom-made, apps.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Effective Hamiltonian approach to the quantum phase transitions in the extended Jaynes-Cummings model
Authors:
H. T. Cui,
Y. A. Yan,
M. Qin,
X. X. Yi
Abstract:
The study of phase transitions in dissipative quantum systems based on the Liouvillian is often hindered by the difficulty of constructing a time-local master equation when the system-environment coupling is strong. To address this issue, the complex discretization approximation for the environment is proposed to study the quantum phase transition in the extended Jaynes-Cumming model with an infin…
▽ More
The study of phase transitions in dissipative quantum systems based on the Liouvillian is often hindered by the difficulty of constructing a time-local master equation when the system-environment coupling is strong. To address this issue, the complex discretization approximation for the environment is proposed to study the quantum phase transition in the extended Jaynes-Cumming model with an infinite number of boson modes. This approach yields a non-Hermitian effective Hamiltonian that can be used to simulate the dynamics of the spin. It is found that the ground state of this effective Hamiltonian determines the spin dynamics in the single-excitation subspace. Depending on the opening of the energy gap and the maximum population of excitations on the spin degree of freedom, three distinct phases can be identified: fast decaying, localized, and stretched dynamics of the spin. This approach can be extended to multiple excitations, and similar dynamics were found in the double-excitation subspace, indicating the robustness of the single-excitation phase.
△ Less
Submitted 6 April, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Online Container Scheduling for Low-Latency IoT Services in Edge Cluster Upgrade: A Reinforcement Learning Approach
Authors:
Hanshuai Cui,
Zhiqing Tang,
Jiong Lou,
Weijia Jia
Abstract:
In Mobile Edge Computing (MEC), Internet of Things (IoT) devices offload computationally-intensive tasks to edge nodes, where they are executed within containers, reducing the reliance on centralized cloud infrastructure. Frequent upgrades are essential to maintain the efficient and secure operation of edge clusters. However, traditional cloud cluster upgrade strategies are ill-suited for edge clu…
▽ More
In Mobile Edge Computing (MEC), Internet of Things (IoT) devices offload computationally-intensive tasks to edge nodes, where they are executed within containers, reducing the reliance on centralized cloud infrastructure. Frequent upgrades are essential to maintain the efficient and secure operation of edge clusters. However, traditional cloud cluster upgrade strategies are ill-suited for edge clusters due to their geographically distributed nature and resource limitations. Therefore, it is crucial to properly schedule containers and upgrade edge clusters to minimize the impact on running tasks. In this paper, we propose a low-latency container scheduling algorithm for edge cluster upgrades. Specifically: 1) We formulate the online container scheduling problem for edge cluster upgrade to minimize the total task latency. 2) We propose a policy gradient-based reinforcement learning algorithm to address this problem, considering the unique characteristics of MEC. 3) Experimental results demonstrate that our algorithm reduces total task latency by approximately 27\% compared to baseline algorithms.
△ Less
Submitted 22 July, 2023;
originally announced July 2023.
-
Feature Map Testing for Deep Neural Networks
Authors:
Dong Huang,
Qingwen Bu,
Yahao Qing,
Yichao Fu,
Heming Cui
Abstract:
Due to the widespread application of deep neural networks~(DNNs) in safety-critical tasks, deep learning testing has drawn increasing attention. During the testing process, test cases that have been fuzzed or selected using test metrics are fed into the model to find fault-inducing test units (e.g., neurons and feature maps, activating which will almost certainly result in a model error) and repor…
▽ More
Due to the widespread application of deep neural networks~(DNNs) in safety-critical tasks, deep learning testing has drawn increasing attention. During the testing process, test cases that have been fuzzed or selected using test metrics are fed into the model to find fault-inducing test units (e.g., neurons and feature maps, activating which will almost certainly result in a model error) and report them to the DNN developer, who subsequently repair them~(e.g., retraining the model with test cases). Current test metrics, however, are primarily concerned with the neurons, which means that test cases that are discovered either by guided fuzzing or selection with these metrics focus on detecting fault-inducing neurons while failing to detect fault-inducing feature maps.
In this work, we propose DeepFeature, which tests DNNs from the feature map level. When testing is conducted, DeepFeature will scrutinize every internal feature map in the model and identify vulnerabilities that can be enhanced through repairing to increase the model's overall performance. Exhaustive experiments are conducted to demonstrate that (1) DeepFeature is a strong tool for detecting the model's vulnerable feature maps; (2) DeepFeature's test case selection has a high fault detection rate and can detect more types of faults~(comparing DeepFeature to coverage-guided selection techniques, the fault detection rate is increased by 49.32\%). (3) DeepFeature's fuzzer also outperforms current fuzzing techniques and generates valuable test cases more efficiently.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing
Authors:
Dong Huang,
Qingwen Bu,
Yichao Fu,
Yuhao Qing,
Bocheng Xiao,
Heming Cui
Abstract:
Deep Neural Networks~(DNNs) have been widely deployed in software to address various tasks~(e.g., autonomous driving, medical diagnosis). However, they could also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal the incorrect behaviors in DNN and repair them, DNN developers often collect rich unlabeled datasets from the natural world and label t…
▽ More
Deep Neural Networks~(DNNs) have been widely deployed in software to address various tasks~(e.g., autonomous driving, medical diagnosis). However, they could also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal the incorrect behaviors in DNN and repair them, DNN developers often collect rich unlabeled datasets from the natural world and label them to test the DNN models. However, properly labeling a large number of unlabeled datasets is a highly expensive and time-consuming task.
To address the above-mentioned problem, we propose NSS, Neuron Sensitivity guided test case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets. NSS leverages the internal neuron's information induced by test cases to select valuable test cases, which have high confidence in causing the model to behave incorrectly. We evaluate NSS with four widely used datasets and four well-designed DNN models compared to SOTA baseline methods. The results show that NSS performs well in assessing the test cases' probability of fault triggering and model improvement capabilities. Specifically, compared with baseline approaches, NSS obtains a higher fault detection rate~(e.g., when selecting 5\% test case from the unlabeled dataset in MNIST \& LeNet1 experiment, NSS can obtain 81.8\% fault detection rate, 20\% higher than baselines).
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Towards Building More Robust Models with Frequency Bias
Authors:
Qingwen Bu,
Dong Huang,
Heming Cui
Abstract:
The vulnerability of deep neural networks to adversarial samples has been a major impediment to their broad applications, despite their success in various fields. Recently, some works suggested that adversarially-trained models emphasize the importance of low-frequency information to achieve higher robustness. While several attempts have been made to leverage this frequency characteristic, they ha…
▽ More
The vulnerability of deep neural networks to adversarial samples has been a major impediment to their broad applications, despite their success in various fields. Recently, some works suggested that adversarially-trained models emphasize the importance of low-frequency information to achieve higher robustness. While several attempts have been made to leverage this frequency characteristic, they have all faced the issue that applying low-pass filters directly to input images leads to irreversible loss of discriminative information and poor generalizability to datasets with distinct frequency features. This paper presents a plug-and-play module called the Frequency Preference Control Module that adaptively reconfigures the low- and high-frequency components of intermediate feature representations, providing better utilization of frequency in robust learning. Empirical studies show that our proposed module can be easily incorporated into any adversarial training framework, further improving model robustness across different architectures and datasets. Additionally, experiments were conducted to examine how the frequency bias of robust models impacts the adversarial training process and its final robustness, revealing interesting insights.
△ Less
Submitted 27 July, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Super resolution dual-layer CBCT imaging with model-guided deep learning
Authors:
Jiongtao Zhu,
Ting Su,
Xin Zhang,
Han Cui,
Yuhang Tan,
Hairong Zheng,
Dong Liang,
**chuan Guo,
Yongshuai Ge
Abstract:
Objective: This study aims at investigating a novel super resolution CBCT imaging technique with the dual-layer flat panel detector (DL-FPD). Approach: In DL-FPD based CBCT imaging, the low-energy and high-energy projections acquired from the top and bottom detector layers contain intrinsically mismatched spatial information, from which super resolution CBCT images can be generated. To explain, a…
▽ More
Objective: This study aims at investigating a novel super resolution CBCT imaging technique with the dual-layer flat panel detector (DL-FPD). Approach: In DL-FPD based CBCT imaging, the low-energy and high-energy projections acquired from the top and bottom detector layers contain intrinsically mismatched spatial information, from which super resolution CBCT images can be generated. To explain, a simple mathematical model is established according to the signal formation procedure in DL-FPD. Next, a dedicated recurrent neural network (RNN), named as suRi-Net, is designed by referring to the above imaging model to retrieve the high resolution dual-energy information. Different phantom experiments are conducted to validate the performance of this newly developed super resolution CBCT imaging method. Main Results: Results show that the proposed suRi-Net can retrieve high spatial resolution information accurately from the low-energy and high-energy projections having lower spatial resolution. Quantitatively, the spatial resolution of the reconstructed CBCT images of the top and bottom detector layers is increased by about 45% and 54%, respectively. Significance: In future, suRi-Net provides a new approach to achieve high spatial resolution dual-energy imaging in DL-FPD based CBCT systems.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Jet charge identification in ee-Z-qq process at Z pole operation
Authors:
Hanhua Cui,
Mingrui Zhao,
Yuexin Wang,
Hao Liang,
Manqi Ruan
Abstract:
Accurate jet charge identification is essential for precise electroweak and flavor measurements at the high-energy frontier. We propose a novel method called the Leading Particle Jet Charge method (LPJC) to determine the jet charge based on information about the leading charged particle. Tested on Z - bb and Z - cc samples at a center-of-mass energy of 91.2GeV, the LPJC achieves an effective taggi…
▽ More
Accurate jet charge identification is essential for precise electroweak and flavor measurements at the high-energy frontier. We propose a novel method called the Leading Particle Jet Charge method (LPJC) to determine the jet charge based on information about the leading charged particle. Tested on Z - bb and Z - cc samples at a center-of-mass energy of 91.2GeV, the LPJC achieves an effective tagging power of 20%/9% for the c/b jet, respectively. Combined with the Weighted Jet Charge method (WJC), we develop a Heavy Flavor Jet Charge method (HFJC), which achieves an effective tagging power of 39%/20% for c/b jet, respectively. This paper also discusses the dependencies between jet charge identification performance and the fragmentation process of heavy flavor jets, and critical detector performances.
△ Less
Submitted 18 March, 2024; v1 submitted 24 June, 2023;
originally announced June 2023.
-
A Universal Semantic-Geometric Representation for Robotic Manipulation
Authors:
Tong Zhang,
Yingdong Hu,
Hanchen Cui,
Hang Zhao,
Yang Gao
Abstract:
Robots rely heavily on sensors, especially RGB and depth cameras, to perceive and interact with the world. RGB cameras record 2D images with rich semantic information while missing precise spatial information. On the other side, depth cameras offer critical 3D geometry data but capture limited semantics. Therefore, integrating both modalities is crucial for learning representations for robotic per…
▽ More
Robots rely heavily on sensors, especially RGB and depth cameras, to perceive and interact with the world. RGB cameras record 2D images with rich semantic information while missing precise spatial information. On the other side, depth cameras offer critical 3D geometry data but capture limited semantics. Therefore, integrating both modalities is crucial for learning representations for robotic perception and control. However, current research predominantly focuses on only one of these modalities, neglecting the benefits of incorporating both. To this end, we present $\textbf{Semantic-Geometric Representation} (\textbf{SGR})$, a universal perception module for robotics that leverages the rich semantic information of large-scale pre-trained 2D models and inherits the merits of 3D spatial reasoning. Our experiments demonstrate that SGR empowers the agent to successfully complete a diverse range of simulated and real-world robotic manipulation tasks, outperforming state-of-the-art methods significantly in both single-task and multi-task settings. Furthermore, SGR possesses the capability to generalize to novel semantic attributes, setting it apart from the other methods. Project website: https://semantic-geometric-representation.github.io.
△ Less
Submitted 13 October, 2023; v1 submitted 18 June, 2023;
originally announced June 2023.
-
Sim-on-Wheels: Physical World in the Loop Simulation for Self-Driving
Authors:
Yuan Shen,
Bhargav Chandaka,
Zhi-hao Lin,
Albert Zhai,
Hang Cui,
David Forsyth,
Shenlong Wang
Abstract:
We present Sim-on-Wheels, a safe, realistic, and vehicle-in-loop framework to test autonomous vehicles' performance in the real world under safety-critical scenarios. Sim-on-wheels runs on a self-driving vehicle operating in the physical world. It creates virtual traffic participants with risky behaviors and seamlessly inserts the virtual events into images perceived from the physical world in rea…
▽ More
We present Sim-on-Wheels, a safe, realistic, and vehicle-in-loop framework to test autonomous vehicles' performance in the real world under safety-critical scenarios. Sim-on-wheels runs on a self-driving vehicle operating in the physical world. It creates virtual traffic participants with risky behaviors and seamlessly inserts the virtual events into images perceived from the physical world in real-time. The manipulated images are fed into autonomy, allowing the self-driving vehicle to react to such virtual events. The full pipeline runs on the actual vehicle and interacts with the physical world, but the safety-critical events it sees are virtual. Sim-on-Wheels is safe, interactive, realistic, and easy to use. The experiments demonstrate the potential of Sim-on-Wheels to facilitate the process of testing autonomous driving in challenging real-world scenes with high fidelity and low risk.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises
Authors:
Hejie Cui,
Jiaying Lu,
Shiyu Wang,
Ran Xu,
Wen**g Ma,
Shaojun Yu,
Yue Yu,
Xuan Kan,
Chen Ling,
Tianfan Fu,
Liang Zhao,
Joyce Ho,
Fei Wang,
Carl Yang
Abstract:
Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs suc…
▽ More
Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs such as regarding data heterogeneity and limited coverage are not fully understood, highlighting the need for detailed reviews. This work provides the first comprehensive review of HKGs. It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches, i.e., model-free and model-based. The existing HKG resources are also organized based on the data types they capture and application domains they cover, along with relevant statistical information (Resource available at https://github.com/lujiaying/Awesome-HealthCare-KnowledgeBase). At the application level, we delve into the successful integration of HKGs across various health domains, ranging from fine-grained basic science research to high-level clinical decision support and public health. Lastly, the paper highlights the opportunities for HKGs in the era of LLMs. This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research.
△ Less
Submitted 19 February, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
R-Mixup: Riemannian Mixup for Biological Networks
Authors:
Xuan Kan,
Zimu Li,
Hejie Cui,
Yue Yu,
Ran Xu,
Shaojun Yu,
Zilong Zhang,
Ying Guo,
Carl Yang
Abstract:
Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-MIXUP, a…
▽ More
Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-MIXUP, a Mixup-based data augmentation technique that suits the symmetric positive definite (SPD) property of adjacency matrices from biological networks with optimized training efficiency. The interpolation process in R-MIXUP leverages the log-Euclidean distance metrics from the Riemannian manifold, effectively addressing the swelling effect and arbitrarily incorrect label issues of vanilla Mixup. We demonstrate the effectiveness of R-MIXUP with five real-world biological network datasets on both regression and classification tasks. Besides, we derive a commonly ignored necessary condition for identifying the SPD matrices of biological networks and empirically study its influence on the model performance. The code implementation can be found in Appendix E.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
PV2TEA: Patching Visual Modality to Textual-Established Information Extraction
Authors:
Hejie Cui,
Rongmei Lin,
Nasser Zalmout,
Chenwei Zhang,
**gbo Shang,
Carl Yang,
Xian Li
Abstract:
Information extraction, e.g., attribute value extraction, has been extensively studied and formulated based only on text. However, many attributes can benefit from image-based extraction, like color, shape, pattern, among others. The visual modality has long been underutilized, mainly due to multimodal annotation difficulty. In this paper, we aim to patch the visual modality to the textual-establi…
▽ More
Information extraction, e.g., attribute value extraction, has been extensively studied and formulated based only on text. However, many attributes can benefit from image-based extraction, like color, shape, pattern, among others. The visual modality has long been underutilized, mainly due to multimodal annotation difficulty. In this paper, we aim to patch the visual modality to the textual-established attribute information extractor. The cross-modality integration faces several unique challenges: (C1) images and textual descriptions are loosely paired intra-sample and inter-samples; (C2) images usually contain rich backgrounds that can mislead the prediction; (C3) weakly supervised labels from textual-established extractors are biased for multimodal training. We present PV2TEA, an encoder-decoder architecture equipped with three bias reduction schemes: (S1) Augmented label-smoothed contrast to improve the cross-modality alignment for loosely-paired image and text; (S2) Attention-pruning that adaptively distinguishes the visual foreground; (S3) Two-level neighborhood regularization that mitigates the label textual bias via reliability estimation. Empirical results on real-world e-Commerce datasets demonstrate up to 11.74% absolute (20.97% relatively) F1 increase over unimodal baselines.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Explanation Graph Generation via Generative Pre-training over Synthetic Graphs
Authors:
Han Cui,
Shangzhan Li,
Yu Zhang,
Qi Shi
Abstract:
The generation of explanation graphs is a significant task that aims to produce explanation graphs in response to user input, revealing the internal reasoning process. This task is challenging due to the significant discrepancy between unstructured user queries and structured explanation graphs. Current research commonly fine-tunes a text-based pre-trained language model on a small downstream data…
▽ More
The generation of explanation graphs is a significant task that aims to produce explanation graphs in response to user input, revealing the internal reasoning process. This task is challenging due to the significant discrepancy between unstructured user queries and structured explanation graphs. Current research commonly fine-tunes a text-based pre-trained language model on a small downstream dataset that is annotated with labeled graphs. However, due to the limited scale of available datasets, this approach may prove to be insufficient in bridging the gap between natural language text and structured graphs. In this paper, to alleviate the above limitations, we propose a novel pre-trained framework EG3P(for Explanation Graph Generation via Generative Pre-training over synthetic graphs) for the explanation graph generation task. Specifically, we first propose a text-to-graph generative task to pre-train the model with the goal of bridging the text-graph gap. Additionally, we propose an automatic corpus synthesis strategy for synthesizing a large scale of high-quality corpus, reducing the reliance on costly manual annotation methods. Experimental results on ExplaGraphs show the effectiveness of EG3P that our model surpasses all baseline systems with remarkable margins. Besides, further analysis demonstrates that EG3P is able to generate better explanation graphs on actual reasoning tasks such as CommonsenseQA and OpenbookQA.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation
Authors:
Ziyang Chen,
Yongsheng Pan,
Yiwen Ye,
Hengfei Cui,
Yong Xia
Abstract:
Although recent years have witnessed the great success of convolutional neural networks (CNNs) in medical image segmentation, the domain shift issue caused by the highly variable image quality of medical images hinders the deployment of CNNs in real-world clinical applications. Domain generalization (DG) methods aim to address this issue by training a robust model on the source domain, which has a…
▽ More
Although recent years have witnessed the great success of convolutional neural networks (CNNs) in medical image segmentation, the domain shift issue caused by the highly variable image quality of medical images hinders the deployment of CNNs in real-world clinical applications. Domain generalization (DG) methods aim to address this issue by training a robust model on the source domain, which has a strong generalization ability. Previously, many DG methods based on feature-space domain randomization have been proposed, which, however, suffer from the limited and unordered search space of feature styles. In this paper, we propose a multi-source DG method called Treasure in Distribution (TriD), which constructs an unprecedented search space to obtain the model with strong robustness by randomly sampling from a uniform distribution. To learn the domain-invariant representations explicitly, we further devise a style-mixing strategy in our TriD, which mixes the feature styles by randomly mixing the augmented and original statistics along the channel wise and can be extended to other DG methods. Extensive experiments on two medical segmentation tasks with different modalities demonstrate that our TriD achieves superior generalization performance on unseen target-domain data. Code is available at https://github.com/Chen-Ziyang/TriD.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
Authors:
Chen Ling,
Xujiang Zhao,
Jiaying Lu,
Chengyuan Deng,
Can Zheng,
Junxiang Wang,
Tanmoy Chowdhury,
Yun Li,
Hejie Cui,
Xuchao Zhang,
Tianjiao Zhao,
Amit Panalkar,
Dhagash Mehta,
Stefano Pasquali,
Wei Cheng,
Haoyu Wang,
Yanchi Liu,
Zhengzhang Chen,
Haifeng Chen,
Chris White,
Quanquan Gu,
Jian Pei,
Carl Yang,
Liang Zhao
Abstract:
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of dom…
▽ More
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). Domain specification techniques are key to make large language models disruptive in many applications. Specifically, to solve these hurdles, there has been a notable increase in research and practices conducted in recent years on the domain specialization of LLMs. This emerging field of study, with its substantial potential for impact, necessitates a comprehensive and systematic review to better summarize and guide ongoing work in this area. In this article, we present a comprehensive survey on domain specification techniques for large language models, an emerging direction critical for large language model applications. First, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. Second, we present an extensive taxonomy of critical application domains that can benefit dramatically from specialized LLMs, discussing their practical significance and open challenges. Last, we offer our insights into the current research status and future trends in this area.
△ Less
Submitted 29 March, 2024; v1 submitted 29 May, 2023;
originally announced May 2023.
-
The First LHAASO Catalog of Gamma-Ray Sources
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022.…
▽ More
We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022. This catalog represents the main result from the most sensitive large coverage gamma-ray survey of the sky above 1 TeV, covering declination from $-$20$^{\circ}$ to 80$^{\circ}$. In total, the catalog contains 90 sources with an extended size smaller than $2^\circ$ and a significance of detection at $> 5σ$. Based on our source association criteria, 32 new TeV sources are proposed in this study. Among the 90 sources, 43 sources are detected with ultra-high energy ($E > 100$ TeV) emission at $> 4σ$ significance level. We provide the position, extension, and spectral characteristics of all the sources in this catalog.
△ Less
Submitted 27 November, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis
Authors:
Yi Yang,
Hejie Cui,
Carl Yang
Abstract:
The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Ne…
▽ More
The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Neural Networks (GNNs) have been applied for brain network analysis. However, training deep models requires large amounts of labeled data, which is often scarce in brain network datasets due to the complexities of data acquisition and sharing restrictions. To make the most out of available training data, we propose PTGB, a GNN pre-training framework that captures intrinsic brain network structures, regardless of clinical outcomes, and is easily adaptable to various downstream tasks. PTGB comprises two key components: (1) an unsupervised pre-training technique designed specifically for brain networks, which enables learning from large-scale datasets without task-specific labels; (2) a data-driven parcellation atlas map** pipeline that facilitates knowledge transfer across datasets with different ROI systems. Extensive evaluations using various GNN models have demonstrated the robust and superior performance of PTGB compared to baseline methods.
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes
Authors:
Shun Zhang,
Haoran Sun,
Runze Yu,
Hongshenyuan Cui,
Jian Ren,
Feifei Gao,
Shi **,
Hongxiang Xie,
Hao Wang
Abstract:
In this paper, we propose a two-bit reconfigurable intelligent surface (RIS)-aided communication system, which mainly consists of a two-bit RIS, a transmitter and a receiver. A corresponding prototype verification system is designed to perform experimental tests in practical environments. The carrier frequency is set as 3.5GHz, and the RIS array possesses 256 units, each of which adopts two-bit ph…
▽ More
In this paper, we propose a two-bit reconfigurable intelligent surface (RIS)-aided communication system, which mainly consists of a two-bit RIS, a transmitter and a receiver. A corresponding prototype verification system is designed to perform experimental tests in practical environments. The carrier frequency is set as 3.5GHz, and the RIS array possesses 256 units, each of which adopts two-bit phase quantization. In particular, we adopt a self-developed broadband intelligent communication system 40MHz-Net (BICT-40N) terminal in order to fully acquire the channel information. The terminal mainly includes a baseband board and a radio frequency (RF) front-end board, where the latter can achieve 26 dB transmitting link gain and 33 dB receiving link gain. The orthogonal frequency division multiplexing (OFDM) signal is used for the terminal, where the bandwidth is 40MHz and the subcarrier spacing is 625KHz. Also, the terminal supports a series of modulation modes, including QPSK, QAM, etc.Through experimental tests, we validate a few functions and properties of the RIS as follows. First, we validate a novel RIS power consumption model, which considers both the static and the dynamic power consumption. Besides, we demonstrate the existence of the imaging interference and find that two-bit RIS can lower the imaging interference about 10 dBm. Moreover, we verify that the RIS can outperform the metal plate in terms of the beam focusing performance. In addition, we find that the RIS has the ability to improve the channel stationarity. Then, we realize the multi-beam reflection of the RIS utilizing the pattern addition (PA) algorithm. Lastly, we validate the existence of the mutual coupling between different RIS units.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
High-dimensional Asymptotics of Denoising Autoencoders
Authors:
Hugo Cui,
Lenka Zdeborová
Abstract:
We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension jointly tend to infinity while the number of hidden units remains bounded. We provide closed-form expressions for the denoising mean-squared test error.…
▽ More
We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension jointly tend to infinity while the number of hidden units remains bounded. We provide closed-form expressions for the denoising mean-squared test error. Building on this result, we quantitatively characterize the advantage of the considered architecture over the autoencoder without the skip connection that relates closely to principal component analysis. We further show that our results accurately capture the learning curves on a range of real data sets.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
On area-minimizing Pfaffian varieties
Authors:
Hongbin Cui,
Xiaoxiang Jiao,
Xiaowei Xu
Abstract:
There are two significant families of minimal real matrix varieties: determinantal varieties and skew-symmetric determinantal varieties, the later ones are also known as Pfaffian varieties. In 1999, Kerckhove and Lawlor [Duke Math.J. 96(2),401--424,1999] proved that determinantal varieties are area-minimizing except for two families. In this paper we prove that all Pfaffian varieties are area-mini…
▽ More
There are two significant families of minimal real matrix varieties: determinantal varieties and skew-symmetric determinantal varieties, the later ones are also known as Pfaffian varieties. In 1999, Kerckhove and Lawlor [Duke Math.J. 96(2),401--424,1999] proved that determinantal varieties are area-minimizing except for two families. In this paper we prove that all Pfaffian varieties are area-minimizing with the exception of Pfaffian hypersurfaces.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Measurement of ultra-high-energy diffuse gamma-ray emission of the Galactic plane from 10 TeV to 1 PeV with LHAASO-KM2A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer ar…
▽ More
The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer array of the Large High Altitude Air Shower Observatory (LHAASO). Diffuse emissions from the inner ($15^{\circ}<l<125^{\circ}$, $|b|<5^{\circ}$) and outer ($125^{\circ}<l<235^{\circ}$, $|b|<5^{\circ}$) Galactic plane are detected with $29.1σ$ and $12.7σ$ significance, respectively. The outer Galactic plane diffuse emission is detected for the first time in the very- to ultra-high-energy domain ($E>10$~TeV). The energy spectrum in the inner Galaxy regions can be described by a power-law function with an index of $-2.99\pm0.04$, which is different from the curved spectrum as expected from hadronic interactions between locally measured cosmic rays and the line-of-sight integrated gas content. Furthermore, the measured flux is higher by a factor of $\sim3$ than the prediction. A similar spectrum with an index of $-2.99\pm0.07$ is found in the outer Galaxy region, and the absolute flux for $10\lesssim E\lesssim60$ TeV is again higher than the prediction for hadronic cosmic ray interactions. The latitude distributions of the diffuse emission are consistent with the gas distribution, while the longitude distributions show clear deviation from the gas distribution. The LHAASO measurements imply that either additional emission sources exist or cosmic ray intensities have spatial variations.
△ Less
Submitted 19 August, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
U-NEED: A Fine-grained Dataset for User Needs-Centric E-commerce Conversational Recommendation
Authors:
Yuanxing Liu,
Weinan Zhang,
Baohua Dong,
Yan Fan,
Hang Wang,
Fan Feng,
Yifan Chen,
Ziyu Zhuang,
Hengbin Cui,
Yongbin Li,
Wanxiang Che
Abstract:
Conversational recommender systems (CRSs) aim to understand the information needs and preferences expressed in a dialogue to recommend suitable items to the user. Most of the existing conversational recommendation datasets are synthesized or simulated with crowdsourcing, which has a large gap with real-world scenarios. To bridge the gap, previous work contributes a dataset E-ConvRec, based on pre-…
▽ More
Conversational recommender systems (CRSs) aim to understand the information needs and preferences expressed in a dialogue to recommend suitable items to the user. Most of the existing conversational recommendation datasets are synthesized or simulated with crowdsourcing, which has a large gap with real-world scenarios. To bridge the gap, previous work contributes a dataset E-ConvRec, based on pre-sales dialogues between users and customer service staff in E-commerce scenarios. However, E-ConvRec only supplies coarse-grained annotations and general tasks for making recommendations in pre-sales dialogues. Different from that, we use real user needs as a clue to explore the E-commerce conversational recommendation in complex pre-sales dialogues, namely user needs-centric E-commerce conversational recommendation (UNECR).
In this paper, we construct a user needs-centric E-commerce conversational recommendation dataset (U-NEED) from real-world E-commerce scenarios. U-NEED consists of 3 types of resources: (i) 7,698 fine-grained annotated pre-sales dialogues in 5 top categories (ii) 333,879 user behaviors and (iii) 332,148 product knowledge tuples. To facilitate the research of UNECR, we propose 5 critical tasks: (i) pre-sales dialogue understanding (ii) user needs elicitation (iii) user needs-based recommendation (iv) pre-sales dialogue generation and (v) pre-sales dialogue evaluation. We establish baseline methods and evaluation metrics for each task. We report experimental results of 5 tasks on U-NEED. We also report results in 3 typical categories. Experimental results indicate that the challenges of UNECR in various categories are different.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Transformer-Based Hierarchical Clustering for Brain Network Analysis
Authors:
Wei Dai,
Hejie Cui,
Xuan Kan,
Ying Guo,
Sanne van Rooij,
Carl Yang
Abstract:
Brain networks, graphical models such as those constructed from MRI, have been widely used in pathological prediction and analysis of brain functions. Within the complex brain system, differences in neuronal connection strengths parcellate the brain into various functional modules (network communities), which are critical for brain analysis. However, identifying such communities within the brain h…
▽ More
Brain networks, graphical models such as those constructed from MRI, have been widely used in pathological prediction and analysis of brain functions. Within the complex brain system, differences in neuronal connection strengths parcellate the brain into various functional modules (network communities), which are critical for brain analysis. However, identifying such communities within the brain has been a nontrivial issue due to the complexity of neuronal interactions. In this work, we propose a novel interpretable transformer-based model for joint hierarchical cluster identification and brain network classification. Extensive experimental results on real-world brain network datasets show that with the help of hierarchical clustering, the model achieves increased accuracy and reduced runtime complexity while providing plausible insight into the functional organization of brain regions. The implementation is available at https://github.com/DDVD233/THC.
△ Less
Submitted 6 May, 2023;
originally announced May 2023.
-
Beyond the Model: Data Pre-processing Attack to Deep Learning Models in Android Apps
Authors:
Ye Sang,
Yu** Huang,
Shuo Huang,
Helei Cui
Abstract:
The increasing popularity of deep learning (DL) models and the advantages of computing, including low latency and bandwidth savings on smartphones, have led to the emergence of intelligent mobile applications, also known as DL apps, in recent years. However, this technological development has also given rise to several security concerns, including adversarial examples, model stealing, and data poi…
▽ More
The increasing popularity of deep learning (DL) models and the advantages of computing, including low latency and bandwidth savings on smartphones, have led to the emergence of intelligent mobile applications, also known as DL apps, in recent years. However, this technological development has also given rise to several security concerns, including adversarial examples, model stealing, and data poisoning issues. Existing works on attacks and countermeasures for on-device DL models have primarily focused on the models themselves. However, scant attention has been paid to the impact of data processing disturbance on the model inference. This knowledge disparity highlights the need for additional research to fully comprehend and address security issues related to data processing for on-device models. In this paper, we introduce a data processing-based attacks against real-world DL apps. In particular, our attack could influence the performance and latency of the model without affecting the operation of a DL app. To demonstrate the effectiveness of our attack, we carry out an empirical study on 517 real-world DL apps collected from Google Play. Among 320 apps utilizing MLkit, we find that 81.56\% of them can be successfully attacked.
The results emphasize the importance of DL app developers being aware of and taking actions to secure on-device models from the perspective of data processing.
△ Less
Submitted 11 May, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning
Authors:
Zijian Wang,
Shuo Huang,
Yu** Huang,
Helei Cui
Abstract:
In recent years, on-device deep learning has gained attention as a means of develo** affordable deep learning applications for mobile devices. However, on-device models are constrained by limited energy and computation resources. In the mean time, a poisoning attack known as sponge poisoning has been developed.This attack involves feeding the model with poisoned examples to increase the energy c…
▽ More
In recent years, on-device deep learning has gained attention as a means of develo** affordable deep learning applications for mobile devices. However, on-device models are constrained by limited energy and computation resources. In the mean time, a poisoning attack known as sponge poisoning has been developed.This attack involves feeding the model with poisoned examples to increase the energy consumption during inference. As previous work is focusing on server hardware accelerators, in this work, we extend the sponge poisoning attack to an on-device scenario to evaluate the vulnerability of mobile device processors. We present an on-device sponge poisoning attack pipeline to simulate the streaming and consistent inference scenario to bridge the knowledge gap in the on-device setting. Our exclusive experimental analysis with processors and on-device networks shows that sponge poisoning attacks can effectively pollute the modern processor with its built-in accelerator. We analyze the impact of different factors in the sponge poisoning algorithm and highlight the need for improved defense mechanisms to prevent such attacks on on-device deep learning applications.
△ Less
Submitted 11 May, 2023; v1 submitted 5 May, 2023;
originally announced May 2023.
-
Emergence of flat bands and ferromagnetic fluctuations via orbital-selective electron correlations in Mn-based kagome metal
Authors:
Subhasis Samanta,
Hwiwoo Park,
Chanhyeon Lee,
Sungmin Jeon,
Hengbo Cui,
Yong-Xin Yao,
Jungseek Hwang,
Kwang-Yong Choi,
Heung-Sik Kim
Abstract:
Kagome lattice has been actively studied for the possible realization of frustration-induced two-dimensional flat bands and a number of correlation-induced phases. Currently, the search for kagome systems with a nearly dispersionless flat band close to the Fermi level is ongoing. Here, by combining theoretical and experimental tools, we present Sc$_3$Mn$_3$Al$_7$Si$_5$ as a novel realization of co…
▽ More
Kagome lattice has been actively studied for the possible realization of frustration-induced two-dimensional flat bands and a number of correlation-induced phases. Currently, the search for kagome systems with a nearly dispersionless flat band close to the Fermi level is ongoing. Here, by combining theoretical and experimental tools, we present Sc$_3$Mn$_3$Al$_7$Si$_5$ as a novel realization of correlation-induced almost-flat bands in the kagome lattice in the vicinity of the Fermi level. Our magnetic susceptibility, $^{27}$Al nuclear magnetic resonance, transport, and optical conductivity measurements provide signatures of a correlated metallic phase with tantalizing ferromagnetic instability. Our dynamical mean-field calculations suggest that such ferromagnetic instability observed originates from the formation of nearly flat dispersions close to the Fermi level, where electron correlations induce strong orbital-selective renormalization and manifestation of the kagome-frustrated bands. In addition, a significant negative magnetoresistance signal is observed, which can be attributed to the suppression of flat-band-induced ferromagnetic fluctuation, which further supports the formation of flat bands in this compound. These findings broaden a new prospect to harness correlated topological phases via multiorbital correlations in 3$d$-based kagome systems.
△ Less
Submitted 25 June, 2024; v1 submitted 10 April, 2023;
originally announced April 2023.
-
A CI-based Auditing Framework for Data Collection Practices
Authors:
Athina Markopoulou,
Rahmadi Trimananda,
Hao Cui
Abstract:
Apps and devices (mobile devices, web browsers, IoT, VR, voice assistants, etc.) routinely collect user data, and send them to first- and third-party servers through the network. Recently, there is a lot of interest in (1) auditing the actual data collection practices of those systems; and also in (2) checking the consistency of those practices against the statements made in the corresponding priv…
▽ More
Apps and devices (mobile devices, web browsers, IoT, VR, voice assistants, etc.) routinely collect user data, and send them to first- and third-party servers through the network. Recently, there is a lot of interest in (1) auditing the actual data collection practices of those systems; and also in (2) checking the consistency of those practices against the statements made in the corresponding privacy policies. In this paper, we argue that the contextual integrity (CI) tuple can be the basic building block for defining and implementing such an auditing framework. We elaborate on the special case where the tuple is partially extracted from the network traffic generated by the end-device of interest, and partially from the corresponding privacy policies using natural language processing (NLP) techniques. Along the way, we discuss related bodies of work and representative examples that fit into that framework. More generally, we believe that CI can be the building block not only for auditing at the edge, but also for specifying privacy policies and system APIs. We also discuss limitations and directions for future work.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.