Skip to main content

Showing 1–50 of 344 results for author: Shao, H

.
  1. CAT: Interpretable Concept-based Taylor Additive Models

    Authors: Viet Duong, Qiong Wu, Zhengyi Zhou, Hongjue Zhao, Chenxiang Luo, Eric Zavesky, Huaxiu Yao, Huajie Shao

    Abstract: As an emerging interpretable technique, Generalized Additive Models (GAMs) adopt neural networks to individually learn non-linear functions for each feature, which are then combined through a linear model for final predictions. Although GAMs can explain deep neural networks (DNNs) at the feature level, they require large numbers of model parameters and are prone to overfitting, making them hard to… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.14869  [pdf, other

    eess.SP

    Cost-Effective RF Fingerprinting Based on Hybrid CVNN-RF Classifier with Automated Multi-Dimensional Early-Exit Strategy

    Authors: Jiayan Gan, Zhixing Du, Qiang Li, Huaizong Shao, **gran Lin, Ye Pan, Zhongyi Wen, Shafei Wang

    Abstract: While the Internet of Things (IoT) technology is booming and offers huge opportunities for information exchange, it also faces unprecedented security challenges. As an important complement to the physical layer security technologies for IoT, radio frequency fingerprinting (RFF) is of great interest due to its difficulty in counterfeiting. Recently, many machine learning (ML)-based RFF algorithms h… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  3. arXiv:2406.13867  [pdf, other

    cs.IT cs.DM math.CO

    Error-Correcting Graph Codes

    Authors: Swastik Kopparty, Aditya Potukuchi, Harry Sha

    Abstract: In this paper, we define, study, and construct {\em Error-Correcting Graph Codes}. An error-correcting graph code of distance $δ$ is a family $C$ of graphs, on a common vertex set of size $n$, such that if we start with any graph in $C$, we would have to modify the neighborhoods of at least $δn$ vertices in order to reach some other graph in $C$. This is a natural graph generalization of the sta… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 27 pages, 3 figures, 1 table

    ACM Class: G.2.1; E.4

  4. arXiv:2405.18164  [pdf

    cond-mat.mtrl-sci

    Imaging, counting, and positioning single interstitial atoms in solids

    Authors: Jizhe Cui, Haozhi Sha, Liangze Mao, Kang Sun, Wenfeng Yang, Rong Yu

    Abstract: Interstitial atoms are ubiquitous in solids and they are widely incorporated into materials to tune their lattice structure, electronic transportation, and mechanical properties. Because the distribution of interstitial atoms in matrix materials is usually disordered and most of them are light atoms with weak scattering ability, it remains a challenge to directly image single interstitial atoms an… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 20 pages and 8 figures; Jizhe Cui and Haozhi Sha contributed equally to this work. Rong Yu, corresponding author: [email protected]

  5. arXiv:2405.17529  [pdf, other

    cs.LG cs.CR

    Clip Body and Tail Separately: High Probability Guarantees for DPSGD with Heavy Tails

    Authors: Haichao Sha, Yang Cao, Yong Liu, Yuncheng Wu, Ruixuan Liu, Hong Chen

    Abstract: Differentially Private Stochastic Gradient Descent (DPSGD) is widely utilized to preserve training data privacy in deep learning, which first clips the gradients to a predefined norm and then injects calibrated noise into the training procedure. Existing DPSGD works typically assume the gradients follow sub-Gaussian distributions and design various clip** mechanisms to optimize training performa… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  6. arXiv:2405.17233  [pdf, other

    cs.LG

    CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs

    Authors: Haoyu Wang, Bei Liu, Hang Shao, Bo Xiao, Ke Zeng, Guanglu Wan, Yanmin Qian

    Abstract: Parameter quantization for Large Language Models (LLMs) has attracted increasing attentions recently in reducing memory costs and improving computational efficiency. Early approaches have been widely adopted. However, the existing methods suffer from poor performance in low-bit (such as 2 to 3 bits) scenarios. In this paper, we present a novel and effective Column-Level Adaptive weight Quantizatio… ▽ More

    Submitted 2 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  7. arXiv:2405.16889  [pdf

    eess.SP

    Extraction of In-Phase and Quadrature Components by Time-Encoding Sampling

    Authors: Y. H. Shao, S. Y. Chen, H. Z. Yang, F. Xi, H. Hong, Z. Liu

    Abstract: Time encoding machine (TEM) is a biologically-inspired scheme to perform signal sampling using timing. In this paper, we study its application to the sampling of bandpass signals. We propose an integrate-and-fire TEM scheme by which the in-phase (I) and quadrature (Q) components are extracted through reconstruction. We design the TEM according to the signal bandwidth and amplitude instead of upper… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 30 pages, 8 figures

  8. arXiv:2405.14292  [pdf, other

    cs.CV cs.RO

    A New Method in Facial Registration in Clinics Based on Structure Light Images

    Authors: Pengfei Li, Ziyue Ma, Hong Wang, Juan Deng, Yan Wang, Zhenyu Xu, Feng Yan, Wenjun Tu, Hong Sha

    Abstract: Background and Objective: In neurosurgery, fusing clinical images and depth images that can improve the information and details is beneficial to surgery. We found that the registration of face depth images was invalid frequently using existing methods. To abundant traditional image methods with depth information, a method in registering with depth images and traditional clinical images was investi… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  9. arXiv:2405.06607  [pdf, other

    cond-mat.str-el hep-lat

    SO(5) multicriticality in two-dimensional quantum magnets

    Authors: Jun Takahashi, Hui Shao, Bowen Zhao, Wenan Guo, Anders W. Sandvik

    Abstract: We resolve the nature of the quantum phase transition between a Néel antiferromagnet and a valence-bond solid in two-dimensional spin-1/2 magnets. We study a class of $J$-$Q$ models, in which Heisenberg exchange $J$ competes with interactions $Q_n$ formed by products of $n$ singlet projectors on adjacent parallel lattice links. QMC simulations provide unambiguous evidence for first-order transitio… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 57 pages, 36 figures

  10. arXiv:2405.03882  [pdf, other

    cs.CV cs.AI

    Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

    Authors: Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

    Abstract: Motivated by the huge success of Transformers in the field of natural language processing (NLP), Vision Transformers (ViTs) have been rapidly developed and achieved remarkable performance in various computer vision tasks. However, their huge model sizes and intensive computations hinder ViTs' deployment on embedded devices, calling for effective model compression methods, such as quantization. Unf… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  11. arXiv:2404.13046  [pdf, other

    cs.CV

    MoVA: Adapting Mixture of Vision Experts to Multimodal Context

    Authors: Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu

    Abstract: As the key component in multimodal large language models (MLLMs), the ability of the visual encoder greatly affects MLLM's understanding on diverse image content. Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understandi… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  12. arXiv:2404.12867  [pdf, other

    cs.CV cs.RO

    FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving

    Authors: Xingtai Gui, Tengteng Huang, Haonan Shao, Haotian Yao, Chi Zhang

    Abstract: The future instance prediction from a Bird's Eye View(BEV) perspective is a vital component in autonomous driving, which involves future instance segmentation and instance motion prediction. Existing methods usually rely on a redundant and complex pipeline which requires multiple auxiliary outputs and post-processing procedures. Moreover, estimated errors on each of the auxiliary predictions will… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  13. arXiv:2404.08145  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Polar vortex hidden in twisted bilayers of paraelectric SrTiO3

    Authors: Haozhi Sha, Yixuan Zhang, Yunpeng Ma, Wei Li, Wenfeng Yang, Jizhe Cui, Qian Li, Houbing Huang, Rong Yu

    Abstract: Polar topologies, such as vortex and skyrmion, have attracted significant interest due to their unique physical properties and promising applications in high-density memory devices. Currently, most polar vortices are observed in heterostructures containing ferroelectric materials and constrained by substrates. In this study, we unravel arrays of polar vortices formed in twisted freestanding bilaye… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  14. arXiv:2404.02571  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Wenzhou TE: a first-principles calculated thermoelectric materials database

    Authors: Ying Fang, Hezhu Shao

    Abstract: Since the implementation of the Materials Genome Project by the Obama administration in the United States, the development of various computational materials databases has fundamentally expanded the choices of industries such as materials and energy. In the field of thermoelectric materials, the thermoelectric figure of merit ZT quantifies the performance of the material. From the viewpoint of cal… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 13 pages, 5 figures

    Journal ref: https://www.mdpi.com/1996-1944/17/10/2200

  15. arXiv:2404.01448  [pdf

    physics.med-ph cs.LG

    Prior Frequency Guided Diffusion Model for Limited Angle (LA)-CBCT Reconstruction

    Authors: Jiacheng Xie, Hua-Chieh Shao, Yunxiang Li, You Zhang

    Abstract: Cone-beam computed tomography (CBCT) is widely used in image-guided radiotherapy. Reconstructing CBCTs from limited-angle acquisitions (LA-CBCT) is highly desired for improved imaging efficiency, dose reduction, and better mechanical clearance. LA-CBCT reconstruction, however, suffers from severe under-sampling artifacts, making it a highly ill-posed inverse problem. Diffusion models can generate… ▽ More

    Submitted 8 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 20 pages, 8 figures, submitted to Physics in Medicine & Biology

  16. arXiv:2403.20230  [pdf, other

    cs.AR cs.LG

    An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT

    Authors: Haikuo Shao, Huihong Shi, Wendong Mao, Zhongfeng Wang

    Abstract: Vision Transformers (ViTs) have achieved significant success in computer vision. However, their intensive computations and massive memory footprint challenge ViTs' deployment on embedded devices, calling for efficient ViTs. Among them, EfficientViT, the state-of-the-art one, features a Convolution-Transformer hybrid architecture, enhancing both accuracy and hardware efficiency. Unfortunately, exis… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: To appear in the 2024 IEEE International Symposium on Circuits and Systems (ISCAS 2024)

  17. arXiv:2403.16999  [pdf, other

    cs.CV

    Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

    Authors: Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li

    Abstract: This paper presents Visual CoT, a novel pipeline that leverages the reasoning capabilities of multi-modal large language models (MLLMs) by incorporating visual Chain-of-Thought (CoT) reasoning. While MLLMs have shown promise in various visual tasks, they often lack interpretability and struggle with complex visual inputs. To address these challenges, we propose a multi-turn processing pipeline tha… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Code: https://github.com/deepcs233/Visual-CoT

  18. arXiv:2403.15464  [pdf, other

    cs.CL cs.AI cs.LG cs.MA

    LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction

    Authors: Hejie Cui, Zhuocheng Shen, Jieyu Zhang, Hui Shao, Lianhui Qin, Joyce C. Ho, Carl Yang

    Abstract: Electronic health records (EHRs) contain valuable patient data for health-related prediction tasks, such as disease prediction. Traditional approaches rely on supervised learning methods that require large labeled datasets, which can be expensive and challenging to obtain. In this study, we investigate the feasibility of applying Large Language Models (LLMs) to convert structured patient visit dat… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    ACM Class: J.3; I.2.7

  19. arXiv:2403.14693  [pdf

    cs.CY cs.AI cs.DC cs.IR

    A2CI: A Cloud-based, Service-oriented Geospatial Cyberinfrastructure to Support Atmospheric Research

    Authors: Wenwen Li, Hu Shao, Sizhe Wang, Xiran Zhou, Sheng Wu

    Abstract: Big earth science data offers the scientific community great opportunities. Many more studies at large-scales, over long-terms and at high resolution can now be conducted using the rich information collected by remote sensing satellites, ground-based sensor networks, and even social media input. However, the hundreds of terabytes of information collected and compiled on an hourly basis by NASA and… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    MSC Class: big data; cyberinfrastructure; cloud computing

  20. arXiv:2403.11492  [pdf, other

    cs.CV cs.AI cs.RO

    SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

    Authors: Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu

    Abstract: Predicting the future motion of surrounding agents is essential for autonomous vehicles (AVs) to operate safely in dynamic, human-robot-mixed environments. Context information, such as road maps and surrounding agents' states, provides crucial geometric and semantic information for motion behavior prediction. To this end, recent works explore two-stage prediction frameworks where coarse trajectori… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Camera-ready version for CVPR 2024

  21. arXiv:2403.10779  [pdf, other

    cs.CL

    LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices

    Authors: **g** Nie, Hanya Shao, Yuang Fan, Qijia Shao, Haoxuan You, Matthias Preindl, Xiaofan Jiang

    Abstract: Despite the global mental health crisis, access to screenings, professionals, and treatments remains high. In collaboration with licensed psychotherapists, we propose a Conversational AI Therapist with psychotherapeutic Interventions (CaiTI), a platform that leverages large language models (LLM)s and smart devices to enable better mental health self-care. CaiTI can screen the day-to-day functionin… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  22. arXiv:2403.10319  [pdf, other

    cs.NI cs.CR

    NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models

    Authors: Chen Qian, Xiaochang Li, Qineng Wang, Gang Zhou, Huajie Shao

    Abstract: In computer networking, network traffic refers to the amount of data transmitted in the form of packets between internetworked computers or Cyber-Physical Systems. Monitoring and analyzing network traffic is crucial for ensuring the performance, security, and reliability of a network. However, a significant challenge in network traffic analysis is to process diverse data packets including both cip… ▽ More

    Submitted 18 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  23. arXiv:2403.09615  [pdf, other

    cs.HC

    PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation

    Authors: Yuhan Guo, Hanning Shao, Can Liu, Kai Xu, Xiaoru Yuan

    Abstract: Generative text-to-image models, which allow users to create appealing images through a text prompt, have seen a dramatic increase in popularity in recent years. However, most users have a limited understanding of how such models work and it often requires many trials and errors to achieve satisfactory results. The prompt history contains a wealth of information that could provide users with insig… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  24. arXiv:2403.07390  [pdf, other

    eess.IV cs.CV

    Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution

    Authors: Haochen Sun, Yan Yuan, Lijuan Su, Haotian Shao

    Abstract: Previous approaches for blind image super-resolution (SR) have relied on degradation estimation to restore high-resolution (HR) images from their low-resolution (LR) counterparts. However, accurate degradation estimation poses significant challenges. The SR model's incompatibility with degradation estimation methods, particularly the Correction Filter, may significantly impair performance as a res… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 16 pages

  25. arXiv:2402.19303  [pdf, ps, other

    cs.LG cs.GT

    Learnability Gaps of Strategic Classification

    Authors: Lee Cohen, Yishay Mansour, Shay Moran, Han Shao

    Abstract: In contrast with standard classification tasks, strategic classification involves agents strategically modifying their features in an effort to receive favorable predictions. For instance, given a classifier determining loan approval based on credit scores, applicants may open or close their credit cards to fool the classifier. The learning goal is to find a classifier robust against strategic man… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  26. arXiv:2402.19221  [pdf, other

    hep-ph

    FKS subtraction for quarkonium production at NLO

    Authors: Ajjath A H, Hua-Sheng Shao, Lukas Simon

    Abstract: We extend the local infrared-divergence subtraction formalism, originally proposed by Frixione, Kunszt and Signer (FKS), to calculate short-distance (differential) cross section for any inclusive process involving a quarkonium particle in non-relativistic QCD (NRQCD) factorisation at next-to-leading order (NLO) accuracy in the strong coupling constant $α_s$. The new formulas are generally applicab… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 51 pages, 1 figure

  27. arXiv:2402.15991  [pdf, other

    cs.CL

    $C^3$: Confidence Calibration Model Cascade for Inference-Efficient Cross-Lingual Natural Language Understanding

    Authors: Taixi Lu, Haoyu Wang, Huajie Shao, **g Gao, Huaxiu Yao

    Abstract: Cross-lingual natural language understanding (NLU) is a critical task in natural language processing (NLP). Recent advancements have seen multilingual pre-trained language models (mPLMs) significantly enhance the performance of these tasks. However, mPLMs necessitate substantial resources and incur high computational costs during inference, posing challenges for deployment in real-world and real-t… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  28. arXiv:2402.15758  [pdf, other

    cs.CL cs.AI

    Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens

    Authors: Ziqian Zeng, Jiahong Yu, Qianshi Pang, Zihao Wang, Hui** Zhuang, Hongen Shao, Xiaofeng Zou

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across various tasks. However, their widespread application is hindered by the resource-intensive decoding process. To address this challenge, current approaches have incorporated additional decoding heads to enable parallel prediction of multiple subsequent tokens, thereby achieving inference acceleration. Nevertheless, the ac… ▽ More

    Submitted 18 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  29. arXiv:2402.14605  [pdf, other

    cond-mat.quant-gas cond-mat.str-el quant-ph

    Observation of the antiferromagnetic phase transition in the fermionic Hubbard model

    Authors: Hou-Ji Shao, Yu-Xuan Wang, De-Zhi Zhu, Yan-Song Zhu, Hao-Nan Sun, Si-Yuan Chen, Chi Zhang, Zhi-Jie Fan, You** Deng, Xing-Can Yao, Yu-Ao Chen, Jian-Wei Pan

    Abstract: The fermionic Hubbard model (FHM)[1], despite its simple form, captures essential features of strongly correlated electron physics. Ultracold fermions in optical lattices[2, 3] provide a clean and well-controlled platform for simulating FHM. Do** its antiferromagnetic ground state at half filling, various exotic phases are expected to arise in the FHM simulator, including stripe order[4], pseudo… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  30. Data Storytelling in Data Visualisation: Does it Enhance the Efficiency and Effectiveness of Information Retrieval and Insights Comprehension?

    Authors: Honbo Shao, Roberto Martinez-Maldonado, Vanessa Echeverria, Lixiang Yan, Dragan Gasevic

    Abstract: Data storytelling (DS) is rapidly gaining attention as an approach that integrates data, visuals, and narratives to create data stories that can help a particular audience to comprehend the key messages underscored by the data with enhanced efficiency and effectiveness. It has been posited that DS can be especially advantageous for audiences with limited visualisation literacy, by presenting the d… ▽ More

    Submitted 20 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted to CHI24 Edited two typos. One in the abstract, another in a formulae

  31. Asgard/NOTT: L-band nulling interferometry at the VLTI. II. Warm optical design and injection system

    Authors: Germain Garreau, Azzurra Bigioli, Romain Laugier, Gert Raskin, Johan Morren, Jean-Philippe Berger, Colin Dandumont, Harry-Dean Kenchington Goldsmith, Simon Gross, Michael Ireland, Lucas Labadie, Jérôme Loicq, Stephen Madden, Guillermo Martin, Marc-Antoine Martinod, Alexandra Mazzoli, Ahmed Sanny, Hancheng Shao, Kunlun Yan, Denis Defrère

    Abstract: Asgard/NOTT (previously Hi-5) is a European Research Council (ERC)-funded project hosted at KU Leuven and a new visitor instrument for the Very Large Telescope Interferometer (VLTI). Its primary goal is to image the snow line region around young stars using nulling interferometry in the L-band (3.5 to 4.0)$μ$m, where the contrast between exoplanets and their host stars is advantageous. The breakth… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted for publication in JATIS. 23 pages, 11 figures, 8 tables

    Journal ref: J. Astron. Telesc. Instrum. Syst. 10(1), 015002 (2024)

  32. arXiv:2402.05935  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

    Authors: Dongyang Liu, Renrui Zhang, Longtian Qiu, Siyuan Huang, Weifeng Lin, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng **, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao, Peng Gao

    Abstract: We propose SPHINX-X, an extensive Multimodality Large Language Model (MLLM) series developed upon SPHINX. To improve the architecture and training efficiency, we modify the SPHINX framework by removing redundant visual encoders, bypassing fully-padded sub-images with skip tokens, and simplifying multi-stage training into a one-stage all-in-one paradigm. To fully unleash the potential of MLLMs, we… ▽ More

    Submitted 26 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted by ICML 2024. Code and models are released at https://github.com/Alpha-VLLM/LLaMA2-Accessory

  33. arXiv:2402.03646  [pdf, other

    cs.LG cs.NI

    Lens: A Foundation Model for Network Traffic in Cybersecurity

    Authors: Qineng Wang, Chen Qian, Xiaochang Li, Ziyu Yao, Huajie Shao

    Abstract: Network traffic refers to the amount of data being sent and received over the internet or any system that connects computers. Analyzing and understanding network traffic is vital for improving network security and management. However, the analysis of network traffic is challenging due to the diverse nature of data packets, which often feature heterogeneous headers and encrypted payloads lacking se… ▽ More

    Submitted 28 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  34. arXiv:2402.02851  [pdf, other

    cs.CV cs.LG stat.ML

    Enhancing Compositional Generalization via Compositional Feature Alignment

    Authors: Haoxiang Wang, Haozhe Si, Huajie Shao, Han Zhao

    Abstract: Real-world applications of machine learning models often confront data distribution shifts, wherein discrepancies exist between the training and test data distributions. In the common multi-domain multi-class setup, as the number of classes and domains scales up, it becomes infeasible to gather training data for every domain-class combination. This challenge naturally leads the quest for models wi… ▽ More

    Submitted 22 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR). The code is released at https://github.com/Haoxiang-Wang/Compositional-Feature-Alignment

  35. arXiv:2401.02439  [pdf

    cond-mat.mtrl-sci physics.optics

    Information limit of 15 pm achieved with bright-field ptychography

    Authors: Haozhi Sha, Jizhe Cui, Wenfeng Yang, Rong Yu

    Abstract: It is generally assumed that a high spatial resolution of a microscope requires a large numerical aperture of the imaging lens or detector. In this study, the information limit of 15 pm is achieved in transmission electron microscopy using only the bright-field disk (small numerical aperture) via multislice ptychography. The results indicate that high-frequency information has been encoded in the… ▽ More

    Submitted 20 December, 2023; originally announced January 2024.

    Comments: 10 pages, 4 figures

  36. arXiv:2401.01638  [pdf, other

    physics.ins-det

    Radon Removal Commissioning of the PandaX-4T Cryogenic Distillation System

    Authors: Xiangyi Cui, Zhou Wang, Jiafu Li, Shuaijie Li, Lin Si, Yonglin Ju, Wenbo Ma, Jianglai Liu, Li Zhao, Xiangdong Ji, Rui Yan, Haidong Sha, Peiyao Huang, Xiuli Wang, Huaxuan Liu

    Abstract: The PandaX-4T distillation system, designed for the removal of krypton and radon from xenon, is evaluated for its radon removal efficiency using a $^{222}$Rn source during the online distillation process. The PandaX-4T dark matter detector is employed to monitor the temporal evolution of radon activity. To determine the radon reduction factor, the experimental data of radon atoms introduced into a… ▽ More

    Submitted 19 April, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: 14 pages, 9 figures

  37. arXiv:2401.01495  [pdf, other

    cs.CL

    A Two-Stage Multimodal Emotion Recognition Model Based on Graph Contrastive Learning

    Authors: Wei Ai, FuChen Zhang, Tao Meng, YunTao Shou, HongEn Shao, Keqin Li

    Abstract: In terms of human-computer interaction, it is becoming more and more important to correctly understand the user's emotional state in a conversation, so the task of multimodal emotion recognition (MER) started to receive more attention. However, existing emotion classification methods usually perform classification only once. Sentences are likely to be misclassified in a single round of classificat… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 9 pages, 3 figures

  38. arXiv:2401.00376  [pdf, other

    cond-mat.str-el

    Magnon, doublon and quarton excitations in 2D S=1/2 trimerized Heisenberg models

    Authors: Yue-Yue Chang, Jun-Qing Cheng, Hui Shao, Dao-Xin Yao, Han-Qing Wu

    Abstract: We investigate the magnetic excitations of the trimerized Heisenberg models with intra-trimer interaction $J_1$ and inter-trimer interaction $J_2$ on four different two-dimensional lattices using a combination of stochastic series expansion quantum Monte Carlo (SSE QMC) and stochastic analytic continuation methods (SAC), complemented by cluster perturbation theory (CPT). These models exhibit quasi… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

  39. arXiv:2312.16966  [pdf, other

    hep-ph hep-ex hep-th

    Two-loop massive QCD and QED helicity amplitudes for light-by-light scattering

    Authors: Ajjath A H, Ekta Chaubey, Hua-Sheng Shao

    Abstract: We present the analytic and compact two-loop helicity amplitudes for QCD and QED corrections to the light-by-light scattering process with massive internal fermions. We express the master integrals either in terms of multiple polylogarithms or in terms of iterated integrals with dlog one-forms. We also elaborate on optimizing the analytic results for each phase-space region. This makes the numeric… ▽ More

    Submitted 21 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 35 pages, 1 figure, v2: journal version (two long equations move into appendix)

    Journal ref: JHEP 03 (2024) 121

  40. arXiv:2312.16956  [pdf, other

    hep-ph hep-ex hep-th nucl-ex nucl-th

    Light-by-Light Scattering at Next-to-Leading Order in QCD and QED

    Authors: Ajjath A H, Ekta Chaubey, Mathijs Fraaije, Valentin Hirschi, Hua-Sheng Shao

    Abstract: The recent experimental observation of Light-by-Light (LbL) scattering at the Large Hadron Collider has revived interest in this fundamental process, and especially of the accurate prediction of its cross-section, which we present here for the first time at Next-to-Leading Order (NLO) in both QCD and QED. We compare two radically different computational approaches, both exact in the fermion mass d… ▽ More

    Submitted 10 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 11 pages, 6 figures (including appendix) v2: minor corrections and journal version

    Journal ref: Phys.Lett.B 851 (2024) 138555

  41. arXiv:2312.08866  [pdf, other

    eess.IV cs.CV

    MCANet: Medical Image Segmentation with Multi-Scale Cross-Axis Attention

    Authors: Hao Shao, Quansheng Zeng, Qibin Hou, Jufeng Yang

    Abstract: Efficiently capturing multi-scale information and building long-range dependencies among pixels are essential for medical image segmentation because of the various sizes and shapes of the lesion regions or organs. In this paper, we present Multi-scale Cross-axis Attention (MCA) to solve the above challenging issues based on the efficient axial attention. Instead of simply connecting axial attentio… ▽ More

    Submitted 19 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

  42. arXiv:2312.08735  [pdf, other

    cs.CV

    Polyper: Boundary Sensitive Polyp Segmentation

    Authors: Hao Shao, Yang Zhang, Qibin Hou

    Abstract: We present a new boundary sensitive framework for polyp segmentation, called Polyper. Our method is motivated by a clinical approach that seasoned medical practitioners often leverage the inherent features of interior polyp regions to tackle blurred boundaries.Inspired by this, we propose explicitly leveraging polyp regions to bolster the model's boundary discrimination capability while minimizing… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  43. arXiv:2312.07488  [pdf, other

    cs.CV cs.AI cs.RO

    LMDrive: Closed-Loop End-to-End Driving with Large Language Models

    Authors: Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, Hongsheng Li

    Abstract: Despite significant recent progress in the field of autonomous driving, modern methods still struggle and can incur serious accidents when encountering long-tail unforeseen events and challenging urban scenarios. On the one hand, large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence". On the other hand, previous autonomous driving m… ▽ More

    Submitted 21 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: project page: https://hao-shao.com/projects/lmdrive.html

  44. arXiv:2312.03792  [pdf, other

    cs.CR cs.LG

    PCDP-SGD: Improving the Convergence of Differentially Private SGD via Projection in Advance

    Authors: Haichao Sha, Ruixuan Liu, Yixuan Liu, Hong Chen

    Abstract: The paradigm of Differentially Private SGD~(DP-SGD) can provide a theoretical guarantee for training data in both centralized and federated settings. However, the utility degradation caused by DP-SGD limits its wide application in high-stakes tasks, such as medical image diagnosis. In addition to the necessary perturbation, the convergence issue is attributed to the information loss on the gradien… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  45. arXiv:2311.16459  [pdf, other

    cs.LG cs.DC cs.GT

    On the Effect of Defections in Federated Learning and How to Prevent Them

    Authors: Minbiao Han, Kumar Kshitij Patel, Han Shao, Lingxiao Wang

    Abstract: Federated learning is a machine learning protocol that enables a large population of agents to collaborate over multiple rounds to produce a single consensus model. There are several federated learning applications where agents may choose to defect permanently$-$essentially withdrawing from the collaboration$-$if they are content with their instantaneous model in that round. This work demonstrates… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  46. arXiv:2311.12070  [pdf, other

    eess.IV cs.CV

    FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

    Authors: Yunxiang Li, Hua-Chieh Shao, Xiaoxue Qian, You Zhang

    Abstract: Diffusion models have demonstrated significant potential in producing high-quality images in medical image translation to aid disease diagnosis, localization, and treatment. Nevertheless, current diffusion models have limited success in achieving faithful image translations that can accurately preserve the anatomical structures of medical images, especially for unpaired datasets. The preservation… ▽ More

    Submitted 26 June, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  47. arXiv:2311.10036  [pdf

    physics.med-ph

    Dynamic CBCT Imaging using Prior Model-Free Spatiotemporal Implicit Neural Representation (PMF-STINR)

    Authors: Hua-Chieh Shao, Mengke Tielige, Tinsu Pan, You Zhang

    Abstract: Dynamic cone-beam computed tomography (CBCT) can capture high-spatial-resolution, time-varying images for motion monitoring, patient setup, and adaptive planning of radiotherapy. However, dynamic CBCT reconstruction is an extremely ill-posed spatiotemporal inverse problem, as each CBCT volume in the dynamic sequence is only captured by one or a few X-ray projections. We developed a machine learnin… ▽ More

    Submitted 4 December, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

  48. arXiv:2311.07754  [pdf, other

    cs.GT cs.DS econ.TH

    Efficient Prior-Free Mechanisms for No-Regret Agents

    Authors: Natalie Collina, Aaron Roth, Han Shao

    Abstract: We study a repeated Principal Agent problem between a long lived Principal and Agent pair in a prior free setting. In our setting, the sequence of realized states of nature may be adversarially chosen, the Agent is non-myopic, and the Principal aims for a strong form of policy regret. Following Camara, Hartline, and Johnson, we model the Agent's long-run behavior with behavioral assumptions that r… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  49. arXiv:2311.05075  [pdf

    cs.LG cs.AI cs.CL

    Mental Health Diagnosis in the Digital Age: Harnessing Sentiment Analysis on Social Media Platforms upon Ultra-Sparse Feature Content

    Authors: Haijian Shao, Ming Zhu, Shengjie Zhai

    Abstract: Amid growing global mental health concerns, particularly among vulnerable groups, natural language processing offers a tremendous potential for early detection and intervention of people's mental disorders via analyzing their postings and discussions on social media platforms. However, ultra-sparse training data, often due to vast vocabularies and low-frequency words, hinders the analysis accuracy… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  50. arXiv:2311.00260  [pdf, ps, other

    cs.GT cs.LG

    Incentivized Collaboration in Active Learning

    Authors: Lee Cohen, Han Shao

    Abstract: In collaborative active learning, where multiple agents try to learn labels from a common hypothesis, we introduce an innovative framework for incentivized collaboration. Here, rational agents aim to obtain labels for their data sets while kee** label complexity at a minimum. We focus on designing (strict) individually rational (IR) collaboration protocols, ensuring that agents cannot reduce the… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.