Skip to main content

Showing 51–100 of 644 results for author: Zhu, F

.
  1. arXiv:2403.17192  [pdf

    cs.CV

    Strategies to Improve Real-World Applicability of Laparoscopic Anatomy Segmentation Models

    Authors: Fiona R. Kolbinger, Jiangpeng He, **ge Ma, Fengqing Zhu

    Abstract: Accurate identification and localization of anatomical structures of varying size and appearance in laparoscopic imaging are necessary to leverage the potential of computer vision techniques for surgical decision support. Segmentation performance of such models is traditionally reported using metrics of overlap such as IoU. However, imbalanced and unrealistic representation of classes in the train… ▽ More

    Submitted 15 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 14 pages, 5 figures, 4 tables; accepted for the workshop "Data Curation and Augmentation in Medical Imaging" at CVPR 2024 (archival track)

  2. arXiv:2403.12171  [pdf, other

    cs.CL cs.AI

    EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

    Authors: Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, **g Shao, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large Language Models (LLMs). They are designed to bypass safeguards and elicit prohibited outputs. However, due to significant differences among various jailbreak methods, there is no standard implementation framework available for the community, which limits comprehensive security evaluations. This paper… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2403.11530  [pdf, other

    cs.CV

    Continual Forgetting for Pre-trained Vision Models

    Authors: Hongbo Zhao, Bolin Ni, Haochen Wang, Junsong Fan, Fei Zhu, Yuxi Wang, Yuntao Chen, Gaofeng Meng, Zhaoxiang Zhang

    Abstract: For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while ma… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  4. arXiv:2403.11518  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el

    Optical manipulation of the topological phase in ZrTe5 revealed by time- and angle-resolved photoemission

    Authors: Chaozhi Huang, Chengyang Xu, Fengfeng Zhu, Shaofeng Duan, Jianzhe Liu, Lingxiao Gu, Shichong Wang, Haoran Liu, Dong Qian, Weidong Luo, Wentao Zhang

    Abstract: High-resolution time- and angle-resolved photoemission measurements were conducted on the topological insulator ZrTe5. With strong femtosecond photoexcitation, a possible ultrafast phase transition from a weak to a strong topological insulating phase was experimentally realized by recovering the energy gap inversion in a time scale that was shorter than 0.15 ps. This photoinduced transient strong… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Journal ref: Chinese Physics B 33, 017901 (2024)

  5. Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

    Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

    Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 8 pages, 3 figures

    Journal ref: Physical Review Letters 132, 131002 (2024)

  6. arXiv:2403.09972  [pdf, other

    cs.CL

    Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection

    Authors: Moxin Li, Wenjie Wang, Fuli Feng, Fengbin Zhu, Qifan Wang, Tat-Seng Chua

    Abstract: Self-detection for Large Language Model (LLM) seeks to evaluate the LLM output trustability by leveraging LLM's own capabilities, alleviating the output hallucination issue. However, existing self-detection approaches only retrospectively evaluate answers generated by LLM, typically leading to the over-trust in incorrectly generated answers. To tackle this limitation, we propose a novel self-detec… ▽ More

    Submitted 4 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Under review

  7. arXiv:2403.09665  [pdf, ps, other

    math.GM

    Characterizations of quasi-homogeneous aggregation functions

    Authors: Feng-qing Zhu, Xue-** Wang

    Abstract: In this article, we first give the characterizations of quasi-homogeneous aggregation functions, which show us that quasi-homogeneous aggregation functions are classified into three classes. We then introduce the concept of triple generator of quasi-homogeneous aggregation function, which is applied to construct a quasi-homogeneous aggregation function.

    Submitted 12 May, 2024; v1 submitted 11 January, 2024; originally announced March 2024.

    Comments: 15

  8. arXiv:2403.09283  [pdf

    cond-mat.str-el cond-mat.mes-hall cond-mat.mtrl-sci

    Observation of quantum oscillations near the Mott-Ioffe-Regel limit in CaAs3

    Authors: Yuxiang Wang, Minhao Zhao, **glei Zhang, Wenbin Wu, Shichao Li, Yong Zhang, Wenxiang Jiang, Nesta Benno Joseph, Liangcai Xu, Yicheng Mou, Yunkun Yang, Pengliang Leng, Yong Zhang, Li Pi, Alexey Suslov, Mykhaylo Ozerov, Jan Wyzula, Milan Orlita, Fengfeng Zhu, Yi Zhang, Xufeng Kou, Zengwei Zhu, Awadhesh Narayan, Dong Qian, **sheng Wen , et al. (3 additional authors not shown)

    Abstract: The Mott-Ioffe-Regel limit sets the lower bound of carrier mean free path for coherent quasiparticle transport. Metallicity beyond this limit is of great interest because it is often closely related to quantum criticality and unconventional superconductivity. Progress along this direction mainly focuses on the strange-metal behaviors originating from the evolution of quasiparticle scattering rate… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 18 pages, 5 figures

  9. Learning to Describe for Predicting Zero-shot Drug-Drug Interactions

    Authors: Fangqi Zhu, Yongqi Zhang, Lei Chen, Bing Qin, Ruifeng Xu

    Abstract: Adverse drug-drug interactions~(DDIs) can compromise the effectiveness of concurrent drug administration, posing a significant challenge in healthcare. As the development of new drugs continues, the potential for unknown adverse effects resulting from DDIs becomes a growing concern. Traditional computational methods for DDI prediction may fail to capture interactions for new drugs due to the lack… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  10. arXiv:2403.07167  [pdf, other

    physics.geo-ph math-ph

    Stationary phase analysis of ambient noise cross-correlations: Focusing on non-ballistic arrivals

    Authors: Yunyue Elita Li, Feng Zhu, Jizhong Yang

    Abstract: Stacked cross-correlation functions have become ubiquitous in the ambient seismic imaging and monitoring community as approximations to the Green's function between two receivers. While theoretical understanding of this approximation to the ballistic arrivals is well established, the equivalent analysis for the non-ballistic arrivals is alarmingly inadequate compared to the exponential growth of i… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 22 pages, 11 figures, 1 table

  11. arXiv:2403.06288  [pdf, other

    cs.CV

    Probing Image Compression For Class-Incremental Learning

    Authors: Justin Yang, Zhihao Duan, Andrew Peng, Yuning Huang, Jiangpeng He, Fengqing Zhu

    Abstract: Image compression emerges as a pivotal tool in the efficient handling and transmission of digital images. Its ability to substantially reduce file size not only facilitates enhanced data storage capacity but also potentially brings advantages to the development of continual machine learning (ML) systems, which learn new knowledge incrementally from sequential data. Continual ML systems often rely… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Picture Coding Symposium (PCS) 2024

  12. Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

    Authors: Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin

    Abstract: Vision-and-language navigation (VLN) asks an agent to follow a given language instruction to navigate through a real 3D environment. Despite significant advances, conventional VLN agents are trained typically under disturbance-free environments and may easily fail in real-world scenarios, since they are unaware of how to deal with various possible disturbances, such as sudden obstacles or human in… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by TPAMI 2023

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI,2023)

  13. arXiv:2403.04272  [pdf, other

    cs.CV

    Active Generalized Category Discovery

    Authors: Shijie Ma, Fei Zhu, Zhun Zhong, Xu-Yao Zhang, Cheng-Lin Liu

    Abstract: Generalized Category Discovery (GCD) is a pragmatic and challenging open-world task, which endeavors to cluster unlabeled samples from both novel and old classes, leveraging some labeled data of old classes. Given that knowledge learned from old classes is not fully transferable to new classes, and that novel categories are fully unlabeled, GCD inherently faces intractable problems, including imba… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  14. arXiv:2403.03822  [pdf, other

    cs.HC

    HoLens: A Visual Analytics Design for Higher-order Movement Modeling and Visualization

    Authors: Zezheng Feng, Fang Zhu, Hongjun Wang, Jianing Hao, ShuangHua Yang, Wei Zeng, Huamin Qu

    Abstract: Higher-order patterns reveal sequential multistep state transitions, which are usually superior to origin-destination analysis, which depicts only first-order geospatial movement patterns. Conventional methods for higher-order movement modeling first construct a directed acyclic graph (DAG) of movements, then extract higher-order patterns from the DAG. However, DAG-based methods heavily rely on th… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 20 pages, 18 figures, is accepted by computational visual media journal

  15. arXiv:2403.03172  [pdf, other

    cs.AI cs.LG

    Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination

    Authors: Liangzhou Wang, Kaiwen Zhu, Fengming Zhu, Xinghu Yao, Shujie Zhang, Deheng Ye, Haobo Fu, Qiang Fu, Wei Yang

    Abstract: Reaching consensus is key to multi-agent coordination. To accomplish a cooperative task, agents need to coherently select optimal joint actions to maximize the team reward. However, current cooperative multi-agent reinforcement learning (MARL) methods usually do not explicitly take consensus into consideration, which may cause miscoordination problem. In this paper, we propose a model-based consen… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  16. arXiv:2403.02886  [pdf, other

    cs.CV cs.LG

    Revisiting Confidence Estimation: Towards Reliable Failure Prediction

    Authors: Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu

    Abstract: Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been deve… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE TPAMI. arXiv admin note: text overlap with arXiv:2303.02970; text overlap with arXiv:2007.01458 by other authors

  17. arXiv:2403.01759  [pdf, other

    cs.LG cs.CV

    Open-world Machine Learning: A Review and New Outlooks

    Authors: Fei Zhu, Shijie Ma, Zhen Cheng, Xu-Yao Zhang, Zhaoxiang Zhang, Cheng-Lin Liu

    Abstract: Machine learning has achieved remarkable success in many applications. However, existing studies are largely based on the closed-world assumption, which assumes that the environment is stationary, and the model is fixed once deployed. In many real-world applications, this fundamental and rather naive assumption may not hold because an open environment is complex, dynamic, and full of unknowns. In… ▽ More

    Submitted 14 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  18. arXiv:2403.00810  [pdf, other

    cs.AI cs.CL

    Bootstrap** Cognitive Agents with a Large Language Model

    Authors: Feiyu Zhu, Reid Simmons

    Abstract: Large language models contain noisy general knowledge of the world, yet are hard to train or fine-tune. On the other hand cognitive architectures have excellent interpretability and are flexible to update but require a lot of manual work to instantiate. In this work, we combine the best of both worlds: bootstrap** a cognitive-based model with the noisy knowledge encoded in large language models.… ▽ More

    Submitted 24 February, 2024; originally announced March 2024.

  19. arXiv:2403.00224  [pdf, other

    stat.ME

    Tobit models for count time series

    Authors: Christian H. Weiß, Fukang Zhu

    Abstract: Several models for count time series have been developed during the last decades, often inspired by traditional autoregressive moving average (ARMA) models for real-valued time series, including integer-valued ARMA (INARMA) and integer-valued generalized autoregressive conditional heteroscedasticity (INGARCH) models. Both INARMA and INGARCH models exhibit an ARMA-like autocorrelation function (ACF… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  20. arXiv:2402.18873  [pdf, other

    cs.CL

    Reducing Hallucinations in Entity Abstract Summarization with Facts-Template Decomposition

    Authors: Fangwei Zhu, Peiyi Wang, Zhifang Sui

    Abstract: Entity abstract summarization aims to generate a coherent description of a given entity based on a set of relevant Internet documents. Pretrained language models (PLMs) have achieved significant success in this task, but they may suffer from hallucinations, i.e. generating non-factual information about the entity. To address this issue, we decompose the summary into two components: Facts that repr… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  21. arXiv:2402.18862  [pdf, other

    eess.IV

    Towards Backward-Compatible Continual Learning of Image Compression

    Authors: Zhihao Duan, Ming Lu, Justin Yang, Jiangpeng He, Zhan Ma, Fengqing Zhu

    Abstract: This paper explores the possibility of extending the capability of pre-trained neural image compressors (e.g., adapting to new data or target bitrates) without breaking backward compatibility, the ability to decode bitstreams encoded by the original model. We refer to this problem as continual learning of image compression. Our initial findings show that baseline solutions, such as end-to-end fine… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024

  22. arXiv:2402.18528  [pdf, other

    cs.CV

    Gradient Reweighting: Towards Imbalanced Class-Incremental Learning

    Authors: Jiangpeng He, Fengqing Zhu

    Abstract: Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data while retaining learned knowledge. A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution, which introduces a dual imbalance problem involving (i) disparities between stored exemplars of old tasks and new class data (inter-phase imbalance… ▽ More

    Submitted 29 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024

  23. arXiv:2402.15772  [pdf, other

    stat.ME

    Mean-preserving rounding integer-valued ARMA models

    Authors: Christian H. Weiß, Fukang Zhu

    Abstract: In the past four decades, research on count time series has made significant progress, but research on $\mathbb{Z}$-valued time series is relatively rare. Existing $\mathbb{Z}$-valued models are mainly of autoregressive structure, where the use of the rounding operator is very natural. Because of the discontinuity of the rounding operator, the formulation of the corresponding model identifiability… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  24. arXiv:2402.11425  [pdf, other

    stat.ME cs.LG math.OC math.PR

    Online Local False Discovery Rate Control: A Resource Allocation Approach

    Authors: Ruicheng Ao, Hongyu Chen, David Simchi-Levi, Feng Zhu

    Abstract: We consider the problem of sequentially conducting multiple experiments where each experiment corresponds to a hypothesis testing task. At each time point, the experimenter must make an irrevocable decision of whether to reject the null hypothesis (or equivalently claim a discovery) before the next experimental result arrives. The goal is to maximize the number of discoveries while maintaining a l… ▽ More

    Submitted 1 April, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  25. arXiv:2402.10626  [pdf, other

    cs.IT eess.SP

    Robust Beamforming for RIS-aided Communications: Gradient-based Manifold Meta Learning

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Reconfigurable intelligent surface (RIS) has become a promising technology to realize the programmable wireless environment via steering the incident signal in fully customizable ways. However, a major challenge in RIS-aided communication systems is the simultaneous design of the precoding matrix at the base station (BS) and the phase shifting matrix of the RIS elements. This is mainly attributed… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: journal

  26. arXiv:2402.06292  [pdf

    cond-mat.soft

    Towards full control of molecular exciton energy transfer via FRET in DNA origami assemblies

    Authors: Aleksandra K. Adamczyk, Teun A. P. M. Huijben, Karol Kolataj, Fangjia Zhu, Rodolphe Marie, Fernando D. Stefani, Guillermo P. Acuna

    Abstract: Controlling the flow of excitons between organic molecules holds immense promise for various applications, including energy conversion, spectroscopy, photocatalysis, sensing, and microscopy. DNA nanotechnology has shown promise in achieving this control by using synthetic DNA as a platform for positioning and, very recently, for also orienting organic dyes. In this study, the orientation of doubly… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 19 pages, 4 figures

  27. arXiv:2402.03628  [pdf, other

    cs.CL

    Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies

    Authors: Zhixuan Chu, Yan Wang, Feng Zhu, Lu Yu, Longfei Li, **jie Gu

    Abstract: The advent of large language models (LLMs) such as ChatGPT, PaLM, and GPT-4 has catalyzed remarkable advances in natural language processing, demonstrating human-like language fluency and reasoning capacities. This position paper introduces the concept of Professional Agents (PAgents), an application framework harnessing LLM capabilities to create autonomous agents with controllable, specialized,… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 14 pages, 1 figure

  28. arXiv:2402.02349  [pdf

    eess.IV cs.CV

    Vision Transformer-based Multimodal Feature Fusion Network for Lymphoma Segmentation on PET/CT Images

    Authors: Huan Huang, Liheng Qiu, Shenmiao Yang, Longxi Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Chen Zhao, Weihua Zhou

    Abstract: Background: Diffuse large B-cell lymphoma (DLBCL) segmentation is a challenge in medical image analysis. Traditional segmentation methods for lymphoma struggle with the complex patterns and the presence of DLBCL lesions. Objective: We aim to develop an accurate method for lymphoma segmentation with 18F-Fluorodeoxyglucose positron emission tomography (PET) and computed tomography (CT) images. Metho… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 14 pages, 6 figures; reference added

  29. arXiv:2401.13223  [pdf, other

    cs.CL cs.AI

    TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

    Authors: Fengbin Zhu, Ziyang Liu, Fuli Feng, Chao Wang, Moxin Li, Tat-Seng Chua

    Abstract: In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e.g. SEC filings), where discrete reasoning capabilities are often required. Recently, large language models (LLMs) like GPT-4 have demonstrated strong multi-step reasoning capabilities. We then consider harnessing the amazing power of LLMs to solve our task. We abstr… ▽ More

    Submitted 22 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: ACL 2024 (Under Review)

  30. arXiv:2401.11615  [pdf, other

    eess.IV

    Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding

    Authors: Yichi Zhang, Zhihao Duan, Ming Lu, Dandan Ding, Fengqing Zhu, Zhan Ma

    Abstract: While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image. As seen, CLIC expands the receptive field into the entire image… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  31. arXiv:2401.05960  [pdf, other

    cs.AI

    Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

    Authors: Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang, Zijie Geng, Yang Li, Haoyang Liu, Zhiwu An, Muming Yang, Jianshu Li, Jie Wang, Junchi Yan, Defeng Sun, Tao Zhong, Yong Zhang, Jia Zeng, Mingxuan Yuan, Jianye Hao, Jun Yao , et al. (1 additional authors not shown)

    Abstract: In an era of digital ubiquity, efficient resource management and decision-making are paramount across numerous industries. To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional opt… ▽ More

    Submitted 17 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  32. arXiv:2401.05836  [pdf

    cs.RO

    On State Estimation in Multi-Sensor Fusion Navigation: Optimization and Filtering

    Authors: Feng Zhu, Zhuo Xu, Xveqing Zhang, Yuantai Zhang, Weijie Chen, Xiaohong Zhang

    Abstract: The essential of navigation, perception, and decision-making which are basic tasks for intelligent robots, is to estimate necessary system states. Among them, navigation is fundamental for other upper applications, providing precise position and orientation, by integrating measurements from multiple sensors. With observations of each sensor appropriately modelled, multi-sensor fusion tasks for nav… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  33. arXiv:2401.03828  [pdf

    cs.CV

    A multimodal gesture recognition dataset for desktop human-computer interaction

    Authors: Qi Wang, Fengchao Zhu, Guangming Zhu, Liang Zhang, Ning Li, Eryang Gao

    Abstract: Gesture recognition is an indispensable component of natural and efficient human-computer interaction technology, particularly in desktop-level applications, where it can significantly enhance people's productivity. However, the current gesture recognition community lacks a suitable desktop-level (top-view perspective) dataset for lightweight gesture capture devices. In this study, we have establi… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  34. arXiv:2401.03735  [pdf, other

    cs.CL

    Language Models Know the Value of Numbers

    Authors: Fangwei Zhu, Damai Dai, Zhifang Sui

    Abstract: Large language models (LLMs) have exhibited impressive competence in various tasks, but their internal mechanisms on mathematical problems are still under-explored. In this paper, we study a fundamental question: whether language models know the value of numbers, a basic element in math. To study the question, we construct a synthetic dataset comprising addition problems and utilize linear probes… ▽ More

    Submitted 9 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  35. arXiv:2401.03050  [pdf, ps, other

    math.GR

    Topological restrictions on relatively Anosov representations

    Authors: Konstantinos Tsouvalas, Feng Zhu

    Abstract: We obtain restrictions on which groups can admit relatively Anosov representations into specified target Lie groups, by examining the topology of possible Bowditch boundaries and how they interact with the Anosov limit maps. For instance, we prove that, up to finite index, any group admitting a relatively Anosov representation into SL(3,R) is a free group or surface group, and any group admitting… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 21 pages. Comments welcome!

    MSC Class: 22E40 (Primary) 20F67; 20F65; 57M07 (Secondary)

  36. arXiv:2401.02717  [pdf, other

    cs.CV cs.AI

    Complementary Information Mutual Learning for Multimodality Medical Image Segmentation

    Authors: Chuyun Shen, Wenhao Li, Haoqing Chen, Xiaoling Wang, Feng** Zhu, Yuxin Li, Xiangfeng Wang, Bo **

    Abstract: Radiologists must utilize multiple modal images for tumor segmentation and diagnosis due to the limitations of medical imaging and the diversity of tumor signals. This leads to the development of multimodal learning in segmentation. However, the redundancy among modalities creates challenges for existing subtraction-based joint learning methods, such as misjudging the importance of modalities, ign… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 35 pages, 18 figures

  37. arXiv:2401.02094  [pdf, other

    cs.CV

    Federated Class-Incremental Learning with Prototype Guided Transformer

    Authors: Haiyang Guo, Fei Zhu, Wenzhuo Liu, Xu-Yao Zhang, Cheng-Lin Liu

    Abstract: Existing federated learning methods have effectively addressed decentralized learning in scenarios involving data privacy and non-IID data. However, in real-world situations, each client dynamically learns new classes, requiring the global model to maintain discriminative capabilities for both new and old classes. To effectively mitigate the effects of catastrophic forgetting and data heterogeneit… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 11 pages, 4 figures, conference

  38. arXiv:2312.07126  [pdf, other

    eess.IV

    Deep Hierarchical Video Compression

    Authors: Ming Lu, Zhihao Duan, Fengqing Zhu, Zhan Ma

    Abstract: Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results. Existing methods using a single-scale Variational AutoEncoder (VAE) must devise complex networks for conditional probability estimation in latent space, neglecting multiscale characteristics of video f… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  39. arXiv:2312.06428  [pdf, other

    cs.CV cs.AI cs.IR cs.LG

    VisionTraj: A Noise-Robust Trajectory Recovery Framework based on Large-scale Camera Network

    Authors: Zhishuai Li, Ziyue Li, Xiaoru Hu, Guoqing Du, Yunhao Nie, Feng Zhu, Lei Bai, Rui Zhao

    Abstract: Trajectory recovery based on the snapshots from the city-wide multi-camera network facilitates urban mobility sensing and driveway optimization. The state-of-the-art solutions devoted to such a vision-based scheme typically incorporate predefined rules or unsupervised iterative feedback, struggling with multi-fold challenges such as lack of open-source datasets for training the whole pipeline, and… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  40. arXiv:2312.03667  [pdf, other

    cs.CV

    WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on

    Authors: xujie zhang, Xiu Li, Michael Kampffmeyer, Xin Dong, Zhenyu Xie, Feida Zhu, Haoye Dong, Xiaodan Liang

    Abstract: Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person. While existing methods focus on war** the garment to fit the body pose, they often overlook the synthesis quality around the garment-skin boundary and realistic effects like wrinkles and shadows on the warped garments. These limitations greatly reduce the realism of the generated results and hinder… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  41. arXiv:2312.03408  [pdf, other

    cs.CV

    Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

    Authors: Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, **gdong Wang, Futang Zhu, Chun**g Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao

    Abstract: With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. Current autonomous driving datasets can broadly be categorized into two generations. The first-generation autonomous driving datasets are characterized by relatively sim… ▽ More

    Submitted 22 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: This article is a simplified English translation of corresponding Chinese article. Please refer to Chinese version for the complete content

  42. arXiv:2312.01697  [pdf, other

    cs.CV cs.AI

    Hulk: A Universal Knowledge Translator for Human-Centric Tasks

    Authors: Yizhou Wang, Yixuan Wu, Shixiang Tang, Weizhen He, Xun Guo, Feng Zhu, Lei Bai, Rui Zhao, Jian Wu, Tong He, Wanli Ouyang

    Abstract: Human-centric perception tasks, e.g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis. There is a recent surge to develop human-centric foundation models that can benefit a broad range of human-centric perception tasks. While many human-centric foundation models have achieved success, they did no… ▽ More

    Submitted 21 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: 24 pages, 5 figures

  43. arXiv:2311.17048  [pdf, other

    cs.CV

    Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions

    Authors: Zeyu Han, Fangrui Zhu, Qianru Lao, Huaizu Jiang

    Abstract: Zero-shot referring expression comprehension aims at localizing bounding boxes in an image corresponding to provided textual prompts, which requires: (i) a fine-grained disentanglement of complex visual scene and textual context, and (ii) a capacity to understand relationships among disentangled entities. Unfortunately, existing large vision-language alignment (VLA) models, e.g., CLIP, struggle wi… ▽ More

    Submitted 9 April, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: CVPR 2024, Code available at https://github.com/Show-han/Zeroshot_REC

  44. arXiv:2311.14069  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Massive topological edge channels in three-dimensional topological materials induced by extreme surface anisotropy

    Authors: Fengfeng Zhu, Chenqiang Hua, Xiao Wang, Lin Miao, Yixi Su, Makoto Hashimoto, Donghui Lu, Zhi-Xun Shen, **-Feng Jia, Yunhao Lu, Dandan Guan, Dong Qian

    Abstract: A two-dimensional quantum spin Hall insulator exhibits one-dimensional gapless spin-filtered edge channels allowing for dissipationless transport of charge and spin. However, the sophisticated fabrication requirement of two-dimensional materials and the low capacity of one-dimensional channels hinder the broadening applications. We introduce a method to manipulate a three-dimensional topological m… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  45. arXiv:2311.12276  [pdf, other

    astro-ph.GA astro-ph.SR

    The first Ka-band (26.1-35 GHz) blind line survey towards Orion KL

    Authors: Xunchuan Liu, Tie Liu, Zhiqiang Shen, Sheng-Li Qin, Qiuyi Luo, Yan Gong, Yu Cheng, Christian Henkel, Qilao Gu, Fengyao Zhu, Tianwei Zhang, Rongbing Zhao, Yajun Wu, Bin Li, Juan Li, Zhang Zhao, **qing Wang, Weiye Zhong, Qinghui Liu, Bo Xia, Li Fu, Zhen Yan, Chao Zhang, Lingling Wang, Qian Ye , et al. (9 additional authors not shown)

    Abstract: We conducted a Ka-band (26.1--35 GHz) line survey towards Orion KL using the TianMa 65-m Radio Telescope (TMRT). It is the first blind line survey in the Ka band, and achieves a sensitivity of mK level (1--3 mK at a spectral resolution of $\sim$1 km s$^{-1}$). In total, 592 Gaussian features are extracted. Among them, 257 radio recombination lines (RRLs) are identified. The maximum $Δn$ of RRLs of… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: accepted by ApJS

  46. arXiv:2311.09416  [pdf

    physics.acc-ph

    Low-level radiofrequency system upgrade for the Dalian Coherent Light Source

    Authors: H. L. Ding, J. F. Zhu, H. K. Li, J. W. Han, X. W. Dai, J. Y. Yang, W. Q. Zhang

    Abstract: DCLS (Dalian Coherent Light Source) is an FEL (Free-Electron Laser) user facility at EUV (Extreme Ultraviolet). The primary accelerator of DCLS operates at a repetition rate of 20 Hz, and the beam is divided at the end of the linear accelerator through Kicker to make two 10 Hz beamlines work simultaneously. In the past year, we have completed the upgrade of the DCLS LLRF (Low-Level Radiofrequency)… ▽ More

    Submitted 24 October, 2023; originally announced November 2023.

    Comments: Poster presented at LLRF Workshop 2023 (LLRF2023, arXiv: 2310.03199)

    Report number: LLRF2023/14

  47. arXiv:2311.09414  [pdf

    physics.acc-ph

    A low-delay reference tracking algorithm for microwave measurement and control

    Authors: J. F. Zhu, H. L. Ding, H. K. Li, J. W. Han, X. W. Dai, Z. C. Chen, J. Y. Yang, W. Q. Zhang

    Abstract: In FEL (Free-Electron Laser) accelerators, LLRF (Low-Level Radiofrequency) systems usually deploy feedback or feedforward algorithms requiring precise microwave measurement. The slow drift of the clock allocation network of LLRF significantly impacts the measured microwave phase, thereby affecting the stability of the closed-loop operation. The reference tracking algorithm is used to eliminate the… ▽ More

    Submitted 24 October, 2023; originally announced November 2023.

    Comments: Poster presented at LLRF Workshop 2023 (LLRF2023, arXiv: 2310.03199)

    Report number: LLRF2023/18

  48. arXiv:2311.08414  [pdf

    physics.acc-ph

    The microwave amplitude and phase setting based on event timing for the DCLS

    Authors: J. F. Zhu, H. L. Ding, H. K. Li, J. W. Han, X. W. Dai, B. Xu, L. Shi, J. Y. Yang, W. Q. Zhang

    Abstract: The primary accelerator of DCLS (Dalian Coherent Light Source) operates at a repetition rate of 20 Hz now, and the beam is divided at the end of the linear accelera-tor through Kicker to make two 10 Hz beamlines work simultaneously. For the simultaneous emission FEL of two beamlines, the beam energy of the two beamlines is required to be controlled independently, so we need to set the amplitude an… ▽ More

    Submitted 24 October, 2023; originally announced November 2023.

    Comments: Poster presented at LLRF Workshop 2023 (LLRF2023, arXiv: 2310.03199)

    Report number: LLRF2023/19

  49. arXiv:2311.06861  [pdf, other

    cs.IT eess.SP

    Energy-efficient Beamforming for RISs-aided Communications: Gradient Based Meta Learning

    Authors: Xinquan Wang, Fenghao Zhu, Qianyun Zhou, Qihao Yu, Chongwen Huang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Reconfigurable intelligent surfaces (RISs) have become a promising technology to meet the requirements of energy efficiency and scalability in future six-generation (6G) communications. However, a significant challenge in RISs-aided communications is the joint optimization of active and passive beamforming at base stations (BSs) and RISs respectively. Specifically, the main difficulty is attribute… ▽ More

    Submitted 16 February, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: 5 pages, 8 figures. Accepted in IEEE ICC 2024 (GCSN symposium)

  50. arXiv:2311.00567  [pdf

    eess.IV cs.CV cs.LG physics.med-ph q-bio.QM

    A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images

    Authors: Ni Yao, Hang Hu, Kaicong Chen, Chen Zhao, Yuan Guo, Boya Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Weihua Zhou, Li Tian

    Abstract: Objectives To develop and validate a deep learning-based diagnostic model incorporating uncertainty estimation so as to facilitate radiologists in the preoperative differentiation of the pathological subtypes of renal cell carcinoma (RCC) based on CT images. Methods Data from 668 consecutive patients, pathologically proven RCC, were retrospectively collected from Center 1. By using five-fold cross… ▽ More

    Submitted 12 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 16 pages, 6 figures