Skip to main content

Showing 101–150 of 860 results for author: Zhu, M

.
  1. arXiv:2401.08661  [pdf

    cs.RO cs.LG

    Risk-anticipatory autonomous driving strategies considering vehicles' weights, based on hierarchical deep reinforcement learning

    Authors: Di Chen, Hao Li, Zhicheng **, Huizhao Tu, Meixin Zhu

    Abstract: Autonomous vehicles (AVs) have the potential to prevent accidents caused by drivers errors and reduce road traffic risks. Due to the nature of heavy vehicles, whose collisions cause more serious crashes, the weights of vehicles need to be considered when making driving strategies aimed at reducing the potential risks and their consequences in the context of autonomous driving. This study develops… ▽ More

    Submitted 7 May, 2024; v1 submitted 27 December, 2023; originally announced January 2024.

    Comments: 14 pages, 5 figures, 6 tables

  2. arXiv:2401.07208  [pdf, other

    cs.CV

    Enhanced Few-Shot Class-Incremental Learning via Ensemble Models

    Authors: Mingli Zhu, Zihao Zhu, Sihong Chen, Chen Chen, Baoyuan Wu

    Abstract: Few-shot class-incremental learning (FSCIL) aims to continually fit new classes with limited training data, while maintaining the performance of previously learned classes. The main challenges are overfitting the rare new training samples and forgetting old classes. While catastrophic forgetting has been extensively studied, the overfitting problem has attracted less attention in FSCIL. To tackle… ▽ More

    Submitted 21 March, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  3. arXiv:2401.05507  [pdf, other

    cs.CL cs.AI

    InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

    Authors: Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, **g Su, **g**g Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

    Abstract: In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorpora… ▽ More

    Submitted 11 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 27 pages, 7 figures, work in progress

  4. arXiv:2401.04181  [pdf, other

    cs.RO cs.CV

    Language-Conditioned Robotic Manipulation with Fast and Slow Thinking

    Authors: Minjie Zhu, Yichen Zhu, **ming Li, Junjie Wen, Zhiyuan Xu, Zheng** Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang

    Abstract: The language-conditioned robotic manipulation aims to transfer natural language instructions into executable actions, from simple pick-and-place to tasks requiring intent recognition and visual reasoning. Inspired by the dual process theory in cognitive science, which suggests two parallel systems of fast and slow thinking in human decision-making, we introduce Robotics with Fast and Slow Thinking… ▽ More

    Submitted 1 February, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: accepted to ICRA2024

  5. arXiv:2401.03128  [pdf, other

    cs.AI

    Manifold-based Shapley for SAR Recognization Network Explanation

    Authors: Xuran Hu, Mingzhe Zhu, Yuan**g Liu, Zhenpeng Feng, LJubisa Stankovic

    Abstract: Explainable artificial intelligence (XAI) holds immense significance in enhancing the deep neural network's transparency and credibility, particularly in some risky and high-cost scenarios, like synthetic aperture radar (SAR). Shapley is a game-based explanation technique with robust mathematical foundations. However, Shapley assumes that model's features are independent, rendering Shapley explana… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 5 pages, 4 figures

    ACM Class: H.1.m

  6. arXiv:2401.03122  [pdf, other

    cs.CV eess.IV

    SAR Despeckling via Regional Denoising Diffusion Probabilistic Model

    Authors: Xuran Hu, Ziqiang Xu, Zhihan Chen, Zhengpeng Feng, Mingzhe Zhu, LJubisa Stankovic

    Abstract: Speckle noise poses a significant challenge in maintaining the quality of synthetic aperture radar (SAR) images, so SAR despeckling techniques have drawn increasing attention. Despite the tremendous advancements of deep learning in fixed-scale SAR image despeckling, these methods still struggle to deal with large-scale SAR images. To address this problem, this paper introduces a novel despeckling… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 5 pages, 5 figures

    ACM Class: I.4.4

  7. arXiv:2401.02883  [pdf, other

    cs.RO eess.SY

    iPolicy: Incremental Policy Algorithms for Feedback Motion Planning

    Authors: Guoxiang Zhao, Devesh K. Jha, Yebin Wang, Minghui Zhu

    Abstract: This paper presents policy-based motion planning for robotic systems. The motion planning literature has been mostly focused on open-loop trajectory planning which is followed by tracking online. In contrast, we solve the problem of path planning and controller synthesis simultaneously by solving the related feedback control problem. We present a novel incremental policy (iPolicy) algorithm for mo… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  8. arXiv:2401.02814  [pdf, other

    cs.RO cs.CV

    Object-Centric Instruction Augmentation for Robotic Manipulation

    Authors: Junjie Wen, Yichen Zhu, Minjie Zhu, **ming Li, Zhiyuan Xu, Zheng** Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang

    Abstract: Humans interpret scenes by recognizing both the identities and positions of objects in their observations. For a robot to perform tasks such as \enquote{pick and place}, understanding both what the objects are and where they are located is crucial. While the former has been extensively discussed in the literature that uses the large language model to enrich the text descriptions, the latter remain… ▽ More

    Submitted 1 February, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: accepted to ICRA2024

  9. arXiv:2401.02330  [pdf, other

    cs.CV cs.CL

    LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model

    Authors: Yichen Zhu, Minjie Zhu, Ning Liu, Zhicai Ou, Xiaofeng Mou, Jian Tang

    Abstract: In this paper, we introduce LLaVA-$φ$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues. LLaVA-Phi marks a notable advancement in the realm of compact multi-modal models. It demonstrates that even smaller language models, with as few as 2.7B parameters, can effectively engage in intrica… ▽ More

    Submitted 22 February, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: The datasets were incomplete as they did not include all the necessary copyrights

  10. arXiv:2401.00625  [pdf, ps, other

    cs.LG

    Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

    Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

    Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More

    Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

  11. arXiv:2312.15254  [pdf, other

    quant-ph

    Ecmas: Efficient Circuit Map** and Scheduling for Surface Code

    Authors: Mingzheng Zhu, Hao Fu, Jun Wu, Chi Zhang, Wei Xie, Xiang-Yang Li

    Abstract: As the leading candidate of quantum error correction codes, surface code suffers from significant overhead, such as execution time. Reducing the circuit's execution time not only enhances its execution efficiency but also improves fidelity. However, finding the shortest execution time is NP-hard. In this work, we study the surface code map** and scheduling problem. To reduce the execution time… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 12 pages, Accepted to IEEE/ACM International Symposium on Code Generation and Optimization

  12. arXiv:2312.15043  [pdf, other

    cs.CV

    GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

    Authors: Haozhan Shen, Tiancheng Zhao, Mingwei Zhu, Jianwei Yin

    Abstract: Visual grounding, a crucial vision-language task involving the understanding of the visual context based on the query expression, necessitates the model to capture the interactions between objects, as well as various spatial and attribute information. However, the annotation data of visual grounding task is limited due to its time-consuming and labor-intensive annotation process, resulting in the… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  13. arXiv:2312.13530  [pdf, other

    cs.CR cs.AI cs.LG

    HW-V2W-Map: Hardware Vulnerability to Weakness Map** Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion

    Authors: Yu-Zheng Lin, Muntasir Mamun, Muhtasim Alam Chowdhury, Shuyu Cai, Mingyu Zhu, Banafsheh Saber Latibari, Kevin Immanuel Gubbi, Najmeh Nazari Bavarsad, Arjun Caputo, Avesta Sasan, Houman Homayoun, Setareh Rafatirad, Pratik Satam, Soheil Salehi

    Abstract: The escalating complexity of modern computing frameworks has resulted in a surge in the cybersecurity vulnerabilities reported to the National Vulnerability Database (NVD) by practitioners. Despite the fact that the stature of NVD is one of the most significant databases for the latest insights into vulnerabilities, extracting meaningful trends from such a large amount of unstructured data is stil… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 22 pages, 10 pages appendix, 10 figures, Submitted to ACM TODAES

  14. arXiv:2312.13023  [pdf, other

    eess.SP

    Class Information Guided Reconstruction for Automatic Modulation Open-Set Recognition

    Authors: Ziwei Zhang, Mengtao Zhu, Jiabin Liu, Yunjie Li, Shafei Wang

    Abstract: Automatic Modulation Recognition (AMR) is a crucial technology in the domains of radar and communications. Traditional AMR approaches assume a closed-set scenario, where unknown samples are forcibly misclassified into known classes, leading to serious consequences for situation awareness and threat assessment. To address this issue, Automatic Modulation Open-set Recognition (AMOSR) defines two tas… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 14 pages, 11 figures

  15. arXiv:2312.10685  [pdf, other

    gr-qc astro-ph.CO hep-th

    Null energy condition violation during inflation and pulsar timing array observations

    Authors: Gen Ye, Mian Zhu, Yong Cai

    Abstract: Recently, evidence of stochastic gravitational wave background (SGWB) signals observed by pulsar timing array (PTA) collaborations, has prompted investigations into their origins. We explore the compatibility of a proposed inflationary scenario, incorporating an intermediate null energy condition (NEC)-violating phase, with the PTA observations. The NEC violation potentially amplifies the primordi… ▽ More

    Submitted 6 February, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: 14 pages plus one appendix, 4 figures; match published version; reference added

  16. arXiv:2312.10379  [pdf, other

    quant-ph

    Multi-parameter quantum metrology with stabilized multi-mode squeezed state

    Authors: Yue Li, Xu Cheng, Lingna Wang, Xingyu Zhao, Waner Hou, Yi Li, Kamran Rehan, Mingdong Zhu, Lin Yan, Xi Qin, Xinhua Peng, Haidong Yuan, Yiheng Lin, Jiangfeng Du

    Abstract: Squeezing a quantum state along a specific direction has long been recognized as a crucial technique for enhancing the precision of quantum metrology by reducing parameter uncertainty. However, practical quantum metrology often involves the simultaneous estimation of multiple parameters, necessitating the use of high-quality squeezed states along multiple orthogonal axes to surpass the standard qu… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  17. arXiv:2312.10343  [pdf, other

    eess.SP cs.AR cs.LG cs.NE

    In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar

    Authors: Yang Sui, Minning Zhu, Lingyi Huang, Chung-Tse Michael Wu, Bo Yuan

    Abstract: Radio Frequency Neural Networks (RFNNs) have demonstrated advantages in realizing intelligent applications across various domains. However, as the model size of deep neural networks rapidly increases, implementing large-scale RFNN in practice requires an extensive number of RF interferometers and consumes a substantial amount of energy. To address this challenge, we propose to utilize low-rank dec… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  18. arXiv:2312.09999  [pdf, ps, other

    math.CO

    On graphs without cycles of length 0 modulo 4

    Authors: Ervin Győri, Binlong Li, Nika Salia, Casey Tompkins, Kitti Varga, Manran Zhu

    Abstract: Bollobás proved that for every $k$ and $\ell$ such that $k\mathbb{Z}+\ell$ contains an even number, an $n$-vertex graph containing no cycle of length $\ell \bmod k$ can contain at most a linear number of edges. The precise (or asymptotic) value of the maximum number of edges in such a graph is known for very few pairs $\ell$ and $k$. In this work we precisely determine the maximum number of edges… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  19. arXiv:2312.08890  [pdf, other

    cs.CV cs.CR cs.LG

    Defenses in Adversarial Machine Learning: A Survey

    Authors: Baoyuan Wu, Shaokui Wei, Mingli Zhu, Meixi Zheng, Zihao Zhu, Mingda Zhang, Hongrui Chen, Danni Yuan, Li Liu, Qingshan Liu

    Abstract: Adversarial phenomenon has been widely observed in machine learning (ML) systems, especially in those using deep neural networks, describing that ML systems may produce inconsistent and incomprehensible predictions with humans at some particular cases. This phenomenon poses a serious security threat to the practical application of ML systems, and several advanced attack paradigms have been develop… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: 21 pages, 5 figures, 2 tables, 237 reference papers

  20. arXiv:2312.08880  [pdf, other

    cs.CV

    GenDet: Towards Good Generalizations for AI-Generated Image Detection

    Authors: Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, Yunhe Wang

    Abstract: The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news. Existing methods can effectively detect images generated by seen generators, but it is challenging to detect those generated by unseen generators. They do not concentrate on amplifying the output discrepancy when detectors process real versus fake images. T… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  21. The FAST all sky HI survey (FASHI): The first release of catalog

    Authors: Chuan-Peng Zhang, M. Zhu, P. Jiang, C. Cheng, J. Wang, J. Wang, J. -L. Xu, X. -L. Liu, N. -P. Yu, L. Qian, H. Yu, M. Ai, Y. **g, C. Xu, Z. Liu, X. Guan, C. Sun, Q. Yang, M. Huang, Q. Hao, FAST Collaboration

    Abstract: The FAST All Sky HI survey (FASHI) was designed to cover the entire sky observable by the Five-hundred-meter Aperture Spherical radio Telescope (FAST), spanning approximately 22000 square degrees of declination between -14 deg and +66 deg, and in the frequency range of 1050-1450 MHz, with the expectation of eventually detecting more than 100000 HI sources. Between August 2020 and June 2023, FASHI… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 22 pages, 12 figures, published in SCPMA. All catalogs are available at https://zcp521.github.io/fashi and https://fast.bao.ac.cn/cms/article/271/

    Journal ref: Sci. China-Phys. Mech. Astron. 67, 219511 (2024)

  22. arXiv:2312.04815  [pdf, other

    cs.LG

    Not All Negatives Are Worth Attending to: Meta-Bootstrap** Negative Sampling Framework for Link Prediction

    Authors: Yakun Wang, Binbin Hu, Shuo Yang, Meiqi Zhu, Zhiqiang Zhang, Qiyang Zhang, Jun Zhou, Guo Ye, Huimei He

    Abstract: The rapid development of graph neural networks (GNNs) encourages the rising of link prediction, achieving promising performance with various applications. Unfortunately, through a comprehensive analysis, we surprisingly find that current link predictors with dynamic negative samplers (DNSs) suffer from the migration phenomenon between "easy" and "hard" samples, which goes against the preference of… ▽ More

    Submitted 11 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

  23. arXiv:2312.04584  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Towards Sample-specific Backdoor Attack with Clean Labels via Attribute Trigger

    Authors: Yiming Li, Mingyan Zhu, Junfeng Guo, Tao Wei, Shu-Tao Xia, Zhan Qin

    Abstract: Currently, sample-specific backdoor attacks (SSBAs) are the most advanced and malicious methods since they can easily circumvent most of the current backdoor defenses. In this paper, we reveal that SSBAs are not sufficiently stealthy due to their poisoned-label nature, where users can discover anomalies if they check the image-label relationship. In particular, we demonstrate that it is ineffectiv… ▽ More

    Submitted 10 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: 14 pages

  24. arXiv:2311.18268  [pdf

    cs.DL

    An Explorative Study on Document Type Assignment of Review Articles in Web of Science, Scopus and Journals' Website

    Authors: Manman Zhu, Xinyue Lu, Fuyou Chen, Liying Yang, Zhesi Shen

    Abstract: Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS) and Scopus is important. This study aims to investigate the document type assignation of review articles in web of Science, Scopus and Journals' website in a large scale. 27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed. The document types of these… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  25. arXiv:2311.17940  [pdf, other

    cs.CV

    Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames

    Authors: Chao Chen, Mingzhi Zhu, Ankush Pratap Singh, Yu Yan, Felix Juefei Xu, Chen Feng

    Abstract: We propose scene summarization as a new video-based scene understanding task. It aims to summarize a long video walkthrough of a scene into a small set of frames that are spatially diverse in the scene, which has many impotant applications, such as in surveillance, real estate, and robotics. It stems from video summarization but focuses on long and continuous videos from moving cameras, instead of… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  26. arXiv:2311.17425   

    cs.CV

    SpeechAct: Towards Generating Whole-body Motion from Speech

    Authors: **song Zhang, Minjie Zhu, Yuxiang Zhang, Yebin Liu, Kun Li

    Abstract: This paper addresses the problem of generating whole-body motion from speech. Despite great successes, prior methods still struggle to produce reasonable and diverse whole-body motions from speech. This is due to their reliance on suboptimal representations and a lack of strategies for generating diverse results. To address these challenges, we present a novel hybrid point representation to achiev… ▽ More

    Submitted 13 June, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: The paper has been archived without permission from the newly added author

  27. arXiv:2311.14631  [pdf, other

    cs.CV

    CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization

    Authors: Ruoyu Zhao, Mingrui Zhu, Shiyin Dong, Nannan Wang, Xinbo Gao

    Abstract: We propose CatVersion, an inversion-based method that learns the personalized concept through a handful of examples. Subsequently, users can utilize text prompts to generate images that embody the personalized concept, thereby achieving text-to-image personalization. In contrast to existing approaches that emphasize word embedding learning or parameter fine-tuning for the diffusion model, which po… ▽ More

    Submitted 30 November, 2023; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: For the project page, please visit https://royzhao926.github.io/CatVersion-page/

  28. arXiv:2311.13246  [pdf, other

    cs.CL

    CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

    Authors: Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao Yang, Yanfei Jiang

    Abstract: Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative. To ensure the high… ▽ More

    Submitted 20 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: Accepted by ICDE 2024

  29. arXiv:2311.12665  [pdf, ps, other

    math.AG

    Boundedness of stable minimal models with klt singularities

    Authors: Minzhe Zhu

    Abstract: We investigate the singularities and boundedness of a special kind of algebraic varieties so-called stable minimal models, which are constructed and studied by Birkar. Given a klt stable minimal model with bounded relative volume, if we fix the dimension, Iitaka volume, and a DCC set controlling coefficients, then we show that the singularities of the klt stable minimal model can be controlled uni… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Version 2:31 pages, Section 3 simplified, exposition improved

  30. arXiv:2311.12200  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Hydrogen-induced tunable remanent polarization in a perovskite nickelate

    Authors: Yifan Yuan, Michele Kotiuga, Tae Joon Park, Yuanyuan Ni, Arnob Saha, Hua Zhou, Jerzy T. Sadowski, Abdullah Al-Mahboob, Haoming Yu, Kai Du, Minning Zhu, Sunbin Deng, Ravindra S. Bisht, Xiao Lyu, Chung-Tse Michael Wu, Peide D. Ye, Abhronil Sengupta, Sang-Wook Cheong, Xiaoshan Xu, Karin M. Rabe, Shriram Ramanathan

    Abstract: Materials with field-tunable polarization are of broad interest to condensed matter sciences and solid-state device technologies. Here, using hydrogen (H) donor do**, we modify the room temperature metallic phase of a perovskite nickelate NdNiO3 into an insulating phase with both metastable dipolar polarization and space-charge polarization. We then demonstrate transient negative differential ca… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 13 pages, 5 figures

  31. arXiv:2311.12075  [pdf, other

    cs.CV

    BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning

    Authors: Siyuan Liang, Mingli Zhu, Aishan Liu, Baoyuan Wu, Xiaochun Cao, Ee-Chien Chang

    Abstract: Studying backdoor attacks is valuable for model copyright protection and enhancing defenses. While existing backdoor attacks have successfully infected multimodal contrastive learning models such as CLIP, they can be easily countered by specialized backdoor defenses for MCL models. This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defen… ▽ More

    Submitted 4 March, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: The paper lacks some work that needs to be cited

    Journal ref: CVPR 2024

  32. arXiv:2311.11969  [pdf, other

    eess.IV cs.CV

    SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks

    Authors: ** Ye, Junlong Cheng, Jianpin Chen, Zhongying Deng, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Jilong Chen, Lei Jiang, Hui Sun, Min Zhu, Shaoting Zhang, Junjun He, Yu Qiao

    Abstract: Segment Anything Model (SAM) has achieved impressive results for natural image segmentation with input prompts such as points and bounding boxes. Its success largely owes to massive labeled training data. However, directly applying SAM to medical image segmentation cannot perform well because SAM lacks medical knowledge -- it does not use medical images for training. To incorporate medical knowled… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  33. arXiv:2311.11628  [pdf, other

    cs.LG

    Incorporating LLM Priors into Tabular Learners

    Authors: Max Zhu, Siniša Stanivuk, Andrija Petrovic, Mladen Nikolic, Pietro Lio

    Abstract: We present a method to integrate Large Language Models (LLMs) and traditional tabular data classification techniques, addressing LLMs challenges like data serialization sensitivity and biases. We introduce two strategies utilizing LLMs for ranking categorical variables and generating priors on correlations between continuous variables and targets, enhancing performance in few-shot scenarios. We fo… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Table Representation Learning Workshop at NeurIPS 2023

  34. arXiv:2311.11354  [pdf, other

    cs.CV

    Scale-aware competition network for palmprint recognition

    Authors: Chengrui Gao, Ziyuan Yang, Min Zhu, Andrew Beng ** Teoh

    Abstract: Palmprint biometrics garner heightened attention in palm-scanning payment and social security due to their distinctive attributes. However, prevailing methodologies singularly prioritize texture orientation, neglecting the significant texture scale dimension. We design an innovative network for concurrently extracting intra-scale and inter-scale features to redress this limitation. This paper prop… ▽ More

    Submitted 20 November, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

  35. arXiv:2311.10051  [pdf, other

    cs.LG

    Tabular Few-Shot Generalization Across Heterogeneous Feature Spaces

    Authors: Max Zhu, Katarzyna Kobalczyk, Andrija Petrovic, Mladen Nikolic, Mihaela van der Schaar, Boris Delibasic, Petro Lio

    Abstract: Despite the prevalence of tabular datasets, few-shot learning remains under-explored within this domain. Existing few-shot methods are not directly applicable to tabular datasets due to varying column relationships, meanings, and permutational invariance. To address these challenges, we propose FLAT-a novel approach to tabular few-shot learning, encompassing knowledge sharing between datasets with… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: Tabular learning, Deep learning, Few shot learning

  36. arXiv:2311.09925  [pdf, other

    astro-ph.GA

    Formation of a massive lenticular galaxy under the tidal interaction with a group of dwarf galaxies

    Authors: **-Long Xu, Ming Zhu, Kelley M. Hess, Nai** Yu, Chuan-Peng Zhang, Xiao-Lan Liu, Mei Ai, Peng Jiang, Jie Wang

    Abstract: Based on the atomic-hydrogen (HI) observations using the Five-hundred-meter Aperture Spherical radio Telescope (FAST), we present a detailed study of the gas-rich massive S0 galaxy NGC 1023 in a nearby galaxy group. The presence of an HI extended warped disk in NGC 1023 indicates that this S0 galaxy originated from a spiral galaxy. The data also suggest that NGC 1023 is interacting with four dwarf… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 13 pages, 8 figures, Accepted for publication in the ApJ Letters

  37. arXiv:2311.08786  [pdf, other

    cs.CV

    HFORD: High-Fidelity and Occlusion-Robust De-identification for Face Privacy Protection

    Authors: Dongxin Chen, Mingrui Zhu, Nannan Wang, Xinbo Gao

    Abstract: With the popularity of smart devices and the development of computer vision technology, concerns about face privacy protection are growing. The face de-identification technique is a practical way to solve the identity protection problem. The existing facial de-identification methods have revealed several problems, including the impact on the realism of anonymized results when faced with occlusions… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  38. An evolutionary continuum from nucleated dwarf galaxies to star clusters

    Authors: Kaixiang Wang, Eric W. Peng, Chengze Liu, J. Christopher Mihos, Patrick Côté, Laura Ferrarese, Matthew A. Taylor, John P. Blakeslee, Jean-Charles Cuillandre, Pierre-Alain Duc, Puragra Guhathakurta, Stephen Gwyn, Youkyung Ko, Ariane Lançon, Sungsoon Lim, Lauren A. MacArthur, Thomas Puzia, Joel Roediger, Laura V. Sales, Rubén Sánchez-Janssen, Chelsea Spengler, Elisa Toloba, Hongxin Zhang, Mingcheng Zhu

    Abstract: Systematic studies have revealed hundreds of ultra-compact dwarf galaxies (UCDs) in the nearby Universe. With half-light radii $r_h$ of approximately 10-100 parsecs and stellar masses $M_*$ $\approx$ $10^6-10^8$ solar masses, UCDs are among the densest known stellar systems. Although similar in appearance to massive globular clusters, the detection of extended stellar envelopes, complex star forma… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Published in Nature. Accepted on September 15

    Journal ref: Nature 623 (2023) 296-300

  39. arXiv:2311.05075  [pdf

    cs.LG cs.AI cs.CL

    Mental Health Diagnosis in the Digital Age: Harnessing Sentiment Analysis on Social Media Platforms upon Ultra-Sparse Feature Content

    Authors: Haijian Shao, Ming Zhu, Shengjie Zhai

    Abstract: Amid growing global mental health concerns, particularly among vulnerable groups, natural language processing offers a tremendous potential for early detection and intervention of people's mental disorders via analyzing their postings and discussions on social media platforms. However, ultra-sparse training data, often due to vast vocabularies and low-frequency words, hinders the analysis accuracy… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  40. arXiv:2311.04653  [pdf, other

    cs.LG cs.AI

    Hybrid Focal and Full-Range Attention Based Graph Transformers

    Authors: Minhong Zhu, Zhenhao Zhao, Weiran Cai

    Abstract: The paradigm of Transformers using the self-attention mechanism has manifested its advantage in learning graph-structured data. Yet, Graph Transformers are capable of modeling full range dependencies but are often deficient in extracting information from locality. A common practice is to utilize Message Passing Neural Networks (MPNNs) as an auxiliary to capture local information, which however are… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  41. arXiv:2310.20369  [pdf, other

    cs.LG math.OC

    Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm

    Authors: Miaoxi Zhu, Li Shen, Bo Du, Dacheng Tao

    Abstract: The growing size of available data has attracted increasing interest in solving minimax problems in a decentralized manner for various machine learning tasks. Previous theoretical research has primarily focused on the convergence rate and communication complexity of decentralized minimax algorithms, with little attention given to their generalization. In this paper, we investigate the primal-dual… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  42. arXiv:2310.18848  [pdf, ps, other

    math.AP

    Optimal concentration level of anisotropic Trudinger-Moser functionals on any bounded domain

    Authors: Lu Chen, Rou Jiang, Maochun Zhu

    Abstract: Let $F$ be convex and homogeneous of degree $1$, its polar $F^{o}$ represent a finsler metric on $\mathbb{R}^{n}$, and $Ω$ be any bounded open set in $\mathbb{R}^{n}$. In this paper, we first construct the theoretical structure of anisotropic harmonic transplantation. Using the anisotropic harmonic transplantation, co-area formula, limiting Sobolev approximation method, delicate estimate of level… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  43. arXiv:2310.16917  [pdf, other

    cs.RO cs.LG

    MimicTouch: Learning Human's Control Strategy with Multi-Modal Tactile Feedback

    Authors: Kelin Yu, Yunhai Han, Matthew Zhu, Ye Zhao

    Abstract: In robotics and artificial intelligence, the integration of tactile processing is becoming increasingly pivotal, especially in learning to execute intricate tasks like alignment and insertion. However, existing works focusing on tactile methods for insertion tasks predominantly rely on robot teleoperation data and reinforcement learning, which do not utilize the rich insights provided by human's c… ▽ More

    Submitted 1 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: Presented at CoRL 2023 Deployable Workshop and NIPS 2023 Touch Processing Workshop

  44. arXiv:2310.15975  [pdf

    cs.LG

    Data-driven Traffic Simulation: A Comprehensive Review

    Authors: Di Chen, Meixin Zhu, Hao Yang, Xuesong Wang, Yinhai Wang

    Abstract: Autonomous vehicles (AVs) have the potential to significantly revolutionize society by providing a secure and efficient mode of transportation. Recent years have witnessed notable advancements in autonomous driving perception and prediction, but the challenge of validating the performance of AVs remains largely unresolved. Data-driven microscopic traffic simulation has become an important tool for… ▽ More

    Submitted 23 November, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 21 pages, 7 figures, 6 tables

  45. arXiv:2310.13473  [pdf, other

    cs.CV

    Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

    Authors: Mingwei Zhu, Leigang Sha, Yu Shu, Kangjia Zhao, Tiancheng Zhao, Jianwei Yin

    Abstract: Multimodal large language models (MLLMs) have shown great potential in perception and interpretation tasks, but their capabilities in predictive reasoning remain under-explored. To address this gap, we introduce a novel benchmark that assesses the predictive reasoning capabilities of MLLMs across diverse scenarios. Our benchmark targets three important domains: abstract pattern reasoning, human ac… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  46. CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective Invariants

    Authors: Shaoan Wang, Mingzhu Zhu, Yaoqing Hu, Dongyue Li, Fusong Yuan, Junzhi Yu

    Abstract: High-precision pose estimation based on visual markers has been a thriving research topic in the field of computer vision. However, the suitability of traditional flat markers on curved objects is limited due to the diverse shapes of curved surfaces, which hinders the development of high-precision pose estimation for curved objects. Therefore, this paper proposes a novel visual marker called Cylin… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 15 pages, 22 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2024

  47. arXiv:2310.13315  [pdf, other

    cs.CL

    Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

    Authors: Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao

    Abstract: Quantization is a promising approach for reducing memory overhead and accelerating inference, especially in large pre-trained language model (PLM) scenarios. While having no access to original training data due to security and privacy concerns has emerged the demand for zero-shot quantization. Most of the cutting-edge zero-shot quantization methods primarily 1) apply to computer vision tasks, and… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP2023 (Main). Miaoxi Zhu and Qihuang Zhong contribute equally to this work

  48. arXiv:2310.11802  [pdf, other

    cs.CE cs.LG q-bio.BM

    De novo protein design using geometric vector field networks

    Authors: Weian Mao, Muzhi Zhu, Zheng Sun, Shuaike Shen, Lin Yuanbo Wu, Hao Chen, Chunhua Shen

    Abstract: Innovations like protein diffusion have enabled significant progress in de novo protein design, which is a vital topic in life science. These methods typically depend on protein structure encoders to model residue backbone frames, where atoms do not exist. Most prior encoders rely on atom-wise features, such as angles and distances between atoms, which are not available in this context. Thus far,… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  49. arXiv:2310.10952  [pdf, other

    stat.ML cs.LG stat.AP stat.CO

    Restricted Tweedie Stochastic Block Models

    Authors: Jie Jian, Mu Zhu, Peijun Sang

    Abstract: The stochastic block model (SBM) is a widely used framework for community detection in networks, where the network structure is typically represented by an adjacency matrix. However, conventional SBMs are not directly applicable to an adjacency matrix that consists of non-negative zero-inflated continuous edge weights. To model the international trading network, where edge weights represent tradin… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  50. Contrastive Self-Supervised Learning for Spatio-Temporal Analysis of Lung Ultrasound Videos

    Authors: Li Chen, Jonathan Rubin, Jiahong Ouyang, Naveen Balaraju, Shubham Patil, Courosh Mehanian, Sourabh Kulhare, Rachel Millin, Kenton W Gregory, Cynthia R Gregory, Meihua Zhu, David O Kessler, Laurie Malia, Almaz Dessie, Joni Rabiner, Di Coneybeare, Bo Shopsin, Andrew Hersh, Cristian Madar, Jeffrey Shupp, Laura S Johnson, Jacob Avila, Kristin Dwyer, Peter Weimersheimer, Balasundar Raju , et al. (2 additional authors not shown)

    Abstract: Self-supervised learning (SSL) methods have shown promise for medical imaging applications by learning meaningful visual representations, even when the amount of labeled data is limited. Here, we extend state-of-the-art contrastive learning SSL methods to 2D+time medical ultrasound video data by introducing a modified encoder and augmentation method capable of learning meaningful spatio-temporal r… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: ISBI 2023, 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI)