Skip to main content

Showing 1–50 of 129 results for author: Shen, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00474  [pdf, other

    cs.LG cs.AI

    MH-pFLGB: Model Heterogeneous personalized Federated Learning via Global Bypass for Medical Image Analysis

    Authors: Luyuan Xie, Manqing Lin, ChenMing Xu, Tianyu Luan, Zhipeng Zeng, Wenjun Qian, Cong Li, Yuejian Fang, Qingni Shen, Zhonghai Wu

    Abstract: In the evolving application of medical artificial intelligence, federated learning is notable for its ability to protect training data privacy. Federated learning facilitates collaborative model development without the need to share local data from healthcare institutions. Yet, the statistical and system heterogeneity among these institutions poses substantial challenges, which affects the effecti… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2407.00462  [pdf, other

    cs.CV cs.AI

    pFLFE: Cross-silo Personalized Federated Learning via Feature Enhancement on Medical Image Segmentation

    Authors: Luyuan Xie, Manqing Lin, Siyuan Liu, ChenMing Xu, Tianyu Luan, Cong Li, Yuejian Fang, Qingni Shen, Zhonghai Wu

    Abstract: In medical image segmentation, personalized cross-silo federated learning (FL) is becoming popular for utilizing varied data across healthcare settings to overcome data scarcity and privacy concerns. However, existing methods often suffer from client drift, leading to inconsistent performance and delayed training. We propose a new framework, Personalized Federated Learning via Feature Enhancement… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  3. arXiv:2406.14095  [pdf, other

    cs.LG cs.AI

    Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization

    Authors: Qianli Shen, Yezhen Wang, Zhouhao Yang, Xiang Li, Haonan Wang, Yang Zhang, Jonathan Scarlett, Zhanxing Zhu, Kenji Kawaguchi

    Abstract: Bi-level optimization (BO) has become a fundamental mathematical framework for addressing hierarchical machine learning problems. As deep learning models continue to grow in size, the demand for scalable bi-level optimization solutions has become increasingly critical. Traditional gradient-based bi-level optimization algorithms, due to their inherent characteristics, are ill-suited to meet the dem… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.06825  [pdf, other

    stat.ML cs.LG math.PR

    A local squared Wasserstein-2 method for efficient reconstruction of models with uncertainty

    Authors: Mingtao Xia, Qi**g Shen

    Abstract: In this paper, we propose a local squared Wasserstein-2 (W_2) method to solve the inverse problem of reconstructing models with uncertain latent variables or parameters. A key advantage of our approach is that it does not require prior information on the distribution of the latent variables or parameters in the underlying models. Instead, our method can efficiently reconstruct the distributions of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    MSC Class: 60E05; 62D05

  5. arXiv:2406.06367  [pdf, other

    cs.CV

    MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

    Authors: Xuanyu Yi, Zike Wu, Qiuhong Shen, Qingshan Xu, Pan Zhou, Joo-Hwee Lim, Shuicheng Yan, Xinchao Wang, Hanwang Zhang

    Abstract: Recent 3D large reconstruction models (LRMs) can generate high-quality 3D content in sub-seconds by integrating multi-view diffusion models with scalable multi-view reconstructors. Current works further leverage 3D Gaussian Splatting as 3D representation for improved visual quality and rendering efficiency. However, we observe that existing Gaussian reconstruction models often suffer from multi-vi… ▽ More

    Submitted 20 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  6. arXiv:2406.04178  [pdf, other

    cs.CV

    Encoding Semantic Priors into the Weights of Implicit Neural Representation

    Authors: Zhicheng Cai, Qiu Shen

    Abstract: Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations, which takes coordinates as inputs and generates corresponding signal values. Since these coordinates contain no semantic features, INR fails to take any semantic information into consideration. However, semantic information has been proven critical in many vision tasks, especially for visu… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ICME 2024

  7. arXiv:2406.01653  [pdf, other

    stat.ML cs.LG math.PR stat.AP stat.ME

    An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

    Authors: Mingtao Xia, Xiangting Li, Qi**g Shen, Tom Chou

    Abstract: We analyze the Wasserstein distance ($W$-distance) between two probability distributions associated with two multidimensional jump-diffusion processes. Specifically, we analyze a temporally decoupled squared $W_2$-distance, which provides both upper and lower bounds associated with the discrepancies in the drift, diffusion, and jump amplitude functions between the two jump-diffusion processes. The… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    MSC Class: 60G07; 60J76

  8. arXiv:2406.01028  [pdf, other

    cs.CV

    LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network

    Authors: Xuanqi Zhang, Hai** Zeng, **wang Pan, Qiangqiang Shen, Yongyong Chen

    Abstract: Transformer-based low-light enhancement methods have yielded promising performance by effectively capturing long-range dependencies in a global context. However, their elevated computational demand limits the scalability of multiple iterations in deep unfolding networks, and hence they have difficulty in flexibly balancing interpretability and distortion. To address this issue, we propose a novel… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 9pages, 7 figures

  9. arXiv:2405.18426  [pdf, other

    cs.CV cs.AI

    GFlow: Recovering 4D World from Monocular Video

    Authors: Shizun Wang, Xingyi Yang, Qiuhong Shen, Zhenxiang Jiang, Xinchao Wang

    Abstract: Reconstructing 4D scenes from video inputs is a crucial yet challenging task. Conventional methods usually rely on the assumptions of multi-view video inputs, known camera parameters, or static scenes, all of which are typically absent under in-the-wild scenarios. In this paper, we relax all these constraints and tackle a highly ambitious but practical task, which we termed as AnyV4D: we assume on… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://littlepure2333.github.io/GFlow

  10. arXiv:2405.18218  [pdf, other

    cs.LG

    FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models

    Authors: Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi

    Abstract: Overparametrized transformer networks are the state-of-the-art architecture for Large Language Models (LLMs). However, such models contain billions of parameters making large compute a necessity, while raising environmental concerns. To address these issues, we propose FinerCut, a new form of fine-grained layer pruning, which in contrast to prior work at the transformer block level, considers all… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages

  11. arXiv:2405.15843  [pdf, other

    cs.CV cs.AI

    SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception

    Authors: Louis Foucard, Samar Khanna, Yi Shi, Chi-Kuei Liu, Quinn Z Shen, Thuyen Ngo, Zi-Xiang Xia

    Abstract: In this paper, we propose SpotNet: a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection. We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support. Unlike more recent bird's-eye-view (BEV) sensor-fusion methods whi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  12. arXiv:2405.14800  [pdf, other

    cs.CR cs.CV

    Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy

    Authors: Shengfang Zhai, Huanran Chen, Yinpeng Dong, Jiajun Li, Qingni Shen, Yansong Gao, Hang Su, Yang Liu

    Abstract: Text-to-image diffusion models have achieved tremendous success in the field of controllable image generation, while also coming along with issues of privacy leakage and data copyrights. Membership inference arises in these contexts as a potential auditing method for detecting unauthorized data usage. While some efforts have been made on diffusion models, they are not applicable to text-to-image d… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures; minor typos corrected

  13. arXiv:2405.13144  [pdf, other

    cs.AI cs.CL

    Mamo: a Mathematical Modeling Benchmark with Solvers

    Authors: Xuhan Huang, Qingning Shen, Yan Hu, Anningzhe Gao, Benyou Wang

    Abstract: Mathematical modeling involves representing real-world phenomena, systems, or problems using mathematical expressions and equations to analyze, understand, and predict their behavior. Given that this process typically requires experienced experts, there is an interest in exploring whether Large Language Models (LLMs) can undertake mathematical modeling to potentially decrease human labor. To evalu… ▽ More

    Submitted 30 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: Project: https://github.com/FreedomIntelligence/Mamo Updates: 1. include more models 2. minor modification of the metric with new results 3. fix some typos 4. add error analysis with examples

  14. arXiv:2405.13076  [pdf

    q-fin.ST cs.LG

    A K-means Algorithm for Financial Market Risk Forecasting

    Authors: **xin Xu, Kaixian Xu, Yue Wang, Qinyan Shen, Ruisi Li

    Abstract: Financial market risk forecasting involves applying mathematical models, historical data analysis and statistical methods to estimate the impact of future market movements on investments. This process is crucial for investors to develop strategies, financial institutions to manage assets and regulators to formulate policy. In today's society, there are problems of high error rate and low precision… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  15. arXiv:2405.09470  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer

    Authors: Weifei **, Yuxin Cao, Junjie Su, Qi Shen, Kai Ye, Derui Wang, Jie Hao, Ziyao Liu

    Abstract: In light of the widespread application of Automatic Speech Recognition (ASR) systems, their security concerns have received much more attention than ever before, primarily due to the susceptibility of Deep Neural Networks. Previous studies have illustrated that surreptitiously crafting adversarial perturbations enables the manipulation of speech recognition systems, resulting in the production of… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted to SecTL (AsiaCCS Workshop) 2024

  16. arXiv:2405.06822  [pdf, other

    cs.LG cs.AI

    MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis

    Authors: Luyuan Xie, Manqing Lin, Tianyu Luan, Cong Li, Yuejian Fang, Qingni Shen, Zhonghai Wu

    Abstract: Federated learning is widely used in medical applications for training global models without needing local data access. However, varying computational capabilities and network architectures (system heterogeneity), across clients pose significant challenges in effectively aggregating information from non-independently and identically distributed (non-IID) data. Current federated learning methods us… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: This paper is accepted by ICML 2024

  17. arXiv:2405.05800  [pdf, other

    cs.GR cs.CV

    DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation

    Authors: Sitian Shen, **g Xu, Yuheng Yuan, Xingyi Yang, Qiuhong Shen, Xinchao Wang

    Abstract: User-friendly 3D object editing is a challenging task that has attracted significant attention recently. The limitations of direct 3D object editing without 2D prior knowledge have prompted increased attention towards utilizing 2D generative models for 3D editing. While existing methods like Instruct NeRF-to-NeRF offer a solution, they often lack user-friendliness, particularly due to semantic gui… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  18. arXiv:2403.18795  [pdf, other

    cs.CV cs.AI

    Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

    Authors: Qiuhong Shen, Zike Wu, Xuanyu Yi, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang

    Abstract: We tackle the challenge of efficiently reconstructing a 3D asset from a single image at millisecond speed. Existing methods for single-image 3D reconstruction are primarily based on Score Distillation Sampling (SDS) with Neural 3D representations. Despite promising results, these approaches encounter practical limitations due to lengthy optimizations and significant memory consumption. In this wor… ▽ More

    Submitted 24 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: project page: https://florinshen.github.io/gamba-project

  19. arXiv:2403.17610  [pdf, other

    cs.CV

    MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

    Authors: He Zhang, Shenghao Ren, Haolei Yuan, Jianhui Zhao, Fan Li, Shuangpeng Sun, Zhenghao Liang, Tao Yu, Qiu Shen, Xun Cao

    Abstract: Foot contact is an important cue for human motion capture, understanding, and generation. Existing datasets tend to annotate dense foot contact using visual matching with thresholding or incorporating pressure signals. However, these approaches either suffer from low accuracy or are only designed for small-range and slow motion. There is still a lack of a vision-pressure multimodal dataset with la… ▽ More

    Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: CVPR2024

  20. arXiv:2403.09294  [pdf, other

    cs.CV cs.CL

    Anatomical Structure-Guided Medical Vision-Language Pre-training

    Authors: Qingqiu Li, Xiaohan Yan, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shujun Wang

    Abstract: Learning medical visual representations through vision-language pre-training has reached remarkable progress. Despite the promising performance, it still faces challenges, i.e., local alignment lacks interpretability and clinical relevance, and the insufficient internal and external representation learning of image-report pairs. To address these issues, we propose an Anatomical Structure-Guided (A… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  21. arXiv:2402.13435  [pdf, other

    cs.IR cs.LG

    Learning to Retrieve for Job Matching

    Authors: Jianqiang Shen, Yuchin Juan, Shaobo Zhang, ** Liu, Wen Pu, Sriram Vasudevan, Qingquan Song, Fedor Borisyuk, Kay Qianqi Shen, Haichao Wei, Yunxiang Ren, Yeou S. Chiou, Sicong Kuang, Yuan Yin, Ben Zheng, Muchen Wu, Shaghayegh Gharghabi, Xiaoqing Wang, Huichao Xue, Qi Guo, Daniel Hewlett, Luke Simon, Liangjie Hong, Wen**g Zhang

    Abstract: Web-scale search systems typically tackle the scalability challenge with a two-step paradigm: retrieval and ranking. The retrieval step, also known as candidate selection, often involves extracting standardized entities, creating an inverted index, and performing term matching for retrieval. Such traditional methods require manual and time-consuming development of query models. In this paper, we d… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  22. arXiv:2402.13430  [pdf, other

    cs.LG cs.AI cs.SI

    LinkSAGE: Optimizing Job Matching Using Graph Neural Networks

    Authors: ** Liu, Haichao Wei, Xiaochen Hou, Jianqiang Shen, Shihai He, Kay Qianqi Shen, Zhujun Chen, Fedor Borisyuk, Daniel Hewlett, Liang Wu, Srikant Veeraraghavan, Alex Tsun, Chengming Jiang, Wen**g Zhang

    Abstract: We present LinkSAGE, an innovative framework that integrates Graph Neural Networks (GNNs) into large-scale personalized job matching systems, designed to address the complex dynamics of LinkedIns extensive professional network. Our approach capitalizes on a novel job marketplace graph, the largest and most intricate of its kind in industry, with billions of nodes and edges. This graph is not merel… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  23. arXiv:2402.11139  [pdf, other

    cs.LG cs.AI

    LiGNN: Graph Neural Networks at LinkedIn

    Authors: Fedor Borisyuk, Shihai He, Yunbo Ouyang, Morteza Ramezani, Peng Du, Xiaochen Hou, Chengming Jiang, Nitin Pasumarthy, Priya Bannur, Birjodh Tiwana, ** Liu, Siddharth Dangi, Daqi Sun, Zhoutao Pei, Xiao Shi, Sirou Zhu, Qianqi Shen, Kuang-Hsuan Lee, David Stein, Baolei Li, Haichao Wei, Amol Ghoting, Souvik Ghosh

    Abstract: In this paper, we present LiGNN, a deployed large-scale Graph Neural Networks (GNNs) Framework. We share our insight on develo** and deployment of GNNs at large scale at LinkedIn. We present a set of algorithmic improvements to the quality of GNN representation learning including temporal graph architectures with long term losses, effective cold start solutions via graph densification, ID embedd… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  24. arXiv:2402.07659  [pdf, other

    cs.IR

    Multi-Behavior Collaborative Filtering with Partial Order Graph Convolutional Networks

    Authors: Yijie Zhang, Yuanchen Bei, Hao Chen, Qijie Shen, Zheng Yuan, Huan Gong, Senzhang Wang, Feiran Huang, Xiao Huang

    Abstract: Representing information of multiple behaviors in the single graph collaborative filtering (CF) vector has been a long-standing challenge. This is because different behaviors naturally form separate behavior graphs and learn separate CF embeddings. Existing models merge the separate embeddings by appointing the CF embeddings for some behaviors as the primary embedding and utilizing other auxiliari… ▽ More

    Submitted 20 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted by KDD2024

  25. arXiv:2402.07562  [pdf, other

    cs.CR cs.AI

    Discovering Universal Semantic Triggers for Text-to-Image Synthesis

    Authors: Shengfang Zhai, Weilong Wang, Jiajun Li, Yinpeng Dong, Hang Su, Qingni Shen

    Abstract: Recently text-to-image models have gained widespread attention in the community due to their controllable and high-quality generation ability. However, the robustness of such models and their potential ethical issues have not been fully explored. In this paper, we introduce Universal Semantic Trigger, a meaningless token sequence that can be added at any location within the input text yet can indu… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures. Work in progress

  26. arXiv:2402.06859  [pdf, other

    cs.LG cs.AI cs.IR

    LiRank: Industrial Large Scale Ranking Models at LinkedIn

    Authors: Fedor Borisyuk, Mingzhou Zhou, Qingquan Song, Siyu Zhu, Birjodh Tiwana, Ganesh Parameswaran, Siddharth Dangi, Lars Hertel, Qiang Xiao, Xiaochen Hou, Yunbo Ouyang, Aman Gupta, Sheallika Singh, Dan Liu, Hailing Cheng, Lei Le, Jonathan Hung, Sathiya Keerthi, Ruoyan Wang, Fengyu Zhang, Mohit Kothari, Chen Zhu, Daqi Sun, Yun Dai, Xun Luan , et al. (9 additional authors not shown)

    Abstract: We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods. We unveil several modeling improvements, including Residual DCN, which adds attention and residual connections to the famous DCNv2 architecture. We share insights into combining and tuning SOTA architectures to create a unified model, including… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    ACM Class: H.3.3

  27. arXiv:2402.00225  [pdf, other

    cs.CV

    Geometry aware 3D generation from in-the-wild images in ImageNet

    Authors: Qijia Shen, Guangrun Wang

    Abstract: Generating accurate 3D models is a challenging problem that traditionally requires explicit learning from 3D datasets using supervised learning. Although recent advances have shown promise in learning 3D models from 2D images, these methods often rely on well-structured datasets with multi-view images of each instance or camera pose information. Furthermore, these datasets usually contain clean ba… ▽ More

    Submitted 1 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

  28. arXiv:2401.14939  [pdf, other

    cs.IR

    Macro Graph Neural Networks for Online Billion-Scale Recommender Systems

    Authors: Hao Chen, Yuanchen Bei, Qijie Shen, Yue Xu, Sheng Zhou, Wenbing Huang, Feiran Huang, Senzhang Wang, Xiao Huang

    Abstract: Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved in aggregating billions of neighbors. To tackle this, GNN-based CTR models usually sample hundreds of neighbors out of the billions to facilitate efficient online recommendations. However, sampling only a… ▽ More

    Submitted 8 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: 11 pages, 7 figures, accepted by The Web Conference 2024

  29. arXiv:2401.11354  [pdf, other

    math.PR cs.LG stat.ME

    Squared Wasserstein-2 Distance for Efficient Reconstruction of Stochastic Differential Equations

    Authors: Mingtao Xia, Xiangting Li, Qi**g Shen, Tom Chou

    Abstract: We provide an analysis of the squared Wasserstein-2 ($W_2$) distance between two probability distributions associated with two stochastic differential equations (SDEs). Based on this analysis, we propose the use of a squared $W_2$ distance-based loss functions in the \textit{reconstruction} of SDEs from noisy data. To demonstrate the practicality of our Wasserstein distance-based loss functions, w… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: 37 pages, 5 figures

    MSC Class: 60H10; 49Q22

  30. arXiv:2401.05431  [pdf, other

    eess.SP cs.AI cs.LG

    TRLS: A Time Series Representation Learning Framework via Spectrogram for Medical Signal Processing

    Authors: Luyuan Xie, Cong Li, Xin Zhang, Shengfang Zhai, Yuejian Fang, Qingni Shen, Zhonghai Wu

    Abstract: Representation learning frameworks in unlabeled time series have been proposed for medical signal processing. Despite the numerous excellent progresses have been made in previous works, we observe the representation extracted for the time series still does not generalize well. In this paper, we present a Time series (medical signal) Representation Learning framework via Spectrogram (TRLS) to get m… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: This paper is accept by ICASSP 2024. This is a more detailed version

  31. arXiv:2401.04136  [pdf, other

    cs.CR cs.AI

    The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

    Authors: Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi

    Abstract: The commercialization of text-to-image diffusion models (DMs) brings forth potential copyright concerns. Despite numerous attempts to protect DMs from copyright issues, the vulnerabilities of these solutions are underexplored. In this study, we formalized the Copyright Infringement Attack on generative AI models and proposed a backdoor attack method, SilentBadDiffusion, to induce copyright infring… ▽ More

    Submitted 26 May, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted for presentation at ICML 2024

  32. arXiv:2312.00057  [pdf, other

    cs.CR cs.AI cs.CV cs.MM

    VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

    Authors: Xiang Li, Qianli Shen, Kenji Kawaguchi

    Abstract: The booming use of text-to-image generative models has raised concerns about their high risk of producing copyright-infringing content. While probabilistic copyright protection methods provide a probabilistic guarantee against such infringement, in this paper, we introduce Virtually Assured Amplification Attack (VA3), a novel online attack framework that exposes the vulnerabilities of these protec… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 November, 2023; originally announced December 2023.

    Comments: 18 pages, 9 figures. Accept to CVPR 2024

  33. arXiv:2311.11056  [pdf, other

    cs.RO cs.LG cs.SE

    Choose Your Simulator Wisely: A Review on Open-source Simulators for Autonomous Driving

    Authors: Yueyuan Li, Wei Yuan, Songan Zhang, Weihao Yan, Qiyuan Shen, Chunxiang Wang, Ming Yang

    Abstract: Simulators play a crucial role in autonomous driving, offering significant time, cost, and labor savings. Over the past few years, the number of simulators for autonomous driving has grown substantially. However, there is a growing concern about the validity of algorithms developed and evaluated in simulators, indicating a need for a thorough analysis of the development status of the simulators.… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: 18 pages, 5 figures, 8 tables

  34. arXiv:2311.00911  [pdf, other

    cs.NI

    A Lightweight Routing Layer Using a Reliable Link-Layer Protocol

    Authors: Qianfeng Shen, Paul Chow

    Abstract: In today's data centers, the performance of interconnects plays a pivotal role. However, many of the underlying technologies for these interconnects have a history of several decades and existed long before data centers came into being.To better cater to the requirements of data center networks, particularly in the context of intra-rack communication, we have developed a new interconnect. This int… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  35. arXiv:2310.19257  [pdf, other

    cs.CV

    A High-Resolution Dataset for Instance Detection with Multi-View Instance Capture

    Authors: Qianqian Shen, Yunhan Zhao, Nahyun Kwon, Jeeeun Kim, Yanan Li, Shu Kong

    Abstract: Instance detection (InsDet) is a long-lasting problem in robotics and computer vision, aiming to detect object instances (predefined by some visual examples) in a cluttered scene. Despite its practical significance, its advancement is overshadowed by Object Detection, which aims to detect objects belonging to some predefined classes. One major reason is that current InsDet datasets are too small i… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023, Datasets and Benchmarks Track

  36. arXiv:2310.18004  [pdf, other

    cs.IR

    Text2Bundle: Towards Personalized Query-based Bundle Generation

    Authors: Shixuan Zhu, Chuan Cui, JunTong Hu, Qi Shen, Yu Ji, Zhihua Wei

    Abstract: Bundle generation aims to provide a bundle of items for the user, and has been widely studied and applied on online service platforms. Existing bundle generation methods mainly utilized user's preference from historical interactions in common recommendation paradigm, and ignored the potential textual query which is user's current explicit intention. There can be a scenario in which a user proactiv… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  37. arXiv:2310.13365  [pdf, other

    cs.IR

    Towards Multi-Subsession Conversational Recommendation

    Authors: Yu Ji, Qi Shen, Shixuan Zhu, Hang Yu, Yiming Zhang, Chuan Cui, Zhihua Wei

    Abstract: Conversational recommendation systems (CRS) could acquire dynamic user preferences towards desired items through multi-round interactive dialogue. Previous CRS mainly focuses on the single conversation (subsession) that user quits after a successful recommendation, neglecting the common scenario where user has multiple conversations (multi-subsession) over a short period. Therefore, we propose a n… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  38. arXiv:2310.10563  [pdf, other

    cs.CV cs.LG

    RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets

    Authors: Zhicheng Cai, Xiaohan Ding, Qiu Shen, Xun Cao

    Abstract: We propose Re-parameterized Refocusing Convolution (RefConv) as a replacement for regular convolutional layers, which is a plug-and-play module to improve the performance without any inference costs. Specifically, given a pre-trained model, RefConv applies a trainable Refocusing Transformation to the basis kernels inherited from the pre-trained model to establish connections among the parameters.… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  39. arXiv:2310.06923  [pdf, other

    cs.AI cs.LG

    PICProp: Physics-Informed Confidence Propagation for Uncertainty Quantification

    Authors: Qianli Shen, Wai Hoh Tang, Zhun Deng, Apostolos Psaros, Kenji Kawaguchi

    Abstract: Standard approaches for uncertainty quantification in deep learning and physics-informed learning have persistent limitations. Indicatively, strong assumptions regarding the data likelihood are required, the performance highly depends on the selection of priors, and the posterior can be sampled only approximately, which leads to poor approximations because of the associated computational cost. Thi… ▽ More

    Submitted 20 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023. Code is available at https://github.com/ShenQianli/PICProp

  40. arXiv:2309.16943  [pdf, other

    cs.LG eess.SY

    Physics-Informed Induction Machine Modelling

    Authors: Qing Shen, Yifan Zhou, Peng Zhang

    Abstract: This rapid communication devises a Neural Induction Machine (NeuIM) model, which pilots the use of physics-informed machine learning to enable AI-based electromagnetic transient simulations. The contributions are threefold: (1) a formation of NeuIM to represent the induction machine in phase domain; (2) a physics-informed neural network capable of capturing fast and slow IM dynamics even in the ab… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  41. arXiv:2309.16131  [pdf, ps, other

    cs.LG cs.NE math.SP

    A Spectral Approach for Learning Spatiotemporal Neural Differential Equations

    Authors: Mingtao Xia, Xiangting Li, Qi**g Shen, Tom Chou

    Abstract: Rapidly develo** machine learning methods has stimulated research interest in computationally reconstructing differential equations (DEs) from observational data which may provide additional insight into underlying causative mechanisms. In this paper, we propose a novel neural-ODE based method that uses spectral expansions in space to learn spatiotemporal DEs. The major advantage of our spectral… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 21 pages, 5 figures

  42. arXiv:2308.13820  [pdf, other

    cs.IR

    Video and Audio are Images: A Cross-Modal Mixer for Original Data on Video-Audio Retrieval

    Authors: Zichen Yuan, Qi Shen, Bingyi Zheng, Yuting Liu, Linying Jiang, Guibing Guo

    Abstract: Cross-modal retrieval has become popular in recent years, particularly with the rise of multimedia. Generally, the information from each modality exhibits distinct representations and semantic information, which makes feature tends to be in separate latent spaces encoded with dual-tower architecture and makes it difficult to establish semantic relationships between modalities, resulting in poor re… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  43. arXiv:2308.08259  [pdf, other

    cs.LG

    Graph Relation Aware Continual Learning

    Authors: Qinghua Shen, Weijieying Ren, Wei Qin

    Abstract: Continual graph learning (CGL) studies the problem of learning from an infinite stream of graph data, consolidating historical knowledge, and generalizing it to the future task. At once, only current graph data are available. Although some recent attempts have been made to handle this task, we still face two potential challenges: 1) most of existing works only manipulate on the intermediate graph… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  44. arXiv:2308.07622  [pdf, other

    cs.MM

    EMID: An Emotional Aligned Dataset in Audio-Visual Modality

    Authors: Jialing Zou, Jiahao Mei, Guangze Ye, Tianyu Huai, Qiwei Shen, Daoguo Dong

    Abstract: In this paper, we propose Emotionally paired Music and Image Dataset (EMID), a novel dataset designed for the emotional matching of music and images, to facilitate auditory-visual cross-modal tasks such as generation and retrieval. Unlike existing approaches that primarily focus on semantic correlations or roughly divided emotional relations, EMID emphasizes the significance of emotional consisten… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  45. arXiv:2306.14119  [pdf, other

    eess.IV cs.CV

    SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

    Authors: Luyuan Xie, Cong Li, Zirui Wang, Xin Zhang, Boyan Chen, Qingni Shen, Zhonghai Wu

    Abstract: The rapid identification and accurate diagnosis of breast cancer, known as the killer of women, have become greatly significant for those patients. Numerous breast cancer histopathological image classification methods have been proposed. But they still suffer from two problems. (1) These methods can only hand high-resolution (HR) images. However, the low-resolution (LR) images are often collected… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: Accepted by MICCAI 2023

  46. arXiv:2306.10536  [pdf, other

    cs.CV

    Learn to Enhance the Negative Information in Convolutional Neural Network

    Authors: Zhicheng Cai, Chenglei Peng, Qiu Shen

    Abstract: This paper proposes a learnable nonlinear activation mechanism specifically for convolutional neural network (CNN) termed as LENI, which learns to enhance the negative information in CNNs. In sharp contrast to ReLU which cuts off the negative neurons and suffers from the issue of ''dying ReLU'', LENI enjoys the capacity to reconstruct the dead neurons and reduce the information loss. Compared to i… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

    Comments: ICIG 2023

  47. arXiv:2306.09264  [pdf, other

    cs.CV

    Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness Learning and Fair Identity Normalization

    Authors: Yan Luo, Yu Tian, Min Shi, Louis R. Pasquale, Lucy Q. Shen, Nazlee Zebardast, Tobias Elze, Mengyu Wang

    Abstract: Fairness (also known as equity interchangeably) in machine learning is important for societal well-being, but limited public datasets hinder its progress. Currently, no dedicated public medical datasets with imaging data for fairness learning are available, though minority groups suffer from more health issues. To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal ner… ▽ More

    Submitted 10 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted in IEEE Transactions on Medical Imaging

  48. arXiv:2306.06365  [pdf, other

    cs.CV

    FalconNet: Factorization for the Light-weight ConvNets

    Authors: Zhicheng Cai, Qiu Shen

    Abstract: Designing light-weight CNN models with little parameters and Flops is a prominent research concern. However, three significant issues persist in the current light-weight CNNs: i) the lack of architectural consistency leads to redundancy and hindered capacity comparison, as well as the ambiguity in causation between architectural choices and performance enhancement; ii) the utilization of a single-… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

  49. arXiv:2305.12420  [pdf, other

    cs.IR

    Multi-factor Sequential Re-ranking with Perception-Aware Diversification

    Authors: Yue Xu, Hao Chen, Zefan Wang, Jianwen Yin, Qijie Shen, Dimin Wang, Feiran Huang, Lixiang Lai, Tao Zhuang, Junfeng Ge, Xia Hu

    Abstract: Feed recommendation systems, which recommend a sequence of items for users to browse and interact with, have gained significant popularity in practical applications. In feed products, users tend to browse a large number of items in succession, so the previously viewed items have a significant impact on users' behavior towards the following items. Therefore, traditional methods that mainly focus on… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Journal ref: KDD 2023

  50. arXiv:2305.12319  [pdf, other

    cs.IR

    Multi-channel Integrated Recommendation with Exposure Constraints

    Authors: Yue Xu, Qijie Shen, Jianwen Yin, Zengde Deng, Dimin Wang, Hao Chen, Lixiang Lai, Tao Zhuang, Junfeng Ge

    Abstract: Integrated recommendation, which aims at jointly recommending heterogeneous items from different channels in a main feed, has been widely applied to various online platforms. Though attractive, integrated recommendation requires the ranking methods to migrate from conventional user-item models to the new user-channel-item paradigm in order to better capture users' preferences on both item and chan… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Journal ref: KDD 2023