Skip to main content

Showing 1–50 of 70 results for author: Shan, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18862  [pdf, other

    cs.SD eess.AS

    Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study

    Authors: Peikun Chen, Sining Sun, Changhao Shan, Qing Yang, Lei Xie

    Abstract: Unified speech-text models like SpeechGPT, VioLA, and AudioPaLM have shown impressive performance across various speech-related tasks, especially in Automatic Speech Recognition (ASR). These models typically adopt a unified method to model discrete speech and text tokens, followed by training a decoder-only transformer. However, they are all designed for non-streaming ASR tasks, where the entire s… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted for Interspeech 2024

  2. arXiv:2406.00341  [pdf, other

    eess.IV cs.CV

    DSCA: A Digital Subtraction Angiography Sequence Dataset and Spatio-Temporal Model for Cerebral Artery Segmentation

    Authors: Qihang Xie, Mengguo Guo, Lei Mou, Dan Zhang, Da Chen, Caifeng Shan, Yitian Zhao, Ruisheng Su, Jiong Zhang

    Abstract: Cerebrovascular diseases (CVDs) remain a leading cause of global disability and mortality. Digital Subtraction Angiography (DSA) sequences, recognized as the golden standard for diagnosing CVDs, can clearly visualize the dynamic flow and reveal pathological conditions within the cerebrovasculature. Therefore, precise segmentation of cerebral arteries (CAs) and classification between their main tru… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  3. arXiv:2405.19119  [pdf, other

    cs.LG

    Can Graph Learning Improve Task Planning?

    Authors: Xixi Wu, Yifei Shen, Caihua Shan, Kaitao Song, Siwei Wang, Bohang Zhang, Jiarui Feng, Hong Cheng, Wei Chen, Yun Xiong, Dongsheng Li

    Abstract: Task planning is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, t… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2405.04902  [pdf, other

    eess.IV cs.CV

    HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis

    Authors: Zhihan Ju, Wanting Zhou, Longteng Kong, Yu Chen, Yi Li, Zhenan Sun, Caifeng Shan

    Abstract: Medical Image Synthesis (MIS) plays an important role in the intelligent medical field, which greatly saves the economic and time costs of medical diagnosis. However, due to the complexity of medical images and similar characteristics of different tissue cells, existing methods face great challenges in meeting their biological consistency. To this end, we propose the Hybrid Augmented Generative Ad… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  5. arXiv:2403.08258  [pdf, other

    cs.CL cs.LG

    Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition

    Authors: Wen**g Zhu, Sining Sun, Changhao Shan, Peng Fan, Qing Yang

    Abstract: Conformer-based attention models have become the de facto backbone model for Automatic Speech Recognition tasks. A blank symbol is usually introduced to align the input and output sequences for CTC or RNN-T models. Unfortunately, the long input length overloads computational budget and memory consumption quadratically by attention mechanism. In this work, we propose a "Skip-and-Recover" Conformer… ▽ More

    Submitted 20 May, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICME2024

  6. arXiv:2401.09769  [pdf, other

    cs.SI cs.AI cs.LG

    Learning from Graphs with Heterophily: Progress and Future

    Authors: Chenghua Gong, Yao Cheng, Xiang Li, Caihua Shan, Siqiang Luo

    Abstract: Graphs are structured data that models complex relations between real-world entities. Heterophilous graphs, where linked nodes are prone to be with different labels or dissimilar features, have recently attracted significant attention and found many applications. Meanwhile, increasing efforts have been made to advance learning from heterophilous graphs. Although there exist surveys on the relevant… ▽ More

    Submitted 1 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 9 pages

  7. arXiv:2401.01217  [pdf, other

    cs.DC

    KCES: A Workflow Containerization Scheduling Scheme Under Cloud-Edge Collaboration Framework

    Authors: Chenggang Shan, Runze Gao, Qinghua Han, Zhen Yang, **hui Zhang, Yuanqing Xia

    Abstract: As more IoT applications gradually move towards the cloud-edge collaborative mode, the containerized scheduling of workflows extends from the cloud to the edge. However, given the high delay of the communication network, loose coupling of structure, and resource heterogeneity between cloud and edge, workflow containerization scheduling in the cloud-edge scenarios faces the difficulty of resource c… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  8. arXiv:2311.14304  [pdf, other

    cs.LG

    AdaMedGraph: Adaboosting Graph Neural Networks for Personalized Medicine

    Authors: Jie Lian, Xufang Luo, Caihua Shan, Dongqi Han, Varut Vardhanabhuti, Dongsheng Li

    Abstract: Precision medicine tailored to individual patients has gained significant attention in recent times. Machine learning techniques are now employed to process personalized data from various sources, including images, genetics, and assessments. These techniques have demonstrated good outcomes in many clinical prediction tasks. Notably, the approach of constructing graphs by linking similar patients a… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 9 pages

  9. arXiv:2311.02832  [pdf, other

    cs.LG

    Prioritized Propagation in Graph Neural Networks

    Authors: Yao Cheng, Minjie Chen, Xiang Li, Caihua Shan, Ming Gao

    Abstract: Graph neural networks (GNNs) have recently received significant attention. Learning node-wise message propagation in GNNs aims to set personalized propagation steps for different nodes in the graph. Despite the success, existing methods ignore node priority that can be reflected by node influence and heterophily. In this paper, we propose a versatile framework PPro, which can be integrated with mo… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  10. Resurrecting Label Propagation for Graphs with Heterophily and Label Noise

    Authors: Yao Cheng, Caihua Shan, Yifei Shen, Xiang Li, Siqiang Luo, Dongsheng Li

    Abstract: Label noise is a common challenge in large datasets, as it can significantly degrade the generalization ability of deep neural networks. Most existing studies focus on noisy labels in computer vision; however, graph models encompass both node features and graph topology as input, and become more susceptible to label noise through message-passing mechanisms. Recently, only a few works have been pro… ▽ More

    Submitted 12 June, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: To appear in KDD2024

  11. arXiv:2310.14954  [pdf, other

    cs.SD cs.CL eess.AS

    Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition

    Authors: Peng Fan, Changhao Shan, Sining Sun, Qing Yang, Jianwei Zhang

    Abstract: Recently, Conformer as a backbone network for end-to-end automatic speech recognition achieved state-of-the-art performance. The Conformer block leverages a self-attention mechanism to capture global information, along with a convolutional neural network to capture local information, resulting in improved performance. However, the Conformer-based model encounters an issue with the self-attention m… ▽ More

    Submitted 28 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: This manuscript has been accepted by IEEE Signal Processing Letters for publication

  12. arXiv:2308.07013  [pdf, other

    cs.DB cs.LG

    Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads

    Authors: Dingheng Mo, Fanchao Chen, Siqiang Luo, Caihua Shan

    Abstract: LSM-trees are widely adopted as the storage backend of key-value stores. However, optimizing the system performance under dynamic workloads has not been sufficiently studied or evaluated in previous work. To fill the gap, we present RusKey, a key-value store with the following new features: (1) RusKey is a first attempt to orchestrate LSM-tree structures online to enable robust performance under t… ▽ More

    Submitted 17 September, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: 25 pages, 13 figures

  13. arXiv:2307.12262  [pdf, other

    cs.SD cs.CL cs.HC eess.AS

    A meta learning scheme for fast accent domain expansion in Mandarin speech recognition

    Authors: Ziwei Zhu, Changhao Shan, Bihong Zhang, Jian Yu

    Abstract: Spoken languages show significant variation across mandarin and accent. Despite the high performance of mandarin automatic speech recognition (ASR), accent ASR is still a challenge task. In this paper, we introduce meta-learning techniques for fast accent domain expansion in mandarin speech recognition, which expands the field of accents without deteriorating the performance of mandarin ASR. Meta-… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  14. arXiv:2304.04982  [pdf, other

    cs.LG cs.AI q-bio.QM

    Biological Factor Regulatory Neural Network

    Authors: Xinnan Dai, Caihua Shan, Jie Zheng, Xiaoxiao Li, Dongsheng Li

    Abstract: Genes are fundamental for analyzing biological systems and many recent works proposed to utilize gene expression for various biological tasks by deep learning models. Despite their promising performance, it is hard for deep neural networks to provide biological insights for humans due to their black-box nature. Recently, some works integrated biological knowledge with neural networks to improve th… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  15. arXiv:2303.13182  [pdf, other

    cs.RO cs.AI cs.CV

    CMG-Net: An End-to-End Contact-Based Multi-Finger Dexterous Gras** Network

    Authors: Mingze Wei, Yaomin Huang, Zhiyuan Xu, Ning Liu, Zheng** Che, Xinyu Zhang, Chaomin Shen, Feifei Feng, Chun Shan, Jian Tang

    Abstract: In this paper, we propose a novel representation for gras** using contacts between multi-finger robotic hands and objects to be manipulated. This representation significantly reduces the prediction dimensions and accelerates the learning process. We present an effective end-to-end network, CMG-Net, for gras** unknown objects in a cluttered environment by efficiently predicting multi-finger gra… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: The first two authors are with equal contributions. Paper accepted by ICRA 2023

  16. arXiv:2301.12458  [pdf, other

    cs.LG

    SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking

    Authors: Xiang Li, Tiandi Ye, Caihua Shan, Dongsheng Li, Ming Gao

    Abstract: Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge… ▽ More

    Submitted 7 February, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

    Comments: Accepted by WebConf 2023

  17. Adaptive Resource Allocation for Workflow Containerization on Kubernetes

    Authors: Chenggang Shan, Chuge Wu, Yuanqing Xia, Zehua Guo, Danyang Liu, **hui Zhang

    Abstract: In a cloud-native era, the Kubernetes-based workflow engine enables workflow containerized execution through the inherent abilities of Kubernetes. However, when encountering continuous workflow requests and unexpected resource request spikes, the engine is limited to the current workflow load information for resource allocation, which lacks the agility and predictability of resource allocation, re… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Journal ref: Journal of Systems Engineering and Electronics, 2023:34(3), 723-743

  18. CLARE: A Semi-supervised Community Detection Algorithm

    Authors: Xixi Wu, Yun Xiong, Yao Zhang, Yizhu Jiao, Caihua Shan, Yiheng Sun, Yangyong Zhu, Philip S. Yu

    Abstract: Community detection refers to the task of discovering closely related subgraphs to understand the networks. However, traditional community detection algorithms fail to pinpoint a particular kind of community. This limits its applicability in real-world networks, e.g., distinguishing fraud groups from normal ones in transaction networks. Recently, semi-supervised community detection emerges as a so… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: Accepted by KDD'2022

  19. Exact Recursive Probabilistic Programming

    Authors: David Chiang, Colin McDonald, Chung-chieh Shan

    Abstract: Recursive calls over recursive data are useful for generating probability distributions, and probabilistic programming allows computations over these distributions to be expressed in a modular and intuitive way. Exact inference is also useful, but unfortunately, existing probabilistic programming languages do not perform exact inference on recursive calls over recursive data, forcing programmers t… ▽ More

    Submitted 27 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Journal ref: Proc. ACM Program. Lang. 7, OOPSLA1, Article 98 (April 2023)

  20. arXiv:2208.07211  [pdf, other

    cs.LG cs.AI

    RuDi: Explaining Behavior Sequence Models by Automatic Statistics Generation and Rule Distillation

    Authors: Yao Zhang, Yun Xiong, Yiheng Sun, Caihua Shan, Tian Lu, Hui Song, Yangyong Zhu

    Abstract: Risk scoring systems have been widely deployed in many applications, which assign risk scores to users according to their behavior sequences. Though many deep learning methods with sophisticated designs have achieved promising results, the black-box nature hinders their applications due to fairness, explainability, and compliance consideration. Rule-based systems are considered reliable in these s… ▽ More

    Submitted 16 August, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: CIKM'2022. Codes: https://github.com/yzhang1918/cikm2022rudi

  21. arXiv:2207.12995  [pdf, other

    cs.CV

    Exploring Generalizable Distillation for Efficient Medical Image Segmentation

    Authors: Xingqun Qi, Zhuojie Wu, Min Ren, Muyi Sun, Caifeng Shan, Zhenan Sun

    Abstract: Efficient medical image segmentation aims to provide accurate pixel-wise predictions for medical images with a lightweight implementation framework. However, lightweight frameworks generally fail to achieve superior performance and suffer from poor generalizable ability on cross-domain tasks. In this paper, we explore the generalizable knowledge distillation for the efficient segmentation of cross… ▽ More

    Submitted 20 February, 2023; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: Under Review

  22. arXiv:2207.01222  [pdf, other

    cs.DC

    KubeAdaptor: A Docking Framework for Workflow Containerization on Kubernetes

    Authors: Chenggang Shan, Guan Wang, Yuanqing Xia, Yufeng Zhan, **hui Zhang

    Abstract: As Kubernetes becomes the infrastructure of the cloud-native era, the integration of workflow systems with Kubernetes is gaining more and more popularity. To our knowledge, workflow systems employ scheduling algorithms that optimize task execution order of workflow to improve performance and execution efficiency. However, due to its inherent scheduling mechanism, Kubernetes does not execute contai… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Workflow System, Containerization, Task Scheduling, Event Trigger, Kubernetes

  23. arXiv:2206.10066  [pdf, other

    cs.CV

    RendNet: Unified 2D/3D Recognizer With Latent Space Rendering

    Authors: Ruoxi Shi, Xinyang Jiang, Caihua Shan, Yansen Wang, Dongsheng Li

    Abstract: Vector graphics (VG) have been ubiquitous in our daily life with vast applications in engineering, architecture, designs, etc. The VG recognition process of most existing methods is to first render the VG into raster graphics (RG) and then conduct recognition based on RG formats. However, this procedure discards the structure of geometries and loses the high resolution of VG. Recently, another cat… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

    Comments: CVPR 2022 Oral

  24. arXiv:2205.07308  [pdf, other

    cs.LG cs.AI

    Finding Global Homophily in Graph Neural Networks When Meeting Heterophily

    Authors: Xiang Li, Renyu Zhu, Yao Cheng, Caihua Shan, Siqiang Luo, Dongsheng Li, Weining Qian

    Abstract: We investigate graph neural networks on graphs with heterophily. Some existing methods amplify a node's neighborhood with multi-hop neighbors to include more nodes with homophily. However, it is a significant challenge to set personalized neighborhood sizes for different nodes. Further, for other homophilous nodes excluded in the neighborhood, they are ignored for information aggregation. To addre… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

    Comments: To appear in ICML 2022

  25. CMMD: Cross-Metric Multi-Dimensional Root Cause Analysis

    Authors: Shifu Yan, Caihua Shan, Wenyi Yang, Bixiong Xu, Dongsheng Li, Lili Qiu, Jie Tong, Qi Zhang

    Abstract: In large-scale online services, crucial metrics, a.k.a., key performance indicators (KPIs), are monitored periodically to check their running statuses. Generally, KPIs are aggregated along multiple dimensions and derived by complex calculations among fundamental metrics from the raw data. Once abnormal KPI values are observed, root cause analysis (RCA) can be applied to identify the reasons for an… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

  26. arXiv:2201.01592  [pdf, other

    cs.CV

    Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative Adversarial Network with Graph Representation Learning

    Authors: Xingqun Qi, Muyi Sun, Zijian Wang, Jiaming Liu, Qi Li, Fang Zhao, Shanghang Zhang, Caifeng Shan

    Abstract: Biphasic face photo-sketch synthesis has significant practical value in wide-ranging fields such as digital entertainment and law enforcement. Previous approaches directly generate the photo-sketch in a global view, they always suffer from the low quality of sketches and complex photo variations, leading to unnatural and low-fidelity results. In this paper, we propose a novel Semantic-Driven Gener… ▽ More

    Submitted 3 December, 2023; v1 submitted 5 January, 2022; originally announced January 2022.

    Comments: Accepted to IEEE TNNLS

  27. arXiv:2111.03281  [pdf, other

    cs.CV eess.IV

    Recognizing Vector Graphics without Rasterization

    Authors: Xinyang Jiang, Lu Liu, Caihua Shan, Yifei Shen, Xuanyi Dong, Dongsheng Li

    Abstract: In this paper, we consider a different data format for images: vector graphics. In contrast to raster graphics which are widely used in image recognition, vector graphics can be scaled up or down into any resolution without aliasing or information loss, due to the analytic representation of the primitives in the document. Furthermore, vector graphics are able to give extra structural information o… ▽ More

    Submitted 23 December, 2021; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: Accepted by NeurIPS2021

  28. arXiv:2108.07567  [pdf, ps, other

    cs.IR cs.LG eess.SP

    How Powerful is Graph Convolution for Recommendation?

    Authors: Yifei Shen, Yongji Wu, Yao Zhang, Caihua Shan, Jun Zhang, Khaled B. Letaief, Dongsheng Li

    Abstract: Graph convolutional networks (GCNs) have recently enabled a popular class of algorithms for collaborative filtering (CF). Nevertheless, the theoretical underpinnings of their empirical successes remain elusive. In this paper, we endeavor to obtain a better understanding of GCN-based CF methods via the lens of graph signal processing. By identifying the critical role of smoothness, a key concept in… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

  29. Medical Instrument Segmentation in 3D US by Hybrid Constrained Semi-Supervised Learning

    Authors: Hongxu Yang, Caifeng Shan, R. Arthur Bouwman, Lukas R. C. Dekker, Alexander F. Kolen, Peter H. N. de With

    Abstract: Medical instrument segmentation in 3D ultrasound is essential for image-guided intervention. However, to train a successful deep neural network for instrument segmentation, a large number of labeled images are required, which is expensive and time-consuming to obtain. In this article, we propose a semi-supervised learning (SSL) framework for instrument segmentation in 3D US, which requires much le… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

    Comments: Accepted by IEEE JBHI

  30. arXiv:2107.05917  [pdf, other

    cs.LG cs.AI

    Towards Representation Identical Privacy-Preserving Graph Neural Network via Split Learning

    Authors: Chuanqiang Shan, Huiyun Jiao, Jie Fu

    Abstract: In recent years, the fast rise in number of studies on graph neural network (GNN) has put it from the theories research to reality application stage. Despite the encouraging performance achieved by GNN, less attention has been paid to the privacy-preserving training and inference over distributed graph data in the related literature. Due to the particularity of graph structure, it is challenging t… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  31. arXiv:2106.15125  [pdf, other

    cs.CV

    Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition

    Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang

    Abstract: One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly sophisticated and over-parameterized. The low efficiency in model training and inference has increased the validation costs of model architectures in large-scale data… ▽ More

    Submitted 3 March, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: 15 pages, 12 tables, 10 figures, Accepted by IEEE T-PAMI. arXiv admin note: text overlap with arXiv:2010.09978

  32. arXiv:2106.15121  [pdf, other

    cs.CV

    Face Sketch Synthesis via Semantic-Driven Generative Adversarial Network

    Authors: Xingqun Qi, Muyi Sun, Weining Wang, Xiaoxiao Dong, Qi Li, Caifeng Shan

    Abstract: Face sketch synthesis has made significant progress with the development of deep neural networks in these years. The delicate depiction of sketch portraits facilitates a wide range of applications like digital entertainment and law enforcement. However, accurate and realistic face sketch generation is still a challenging task due to the illumination variations and complex backgrounds in the real s… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  33. arXiv:2106.14356  [pdf, ps, other

    cs.IT cs.NI

    Collaborative Edge Learning in MIMO-NOMA Uplink Transmission Environment

    Authors: Mian Guo, Chun Shan, Mithun Mukherjee, Jaime Lloret, Quansheng Guan

    Abstract: Multiple-input multiple-output non-orthogonal multiple access (MIMO-NOMA) cellular network is promising for supporting massive connectivity. This paper exploits low-latency machine learning in the MIMO-NOMA uplink transmission environment, where a substantial amount of data must be uploaded from multiple data sources to a one-hop away edge server for machine learning. A delay-aware edge learning f… ▽ More

    Submitted 27 June, 2021; originally announced June 2021.

    Comments: 5 pages, 3 figures, accepted for publication in IEEE/CIC ICCC 2021

  34. arXiv:2010.12071  [pdf, other

    cs.PL cs.LG

    Translating Recursive Probabilistic Programs to Factor Graph Grammars

    Authors: David Chiang, Chung-chieh Shan

    Abstract: It is natural for probabilistic programs to use conditionals to express alternative substructures in models, and loops (recursion) to express repeated substructures in models. Thus, probabilistic programs with conditionals and recursion motivate ongoing interest in efficient and general inference. A factor graph grammar (FGG) generates a set of factor graphs that do not all need to be enumerated i… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Extended abstract of presentation at PROBPROG 2020

  35. Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition

    Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang

    Abstract: One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the State-Of-The-Art (SOTA) models of this task tends to be exceedingly sophisticated and over-parameterized, where the low efficiency in model training and inference has obstructed the development in the field, especially for large-scale action… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: Accepted by ACM MultiMedia 2020, 9 pages, 4 figures, 5 tables

  36. Weakly-supervised Learning For Catheter Segmentation in 3D Frustum Ultrasound

    Authors: Hongxu Yang, Caifeng Shan, Alexander F. Kolen, Peter H. N. de With

    Abstract: Accurate and efficient catheter segmentation in 3D ultrasound (US) is essential for cardiac intervention. Currently, the state-of-the-art segmentation algorithms are based on convolutional neural networks (CNNs), which achieved remarkable performances in a standard Cartesian volumetric data. Nevertheless, these approaches suffer the challenges of low efficiency and GPU unfriendly image size. There… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

  37. arXiv:2010.09015  [pdf, other

    cs.CV

    Tracklets Predicting Based Adaptive Graph Tracking

    Authors: Chaobing Shan, Chunbo Wei, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Xiaoliang Cheng, Kewei Liang

    Abstract: Most of the existing tracking methods link the detected boxes to the tracklets using a linear combination of feature cosine distances and box overlap. But the problem of inconsistent features of an object in two different frames still exists. In addition, when extracting features, only appearance information is utilized, neither the location relationship nor the information of the tracklets is con… ▽ More

    Submitted 19 November, 2020; v1 submitted 18 October, 2020; originally announced October 2020.

  38. Richly Activated Graph Convolutional Network for Robust Skeleton-based Action Recognition

    Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang

    Abstract: Current methods for skeleton-based human action recognition usually work with complete skeletons. However, in real scenarios, it is inevitable to capture incomplete or noisy skeletons, which could significantly deteriorate the performance of current methods when some informative joints are occluded or disturbed. To improve the robustness of action recognition models, a multi-stream graph convoluti… ▽ More

    Submitted 25 November, 2020; v1 submitted 9 August, 2020; originally announced August 2020.

    Comments: Accepted by IEEE T-CSVT, 11 pages, 6 figures, 10 tables

  39. arXiv:2007.07788  [pdf, other

    cs.CV

    CANet: Context Aware Network for 3D Brain Glioma Segmentation

    Authors: Zhihua Liu, Lei Tong, Long Chen, Feixiang Zhou, Zheheng Jiang, Qianni Zhang, Yinhai Wang, Caifeng Shan, Ling Li, Huiyu Zhou

    Abstract: Automated segmentation of brain glioma plays an active role in diagnosis decision, progression monitoring and surgery planning. Based on deep neural networks, previous studies have shown promising technologies for brain glioma segmentation. However, these approaches lack powerful strategies to incorporate contextual information of tumor cells and their surrounding, which has been proven as a funda… ▽ More

    Submitted 22 March, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

  40. arXiv:2007.04807  [pdf, other

    eess.IV cs.CV physics.med-ph

    Medical Instrument Detection in Ultrasound-Guided Interventions: A Review

    Authors: Hongxu Yang, Caifeng Shan, Alexander F. Kolen, Peter H. N. de With

    Abstract: Medical instrument detection is essential for computer-assisted interventions since it would facilitate the surgeons to find the instrument efficiently with a better interpretation, which leads to a better outcome. This article reviews medical instrument detection methods in the ultrasound-guided intervention. First, we present a comprehensive review of instrument detection methodologies, which in… ▽ More

    Submitted 1 February, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Draft paper

  41. arXiv:2006.14702  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid Constrained Semi-Supervised Learning and Dual-UNet

    Authors: Hongxu Yang, Caifeng Shan, Alexander F. Kolen, Peter H. N. de With

    Abstract: Catheter segmentation in 3D ultrasound is important for computer-assisted cardiac intervention. However, a large amount of labeled images are required to train a successful deep convolutional neural network (CNN) to segment the catheter, which is expensive and time-consuming. In this paper, we propose a novel catheter segmentation approach, which requests fewer annotations than the supervised lear… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: Accepted by MICCAI 2020

  42. arXiv:2006.04435  [pdf, other

    cs.LG cs.AI stat.ML

    CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

    Authors: Xiang Li, Ben Kao, Caihua Shan, Dawei Yin, Martin Ester

    Abstract: We study the problem of applying spectral clustering to cluster multi-scale data, which is data whose clusters are of various sizes and densities. Traditional spectral clustering techniques discover clusters by processing a similarity matrix that reflects the proximity of objects. For multi-scale data, distance-based similarity is not effective because objects of a sparse cluster could be far apar… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  43. Sham: A DSL for Fast DSLs

    Authors: Rajan Walia, Chung-chieh Shan, Sam Tobin-Hochstadt

    Abstract: Domain-specific languages (DSLs) are touted as both easy to embed in programs and easy to optimize. Yet these goals are often in tension. Embedded or internal DSLs fit naturally with a host language, while inheriting the host's performance characteristics. External DSLs can use external optimizers and languages but sit apart from the host. We present Sham, a toolkit designed to enable internal DSL… ▽ More

    Submitted 15 July, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

    Journal ref: The Art, Science, and Engineering of Programming, 2022, Vol. 6, Issue 1, Article 4

  44. arXiv:2005.07567  [pdf

    q-bio.QM cs.LG stat.AP

    Accelerating drug repurposing for COVID-19 via modeling drug mechanism of action with large scale gene-expression profiles

    Authors: Lu Han, G. C. Shan, B. F. Chu, H. Y. Wang, Z. J. Wang, S. Q. Gao, W. X. Zhou

    Abstract: The novel coronavirus disease, named COVID-19, emerged in China in December 2019, and has rapidly spread around the world. It is clearly urgent to fight COVID-19 at global scale. The development of methods for identifying drug uses based on phenotypic data can improve the efficiency of drug development. However, there are still many difficulties in identifying drug applications based on cell pictu… ▽ More

    Submitted 5 October, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: 22 pages, 4 figures. Cognitive Neurodynamics (2021)

  45. arXiv:2003.02115  [pdf, other

    cs.CV cs.LG eess.IV

    VESR-Net: The Winning Solution to Youku Video Enhancement and Super-Resolution Challenge

    Authors: Jiale Chen, Xu Tan, Chaowei Shan, Sen Liu, Zhibo Chen

    Abstract: This paper introduces VESR-Net, a method for video enhancement and super-resolution (VESR). We design a separate non-local module to explore the relations among video frames and fuse video frames efficiently, and a channel attention residual block to capture the relations among feature maps for video frame reconstruction in VESR-Net. We conduct experiments to analyze the effectiveness of these des… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: The champion of Youku-VESR challenge

  46. arXiv:1911.01042  [pdf, other

    cs.DB cs.LG

    A General Early-Stop** Module for Crowdsourced Ranking

    Authors: Caihua Shan, Leong Hou U, Nikos Mamoulis, Reynold Cheng, Xiang Li

    Abstract: Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

  47. arXiv:1911.01030  [pdf, other

    cs.LG cs.DB stat.ML

    An End-to-End Deep RL Framework for Task Arrangement in Crowdsourcing Platforms

    Authors: Caihua Shan, Nikos Mamoulis, Reynold Cheng, Guoliang Li, Xiang Li, Yuqiu Qian

    Abstract: In this paper, we propose a Deep Reinforcement Learning (RL) framework for task arrangement, which is a critical problem for the success of crowdsourcing platforms. Previous works conduct the personalized recommendation of tasks to workers via supervised learning methods. However, the majority of them only consider the benefit of either workers or requesters independently. In addition, they cannot… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

  48. arXiv:1906.03404  [pdf, other

    cs.CV

    A Coarse-to-Fine Framework for Learned Color Enhancement with Non-Local Attention

    Authors: Chaowei Shan, Zhizheng Zhang, Zhibo Chen

    Abstract: Automatic color enhancement is aimed to adaptively adjust photos to expected styles and tones. For current learned methods in this field, global harmonious perception and local details are hard to be well-considered in a single model simultaneously. To address this problem, we propose a coarse-to-fine framework with non-local attention for color enhancement in this paper. Within our framework, we… ▽ More

    Submitted 20 July, 2019; v1 submitted 8 June, 2019; originally announced June 2019.

    Comments: To appear in ICIP19

  49. arXiv:1903.04982  [pdf, other

    cs.LG

    A Capsule-unified Framework of Deep Neural Networks for Graphical Programming

    Authors: Yujian Li, Chuanhui Shan

    Abstract: Recently, the growth of deep learning has produced a large number of deep neural networks. How to describe these networks unifiedly is becoming an important issue. We first formalize neural networks in a mathematical definition, give their directed graph representations, and prove a generation theorem about the induced networks of connected directed acyclic graphs. Then, using the concept of capsu… ▽ More

    Submitted 13 March, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

    Comments: 20 pages; 26 figures. arXiv admin note: text overlap with arXiv:1805.03551

  50. arXiv:1902.09278  [pdf, other

    astro-ph.IM astro-ph.CO cs.LG

    Separating the EoR Signal with a Convolutional Denoising Autoencoder: A Deep-learning-based Method

    Authors: Weitian Li, Haiguang Xu, Zhixian Ma, Ruimin Zhu, Dan Hu, Zhenghao Zhu, Junhua Gu, Chenxi Shan, Jie Zhu, Xiang-** Wu

    Abstract: When applying the foreground removal methods to uncover the faint cosmological signal from the epoch of reionization (EoR), the foreground spectra are assumed to be smooth. However, this assumption can be seriously violated in practice since the unresolved or mis-subtracted foreground sources, which are further complicated by the frequency-dependent beam effects of interferometers, will generate s… ▽ More

    Submitted 14 March, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: 10 pages, 9 figures; minor text updates to match the MNRAS published version

    Journal ref: 2019, MNRAS, 485, 2628