Skip to main content

Showing 1–46 of 46 results for author: Shang, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01909  [pdf, other

    cs.CL cs.SD eess.AS

    Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models

    Authors: Zhiyuan Tang, Dong Wang, Shen Huang, Shidong Shang

    Abstract: Recent studies have demonstrated the efficacy of large language models (LLMs) in error correction for automatic speech recognition (ASR). However, much of the research focuses on the English language. This paper redirects the attention to Chinese. Firstly, we construct a specialized benchmark dataset aimed at error correction for Chinese ASR with 724K hypotheses-transcription pairs, named the Chin… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Interspeech 2024

  2. arXiv:2407.00993  [pdf, other

    cs.AI cs.CL

    Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents

    Authors: Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang

    Abstract: With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agents generally faces three main challenges: (1) The inefficiency of UI-only operations imposes limitations to task evaluation. (2) Specific instructions… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.19966  [pdf, other

    cs.CL

    Simulating Financial Market via Large Language Model based Agents

    Authors: Shen Gao, Yuntao Wen, Minghang Zhu, Jianing Wei, Yuhan Cheng, Qunzi Zhang, Shuo Shang

    Abstract: Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  4. arXiv:2406.12123  [pdf, other

    cs.RO cs.AI cs.LG

    ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

    Authors: **gxi Xu, Runsheng Wang, Siqi Shang, Ava Chen, Lauren Winterbottom, To-Liang Hsu, Wenxi Chen, Khondoker Ahmed, Pedro Leandro La Rotta, Xinyue Zhu, Dawn M. Nilsen, Joel Stein, Matei Ciocarlie

    Abstract: Intent inferral on a hand orthosis for stroke patients is challenging due to the difficulty of data collection from impaired subjects. Additionally, EMG signals exhibit significant variations across different conditions, sessions, and subjects, making it hard for classifiers to generalize. Traditional approaches require a large labeled dataset from the new condition, session, or subject to train i… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages

  5. arXiv:2405.18113  [pdf, other

    cs.CL cs.AI

    Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting

    Authors: Hongda Sun, Hongzhan Lin, Haiyu Yan, Chen Zhu, Yang Song, Xin Gao, Shuo Shang, Rui Yan

    Abstract: The emergence of online recruitment services has revolutionized the traditional landscape of job seeking and recruitment, necessitating the development of high-quality industrial applications to improve person-job fitting. Existing methods generally rely on modeling the latent semantics of resumes and job descriptions and learning a matching function between them. Inspired by the powerful role-pla… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  6. arXiv:2405.03654  [pdf, other

    cs.CR cs.AI

    Can LLMs Deeply Detect Complex Malicious Queries? A Framework for Jailbreaking via Obfuscating Intent

    Authors: Shang Shang, Xinqiang Zhao, Zhongjiang Yao, Yepeng Yao, Liya Su, Zi**g Fan, Xiaodan Zhang, Zhengwei Jiang

    Abstract: To demonstrate and address the underlying maliciousness, we propose a theoretical hypothesis and analytical approach, and introduce a new black-box jailbreak attack methodology named IntentObfuscator, exploiting this identified flaw by obfuscating the true intentions behind user prompts.This approach compels LLMs to inadvertently generate restricted content, bypassing their built-in content securi… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  7. arXiv:2404.06311  [pdf, other

    cs.IR

    DRE: Generating Recommendation Explanations by Aligning Large Language Models at Data-level

    Authors: Shen Gao, Yifan Wang, Jiabao Fang, Lisi Chen, Peng Han, Shuo Shang

    Abstract: Recommendation systems play a crucial role in various domains, suggesting items based on user behavior.However, the lack of transparency in presenting recommendations can lead to user confusion. In this paper, we introduce Data-level Recommendation Explanation (DRE), a non-intrusive explanation framework for black-box recommendation models.Different from existing methods, DRE does not require any… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 5 pages, 2 figures

  8. arXiv:2404.05569  [pdf, other

    cs.AI cs.CL cs.MA

    360$^\circ$REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

    Authors: Shen Gao, Hao Li, Chengrui Huang, Quan Tu, Zhiliang Tian, Minlie Huang, Shuo Shang

    Abstract: Large language model agents have demonstrated remarkable advancements across various complex tasks. Recent works focus on optimizing the agent team or employing self-reflection to iteratively solve complex tasks. Since these agents are all based on the same LLM, only conducting self-evaluation or removing underperforming agents does not substantively enhance the capability of the agents. We argue… ▽ More

    Submitted 26 June, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  9. arXiv:2403.17755  [pdf, other

    cs.AI cs.CR cs.CV

    DataCook: Crafting Anti-Adversarial Examples for Healthcare Data Copyright Protection

    Authors: Sihan Shang, Jiancheng Yang, Zhenglong Sun, Pascal Fua

    Abstract: In the realm of healthcare, the challenges of copyright protection and unauthorized third-party misuse are increasingly significant. Traditional methods for data copyright protection are applied prior to data distribution, implying that models trained on these data become uncontrollable. This paper introduces a novel approach, named DataCook, designed to safeguard the copyright of healthcare data… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  10. arXiv:2403.06831  [pdf, other

    cs.CV

    HDRTransDC: High Dynamic Range Image Reconstruction with Transformer Deformation Convolution

    Authors: Shuaikang Shang, Xue**g Kang, Anlong Ming

    Abstract: High Dynamic Range (HDR) imaging aims to generate an artifact-free HDR image with realistic details by fusing multi-exposure Low Dynamic Range (LDR) images. Caused by large motion and severe under-/over-exposure among input LDR images, HDR imaging suffers from ghosting artifacts and fusion distortions. To address these critical issues, we propose an HDR Transformer Deformation Convolution (HDRTran… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  11. arXiv:2403.05217  [pdf, other

    cs.CL cs.AI cs.IR

    Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering

    Authors: Hongda Sun, Yuxuan Liu, Chengwei Wu, Haiyu Yan, Cheng Tai, Xin Gao, Shuo Shang, Rui Yan

    Abstract: Open-domain question answering (ODQA) has emerged as a pivotal research spotlight in information systems. Existing methods follow two main paradigms to collect evidence: (1) The \textit{retrieve-then-read} paradigm retrieves pertinent documents from an external corpus; and (2) the \textit{generate-then-read} paradigm employs large language models (LLMs) to generate relevant documents. However, nei… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: TheWebConf 2024 (WWW 2024) oral, code repo: https://github.com/EthanLeo-LYX/LLMQA

  12. arXiv:2403.03102  [pdf, other

    cs.CL cs.AI

    "In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning

    Authors: Chuanqi Cheng, Quan Tu, Wei Wu, Shuo Shang, Cunli Mao, Zhengtao Yu, Rui Yan

    Abstract: Personalized dialogue systems have gained significant attention in recent years for their ability to generate responses in alignment with different personas. However, most existing approaches rely on pre-defined personal profiles, which are not only time-consuming and labor-intensive to create but also lack flexibility. We propose In-Dialogue Learning (IDL), a fine-tuning framework that enhances t… ▽ More

    Submitted 12 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  13. arXiv:2403.02181  [pdf, other

    cs.CL cs.AI cs.LG

    Not all Layers of LLMs are Necessary during Inference

    Authors: Siqi Fan, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Shuo Shang, Aixin Sun, Yequan Wang, Zhongyuan Wang

    Abstract: The inference phase of Large Language Models (LLMs) is very expensive. An ideal inference stage of LLMs could utilize fewer computational resources while still maintaining its capabilities (e.g., generalization and in-context learning ability). In this paper, we try to answer the question, "During LLM inference, can we use shallow layers for easy instances; and deep layers for hard ones?" To answe… ▽ More

    Submitted 14 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  14. arXiv:2403.00832  [pdf, other

    cs.IR cs.AI

    Explainable Session-based Recommendation via Path Reasoning

    Authors: Yang Cao, Shuo Shang, Jun Wang, Wei Zhang

    Abstract: This paper explores providing explainability for session-based recommendation (SR) by path reasoning. Current SR models emphasize accuracy but lack explainability, while traditional path reasoning prioritizes knowledge graph exploration, ignoring sequential patterns present in the session history. Therefore, we propose a generalized hierarchical reinforcement learning framework for SR, which impro… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  15. arXiv:2401.15484  [pdf, other

    cs.RO

    R$\times$R: Rapid eXploration for Reinforcement Learning via Sampling-based Reset Distributions and Imitation Pre-training

    Authors: Gagan Khandate, Tristan L. Saidi, Siqi Shang, Eric T. Chang, Yang Liu, Seth Dennis, Johnson Adams, Matei Ciocarlie

    Abstract: We present a method for enabling Reinforcement Learning of motor control policies for complex skills such as dexterous manipulation. We posit that a key difficulty for training such policies is the difficulty of exploring the problem state space, as the accessible and useful regions of this space form a complex structure along manifolds of the original high-dimensional state space. This work prese… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 20 pages, 14 figures, submitted to Autonomous Robots, RSS 2023 Special Issue. arXiv admin note: substantial text overlap with arXiv:2303.03486

  16. arXiv:2312.01871  [pdf, other

    cs.CV

    FeaInfNet: Diagnosis in Medical Image with Feature-Driven Inference and Visual Explanations

    Authors: Yitao Peng, Lianghua He, Die Hu, Yihang Liu, Longzhen Yang, Shaohua Shang

    Abstract: Interpretable deep learning models have received widespread attention in the field of image recognition. Due to the unique multi-instance learning of medical images and the difficulty in identifying decision-making regions, many interpretability models that have been proposed still have problems of insufficient accuracy and interpretability in medical image disease diagnosis. To solve these proble… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  17. arXiv:2310.18659  [pdf, other

    cs.AI cs.CL

    DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy

    Authors: Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan

    Abstract: Recent advances in large language models (LLMs) have revolutionized the landscape of reasoning tasks. To enhance the capabilities of LLMs to emulate human reasoning, prior studies have focused on modeling reasoning steps using various thought structures like chains, trees, or graphs. However, LLM-based reasoning still encounters the following challenges: (1) Limited adaptability of preset structur… ▽ More

    Submitted 26 May, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at ACL 2024 Main, Code repo: https://github.com/XiaoMi/DetermLR

  18. arXiv:2310.15523  [pdf, other

    cs.LG cs.AI

    Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning

    Authors: Yuxiang Wang, Xiao Yan, Chuang Hu, Fangcheng Fu, Wentao Zhang, Hao Wang, Shuo Shang, Jiawei Jiang

    Abstract: For graph self-supervised learning (GSSL), masked autoencoder (MAE) follows the generative paradigm and learns to reconstruct masked graph edges or node features. Contrastive Learning (CL) maximizes the similarity between augmented views of the same graph and is widely used for GSSL. However, MAE and CL are considered separately in existing works for GSSL. We observe that the MAE and CL paradigms… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  19. arXiv:2310.10992  [pdf, other

    cs.SD eess.AS

    A High Fidelity and Low Complexity Neural Audio Coding

    Authors: Wenzhe Liu, Wei Xiao, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu

    Abstract: Audio coding is an essential module in the real-time communication system. Neural audio codecs can compress audio samples with a low bitrate due to the strong modeling and generative capabilities of deep neural networks. To address the poor high-frequency expression and high computational cost and storage consumption, we proposed an integrated framework that utilizes a neural network to model wide… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  20. arXiv:2308.16385  [pdf, other

    cs.LG cs.AI

    BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

    Authors: Qiang Huang, Jiawei Jiang, Xi Susie Rao, Ce Zhang, Zhichao Han, Zitao Zhang, Xin Wang, Yongjun He, Quanqing Xu, Yang Zhao, Chuang Hu, Shuo Shang, Bo Du

    Abstract: To handle graphs in which features or connectivities are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed. Despite the success of these TGNNs, the previous TGNN evaluations reveal several limitations regarding four critical issues: 1) inconsistent datasets, 2) inconsistent evaluation pipelines, 3) lacking workload diversity, and 4) lacking efficient compari… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: 28 pages, 23 figures, 27 tables. Submitted to the Conference on Neural Information Processing Systems 2023 Track on Datasets and Benchmarks

  21. arXiv:2308.10278  [pdf, other

    cs.CL

    CharacterChat: Learning towards Conversational AI with Personalized Social Support

    Authors: Quan Tu, Chuanqi Chen, **peng Li, Yanran Li, Shuo Shang, Dongyan Zhao, Ran Wang, Rui Yan

    Abstract: In our modern, fast-paced, and interconnected world, the importance of mental well-being has grown into a matter of great urgency. However, traditional methods such as Emotional Support Conversations (ESC) face challenges in effectively addressing a diverse range of individual personalities. In response, we introduce the Social Support Conversation (S2Conv) framework. It comprises a series of supp… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: 10 pages, 6 figures, 5 tables

  22. arXiv:2307.13581  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Comparing Forward and Inverse Design Paradigms: A Case Study on Refractory High-Entropy Alloys

    Authors: Arindam Debnath, Lavanya Raman, Wenjie Li, Adam M. Krajewski, Marcia Ahn, Shuang Lin, Shunli Shang, Allison M. Beese, Zi-Kui Liu, Wesley F. Reinhart

    Abstract: The rapid design of advanced materials is a topic of great scientific interest. The conventional, ``forward'' paradigm of materials design involves evaluating multiple candidates to determine the best candidate that matches the target properties. However, recent advances in the field of deep learning have given rise to the possibility of an ``inverse'' design paradigm for advanced materials, where… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  23. arXiv:2305.05599  [pdf, other

    cs.SD cs.HC eess.AS

    Inter-SubNet: Speech Enhancement with Subband Interaction

    Authors: Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng

    Abstract: Subband-based approaches process subbands in parallel through the model with shared parameters to learn the commonality of local spectrums for noise reduction. In this way, they have achieved remarkable results with fewer parameters. However, in some complex environments, the lack of global spectral information has a negative impact on the performance of these subband-based approaches. To this end… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted by ICASSP 2023

  24. arXiv:2304.06875  [pdf, other

    cs.CL cs.LG

    nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales

    Authors: Yiqun Yao, Siqi fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang

    Abstract: As language models scale up, it becomes increasingly expensive to verify research ideas because conclusions on small models do not trivially transfer to large ones. A possible solution is to establish a generic system that accurately predicts certain metrics for large models without training them. Existing scaling laws require hyperparameter search on the largest models, limiting their predicative… ▽ More

    Submitted 6 April, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: This is a modified and extended version of our previous Mu-scaling work released in April 2023 (see v1)

  25. arXiv:2303.08714  [pdf, other

    cs.CV

    ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution

    Authors: Shuyao Shang, Zhengyang Shan, Guangxing Liu, LunQian Wang, XingHua Wang, Zekai Zhang, **glin Zhang

    Abstract: Adapting the Diffusion Probabilistic Model (DPM) for direct image super-resolution is wasteful, given that a simple Convolutional Neural Network (CNN) can recover the main low-frequency content. Therefore, we present ResDiff, a novel Diffusion Probabilistic Model based on Residual structure for Single Image Super-Resolution (SISR). ResDiff utilizes a combination of a CNN, which restores primary lo… ▽ More

    Submitted 2 February, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 9 pages, 5 figures

  26. arXiv:2303.07704  [pdf, other

    eess.AS cs.SD

    TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge

    Authors: Yukai Ju, Jun Chen, Shimin Zhang, Shulin He, Wei Rao, Weixin Zhu, Yannan Wang, Tao Yu, Shidong Shang

    Abstract: This paper introduces the Unbeatable Team's submission to the ICASSP 2023 Deep Noise Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded version -- TEA-PSE 3.0. Specifically, TEA-PSE 3.0 incorporates a residual LSTM after squeezed temporal convolution network (S-TCN) to enhance sequence modeling capabilities. Additionally, the local-global representation (LGR) struct… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  27. arXiv:2303.03486  [pdf, other

    cs.RO

    Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation

    Authors: Gagan Khandate, Siqi Shang, Eric T. Chang, Tristan Luca Saidi, Yang Liu, Seth Matthew Dennis, Johnson Adams, Matei Ciocarlie

    Abstract: In this paper, we present a novel method for achieving dexterous manipulation of complex objects, while simultaneously securing the object without the use of passive support surfaces. We posit that a key difficulty for training such policies in a Reinforcement Learning framework is the difficulty of exploring the problem state space, as the accessible regions of this space form a complex structure… ▽ More

    Submitted 23 May, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, accepted at Robotics Science & Systems 2023

  28. arXiv:2212.12116  [pdf, other

    cs.CV cs.AI

    Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN

    Authors: Yaozong Mo, Chaofeng Li, Wenqi Ren, Shaopeng Shang, Wenwu Wang, Xiao-jun Wu

    Abstract: Deep learning-based methods have achieved significant performance for image defogging. However, existing methods are mainly developed for land scenes and perform poorly when dealing with overwater foggy images, since overwater scenes typically contain large expanses of sky and water. In this work, we propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes. T… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  29. arXiv:2211.05432  [pdf, other

    cs.SD eess.AS

    Speech Enhancement with Fullband-Subband Cross-Attention Network

    Authors: Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng

    Abstract: FullSubNet has shown its promising performance on speech enhancement by utilizing both fullband and subband information. However, the relationship between fullband and subband in FullSubNet is achieved by simply concatenating the output of fullband model and subband units. It only supplements the subband units with a small quantity of global information and has not considered the interaction betwe… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Accepted by InterSpeech 2022. arXiv admin note: text overlap with arXiv:2203.12188

  30. arXiv:2210.15853  [pdf, other

    cs.SD eess.AS

    Speech Enhancement with Intelligent Neural Homomorphic Synthesis

    Authors: Shulin He, Wei Rao, **jiang Liu, Jun Chen, Yukai Ju, Xueliang Zhang, Yannan Wang, Shidong Shang

    Abstract: Most neural network speech enhancement models ignore speech production mathematical models by directly map** Fourier transform spectrums or waveforms. In this work, we propose a neural source filter network for speech enhancement. Specifically, we use homomorphic signal processing and cepstral analysis to obtain noisy speech's excitation and vocal tract. Unlike traditional signal processing, we… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  31. arXiv:2203.16032  [pdf, other

    cs.SD eess.AS

    ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications

    Authors: Gaoxiong Yi, Wei Xiao, Yiming Xiao, Babak Naderi, Sebastian Möller, Wafaa Wardah, Gabriel Mittag, Ross Cutler, Zhuohuang Zhang, Donald S. Williamson, Fei Chen, Fuzheng Yang, Shidong Shang

    Abstract: With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is traditionally assessed in subjective tests in laborato… ▽ More

    Submitted 31 March, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  32. arXiv:2104.00960  [pdf, other

    eess.AS cs.SD

    INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

    Authors: Wei Rao, Yihui Fu, Yanxin Hu, Xin Xu, Yvkai Jv, Jiangyu Han, Zhongjie Jiang, Lei Xie, Yannan Wang, Shinji Watanabe, Zheng-Hua Tan, Hui Bu, Tao Yu, Shidong Shang

    Abstract: The ConferencingSpeech 2021 challenge is proposed to stimulate research on far-field multi-channel speech enhancement for video conferencing. The challenge consists of two separate tasks: 1) Task 1 is multi-channel speech enhancement with single microphone array and focusing on practical application with real-time requirement and 2) Task 2 is multi-channel speech enhancement with multiple distribu… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: 5 pages, submitted to INTERSPEECH 2021

  33. arXiv:1911.08119  [pdf, other

    cs.LG cs.CV

    Adaptive Routing Between Capsules

    Authors: Qiang Ren, Shaohua Shang, Lianghua He

    Abstract: Capsule network is the most recent exciting advancement in the deep learning field and represents positional information by stacking features into vectors. The dynamic routing algorithm is used in the capsule network, however, there are some disadvantages such as the inability to stack multiple layers and a large amount of computation. In this paper, we propose an adaptive routing algorithm that c… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  34. arXiv:1901.09822  [pdf, other

    cs.CV

    Virtual Conditional Generative Adversarial Networks

    Authors: Haifeng Shi, Guanyu Cai, Yuqin Wang, Shaohua Shang, Lianghua He

    Abstract: When trained on multimodal image datasets, normal Generative Adversarial Networks (GANs) are usually outperformed by class-conditional GANs and ensemble GANs, but conditional GANs is restricted to labeled datasets and ensemble GANs lack efficiency. We propose a novel GAN variant called virtual conditional GAN (vcGAN) which is not only an ensemble GAN with multiple generative paths while adding alm… ▽ More

    Submitted 25 January, 2019; originally announced January 2019.

  35. arXiv:1901.07288  [pdf, other

    cs.CV

    Unsupervised Learning-based Depth Estimation aided Visual SLAM Approach

    Authors: Mingyang Geng, Suning Shang, Bo Ding, Huaimin Wang, Pengfei Zhang, Lei Zhang

    Abstract: The RGB-D camera maintains a limited range for working and is hard to accurately measure the depth information in a far distance. Besides, the RGB-D camera will easily be influenced by strong lighting and other external factors, which will lead to a poor accuracy on the acquired environmental depth information. Recently, deep learning technologies have achieved great success in the visual SLAM are… ▽ More

    Submitted 22 January, 2019; originally announced January 2019.

    Comments: 27 pages

  36. arXiv:1811.05652  [pdf, other

    cs.DS

    Submodular Optimization Over Streams with Inhomogeneous Decays

    Authors: Junzhou Zhao, Shuo Shang, **hui Wang, John C. S. Lui, Xiangliang Zhang

    Abstract: Cardinality constrained submodular function maximization, which aims to select a subset of size at most $k$ to maximize a monotone submodular utility function, is the key in many data mining and machine learning applications such as data summarization and maximum coverage problems. When data is given as a stream, streaming submodular optimization (SSO) techniques are desired. Existing SSO techniqu… ▽ More

    Submitted 14 November, 2018; originally announced November 2018.

  37. arXiv:1810.07917  [pdf, other

    cs.SI physics.soc-ph

    Tracking Influential Nodes in Time-Decaying Dynamic Interaction Networks

    Authors: Junzhou Zhao, Shuo Shang, **hui Wang, John C. S. Lui, Xiangliang Zhang

    Abstract: Identifying influential nodes that can jointly trigger the maximum influence spread in networks is a fundamental problem in many applications such as viral marketing, online advertising, and disease control. Most existing studies assume that social influence is static and they fail to capture the dynamics of influence in reality. In this work, we address the dynamic influence challenge by designin… ▽ More

    Submitted 22 October, 2018; v1 submitted 18 October, 2018; originally announced October 2018.

    Comments: 14 pages, 15 figures

  38. arXiv:1605.02337  [pdf, other

    cs.DS

    A Novel Framework for Online Amnesic Trajectory Compression in Resource-constrained Environments

    Authors: Jiajun Liu, Kun Zhao, Philipp Sommer, Shuo Shang, Brano Kusy, Jae-Gil Lee, Raja Jurdak

    Abstract: State-of-the-art trajectory compression methods usually involve high space-time complexity or yield unsatisfactory compression rates, leading to rapid exhaustion of memory, computation, storage and energy resources. Their ability is commonly limited when operating in a resource-constrained environment especially when the data volume (even when compressed) far exceeds the storage limit. Hence we pr… ▽ More

    Submitted 8 May, 2016; originally announced May 2016.

    Comments: arXiv admin note: substantial text overlap with arXiv:1412.0321

  39. arXiv:1412.0321  [pdf, other

    cs.DS cs.DB

    Bounded Quadrant System: Error-bounded Trajectory Compression on the Go

    Authors: Jiajun Liu, Kun Zhao, Philipp Sommer, Shuo Shang, Brano Kusy, Raja Jurdak

    Abstract: Long-term location tracking, where trajectory compression is commonly used, has gained high interest for many applications in transport, ecology, and wearable computing. However, state-of-the-art compression methods involve high space-time complexity or achieve unsatisfactory compression rate, leading to rapid exhaustion of memory, computation, storage and energy resources. We propose a novel onli… ▽ More

    Submitted 8 December, 2014; v1 submitted 30 November, 2014; originally announced December 2014.

    Comments: International Conference on Data Engineering (ICDE) 2015, 12 pages

  40. arXiv:1409.6831  [pdf, other

    cs.AI cs.CR

    The Application of Differential Privacy for Rank Aggregation: Privacy and Accuracy

    Authors: Shang Shang, Tiance Wang, Paul Cuff, Sanjeev Kulkarni

    Abstract: The potential risk of privacy leakage prevents users from sharing their honest opinions on social platforms. This paper addresses the problem of privacy preservation if the query returns the histogram of rankings. The framework of differential privacy is applied to rank aggregation. The error probability of the aggregated ranking is analyzed as a result of noise added in order to achieve different… ▽ More

    Submitted 24 September, 2014; originally announced September 2014.

    Comments: Fusion 2014

  41. An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

    Authors: Shang Shang, Paul Cuff, Pan Hui, Sanjeev Kulkarni

    Abstract: We analyze a class of distributed quantized consensus algorithms for arbitrary static networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reac… ▽ More

    Submitted 24 September, 2014; originally announced September 2014.

    Comments: to appear in IEEE Trans. on Automatic Control, January, 2015. arXiv admin note: substantial text overlap with arXiv:1208.0788

    Journal ref: IEEE Trans. on Automatic Control, 60(4):1127-32, April, 2015

  42. arXiv:1305.0540  [pdf, other

    cs.IR

    Privacy Preserving Recommendation System Based on Groups

    Authors: Shang Shang, Yuk Hui, Pan Hui, Paul Cuff, Sanjeev Kulkarni

    Abstract: Recommendation systems have received considerable attention in the recent decades. Yet with the development of information technology and social media, the risk in revealing private data to service providers has been a growing concern to more and more users. Trade-offs between quality and privacy in recommendation systems naturally arise. In this paper, we present a privacy preserving recommendati… ▽ More

    Submitted 13 May, 2013; v1 submitted 2 May, 2013; originally announced May 2013.

  43. arXiv:1208.0788  [pdf, other

    stat.AP cs.DC cs.PF math.OC

    An Upper Bound on the Convergence Time for Quantized Consensus

    Authors: Shang Shang, Paul W. Cuff, Pan Hui, Sanjeev R. Kulkarni

    Abstract: We analyze a class of distributed quantized consen- sus algorithms for arbitrary networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach con… ▽ More

    Submitted 17 May, 2013; v1 submitted 3 August, 2012; originally announced August 2012.

    Comments: submitted to IEEE Transactions on Automatic Control, 23 pages

  44. arXiv:1208.0787  [pdf, other

    cs.IR cs.LG

    A Random Walk Based Model Incorporating Social Information for Recommendations

    Authors: Shang Shang, Sanjeev R. Kulkarni, Paul W. Cuff, Pan Hui

    Abstract: Collaborative filtering (CF) is one of the most popular approaches to build a recommendation system. In this paper, we propose a hybrid collaborative filtering model based on a Makovian random walk to address the data sparsity and cold start problems in recommendation systems. More precisely, we construct a directed graph whose nodes consist of items and users, together with item content, user pro… ▽ More

    Submitted 17 May, 2013; v1 submitted 3 August, 2012; originally announced August 2012.

    Comments: 2012 IEEE Machine Learning for Signal Processing Workshop (MLSP), 6 pages

  45. arXiv:1208.0782  [pdf, other

    cs.IR cs.LG cs.SI physics.soc-ph

    Wisdom of the Crowd: Incorporating Social Influence in Recommendation Models

    Authors: Shang Shang, Pan Hui, Sanjeev R. Kulkarni, Paul W. Cuff

    Abstract: Recommendation systems have received considerable attention recently. However, most research has been focused on improving the performance of collaborative filtering (CF) techniques. Social networks, indispensably, provide us extra information on people's preferences, and should be considered and deployed to improve the quality of recommendations. In this paper, we propose two recommendation model… ▽ More

    Submitted 17 May, 2013; v1 submitted 3 August, 2012; originally announced August 2012.

    Comments: HotPost 2011, 6 pages

  46. arXiv:1208.0525  [pdf, other

    cs.PF

    An Upper Bound on the Convergence Time for Distributed Binary Consensus

    Authors: Shang Shang, Paul W. Cuff, Sanjeev R. Kulkarni, Pan Hui

    Abstract: The problem addressed in this paper is the analysis of a distributed consensus algorithm for arbitrary networks, proposed by Bénézit et al.. In the initial setting, each node in the network has one of two possible states ("yes" or "no"). Nodes can update their states by communicating with their neighbors via a 2-bit message in an asynchronous clock setting. Eventually, all nodes reach consensus on… ▽ More

    Submitted 17 May, 2013; v1 submitted 2 August, 2012; originally announced August 2012.

    Comments: 15th International Conference on Information Fusion, July 2012, 7 pages