Skip to main content

Showing 1–50 of 55 results for author: Shi, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.17801  [pdf, other

    cs.SD cs.CL eess.AS

    A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

    Authors: Xiaopeng Wang, Yi Lu, Xin Qi, Zhiyong Wang, Yuankun Xie, Shuchen Shi, Ruibo Fu

    Abstract: This paper presents the development of a speech synthesis system for the LIMMITS'24 Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a multi-speaker, multi-lingual Indic Text-to-Speech system with voice cloning capabilities, covering seven Indian languages with both male and female speakers. The system was trained using challenge data and fine-tuned for few-… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.14931  [pdf, other

    eess.SP

    Multi-beam Training for Near-field Communications in High-frequency Bands

    Authors: Cong Zhou, Changsheng You, Zixuan Huang, Shuo Shi, Yi Gong, Chan-Byoung Chae, Kaibin Huang

    Abstract: In this paper, we study efficient multi-beam training design for near-field communications to reduce the beam training overhead of conventional single-beam training methods. In particular, the array-division based multi-beam training method, which is widely used in far-field communications, cannot be directly applied to the near-field scenario, since different sub-arrays may observe different user… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: In this paper, a novel near-field multi-beam training scheme is proposed by sparsely activating a portion of antennas to form a sparse linear array

  3. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  4. arXiv:2406.08112  [pdf, other

    cs.SD cs.AI eess.AS

    Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

    Authors: Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi

    Abstract: With the proliferation of Large Language Model (LLM) based deepfake audio, there is an urgent need for effective detection methods. Previous deepfake audio generation methods typically involve a multi-step generation process, with the final step using a vocoder to predict the waveform from handcrafted features. However, LLM-based audio is directly generated from discrete neural codecs in an end-to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. arXiv admin note: substantial text overlap with arXiv:2405.04880

  5. arXiv:2406.04683  [pdf, other

    cs.SD eess.AS

    PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

    Authors: Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang

    Abstract: Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description, playing a crucial role in media production. The text descriptions in TTA datasets lack rich variations and diversity, resulting in a drop in TTA model performance when faced with complex text. To address this issue, we propose a method called Portable Plug-in Prompt Refiner, which utilizes rich knowledge abo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  6. arXiv:2406.04262  [pdf, other

    eess.SP

    Near-field Beam Training with Sparse DFT Codebook

    Authors: Cong Zhou, Chenyu Wu, Changsheng You, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as one promising technology to improve the spectral efficiency and spatial resolution of future sixth generation (6G) wireless systems.The upsurge in the antenna number antennas renders communication users more likely to be located in the near-field region, which requires a more accurate spherical (instead of planar) wavefront propagation modeling… ▽ More

    Submitted 18 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: In this paper, we propose a novel sparse DFT codebook to reduce near-field beam training overhead, which is equivalent to sparsely activating the dense array

  7. arXiv:2406.03247  [pdf, other

    cs.SD eess.AS

    Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

    Authors: Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi

    Abstract: The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new spoofing techniques. Traditional FAD methods often focus solely on distinguishing between genuine and known spoofed audio. We propose a Genuine-Focused Learning (GFL) framework guided, aiming for highly generalized FAD, called GFL-FAD. This method incorporates a Counterfactual Reasoning Enhanced Representation… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  8. arXiv:2406.03237  [pdf, other

    cs.SD eess.AS

    Generalized Fake Audio Detection via Deep Stable Learning

    Authors: Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi

    Abstract: Although current fake audio detection approaches have achieved remarkable success on specific datasets, they often fail when evaluated with datasets from different distributions. Previous studies typically address distribution shift by focusing on using extra data or applying extra loss restrictions during training. However, these methods either require a substantial amount of data or complicate t… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  9. arXiv:2405.04066  [pdf, other

    cs.SI eess.SY

    Characterizing Regional Importance in Cities with Human Mobility Motifs in Metro Networks

    Authors: Shuyang Shi, Ding Lyu, Lin Wang, Xiaofan Wang, Guanrong Chen

    Abstract: Uncovering higher-order spatiotemporal dependencies within human mobility networks offers valuable insights into the analysis of urban structures. In most existing studies, human mobility networks are typically constructed by aggregating all trips without distinguishing who takes which specific trip. Instead, we claim individual mobility motifs, higher-order structures generated by daily trips of… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  10. arXiv:2404.13786  [pdf, other

    eess.SY cs.AI cs.DC cs.LG

    Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

    Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, **gfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

    Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components ca… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  11. arXiv:2404.07620  [pdf, other

    eess.IV cs.CV

    Diffusion Probabilistic Multi-cue Level Set for Reducing Edge Uncertainty in Pancreas Segmentation

    Authors: Yue Gou, Yuming Xing, Shengzhu Shi, Zhichang Guo

    Abstract: Accurately segmenting the pancreas remains a huge challenge. Traditional methods encounter difficulties in semantic localization due to the small volume and distorted structure of the pancreas, while deep learning methods encounter challenges in obtaining accurate edges because of low contrast and organ overlap**. To overcome these issues, we propose a multi-cue level set method based on the dif… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  12. arXiv:2401.05690  [pdf, other

    cs.IT eess.SP

    Sparse Array Enabled Near-Field Communications: Beam Pattern Analysis and Hybrid Beamforming Design

    Authors: Cong Zhou, Changsheng You, Haodong Zhang, Li Chen, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as a promising technology to enable near-field communications for achieving enhanced spectrum efficiency and spatial resolution, by drastically increasing the number of antennas. However, this also inevitably incurs higher hardware and energy cost, which may not be affordable in future wireless systems. To address this issue, we propose in this pa… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: In this paper, we propose to exploit sparse arrays for enabling near-field communications and characterize its unique beam pattern for facilitating its hybrid beamforming design

  13. arXiv:2312.11255  [pdf, other

    eess.SY

    State-action control barrier functions: Imposing safety on learning-based control with low online computational costs

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Learning-based control with safety guarantees usually requires real-time safety certification and modifications of possibly unsafe learning-based policies. The control barrier function (CBF) method uses a safety filter containing a constrained optimization problem to produce safe policies. However, finding a valid CBF for a general nonlinear system requires a complex function parameterization, whi… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  14. arXiv:2311.03974  [pdf, ps, other

    cs.IT eess.SP

    NOMA Enabled Multi-Access Edge Computing: A Joint MU-MIMO Precoding and Computation Offloading Design

    Authors: Deyou Zhang, Meng Wang, Shuo Shi, Ming Xiao

    Abstract: This letter investigates computation offloading and transmit precoding co-design for multi-access edge computing (MEC), where multiple MEC users (MUs) equipped with multiple antennas access the MEC server in a non-orthogonal multiple access manner. We aim to minimize the total energy consumption of all MUs while satisfying the latency constraints by jointly optimizing the computational frequency,… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  15. arXiv:2311.02679  [pdf, other

    eess.SY cs.LG

    Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration

    Authors: Archith Athrey, Othmane Mazhar, Meichen Guo, Bart De Schutter, Shengling Shi

    Abstract: In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by… ▽ More

    Submitted 24 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  16. arXiv:2310.15937  [pdf, other

    eess.SY

    A Behavioral Perspective on Models of Linear Dynamical Networks with Manifest Variables

    Authors: Shengling Shi, Zhiyong Sun, Bart De Schutter

    Abstract: Networks of dynamical systems play an important role in various domains and have motivated many studies on the control and analysis of linear dynamical networks. For linear network models considered in these studies, it is typically pre-determined what signal channels are inputs and what are outputs. These models do not capture the practical need to incorporate different experimental situations, w… ▽ More

    Submitted 5 May, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  17. arXiv:2309.00223  [pdf, other

    eess.AS cs.CL cs.SD

    The FruitShell French synthesis system at the Blizzard 2023 Challenge

    Authors: Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi

    Abstract: This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phoneme… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

  18. arXiv:2307.03423  [pdf, other

    eess.IV cs.CV cs.LG

    Hyperspectral and Multispectral Image Fusion Using the Conditional Denoising Diffusion Probabilistic Model

    Authors: Shuaikai Shi, Lijun Zhang, Jie Chen

    Abstract: Hyperspectral images (HSI) have a large amount of spectral information reflecting the characteristics of matter, while their spatial resolution is low due to the limitations of imaging technology. Complementary to this are multispectral images (MSI), e.g., RGB images, with high spatial resolution but insufficient spectral bands. Hyperspectral and multispectral image fusion is a technique for acqui… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  19. arXiv:2307.03413  [pdf, other

    cs.CV eess.IV

    Unsupervised Hyperspectral and Multispectral Images Fusion Based on the Cycle Consistency

    Authors: Shuaikai Shi, Lijun Zhang, Yoann Altmann, Jie Chen

    Abstract: Hyperspectral images (HSI) with abundant spectral information reflected materials property usually perform low spatial resolution due to the hardware limits. Meanwhile, multispectral images (MSI), e.g., RGB images, have a high spatial resolution but deficient spectral signatures. Hyperspectral and multispectral image fusion can be cost-effective and efficient for acquiring both high spatial resolu… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  20. arXiv:2306.15723  [pdf, other

    eess.SY

    Approximate Dynamic Programming for Constrained Piecewise Affine Systems with Stability and Safety Guarantees

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Infinite-horizon optimal control of constrained piecewise affine (PWA) systems has been approximately addressed by hybrid model predictive control (MPC), which, however, has computational limitations, both in offline design and online implementation. In this paper, we consider an alternative approach based on approximate dynamic programming (ADP), an important class of methods in reinforcement lea… ▽ More

    Submitted 6 January, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

  21. arXiv:2305.19069  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Multi-source adversarial transfer learning for ultrasound image segmentation with limited similarity

    Authors: Yifu Zhang, Hongru Li, Tao Yang, Rui Tao, Zhengyuan Liu, Shimeng Shi, Jiansong Zhang, Ning Ma, Wu** Feng, Zhanhu Zhang, Xinyu Zhang

    Abstract: Lesion segmentation of ultrasound medical images based on deep learning techniques is a widely used method for diagnosing diseases. Although there is a large amount of ultrasound image data in medical centers and other places, labeled ultrasound datasets are a scarce resource, and it is likely that no datasets are available for new tissues/organs. Transfer learning provides the possibility to solv… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Submitted to Applied Soft Computing Journal

  22. arXiv:2305.11438  [pdf, other

    cs.CL eess.AS

    Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

    Authors: Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma

    Abstract: Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features. Deep neural networks are commonly trained to map fluency-related features into the human scores. However, the effectiveness of deep learning-based models is constrained by the limited amount of labeled training samples. To address this, we introduce a self-supervised learning (SSL) approach that take… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  23. arXiv:2305.10983  [pdf, other

    cs.CV eess.IV

    Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

    Authors: Tianhe Wu, Shuwei Shi, Haoming Cai, Mingdeng Cao, **g Xiao, Yinqiang Zheng, Yujiu Yang

    Abstract: Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information. It is becoming more significant with the increasing advancement of virtual reality (VR) technology. However, the quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipe… ▽ More

    Submitted 10 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  24. arXiv:2305.01871  [pdf

    physics.med-ph eess.IV

    Convolutional neural network-based single-shot speckle tracking for x-ray phase-contrast imaging

    Authors: Serena Qinyun Z. Shi, Nadav Shapira, Peter B. Noël, Sebastian Meyer

    Abstract: X-ray phase-contrast imaging offers enhanced sensitivity for weakly-attenuating materials, such as breast and brain tissue, but has yet to be widely implemented clinically due to high coherence requirements and expensive x-ray optics. Speckle-based phase contrast imaging has been proposed as an affordable and simple alternative; however, obtaining high-quality phase-contrast images requires accura… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  25. arXiv:2302.12511  [pdf, ps, other

    cs.IT eess.SP

    Two-Stage Hierarchical Beam Training for Near-Field Communications

    Authors: Chenyu Wu, Changsheng You, Yuanwei Liu, Li Chen, Shuo Shi

    Abstract: Extremely large-scale array (XL-array) has emerged as a promising technology to improve the spectrum efficiency and spatial resolution of future wireless systems. However, the huge number of antennas renders the users more likely to locate in the near-field (instead of the far-field) region of the XL-array with spherical wavefront propagation. This inevitably incurs prohibitively high beam trainin… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: We proposed a novel two-stage hierarchical beam training method for near-field communication systems. This paper has been submitted to IEEE for possible publication

  26. arXiv:2302.10444  [pdf, other

    eess.AS cs.SD

    Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

    Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

    Abstract: Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly… ▽ More

    Submitted 13 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP 2023

  27. arXiv:2302.09928  [pdf, other

    eess.AS

    An ASR-free Fluency Scoring Approach with Self-Supervised Learning

    Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

    Abstract: A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, w… ▽ More

    Submitted 13 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP 2023

  28. arXiv:2301.07876  [pdf, other

    eess.SY cs.LG

    Suboptimality analysis of receding horizon quadratic control with unknown linear systems and its applications in learning-based control

    Authors: Shengling Shi, Anastasios Tsiamis, Bart De Schutter

    Abstract: In this work, we aim to analyze how the trade-off between the modeling error, the terminal value function error, and the prediction horizon affects the performance of a nominal receding-horizon linear quadratic (LQ) controller. By develo** a novel perturbation result of the Riccati difference equation, a novel performance upper bound is obtained and suggests that for many cases, the prediction h… ▽ More

    Submitted 8 April, 2024; v1 submitted 18 January, 2023; originally announced January 2023.

  29. arXiv:2209.08209  [pdf, other

    eess.SY

    RISE-Based Adaptive Control with Mass-Inertia Parameter Estimation for Aerial Transportation of Multi-Rotor UAVs

    Authors: Shuyang Shi, Yuzhu Li, Wei Dong

    Abstract: This paper proposes an adaptive tracking strategy with mass-inertia estimation for aerial transportation problems of multi-rotor UAVs. The dynamic model of multi-rotor UAVs with disturbances is firstly developed with a linearly parameterized form. Subsequently, a cascade controller with the robust integral of the sign of the error (RISE) terms is applied to smooth the control inputs and address bo… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  30. Towards V2I Age-aware Fairness Access: A DQN Based Intelligent Vehicular Node Training and Test Method

    Authors: Qiong Wu, Shuai Shi, Ziyang Wan, Qiang Fan, **yi Fan, Cui Zhang

    Abstract: Vehicles on the road exchange data with base station (BS) frequently through vehicle to infrastructure (V2I) communications to ensure the normal use of vehicular applications, where the IEEE 802.11 distributed coordination function (DCF) is employed to allocate a minimum contention window (MCW) for channel access. Each vehicle may change its MCW to achieve more access opportunities at the expense… ▽ More

    Submitted 3 March, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: This paper has been accepted by Chinese Journal of Electronics. Simulation codes have been provided at: https://github.com/qiongwu86/Age-Fairness

  31. arXiv:2207.00792  [pdf, ps, other

    cs.IT eess.SP

    Two-Timescale Design for STAR-RIS Aided NOMA Systems

    Authors: Chenyu Wu, Changsheng You, Yuanwei Liu, Shuo Shi, Marco Di Renzo

    Abstract: Simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) have emerged as a promising technology for achieving full-space coverage. Prior works on STAR-RISs mostly assumed the full and instantaneous channel state information (CSI) is available, which, however, is practically difficult to obtain due to the large number of elements. To address it, we investigate STAR… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: 30 pages, 10 figures

  32. arXiv:2205.10065  [pdf, ps, other

    eess.SY

    Approximate Dynamic Programming for Constrained Linear Systems: A Piecewise Quadratic Approximation Approach

    Authors: Kanghui He, Shengling Shi, Ton van den Boom, Bart De Schutter

    Abstract: Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictiv… ▽ More

    Submitted 6 April, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

  33. arXiv:2204.08958  [pdf, other

    cs.CV eess.IV

    MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment

    Authors: Sidi Yang, Tianhe Wu, Shuwei Shi, Shanshan Lao, Yuan Gong, Mingdeng Cao, Jiahao Wang, Yujiu Yang

    Abstract: No-Reference Image Quality Assessment (NR-IQA) aims to assess the perceptual quality of images in accordance with human subjective perception. Unfortunately, existing NR-IQA methods are far from meeting the needs of predicting accurate quality scores on GAN-based distortion images. To this end, we propose Multi-dimension Attention Network for no-reference Image Quality Assessment (MANIQA) to impro… ▽ More

    Submitted 20 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

  34. arXiv:2204.07876  [pdf, other

    cs.HC eess.SY

    Lodestar: Supporting Independent Learning and Rapid Experimentation Through Data-Driven Analysis Recommendations

    Authors: Deepthi Raghunandan, Zhe Cui, Kartik Krishnan, Segen Tirfe, Shenzhi Shi, Tejaswi Darshan Shrestha, Leilani Battle, Niklas Elmqvist

    Abstract: Kee** abreast of current trends, technologies, and best practices in visualization and data analysis is becoming increasingly difficult, especially for fledgling data scientists. In this paper, we propose Lodestar, an interactive computational notebook that allows users to quickly explore and construct new data science workflows by selecting from a list of automated analysis recommendations. We… ▽ More

    Submitted 16 April, 2022; originally announced April 2022.

    Comments: This paper was presented as part of the workshop called Visualization in Data Science (at ACM KDD and IEEE VIS)

  35. arXiv:2203.09862  [pdf, other

    eess.SY cs.LG math.OC

    Finite-sample analysis of identification of switched linear systems with arbitrary or restricted switching

    Authors: Shengling Shi, Othmane Mazhar, Bart De Schutter

    Abstract: For the identification of switched systems with a measured switching signal, this work aims to analyze the effect of switching strategies on the estimation error. The data for identification is assumed to be collected from globally asymptotically or marginally stable switched systems under switches that are arbitrary or subject to an average dwell time constraint. Then the switched system is estim… ▽ More

    Submitted 28 June, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

  36. Excitation allocation for generic identifiability of linear dynamic networks with fixed modules

    Authors: H. J. Dreef, S. Shi, X. Cheng, M. C. F. Donkers, P. M. J. Van den Hof

    Abstract: Identifiability of linear dynamic networks requires the presence of a sufficient number of external excitation signals. The problem of allocating a minimal number of external signals for guaranteeing generic network identifiability has been recently addressed in the literature. Here we will extend that work by explicitly incorporating the situation that some network modules are known, and thus are… ▽ More

    Submitted 13 May, 2022; v1 submitted 22 January, 2022; originally announced January 2022.

    Journal ref: IEEE Control Systems Letters, Vol. 6, pp. 2587-2592, 2022

  37. arXiv:2201.08477  [pdf, ps, other

    eess.SP cs.IT cs.LG

    DDPG-Driven Deep-Unfolding with Adaptive Depth for Channel Estimation with Sparse Bayesian Learning

    Authors: Qiyu Hu, Shuhan Shi, Yunlong Cai, Guanding Yu

    Abstract: Deep-unfolding neural networks (NNs) have received great attention since they achieve satisfactory performance with relatively low complexity. Typically, these deep-unfolding NNs are restricted to a fixed-depth for all inputs. However, the optimal number of layers required for convergence changes with different inputs. In this paper, we first develop a framework of deep deterministic policy gradie… ▽ More

    Submitted 18 April, 2023; v1 submitted 20 January, 2022; originally announced January 2022.

    Comments: 16 pages, 14 figures

  38. Decentralized Spectrum Access System: Vision, Challenges, and a Blockchain Solution

    Authors: Yang Xiao, Shanghao Shi, Wen**g Lou, Chonggang Wang, Xu Li, Ning Zhang, Y. Thomas Hou, Jeffrey H. Reed

    Abstract: Spectrum access system (SAS) is widely considered the de facto solution to coordinating dynamic spectrum sharing (DSS) and protecting incumbent users. The current SAS paradigm prescribed by the FCC for the CBRS band and standardized by the WInnForum follows a centralized service model in that a spectrum user subscribes to a SAS server for spectrum allocation service. This model, however, neither t… ▽ More

    Submitted 10 December, 2021; originally announced December 2021.

    Comments: A version of this work has been accepted by IEEE Wireless Communications for publication

    Journal ref: IEEE Wireless Communications (2022)

  39. arXiv:2110.05443  [pdf

    eess.IV cs.CV

    Spatial-temporal V-Net for automatic segmentation and quantification of right ventricles in gated myocardial perfusion SPECT images

    Authors: Chen Zhao, Shi Shi, Zhuo He, Cheng Wang, Zhongqiang Zhao, Xinli Li, Yanli Zhou, Weihua Zhou

    Abstract: Background. Functional assessment of right ventricle (RV) using gated myocardial perfusion single-photon emission computed tomography (MPS) heavily relies on the precise extraction of right ventricular contours. In this paper, we present a new deep-learning-based model integrating both the spatial and temporal features in gated MPS images to perform the segmentation of the RV epicardium and endoca… ▽ More

    Submitted 26 December, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: 15 pages, 8 figures

  40. arXiv:2109.06574  [pdf, other

    eess.SP

    Deep-Unfolding Neural-Network Aided Hybrid Beamforming Based on Symbol-Error Probability Minimization

    Authors: S. Shi, Y. Cai, Q. Hu, B. Champagne, L. Hanzo

    Abstract: In massive multiple-input multiple-output (MIMO) systems, hybrid analog-digital (AD) beamforming can be used to attain a high directional gain without requiring a dedicated radio frequency (RF) chain for each antenna element, which substantially reduces both the hardware costs and power consumption. While massive MIMO transceiver design typically relies on the conventional mean-square error (MSE)… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

  41. arXiv:2105.03072  [pdf, other

    eess.IV cs.CV

    NTIRE 2021 Challenge on Perceptual Image Quality Assessment

    Authors: **** Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, Sungjun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, **gyu Guo, Zirui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang , et al. (25 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021. As a new type of image processing technology, perceptual image processing algorithms based on Generative Adversarial Networks (GAN) have produced images with more realistic textures. These o… ▽ More

    Submitted 28 June, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

  42. arXiv:2104.11599  [pdf, other

    cs.CV eess.IV

    Region-Adaptive Deformable Network for Image Quality Assessment

    Authors: Shuwei Shi, Qingyan Bai, Mingdeng Cao, Weihao Xia, Jiahao Wang, Yifan Chen, Yujiu Yang

    Abstract: Image quality assessment (IQA) aims to assess the perceptual quality of images. The outputs of the IQA algorithms are expected to be consistent with human subjective perception. In image restoration and enhancement tasks, images generated by generative adversarial networks (GAN) can achieve better visual performance than traditional CNN-generated images, although they have spatial shift and textur… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: CVPR NTIRE Workshop 2021. The first two authors contribute equally to this work. Code is available at https://github.com/IIGROUP/RADN

  43. arXiv:2104.01818  [pdf, other

    eess.AS

    The Multi-speaker Multi-style Voice Cloning Challenge 2021

    Authors: Qicong Xie, Xiaohai Tian, Guanghou Liu, Kun Song, Lei Xie, Zhiyong Wu, Hai Li, Song Shi, Haizhou Li, Fen Hong, Hui Bu, Xin Xu

    Abstract: The Multi-speaker Multi-style Voice Cloning Challenge (M2VoC) aims to provide a common sizable dataset as well as a fair testbed for the benchmarking of the popular voice cloning task. Specifically, we formulate the challenge to adapt an average TTS model to the stylistic target voice with limited data from target speaker, evaluated by speaker identity and style similarity. The challenge consists… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: has been accepted to ICASSP 2021

  44. arXiv:2101.08918  [pdf, other

    cs.IT eess.SP

    Performance Analysis for Cache-enabled Cellular Networks with Cooperative Transmission

    Authors: Tianming Feng, Shuo Shi, Shushi Gu, Ning Zhang, Wei Xiang, Xuemai Gu

    Abstract: The large amount of deployed smart devices put tremendous traffic pressure on networks. Caching at the edge has been widely studied as a promising technique to solve this problem. To further improve the successful transmission probability (STP) of cache-enabled cellular networks (CEN), we combine the cooperative transmission technique with CEN and propose a novel transmission scheme. Local channel… ▽ More

    Submitted 21 January, 2021; originally announced January 2021.

    Comments: arXiv admin note: text overlap with arXiv:2101.08669

  45. arXiv:2101.05442  [pdf, other

    eess.IV cs.CV cs.LG

    Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans

    Authors: Xin He, Shihao Wang, Xiaowen Chu, Shaohuai Shi, Jiang** Tang, Xin Liu, Chenggang Yan, Jiyong Zhang, Guiguang Ding

    Abstract: The COVID-19 pandemic has spread globally for several months. Because its transmissibility and high pathogenicity seriously threaten people's lives, it is crucial to accurately and quickly detect COVID-19 infection. Many recent studies have shown that deep learning (DL) based solutions can help detect COVID-19 based on chest CT scans. However, most existing work focuses on 2D datasets, which may r… ▽ More

    Submitted 12 February, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: Accepted by AAAI 2021, COVID-19, Neural Architecture Search, AutoML

  46. arXiv:2012.11414  [pdf, other

    eess.SY

    Single module identifiability in linear dynamic networks with partial excitation and measurement

    Authors: Shengling Shi, Xiaodong Cheng, Paul M. J. Van den Hof

    Abstract: Identifiability of a single module in a network of transfer functions is determined by whether a particular transfer function in the network can be uniquely distinguished within a network model set, on the basis of data. Whereas previous research has focused on the situations that all network signals are either excited or measured, we develop generalized analysis results for the situation of parti… ▽ More

    Submitted 20 December, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

  47. arXiv:2010.07801  [pdf, other

    eess.SY q-bio.NC q-bio.QM

    A Bayesian method for inference of effective connectivity in brain networks for detecting the Mozart effect

    Authors: Rik J. C. van Esch, Shengling Shi, Antoine Bernas, Svitlana Zinger, Albert P. Aldenkamp, Paul M. J. Van den Hof

    Abstract: Several studies claim that listening to Mozart music affects cognition and can be used to treat neurological conditions like epilepsy. Research into this Mozart effect has not addressed how dynamic interactions between brain networks, i.e. effective connectivity, are affected. The Granger-causality analysis is often used to infer effective connectivity. First, we investigate if a new method, Bayes… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  48. arXiv:2008.01495  [pdf, other

    eess.SY

    Generic identifiability of subnetworks in a linear dynamic network: the full measurement case

    Authors: Shengling Shi, Xiaodong Cheng, Paul M. J. Van den Hof

    Abstract: Identifiability conditions for single or multiple modules in a dynamic network specify under which conditions the considered modules can be uniquely recovered from the second-order statistical properties of the measured signals. Conditions for generic identifiability of multiple modules, i.e. a subnetwork, are developed for the situation that all node signals are measured and excitation of the net… ▽ More

    Submitted 26 October, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

  49. arXiv:2003.06307  [pdf, other

    cs.DC cs.LG eess.SP

    Communication-Efficient Distributed Deep Learning: A Comprehensive Survey

    Authors: Zhenheng Tang, Shaohuai Shi, Wei Wang, Bo Li, Xiaowen Chu

    Abstract: Distributed deep learning (DL) has become prevalent in recent years to reduce training time by leveraging multiple computing devices (e.g., GPUs/TPUs) due to larger models and datasets. However, system scalability is limited by communication becoming the performance bottleneck. Addressing this communication issue has become a prominent research topic. In this paper, we provide a comprehensive surv… ▽ More

    Submitted 1 September, 2023; v1 submitted 10 March, 2020; originally announced March 2020.

  50. arXiv:2001.09259  [pdf, ps, other

    cs.CR eess.SY

    A Blockchain-Based Approach for Saving and Tracking Differential-Privacy Cost

    Authors: Yang Zhao, Jun Zhao, Jiawen Kang, Zehang Zhang, Dusit Niyato, Shuyu Shi

    Abstract: An increasing amount of users' sensitive information is now being collected for analytics purposes. To protect users' privacy, differential privacy has been widely studied in the literature. Specifically, a differentially private algorithm adds noise to the true answer of a query to generate a noisy response. As a result, the information about the dataset leaked by the noisy output is bounded by t… ▽ More

    Submitted 22 December, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: 14 pages, 4 figures