Skip to main content

Showing 1–49 of 49 results for author: Xia, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16148  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

    Authors: Yuwei Zhang, Tong Xia, **g Han, Yu Wu, Georgios Rizos, Yang Liu, Mohammed Mosuily, Jagmohan Chauhan, Cecilia Mascolo

    Abstract: Respiratory audio, such as coughing and breathing sounds, has predictive power for a wide range of healthcare applications, yet is currently under-explored. The main problem for those applications arises from the difficulty in collecting large labeled task-specific data for model development. Generalizable respiratory acoustic foundation models pretrained with unlabeled data would offer appealing… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.15846  [pdf, other

    cs.CL eess.AS

    Revisiting Interpolation Augmentation for Speech-to-Text Generation

    Authors: Chen Xu, Jie Wang, Xiaoqian Liu, Qianqian Dong, Chunliang Zhang, Tong Xiao, **gbo Zhu, Dapeng Man, Wu Yang

    Abstract: Speech-to-text (S2T) generation systems frequently face challenges in low-resource scenarios, primarily due to the lack of extensive labeled datasets. One emerging solution is constructing virtual training samples by interpolating inputs and labels, which has notably enhanced system generalization in other domains. Despite its potential, this technique's application in S2T tasks has remained under… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  3. arXiv:2406.00497  [pdf, ps, other

    cs.SD cs.AI cs.CL eess.AS

    Recent Advances in End-to-End Simultaneous Speech Translation

    Authors: Xiaoqian Liu, Guoqiang Hu, Yangfan Du, Erfeng He, YingFeng Luo, Chen Xu, Tong Xiao, **gbo Zhu

    Abstract: Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles.… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  4. arXiv:2405.12609  [pdf, other

    eess.AS cs.SD

    Mamba in Speech: Towards an Alternative to Self-Attention

    Authors: Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps

    Abstract: Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer, Selective State Space Models (i.e., Mamba) were proposed as an alternative. Mamba exhibited its effectiveness in natural language processing and comp… ▽ More

    Submitted 30 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  5. arXiv:2405.08800  [pdf

    eess.SY

    Estimation of Participation Factors for Power System Oscillation from Measurements

    Authors: Tianwei Xia, Zhe Yu, Kai Sun, Di Shi, Kaiyang Huang

    Abstract: In a power system, when the participation factors of generators are computed to rank their participations into an oscillatory mode, a model-based approach is conventionally used on the linearized system model by means of the corresponding right and left eigenvectors. This paper proposes a new approach for estimating participation factors directly from measurement data on generator responses under… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  6. arXiv:2403.07156  [pdf

    math.DS eess.SY

    On Uniqueness of Participation Factors

    Authors: Tianwei Xia, Kai Sun

    Abstract: In modal analysis and control of a nonlinear dynamical system, participation factors of state variables with respect to a mode of interest serve as pivotal tools for stability studies. Linear participation factors are uniquely determined by the mode's shape and composition, which are defined by the right and left eigenvectors of the linearized model. For nonlinear participation factors as well as… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  7. arXiv:2401.09674  [pdf, ps, other

    eess.SY

    QoS-Aware 3D Coverage Deployment of UAVs for Internet of Vehicles in Intelligent Transportation

    Authors: engfei Du, Tingyue Xiao, Haotong Cao, Daosen Zhai

    Abstract: It is a challenging problem to characterize the air-to-ground (A2G) channel and identify the best deployment location for 3D UAVs with the QoS awareness. To address this problem, we propose a QoS-aware UAV 3D coverage deployment algorithm, which simulates the three-dimensional urban road scenario, considers the UAV communication resource capacity and vehicle communication QoS requirements comprehe… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  8. arXiv:2401.07681  [pdf, other

    eess.AS eess.SY

    Effect of target signals and delays on spatially selective active noise control for open-fitting hearables

    Authors: Tong Xiao, Simon Doclo

    Abstract: Spatially selective active noise control (ANC) hearables are designed to reduce unwanted noise from certain directions while preserving desired sounds from other directions. In previous studies, the target signal has been defined either as the delayed desired component in one of the reference microphone signals or as the desired component in the error microphone signal without any delay. In this p… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: ICASSP 2024 (c) 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  9. arXiv:2312.10952  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Soft Alignment of Modality Space for End-to-end Speech Translation

    Authors: Yuhao Zhang, Kaiqi Kou, Bei Li, Chen Xu, Chunliang Zhang, Tong Xiao, **gbo Zhu

    Abstract: End-to-end Speech Translation (ST) aims to convert speech into target text within a unified model. The inherent differences between speech and text modalities often impede effective cross-modal and cross-lingual transfer. Existing methods typically employ hard alignment (H-Align) of individual speech and text segments, which can degrade textual representations. To address this, we introduce Soft A… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to ICASSP2024

  10. arXiv:2312.06050  [pdf, other

    cs.LG eess.IV stat.ML

    Federated Multilinear Principal Component Analysis with Applications in Prognostics

    Authors: Chengyu Zhou, Yuqi Su, Tangbin Xia, Xiaolei Fang

    Abstract: Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of… ▽ More

    Submitted 28 April, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  11. arXiv:2312.00082  [pdf, other

    eess.IV cs.CV

    A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging

    Authors: Ruoran Li, Runzhao Yang, Wenxin Xiang, Yuxiao Cheng, Tingxiong Xiao, **li Suo

    Abstract: Functional Magnetic Resonance Imaging (fMRI) data is a widely used kind of four-dimensional biomedical data, which requires effective compression. However, fMRI compressing poses unique challenges due to its intricate temporal dynamics, low signal-to-noise ratio, and complicated underlying redundancies. This paper reports a novel compression paradigm specifically tailored for fMRI data based on Im… ▽ More

    Submitted 29 February, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

  12. arXiv:2311.03810  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Rethinking and Improving Multi-task Learning for End-to-end Speech Translation

    Authors: Yuhao Zhang, Chen Xu, Bei Li, Hao Chen, Tong Xiao, Chunliang Zhang, **gbo Zhu

    Abstract: Significant improvements in end-to-end speech translation (ST) have been achieved through the application of multi-task learning. However, the extent to which auxiliary tasks are highly consistent with the ST task, and how much this approach truly helps, have not been thoroughly studied. In this paper, we investigate the consistency between different tasks, considering different times and modules.… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP2023 main conference

  13. arXiv:2309.12234  [pdf, ps, other

    cs.CL eess.AS

    Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition

    Authors: Chen Xu, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, **gbo Zhu, Dapeng Man, Wu Yang

    Abstract: In this study, we present synchronous bilingual Connectionist Temporal Classification (CTC), an innovative framework that leverages dual CTC to bridge the gaps of both modality and language in the speech translation (ST) task. Utilizing transcript and translation as concurrent objectives for CTC, our model bridges the gap between audio and text as well as between source and target languages. Build… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  14. arXiv:2306.11646  [pdf, other

    cs.CL eess.AS

    Recent Advances in Direct Speech-to-text Translation

    Authors: Chen Xu, Rong Ye, Qianqian Dong, Chengqi Zhao, Tom Ko, Mingxuan Wang, Tong Xiao, **gbo Zhu

    Abstract: Recently, speech-to-text translation has attracted more and more attention and many studies have emerged rapidly. In this paper, we present a comprehensive survey on direct speech translation aiming to summarize the current state-of-the-art techniques. First, we categorize the existing research work into three directions based on the main challenges -- modeling burden, data scarcity, and applicati… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: An expanded version of the paper accepted by IJCAI2023 survey track

  15. arXiv:2306.07650  [pdf, other

    cs.CL cs.SD eess.AS

    Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation

    Authors: Yuchen Han, Chen Xu, Tong Xiao, **gbo Zhu

    Abstract: Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST). The commonplace "modality gap" between speech and text data often leads to inconsistent inputs between pre-training and fine-tuning. However, we observe that this gap occurs in the early stages of fine-tuning, but does not have a major impact on the final performance. On… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Main Conference

  16. arXiv:2305.20024  [pdf

    eess.SY

    Cooperative IoT Data Sharing with Heterogeneity of Participants Based on Electricity Retail

    Authors: Bohong Wang, Qinglai Guo, Tian Xia, Qiang Li, Di Liu, Feng Zhao

    Abstract: With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions. However, the isolation of IoT data among several entities makes it difficult to achieve optimal allocation of data resources and convert data resources into real economic value, thus it is necessary to introduce the I… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 18 pages, 14 figures

  17. arXiv:2305.05344  [pdf, other

    eess.IV cs.CV

    Trustworthy Multi-phase Liver Tumor Segmentation via Evidence-based Uncertainty

    Authors: Chuanfei Hu, Tianyi Xia, Ying Cui, Quchen Zou, Yuancheng Wang, Wenbo Xiao, Shenghong Ju, Xinde Li

    Abstract: Multi-phase liver contrast-enhanced computed tomography (CECT) images convey the complementary multi-phase information for liver tumor segmentation (LiTS), which are crucial to assist the diagnosis of liver cancer clinically. However, the performances of existing multi-phase liver tumor segmentation (MPLiTS)-based methods suffer from redundancy and weak interpretability, % of the fused result, res… ▽ More

    Submitted 20 June, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  18. arXiv:2304.14612  [pdf, other

    cs.CV eess.IV

    Local-Global Transformer Enhanced Unfolding Network for Pan-sharpening

    Authors: Mingsong Li, Yikun Liu, Tao Xiao, Yuwen Huang, Gong** Yang

    Abstract: Pan-sharpening aims to increase the spatial resolution of the low-resolution multispectral (LrMS) image with the guidance of the corresponding panchromatic (PAN) image. Although deep learning (DL)-based pan-sharpening methods have achieved promising performance, most of them have a two-fold deficiency. For one thing, the universally adopted black box principle limits the model interpretability. Fo… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted by IJCAI2023

  19. arXiv:2304.08506  [pdf, other

    eess.IV cs.CV

    When SAM Meets Medical Images: An Investigation of Segment Anything Model (SAM) on Multi-phase Liver Tumor Segmentation

    Authors: Chuanfei Hu, Tianyi Xia, Shenghong Ju, Xinde Li

    Abstract: Learning to segmentation without large-scale samples is an inherent capability of human. Recently, Segment Anything Model (SAM) performs the significant zero-shot image segmentation, attracting considerable attention from the computer vision community. Here, we investigate the capability of SAM for medical image analysis, especially for multi-phase liver tumor segmentation (MPLiTS), in terms of pr… ▽ More

    Submitted 21 December, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: Preliminary investigation

  20. arXiv:2304.08320  [pdf

    eess.SY

    On Fast-Converged Deep Reinforcement Learning for Optimal Dispatch of Large-Scale Power Systems under Transient Security Constraints

    Authors: Tannan Xiao, Ying Chen, Han Diao, Shaowei Huang, Chen Shen

    Abstract: Power system optimal dispatch with transient security constraints is commonly represented as Transient Security-Constrained Optimal Power Flow (TSC-OPF). Deep Reinforcement Learning (DRL)-based TSC-OPF trains efficient decision-making agents that are adaptable to various scenarios and provide solution results quickly. However, due to the high dimensionality of the state space and action spaces, as… ▽ More

    Submitted 29 January, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: 10 pages, 11 figures

  21. arXiv:2303.07067  [pdf, other

    cs.LG cs.DC cs.SD eess.AS

    Cross-device Federated Learning for Mobile Health Diagnostics: A First Study on COVID-19 Detection

    Authors: Tong Xia, **g Han, Abhirup Ghosh, Cecilia Mascolo

    Abstract: Federated learning (FL) aided health diagnostic models can incorporate data from a large number of personal edge devices (e.g., mobile phones) while kee** the data local to the originating devices, largely ensuring privacy. However, such a cross-device FL approach for health diagnostics still imposes many challenges due to both local data imbalance (as extreme as local data consists of a single… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: This paper has been accepted by IEEE ICASSP 2023

  22. arXiv:2212.01778  [pdf, ps, other

    eess.AS cs.AI cs.CL cs.SD

    Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data

    Authors: Yuhao Zhang, Chen Xu, Bojie Hu, Chunliang Zhang, Tong Xiao, **gbo Zhu

    Abstract: We present a method for introducing a text encoder into pre-trained end-to-end speech translation systems. It enhances the ability of adapting one modality (i.e., source-language speech) to another (i.e., source-language text). Thus, the speech translation model can learn from both unlabeled and labeled data, especially when the source-language text data is abundant. Beyond this, we present a deno… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

    Comments: Accepted to AAAI 2023

  23. arXiv:2209.15180  [pdf, other

    eess.IV cs.CV

    SCI: A Spectrum Concentrated Implicit Neural Compression for Biomedical Data

    Authors: Runzhao Yang, Tingxiong Xiao, Yuxiao Cheng, Qianni Cao, **yuan Qu, **li Suo, Qionghai Dai

    Abstract: Massive collection and explosive growth of biomedical data, demands effective compression for efficient storage, transmission and sharing. Readily available visual data compression techniques have been studied extensively but tailored for natural images/videos, and thus show limited performance on biomedical data which are of different features and larger diversity. Emerging implicit neural repres… ▽ More

    Submitted 23 November, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: accepted to AAAI2023

    ACM Class: I.4.2; I.2.10

  24. arXiv:2208.09997  [pdf, other

    eess.AS eess.SY

    Spatially Selective Active Noise Control Systems

    Authors: Tong Xiao, Buye Xu, Chuming Zhao

    Abstract: Active noise control (ANC) systems are commonly designed to achieve maximal sound reduction regardless of the incident direction of the sound. When desired sound is present, the state-of-the-art methods add a separate system to reconstruct it. This can result in distortion and latency. In this work, we propose a multi-channel ANC system that only reduces sound from undesired directions, and the sy… ▽ More

    Submitted 12 May, 2023; v1 submitted 21 August, 2022; originally announced August 2022.

    Comments: The following article has been submitted to the Journal of the Acoustical Society of America (JASA). It has been accepted and published in https://doi.org/10.1121/10.0019336

    Journal ref: J. Acoust. Soc. Am., Vol. 153, No. 5, pp. 2733-2744, 2023

  25. arXiv:2203.07815  [pdf, other

    eess.IV cs.CV cs.LG

    Adversarial Counterfactual Augmentation: Application in Alzheimer's Disease Classification

    Authors: Tian Xia, Pedro Sanchez, Chen Qin, Sotirios A. Tsaftaris

    Abstract: Due to the limited availability of medical data, deep learning approaches for medical image analysis tend to generalise poorly to unseen data. Augmenting data during training with random transformations has been shown to help and became a ubiquitous technique for training neural networks. Here, we propose a novel adversarial counterfactual augmentation scheme that aims at finding the most \textit{… ▽ More

    Submitted 1 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

  26. arXiv:2203.04815  [pdf

    eess.SY cs.LG math.DS

    Machine Learning based Optimal Feedback Control for Microgrid Stabilization

    Authors: Tianwei Xia, Kai Sun, Wei Kang

    Abstract: Microgrids have more operational flexibilities as well as uncertainties than conventional power grids, especially when renewable energy resources are utilized. An energy storage based feedback controller can compensate undesired dynamics of a microgrid to improve its stability. However, the optimal feedback control of a microgrid subject to a large disturbance needs to solve a Hamilton-Jacobi-Bell… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted by 2022 IEEE PES General Meeting in Denver, CO

  27. arXiv:2203.04808  [pdf

    math.DS eess.SP eess.SY math.SP

    Time-variant Nonlinear Participation Factors Considering Resonances in Power Systems

    Authors: Tianwei Xia, Kai Sun

    Abstract: The participation factor (PF), as an important modal property for small-signal stability, evaluates the linkage between a state variable and a mode. Applying the normal form theory, a nonlinear PF can be defined to evaluate the participation of a state variable into modal dynamics following a large disturbance, that gives considerations to resonances and nonlinearities up to a desired order. Howev… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted by 2022 IEEE PES General Meeting in Denver CO

  28. arXiv:2202.08981  [pdf, other

    cs.SD cs.LG eess.AS

    A Summary of the ComParE COVID-19 Challenges

    Authors: Harry Coppock, Alican Akman, Christian Bergler, Maurice Gerczuk, Chloƫ Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, **g Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Bjƶrn W. Schuller

    Abstract: The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 18 pages, 13 figures

  29. arXiv:2201.01232  [pdf

    cs.SD cs.LG eess.AS

    Exploring Longitudinal Cough, Breath, and Voice Data for COVID-19 Progression Prediction via Sequential Deep Learning: Model Development and Validation

    Authors: Ting Dang, **g Han, Tong Xia, Dimitris Spathis, Erika Bondareva, Chloƫ Siegele-Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo

    Abstract: Recent work has shown the potential of using audio data (eg, cough, breathing, and voice) in the screening for COVID-19. However, these approaches only focus on one-off detection and detect the infection given the current audio sample, but do not monitor disease progression in COVID-19. Limited exploration has been put forward to continuously monitor COVID-19 progression, especially recovery, thro… ▽ More

    Submitted 22 June, 2022; v1 submitted 4 January, 2022; originally announced January 2022.

    Comments: Updated title. Revised format according to journal requirements

  30. Feasibility Study of Neural ODE and DAE Modules for Power System Dynamic Component Modeling

    Authors: Tannan Xiao, Ying Chen, Shaowei Huang, Tirui He, Huizhe Guan

    Abstract: In the context of high penetration of renewables, the need to build dynamic models of power system components based on accessible measurement data has become urgent. To address this challenge, firstly, a neural ordinary differential equations (ODE) module and a neural differential-algebraic equations (DAE) module are proposed to form a data-driven modeling framework that accurately captures compon… ▽ More

    Submitted 4 July, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: 14 pages, 8 figures, 3 table. Under review by IEEE Transactions on Power Systems

  31. Exploration of Artificial Intelligence-oriented Power System Dynamic Simulators

    Authors: Tannan Xiao, Ying Chen, Jianquan Wang, Shaowei Huang, Weilin Tong, Tirui He

    Abstract: With the rapid development of artificial intelligence (AI), it is foreseeable that the accuracy and efficiency of dynamic analysis for future power system will be greatly improved by the integration of dynamic simulators and AI. To explore the interaction mechanism of power system dynamic simulations and AI, a general design of an AI-oriented power system dynamic simulator is proposed, which consi… ▽ More

    Submitted 6 July, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

    Comments: 10 pages, 8 figures, 1 table. Accepted by Journal of Modern Power System and Clean Energy

  32. arXiv:2109.14956  [pdf

    eess.IV cs.CV cs.LG

    Comparative Validation of Machine Learning Algorithms for Surgical Workflow and Skill Analysis with the HeiChole Benchmark

    Authors: Martin Wagner, Beat-Peter MĆ¼ller-Stich, Anna Kisilenko, Duc Tran, Patrick Heger, Lars MĆ¼ndermann, David M Lubotsky, Benjamin MĆ¼ller, Tornike Davitashvili, Manuela Capek, Annika Reinke, Tong Yu, Armine Vardazaryan, Chinedu Innocent Nwoye, Nicolas Padoy, Xinyang Liu, Eung-Joo Lee, Constantin Disch, Hans Meine, Tong Xia, Fucang Jia, Satoshi Kondo, Wolfgang Reiter, Yueming **, Yonghao Long , et al. (16 additional authors not shown)

    Abstract: PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported fo… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  33. arXiv:2106.15523  [pdf, other

    cs.SD cs.LG eess.AS

    Sounds of COVID-19: exploring realistic performance of audio-based digital testing

    Authors: **g Han, Tong Xia, Dimitris Spathis, Erika Bondareva, Chloƫ Brown, Jagmohan Chauhan, Ting Dang, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo

    Abstract: Researchers have been battling with the question of how we can identify Coronavirus disease (COVID-19) cases efficiently, affordably and at scale. Recent work has shown how audio based approaches, which collect respiratory audio data (cough, breathing and voice) can be used for testing, however there is a lack of exploration of how biases and methodological decisions impact these tools' performanc… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  34. arXiv:2105.05752  [pdf, other

    cs.CL cs.SD eess.AS

    Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

    Authors: Chen Xu, Bojie Hu, Yanyang Li, Yuhao Zhang, shen huang, Qi Ju, Tong Xiao, **gbo Zhu

    Abstract: Encoder pre-training is promising in end-to-end Speech Translation (ST), given the fact that speech-to-translation data is scarce. But ST encoders are not simple instances of Automatic Speech Recognition (ASR) or Machine Translation (MT) encoders. For example, we find that ASR encoders lack the global context representation, which is necessary for translation, whereas MT encoders are not designed… ▽ More

    Submitted 15 June, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: ACL 2021

  35. arXiv:2104.02005  [pdf, other

    cs.SD cs.LG eess.AS

    Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data

    Authors: Tong Xia, **g Han, Lorena Qendro, Ting Dang, Cecilia Mascolo

    Abstract: Recently, sound-based COVID-19 detection studies have shown great promise to achieve scalable and prompt digital pre-screening. However, there are still two unsolved issues hindering the practice. First, collected datasets for model training are often imbalanced, with a considerably smaller proportion of users tested positive, making it harder to learn representative and robust features. Second, d… ▽ More

    Submitted 18 June, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: Accepted by INTERSPEECH 2021

  36. arXiv:2102.13468  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

    Authors: Bjƶrn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, **g Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloƫ Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp

    Abstract: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of es… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 5 pages

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  37. Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data

    Authors: **g Han, Chloƫ Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

    Abstract: The development of fast and accurate screening tools, which could facilitate testing and prevent more costly clinical tests, is key to the current pandemic of COVID-19. In this context, some initial work shows promise in detecting diagnostic signals of COVID-19 from audio sounds. In this paper, we propose a voice-based framework to automatically detect individuals who have tested positive for COVI… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: 5 pages, 3 figures, 2 tables, Accepted for publication at ICASSP 2021

  38. arXiv:2012.08931  [pdf

    eess.IV

    Deep learning for fast MR imaging: a review for learning reconstruction from incomplete k-space data

    Authors: Shanshan Wang, Taohui Xiao, Qiegen Liu, Hairong Zheng

    Abstract: Magnetic resonance imaging is a powerful imaging modality that can provide versatile information but it has a bottleneck problem "slow imaging speed". Reducing the scanned measurements can accelerate MR imaging with the aid of powerful reconstruction methods, which have evolved from linear analytic models to nonlinear iterative ones. The emerging trend in this area is replacing human-defined signa… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: Invited review submitted to Biomedical signal processing and control in Jan 2020

  39. arXiv:2006.05919  [pdf, other

    cs.SD cs.LG eess.AS

    Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data

    Authors: Chloƫ Brown, Jagmohan Chauhan, Andreas Grammenos, **g Han, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

    Abstract: Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digit… ▽ More

    Submitted 18 January, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 9 pages, 6 figures, 2 tables, Accepted for publication at KDD'20 (Health Day)

  40. arXiv:2005.01607  [pdf, other

    eess.IV cs.CV cs.LG

    Pseudo-healthy synthesis with pathology disentanglement and adversarial learning

    Authors: Tian Xia, Agisilaos Chartsias, Sotirios A. Tsaftaris

    Abstract: Pseudo-healthy synthesis is the task of creating a subject-specific `healthy' image from a pathological one. Such images can be helpful in tasks such as anomaly detection and understanding changes induced by pathology and disease. In this paper, we present a model that is encouraged to disentangle the information of pathology from what seems to be healthy. We disentangle what appears to be healthy… ▽ More

    Submitted 18 June, 2021; v1 submitted 20 April, 2020; originally announced May 2020.

    Comments: This paper has been accepted by Medical Image Analysis

  41. arXiv:2003.09651  [pdf

    eess.SY

    Extended Prony Analysis on Power System Oscillation Under a Near-Resonance Condition

    Authors: Tianwei Xia, Zhe Yu, Kai Sun, Di Shi, Zhiwei Wang

    Abstract: Power system oscillations under a large disturbance often exhibit distorted waveforms as captured by increasingly deployed phasor measurement units. One cause is the occurrence of a near-resonance condition among several dominant modes that are influenced by nonlinear transient dynamics of generators. This paper proposes an Extended Prony Analysis method for measurement-based modal analysis. Based… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

    Comments: To be presented at the IEEE PES General Meeting in 2020

  42. arXiv:2003.01950  [pdf, other

    eess.AS cs.CL cs.SD

    AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

    Authors: Zhen Zeng, Jianzong Wang, Ning Cheng, Tian Xia, **g Xiao

    Abstract: Targeting at both high efficiency and performance, we propose AlignTTS to predict the mel-spectrum in parallel. AlignTTS is based on a Feed-Forward Transformer which generates mel-spectrum from a sequence of characters, and the duration of each character is determined by a duration predictor.Instead of adopting the attention mechanism in Transformer TTS to align text to mel-spectrum, the alignment… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: will be presented in ICASSP 2020

  43. arXiv:1912.02620  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    Learning to synthesise the ageing brain without longitudinal data

    Authors: Tian Xia, Agisilaos Chartsias, Chengjia Wang, Sotirios A. Tsaftaris

    Abstract: How will my face look when I get older? Or, for a more challenging question: How will my brain look when I get older? To answer this question one must devise (and learn from data) a multivariate auto-regressive function which given an image and a desired target age generates an output image. While collecting data for faces may be easier, collecting longitudinal brain data is not trivial. We propos… ▽ More

    Submitted 30 September, 2021; v1 submitted 4 December, 2019; originally announced December 2019.

    Journal ref: Medical Image Analysis, 2021, 73: 102169

  44. arXiv:1909.05393  [pdf

    cs.CV cs.AI cs.ET cs.LG eess.SY

    Automated Blood Cell Detection and Counting via Deep Learning for Microfluidic Point-of-Care Medical Devices

    Authors: Tiancheng Xia, Richard Jiang, YongQing Fu, Nanlin **

    Abstract: Automated in-vitro cell detection and counting have been a key theme for artificial and intelligent biological analysis such as biopsy, drug analysis and decease diagnosis. Along with the rapid development of microfluidics and lab-on-chip technologies, in-vitro live cell analysis has been one of the critical tasks for both research and industry communities. However, it is a great challenge to obta… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

    Journal ref: Proceeding of 2019 3rd International Conference on Artificial Intelligence Applications and Technologies (AIAAT 2019)

  45. arXiv:1909.03377  [pdf

    eess.SY cs.SD eess.AS eess.SP

    Ultra-broadband local active noise control with remote acoustic sensing

    Authors: Tong Xiao, Xiaojun Qiu, Benjamin Halkon

    Abstract: One enduring challenge for controlling high frequency sound in local active noise control (ANC) systems is to obtain the acoustic signal at the specific location to be controlled. In some applications such as in ANC headrest systems, it is not practical to install error microphones in a person's ears to provide the user a quiet or optimally acoustically controlled environment. Many virtual error s… ▽ More

    Submitted 27 November, 2020; v1 submitted 7 September, 2019; originally announced September 2019.

    Report number: 20784

    Journal ref: Sci. Rep. 10 (2020)

  46. arXiv:1908.09140  [pdf

    eess.IV cs.CV

    LANTERN: learn analysis transform network for dynamic magnetic resonance imaging with small dataset

    Authors: Shanshan Wang, Yanxia Chen, Taohui Xiao, Ziwen Ke, Qiegen Liu, Hairong Zheng

    Abstract: This paper proposes to learn analysis transform network for dynamic magnetic resonance imaging (LANTERN) with small dataset. Integrating the strength of CS-MRI and deep learning, the proposed framework is highlighted in three components: (i) The spatial and temporal domains are sparsely constrained by using adaptively trained CNN. (ii) We introduce an end-to-end framework to learn the parameters i… ▽ More

    Submitted 24 August, 2019; originally announced August 2019.

  47. arXiv:1908.02054  [pdf, other

    eess.IV cs.CV

    Model-based Convolutional De-Aliasing Network Learning for Parallel MR Imaging

    Authors: Yanxia Chen, Taohui Xiao, Cheng Li, Qiegen Liu, Shanshan Wang

    Abstract: Parallel imaging has been an essential technique to accelerate MR imaging. Nevertheless, the acceleration rate is still limited due to the ill-condition and challenges associated with the undersampled reconstruction. In this paper, we propose a model-based convolutional de-aliasing network with adaptive parameter learning to achieve accurate reconstruction from multi-coil undersampled k-space data… ▽ More

    Submitted 6 August, 2019; originally announced August 2019.

  48. arXiv:1906.04359  [pdf, other

    eess.IV cs.LG stat.ML

    DeepcomplexMRI: Exploiting deep residual network for fast parallel MR imaging with complex convolution

    Authors: Shanshan Wang, Huitao Cheng, Leslie Ying, Taohui Xiao, Ziwen Ke, Xin Liu, Hairong Zheng, Dong Liang

    Abstract: This paper proposes a multi-channel image reconstruction method, named DeepcomplexMRI, to accelerate parallel MR imaging with residual complex convolutional neural network. Different from most existing works which rely on the utilization of the coil sensitivities or prior information of predefined transforms, DeepcomplexMRI takes advantage of the availability of a large number of existing multi-ch… ▽ More

    Submitted 29 July, 2019; v1 submitted 10 June, 2019; originally announced June 2019.

  49. arXiv:1703.09260  [pdf, other

    eess.SY cs.LG

    Goal-Driven Dynamics Learning via Bayesian Optimization

    Authors: Somil Bansal, Roberto Calandra, Ted Xiao, Sergey Levine, Claire J. Tomlin

    Abstract: Real-world robots are becoming increasingly complex and commonly act in poorly understood environments where it is extremely challenging to model or learn their true dynamics. Therefore, it might be desirable to take a task-specific approach, wherein the focus is on explicitly learning the dynamics model which achieves the best control performance for the task at hand, rather than learning the tru… ▽ More

    Submitted 21 September, 2017; v1 submitted 27 March, 2017; originally announced March 2017.

    Comments: This is the extended version of the CDC'17 paper titled "Goal-Driven Dynamics Learning via Bayesian Optimization."