Skip to main content

Showing 1–19 of 19 results for author: Ma, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2402.02950  [pdf, other

    cs.CR eess.SP

    Semantic Entropy Can Simultaneously Benefit Transmission Efficiency and Channel Security of Wireless Semantic Communications

    Authors: Yankai Rong, Guoshun Nan, Minwei Zhang, Sihan Chen, Songtao Wang, Xuefei Zhang, Nan Ma, Shixun Gong, Zhaohui Yang, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek

    Abstract: Recently proliferated deep learning-based semantic communications (DLSC) focus on how transmitted symbols efficiently convey a desired meaning to the destination. However, the sensitivity of neural models and the openness of wireless channels cause the DLSC system to be extremely fragile to various malicious attacks. This inspires us to ask a question: "Can we further exploit the advantages of tra… ▽ More

    Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 13 pages, 12 figures

  2. arXiv:2401.15647  [pdf, other

    cs.CV cs.AI eess.IV

    UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration

    Authors: Nachuan Ma, Rui Fan, Lihua Xie

    Abstract: Over the past decade, automated methods have been developed to detect cracks more efficiently, accurately, and objectively, with the ultimate goal of replacing conventional manual visual inspection techniques. Among these methods, semantic segmentation algorithms have demonstrated promising results in pixel-wise crack detection tasks. However, training such networks requires a large amount of huma… ▽ More

    Submitted 6 May, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  3. arXiv:2310.19817  [pdf, other

    eess.AS cs.SD

    Intelligibility prediction with a pretrained noise-robust automatic speech recognition model

    Authors: Zehai Tu, Ning Ma, Jon Barker

    Abstract: This paper describes two intelligibility prediction systems derived from a pretrained noise-robust automatic speech recognition (ASR) model for the second Clarity Prediction Challenge (CPC2). One system is intrusive and leverages the hidden representations of the ASR model. The other system is non-intrusive and makes predictions with derived ASR uncertainty. The ASR model is only pretrained with a… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  4. arXiv:2309.02171  [pdf, other

    cs.IT eess.SP

    A Wideband MIMO Channel Model for Aerial Intelligent Reflecting Surface-Assisted Wireless Communications

    Authors: Shaoyi Liu, Nan Ma, Yaning Chen, Ke Peng, Dongsheng Xue

    Abstract: Compared to traditional intelligent reflecting surfaces(IRS), aerial IRS (AIRS) has unique advantages, such as more flexible deployment and wider service coverage. However, modeling AIRS in the channel presents new challenges due to their mobility. In this paper, a three-dimensional (3D) wideband channel model for AIRS and IRS joint-assisted multiple-input multiple-output (MIMO) communication syst… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: 6 pages, 7 figures

  5. arXiv:2305.19069  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Multi-source adversarial transfer learning for ultrasound image segmentation with limited similarity

    Authors: Yifu Zhang, Hongru Li, Tao Yang, Rui Tao, Zhengyuan Liu, Shimeng Shi, Jiansong Zhang, Ning Ma, Wu** Feng, Zhanhu Zhang, Xinyu Zhang

    Abstract: Lesion segmentation of ultrasound medical images based on deep learning techniques is a widely used method for diagnosing diseases. Although there is a large amount of ultrasound image data in medical centers and other places, labeled ultrasound datasets are a scarce resource, and it is likely that no datasets are available for new tissues/organs. Transfer learning provides the possibility to solv… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Submitted to Applied Soft Computing Journal

  6. arXiv:2305.13616  [pdf

    eess.IV

    An Entire Renal Anatomy Extraction Network for Advanced CAD During Partial Nephrectomy

    Authors: Nan Ma, Ying Yang, Dongkai Zhou

    Abstract: Partial nephrectomy (PN) is common surgery in urology. Digitization of renal anatomies brings much help to many computer-aided diagnosis (CAD) techniques during PN. However, the manual delineation of kidney vascular system and tumor on each slice is time consuming, error-prone, and inconsistent. Therefore, we proposed an entire renal anatomies extraction method from Computed Tomographic Angiograph… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  7. arXiv:2205.09377  [pdf, other

    cs.IT eess.SP

    Coexistence between Task- and Data-Oriented Communications: A Whittle's Index Guided Multi-Agent Reinforcement Learning Approach

    Authors: Ran Li, Chuan Huang, Xiaoqi Qin, Shengpei Jiang, Nan Ma, Shuguang Cui

    Abstract: We investigate the coexistence of task-oriented and data-oriented communications in a IoT system that shares a group of channels, and study the scheduling problem to jointly optimize the weighted age of incorrect information (AoII) and throughput, which are the performance metrics of the two types of communications, respectively. This problem is formulated as a Markov decision problem, which is di… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  8. arXiv:2204.04288  [pdf, other

    eess.AS cs.SD

    Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction

    Authors: Zehai Tu, Ning Ma, Jon Barker

    Abstract: Non-intrusive intelligibility prediction is important for its application in realistic scenarios, where a clean reference signal is difficult to access. The construction of many non-intrusive predictors require either ground truth intelligibility labels or clean reference signals for supervised learning. In this work, we leverage an unsupervised uncertainty estimation method for predicting speech… ▽ More

    Submitted 6 July, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Accepted to INTERSPEECH2022

  9. arXiv:2204.04287  [pdf, other

    eess.AS cs.SD q-bio.QM

    Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners

    Authors: Zehai Tu, Ning Ma, Jon Barker

    Abstract: An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids. Most algorithms measures the signal-to-noise ratios or correlations between the acoustic features of clean reference signals and degraded signals. However, these hand-picked acoustic features are usually not explicitly correlated with recognitio… ▽ More

    Submitted 6 July, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Accepted to INTERSPEECH2022

  10. arXiv:2204.04284  [pdf, other

    eess.AS cs.SD

    Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

    Authors: Zehai Tu, Jack Deadman, Ning Ma, Jon Barker

    Abstract: End-to-end models have achieved significant improvement on automatic speech recognition. One common method to improve performance of these models is expanding the data-space through data augmentation. Meanwhile, human auditory inspired front-ends have also demonstrated improvement for automatic speech recognisers. In this work, a well-verified auditory-based model, which can simulate various heari… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

  11. arXiv:2106.04639  [pdf, other

    cs.SD eess.AS

    Optimising Hearing Aid Fittings for Speech in Noise with a Differentiable Hearing Loss Model

    Authors: Zehai Tu, Ning Ma, Jon Barker

    Abstract: Current hearing aids normally provide amplification based on a general prescriptive fitting, and the benefits provided by the hearing aids vary among different listening environments despite the inclusion of noise suppression feature. Motivated by this fact, this paper proposes a data-driven machine learning technique to develop hearing aid fittings that are customised to speech in different noisy… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted to Interspeech 2021

  12. arXiv:2103.09030  [pdf, other

    cs.CV eess.IV

    A Large-Scale Dataset for Benchmarking Elevator Button Segmentation and Character Recognition

    Authors: Jianbang Liu, Yuqi Fang, Delong Zhu, Nachuan Ma, ** Pan, Max Q. -H. Meng

    Abstract: Human activities are hugely restricted by COVID-19, recently. Robots that can conduct inter-floor navigation attract much public attention, since they can substitute human workers to conduct the service work. However, current robots either depend on human assistance or elevator retrofitting, and fully autonomous inter-floor navigation is still not available. As the very first step of inter-floor n… ▽ More

    Submitted 22 March, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

  13. arXiv:2012.03166  [pdf, other

    cs.RO cs.AI eess.IV

    Conditional Generative Adversarial Networks for Optimal Path Planning

    Authors: Nachuan Ma, Jiankun Wang, Max Q. -H. Meng

    Abstract: Path planning plays an important role in autonomous robot systems. Effective understanding of the surrounding environment and efficient generation of optimal collision-free path are both critical parts for solving path planning problem. Although conventional sampling-based algorithms, such as the rapidly-exploring random tree (RRT) and its improved optimal version (RRT*), have been widely used in… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  14. arXiv:2003.14022  [pdf, ps, other

    math.OC eess.SP

    Distributed Noise Covariance Matrices Estimation in Sensor Networks

    Authors: Jiahong Li, Nan Ma, Fang Deng

    Abstract: Adaptive algorithms based on in-network processing over networks are useful for online parameter estimation of historical data (e.g., noise covariance) in predictive control and machine learning areas. This paper focuses on the distributed noise covariance matrices estimation problem for multi-sensor linear time-invariant (LTI) systems. Conventional noise covariance estimation approaches, e.g., au… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: 6 pages, 5 figures

  15. arXiv:1912.11774  [pdf, other

    cs.RO cs.CV eess.IV

    Autonomous Removal of Perspective Distortion for Robotic Elevator Button Recognition

    Authors: Delong Zhu, Jianbang Liu, Nachuan Ma, Zhe Min, Max Q. -H. Meng

    Abstract: Elevator button recognition is considered an indispensable function for enabling the autonomous elevator operation of mobile robots. However, due to unfavorable image conditions and various image distortions, the recognition accuracy remains to be improved. In this paper, we present a novel algorithm that can autonomously correct perspective distortions of elevator panel images. The algorithm firs… ▽ More

    Submitted 25 December, 2019; originally announced December 2019.

  16. Robust Binaural Localization of a Target Sound Source by Combining Spectral Source Models and Deep Neural Networks

    Authors: Ning Ma, Jose A. Gonzalez, Guy J. Brown

    Abstract: Despite there being clear evidence for top-down (e.g., attentional) effects in biological spatial hearing, relatively few machine hearing systems exploit top-down model-based knowledge in sound localisation. This paper addresses this issue by proposing a novel framework for binaural sound localisation that combines model-based information about the spectral characteristics of sound sources and dee… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: 10 pages

    Journal ref: IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 26, no. 11, pp. 2122-2131, 2018

  17. Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localisation of Multiple Sources in Reverberant Environments

    Authors: Ning Ma, Tobias May, Guy J. Brown

    Abstract: This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for robust binaural localisation of multiple sources in reverberant environments. DNNs are used to learn the relationship between the source azimuth and binaural cues, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs). In contrast to many… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: 10 pages

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 12, pp. 2444-2453, 2017

  18. arXiv:1904.02992  [pdf, other

    eess.AS cs.SD

    Deep Learning Features for Robust Detection of Acoustic Events in Sleep-Disordered Breathing

    Authors: Hector E. Romero, Ning Ma, Guy J. Brown, Amy V. Beeston, Madina Hasan

    Abstract: Sleep-disordered breathing (SDB) is a serious and prevalent condition, and acoustic analysis via consumer devices (e.g. smartphones) offers a low-cost solution to screening for it. We present a novel approach for the acoustic identification of SDB sounds, such as snoring, using bottleneck features learned from a corpus of whole-night sound recordings. Two types of bottleneck features are described… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted by IEEE ICASSP 2018

  19. arXiv:1904.01916  [pdf, other

    cs.SD eess.AS

    End-to-end Binaural Sound Localisation from the Raw Waveform

    Authors: Paolo Vecchiotti, Ning Ma, Stefano Squartini, Guy J. Brown

    Abstract: A novel end-to-end binaural sound localisation approach is proposed which estimates the azimuth of a sound source directly from the waveform. Instead of employing hand-crafted features commonly employed for binaural sound localisation, such as the interaural time and level difference, our end-to-end system approach uses a convolutional neural network (CNN) to extract specific features from the wav… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: Accepted by ICASSP 2019