-
OpenMines: A Light and Comprehensive Mining Simulation Environment for Truck Dispatching
Authors:
Shi Meng,
Bin Tian,
Xiaotong Zhang,
Shuangying Qi,
Caiji Zhang,
Qiang Zhang
Abstract:
Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Bu…
▽ More
Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Building upon previous work, we introduce probabilistic, user-defined events for random event simulation and implement various evaluation metrics and baselines, effectively reflecting the robustness of fleet management algorithms against unforeseen incidents. We present ``OpenMines'', an open-source framework encompassing the entire process of mine system modeling, algorithm development, and evaluation, facilitating future algorithm comparison and replication in the field. Code is available in https://github.com/370025263/openmines.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Adaptive Unscented Kalman Filter under Minimum Error Entropy with Fiducial Points for Non-Gaussian Systems
Authors:
Boyu Tian,
Haiquan Zhao
Abstract:
The minimum error entropy (MEE) has been extensively used in unscented Kalman filter (UKF) to handle impulsive noises or abnormal measurement data in non-Gaussian systems. However, the MEE-UKF has poor numerical stability due to the inverse operation of singular matrix. In this paper, a novel UKF based on minimum error entropy with fiducial points (MEEF) is proposed \textcolor{black}{to improve th…
▽ More
The minimum error entropy (MEE) has been extensively used in unscented Kalman filter (UKF) to handle impulsive noises or abnormal measurement data in non-Gaussian systems. However, the MEE-UKF has poor numerical stability due to the inverse operation of singular matrix. In this paper, a novel UKF based on minimum error entropy with fiducial points (MEEF) is proposed \textcolor{black}{to improve the problem of non-positive definite key matrix. By adding the correntropy to the error entropy, the proposed algorithm further enhances the ability of suppressing impulse noise and outliers. At the same time, considering the uncertainty of noise distribution, the modified Sage-Husa estimator of noise statistics is introduced to adaptively update the noise covariance matrix. In addition, the convergence analysis of the proposed algorithm provides a guidance for the selection of kernel width. The robustness and estimation accuracy of the proposed algorithm are manifested by the state tracking examples under complex non-Gaussian noises.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
WeldMon: A Cost-effective Ultrasonic Welding Machine Condition Monitoring System
Authors:
Beitong Tian,
Kuan-Chieh Lu,
Ahmadreza Eslaminia,
Yaohui Wang,
Chenhui Shao,
Klara Nahrstedt
Abstract:
Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable…
▽ More
Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable ultrasonic welding machine condition monitoring system that utilizes a custom data acquisition system and a data analysis pipeline designed for real-time analysis. Our classification algorithm combines auto-generated features and hand-crafted features, achieving superior cross-validation accuracy (95.8% on average over all testing tasks) compared to the state-of-the-art method (92.5%) in condition classification tasks. Our data augmentation approach alleviates the concept drift problem, enhancing tool condition classification accuracy by 8.3%. All algorithms run locally, requiring only 385 milliseconds to process data for each welding cycle. We deploy WeldMon and a commercial system on an actual ultrasonic welding machine, performing a comprehensive comparison. Our findings highlight the potential for develo** cost-effective, high-performance, and reliable tool condition monitoring systems.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Joint Acoustic Echo Cancellation and Speech Dereverberation Using Kalman filters
Authors:
Ziteng Wang,
Yueyue Na,
Biao Tian,
Qiang Fu
Abstract:
This paper proposes a joint acoustic echo cancellation (AEC) and speech dereverberation (DR) algorithm in the short-time Fourier transform domain. The reverberant microphone signals are described using an auto-regressive (AR) model. The AR coefficients and the loudspeaker-to-microphone acoustic transfer functions (ATFs) are considered time-varying and are modeled simultaneously using a first-order…
▽ More
This paper proposes a joint acoustic echo cancellation (AEC) and speech dereverberation (DR) algorithm in the short-time Fourier transform domain. The reverberant microphone signals are described using an auto-regressive (AR) model. The AR coefficients and the loudspeaker-to-microphone acoustic transfer functions (ATFs) are considered time-varying and are modeled simultaneously using a first-order Markov process. This leads to a solution where these parameters can be optimally estimated using Kalman filters. It is shown that the proposed algorithm outperforms vanilla solutions that solve AEC and DR sequentially and one state-of-the-art joint DRAEC algorithm based on semi-blind source separation, in terms of both speech quality and echo reduction performance.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness
Authors:
Dianwen Ng,
** Hui Pang,
Yang Xiao,
Biao Tian,
Qiang Fu,
Eng Siong Chng
Abstract:
It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we pres…
▽ More
It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we present a multi-channel ConvMixer for speech command recognitions. The novel architecture introduces an additional audio channel mixing for channel audio interaction in a multi-channel audio setting to achieve better noise-robust features with more efficient computation. Besides, we proposed a centroid based awareness component to enhance the system by equip** it with additional spatial geometry information in the latent feature projection space. We evaluate our model using the new MISP challenge 2021 dataset. Our model achieves significant improvement against the official baseline with a 55% gain in the competition score (0.152) on raw microphone array input and a 63% (0.126) boost upon front-end speech enhancement.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Multi-Task Deep Residual Echo Suppression with Echo-aware Loss
Authors:
Shimin Zhang,
Ziteng Wang,
Jiayao Sun,
Yihui Fu,
Biao Tian,
Qiang Fu,
Lei Xie
Abstract:
This paper introduces the NWPU Team's entry to the ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades a linear AEC with a neural post-filter. The former is used to deal with the linear echo components while the latter suppresses the residual non-linear echo components. We use gated convolutional F-T-LSTM neural network (GFTNN) as the backbone and shape the post-filter by a multi-ta…
▽ More
This paper introduces the NWPU Team's entry to the ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades a linear AEC with a neural post-filter. The former is used to deal with the linear echo components while the latter suppresses the residual non-linear echo components. We use gated convolutional F-T-LSTM neural network (GFTNN) as the backbone and shape the post-filter by a multi-task learning (MTL) framework, where a voice activity detection (VAD) module is adopted as an auxiliary task along with echo suppression, with the aim to avoid over suppression that may cause speech distortion. Moreover, we adopt an echo-aware loss function, where the mean square error (MSE) loss can be optimized particularly for every time-frequency bin (TF-bin) according to the signal-to-echo ratio (SER), leading to further suppression on the echo. Extensive ablation study shows that the time delay estimation (TDE) module in neural post-filter leads to better perceptual quality, and an adaptive filter with better convergence will bring consistent performance gain for the post-filter. Besides, we find that using the linear echo as the input of our neural post-filter is a better choice than using the reference signal directly. In the ICASSP 2022 AEC-Challenge, our approach has ranked the 1st place on word accuracy (WAcc) (0.817) and the 3rd place on both mean opinion score (MOS) (4.502) and the final score (0.864).
△ Less
Submitted 20 February, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
ConvMixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-field Keyword Spotting
Authors:
Dianwen Ng,
Yunqi Chen,
Biao Tian,
Qiang Fu,
Eng Siong Chng
Abstract:
Building efficient architecture in neural speech processing is paramount to success in keyword spotting deployment. However, it is very challenging for lightweight models to achieve noise robustness with concise neural operations. In a real-world application, the user environment is typically noisy and may also contain reverberations. We proposed a novel feature interactive convolutional model wit…
▽ More
Building efficient architecture in neural speech processing is paramount to success in keyword spotting deployment. However, it is very challenging for lightweight models to achieve noise robustness with concise neural operations. In a real-world application, the user environment is typically noisy and may also contain reverberations. We proposed a novel feature interactive convolutional model with merely 100K parameters to tackle this under the noisy far-field condition. The interactive unit is proposed in place of the attention module that promotes the flow of information with more efficient computations. Moreover, curriculum-based multi-condition training is adopted to attain better noise robustness. Our model achieves 98.2% top-1 accuracy on Google Speech Command V2-12 and is competitive against large transformer models under the designed noise condition.
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
Controllable Multichannel Speech Dereverberation based on Deep Neural Networks
Authors:
Ziteng Wang,
Yueyue Na,
Biao Tian,
Qiang Fu
Abstract:
Neural network based speech dereverberation has achieved promising results in recent studies. Nevertheless, many are focused on recovery of only the direct path sound and early reflections, which could be beneficial to speech perception, are discarded. The performance of a model trained to recover clean speech degrades when evaluated on early reverberation targets, and vice versa. This paper propo…
▽ More
Neural network based speech dereverberation has achieved promising results in recent studies. Nevertheless, many are focused on recovery of only the direct path sound and early reflections, which could be beneficial to speech perception, are discarded. The performance of a model trained to recover clean speech degrades when evaluated on early reverberation targets, and vice versa. This paper proposes a novel deep neural network based multichannel speech dereverberation algorithm, in which the dereverberation level is controllable. This is realized by adding a simple floating-point number as target controller of the model. Experiments are conducted using spatially distributed microphones, and the efficacy of the proposed algorithm is confirmed in various simulated conditions.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
NN3A: Neural Network supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications
Authors:
Ziteng Wang,
Yueyue Na,
Biao Tian,
Qiang Fu
Abstract:
Acoustic echo cancellation (AEC), noise suppression (NS) and automatic gain control (AGC) are three often required modules for real-time communications (RTC). This paper proposes a neural network supported algorithm for RTC, namely NN3A, which incorporates an adaptive filter and a multi-task model for residual echo suppression, noise reduction and near-end speech activity detection. The proposed a…
▽ More
Acoustic echo cancellation (AEC), noise suppression (NS) and automatic gain control (AGC) are three often required modules for real-time communications (RTC). This paper proposes a neural network supported algorithm for RTC, namely NN3A, which incorporates an adaptive filter and a multi-task model for residual echo suppression, noise reduction and near-end speech activity detection. The proposed algorithm is shown to outperform both a method using separate models and an end-to-end alternative. It is further shown that there exists a trade-off in the model between residual suppression and near-end speech distortion, which could be balanced by a novel loss weighting function. Several practical aspects of training the joint model are also investigated to push its performance to limit.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
AVA: Adversarial Vignetting Attack against Visual Recognition
Authors:
Binyu Tian,
Felix Juefei-Xu,
Qing Guo,
Xiaofei Xie,
Xiaohong Li,
Yang Liu
Abstract:
Vignetting is an inherited imaging phenomenon within almost all optical systems, showing as a radial intensity darkening toward the corners of an image. Since it is a common effect for photography and usually appears as a slight intensity variation, people usually regard it as a part of a photo and would not even want to post-process it. Due to this natural advantage, in this work, we study vignet…
▽ More
Vignetting is an inherited imaging phenomenon within almost all optical systems, showing as a radial intensity darkening toward the corners of an image. Since it is a common effect for photography and usually appears as a slight intensity variation, people usually regard it as a part of a photo and would not even want to post-process it. Due to this natural advantage, in this work, we study vignetting from a new viewpoint, i.e., adversarial vignetting attack (AVA), which aims to embed intentionally misleading information into vignetting and produce a natural adversarial example without noise patterns. This example can fool the state-of-the-art deep convolutional neural networks (CNNs) but is imperceptible to humans. To this end, we first propose the radial-isotropic adversarial vignetting attack (RI-AVA) based on the physical model of vignetting, where the physical parameters (e.g., illumination factor and focal length) are tuned through the guidance of target CNN models. To achieve higher transferability across different CNNs, we further propose radial-anisotropic adversarial vignetting attack (RA-AVA) by allowing the effective regions of vignetting to be radial-anisotropic and shape-free. Moreover, we propose the geometry-aware level-set optimization method to solve the adversarial vignetting regions and physical parameters jointly. We validate the proposed methods on three popular datasets, i.e., DEV, CIFAR10, and Tiny ImageNet, by attacking four CNNs, e.g., ResNet50, EfficientNet-B0, DenseNet121, and MobileNet-V2, demonstrating the advantages of our methods over baseline methods on both transferability and image quality.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation
Authors:
Yueyue Na,
Ziteng Wang,
Zhang Liu,
Biao Tian,
Qiang Fu
Abstract:
This paper presents a joint source separation algorithm that simultaneously reduces acoustic echo, reverberation and interfering sources. Target speeches are separated from the mixture by maximizing independence with respect to the other sources. It is shown that the separation process can be decomposed into cascading sub-processes that separately relate to acoustic echo cancellation, speech derev…
▽ More
This paper presents a joint source separation algorithm that simultaneously reduces acoustic echo, reverberation and interfering sources. Target speeches are separated from the mixture by maximizing independence with respect to the other sources. It is shown that the separation process can be decomposed into cascading sub-processes that separately relate to acoustic echo cancellation, speech dereverberation and source separation, all of which are solved using the auxiliary function based independent component/vector analysis techniques, and their solving orders are exchangeable. The cascaded solution not only leads to lower computational complexity but also better separation performance than the vanilla joint algorithm.
△ Less
Submitted 9 April, 2021;
originally announced April 2021.
-
Weighted Recursive Least Square Filter and Neural Network based Residual Echo Suppression for the AEC-Challenge
Authors:
Ziteng Wang,
Yueyue Na,
Zhang Liu,
Biao Tian,
Qiang Fu
Abstract:
This paper presents a real-time Acoustic Echo Cancellation (AEC) algorithm submitted to the AEC-Challenge. The algorithm consists of three modules: Generalized Cross-Correlation with PHAse Transform (GCC-PHAT) based time delay compensation, weighted Recursive Least Square (wRLS) based linear adaptive filtering and neural network based residual echo suppression. The wRLS filter is derived from a no…
▽ More
This paper presents a real-time Acoustic Echo Cancellation (AEC) algorithm submitted to the AEC-Challenge. The algorithm consists of three modules: Generalized Cross-Correlation with PHAse Transform (GCC-PHAT) based time delay compensation, weighted Recursive Least Square (wRLS) based linear adaptive filtering and neural network based residual echo suppression. The wRLS filter is derived from a novel semi-blind source separation perspective. The neural network model predicts a Phase-Sensitive Mask (PSM) based on the aligned reference and the linear filter output. The algorithm achieved a mean subjective score of 4.00 and ranked 2nd in the AEC-Challenge.
△ Less
Submitted 18 February, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Bias Field Poses a Threat to DNN-based X-Ray Recognition
Authors:
Binyu Tian,
Qing Guo,
Felix Juefei-Xu,
Wen Le Chan,
Yupeng Cheng,
Xiaohong Li,
Xiaofei Xie,
Shengchao Qin
Abstract:
The chest X-ray plays a key role in screening and diagnosis of many lung diseases including the COVID-19. More recently, many works construct deep neural networks (DNNs) for chest X-ray images to realize automated and efficient diagnosis of lung diseases. However, bias field caused by the improper medical image acquisition process widely exists in the chest X-ray images while the robustness of DNN…
▽ More
The chest X-ray plays a key role in screening and diagnosis of many lung diseases including the COVID-19. More recently, many works construct deep neural networks (DNNs) for chest X-ray images to realize automated and efficient diagnosis of lung diseases. However, bias field caused by the improper medical image acquisition process widely exists in the chest X-ray images while the robustness of DNNs to the bias field is rarely explored, which definitely poses a threat to the X-ray-based automated diagnosis system. In this paper, we study this problem based on the recent adversarial attack and propose a brand new attack, i.e., the adversarial bias field attack where the bias field instead of the additive noise works as the adversarial perturbations for fooling the DNNs. This novel attack posts a key problem: how to locally tune the bias field to realize high attack success rate while maintaining its spatial smoothness to guarantee high realisticity. These two goals contradict each other and thus has made the attack significantly challenging. To overcome this challenge, we propose the adversarial-smooth bias field attack that can locally tune the bias field with joint smooth & adversarial constraints. As a result, the adversarial X-ray images can not only fool the DNNs effectively but also retain very high level of realisticity. We validate our method on real chest X-ray datasets with powerful DNNs, e.g., ResNet50, DenseNet121, and MobileNet, and show different properties to the state-of-the-art attacks in both image realisticity and attack transferability. Our method reveals the potential threat to the DNN-based X-ray automated diagnosis and can definitely benefit the development of bias-field-robust automated diagnosis system.
△ Less
Submitted 3 May, 2021; v1 submitted 19 September, 2020;
originally announced September 2020.
-
Comparison of Different Methods for Time Sequence Prediction in Autonomous Vehicles
Authors:
Teng Liu,
Bin Tian,
Yunfeng Ai,
Long Chen,
Fei Liu,
Dongpu Cao
Abstract:
As a combination of various kinds of technologies, autonomous vehicles could complete a series of driving tasks by itself, such as perception, decision-making, planning, and control. Since there is no human driver to handle the emergency situation, future transportation information is significant for automated vehicles. This paper proposes different methods to forecast the time series for autonomo…
▽ More
As a combination of various kinds of technologies, autonomous vehicles could complete a series of driving tasks by itself, such as perception, decision-making, planning, and control. Since there is no human driver to handle the emergency situation, future transportation information is significant for automated vehicles. This paper proposes different methods to forecast the time series for autonomous vehicles, which are the nearest neighborhood (NN), fuzzy coding (FC), and long short term memory (LSTM). First, the formulation and operational process for these three approaches are introduced. Then, the vehicle velocity is regarded as a case study and the real-world dataset is utilized to predict future information via these techniques. Finally, the performance, merits, and drawbacks of the presented methods are analyzed and discussed.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.