Skip to main content

Showing 1–47 of 47 results for author: Bu, H

.
  1. arXiv:2406.19959  [pdf, other

    cs.SD eess.AS

    RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization

    Authors: Bing Yang, Changsheng Quan, Yabo Wang, Pengyu Wang, Yujie Yang, Ying Fang, Nian Shao, Hui Bu, Xin Xu, Xiaofei Li

    Abstract: The training of deep learning-based multichannel speech enhancement and source localization systems relies heavily on the simulation of room impulse response and multichannel diffuse noise, due to the lack of large-scale real-recorded datasets. However, the acoustic mismatch between simulated and real-world data could degrade the model performance when applying in real-world scenarios. To bridge t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.10304  [pdf, other

    cs.CL

    Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design

    Authors: Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Jianxing Yang, Ming Li, Chin-Hui Lee

    Abstract: Smart home technology has gained widespread adoption, facilitating effortless control of devices through voice commands. However, individuals with dysarthria, a motor speech disorder, face challenges due to the variability of their speech. This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications. To support this, we… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: to be published in Interspeech 2024

  3. arXiv:2406.07256  [pdf, ps, other

    cs.SD cs.AI eess.AS

    AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

    Authors: Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li

    Abstract: The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the large… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  4. Synthesis, disorder and Ising anisotropy in a new spin liquid candidate PrMgAl$_{11}$O$_{19}$

    Authors: Yantao Cao, Huanpeng Bu, Zhendong Fu, **kui Zhao, Jason S. Gardner, Zhongwen Ouyang, Zhaoming Tian, Zhiwei Li, Hanjie Guo

    Abstract: Here we report the successful synthesis of large single crystals of triangular frustrated PrMgAl$_{11}$O$_{19}$ using the optical floating zone technique. Single crystal X-ray diffraction measurements unveiled the presence of quenched disorder within the mirror plane, specifically $\sim$7\% of Pr ions deviating from the ideal 2\textit{d} site towards the 6\textit{h} site. Magnetic susceptibility m… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 4 figures, 14 pages

    Journal ref: Materials Futures 2024

  5. arXiv:2405.03644  [pdf, other

    cs.CR cs.AI

    When LLMs Meet Cybersecurity: A Systematic Literature Review

    Authors: Jie Zhang, Haoyu Bu, Hui Wen, Yu Chen, Lun Li, Hongsong Zhu

    Abstract: The rapid advancements in large language models (LLMs) have opened new avenues across various fields, including cybersecurity, which faces an ever-evolving threat landscape and need for innovative technologies. Despite initial explorations into the application of LLMs in cybersecurity, there is a lack of a comprehensive overview of this research area. This paper bridge this gap by providing a syst… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 36 pages, 7 figures

  6. arXiv:2401.13318  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Magnetic structure and Ising-like antiferromagnetism in the bilayer triangular lattice compound NdZnPO

    Authors: Han Ge, Tiantian Li, S. E. Nikitin, Nan Zhao, Fangli Li, Huanpeng Bu, Jiayue Yuan, Jian Chen, Ying Fu, Jiong Yang, Le Wang, ** Miao, Qiang Zhang, Ines Puente-Orench, Andrey Podlesnyak, Jieming Sheng, Liusuo Wu

    Abstract: The complex interplay of spin frustration and quantum fluctuations in low-dimensional quantum materials leads to a variety of intriguing phenomena. This research focuses on a detailed analysis of the magnetic behavior exhibited by NdZnPO, a bilayer spin-1/2 triangular lattice antiferromagnet. The investigation employs magnetization, specific heat, and powder neutron scattering measurements. At zer… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 11 pages, 6 figures

  7. arXiv:2401.03473  [pdf, ps, other

    cs.SD cs.AI eess.AS

    ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

    Abstract: To promote speech processing and recognition research in driving scenarios, we build on the success of the Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 and launch the ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge. This challenge collects over 100 hours of multi-channel speech data recorded inside a new energy vehicle and 40 hours… ▽ More

    Submitted 20 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  8. arXiv:2401.01735  [pdf, other

    cs.GT

    Economics Arena for Large Language Models

    Authors: Shangmin Guo, Haoran Bu, Haochuan Wang, Yi Ren, Dianbo Sui, Yuming Shang, Siting Lu

    Abstract: Large language models (LLMs) have been extensively used as the backbones for general-purpose agents, and some economics literature suggest that LLMs are capable of playing various types of economics games. Following these works, to overcome the limitation of evaluating LLMs using static benchmarks, we propose to explore competitive games as an evaluation for LLMs to incorporate multi-players and d… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  9. arXiv:2312.06454  [pdf, other

    eess.IV cs.CV cs.LG

    Point Transformer with Federated Learning for Predicting Breast Cancer HER2 Status from Hematoxylin and Eosin-Stained Whole Slide Images

    Authors: Bao Li, Zhenyu Liu, Lizhi Shao, Bensheng Qiu, Hong Bu, Jie Tian

    Abstract: Directly predicting human epidermal growth factor receptor 2 (HER2) status from widely available hematoxylin and eosin (HE)-stained whole slide images (WSIs) can reduce technical costs and expedite treatment selection. Accurately predicting HER2 requires large collections of multi-site WSIs. Federated learning enables collaborative training of these WSIs without gigabyte-size WSIs transportation a… ▽ More

    Submitted 27 February, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  10. arXiv:2309.13573  [pdf, other

    cs.SD eess.AS

    The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR

    Authors: Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu

    Abstract: With the success of the first Multi-channel Multi-party Meeting Transcription challenge (M2MeT), the second M2MeT challenge (M2MeT 2.0) held in ASRU2023 particularly aims to tackle the complex task of \emph{speaker-attributed ASR (SA-ASR)}, which directly addresses the practical and challenging problem of ``who spoke what at when" at typical meeting scenario. We particularly established two sub-tr… ▽ More

    Submitted 5 October, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: 8 pages, Accepted by ASRU2023

  11. arXiv:2307.02709  [pdf

    cs.AI

    Validation of the Practicability of Logical Assessment Formula for Evaluations with Inaccurate Ground-Truth Labels

    Authors: Yongquan Yang, Hong Bu

    Abstract: Logical assessment formula (LAF) is a new theory proposed for evaluations with inaccurate ground-truth labels (IAGTLs) to assess the predictive models for various artificial intelligence applications. However, the practicability of LAF for evaluations with IAGTLs has not yet been validated in real-world practice. In this paper, to address this issue, we applied LAF to tumour segmentation for breas… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.11567

  12. arXiv:2306.10805  [pdf

    physics.med-ph cs.CV eess.IV

    Experts' cognition-driven ensemble deep learning for external validation of predicting pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer

    Authors: Yongquan Yang, Fengling Li, Yani Wei, Yuanyuan Zhao, **g Fu, Xiuli Xiao, Hong Bu

    Abstract: In breast cancer imaging, there has been a trend to directly predict pathological complete response (pCR) to neoadjuvant chemotherapy (NAC) from histological images based on deep learning (DL). However, it has been a commonly known problem that the constructed DL-based models numerically have better performances in internal validation than in external validation. The primary reason for this situat… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  13. arXiv:2305.08669  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci cond-mat.other cond-mat.supr-con

    Synthesis and physical properties of Ce$_2$Rh$_{3+δ}$Sb$_4$ single crystals

    Authors: Kangqiao Cheng, Shuo Zou, Huanpeng Bu, Jiawen Zhang, Shijie Song, Hanjie Guo, Huiqiu Yuan, Yongkang Luo

    Abstract: Millimeter-sized Ce$_2$Rh$_{3+δ}$Sb$_4$ ($δ\approx 1/8$) single crystals were synthesized by a Bi-flux method and their physical properties were studied by a combination of electrical transport, magnetic and thermodynamic measurements. The resistivity anisotropy $ρ_{a,b}/ρ_{c}\sim2$, manifesting a quasi-one-dimensional electronic character. Magnetic susceptibility measurements confirm… ▽ More

    Submitted 16 August, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 7 pages, 4 figures, 2 tables

    Journal ref: Phys. Rev. Mater. 7. 084404 (2023)

  14. arXiv:2304.07295  [pdf

    q-bio.QM cs.AI eess.IV

    Experts' cognition-driven safe noisy labels learning for precise segmentation of residual tumor in breast cancer

    Authors: Yongquan Yang, Jie Chen, Yani Wei, Mohammad Alobaidi, Hong Bu

    Abstract: Precise segmentation of residual tumor in breast cancer (PSRTBC) after neoadjuvant chemotherapy is a fundamental key technique in the treatment process of breast cancer. However, achieving PSRTBC is still a challenge, since the breast cancer tissue and tumor cells commonly have complex and varied morphological changes after neoadjuvant chemotherapy, which inevitably increases the difficulty to pro… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  15. arXiv:2303.15790  [pdf, other

    hep-ex hep-ph physics.ins-det

    STCF Conceptual Design Report: Volume 1 -- Physics & Detector

    Authors: M. Achasov, X. C. Ai, R. Aliberti, L. P. An, Q. An, X. Z. Bai, Y. Bai, O. Bakina, A. Barnyakov, V. Blinov, V. Bobrovnikov, D. Bodrov, A. Bogomyagkov, A. Bondar, I. Boyko, Z. H. Bu, F. M. Cai, H. Cai, J. J. Cao, Q. H. Cao, Z. Cao, Q. Chang, K. T. Chao, D. Y. Chen, H. Chen , et al. (413 additional authors not shown)

    Abstract: The Super $τ$-Charm facility (STCF) is an electron-positron collider proposed by the Chinese particle physics community. It is designed to operate in a center-of-mass energy range from 2 to 7 GeV with a peak luminosity of $0.5\times 10^{35}{\rm cm}^{-2}{\rm s}^{-1}$ or higher. The STCF will produce a data sample about a factor of 100 larger than that by the present $τ$-Charm factory -- the BEPCII,… ▽ More

    Submitted 5 October, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Journal ref: Front. Phys. 19(1), 14701 (2024)

  16. arXiv:2302.09217  [pdf, other

    q-bio.QM stat.AP

    Identify local limiting factors of species distribution using min-linear logistic regression

    Authors: Hongliang Bu, Yunyi Shen

    Abstract: Logistic regression is a commonly used building block in ecological modeling, but its additive structure among environmental predictors often assumes compensatory relationships between predictors, which can lead to problematic results. In reality, the distribution of species is often determined by the least-favored factor, according to von Liebig's Law of the Minimum, which is not addressed in mod… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  17. arXiv:2211.01585  [pdf, other

    cs.SD eess.AS

    The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results

    Authors: Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen, Xin Xu

    Abstract: This paper summarizes the outcomes from the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC). We first address the necessity of the challenge and then introduce the associated dataset collected from a new-energy vehicle (NEV) covering a variety of cockpit acoustic conditions and linguistic contents. We then describe the track arrangement and the baseline system. Specifically, w… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Accepted by ISCSLP2022

  18. arXiv:2209.05273  [pdf, other

    eess.AS

    The 2022 Far-field Speaker Verification Challenge: Exploring domain mismatch and semi-supervised learning under the far-field scenario

    Authors: Xiaoyi Qin, Ming Li, Hui Bu, Shrikanth Narayanan, Haizhou Li

    Abstract: FFSVC2022 is the second challenge of far-field speaker verification. FFSVC2022 provides the fully-supervised far-field speaker verification to further explore the far-field scenario and proposes semi-supervised far-field speaker verification. In contrast to FFSVC2020, FFSVC2022 focus on the single-channel scenario. In addition, a supplementary set for the FFSVC2020 dataset is released this year. T… ▽ More

    Submitted 15 September, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

  19. Gapless triangular-lattice spin-liquid candidate in PrZnAl$_{11}$O$_{19}$

    Authors: Huanpeng Bu, Malik Ashtar, Toni Shiroka, Helen C. Walker, Zhendong Fu, **kui Zhao, Jason S. Gardner, Gang Chen, Zhaoming Tian, Hanjie Guo

    Abstract: A quantum spin liquid (QSL) is an exotic state in which electron spins are highly entangled, yet keep fluctuating even at zero temperature. Experimental realization of model QSLs has been challenging due to imperfections, such as antisite disorder, strain, and extra or a lack of interactions in real materials compared to the model Hamiltonian. Here we report the magnetic susceptibility, thermodyna… ▽ More

    Submitted 5 September, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: 8 pages, 6 figures

    Journal ref: Phys. Rev. B 106, 134428 (2022)

  20. Anomalous ferromagnetic behavior in the orthorhombic Li$_3$Co$_2$SbO$_6$

    Authors: Qianhui Duan, Huanpeng Bu, Vladimir Pomjakushin, Hubertus Luetkens, Yuke Li, **kui Zhao, Jason S. Gardner, Hanjie Guo

    Abstract: Monoclinic Li$_3$Co$_2$SbO$_6$ has been proposed as a Kitaev spin liquid candidate and investigated intensively, whereas the properties of its polymorph, the orthorhombic phase, is less known. Here we report the magnetic properties of the orthorhombic Li$_3$Co$_2$SbO$_6$ as revealed by dc and ac magnetic susceptibility, muon spin relaxation ($μ$SR) and neutron diffraction measurements. Successive… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: 9 pages, 9 figures. Accepted by Inor. Chem

    Journal ref: Inorg. Chem. 61, 10880-10887 (2022)

  21. arXiv:2206.05635  [pdf, other

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    La$_2$Rh$_{3+δ}$Sb$_4$: A new ternary superconducting rhodium-antimonide

    Authors: Kangqiao Cheng, Wei Xie, Shuo Zou, Huanpeng Bu, **-Ke Bao, Zengwei Zhu, Hanjie Guo, Chao Cao, Yongkang Luo

    Abstract: Rhodium-containing compounds offer a fertile playground to explore novel materials with superconductivity and other fantastic electronic correlation effects. A new ternary rhodium-antimonide La$_2$Rh$_{3+δ}$Sb$_4$ ($δ\approx 1/8$) has been synthesized by a Bi-flux method. It crystallizes in the orthorhombic Pr$_2$Ir$_3$Sb$_4$-like structure, with the space group $Pnma$ (No. 62). The crystalline st… ▽ More

    Submitted 13 October, 2022; v1 submitted 11 June, 2022; originally announced June 2022.

    Comments: 15+1 pages, 7+1 figures, 2+1 tables

    Journal ref: Materials Futures 1, 045201 (2022)

  22. arXiv:2205.03850  [pdf, other

    cs.CR cs.LG eess.SY

    SeqNet: An Efficient Neural Network for Automatic Malware Detection

    Authors: Jiawei Xu, Wenxuan Fu, Haoyu Bu, Zhi Wang, Lingyun Ying

    Abstract: Malware continues to evolve rapidly, and more than 450,000 new samples are captured every day, which makes manual malware analysis impractical. However, existing deep learning detection models need manual feature engineering or require high computational overhead for long training processes, which might be laborious to select feature space and difficult to retrain for mitigating model aging. There… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  23. Single crystal growth and superconductivity in RbNi$_2$Se$_2$

    Authors: Hui Liu, Xunwu Hu, Hanjie Guo, Xiao-Kun Teng, Huanpeng Bu, Zhihui Luo, Lisi Li, Zengjia Liu, Mengwu Huo, Feixiang Liang, Hualei Sun, Bing Shen, Pengcheng Dai, Robert J. Birgeneau, Dao-Xin Yao, Ming Yi, Meng Wang

    Abstract: We report the synthesis and characterization of RbNi$_2$Se$_2$, an analog of the iron chalcogenide superconductor Rb$_x$Fe$_2$Se$_2$, via transport, angle resolved photoemission spectroscopy, and density functional theory calculations. A superconducting transition at $T_{c}$ = 1.20 K is identified. In normal state, RbNi$_2$Se$_2$ shows paramagnetic and Fermi liquid behaviors. A large Sommerfeld co… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

    Comments: 7 pages, 4 figures

  24. arXiv:2202.03647  [pdf, other

    cs.SD eess.AS

    Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

    Authors: Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu

    Abstract: The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge (M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech technologies. The M2MeT challenge has particularly set up two tracks, speaker diarization (track 1) and multi-speaker automatic speech recognition (ASR) (track 2). Along with the challenge, we released 120 hours of real-recorded Ma… ▽ More

    Submitted 25 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: Accepted by ICASSP 2022

  25. One-Step Abductive Multi-Target Learning with Diverse Noisy Samples and Its Application to Tumour Segmentation for Breast Cancer

    Authors: Yongquan Yang, Fengling Li, Yani Wei, Jie Chen, Ning Chen, Mohammad H. Alobaidi, Hong Bu

    Abstract: Recent studies have demonstrated the effectiveness of the combination of machine learning and logical reasoning, including data-driven logical reasoning, knowledge driven machine learning and abductive learning, in inventing advanced technologies for different artificial intelligence applications. One-step abductive multi-target learning (OSAMTL), an approach inspired by abductive learning, via si… ▽ More

    Submitted 12 April, 2024; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: The final published version (81 pages)

    Journal ref: Expert Systems with Applications, 2024

  26. arXiv:2110.07393  [pdf, other

    cs.SD eess.AS

    M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

    Authors: Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu

    Abstract: Recent development of speech processing, such as speech recognition, speaker diarization, etc., has inspired numerous applications of speech technologies. The meeting scenario is one of the most valuable and, at the same time, most challenging scenarios for the deployment of speech technologies. Specifically, two typical tasks, speaker diarization and multi-speaker automatic speech recognition hav… ▽ More

    Submitted 25 February, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted by ICASSP 2022

  27. arXiv:2110.03370  [pdf, other

    cs.SD cs.CL

    WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

    Authors: Binbin Zhang, Hang Lv, Pengcheng Guo, Qijie Shao, Chao Yang, Lei Xie, Xin Xu, Hui Bu, Xiaoyu Chen, Chenchen Zeng, Di Wu, Zhendong Peng

    Abstract: In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about 10000 hours unlabeled speech, with 22400+ hours in total. We collect the data from YouTube and Podcast, which covers a variety of speaking styles, scenarios, domains, topics, and noisy conditions. An optical character recognition… ▽ More

    Submitted 23 February, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

  28. arXiv:2104.03603  [pdf, other

    cs.SD eess.AS

    AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario

    Authors: Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu, Xin Xu, Jun Du, **gdong Chen

    Abstract: In this paper, we present AISHELL-4, a sizable real-recorded Mandarin speech dataset collected by 8-channel circular microphone array for speech processing in conference scenario. The dataset consists of 211 recorded meeting sessions, each containing 4 to 8 speakers, with a total length of 120 hours. This dataset aims to bridge the advanced research on multi-speaker processing and the practical ap… ▽ More

    Submitted 10 August, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: Accepted by Interspeech 2021

  29. arXiv:2104.01818  [pdf, other

    eess.AS

    The Multi-speaker Multi-style Voice Cloning Challenge 2021

    Authors: Qicong Xie, Xiaohai Tian, Guanghou Liu, Kun Song, Lei Xie, Zhiyong Wu, Hai Li, Song Shi, Haizhou Li, Fen Hong, Hui Bu, Xin Xu

    Abstract: The Multi-speaker Multi-style Voice Cloning Challenge (M2VoC) aims to provide a common sizable dataset as well as a fair testbed for the benchmarking of the popular voice cloning task. Specifically, we formulate the challenge to adapt an average TTS model to the stylistic target voice with limited data from target speaker, evaluated by speaker identity and style similarity. The challenge consists… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: has been accepted to ICASSP 2021

  30. arXiv:2104.00960  [pdf, other

    eess.AS cs.SD

    INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

    Authors: Wei Rao, Yihui Fu, Yanxin Hu, Xin Xu, Yvkai Jv, Jiangyu Han, Zhongjie Jiang, Lei Xie, Yannan Wang, Shinji Watanabe, Zheng-Hua Tan, Hui Bu, Tao Yu, Shidong Shang

    Abstract: The ConferencingSpeech 2021 challenge is proposed to stimulate research on far-field multi-channel speech enhancement for video conferencing. The challenge consists of two separate tasks: 1) Task 1 is multi-channel speech enhancement with single microphone array and focusing on practical application with real-time requirement and 2) Task 2 is multi-channel speech enhancement with multiple distribu… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: 5 pages, submitted to INTERSPEECH 2021

  31. arXiv:2102.12173  [pdf

    eess.IV

    Deep learning-based framework for cardiac function assessment in embryonic zebrafish from heart beating videos

    Authors: Amir Mohammad Naderi, Haisong Bu, **gcheng Su, Mao-Hsiang Huang, Khuong Vo, Ramses Seferino Trigo Torres, J. -C. Chiao, Juhyun Lee, Michael P. H. Lau, Xiaolei Xu, Hung Cao

    Abstract: Zebrafish is a powerful and widely-used model system for a host of biological investigations including cardiovascular studies and genetic screening. Zebrafish are readily assessable during developmental stages; however, the current methods for quantification and monitoring of cardiac functions mostly involve tedious manual work and inconsistent estimations. In this paper, we developed and validate… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  32. arXiv:2011.11879  [pdf

    eess.IV cs.CV cs.LG

    Blind deblurring for microscopic pathology images using deep learning networks

    Authors: Cheng Jiang, Jun Liao, Pei Dong, Zhaoxuan Ma, De Cai, Guoan Zheng, Yue** Liu, Hong Bu, Jianhua Yao

    Abstract: Artificial Intelligence (AI)-powered pathology is a revolutionary step in the world of digital pathology and shows great promise to increase both diagnosis accuracy and efficiency. However, defocus and motion blur can obscure tissue or cell characteristics hence compromising AI algorithms'accuracy and robustness in analyzing the images. In this paper, we demonstrate a deep-learning-based approach… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

  33. arXiv:2011.02198  [pdf, other

    cs.SD eess.AS

    IEEE SLT 2021 Alpha-mini Speech Challenge: Open Datasets, Tracks, Rules and Baselines

    Authors: Yihui Fu, Zhuoyuan Yao, Weipeng He, Jian Wu, Xiong Wang, Zhanheng Yang, Shimin Zhang, Lei Xie, Dongyan Huang, Hui Bu, Petr Motlicek, Jean-Marc Odobez

    Abstract: The IEEE Spoken Language Technology Workshop (SLT) 2021 Alpha-mini Speech Challenge (ASC) is intended to improve research on keyword spotting (KWS) and sound source location (SSL) on humanoid robots. Many publications report significant improvements in deep learning based KWS and SSL on open source datasets in recent years. For deep learning model training, it is necessary to expand the data cover… ▽ More

    Submitted 14 November, 2020; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Accepted at IEEE SLT 2021

  34. arXiv:2010.11567  [pdf, other

    cs.SD eess.AS

    AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines

    Authors: Yao Shi, Hui Bu, Xin Xu, Shaoji Zhang, Ming Li

    Abstract: In this paper, we present AISHELL-3, a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by 218 native Chinese mandarin speakers. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided… ▽ More

    Submitted 22 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

  35. Statistical inference for unknown parameters of stochastic SIS epidemics on complete graphs

    Authors: Huazheng Bu, Xiaofeng Xue

    Abstract: In this paper, we are concerned with the stochastic susceptible-infectious-susceptible (SIS) epidemic model on the complete graph with $n$ vertices. This model has two parameters, which are the infection rate and the recovery rate. By utilizing the theory of density-dependent Markov chains, we give consistent estimations of the above two parameters as $n$ grows to infinity according to the sample… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: 16 pages

  36. arXiv:2005.08046  [pdf, other

    eess.AS cs.SD

    The INTERSPEECH 2020 Far-Field Speaker Verification Challenge

    Authors: Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan, Haizhou Li

    Abstract: The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, far-field text-independent speaker verification from single microphone array, and far-field text-dependent speaker verification from distributed microphone arrays. All three… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Submitted to INTERSPEECH 2020

  37. arXiv:2003.00452  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Large Magnetoresistance in Topological Insulator Candidate TaSe3

    Authors: Yong Zhang, Tongshuai Zhu, Haijun Bu, Zixiu Cai, Chuanying Xi, Bo Chen, Boyuan Wei, Dong**g Lin, Hangkai Xie, Muhammad Naveed, Xiaoxiang Xi, Fucong Fei, Haijun Zhang, Fengqi Song

    Abstract: Large unsaturated magnetoresistance (XMR) with magnitude about 1000% is observed in topological insulator candidate TaSe3 from our high field (up to 38 T) measurements. Two oscillation modes, associated with one hole pocket and two electron pockets in the bulk, respectively, are detected from our Shubnikov-de Hass (SdH) measurements, consistent with our first-principles calculations. With the deta… ▽ More

    Submitted 2 September, 2020; v1 submitted 1 March, 2020; originally announced March 2020.

    Journal ref: AIP Advances 10, 095314 (2020)

  38. arXiv:2002.00387  [pdf, other

    cs.SD eess.AS

    The FFSVC 2020 Evaluation Plan

    Authors: Xiaoyi Qin, Ming Li, Hui Bu, Rohan Kumar Das, Wei Rao, Shrikanth Narayanan, Haizhou Li

    Abstract: The Far-Field Speaker Verification Challenge 2020 (FFSVC20) is designed to boost the speaker verification research with special focus on far-field distributed microphone arrays under noisy conditions in real scenarios. The objectives of this challenge are to: 1) benchmark the current speech verification technology under this challenging condition, 2) promote the development of new ideas and techno… ▽ More

    Submitted 4 February, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

  39. arXiv:1912.01231  [pdf, other

    cs.SD eess.AS

    HI-MIA : A Far-field Text-Dependent Speaker Verification Database and the Baselines

    Authors: Xiaoyi Qin, Hui Bu, Ming Li

    Abstract: This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for far-field microphone array based speaker verification since most of the publicly available databases are single channel close-talking and text-independent. The database contains recordings of 340 people in rooms designed for the far-field scenario. Recordings are captu… ▽ More

    Submitted 1 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: Accepted at ICASSP 2020

  40. arXiv:1909.05564  [pdf

    cond-mat.mes-hall

    Magneto-transport and Shubnikov-de Haas oscillations in the layered ternary telluride Ta3SiTe6 topological semimetal

    Authors: Muhammad Naveed, Fucong Fei, Haijun Bu, Xiangyan Bo, Syed Adil Shah, Bo Chen, Yong Zhang, Qianqian Liu, Boyuan Wei, Shuai Zhang, Chuanying Xi, Xiangang Wan, Fengqi Song

    Abstract: Topological semimetals characterize a novel class of quantum materials hosting Dirac/Weyl fermions. The important features of topological fermions can be exhibited by quantum oscillations. Here we report the magnetoresistance and Shubnikov-de Haas (SdH) quantum oscillation of longitudinal resistance in the single crystal of topological semimetal Ta3SiTe6 with the magnetic field up to 38 T. Periodi… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

    Comments: 18 pages, 4 figures

  41. arXiv:1904.06026  [pdf

    cs.CV

    Cycle-Consistent Adversarial GAN: the integration of adversarial attack and defense

    Authors: Lingyun Jiang, Kai Qiao, Ruoxi Qin, Linyuan Wang, Jian Chen, Haibing Bu, Bin Yan

    Abstract: In image classification of deep learning, adversarial examples where inputs intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those research in these networks are only for one aspect… ▽ More

    Submitted 12 April, 2019; originally announced April 2019.

    Comments: 13 pages,7 tables, 1 figure

  42. arXiv:1808.10583  [pdf, other

    cs.CL

    AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale

    Authors: Jiayu Du, Xingyu Na, Xuechen Liu, Hui Bu

    Abstract: AISHELL-1 is by far the largest open-source speech corpus available for Mandarin speech recognition research. It was released with a baseline system containing solid training and testing pipelines for Mandarin ASR. In AISHELL-2, 1000 hours of clean read-speech data from iOS is published, which is free for academic usage. On top of AISHELL-2 corpus, an improved recipe is developed and released, con… ▽ More

    Submitted 12 September, 2018; v1 submitted 30 August, 2018; originally announced August 2018.

  43. arXiv:1804.04313  [pdf

    cond-mat.mes-hall

    Oscillating planar Hall response from the surface electrons in bulk crystal Sn doped Bi1.1Sb0.9Te2S

    Authors: Bin Wu, Xing-Chen Pan, Wenkai Wu, Fucong Fei, Bo Chen, Qianqian Liu, Haijun Bu, Lu Cao, Fengqi Song, Baigeng Wang

    Abstract: We report the low-temperature magneto-transport in the bulk-insulating single crystal of topological insulator Sn doped Bi1.1Sb0.9Te2S. The Shubnikov-de Haas oscillations appear with their reciprocal frequency proportional to cos/theta , demonstrating the dominant transport of topological surface states. While the magnetic field is rotating in the sample surface, the planar Hall effect arises with… ▽ More

    Submitted 12 April, 2018; originally announced April 2018.

  44. arXiv:1709.05522  [pdf, other

    cs.CL

    AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

    Authors: Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng

    Abstract: An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including audio capturing devices and environments are presented in details. The preparation of the related resources, including transcriptions and lexicon… ▽ More

    Submitted 16 September, 2017; originally announced September 2017.

    Comments: Oriental COCOSDA 2017

  45. arXiv:1612.05990  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Discovery of a new type of topological Weyl fermion semimetal state in Mo$_x$W$_{1-x}$Te$_2$

    Authors: Ilya Belopolski, Daniel S. Sanchez, Yukiaki Ishida, Xingchen Pan, Peng Yu, Su-Yang Xu, Guoqing Chang, Tay-Rong Chang, Hao Zheng, Nasser Alidoust, Guang Bian, Madhab Neupane, Shin-Ming Huang, Chi-Cheng Lee, You Song, Haijun Bu, Guanghou Wang, Shisheng Li, Goki Eda, Horng-Tay Jeng, Takeshi Kondo, Hsin Lin, Zheng Liu, Fengqi Song, Shik Shin , et al. (1 additional authors not shown)

    Abstract: The recent discovery of a Weyl semimetal in TaAs offers the first Weyl fermion observed in nature and dramatically broadens the classification of topological phases. However, in TaAs it has proven challenging to study the rich transport phenomena arising from emergent Weyl fermions. The series Mo$_x$W$_{1-x}$Te$_2$ are inversion-breaking, layered, tunable semimetals already under study as a promis… ▽ More

    Submitted 18 December, 2016; originally announced December 2016.

    Journal ref: Nature Communications 7, 13643 (2016)

  46. arXiv:1601.05534  [pdf

    cond-mat.mes-hall

    Repairing atomic vacancies in single-layer MoSe2 field-effect transistor and its defect dynamics

    Authors: Yuze Meng, Chongyi Ling, Run Xin, Peng Wang, You Song, Haijun Bu, Si Gao, Xuefeng Wang, Fengqi Song, **lan Wang, Xinran Wang, Baigeng Wang, Guanghou Wang

    Abstract: Here we repair the single-layer MoSe2 field-effect transistors by the EDTA processing, after which the devices' room-temperature carrier mobility increases from 0.1 to over 70cm2/Vs. The atomic dynamics is constructed by the combined study of the first-principle calculation, aberration-corrected transmission electron microscopy and Raman spectroscopy. Single/double Se vacancies are revealed origin… ▽ More

    Submitted 21 January, 2016; originally announced January 2016.

  47. Films with the discrete nano-DLC-particles as the field emission cascade

    Authors: Fengqi Song, Feng Zhou, Haijun Bu, Xiaoshu Wang, Longbing He, Min Han, Jianguo Wan, Jianfeng Zhou, Guanghou Wang

    Abstract: The films with the discrete diamond-like-carbon nanoparticles were prepared by the deposition of the carbon nanoparticle beam. Their morphologies were imaged by Scanning Electron Microscopy (SEM) and Atomic Force Microscopy (AFM). The nanoparticles were found distributed on the silicon (100) substrate discretely. The semisphere shapes of the nanoparticles were demonstrated by the AFM line profile.… ▽ More

    Submitted 31 August, 2010; originally announced September 2010.

    Journal ref: J. Phys. D: Appl. Phys. 41 (2008) 042001