Skip to main content

Showing 1–32 of 32 results for author: Cai, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.11265  [pdf, ps, other

    eess.SY

    Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning

    Authors: Yuanzhe Geng, Erwu Liu, Wei Ni, Rui Wang, Yan Liu, Hao Xu, Chen Cai, Abbas Jamalipour

    Abstract: This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner. This differs from most existing works that have typically assumed that source and relay nodes follow a schedule created implicitly by a central controller. We propose that the relays for… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2405.10570  [pdf

    eess.IV cs.AI

    Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

    Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang **, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

    Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 6 tables

  3. arXiv:2402.15939  [pdf

    eess.IV cs.LG

    Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

    Authors: Zi Wang, Min Xiao, Yirong Zhou, Chengyan Wang, Naiming Wu, Yi Li, Yiwen Gong, Shufu Chang, Yinyin Chen, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Di Guo, Guang Yang, Xiaobo Qu

    Abstract: Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge leads to necessitate extensive training data in many deep learning reconstruction methods. This work proposes a novel and efficient approach, levera… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 10 pages, 11 figures, 3 tables

  4. arXiv:2310.07464  [pdf

    eess.IV cs.LG q-bio.QM

    Deep Learning Predicts Biomarker Status and Discovers Related Histomorphology Characteristics for Low-Grade Glioma

    Authors: Zijie Fang, Yihan Liu, Yifeng Wang, Xiangyang Zhang, Yang Chen, Chang**g Cai, Yiyang Lin, Ying Han, Zhi Wang, Shan Zeng, Hong Shen, Jun Tan, Yongbing Zhang

    Abstract: Biomarker detection is an indispensable part in the diagnosis and treatment of low-grade glioma (LGG). However, current LGG biomarker detection methods rely on expensive and complex molecular genetic testing, for which professionals are required to analyze the results, and intra-rater variability is often reported. To overcome these challenges, we propose an interpretable deep learning pipeline, a… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 47 pages, 6 figures

  5. arXiv:2309.02888  [pdf, other

    eess.SP

    Multi-Device Task-Oriented Communication via Maximal Coding Rate Reduction

    Authors: Chang Cai, Xiaojun Yuan, Ying-Jun Angela Zhang

    Abstract: In task-oriented communications, most existing work designed the physical-layer communication modules and learning based codecs with distinct objectives: learning is targeted at accurate execution of specific tasks, while communication aims at optimizing conventional communication metrics, such as throughput maximization, delay minimization, or bit error rate minimization. The inconsistency betwee… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: under minor revision in IEEE Transactions on Wireless Communications

  6. arXiv:2308.01173  [pdf, other

    eess.IV

    FlexDTI: Flexible diffusion gradient encoding scheme-based highly efficient diffusion tensor imaging using deep learning

    Authors: Zejun Wu, Jiechao Wang, Zunquan Chen, Qinqin Yang, Zhen Xing, Dairong Cao, Jianfeng Bao, Taishan Kang, Jianzhong Lin, Shuhui Cai, Zhong Chen, Congbo Cai

    Abstract: Objective: Most deep neural network-based diffusion tensor imaging methods require the diffusion gradients' number and directions in the data to be reconstructed to match those in the training data. This work aims to develop and evaluate a novel dynamic-convolution-based method called FlexDTI for highly efficient diffusion tensor reconstruction with flexible diffusion encoding gradient scheme. App… ▽ More

    Submitted 21 December, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: 24 pages,9 figures,3 tables

  7. arXiv:2307.13220  [pdf

    eess.IV cs.AI physics.med-ph

    One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI Reconstruction

    Authors: Zi Wang, Xiaotong Yu, Chengyan Wang, Weibo Chen, Jiazheng Wang, Ying-Hua Chu, Hongwei Sun, Rushuai Li, Peiyong Li, Fan Yang, Haiwei Han, Taishan Kang, Jianzhong Lin, Chen Yang, Shufu Chang, Zhang Shi, Sha Hua, Yan Li, Juan Hu, Liuhong Zhu, Jianjun Zhou, Mei**g Lin, Jiefeng Guo, Congbo Cai, Zhong Chen , et al. (3 additional authors not shown)

    Abstract: Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although Deep… ▽ More

    Submitted 28 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: 38 pages, 19 figures, 5 tables

  8. arXiv:2301.10455  [pdf, other

    eess.IV cs.CV cs.MM

    Rate-Perception Optimized Preprocessing for Video Coding

    Authors: Chengqian Ma, Zhiqiang Wu, Chunlei Cai, Pengwei Zhang, Yi Wang, Long Zheng, Chao Chen, Quan Zhou

    Abstract: In the past decades, lots of progress have been done in the video compression field including traditional video codec and learning-based video codec. However, few studies focus on using preprocessing techniques to improve the rate-distortion performance. In this paper, we propose a rate-perception optimized preprocessing (RPP) method. We first introduce an adaptive Discrete Cosine Transform loss f… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  9. arXiv:2210.10379  [pdf, ps, other

    eess.IV physics.med-ph

    High-efficient Bloch simulation of magnetic resonance imaging sequences based on deep learning

    Authors: Haitao Huang, Qinqin Yang, Jiechao Wang, Pujie Zhang, Shuhui Cai, Congbo Cai

    Abstract: Objective: Bloch simulation constitutes an essential part of magnetic resonance imaging (MRI) development. However, even with the graphics processing unit (GPU) acceleration, the heavy computational load remains a major challenge, especially in large-scale, high-accuracy simulation scenarios. This work aims to develop a deep learning-based simulator to accelerate Bloch simulation. Approach: The si… ▽ More

    Submitted 15 March, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 18 pages, 8 figures

  10. arXiv:2208.10739  [pdf, other

    cs.CV eess.IV

    Quality-Constant Per-Shot Encoding by Two-Pass Learning-based Rate Factor Prediction

    Authors: Chunlei Cai, Yi Wang, Xiaobo Li, Tianxiao Ye

    Abstract: Providing quality-constant streams can simultaneously guarantee user experience and prevent wasting bit-rate. In this paper, we propose a novel deep learning based two-pass encoder parameter prediction framework to decide rate factor (RF), with which encoder can output streams with constant quality. For each one-shot segment in a video, the proposed method firstly extracts spatial, temporal and pr… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  11. arXiv:2208.04459  [pdf, other

    eess.SY stat.AP

    Bullwhip Effect of Supply Networks: Joint Impact of Network Structure and Market Demand

    Authors: **-Zhu Yü, Chencheng Cai, Jianxi Gao

    Abstract: The progressive amplification of fluctuations in demand as the demand travels upstream the supply chains is known as the bullwhip effect. We first analytically characterize the bullwhip effect in general supply chain networks in two cases: (i) all suppliers have a unique layer position, where our method is founded on the control-theoretic approach, and (ii) not all suppliers have a unique layer po… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  12. arXiv:2204.05649  [pdf, other

    cs.SD eess.AS

    ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition

    Authors: Zi Huang, Shulei Ji, Zhilan Hu, Chuangjian Cai, **g Luo, Xinyu Yang

    Abstract: Music emotion recognition (MER), a sub-task of music information retrieval (MIR), has developed rapidly in recent years. However, the learning of affect-salient features remains a challenge. In this paper, we propose an end-to-end attention-based deep feature fusion (ADFF) approach for MER. Only taking log Mel-spectrogram as input, this method uses adapted VGGNet as spatial feature learning module… ▽ More

    Submitted 30 June, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: It has been received by Interspeech2022

  13. arXiv:2203.13617  [pdf, other

    eess.AS cs.LG cs.SD

    EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition

    Authors: Haiyang Sun, Zheng Lian, Bin Liu, Ying Li, Licai Sun, Cong Cai, Jianhua Tao, Meng Wang, Yuan Cheng

    Abstract: Speech emotion recognition (SER) is an important research topic in human-computer interaction. Existing works mainly rely on human expertise to design models. Despite their success, different datasets often require distinct structures and hyperparameters. Searching for an optimal model for each dataset is time-consuming and labor-intensive. To address this problem, we propose a two-stream neural a… ▽ More

    Submitted 9 June, 2023; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: Accepted to Interspeech 2023

  14. arXiv:2203.11178  [pdf

    cs.LG eess.SP physics.med-ph

    Physics-driven Synthetic Data Learning for Biomedical Magnetic Resonance

    Authors: Qinqin Yang, Zi Wang, Kunyuan Guo, Congbo Cai, Xiaobo Qu

    Abstract: Deep learning has innovated the field of computational imaging. One of its bottlenecks is unavailable or insufficient training data. This article reviews an emerging paradigm, imaging physics-based data synthesis (IPADS), that can provide huge training data in biomedical magnetic resonance without or with few real data. Following the physical law of magnetic resonance, IPADS generates signals from… ▽ More

    Submitted 21 May, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  15. arXiv:2203.08377  [pdf, other

    cs.IT eess.SP

    RIS Partitioning Based Scalable Beamforming Design for Large-Scale MIMO: Asymptotic Analysis and Optimization

    Authors: Chang Cai, Xiaojun Yuan, Ying-Jun Angela Zhang

    Abstract: In next-generation wireless networks, reconfigurable intelligent surface (RIS)-assisted multiple-input multiple-output (MIMO) systems are foreseeable to support a large number of antennas at the transceiver as well as a large number of reflecting elements at the RIS. To fully unleash the potential of RIS, the phase shifts of RIS elements should be carefully designed, resulting in a high-dimensiona… ▽ More

    Submitted 21 January, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: accepted by IEEE Transactions on Wireless Communications

  16. MoRe-Fi: Motion-robust and Fine-grained Respiration Monitoring via Deep-Learning UWB Radar

    Authors: Tianyue Zheng, Zhe Chen, Shujie Zhang, Chao Cai, Jun Luo

    Abstract: Crucial for healthcare and biomedical applications, respiration monitoring often employs wearable sensors in practice, causing inconvenience due to their direct contact with human bodies. Therefore, researchers have been constantly searching for contact-free alternatives. Nonetheless, existing contact-free designs mostly require human subjects to remain static, largely confining their adoptions in… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: 14 pages

    Journal ref: SenSys '21: Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems November 2021

  17. arXiv:2111.01692  [pdf, other

    stat.ML cs.AI cs.LG eess.SP stat.AP

    Efficient Hierarchical Bayesian Inference for Spatio-temporal Regression Models in Neuroimaging

    Authors: Ali Hashemi, Yi**g Gao, Chang Cai, Sanjay Ghosh, Klaus-Robert Müller, Srikantan S. Nagarajan, Stefan Haufe

    Abstract: Several problems in neuroimaging and beyond require inference on the parameters of multi-task sparse hierarchical regression models. Examples include M/EEG inverse problems, neural encoding models for task-based fMRI analyses, and climate science. In these domains, both the model parameters to be inferred and the measurement noise may exhibit a complex spatio-temporal structure. Existing work eith… ▽ More

    Submitted 23 November, 2021; v1 submitted 2 November, 2021; originally announced November 2021.

    Comments: Accepted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  18. arXiv:2110.14848  [pdf, other

    eess.SP cs.LG cs.NI

    V2iFi: in-Vehicle Vital Sign Monitoring via Compact RF Sensing

    Authors: Tianyue Zheng, Zhe Chen, Chao Cai, Jun Luo, Xu Zhang

    Abstract: Given the significant amount of time people spend in vehicles, health issues under driving condition have become a major concern. Such issues may vary from fatigue, asthma, stroke, to even heart attack, yet they can be adequately indicated by vital signs and abnormal activities. Therefore, in-vehicle vital sign monitoring can help us predict and hence prevent these issues. Whereas existing sensor-… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: 27 pages

    Journal ref: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 4, Issue 2, June 2020

  19. RF-Based Human Activity Recognition Using Signal Adapted Convolutional Neural Network

    Authors: Zhe Chen, Chao Cai, Tianyue Zheng, Jun Luo, Jie Xiong, Xin Wang

    Abstract: Human Activity Recognition (HAR) plays a critical role in a wide range of real-world applications, and it is traditionally achieved via wearable sensing. Recently, to avoid the burden and discomfort caused by wearable devices, device-free approaches exploiting RF signals arise as a promising alternative for HAR. Most of the latest device-free approaches require training a large deep neural network… ▽ More

    Submitted 27 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 13 pages

    Journal ref: IEEE Transactions on Mobile Computing, 19 April 2021

  20. arXiv:2110.03879  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees

    Authors: Yuanchao Wang, Wenji Du, Chenghao Cai, Yanyan Xu

    Abstract: The attention mechanism has largely improved the performance of end-to-end speech recognition systems. However, the underlying behaviours of attention is not yet clearer. In this study, we use decision trees to explain how the attention mechanism impact itself in speech recognition. The results indicate that attention levels are largely impacted by their previous states rather than the encoder and… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 10 pages, 5 figures

  21. arXiv:2108.03008  [pdf, other

    cs.SD cs.LG eess.AS

    An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder Architectures

    Authors: Dengfeng Ke, Yuxing Lu, Xudong Liu, Yanyan Xu, **g Sun, Cheng-Hao Cai

    Abstract: With the rapid development of neural network architectures and speech processing models, singing voice synthesis with neural networks is becoming the cutting-edge technique of digital music production. In this work, in order to explore how to improve the quality and efficiency of singing voice synthesis, in this work, we use encoder-decoder neural models and a number of vocoders to achieve singing… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 27 pages, 4 figures, 5 tables

  22. arXiv:2107.14521  [pdf, other

    eess.IV

    Model-based Synthetic Data-driven Learning (MOST-DL): Application in Single-shot T2 Map** with Severe Head Motion Using Overlap**-echo Acquisition

    Authors: Qinqin Yang, Yanhong Lin, Jiechao Wang, Jianfeng Bao, Xiaoyin Wang, Lingceng Ma, Zihan Zhou, Qizhi Yang, Shuhui Cai, Hongjian He, Congbo Cai, Jiyang Dong, **gliang Cheng, Zhong Chen, Jianhui Zhong

    Abstract: Use of synthetic data has provided a potential solution for addressing unavailable or insufficient training samples in deep learning-based magnetic resonance imaging (MRI). However, the challenge brought by domain gap between synthetic and real data is usually encountered, especially under complex experimental conditions. In this study, by combining Bloch simulation and general MRI models, we prop… ▽ More

    Submitted 29 May, 2022; v1 submitted 30 July, 2021; originally announced July 2021.

    Comments: 15 pages, 13 figures

  23. arXiv:2010.12876  [pdf, other

    eess.IV cs.LG eess.SP

    Electromagnetic Source Imaging via a Data-Synthesis-Based Convolutional Encoder-Decoder Network

    Authors: Gexin Huang, Jiawen Liang, Ke Liu, Chang Cai, ZhengHui Gu, Feifei Qi, Yuan Qing Li, Zhu Liang Yu, Wei Wu

    Abstract: Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this paper a novel data-synthesized spatio-temporally convolutional encoder-decoder network… ▽ More

    Submitted 13 July, 2022; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 15 pages, 14 figures, and journal

  24. arXiv:2010.05388  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    AI Song Contest: Human-AI Co-Creation in Songwriting

    Authors: Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, Carrie J. Cai

    Abstract: Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song w… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: 6 pages + 3 pages of references

    ACM Class: J.5; I.2

    Journal ref: ISMIR 2020

  25. arXiv:2006.01604  [pdf, ps, other

    cs.IT eess.SP

    Two-Timescale Optimization for Intelligent Reflecting Surface Aided D2D Underlay Communication

    Authors: Chang Cai, Huiyuan Yang, Xiaojun Yuan, Ying-Chang Liang

    Abstract: The performance of a device-to-device (D2D) underlay communication system is limited by the co-channel interference between cellular users (CUs) and D2D devices. To address this challenge, an intelligent reflecting surface (IRS) aided D2D underlay system is studied in this paper. A two-timescale optimization scheme is proposed to reduce the required channel training and feedback overhead, where tr… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

  26. The optimal sequence for reset controllers

    Authors: Chengwai Cai, Ali Ahmadi Dastjerdi, Niranjan Saikumar, S. H. HosseinNia

    Abstract: PID controllers cannot satisfy the high performance requirements since they are restricted by the water-bed effect. Thus, the need for a better alternative to linear PID controllers increases due to the rising demands of the high-tech industry. This has led many researchers to explore nonlinear controllers like reset control. Although reset controllers have been widely used to overcome the limitat… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

  27. arXiv:2003.11282  [pdf, other

    eess.IV cs.CV

    Content Adaptive and Error Propagation Aware Deep Video Compression

    Authors: Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao

    Abstract: Recently, learning based video compression methods attract increasing attention. However, the previous works suffer from error propagation due to the accumulation of reconstructed error in inter predictive coding. Meanwhile, the previous learning based video codecs are also not adaptive to different video contents. To address these two problems, we propose a content adaptive and error propagation… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: First two authors contributed equally

  28. arXiv:1906.05356  [pdf

    eess.SY cs.LG

    Traffic signal control optimization under severe incident conditions using Genetic Algorithm

    Authors: Tuo Mao, Adriana-Simona Mihaita, Chen Cai

    Abstract: Traffic control optimization is a challenging task for various traffic centres in the world and majority of approaches focus only on applying adaptive methods under normal (recurrent) traffic conditions. But optimizing the control plans when severe incidents occur still remains a hard topic to address, especially if a high number of lanes or entire intersections are affected. This paper aims at ta… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 14 pages, 15 figures, preprint for the 26th ITS World Congress 21-25 Oct 2019, Singapore

  29. arXiv:1906.04739  [pdf

    eess.SP cs.AI cs.LG stat.ML

    Trip Table Estimation and Prediction for Dynamic Traffic Assignment Applications

    Authors: Sajjad Shafiei, Adriana-Simona Mihaita, Chen Cai

    Abstract: The study focuses on estimating and predicting time-varying origin to destination (OD) trip tables for a dynamic traffic assignment (DTA) model. A bi-level optimisation problem is formulated and solved to estimate OD flows from pre-existent demand matrix and historical traffic flow counts. The estimated demand is then considered as an input for a time series OD demand prediction model to support t… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 6 pages, 6 figures, preprint at the 26th ITS World Congress 21-25 Oct 2019

  30. arXiv:1901.03450  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Ubiquitous Acoustic Sensing on Commodity IoT Devices: A Survey

    Authors: Chao Cai, Rong Zheng, Jun Luo

    Abstract: With the proliferation of Internet-of-Things devices, acoustic sensing attracts much attention in recent years. It exploits acoustic transceivers such as microphones and speakers beyond their primary functions, namely recording and playing, to enable novel applications and new user experiences. In this paper, we present the first systematic survey of recent advances in active acoustic sensing usin… ▽ More

    Submitted 12 August, 2021; v1 submitted 10 January, 2019; originally announced January 2019.

  31. arXiv:1812.00101  [pdf, other

    eess.IV cs.CV

    DVC: An End-to-end Deep Video Compression Framework

    Authors: Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, Zhiyong Gao

    Abstract: Conventional video compression approaches use the predictive coding architecture and encode the corresponding motion information and residual information. In this paper, taking advantage of both classical architecture in the conventional video compression method and the powerful non-linear representation ability of neural networks, we propose the first end-to-end video compression deep model that… ▽ More

    Submitted 7 April, 2019; v1 submitted 30 November, 2018; originally announced December 2018.

    Comments: Accepted by CVPR 2019. Project page https://github.com/GuoLusjtu/DVC

  32. arXiv:1803.01107  [pdf

    cs.SD eess.AS

    Audio-only Bird Species Automated Identification Method with Limited Training Data Based on Multi-Channel Deep Convolutional Neural Networks

    Authors: Jiang-jian Xie, Chang-qing Ding, Wen-bin Li, Cheng-hao Cai

    Abstract: Based on the transfer learning, we design a bird species identification model that uses the VGG-16 model (pretrained on ImageNet) for feature extraction, then a classifier consisting of two fully-connected hidden layers and a Softmax layer is attached. We compare the performance of the proposed model with the original VGG16 model. The results show that the former has higher train efficiency, but l… ▽ More

    Submitted 3 March, 2018; originally announced March 2018.

    Comments: 11 pages,11 figures