Skip to main content

Showing 1–15 of 15 results for author: Ye, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07801  [pdf, other

    cs.CL cs.SD eess.AS

    PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models

    Authors: Runyan Yang, Huibao Yang, Xiqing Zhang, Tiantian Ye, Ying Liu, Yingying Gao, Shilei Zhang, Chao Deng, Junlan Feng

    Abstract: Recently, there have been attempts to integrate various speech processing tasks into a unified model. However, few previous works directly demonstrated that joint optimization of diverse tasks in multitask speech models has positive influence on the performance of individual tasks. In this paper we present a multitask speech model -- PolySpeech, which supports speech recognition, speech synthesis,… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 5 pages, 2 figures

  2. Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition

    Authors: Jacob Fein-Ashley, Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar SAR Automatic Target Recognition ATR is a key technique of remote-sensing image recognition which can be supported by deep neural networks The existing works of SAR ATR mostly focus on improving the accuracy of the target recognition while ignoring the systems performance in terms of speed and storage which is critical to real-world applications of SAR ATR For decision-mak… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 6 Pages

  3. arXiv:2311.18173  [pdf

    eess.IV cs.CE cs.CV

    Quantification of cardiac capillarization in single-immunostained myocardial slices using weakly supervised instance segmentation

    Authors: Zhao Zhang, Xiwen Chen, William Richardson, Bruce Z. Gao, Abolfazl Razi, Tong Ye

    Abstract: Decreased myocardial capillary density has been reported as an important histopathological feature associated with various heart disorders. Quantitative assessment of cardiac capillarization typically involves double immunostaining of cardiomyocytes (CMs) and capillaries in myocardial slices. In contrast, single immunostaining of basement membrane components is a straightforward approach to simult… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  4. arXiv:2310.16102  [pdf, other

    eess.IV cs.CV physics.optics

    Learned, Uncertainty-driven Adaptive Acquisition for Photon-Efficient Multiphoton Microscopy

    Authors: Cassandra Tong Ye, Jiashu Han, Kunzan Liu, Anastasios Angelopoulos, Linda Griffith, Kristina Monakhova, Sixian You

    Abstract: Multiphoton microscopy (MPM) is a powerful imaging tool that has been a critical enabler for live tissue imaging. However, since most multiphoton microscopy platforms rely on point scanning, there is an inherent trade-off between acquisition time, field of view (FOV), phototoxicity, and image quality, often resulting in noisy measurements when fast, large FOV, and/or gentle imaging is needed. Deep… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  5. Output Voltage Response Improvement and Ripple Reduction Control for Input-parallel Output-parallel High-Power DC Supply

    Authors: Jianhui Meng, Xiaolong Wu, Tairan Ye, **gsen Yu, Likang Gu, Zili Zhang, Yang Li

    Abstract: A three-phase isolated AC-DC-DC power supply is widely used in the industrial field due to its attractive features such as high-power density, modularity for easy expansion and electrical isolation. In high-power application scenarios, it can be realized by multiple AC-DC-DC modules with Input-Parallel Output-Parallel (IPOP) mode. However, it has the problems of slow output voltage response and la… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by IEEE Transactions on Power Electronics

    Journal ref: IEEE Transactions on Power Electronics 38 (2023) 11102-11112

  6. arXiv:2302.12186  [pdf, other

    cs.CV eess.IV

    RSFDM-Net: Real-time Spatial and Frequency Domains Modulation Network for Underwater Image Enhancement

    Authors: **gxia Jiang, **bin Bai, Yun Liu, Junjie Yin, Sixiang Chen, Tian Ye, Erkang Chen

    Abstract: Underwater images typically experience mixed degradations of brightness and structure caused by the absorption and scattering of light by suspended particles. To address this issue, we propose a Real-time Spatial and Frequency Domains Modulation Network (RSFDM-Net) for the efficient enhancement of colors and details in underwater images. Specifically, our proposed conditional network is designed w… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

  7. arXiv:2208.11184  [pdf, other

    eess.IV cs.CV

    AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

    Authors: Ren Yang, Radu Timofte, Xin Li, Qi Zhang, Lin Zhang, Fanglong Liu, Dongliang He, Fu li, He Zheng, Weihang Yuan, Pavel Ostyakov, Dmitry Vyal, Magauiya Zhussip, Xueyi Zou, Youliang Yan, Lei Li, **gzhu Tang, Ming Chen, Shijie Zhao, Yu Zhu, Xiaoran Qin, Chenghua Li, Cong Leng, Jian Cheng, Claudio Rota , et al. (28 additional authors not shown)

    Abstract: This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 3… ▽ More

    Submitted 25 August, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: Camera-ready version

  8. arXiv:2208.10739  [pdf, other

    cs.CV eess.IV

    Quality-Constant Per-Shot Encoding by Two-Pass Learning-based Rate Factor Prediction

    Authors: Chunlei Cai, Yi Wang, Xiaobo Li, Tianxiao Ye

    Abstract: Providing quality-constant streams can simultaneously guarantee user experience and prevent wasting bit-rate. In this paper, we propose a novel deep learning based two-pass encoder parameter prediction framework to decide rate factor (RF), with which encoder can output streams with constant quality. For each one-shot segment in a video, the proposed method firstly extracts spatial, temporal and pr… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  9. arXiv:2206.13071  [pdf, other

    cs.SD cs.LG eess.AS

    Uncertainty Calibration for Deep Audio Classifiers

    Authors: Tong Ye, Shi**g Si, Jianzong Wang, Ning Cheng, **g Xiao

    Abstract: Although deep Neural Networks (DNNs) have achieved tremendous success in audio classification tasks, their uncertainty calibration are still under-explored. A well-calibrated model should be accurate when it is certain about its prediction and indicate high uncertainty when it is likely to be inaccurate. In this work, we investigate the uncertainty calibration for deep audio classifiers. In partic… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted by InterSpeech 2022, the first two authors contributed equally

  10. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, **gyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, **shan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  11. arXiv:2203.11006  [pdf, other

    cs.CV eess.IV

    Underwater Light Field Retention : Neural Rendering for Underwater Imaging

    Authors: Tian Ye, Sixiang Chen, Yun Liu, Yi Ye, Erkang Chen, Yuche Li

    Abstract: Underwater Image Rendering aims to generate a true-tolife underwater image from a given clean one, which could be applied to various practical applications such as underwater image enhancement, camera filter, and virtual gaming. We explore two less-touched but challenging problems in underwater image rendering, namely, i) how to render diverse underwater scenes by a single neural network? ii) how… ▽ More

    Submitted 23 April, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  12. arXiv:2109.05479  [pdf, ps, other

    eess.IV cs.CV

    Efficient Re-parameterization Residual Attention Network For Nonhomogeneous Image Dehazing

    Authors: Tian Ye, ErKang Chen, XinRui Huang, Peng Chen

    Abstract: This paper proposes an end-to-end Efficient Re-parameterizationResidual Attention Network(ERRA-Net) to directly restore the nonhomogeneous hazy image. The contribution of this paper mainly has the following three aspects: 1) A novel Multi-branch Attention (MA) block. The spatial attention mechanism better reconstructs high-frequency features, and the channel attention mechanism treats the features… ▽ More

    Submitted 14 September, 2021; v1 submitted 12 September, 2021; originally announced September 2021.

  13. arXiv:2010.07739  [pdf

    cs.SD cs.MM eess.AS

    Music Classification in MIDI Format based on LSTM Mdel

    Authors: Yiting Xia, Yiwei Jiang, Tao Ye

    Abstract: Music classification between music made by AI or human composers can be done by deep learning networks. We first transformed music samples in midi format to natural language sequences, then classified these samples by mLSTM (multiplicative Long Short Term Memory) + logistic regression. The accuracy of the result evaluated by 10-fold cross validation can reach 90%. Our work indicates that music gen… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: in Chinese

  14. arXiv:1907.04784  [pdf, other

    cs.NI eess.SP

    AWG-based Nonblocking Shuffle-Exchange Networks

    Authors: Tong Ye, **gjie Ding, Tony Tong Lee, Guido Maier

    Abstract: Optical shuffle-exchange networks (SENs) have wide application in different kinds of interconnection networks. This paper proposes an approach to construct modular optical SENs, using a set of arrayed waveguide gratings (AWGs) and tunable wavelength converters (TWCs). According to the wavelength routing property of AWGs, we demonstrate for the first time that an AWG is functionally equivalent to a… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

    Comments: 13 pages, 8 figures

  15. arXiv:1811.06296  [pdf, other

    eess.AS cs.SD

    Comprehensive evaluation of statistical speech waveform synthesis

    Authors: Thomas Merritt, Bartosz Putrycz, Adam Nadolski, Tianjun Ye, Daniel Korzekwa, Wiktor Dolecki, Thomas Drugman, Viacheslav Klimkov, Alexis Moinet, Andrew Breen, Rafal Kuklinski, Nikko Strom, Roberto Barra-Chicote

    Abstract: Statistical TTS systems that directly predict the speech waveform have recently reported improvements in synthesis quality. This investigation evaluates Amazon's statistical speech waveform synthesis (SSWS) system. An in-depth evaluation of SSWS is conducted across a number of domains to better understand the consistency in quality. The results of this evaluation are validated by repeating the pro… ▽ More

    Submitted 11 December, 2018; v1 submitted 15 November, 2018; originally announced November 2018.