Skip to main content

Showing 1–19 of 19 results for author: Yao, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.02430  [pdf, other

    eess.AS cs.SD

    Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

    Authors: Philip Anastassiou, Jiawei Chen, Jitong Chen, Yuanzhe Chen, Zhuo Chen, Ziyi Chen, Jian Cong, Lelai Deng, Chuang Ding, Lu Gao, Mingqing Gong, Peisong Huang, Qingqing Huang, Zhiying Huang, Yuanyuan Huo, Dongya Jia, Chumin Li, Feiya Li, Hui Li, Jiaxin Li, Xiaoyang Li, Xingxing Li, Lin Liu, Shouda Liu, Sichao Liu , et al. (21 additional authors not shown)

    Abstract: We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable of generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as a foundation model for speech generation and excels in speech in-context learning, achieving performance in speaker similarity and naturalness that matches ground truth human speech in both objective and sub… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2405.14336  [pdf, other

    eess.IV

    I$^2$VC: A Unified Framework for Intra- & Inter-frame Video Compression

    Authors: Meiqin Liu, Chenming Xu, Yukai Gu, Chao Yao, Yao Zhao

    Abstract: Video compression aims to reconstruct seamless frames by encoding the motion and residual information from existing frames. Previous neural video compression methods necessitate distinct codecs for three types of frames (I-frame, P-frame and B-frame), which hinders a unified approach and generalization across different video contexts. Intra-codec techniques lack the advanced Motion Estimation and… ▽ More

    Submitted 1 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 19 pages, 10 figures

  3. arXiv:2401.05412  [pdf, other

    cs.CV cs.AI eess.SP

    Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics

    Authors: Xueyuan Yang, Chao Yao, Xiaojuan Ban

    Abstract: Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU r… ▽ More

    Submitted 26 December, 2023; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  4. arXiv:2310.18090  [pdf, ps, other

    eess.SP

    Probabilistic Constellation Sha** for OFDM-Based ISAC Signaling

    Authors: Zhen Du, Fan Liu, Yifeng Xiong, Tony Xiao Han, Weijie Yuan, Yuanhao Cui, Changhua Yao, Yonina C. Eldar

    Abstract: Integrated Sensing and Communications (ISAC) has garnered significant attention as a promising technology for the upcoming sixth-generation wireless communication systems (6G). In pursuit of this goal, a common strategy is that a unified waveform, such as Orthogonal Frequency Division Multiplexing (OFDM), should serve dual-functional roles by enabling simultaneous sensing and communications (S&C)… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  5. IBVC: Interpolation-driven B-frame Video Compression

    Authors: Chenming Xu, Meiqin Liu, Chao Yao, Weisi Lin, Yao Zhao

    Abstract: Learned B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction. However, previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation or video frame interpolation. They suffer from inaccurate quantized motions and inefficient motion compens… ▽ More

    Submitted 14 March, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Submitted to Pattern Recognition

  6. arXiv:2307.15272  [pdf

    eess.SP

    Direct Power Flow Controller with Continuous Full Regulation Range

    Authors: Chong Yao, Youjun Zhang

    Abstract: For enhancing power flow control in power transmission, a simplified new structure of direct power flow controller with continuous full regulation range (F-DPFC) was proposed. It has only one-stage power conversion and comprises of a three-phase transformer in parallel and a three-phase trans-former in series with grid, three single-phase full-bridge ac units, and a three-phase filter. Compared wi… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: 9 pages,20 figures

  7. arXiv:2305.08325  [pdf, other

    cs.CV eess.IV

    Screentone-Aware Manga Super-Resolution Using DeepLearning

    Authors: Chih-Yuan Yao, Husan-Ting Chou, Yu-Sheng Lin, Kuo-wei Chen

    Abstract: Manga, as a widely beloved form of entertainment around the world, have shifted from paper to electronic screens with the proliferation of handheld devices. However, as the demand for image quality increases with screen development, high-quality images can hinder transmission and affect the viewing experience. Traditional vectorization methods require a significant amount of manual parameter adjus… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  8. arXiv:2303.09112  [pdf, other

    eess.IV cs.AI cs.LG cs.MM

    SigVIC: Spatial Importance Guided Variable-Rate Image Compression

    Authors: Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao

    Abstract: Variable-rate mechanism has improved the flexibility and efficiency of learning-based image compression that trains multiple models for different rate-distortion tradeoffs. One of the most common approaches for variable-rate is to channel-wisely or spatial-uniformly scale the internal features. However, the diversity of spatial importance is instructive for bit allocation of image compression. In… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE ICASSP2023 (Camera Ready)

  9. arXiv:2208.04622  [pdf, other

    eess.AS cs.SD

    An Anchor-Free Detector for Continuous Speech Keyword Spotting

    Authors: Zhiyuan Zhao, Chuanxin Tang, Chengdong Yao, Chong Luo

    Abstract: Continuous Speech Keyword Spotting (CSKWS) is a task to detect predefined keywords in a continuous speech. In this paper, we regard CSKWS as a one-dimensional object detection task and propose a novel anchor-free detector, named AF-KWS, to solve the problem. AF-KWS directly regresses the center locations and lengths of the keywords through a single-stage deep neural network. In particular, AF-KWS… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted by Interspeech 2022

  10. arXiv:2203.16850  [pdf, other

    eess.IV cs.CV

    Revisiting Document Image Dewar** by Grid Regularization

    Authors: Xiangwei Jiang, Rujiao Long, Nan Xue, Zhibo Yang, Cong Yao, Gui-Song Xia

    Abstract: This paper addresses the problem of document image dewar**, which aims at eliminating the geometric distortion in document images for document digitization. Instead of designing a better neural network to approximate the optical flow fields between the inputs and outputs, we pursue the best readability by taking the text lines and the document boundaries into account from a constrained optimizat… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

  11. arXiv:2201.02311  [pdf, other

    math.OC eess.SY

    Joint Routing and Charging Problem of Electric Vehicles with Incentive-aware Customers Considering Spatio-temporal Charging Prices

    Authors: Canqi Yao, Shibo Chen, Mauro Salazar, Zaiyue Yang

    Abstract: This paper investigates the scheduling problem of a fleet of electric vehicles, providing mobility as a service to a set of time-specified customers, where the operator needs to solve the routing and charging problem jointly for each EV. Hereby we consider incentive-aware customers and propose that the operator offers monetary incentives to customers in exchange for time flexibility. In this way,… ▽ More

    Submitted 26 May, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

    Comments: Submitted to TRC

  12. arXiv:2110.06441  [pdf, other

    eess.SY

    Incentive-aware Electric Vehicle Routing Problem: a Bi-level Model and a Joint Solution Algorithm

    Authors: Canqi Yao, Shibo Chen, Mauro Salazar, Zaiyue Yang

    Abstract: Fixed pickup and delivery times can strongly limit the performance of freight transportation. Against this backdrop, fleet operators can use compensation mechanisms such as monetary incentives to buy delay time from their customers, in order to improve the fleet efficiency and ultimately minimize the costs of operation. To make the most of such an operational model, the fleet activities and the in… ▽ More

    Submitted 24 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Accepted by ACC2022

  13. Cooperative Operation of the Fleet Operator and Incentive-aware Customers in an On-demand Delivery System: A Bi-level Approach

    Authors: Canqi Yao, Shibo Chen, Zaiyue Yang

    Abstract: In this paper, we study the cooperative operation problem between the fleet operator and incentive-aware customers in an on-demand delivery system. Specifically, the fleet operator offers discounts on transportation costs in exchange of the delivery time flexibility of customers. In order to capture the interaction between the fleet operator and customers, a novel bi-level optimization framework i… ▽ More

    Submitted 12 October, 2023; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: Accepted for publication in IEEE Internet of Things Journal

    Journal ref: IEEE Internet of Things Journal, 2023

  14. arXiv:2105.09165  [pdf, other

    math.OC eess.SY

    Evacuation Problem Under the Nuclear Leakage Accident

    Authors: Canqi Yao, Shibo Chen, Zaiyue Yang

    Abstract: To handle the detrimental effects brought by leakage of radioactive gases at nuclear power station, we propose a bus based evacuation optimization problem. The proposed model incorporates the following four constraints, 1) the maximum dose of radiation per evacuee, 2) the limitation of bus capacity, 3) the number of evacuees at demand node (bus pickup stop), 4) evacuees balance at demand and shelt… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted by 2021 40th Chinese Control Conference (CCC). IEEE

  15. Joint Routing and Charging Problem of Multiple Electric Vehicles: A Fast Optimization Algorithm

    Authors: Canqi Yao, Shibo Chen, Zaiyue Yang

    Abstract: Logistics has gained great attentions with the prosperous development of commerce, which is often seen as the classic optimal vehicle routing problem. Meanwhile, electric vehicle (EV) has been widely used in logistic fleet to curb the emission of green house gases in recent years. Solving the optimization problem of joint routing and charging of multiple EVs is in a urgent need, whose objective fu… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: Accepted by IEEE Transactions on Intelligent Transportation Systems, DOI: 10.1109/TITS.2021.3076601

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2021

  16. arXiv:2010.10163  [pdf, other

    eess.IV cs.CV cs.LG

    Claw U-Net: A Unet-based Network with Deep Feature Concatenation for Scleral Blood Vessel Segmentation

    Authors: Chang Yao, **gyu Tang, Menghan Hu, Yue Wu, Wenyi Guo, Qingli Li, Xiao-** Zhang

    Abstract: Sturge-Weber syndrome (SWS) is a vascular malformation disease, and it may cause blindness if the patient's condition is severe. Clinical results show that SWS can be divided into two types based on the characteristics of scleral blood vessels. Therefore, how to accurately segment scleral blood vessels has become a significant problem in computer-aided diagnosis. In this research, we propose to co… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: 5 pages,4 figures

  17. arXiv:2005.08748  [pdf, other

    cs.LG eess.SP physics.ao-ph stat.ML

    Deep Learning for Post-Processing Ensemble Weather Forecasts

    Authors: Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler

    Abstract: Quantifying uncertainty in weather forecasts is critical, especially for predicting extreme weather events. This is typically accomplished with ensemble prediction systems, which consist of many perturbed numerical weather simulations, or trajectories, run in parallel. These systems are associated with a high computational cost and often involve statistical post-processing steps to inexpensively i… ▽ More

    Submitted 21 September, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

  18. arXiv:1910.01796  [pdf

    q-bio.QM eess.IV

    Transfer Learning for Automated OCTA Detection of Diabetic Retinopathy

    Authors: David Le, Minhaj Alam, Cham Yao, Jennifer I. Lim, R. V. P. Chan, Devrim Toslak, Xincheng Yao

    Abstract: Purpose: To test the feasibility of using deep learning for optical coherence tomography angiography (OCTA) detection of diabetic retinopathy (DR). Methods: A deep learning convolutional neural network (CNN) architecture VGG16 was employed for this study. A transfer learning process was implemented to re-train the CNN for robust OCTA classification. In order to demonstrate the feasibility of using… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    Comments: 20 pages, 4 figures, 6 tables

  19. arXiv:1908.11834  [pdf, other

    cs.CV cs.LG eess.IV

    Rethinking Irregular Scene Text Recognition

    Authors: Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao

    Abstract: Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a… ▽ More

    Submitted 11 November, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: Technical report for participation in ICDAR2019-ArT recognition track