Skip to main content

Showing 1–36 of 36 results for author: Song, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.16346  [pdf, other

    eess.IV cs.AI cs.CV

    Light-weight Retinal Layer Segmentation with Global Reasoning

    Authors: Xiang He, Weiye Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, Yi Wan

    Abstract: Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases. However, it is challenging to achieve accurate segmentation due to low contrast and blood flow noises presented in the images. In addition, the algorithm should be light-weight to be deployed for practical clinical applications… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: IEEE Transactions on Instrumentation & Measurement

  2. arXiv:2403.05808  [pdf, other

    cs.CV eess.IV

    Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution

    Authors: Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haorang Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang

    Abstract: Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation inf… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  3. arXiv:2403.05136  [pdf, other

    cs.RO eess.SP

    DeRO: Dead Reckoning Based on Radar Odometry With Accelerometers Aided for Robot Localization

    Authors: Hoang Viet Do, Yong Hun Kim, Joo Han Lee, Min Ho Lee, ** Woo Song

    Abstract: In this paper, we propose a radar odometry structure that directly utilizes radar velocity measurements for dead reckoning while maintaining its ability to update estimations within the Kalman filter framework. Specifically, we employ the Doppler velocity obtained by a 4D Frequency Modulated Continuous Wave (FMCW) radar in conjunction with gyroscope data to calculate poses. This approach helps mit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures, 1 table, conference

    ACM Class: I.2.9

  4. arXiv:2401.05850  [pdf, other

    cs.SD eess.AS

    Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection

    Authors: Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He

    Abstract: Overlap** sound events are ubiquitous in real-world environments, but existing end-to-end sound event detection (SED) methods still struggle to detect them effectively. A critical reason is that these methods represent overlap** events using shared and entangled frame-wise features, which degrades the feature discrimination. To solve the problem, we propose a disentangled feature learning fram… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: accepted by icassp2024

  5. arXiv:2401.03352  [pdf, other

    eess.SY

    Dynamic and Memory-efficient Shape Based Methodologies for User Type Identification in Smart Grid Applications

    Authors: Rui Yuan, S. Ali Pourmousavi, Wen L. Soong, Jon A. R. Liisberg

    Abstract: Detecting behind-the-meter (BTM) equipment and major appliances at the residential level and tracking their changes in real time is important for aggregators and traditional electricity utilities. In our previous work, we developed a systematic solution called IRMAC to identify residential users' BTM equipment and applications from their imported energy data. As a part of IRMAC, a Similarity Profi… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  6. arXiv:2312.04398  [pdf

    cs.CV cs.AI cs.LG eess.IV stat.ML

    Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

    Authors: Yongqi Dong, Xingmin Lu, Ruohan Li, Wei Song, Bart van Arem, Haneen Farah

    Abstract: The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane rendering map images occasionally introduces potential hazards, as such anomalies can be misleading to human drivers and consequently contribute to unsafe driving conditions. In response to this concern and to accurately and effectively detect the anomalies, t… ▽ More

    Submitted 29 May, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 22 pages, 6 figures, accepted by the 103rd Transportation Research Board (TRB) Annual Meeting, under review by Transportation Research Record: Journal of the Transportation Research Board

  7. arXiv:2310.12399  [pdf, other

    eess.SP eess.AS

    A New Time Series Similarity Measure and Its Smart Grid Applications

    Authors: Rui Yuan, S. Ali Pourmousavi, Wen L. Soong, Andrew J. Black, Jon A. R. Liisberg, Julian Lemos-Vinasco

    Abstract: Many smart grid applications involve data mining, clustering, classification, identification, and anomaly detection, among others. These applications primarily depend on the measurement of similarity, which is the distance between different time series or subsequences of a time series. The commonly used time series distance measures, namely Euclidean Distance (ED) and Dynamic Time War** (DTW), d… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 7 pages, 6 figures conference

  8. arXiv:2303.12123  [pdf, other

    eess.IV cs.CV

    Oral-3Dv2: 3D Oral Reconstruction from Panoramic X-Ray Imaging with Implicit Neural Representation

    Authors: Weinan Song, Haoxin Zheng, Dezhan Tu, Chengwen Liang, Lei He

    Abstract: 3D reconstruction of medical imaging from 2D images has become an increasingly interesting topic with the development of deep learning models in recent years. Previous studies in 3D reconstruction from limited X-ray images mainly rely on learning from paired 2D and 3D images, where the reconstruction quality relies on the scale and variation of collected data. This has brought significant challeng… ▽ More

    Submitted 3 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

  9. arXiv:2302.11806  [pdf, other

    eess.IV cs.CV

    PLU-Net: Extraction of multi-scale feature fusion

    Authors: Weihu Song

    Abstract: Deep learning algorithms have achieved remarkable results in medical image segmentation in recent years. These networks are unable to handle with image boundaries and details with enormous parameters, resulting in poor segmentation results. To address the issue, we develop atrous spatial pyramid pooling (ASPP) and combine it with the Squeeze-and-Excitation block (SE block), as well as present the… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 11 pages, 9 figures

  10. arXiv:2302.10412  [pdf, other

    eess.IV cs.CV

    Non-pooling Network for medical image segmentation

    Authors: Weihu Song, Heng Yu

    Abstract: Existing studies tend tofocus onmodel modifications and integration with higher accuracy, which improve performance but also carry huge computational costs, resulting in longer detection times. Inmedical imaging, the use of time is extremely sensitive. And at present most of the semantic segmentation models have encoder-decoder structure or double branch structure. Their several times of the pooli… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 8 pages, 5 figures

  11. arXiv:2302.06381  [pdf

    eess.IV cs.CV

    Self-supervised phase unwrap** in fringe projection profilometry

    Authors: Xiaomin Gao, Wanzhong Song, Chunqian Tan, Junzhe Lei

    Abstract: Fast-speed and high-accuracy three-dimensional (3D) shape measurement has been the goal all along in fringe projection profilometry (FPP). The dual-frequency temporal phase unwrap** method (DF-TPU) is one of the prominent technologies to achieve this goal. However, the period number of the high-frequency pattern of existing DF-TPU approaches is usually limited by the inevitable phase errors, set… ▽ More

    Submitted 30 May, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

  12. arXiv:2211.06170  [pdf, other

    cs.SD eess.AS

    MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy

    Authors: Ya-Jie Zhang, Wei Song, Yanghao Yue, Zhengchen Zhang, Youzheng Wu, Xiaodong He

    Abstract: Humans often speak in a continuous manner which leads to coherent and consistent prosody properties across neighboring utterances. However, most state-of-the-art speech synthesis systems only consider the information within each sentence and ignore the contextual semantic and acoustic features. This makes it inadequate to generate high-quality paragraph-level speech which requires high expressiven… ▽ More

    Submitted 18 May, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted in Interspeech 2023

  13. arXiv:2211.00996  [pdf, other

    cs.SD eess.AS

    Singing Voice Synthesis with Vibrato Modeling and Latent Energy Representation

    Authors: Yingjie Song, Wei Song, Wei Zhang, Zhengchen Zhang, Dan Zeng, Zhi Liu, Yang Yu

    Abstract: This paper proposes an expressive singing voice synthesis system by introducing explicit vibrato modeling and latent energy representation. Vibrato is essential to the naturalness of synthesized sound, due to the inherent characteristics of human singing. Hence, a deep learning-based vibrato model is introduced in this paper to control the vibrato's likeliness, rate, depth and phase in singing, wh… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  14. arXiv:2211.00967  [pdf, other

    cs.SD eess.AS

    Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement

    Authors: Wei Song, Yanghao Yue, Ya-jie Zhang, Zhengchen Zhang, Youzheng Wu, Xiaodong He

    Abstract: Disentanglement of a speaker's timbre and style is very important for style transfer in multi-speaker multi-style text-to-speech (TTS) scenarios. With the disentanglement of timbres and styles, TTS systems could synthesize expressive speech for a given speaker with any style which has been seen in the training corpus. However, there are still some shortcomings with the current research on timbre a… ▽ More

    Submitted 22 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

  15. arXiv:2208.03524  [pdf

    eess.IV cs.CV

    Deep Learning-enabled Spatial Phase Unwrap** for 3D Measurement

    Authors: Xiaolong Luo, Wanzhong Song, Songlin Bai, Yu Li, Zhihe Zhao

    Abstract: In terms of 3D imaging speed and system cost, the single-camera system projecting single-frequency patterns is the ideal option among all proposed Fringe Projection Profilometry (FPP) systems. This system necessitates a robust spatial phase unwrap** (SPU) algorithm. However, robust SPU remains a challenge in complex scenes. Quality-guided SPU algorithms need more efficient ways to identify the u… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: 26 pages

    ACM Class: I.4.5

    Journal ref: Optics & Laser Technology, 163 (2023) 109340

  16. arXiv:2207.13434  [pdf

    cs.SD cs.CV cs.MM eess.AS

    End-To-End Audiovisual Feature Fusion for Active Speaker Detection

    Authors: Fiseha B. Tesema, Zheyuan Lin, Shiqiang Zhu, Wei Song, Jason Gu, Hong Wu

    Abstract: Active speaker detection plays a vital role in human-machine interaction. Recently, a few end-to-end audiovisual frameworks emerged. However, these models' inference time was not explored and are not applicable for real-time applications due to their complexity and large input size. In addition, they explored a similar feature extraction strategy that employs the ConvNet on audio and visual inputs… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: To appear on the proceeding of the Fourteenth International Conference on Digital Image Processing (ICDIP 2022), May 20-23, Wuhan, China, 8 pages, 3 figures

    Journal ref: Proceedings Volume 12342, Fourteenth International Conference on Digital Image Processing (ICDIP 2022); 123422A (2022)

  17. arXiv:2206.13390  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    A Comprehensive Survey on Video Saliency Detection with Auditory Information: the Audio-visual Consistency Perceptual is the Key!

    Authors: Chenglizhao Chen, Mengke Song, Wenfeng Song, Li Guo, Muwei Jian

    Abstract: Video saliency detection (VSD) aims at fast locating the most attractive objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied on the visual system but paid less attention to the audio aspect, while, actually, our audio system is the most vital complementary part to our visual system. Also, audio-visual saliency detection (AVSD), one of the most representativ… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  18. arXiv:2205.00511   

    cs.LG eess.SP

    An Early Fault Detection Method of Rotating Machines Based on Multiple Feature Fusion with Stacking Architecture

    Authors: Wenbin Song, Di Wu, Weiming Shen, Benoit Boulet

    Abstract: Early fault detection (EFD) of rotating machines is important to decrease the maintenance cost and improve the mechanical system stability. One of the key points of EFD is develo** a generic model to extract robust and discriminative features from different equipment for early fault detection. Most existing EFD methods focus on learning fault representation by one type of feature. However, a com… ▽ More

    Submitted 28 February, 2023; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: The results require to be updated

  19. arXiv:2203.06363  [pdf, other

    eess.IV cs.CV

    MDT-Net: Multi-domain Transfer by Perceptual Supervision for Unpaired Images in OCT Scan

    Authors: Weinan Song, Gaurav Fotedar, Nima Tajbakhsh, Ziheng Zhou, Lei He, Xiaowei Ding

    Abstract: Deep learning models tend to underperform in the presence of domain shifts. Domain transfer has recently emerged as a promising approach wherein images exhibiting a domain shift are transformed into other domains for augmentation or adaptation. However, with the absence of paired and annotated images, models merely learned by adversarial loss and cycle consistency loss could result in poor consist… ▽ More

    Submitted 25 October, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

  20. arXiv:2202.03648  [pdf, ps, other

    eess.SY

    Energy Efficiency and Delay Tradeoff in an MEC-Enabled Mobile IoT Network

    Authors: Han Hu, Weiwei Song, Qun Wang, Rose Qingyang Hu, Hongbo Zhu

    Abstract: Mobile Edge Computing (MEC) has recently emerged as a promising technology in the 5G era. It is deemed an effective paradigm to support computation-intensive and delay critical applications even at energy-constrained and computation-limited Internet of Things (IoT) devices. To effectively exploit the performance benefits enabled by MEC, it is imperative to jointly allocate radio and computational… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  21. arXiv:2201.02656  [pdf, other

    eess.IV cs.CV

    GPU-Net: Lightweight U-Net with more diverse features

    Authors: Heng Yu, Di Fan, Weihu Song

    Abstract: Image segmentation is an important task in the medical image field and many convolutional neural networks (CNNs) based methods have been proposed, among which U-Net and its variants show promising performance. In this paper, we propose GP-module and GPU-Net based on U-Net, which can learn more diverse features by introducing Ghost module and atrous spatial pyramid pooling (ASPP). Our method achiev… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

  22. arXiv:2109.13732  [pdf, ps, other

    cs.LG eess.SP eess.SY

    IRMAC: Interpretable Refined Motifs in Binary Classification for Smart Grid Applications

    Authors: Rui Yuan, S. Ali Pourmousavi, Wen L. Soong, Giang Nguyen, Jon A. R. Liisberg

    Abstract: Modern power systems are experiencing the challenge of high uncertainty with the increasing penetration of renewable energy resources and the electrification of heating systems. In this paradigm shift, understanding electricity users' demand is of utmost value to retailers, aggregators, and policymakers. However, behind-the-meter (BTM) equipment and appliances at the household level are unknown to… ▽ More

    Submitted 14 November, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

    Comments: 22 pages, 13 figures

    Journal ref: Engineering Applicationsof Artificial Intelligence (2022) 105588

  23. arXiv:2106.15842  [pdf, other

    eess.SP cs.AI cs.LG

    Dual Aspect Self-Attention based on Transformer for Remaining Useful Life Prediction

    Authors: Zhizheng Zhang, Wen Song, Qiqiang Li

    Abstract: Remaining useful life prediction (RUL) is one of the key technologies of condition-based maintenance, which is important to maintain the reliability and safety of industrial equipments. Massive industrial measurement data has effectively improved the performance of the data-driven based RUL prediction method. While deep learning has achieved great success in RUL prediction, existing methods have d… ▽ More

    Submitted 20 April, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

  24. arXiv:2103.09180  [pdf, ps, other

    eess.SY

    Mobility-Aware Offloading and Resource Allocation in MEC-Enabled IoT Networks

    Authors: Han Hu, Weiwei Song, Qun Wang, Fuhui Zhou, Rose Qingyang Hu

    Abstract: Mobile edge computing (MEC)-enabled Internet of Things (IoT) networks have been deemed a promising paradigm to support massive energy-constrained and computation-limited IoT devices. IoT with mobility has found tremendous new services in the 5G era and the forthcoming 6G eras such as autonomous driving and vehicular communications. However, mobility of IoT devices has not been studied in the suffi… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  25. arXiv:2011.05161  [pdf, other

    eess.AS cs.LG cs.SD

    Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis

    Authors: Guanghui Xu, Wei Song, Zhengchen Zhang, Chao Zhang, Xiaodong He, Bowen Zhou

    Abstract: Despite prosody is related to the linguistic information up to the discourse structure, most text-to-speech (TTS) systems only take into account that within each sentence, which makes it challenging when converting a paragraph of texts into natural and expressive speech. In this paper, we propose to use the text embeddings of the neighboring sentences to improve the prosody generation for each utt… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: 5 pages, 4 figures

  26. arXiv:2008.04147  [pdf, ps, other

    eess.SP cs.IT

    Knowledge Distillation-aided End-to-End Learning for Linear Precoding in Multiuser MIMO Downlink Systems with Finite-Rate Feedback

    Authors: Kyeongbo Kong, Woo-** Song, Moonsik Min

    Abstract: We propose a deep learning-based channel estimation, quantization, feedback, and precoding method for downlink multiuser multiple-input and multiple-output systems. In the proposed system, channel estimation and quantization for limited feedback are handled by a receiver deep neural network (DNN). Precoder selection is handled by a transmitter DNN. To emulate the traditional channel quantization,… ▽ More

    Submitted 22 March, 2021; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: 6 pages, 4 figures, submitted to IEEE Transactions on Vehicular Technology

  27. arXiv:2003.10661  [pdf

    stat.ML cs.LG eess.SP physics.app-ph

    Training a U-Net based on a random mode-coupling matrix model to recover acoustic interference striations

    Authors: Xiaolei Li, Wenhua Song, Dazhi Gao, Wei Gao, Haozhong Wan

    Abstract: A U-Net is trained to recover acoustic interference striations (AISs) from distorted ones. A random mode-coupling matrix model is introduced to generate a large number of training data quickly, which are used to train the U-Net. The performance of AIS recovery of the U-Net is tested in range-dependent waveguides with nonlinear internal waves (NLIWs). Although the random mode-coupling matrix model… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

  28. arXiv:2003.08413  [pdf, other

    eess.IV cs.CV

    Oral-3D: Reconstructing the 3D Bone Structure of Oral Cavity from 2D Panoramic X-ray

    Authors: Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He

    Abstract: Panoramic X-ray (PX) provides a 2D picture of the patient's mouth in a panoramic view to help dentists observe the invisible disease inside the gum. However, it provides limited 2D information compared with cone-beam computed tomography (CBCT), another dental imaging method that generates a 3D picture of the oral cavity but with more radiation dose and a higher price. Consequently, it is of great… ▽ More

    Submitted 8 January, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

  29. arXiv:2002.08406  [pdf, other

    eess.IV cs.CV

    T-Net: Learning Feature Representation with Task-specific Supervision for Biomedical Image Analysis

    Authors: Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He

    Abstract: The encoder-decoder network is widely used to learn deep feature representations from pixel-wise annotations in biomedical image analysis. Under this structure, the performance profoundly relies on the effectiveness of feature extraction achieved by the encoding network. However, few models have considered adapting the attention of the feature extractor even in different kinds of tasks. In this pa… ▽ More

    Submitted 9 January, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

  30. Deep Learning for Hyperspectral Image Classification: An Overview

    Authors: Shutao Li, Weiwei Song, Leyuan Fang, Yushi Chen, Pedram Ghamisi, Jón Atli Benediktsson

    Abstract: Hyperspectral image (HSI) classification has become a hot topic in the field of remote sensing. In general, the complex characteristics of hyperspectral data make the accurate classification of such data challenging for traditional machine learning methods. In addition, hyperspectral imaging often deals with an inherently nonlinear relation between the captured spectral information and the corresp… ▽ More

    Submitted 26 October, 2019; originally announced October 2019.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 9, pp. 6690-6709, Sep. 2019

  31. arXiv:1909.04614  [pdf, other

    cs.CV eess.IV

    Deep Hashing Learning for Visual and Semantic Retrieval of Remote Sensing Images

    Authors: Weiwei Song, Shutao Li, Jon Atli Benediktsson

    Abstract: Driven by the urgent demand for managing remote sensing big data, large-scale remote sensing image retrieval (RSIR) attracts increasing attention in the remote sensing field. In general, existing retrieval methods can be regarded as visual-based retrieval approaches which search and return a set of similar images from a database to a given query image. Although retrieval methods have achieved grea… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

  32. arXiv:1907.03246  [pdf

    eess.IV cs.CV cs.MM

    An Experimental-based Review of Image Enhancement and Image Restoration Methods for Underwater Imaging

    Authors: Yan Wang, Wei Song, Giancarlo Fortino, Lizhe Qi, Wenqiang Zhang, Antonio Liotta

    Abstract: Underwater images play a key role in ocean exploration, but often suffer from severe quality degradation due to light absorption and scattering in water medium. Although major breakthroughs have been made recently in the general area of image enhancement and restoration, the applicability of new methods for improving the quality of underwater images has not specifically been captured. In this pape… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

    Comments: 19

  33. arXiv:1906.08673  [pdf

    eess.IV cs.MM

    Enhancement of Underwater Images with Statistical Model of Background Light and Optimization of Transmission Map

    Authors: Wei Song, Yan Wang, Dongmei Huang, Antonio Liotta, Cristian Perra

    Abstract: Underwater images often have severe quality degradation and distortion due to light absorption and scattering in the water medium. A hazed image formation model is widely used to restore the image quality. It depends on two optical parameters: the background light and the transmission map. Underwater images can also be enhanced by color and contrast correction from the perspective of image process… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: 17 pages

  34. arXiv:1904.06063  [pdf, other

    cs.CL cs.SD eess.AS

    Building a mixed-lingual neural TTS system with only monolingual data

    Authors: Liumeng Xue, Wei Song, Guanghui Xu, Lei Xie, Zhizheng Wu

    Abstract: When deploying a Chinese neural text-to-speech (TTS) synthesis system, one of the challenges is to synthesize Chinese utterances with English phrases or words embedded. This paper looks into the problem in the encoder-decoder framework when only monolingual data from a target speaker is available. Specifically, we view the problem from two aspects: speaker consistency within an utterance and natur… ▽ More

    Submitted 22 August, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: To appear in INTERSPEECH 2019

  35. arXiv:1810.12271  [pdf, other

    eess.SP physics.geo-ph

    Toward Creating Subsurface Camera

    Authors: WenZhan Song, Fangyu Li, Maria Valero, Liang Zhao

    Abstract: In this article, the framework and architecture of Subsurface Camera (SAMERA) is envisioned and described for the first time. A SAMERA is a geophysical sensor network that senses and processes geophysical sensor signals, and computes a 3D subsurface image in-situ in real-time. The basic mechanism is: geophysical waves propagating/reflected/refracted through subsurface enter a network of geophysica… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

    Comments: 15 pages, 7 figures

  36. arXiv:1808.09087  [pdf, other

    cs.AR eess.SY

    TRINITY: Coordinated Performance, Energy and Temperature Management in 3D Processor-Memory Stacks

    Authors: Karthik Rao, William Song, Yorai Wardi, Sudhakar Yalamanchili

    Abstract: The consistent demand for better performance has lead to innovations at hardware and microarchitectural levels. 3D stacking of memory and logic dies delivers an order of magnitude improvement in available memory bandwidth. The price paid however is, tight thermal constraints. In this paper, we study the complex multiphysics interactions between performance, energy and temperature. Using a cache… ▽ More

    Submitted 9 September, 2018; v1 submitted 27 August, 2018; originally announced August 2018.