Skip to main content

Showing 1–50 of 71 results for author: Chang, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.11057  [pdf, other

    eess.SY

    Design of Interacting Particle Systems for Fast and Efficient Reinforcement Learning

    Authors: Anant A Joshi, Heng-Sheng Chang, Amirhossein Taghvaei, Prashant G Mehta, Sean P. Meyn

    Abstract: This paper is concerned with the design of algorithms based on systems of interacting particles to represent, approximate, and learn the optimal control law for reinforcement learning (RL). The primary contribution of the present paper is to show that convergence rates can be accelerated dramatically through careful design of interactions between particles. Theory focuses on the linear quadratic s… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  3. arXiv:2405.20502  [pdf, ps, other

    eess.SY math.DS math.OC

    Reach-Avoid Control Synthesis for a Quadrotor UAV with Formal Safety Guarantees

    Authors: Mohamed Serry, Haocheng Chang, Jun Liu

    Abstract: Reach-avoid specifications are one of the most common tasks in autonomous aerial vehicle (UAV) applications. Despite the intensive research and development associated with control of aerial vehicles, generating feasible trajectories though complex environments and tracking them with formal safety guarantees remain challenging. In this paper, we propose a control framework for a quadrotor UAV that… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2404.09385  [pdf, other

    eess.AS cs.CL eess.SP

    A Large-Scale Evaluation of Speech Foundation Models

    Authors: Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

    Abstract: The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work,… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: The extended journal version for SUPERB and SUPERB-SG. Published in IEEE/ACM TASLP. The Arxiv version is preferred

  5. arXiv:2404.05191  [pdf, ps, other

    eess.SP

    Graph-based Untrained Neural Network Detector for OTFS Systems

    Authors: Hao Chang, Branka Vucetic, Wibowo Hardjawana

    Abstract: Inter-carrier interference (ICI) caused by mobile reflectors significantly degrades the conventional orthogonal frequency division multiplexing (OFDM) performance in high-mobility environments. The orthogonal time frequency space (OTFS) modulation system effectively represents ICI in the delay-Doppler domain, thus significantly outperforming OFDM. Existing iterative and neural network (NN) based O… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  6. arXiv:2402.06959  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

    Authors: Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath

    Abstract: The recently proposed visually grounded speech model SpeechCLIP is an innovative framework that bridges speech and text through images via CLIP without relying on text transcription. On this basis, this paper introduces two extensions to SpeechCLIP. First, we apply the Continuous Integrate-and-Fire (CIF) module to replace a fixed number of CLS tokens in the cascaded architecture. Second, we propos… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024, Self-supervision in Audio, Speech, and Beyond (SASB) workshop

  7. arXiv:2312.01042  [pdf, ps, other

    cs.IT eess.SP

    Covert Communications in STAR-RIS-Aided Rate-Splitting Multiple Access Systems

    Authors: Heng Chang, Hai Yang, Shuobo Xu, Xiyu Pang, Hongwu Liu

    Abstract: In this paper, we investigate covert communications in a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided rate-splitting multiple access (RSMA) system. Under the RSMA principles, the messages for the covert user (Bob) and public user (Grace) are converted to the common and private streams at the legitimate transmitter (Alice) to realize downlink transm… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 17 pages, submitted to journal

  8. arXiv:2311.09117  [pdf, other

    cs.CL cs.SD eess.AS

    R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

    Authors: Heng-Jui Chang, James Glass

    Abstract: This paper introduces Robust Spin (R-Spin), a data-efficient domain-specific self-supervision method for speaker and noise-invariant speech representations by learning discrete acoustic units with speaker-invariant clustering (Spin). R-Spin resolves Spin's issues and enhances content representations by learning to predict acoustic pieces. R-Spin offers a 12X reduction in computational resources co… ▽ More

    Submitted 1 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024

  9. arXiv:2311.08439  [pdf, other

    eess.IV cs.CV cs.LG

    A Unified Approach for Comprehensive Analysis of Various Spectral and Tissue Doppler Echocardiography

    Authors: Jaeik Jeon, Jiyeon Kim, Yeonggul Jang, Yeonyee E. Yoon, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee, Hyuk-Jae Chang

    Abstract: Doppler echocardiography offers critical insights into cardiac function and phases by quantifying blood flow velocities and evaluating myocardial motion. However, previous methods for automating Doppler analysis, ranging from initial signal processing techniques to advanced deep learning approaches, have been constrained by their reliance on electrocardiogram (ECG) data and their inability to proc… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  10. arXiv:2311.00518  [pdf

    eess.IV

    See SIFT in a Rain

    Authors: Wei Wu, Hao Chang, Zhu Li

    Abstract: Rain streaks bring complicated pixel intensity changes and additional gradients, greatly obstructing the extraction of image features from background. This causes serious performance degradation in feature-based applications. Thus, it is critical to remove rain streaks from a single rainy image to recover image features. Recently, many excellent image deraining methods have made remarkable progres… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: A direct DoG feature pyramid recovery from rainy pixels solution for SIFT detection, accepted by T-CSVT, 2023

    Journal ref: IEEE Trans. on Circuits & System for Video Tech., 2023

  11. arXiv:2310.14893  [pdf, other

    cs.LG eess.SY stat.AP

    Data Drift Monitoring for Log Anomaly Detection Pipelines

    Authors: Dipak Wani, Samuel Ackerman, Eitan Farchi, Xiaotong Liu, Hau-wen Chang, Sarasi Lalithsena

    Abstract: Logs enable the monitoring of infrastructure status and the performance of associated applications. Logs are also invaluable for diagnosing the root causes of any problems that may arise. Log Anomaly Detection (LAD) pipelines automate the detection of anomalies in logs, providing assistance to site reliability engineers (SREs) in system diagnosis. Log patterns change over time, necessitating updat… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  12. arXiv:2310.12837  [pdf

    eess.AS

    Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

    Authors: Hsinyu Chang, Yicheng Hsu, Mingsian R. Bai

    Abstract: Recent research advances in deep neural network (DNN)-based beamformers have shown great promise for speech enhancement under adverse acoustic conditions. Different network architectures and input features have been explored in estimating beamforming weights. In this paper, we propose a deep beamformer based on an efficient convolutional recurrent network (CRN) trained with a novel ARray RespOnse-… ▽ More

    Submitted 22 October, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 6 pages

  13. arXiv:2310.08897  [pdf, other

    eess.IV cs.CV cs.LG

    Self supervised convolutional kernel based handcrafted feature harmonization: Enhanced left ventricle hypertension disease phenoty** on echocardiography

    Authors: **a Lee, Youngtaek Hong, Dawun Jeong, Yeonggul Jang, Jaeik Jeon, Sihyeon Jeong, Taekgeun Jung, Yeonyee E. Yoon, Inki Moon, Seung-Ah Lee, Hyuk-Jae Chang

    Abstract: Radiomics, a medical imaging technique, extracts quantitative handcrafted features from images to predict diseases. Harmonization in those features ensures consistent feature extraction across various imaging devices and protocols. Methods for harmonization include standardized imaging protocols, statistical adjustments, and evaluating feature robustness. Myocardial diseases such as Left Ventricul… ▽ More

    Submitted 22 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: 11 pages, 7 figures

  14. arXiv:2309.07707  [pdf, other

    cs.CL cs.SD eess.AS

    CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders

    Authors: Heng-Jui Chang, Ning Dong, Ruslan Mavlyutov, Sravya Popuri, Yu-An Chung

    Abstract: Large-scale self-supervised pre-trained speech encoders outperform conventional approaches in speech recognition and translation tasks. Due to the high cost of develo** these large models, building new encoders for new tasks and deploying them to on-device applications are infeasible. Prior studies propose model compression methods to address this issue, but those works focus on smaller models a… ▽ More

    Submitted 27 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  15. arXiv:2308.16483  [pdf, other

    eess.SP cs.HC cs.LG

    Improving Out-of-Distribution Detection in Echocardiographic View Classication through Enhancing Semantic Features

    Authors: Jaeik Jeon, Seongmin Ha, Yeonggul Jang, Yeonyee E. Yoon, Jiyeon Kim, Hyunseok Jeong, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee Hyuk-Jae Chang

    Abstract: In echocardiographic view classification, accurately detecting out-of-distribution (OOD) data is essential but challenging, especially given the subtle differences between in-distribution and OOD data. While conventional OOD detection methods, such as Mahalanobis distance (MD) are effective in far-OOD scenarios with clear distinctions between distributions, they struggle to discern the less obviou… ▽ More

    Submitted 23 November, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

  16. arXiv:2308.04156  [pdf, other

    cs.CV cs.MM eess.IV

    Towards Top-Down Stereo Image Quality Assessment via Stereo Attention

    Authors: Huilin Zhang, Sumei Li, Haoxiang Chang, Peiming Lin

    Abstract: Stereo image quality assessment (SIQA) plays a crucial role in evaluating and improving the visual experience of 3D content. Existing visual properties-based methods for SIQA have achieved promising performance. However, these approaches ignore the top-down philosophy, leading to a lack of a comprehensive grasp of the human visual system (HVS) and SIQA. This paper presents a novel Stereo AttenTion… ▽ More

    Submitted 14 November, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: 12 pages, 5 figures

  17. Optimal preprocessing of WiFi CSI for sensing applications

    Authors: Vishnu V. Ratnam, Hao Chen, Hao Hsuan Chang, Abhishek Sehgal, Jianzhong, Zhang

    Abstract: Due to its ubiquitous and contact-free nature, the use of WiFi infrastructure for performing sensing tasks has tremendous potential. However, the channel state information (CSI) measured by a WiFi receiver suffers from errors in both its gain and phase, which can significantly hinder sensing tasks. By analyzing these errors from different WiFi receivers, a mathematical model for these gain and pha… ▽ More

    Submitted 21 May, 2024; v1 submitted 22 July, 2023; originally announced July 2023.

    Comments: Paper is accepted to IEEE Transactions on Wireless Communications

    Journal ref: IEEE Transactions on Wireless Communications (2024)

  18. arXiv:2305.17896  [pdf, other

    eess.SP

    Continuous and Noninvasive Measurement of Arterial Pulse Pressure and Pressure Waveform using an Image-free Ultrasound System

    Authors: Lirui Xu, Pang Wu, Pan Xia, Fanglin Geng, Peng Wang, Xianxiang Chen, Zhenfeng Li, Lidong Du, Shu** Liu, Li Li, Hongbo Chang, Zhen Fang

    Abstract: The local beat-to-beat local pulse pressure (PP) and blood pressure waveform of arteries, especially central arteries, are important indicators of the course of cardiovascular diseases (CVDs). Nevertheless, noninvasive measurement of them remains a challenge in the clinic. This work presents a three-element image-free ultrasound system with a low-computational method for real-time measurement of l… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 13 pages, 12 figures

  19. arXiv:2305.11237  [pdf, other

    eess.SP

    DRL meets DSA Networks: Convergence Analysis and Its Application to System Design

    Authors: Ramin Safavinejad, Hao-Hsuan Chang, Lingjia Liu

    Abstract: In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the interaction between the SU and the PU systems are limited, deep reinforcement learning (DRL) has been introduced to help SUs to conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utiliz… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  20. arXiv:2305.11072  [pdf, other

    cs.CL eess.AS

    Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

    Authors: Heng-Jui Chang, Alexander H. Liu, James Glass

    Abstract: Self-supervised speech representation models have succeeded in various tasks, but improving them for content-related problems using unlabeled data is challenging. We propose speaker-invariant clustering (Spin), a novel self-supervised learning method that clusters speech representations and performs swapped prediction between the original and speaker-perturbed utterances. Spin disentangles speaker… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  21. arXiv:2305.04414  [pdf, ps, other

    eess.SP

    Untrained Neural Network based Bayesian Detector for OTFS Modulation Systems

    Authors: Hao Chang, Alva Kosasih, Wibowo Hardjawana, Xinwei Qu, Branka Vucetic

    Abstract: The orthogonal time frequency space (OTFS) symbol detector design for high mobility communication scenarios has received numerous attention lately. Current state-of-the-art OTFS detectors mainly can be divided into two categories; iterative and training-based deep neural network (DNN) detectors. Many practical iterative detectors rely on minimum-mean-square-error (MMSE) denoiser to get the initial… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

  22. arXiv:2303.09828  [pdf, other

    eess.SY

    Model Reference Gaussian Process Regression: Data-Driven State Feedback Controller

    Authors: Hyuntae Kim, Hamin Chang, Hyungbo Shim

    Abstract: This paper proposes a data-driven state feedback controller that enables reference tracking for nonlinear discrete-time systems. The controller is designed based on the identified inverse model of the system and a given reference model, assuming that the identification of the inverse model is carried out using only the system's state/input measurements. When its results are provided, we present co… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 6pages, 3figures, Submitted to LCSS/CDC 2023

  23. arXiv:2302.05811  [pdf, other

    cs.RO eess.SY

    Hierarchical control and learning of a foraging CyberOctopus

    Authors: Chia-Hsien Shih, Noel Naughton, Udit Halder, Heng-Sheng Chang, Seung Hyun Kim, Rhanor Gillette, Prashant G. Mehta, Mattia Gazzola

    Abstract: Inspired by the unique neurophysiology of the octopus, we propose a hierarchical framework that simplifies the coordination of multiple soft arms by decomposing control into high-level decision making, low-level motor activation, and local reflexive behaviors via sensory feedback. When evaluated in the illustrative problem of a model octopus foraging for food, this hierarchical decomposition resul… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: 16 pages, 7 figures

  24. arXiv:2301.05351  [pdf, other

    eess.SY

    Data-driven Moving Horizon Estimation for Angular Velocity of Space Noncooperative Target in Eddy Current De-tumbling Mission

    Authors: Xiyao Liu, Haitao Chang, Zhenyu Lu, Panfeng Huang

    Abstract: Angular velocity estimation is critical for eddy current de-tumbling of noncooperative space targets. However, unknown model of the noncooperative target and few observation data make the model-based estimation methods challenged. In this paper, a Data-driven Moving Horizon Estimation method is proposed to estimate the angular velocity of the noncooperative target with de-tumbling torque. In this… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

  25. arXiv:2211.06619  [pdf, other

    math.OC eess.SP physics.optics

    Fast Iterative Algorithms for Blind Phase Retrieval: A survey

    Authors: Huibin Chang, Li Yang, Stefano Marchesini

    Abstract: In nanoscale imaging technique and ultrafast laser, the reconstruction procedure is normally formulated as a blind phase retrieval (BPR) problem, where one has to recover both the sample and the probe (pupil) jointly from phaseless data. This survey first presents the mathematical formula of BPR, related nonlinear optimization problems and then gives a brief review of the recent iterative algorith… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

  26. arXiv:2211.01180  [pdf, other

    cs.CL cs.SD eess.AS

    M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval

    Authors: Layne Berry, Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Hung-yi Lee, David Harwath

    Abstract: This work investigates the use of large-scale, English-only pre-trained models (CLIP and HuBERT) for multilingual image-speech retrieval. For non-English image-speech retrieval, we outperform the current state-of-the-art performance by a wide margin both when training separate models for each language, and with a single model which processes speech in all three languages. We identify key differenc… ▽ More

    Submitted 10 April, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted to ICASSP 2023

  27. arXiv:2210.02494  [pdf, other

    eess.SY

    Model Reference Gaussian Process Regression: Data-Driven Output Feedback Controller

    Authors: Hyuntae Kim, Hamin Chang, Hyungbo Shim

    Abstract: Data-driven controls using Gaussian process regression have recently gained much attention. In such approaches, system identification by Gaussian process regression is mostly followed by model-based controller designs. However, the outcomes of Gaussian process regression are often too complicated to apply conventional control designs, which makes the numerical design such as model predictive contr… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: 6 pages, 5 figures, submitted to American Control Conference 2023

  28. arXiv:2210.00705  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model

    Authors: Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Layne Berry, Hung-yi Lee, David Harwath

    Abstract: Data-driven speech processing models usually perform well with a large amount of text supervision, but collecting transcribed speech data is costly. Therefore, we propose SpeechCLIP, a novel framework bridging speech and text through images to enhance speech models without transcriptions. We leverage state-of-the-art pre-trained HuBERT and CLIP, aligning them via paired images and spoken captions… ▽ More

    Submitted 25 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  29. arXiv:2209.08630  [pdf, other

    cs.CV cs.AI cs.CY cs.GT eess.IV

    RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-supervised Learning

    Authors: Wei-Ting Chen, I-Hsiang Chen, Chih-Yuan Yeh, Hao-Hsiang Yang, Hua-En Chang, Jian-Jiun Ding, Sy-Yen Kuo

    Abstract: Recently, vehicle similarity learning, also called re-identification (ReID), has attracted significant attention in computer vision. Several algorithms have been developed and obtained considerable success. However, most existing methods have unpleasant performance in the hazy scenario due to poor visibility. Though some strategies are possible to resolve this problem, they still have room to be i… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: Accepted by ECCV 2022

  30. arXiv:2209.04089  [pdf, other

    eess.SY cs.RO physics.bio-ph

    Energy Sha** Control of a Muscular Octopus Arm Moving in Three Dimensions

    Authors: Heng-Sheng Chang, Udit Halder, Chia-Hsien Shih, Noel Naughton, Mattia Gazzola, Prashant G. Mehta

    Abstract: Flexible octopus arms exhibit an exceptional ability to coordinate large numbers of degrees of freedom and perform complex manipulation tasks. As a consequence, these systems continue to attract the attention of biologists and roboticists alike. In this paper, we develop a three-dimensional model of a soft octopus arm, equipped with biomechanically realistic muscle actuation. Internal forces and c… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  31. Federated Learning Enables Big Data for Rare Cancer Boundary Detection

    Authors: Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G Anthony Reina, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J Preetha, Felix Sahm, Klaus Maier-Hein, Maximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer , et al. (254 additional authors not shown)

    Abstract: Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc… ▽ More

    Submitted 25 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: federated learning, deep learning, convolutional neural network, segmentation, brain tumor, glioma, glioblastoma, FeTS, BraTS

  32. arXiv:2204.08987  [pdf

    cs.LG eess.SP

    Deep learning based closed-loop optimization of geothermal reservoir production

    Authors: Nanzhe Wang, Haibin Chang, Xiangzhao Kong, Martin O. Saar, Dongxiao Zhang

    Abstract: To maximize the economic benefits of geothermal energy production, it is essential to optimize geothermal reservoir management strategies, in which geologic uncertainty should be considered. In this work, we propose a closed-loop optimization framework, based on deep learning surrogates, for the well control optimization of geothermal reservoirs. In this framework, we construct a hybrid convolutio… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: 37 pages, 24 figures

  33. arXiv:2203.06849  [pdf, other

    cs.CL cs.SD eess.AS

    SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

    Authors: Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Jeff Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee

    Abstract: Transfer learning has proven to be crucial in advancing the state of speech and natural language processing research in recent years. In speech, a model pre-trained by self-supervised learning transfers remarkably well on multiple tasks. However, the lack of a consistent evaluation methodology is limiting towards a holistic understanding of the efficacy of such models. SUPERB was a step towards in… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: ACL 2022 main conference

  34. arXiv:2203.05043  [pdf, other

    cs.RO eess.SY

    In-Place Rotation for Enhancing Snake-like Robot Mobility

    Authors: Alexander H. Chang, Patricio A. Vela

    Abstract: Gaits engineered for snake-like robots to rotate in-place instrumentally fill a gap in the set of locomotive gaits that have traditionally prioritized translation. This paper designs a Turn-in-Place gait and demonstrates the ability of a shape-centric modeling framework to capture the gait's locomotive properties. Shape modeling for turning involves a time-varying continuous body curve described b… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: 8 pages, 5 figures. Submitted to RA-L (IEEE Robotics and Automation Letters) with IROS 2022 Option

  35. arXiv:2202.13627  [pdf, ps, other

    cs.IT eess.SP

    Changeable Rate and Novel Quantization for CSI Feedback Based on Deep Learning

    Authors: Xin Liang, Haoran Chang, Haozhen Li, Xinyu Gu, Lin Zhang

    Abstract: Deep learning (DL)-based channel state information (CSI) feedback improves the capacity and energy efficiency of massive multiple-input multiple-output (MIMO) systems in frequency division duplexing mode. However, multiple neural networks with different lengths of feedback overhead are required by time-varying bandwidth resources. The storage space required at the user equipment (UE) and the base… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  36. arXiv:2202.01946  [pdf, ps, other

    eess.SP cs.IT cs.LG

    Unsupervised Learning Based Hybrid Beamforming with Low-Resolution Phase Shifters for MU-MIMO Systems

    Authors: Chia-Ho Kuo, Hsin-Yuan Chang, Ronald Y. Chang, Wei-Ho Chung

    Abstract: Millimeter wave (mmWave) is a key technology for fifth-generation (5G) and beyond communications. Hybrid beamforming has been proposed for large-scale antenna systems in mmWave communications. Existing hybrid beamforming designs based on infinite-resolution phase shifters (PSs) are impractical due to hardware cost and power consumption. In this paper, we propose an unsupervised-learning-based sche… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: IEEE International Conference on Communications (ICC) 2022

  37. arXiv:2110.09924  [pdf, ps, other

    eess.AS cs.SD

    Speech Enhancement Based on Cyclegan with Noise-informed Training

    Authors: Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su, Yu Tsao

    Abstract: Cycle-consistent generative adversarial networks (CycleGAN) were successfully applied to speech enhancement (SE) tasks with unpaired noisy-clean training data. The CycleGAN SE system adopted two generators and two discriminators trained with losses from noisy-to-clean and clean-to-noisy conversions. CycleGAN showed promising results for numerous SE tasks. Herein, we investigate a potential limitat… ▽ More

    Submitted 6 December, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Journal ref: ISCSLP 2022

  38. arXiv:2110.06142  [pdf, ps, other

    eess.SP eess.IV

    CSI Sensing and Feedback: A Semi-Supervised Learning Approach

    Authors: Haozhen Li, Boyuan Zhang, Xin Liang, Haoran Chang, Xinyu Gu, Lin Zhang

    Abstract: Deep learning-based (DL-based) channel state information (CSI) feedback for a Massive multiple-input multiple-output (MIMO) system has proved to be a creative and efficient application. However, the existing systems ignored the wireless channel environment variation sensing, e.g., indoor and outdoor scenarios. Moreover, systems training requires excess pre-labeled CSI data, which is often unavaila… ▽ More

    Submitted 26 September, 2021; originally announced October 2021.

  39. arXiv:2110.03504  [pdf, other

    cs.CL eess.AS

    Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

    Authors: Liang-Hsuan Tseng, Yu-Kuan Fu, Heng-Jui Chang, Hung-yi Lee

    Abstract: Code-switching (CS) is common in daily conversations where more than one language is used within a sentence. The difficulties of CS speech recognition lie in alternating languages and the lack of transcribed data. Therefore, this paper uses the recently successful self-supervised learning (SSL) methods to leverage many unlabeled speech data without CS. We show that hidden representations of SSL mo… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: Submitted to ICASSP 2022

  40. arXiv:2110.01900  [pdf, other

    cs.CL eess.AS

    DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT

    Authors: Heng-Jui Chang, Shu-wen Yang, Hung-yi Lee

    Abstract: Self-supervised speech representation learning methods like wav2vec 2.0 and Hidden-unit BERT (HuBERT) leverage unlabeled speech data for pre-training and offer good representations for numerous speech processing tasks. Despite the success of these methods, they require large memory and high pre-training costs, making them inaccessible for researchers in academia and small companies. Therefore, thi… ▽ More

    Submitted 27 April, 2022; v1 submitted 5 October, 2021; originally announced October 2021.

    Comments: Accepted to ICASSP 2022

  41. arXiv:2108.09499  [pdf

    q-bio.OT eess.IV

    MITI Minimum Information guidelines for highly multiplexed tissue images

    Authors: Denis Schapiro, Clarence Yapp, Artem Sokolov, Sheila M. Reynolds, Yu-An Chen, Damir Sudar, Yubin Xie, Jeremy L. Muhlich, Raquel Arias-Camison, Sarah Arena, Adam J. Taylor, Milen Nikolov, Madison Tyler, Jia-Ren Lin, Erik A. Burlingame, Human Tumor Atlas Network, Young H. Chang, Samouil L Farhi, Vésteinn Thorsson, Nithya Venkatamohan, Julia L. Drewes, Dana Pe'er, David A. Gutman, Markus D. Herrmann, Nils Gehlenborg , et al. (14 additional authors not shown)

    Abstract: The imminent release of tissue atlases combining multi-channel microscopy with single cell sequencing and other omics data from normal and diseased specimens creates an urgent need for data and metadata standards that guide data deposition, curation and release. We describe a Minimum Information about highly multiplexed Tissue Imaging (MITI) standard that applies best practices developed for genom… ▽ More

    Submitted 23 February, 2022; v1 submitted 21 August, 2021; originally announced August 2021.

  42. arXiv:2107.04589  [pdf, other

    cs.CV cs.LG eess.IV

    ViTGAN: Training GANs with Vision Transformers

    Authors: Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

    Abstract: Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases. In this paper, we investigate if such performance can be extended to image generation. To this end, we integrate the ViT architecture into generative adversarial networks (GANs). For ViT discriminators, we observe that existing regularization methods f… ▽ More

    Submitted 29 May, 2024; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: Accepted to ICLR 2022 (Spotlight)

  43. arXiv:2106.14976  [pdf, other

    eess.SP cs.LG

    Federated Dynamic Spectrum Access

    Authors: Yifei Song, Hao-Hsuan Chang, Zhou Zhou, Shashank Jere, Lingjia Liu

    Abstract: Due to the growing volume of data traffic produced by the surge of Internet of Things (IoT) devices, the demand for radio spectrum resources is approaching their limitation defined by Federal Communications Commission (FCC). To this end, Dynamic Spectrum Access (DSA) is considered as a promising technology to handle this spectrum scarcity. However, standard DSA techniques often rely on analytical… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

  44. arXiv:2105.12227  [pdf, other

    cs.CV eess.IV

    Learning a Model-Driven Variational Network for Deformable Image Registration

    Authors: Xi Jia, Alexander Thorley, Wei Chen, Huaqi Qiu, Linlin Shen, Iain B Styles, Hyung ** Chang, Ales Leonardis, Antonio de Marvao, Declan P. O'Regan, Daniel Rueckert, **ming Duan

    Abstract: Data-driven deep learning approaches to image registration can be less accurate than conventional iterative approaches, especially when training data is limited. To address this whilst retaining the fast inference speed of deep learning, we propose VR-Net, a novel cascaded variational network for unsupervised deformable image registration. Using the variable splitting optimization scheme, we first… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  45. arXiv:2104.13450  [pdf, other

    cs.CV cs.CR cs.LG eess.IV

    Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings

    Authors: Innfarn Yoo, Huiwen Chang, Xiyang Luo, Ondrej Stava, Ce Liu, Peyman Milanfar, Feng Yang

    Abstract: Digital watermarking is widely used for copyright protection. Traditional 3D watermarking approaches or commercial software are typically designed to embed messages into 3D meshes, and later retrieve the messages directly from distorted/undistorted watermarked 3D meshes. However, in many cases, users only have access to rendered 2D images instead of 3D meshes. Unfortunately, retrieving messages fr… ▽ More

    Submitted 29 March, 2022; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: Accepted by CVPR 2022

  46. arXiv:2104.01616  [pdf, other

    cs.CL eess.AS

    Towards Lifelong Learning of End-to-end ASR

    Authors: Heng-Jui Chang, Hung-yi Lee, Lin-shan Lee

    Abstract: Automatic speech recognition (ASR) technologies today are primarily optimized for given datasets; thus, any changes in the application environment (e.g., acoustic conditions or topic domains) may inevitably degrade the performance. We can collect new data describing the new environment and fine-tune the system, but this naturally leads to higher error rates for the earlier datasets, referred to as… ▽ More

    Submitted 2 July, 2021; v1 submitted 4 April, 2021; originally announced April 2021.

    Comments: Interspeech 2021. We acknowledge the support of Salesforce Research Deep Learning Grant

  47. Controlling a CyberOctopus Soft Arm with Muscle-like Actuation

    Authors: Heng-Sheng Chang, Udit Halder, Ekaterina Gribkova, Arman Tekinalp, Noel Naughton, Mattia Gazzola, Prashant G. Mehta

    Abstract: This paper presents an application of the energy sha** methodology to control a flexible, elastic Cosserat rod model of a single octopus arm. The novel contributions of this work are two-fold: (i) a control-oriented modeling of the anatomically realistic internal muscular architecture of an octopus arm; and (ii) the integration of these muscle models into the energy sha** control methodology.… ▽ More

    Submitted 1 April, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

  48. arXiv:2010.01226  [pdf, other

    math.OC cs.RO eess.SY

    Optimal Control of a Soft CyberOctopus Arm

    Authors: Tixian Wang, Udit Halder, Heng-Sheng Chang, Mattia Gazzola, Prashant G. Mehta

    Abstract: In this paper, we use the optimal control methodology to control a flexible, elastic Cosserat rod. An inspiration comes from stereotypical movement patterns in octopus arms, which are observed in a variety of manipulation tasks, such as reaching or fetching. To help uncover the mechanisms underlying these observed morphologies, we outline an optimal control-based framework. A single octopus arm is… ▽ More

    Submitted 1 April, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

  49. arXiv:2007.15580  [pdf

    eess.SP cs.LG math.OC physics.comp-ph stat.ML

    Deep-Learning based Inverse Modeling Approaches: A Subsurface Flow Example

    Authors: Nanzhe Wang, Haibin Chang, Dongxiao Zhang

    Abstract: Deep-learning has achieved good performance and shown great potential for solving forward and inverse problems. In this work, two categories of innovative deep-learning based inverse modeling methods are proposed and compared. The first category is deep-learning surrogate-based inversion methods, in which the Theory-guided Neural Network (TgNN) is constructed as a deep-learning surrogate for probl… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 53 pages, 22 figures, 7 tables

    Journal ref: Journal of Geophysical Research: Solid Earth, e2020JB020549, 2020

  50. arXiv:2007.13973  [pdf, other

    eess.SP

    Multi-Frequency Multi-Scenario Millimeter Wave MIMO Channel Measurements and Modeling for B5G Wireless Communication Systems

    Authors: Jie Huang, Cheng-Xiang Wang, Hengtai Chang, Jian Sun, Xiqi Gao

    Abstract: Millimeter wave (mmWave) bands have been utilized for the fifth generation (5G) communication systems and will no doubt continue to be deployed for beyond 5G (B5G). However, the underlying channels are not fully investigated at multifrequency bands and in multi-scenarios by using the same channel sounder, especially for the outdoor, multiple-input multiple-output (MIMO), and vehicle-to-vehicle (V2… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.