Skip to main content

Showing 1–50 of 78 results for author: Hou, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08374  [pdf, other

    cs.CV cs.AI eess.IV

    2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

    Authors: Tianqi Chen, Jun Hou, Yinchi Zhou, Huidong Xie, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate t… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures

  2. arXiv:2406.06329  [pdf, other

    cs.CL eess.AS

    A Parameter-efficient Language Extension Framework for Multilingual ASR

    Authors: Wei Liu, **gyong Hou, Dong Yang, Muyong Cao, Tan Lee

    Abstract: Covering all languages with a multilingual speech recognition model (MASR) is very difficult. Performing language extension on top of an existing MASR is a desirable choice. In this study, the MASR continual learning problem is probabilistically decomposed into language identity prediction (LP) and cross-lingual adaptation (XLA) sub-problems. Based on this, we propose an architecture-based framewo… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  3. arXiv:2405.12223  [pdf, other

    eess.IV cs.CV

    Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

    Authors: Yinchi Zhou, Tianqi Chen, Jun Hou, Huidong Xie, Nicha C. Dvornek, S. Kevin Zhou, David L. Wilson, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their c… ▽ More

    Submitted 5 April, 2024; originally announced May 2024.

    Comments: 15 pages, 5 figures

  4. arXiv:2404.12804  [pdf, other

    cs.CV eess.IV

    Linearly-evolved Transformer for Pan-sharpening

    Authors: Junming Hou, Zihan Cao, Naishan Zheng, Xuan Li, Xiaoyu Chen, Xinyang Liu, Xiaofeng Cong, Man Zhou, Danfeng Hong

    Abstract: Vision transformer family has dominated the satellite pan-sharpening field driven by the global-wise spatial information modeling mechanism from the core self-attention ingredient. The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 10 pages

  5. arXiv:2403.11953  [pdf, other

    eess.IV cs.CV

    Advancing COVID-19 Detection in 3D CT Scans

    Authors: Qingqiu Li, Runtian Yuan, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen

    Abstract: To make a more accurate diagnosis of COVID-19, we propose a straightforward yet effective model. Firstly, we analyse the characteristics of 3D CT scans and remove the non-lung parts, facilitating the model to focus on lesion-related areas and reducing computational cost. We use ResNeSt50 as the strong feature extractor, initializing it with pretrained weights which have COVID-19-specific prior kno… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  6. arXiv:2403.11498  [pdf, other

    eess.IV cs.CV

    Domain Adaptation Using Pseudo Labels for COVID-19 Detection

    Authors: Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen

    Abstract: In response to the need for rapid and accurate COVID-19 diagnosis during the global pandemic, we present a two-stage framework that leverages pseudo labels for domain adaptation to enhance the detection of COVID-19 from CT scans. By utilizing annotated data from one domain and non-annotated data from another, the model overcomes the challenge of data scarcity and variability, common in emergent he… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  7. arXiv:2402.19020  [pdf, other

    eess.IV cs.CV

    Unsupervised Learning of High-resolution Light Field Imaging via Beam Splitter-based Hybrid Lenses

    Authors: Jianxin Lei, Chengcai Xu, Langqing Shi, Junhui Hou, ** Zhou

    Abstract: In this paper, we design a beam splitter-based hybrid light field imaging prototype to record 4D light field image and high-resolution 2D image simultaneously, and make a hybrid light field dataset. The 2D image could be considered as the high-resolution ground truth corresponding to the low-resolution central sub-aperture image of 4D light field image. Subsequently, we propose an unsupervised lea… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  8. arXiv:2402.16757  [pdf, other

    cs.SD eess.AS

    Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids

    Authors: Jasper Kirton-Wingate, Shafique Ahmed, Adeel Hussain, Mandar Gogate, Kia Dashtipour, Jen-Cheng Hou, Tassadaq Hussain, Yu Tsao, Amir Hussain

    Abstract: Since the advent of Deep Learning (DL), Speech Enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear ambient sound which may be of importance. Hearing Aid (HA) users may wish to customise their SE systems to suit their personal preferences and day-to-da… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: This has been submitted to the Trends in Hearing journal

  9. arXiv:2401.14285  [pdf, other

    cs.CV cs.AI eess.IV

    POUR-Net: A Population-Prior-Aided Over-Under-Representation Network for Low-Count PET Attenuation Map Generation

    Authors: Bo Zhou, Jun Hou, Tianqi Chen, Yinchi Zhou, Xiongchao Chen, Huidong Xie, Qiong Liu, Xueqi Guo, Yu-Jung Tsai, Vladimir Y. Panin, Takuya Toyonaga, James S. Duncan, Chi Liu

    Abstract: Low-dose PET offers a valuable means of minimizing radiation exposure in PET imaging. However, the prevalent practice of employing additional CT scans for generating attenuation maps (u-map) for PET attenuation correction significantly elevates radiation doses. To address this concern and further mitigate radiation exposure in low-dose PET exams, we propose POUR-Net - an innovative population-prio… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 10 pages, 5 figures

  10. arXiv:2401.03689  [pdf, other

    eess.AS cs.SD

    LUPET: Incorporating Hierarchical Information Path into Multilingual ASR

    Authors: Wei Liu, **gyong Hou, Dong Yang, Muyong Cao, Tan Lee

    Abstract: Toward high-performance multilingual automatic speech recognition (ASR), various types of linguistic information and model design have demonstrated their effectiveness independently. They include language identity (LID), phoneme information, language-specific processing modules, and cross-lingual self-supervised speech representation. It is expected that leveraging their benefits synergistically i… ▽ More

    Submitted 10 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted by Interspeech 2024

  11. arXiv:2401.02046  [pdf, other

    eess.AS cs.SD

    CTC Blank Triggered Dynamic Layer-Skip** for Efficient CTC-based Speech Recognition

    Authors: Junfeng Hou, Peiyao Wang, **cheng Zhang, Meng Yang, Minwei Feng, **gcheng Yin

    Abstract: Deploying end-to-end speech recognition models with limited computing resources remains challenging, despite their impressive performance. Given the gradual increase in model size and the wide range of model applications, selectively executing model components for different inputs to improve the inference efficiency is of great interest. In this paper, we propose a dynamic layer-skip** method th… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: accepted by ASRU 2023

  12. arXiv:2310.15194  [pdf

    q-bio.NC cs.HC eess.SP q-bio.QM

    How do the resting EEG preprocessing states affect the outcomes of postprocessing?

    Authors: Shiang Hu, Jie Ruan, Juan Hou, Pedro Antonio Valdes-Sosa, Zhao Lv

    Abstract: Plenty of artifact removal tools and pipelines have been developed to correct the EEG recordings and discover the values below the waveforms. Without visual inspection from the experts, it is susceptible to derive improper preprocessing states, like the insufficient preprocessed EEG (IPE), and the excessive preprocessed EEG (EPE). However, little is known about the impacts of IPE or EPE on the pos… ▽ More

    Submitted 12 December, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

  13. arXiv:2310.05021  [pdf, other

    eess.SY

    Toward Intelligent Emergency Control for Large-scale Power Systems: Convergence of Learning, Physics, Computing and Control

    Authors: Qiuhua Huang, Renke Huang, Tianzhixi Yin, Sohom Datta, Xueqing Sun, Jason Hou, Jie Tan, Wenhao Yu, Yuan Liu, Xinya Li, Bruce Palmer, Ang Li, Xinda Ke, Marianna Vaiman, Song Wang, Yousu Chen

    Abstract: This paper has delved into the pressing need for intelligent emergency control in large-scale power systems, which are experiencing significant transformations and are operating closer to their limits with more uncertainties. Learning-based control methods are promising and have shown effectiveness for intelligent power system control. However, when they are applied to large-scale power systems, t… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: submitted to PSCC 2024

  14. arXiv:2309.11059  [pdf, other

    eess.AS cs.SD

    Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement

    Authors: Shafique Ahmed, Chia-Wei Chen, Wenze Ren, Chin-Jou Li, Ernie Chu, Jun-Cheng Chen, Amir Hussain, Hsin-Min Wang, Yu Tsao, Jen-Cheng Hou

    Abstract: Recent studies have increasingly acknowledged the advantages of incorporating visual data into speech enhancement (SE) systems. In this paper, we introduce a novel audio-visual SE approach, termed DCUC-Net (deep complex U-Net with conformer network). The proposed DCUC-Net leverages complex domain features and a stack of conformer blocks. The encoder and decoder of DCUC-Net are designed using a com… ▽ More

    Submitted 8 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  15. arXiv:2308.05404  [pdf, other

    cs.CV eess.IV

    Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network

    Authors: Xianqiang Lyu, Junhui Hou

    Abstract: This paper presents a novel and interpretable end-to-end learning framework, called the deep compensation unfolding network (DCUNet), for restoring light field (LF) images captured under low-light conditions. DCUNet is designed with a multi-stage architecture that mimics the optimization process of solving an inverse imaging problem in a data-driven fashion. The framework uses the intermediate enh… ▽ More

    Submitted 26 June, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

  16. arXiv:2307.07748  [pdf, other

    eess.AS

    Audio-Visual Speech Enhancement Using Self-supervised Learning to Improve Speech Intelligibility in Cochlear Implant Simulations

    Authors: Richard Lee Lai, Jen-Cheng Hou, Mandar Gogate, Kia Dashtipour, Amir Hussain, Yu Tsao

    Abstract: Individuals with hearing impairments face challenges in their ability to comprehend speech, particularly in noisy environments. The aim of this study is to explore the effectiveness of audio-visual speech enhancement (AVSE) in enhancing the intelligibility of vocoded speech in cochlear implant (CI) simulations. Notably, the study focuses on a challenged scenario where there is limited availability… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

  17. Probabilistic-based Feature Embedding of 4-D Light Fields for Compressive Imaging and Denoising

    Authors: Xianqiang Lyu, Junhui Hou

    Abstract: The high-dimensional nature of the 4-D light field (LF) poses great challenges in achieving efficient and effective feature embedding, that severely impacts the performance of downstream tasks. To tackle this crucial issue, in contrast to existing methods with empirically-designed architectures, we propose a probabilistic-based feature embedding (PFE), which learns a feature embedding architecture… ▽ More

    Submitted 10 January, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Journal ref: International Journal of Computer Vision (2023)

  18. arXiv:2304.04774  [pdf, other

    cs.CV cs.AI eess.IV

    DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion

    Authors: ZiHan Cao, ShiQi Cao, Xiao Wu, JunMing Hou, Ran Ran, Liang-Jian Deng

    Abstract: Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation recently, thanks to its powerful generation capability. However, diffusion models have not yet received sufficient research in the field of image fusion. In this article, we introduce diffusion model to the image fusion field, treating the image fusion task as image-to-image translatio… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  19. arXiv:2304.02389  [pdf, other

    eess.IV cs.CV cs.LG

    DRAC: Diabetic Retinopathy Analysis Challenge with Ultra-Wide Optical Coherence Tomography Angiography Images

    Authors: Bo Qian, Hao Chen, Xiangning Wang, Haoxuan Che, Gitaek Kwon, Jaeyoung Kim, Sung** Choi, Seoyoung Shin, Felix Krause, Markus Unterdechler, Junlin Hou, Rui Feng, Yihao Li, Mostafa El Habib Daho, Qiang Wu, ** Zhang, Xiaokang Yang, Yiyu Cai, Wei** Jia, Huating Li, Bin Sheng

    Abstract: Computer-assisted automatic analysis of diabetic retinopathy (DR) is of great importance in reducing the risks of vision loss and even blindness. Ultra-wide optical coherence tomography angiography (UW-OCTA) is a non-invasive and safe imaging modality in DR diagnosis system, but there is a lack of publicly available benchmarks for model development and evaluation. To promote further research and s… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  20. arXiv:2304.00570  [pdf, other

    eess.IV cs.CV cs.LG

    FedFTN: Personalized Federated Learning with Deep Feature Transformation Network for Multi-institutional Low-count PET Denoising

    Authors: Bo Zhou, Huidong Xie, Qiong Liu, Xiongchao Chen, Xueqi Guo, Zhicheng Feng, Jun Hou, S. Kevin Zhou, Biao Li, Axel Rominger, Kuangyu Shi, James S. Duncan, Chi Liu

    Abstract: Low-count PET is an efficient way to reduce radiation exposure and acquisition time, but the reconstructed images often suffer from low signal-to-noise ratio (SNR), thus affecting diagnosis and other downstream tasks. Recent advances in deep learning have shown great potential in improving low-count PET image quality, but acquiring a large, centralized, and diverse dataset from multiple institutio… ▽ More

    Submitted 6 October, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: 13 pages, 6 figures, Accepted at Medical Image Analysis Journal (MedIA)

  21. arXiv:2303.13764  [pdf, other

    eess.IV cs.CV

    GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute

    Authors: **rui Xing, Hui Yuan, Raouf Hamzaoui, Hao Liu, Junhui Hou

    Abstract: In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry info… ▽ More

    Submitted 6 November, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE TIP (DOI: 10.1109/TIP.2023.3330086)

  22. arXiv:2303.07623  [pdf, other

    physics.med-ph eess.IV

    Uncertainty-weighted Multi-tasking for $T_{1ρ}$ and T$_2$ Map** in the Liver with Self-supervised Learning

    Authors: Chaoxing Huang, Yurui Qian, Jian Hou, Baiyan Jiang, Queenie Chan, Vincent WS Wong, Winnie CW Chu, Weitian Chen

    Abstract: Multi-parametric map** of MRI relaxations in liver has the potential of revealing pathological information of the liver. A self-supervised learning based multi-parametric map** method is proposed to map T$T_{1ρ}$ and T$_2$ simultaneously, by utilising the relaxation constraint in the learning process. Data noise of different map** tasks is utilised to make the model uncertainty-aware, which… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  23. arXiv:2302.07597  [pdf, other

    eess.SY

    Preventive-Corrective Cyber-Defense: Attack-Induced Region Minimization and Cybersecurity Margin Maximization

    Authors: Jiazuo Hou, Fei Teng, Wenqian Yin, Yue Song, Yunhe Hou

    Abstract: False data injection (FDI) cyber-attacks on power systems can be prevented by strategically selecting and protecting a sufficiently large measurement subset, which, however, requires adequate cyber-defense resources for measurement protection. With any given cyber-defense resource, this paper proposes a preventive-corrective cyber-defense strategy, which minimizes the FDI attack-induced region in… ▽ More

    Submitted 13 November, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

  24. arXiv:2301.06132  [pdf, other

    cs.CV eess.IV

    Deep Diversity-Enhanced Feature Representation of Hyperspectral Images

    Authors: **hui Hou, Zhiyu Zhu, Junhui Hou, Hui Liu, Huanqiang Zeng, Deyu Meng

    Abstract: In this paper, we study the problem of efficiently and effectively embedding the high-dimensional spatio-spectral information of hyperspectral (HS) images, guided by feature diversity. Specifically, based on the theoretical formulation that feature diversity is correlated with the rank of the unfolded kernel matrix, we rectify 3D convolution by modifying its topology to enhance the rank upper-boun… ▽ More

    Submitted 9 May, 2024; v1 submitted 15 January, 2023; originally announced January 2023.

    Comments: 17 pages, 12 figures. Accepted in TPAMI 2024. arXiv admin note: substantial text overlap with arXiv:2207.04266

  25. arXiv:2211.14559  [pdf, other

    eess.IV cs.CV

    Boosting COVID-19 Severity Detection with Infection-aware Contrastive Mixup Classification

    Authors: Junlin Hou, Jilan Xu, Nan Zhang, Yuejie Zhang, Xiaobo Zhang, Rui Feng

    Abstract: This paper presents our solution for the 2nd COVID-19 Severity Detection Competition. This task aims to distinguish the Mild, Moderate, Severe, and Critical grades in COVID-19 chest CT images. In our approach, we devise a novel infection-aware 3D Contrastive Mixup Classification network for severity grading. Specifcally, we train two segmentation networks to first extract the lung region and then… ▽ More

    Submitted 1 December, 2022; v1 submitted 26 November, 2022; originally announced November 2022.

    Comments: ECCV AIMIA Workshop 2022

  26. arXiv:2211.14557  [pdf, other

    eess.IV cs.CV

    CMC v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors

    Authors: Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang, Yuejie Zhang, Xiaobo Zhang, Rui Feng

    Abstract: This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022). In our approach, we employ the winning solution last year which uses a strong 3D Contrastive Mixup Classifcation network (CMC v1) as the baseline method, composed of contrastive representation learning and mixup classificatio… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

    Comments: ECCV AIMIA Workshop 2022

  27. arXiv:2211.05910  [pdf, other

    eess.IV cs.CV

    Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

    Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

  28. arXiv:2211.04894  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

    Authors: Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, **gwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: The rapid increase in user-generated-content (UGC) videos calls for the development of effective video quality assessment (VQA) algorithms. However, the objective of the UGC-VQA problem is still ambiguous and can be viewed from two perspectives: the technical perspective, measuring the perception of distortions; and the aesthetic perspective, which relates to preference and recommendation on conte… ▽ More

    Submitted 7 March, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  29. arXiv:2211.04098  [pdf, other

    eess.SY cs.SC

    Abstraction-Based Verification of Approximate Pre-Opacity for Control Systems

    Authors: Junyao Hou, Siyuan Liu, Xiang Yin, Majid Zamani

    Abstract: In this paper, we consider the problem of verifying pre-opacity for discrete-time control systems. Pre-opacity is an important information-flow security property that secures the intention of a system to execute some secret behaviors in the future. Existing works on pre-opacity only consider non-metric discrete systems, where it is assumed that intruders can distinguish different output behaviors… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: Discrete Event Systems, Opacity, Formal Abstractions

  30. arXiv:2211.02419  [pdf, other

    eess.IV cs.CV cs.LG

    High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

    Authors: Yucong Lin, **hua Su, Yuhang Li, Yuhao Wei, Hanchao Yan, Saining Zhang, Jiaan Luo, Danni Ai, Hong Song, **gfan Fan, Tianyu Fu, Deqiang Xiao, Feifei Wang, Jue Hou, Jian Yang

    Abstract: Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We develope… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  31. arXiv:2211.01601  [pdf

    math.OC eess.SY

    A Fast Solution Method for Large-scale Unit Commitment Based on Lagrangian Relaxation and Dynamic Programming

    Authors: Jiangwei Hou, Qiaozhu Zhai, Yuzhou Zhou, Xiaohong Guan

    Abstract: The unit commitment problem (UC) is crucial for the operation and market mechanism of power systems. With the development of modern electricity, the scale of power systems is expanding, and solving the UC problem is also becoming more and more difficult. To this end, this paper proposes a new fast solution method based on Lagrangian relaxation and dynamic program-ming. Firstly, the UC solution is… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 10 pages, journal paper, transactions

  32. arXiv:2210.17456  [pdf, other

    eess.AS cs.SD

    Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings

    Authors: I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

    Abstract: AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via utilizing multi-modal self-supervised embeddings. Nevertheless, it is unclear if such representations can be generalized to solve real-world multi-moda… ▽ More

    Submitted 31 May, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: ICASSP AMHAT 2023

  33. arXiv:2210.16743  [pdf, other

    eess.AS cs.SD

    WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

    Authors: Jie Wang, Menglong Xu, **gyong Hou, Binbin Zhang, Xiao-Lei Zhang, Lei Xie, Fu** Pan

    Abstract: Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices. Recently, end-to-end (E2E) methods have become the most popular approach for on-device KWS tasks. However, there is still a gap between the research and deployment of E2E KWS methods. In this paper, we introduce WeKws, a production-quality, easy-to-build, and convenient-t… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  34. arXiv:2210.07749   

    eess.AS cs.SD

    LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge

    Authors: Yan Jia, Mi Hong, **gyu Hou, Kailong Ren, Sifan Ma, ** Wang, Fangzhen Peng, Yinglin Ji, Lin Yang, Junjie Wang

    Abstract: This paper describes LeVoice automatic speech recognition systems to track2 of intelligent cockpit speech recognition challenge 2022. Track2 is a speech recognition task without limits on the scope of model size. Our main points include deep learning based speech enhancement, text-to-speech based speech generation, training data augmentation via various techniques and speech recognition model fusi… ▽ More

    Submitted 16 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: There are experimental errors

  35. arXiv:2210.00515  [pdf, other

    eess.IV cs.CV

    Deep-OCTA: Ensemble Deep Learning Approaches for Diabetic Retinopathy Analysis on OCTA Images

    Authors: Junlin Hou, Fan Xiao, Jilan Xu, Yuejie Zhang, Haidong Zou, Rui Feng

    Abstract: The ultra-wide optical coherence tomography angiography (OCTA) has become an important imaging modality in diabetic retinopathy (DR) diagnosis. However, there are few researches focusing on automatic DR analysis using ultra-wide OCTA. In this paper, we present novel and practical deep-learning solutions based on ultra-wide OCTA for the Diabetic Retinopathy Analysis Challenge (DRAC). In the segment… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

  36. arXiv:2207.04266  [pdf, other

    eess.IV cs.CV

    Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image Denoising

    Authors: **hui Hou, Zhiyu Zhu, Hui Liu, Junhui Hou

    Abstract: This paper tackles the challenging problem of hyperspectral (HS) image denoising. Unlike existing deep learning-based methods usually adopting complicated network architectures or empirically stacking off-the-shelf modules to pursue performance improvement, we focus on the efficient and effective feature extraction manner for capturing the high-dimensional characteristics of HS images. To be speci… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Comments: 10 pages, 8 figures

  37. arXiv:2207.03105  [pdf

    q-bio.TO cs.CV eess.IV physics.med-ph

    Uncertainty-Aware Self-supervised Neural Network for Liver $T_{1ρ}$ Map** with Relaxation Constraint

    Authors: Chaoxing Huang, Yurui Qian, Simon Chun Ho Yu, Jian Hou, Baiyan Jiang, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: $T_{1ρ}$ map** is a promising quantitative MRI technique for the non-invasive assessment of tissue properties. Learning-based approaches can map $T_{1ρ}$ from a reduced number of $T_{1ρ}$ weighted images, but requires significant amounts of high quality training data. Moreover, existing methods do not provide the confidence level of the $T_{1ρ}… ▽ More

    Submitted 25 October, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

    Comments: Provisionally accepted by Physics in Medicine and Biology

  38. arXiv:2207.01909  [pdf, other

    cs.CV cs.LG eess.IV

    StyleFlow For Content-Fixed Image to Image Translation

    Authors: Weichen Fan, **ghuan Chen, Jiabin Ma, Jun Hou, Shuai Yi

    Abstract: Image-to-image (I2I) translation is a challenging topic in computer vision. We divide this problem into three tasks: strongly constrained translation, normally constrained translation, and weakly constrained translation. The constraint here indicates the extent to which the content or semantic information in the original image is preserved. Although previous approaches have achieved good performan… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  39. arXiv:2207.01758  [pdf, other

    eess.IV cs.CV

    FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis

    Authors: Junlin Hou, Jilan Xu, Rui Feng, Yuejie Zhang

    Abstract: This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop in the European Conference on Computer Vision (ECCV 2022). In our approach, we employ an effective 3D Contrastive Mixup Classification network for COVID-19 diagnosis on chest CT images, which is composed of contrastive representation learning and mixup classification. For the COVID-1… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  40. Deep Posterior Distribution-based Embedding for Hyperspectral Image Super-resolution

    Authors: **hui Hou, Zhiyu Zhu, Junhui Hou, Huanqiang Zeng, **jian Wu, Jiantao Zhou

    Abstract: In this paper, we investigate the problem of hyperspectral (HS) image spatial super-resolution via deep learning. Particularly, we focus on how to embed the high-dimensional spatial-spectral information of HS images efficiently and effectively. Specifically, in contrast to existing methods adopting empirically-designed network modules, we formulate HS embedding as an approximation of the posterior… ▽ More

    Submitted 23 August, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Accepted by IEEE Transactions on Image Processing

  41. arXiv:2203.00914  [pdf, other

    cs.CV cs.MM eess.IV

    PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling

    Authors: Hao Liu, Hui Yuan, Junhui Hou, Raouf Hamzaoui, Wei Gao

    Abstract: We propose a generative adversarial network for point cloud upsampling, which can not only make the upsampled points evenly distributed on the underlying surface but also efficiently generate clean high frequency regions. The generator of our network includes a dynamic graph hierarchical residual aggregation unit and a hierarchical residual aggregation unit for point feature extraction and upsampl… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  42. arXiv:2110.11998  [pdf, other

    eess.IV cs.CV

    Semi-Supervised Semantic Segmentation of Vessel Images using Leaking Perturbations

    Authors: **yong Hou, Xuejie Ding, Jeremiah D. Deng

    Abstract: Semantic segmentation based on deep learning methods can attain appealing accuracy provided large amounts of annotated samples. However, it remains a challenging task when only limited labelled data are available, which is especially common in medical imaging. In this paper, we propose to use Leaking GAN, a GAN-based semi-supervised architecture for retina vessel semantic segmentation. Our key ide… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: To appear in WACV'22

  43. arXiv:2108.05547  [pdf, other

    eess.IV cs.CV

    Deep Amended Gradient Descent for Efficient Spectral Reconstruction from Single RGB Images

    Authors: Zhiyu Zhu, Hui Liu, Junhui Hou, Sen Jia, Qingfu Zhang

    Abstract: This paper investigates the problem of recovering hyperspectral (HS) images from single RGB images. To tackle such a severely ill-posed problem, we propose a physically-interpretable, compact, efficient, and end-to-end learning-based framework, namely AGD-Net. Precisely, by taking advantage of the imaging process, we first formulate the problem explicitly based on the classic gradient descent algo… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  44. Attention-Guided Progressive Neural Texture Fusion for High Dynamic Range Image Restoration

    Authors: Jie Chen, Zaifeng Yang, Tsz Nam Chan, Hui Li, Junhui Hou, Lap-Pui Chau

    Abstract: High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  45. arXiv:2105.06295  [pdf, other

    eess.SP cs.LG

    Gait Characterization in Duchenne Muscular Dystrophy (DMD) Using a Single-Sensor Accelerometer: Classical Machine Learning and Deep Learning Approaches

    Authors: Albara Ah Ramli, Xin Liu, Kelly Berndt, Erica Goude, Jiahui Hou, Lynea B. Kaethler, Rex Liu, Amanda Lopez, Alina Nicorici, Corey Owens, David Rodriguez, Jane Wang, Huanle Zhang, Daniel Aranki, Craig M. McDonald, Erik K. Henricson

    Abstract: Differences in gait patterns of children with Duchenne muscular dystrophy (DMD) and typically-develo** (TD) peers are visible to the eye, but quantifications of those differences outside of the gait laboratory have been elusive. In this work, we measured vertical, mediolateral, and anteroposterior acceleration using a waist-worn iPhone accelerometer during ambulation across a typical range of ve… ▽ More

    Submitted 10 July, 2023; v1 submitted 12 May, 2021; originally announced May 2021.

  46. arXiv:2104.10764  [pdf, other

    eess.AS cs.CL cs.SD

    HMM-Free Encoder Pre-Training for Streaming RNN Transducer

    Authors: Lu Huang, **gyu Sun, Yufeng Tang, Junfeng Hou, **kun Chen, Jun Zhang, Zejun Ma

    Abstract: This work describes an encoder pre-training procedure using frame-wise label to improve the training of streaming recurrent neural network transducer (RNN-T) model. Streaming RNN-T trained from scratch usually performs worse than non-streaming RNN-T. Although it is common to address this issue through pre-training components of RNN-T with other criteria or frame-wise alignment guidance, the alignm… ▽ More

    Submitted 10 June, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

    Comments: Accepted by Interspeech 2021

  47. arXiv:2102.13552  [pdf, other

    cs.SD cs.LG eess.AS

    The NPU System for the 2020 Personalized Voice Trigger Challenge

    Authors: **gyong Hou, Li Zhang, Yihui Fu, Qing Wang, Zhanheng Yang, Qijie Shao, Lei Xie

    Abstract: This paper describes the system developed by the NPU team for the 2020 personalized voice trigger challenge. Our submitted system consists of two independently trained subsystems: a small footprint keyword spotting (KWS) system and a speaker verification (SV) system. For the KWS system, a multi-scale dilated temporal convolutional (MDTC) network is proposed to detect wake-up word (WuW). For SV sys… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  48. arXiv:2102.07085  [pdf, other

    eess.IV cs.CV

    Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses

    Authors: **g **, Mantang Guo, Junhui Hou, Hui Liu, Hongkai Xiong

    Abstract: This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a… ▽ More

    Submitted 17 June, 2023; v1 submitted 14 February, 2021; originally announced February 2021.

    Comments: Accepted by IEEE TPAMI. arXiv admin note: text overlap with arXiv:1907.09640

  49. arXiv:2012.04874  [pdf, other

    eess.SY cs.FL

    A Framework for Current-State Opacity under Dynamic Information Release Mechanism

    Authors: Junyao Hou, Xiang Yin, Shaoyuan Li

    Abstract: Opacity is an important information-flow security property that characterizes the plausible deniability of a dynamic system for its "secret" against eavesdrop** attacks. As an information-flow property, the underlying observation model is the key in the modeling and analysis of opacity. In this paper, we investigate the verification of current-state opacity for discrete-event systems under Orwel… ▽ More

    Submitted 17 January, 2022; v1 submitted 9 December, 2020; originally announced December 2020.

  50. Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression

    Authors: Qi Liu, Hui Yuan, Raouf Hamzaoui, Honglei Su, Junhui Hou, Huan Yang

    Abstract: In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate. One of the main challenges of this approach is to define a quality measure that can be computed with low computational cost and which correlates well with perceptual quality. While several quality measures that fulfil these two criteria have b… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: 14 figures and 7 tables, submitted to IEEE T IP