Skip to main content

Showing 1–50 of 123 results for author: Lin, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18327  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-modal Evidential Fusion Network for Trusted PET/CT Tumor Segmentation

    Authors: Yuxuan Qi, Li Lin, Jiajun Wang, **gya Zhang, Bin Zhang

    Abstract: Accurate segmentation of tumors in PET/CT images is important in computer-aided diagnosis and treatment of cancer. The key issue of such a segmentation problem lies in the effective integration of complementary information from PET and CT images. However, the quality of PET and CT images varies widely in clinical settings, which leads to uncertainty in the modality information extracted by network… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.15931  [pdf, other

    eess.SY cs.CE cs.LG stat.AP

    Multistep Criticality Search and Power Sha** in Microreactors with Reinforcement Learning

    Authors: Majdi I. Radaideh, Leo Tunkle, Dean Price, Kamal Abdulraheem, Linyu Lin, Moutaz Elias

    Abstract: Reducing operation and maintenance costs is a key objective for advanced reactors in general and microreactors in particular. To achieve this reduction, develo** robust autonomous control algorithms is essential to ensure safe and autonomous reactor operation. Recently, artificial intelligence and machine learning algorithms, specifically reinforcement learning (RL) algorithms, have seen rapid i… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 15 pages, 3 figures, and 2 tables

  3. arXiv:2405.18386  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

    Authors: Yixiao Zhang, Yukara Ikemiya, Woosung Choi, Naoki Murata, Marco A. Martínez-Ramírez, Liwei Lin, Gus Xia, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

    Abstract: Recent advances in text-to-music editing, which employ text queries to modify music (e.g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation. Previous approaches in this domain have been constrained by the necessity to train specific editing models from scratch, which is both resource-intensive and inefficient; o… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Code and demo are available at: https://github.com/ldzhangyx/instruct-musicgen

  4. arXiv:2405.17496  [pdf, other

    eess.IV

    UU-Mamba: Uncertainty-aware U-Mamba for Cardiac Image Segmentation

    Authors: Ting Yu Tsai, Li Lin, Shu Hu, Ming-Ching Chang, Hongtu Zhu, Xin Wang

    Abstract: Biomedical image segmentation is critical for accurate identification and analysis of anatomical structures in medical imaging, particularly in cardiac MRI. Manual segmentation is labor-intensive, time-consuming, and prone to errors, highlighting the need for automated methods. However, current machine learning approaches face challenges like overfitting and data demands. To tackle these issues, w… ▽ More

    Submitted 4 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  5. arXiv:2404.12908  [pdf, other

    cs.CV cs.LG eess.IV

    Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images

    Authors: Santosh, Li Lin, Irene Amerini, Xin Wang, Shu Hu

    Abstract: Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. However, their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content, raising concerns about digital authenticity and potential misuse in creating deepfakes. This work introduces a robust detection… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  6. arXiv:2404.03885  [pdf, ps, other

    cs.IT cs.DS eess.SP math.ST

    The ESPRIT algorithm under high noise: Optimal error scaling and noisy super-resolution

    Authors: Zhiyan Ding, Ethan N. Epperly, Lin Lin, Ruizhe Zhang

    Abstract: Subspace-based signal processing techniques, such as the Estimation of Signal Parameters via Rotational Invariant Techniques (ESPRIT) algorithm, are popular methods for spectral estimation. These algorithms can achieve the so-called super-resolution scaling under low noise conditions, surpassing the well-known Nyquist limit. However, the performance of these algorithms under high-noise conditions… ▽ More

    Submitted 22 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  7. arXiv:2404.02744  [pdf

    eess.IV

    Terraced Compression Method with Automated Threshold Selection for Multidimensional Image Clustering of Heterogeneous Bodies

    Authors: Jiatong Li, Gang Li, Nan Su Su Win, Ling Lin

    Abstract: Multispectral transmission imaging provides strong benefits for early breast cancer screening. The frame accumulation method addresses the challenge of low grayscale and signal-to-noise ratio resulting from the strong absorption and scattering of light by breast tissue. This method introduces redundancy in data while improving the grayscale and signal-to-noise ratio of the image. Existing terraced… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  8. arXiv:2404.02286  [pdf, other

    cs.IT eess.SP

    Energy Allocation for Multi-User Cooperative Molecular Communication Systems in the Internet of Bio-Nano Things

    Authors: Dongliang **g, Lin Lin, Andrew W. Eckford

    Abstract: Cooperative molecular communication (MC) is a promising technology for facilitating communication between nanomachines in the Internet of Bio-Nano Things (IoBNT) field. However, the performance of IoBNT is limited by the availability of energy for cooperative MC. This paper presents a novel transmitter design scheme that utilizes molecule movement between reservoirs, creating concentration differe… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: To appear in IEEE Internet of Things Journal

  9. arXiv:2403.20025  [pdf, ps, other

    cs.IT eess.SP

    Secure Full-Duplex Communication via Movable Antennas

    Authors: **gze Ding, Zijian Zhou, Chenbo Wang, Wenyao Li, Lifeng Lin, Bingli Jiao

    Abstract: This paper investigates physical layer security (PLS) for a movable antenna (MA)-assisted full-duplex (FD) system. In this system, an FD base station (BS) with multiple MAs for transmission and reception provides services for an uplink (UL) user and a downlink (DL) user. Each user operates in half-duplex (HD) mode and is equipped with a single fixed-position antenna (FPA), in the presence of a sin… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: This paper has been submitted for possible publication

  10. IDF-CR: Iterative Diffusion Process for Divide-and-Conquer Cloud Removal in Remote-sensing Images

    Authors: Meilin Wang, Yexing Song, Pengxu Wei, Xiaoyu Xian, Yukai Shi, Liang Lin

    Abstract: Deep learning technologies have demonstrated their effectiveness in removing cloud cover from optical remote-sensing images. Convolutional Neural Networks (CNNs) exert dominance in the cloud removal tasks. However, constrained by the inherent limitations of convolutional operations, CNNs can address only a modest fraction of cloud occlusion. In recent years, diffusion models have achieved state-of… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE TGRS, we first present an iterative diffusion process for cloud removal, the code is available at: https://github.com/SongYxing/IDF-CR

  11. arXiv:2403.08947  [pdf, other

    eess.IV cs.CV

    Robust COVID-19 Detection in CT Images with CLIP

    Authors: Li Lin, Yamini Sri Krubha, Zhenhuan Yang, Cheng Ren, Thuc Duy Le, Irene Amerini, Xin Wang, Shu Hu

    Abstract: In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data. In this work, we introduce the first lightweight detector designed to overcome these obstacles, leveraging a frozen CLIP image encoder a… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  12. arXiv:2402.17502  [pdf, other

    cs.CV eess.IV

    FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-supervised Medical Image Segmentation

    Authors: Li Lin, Yixiang Liu, Jiewei Wu, Pu** Cheng, Zhiyuan Cai, Kenneth K. Y. Wong, Xiaoying Tang

    Abstract: Federated learning (FL) effectively mitigates the data silo challenge brought about by policies and privacy concerns, implicitly harnessing more data for deep model training. However, traditional centralized FL models grapple with diverse multi-center data, especially in the face of significant data heterogeneity, notably in medical contexts. In the realm of medical image segmentation, the growing… ▽ More

    Submitted 31 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 12 pages, 10 figures

  13. arXiv:2402.09508  [pdf, other

    cs.SD cs.AI eess.AS

    Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing via Content-based Controls

    Authors: Liwei Lin, Gus Xia, Yixiao Zhang, Junyan Jiang

    Abstract: Controllable music generation plays a vital role in human-AI music co-creation. While Large Language Models (LLMs) have shown promise in generating high-quality music, their focus on autoregressive generation limits their utility in music editing tasks. To address this gap, we propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme. T… ▽ More

    Submitted 10 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  14. arXiv:2401.17049  [pdf, ps, other

    cs.IT eess.SP

    Movable Antenna-Enabled Co-Frequency Co-Time Full-Duplex Wireless Communication

    Authors: **gze Ding, Zijian Zhou, Wenyao Li, Chenbo Wang, Lifeng Lin, Bingli Jiao

    Abstract: Movable antenna (MA) provides an innovative way to arrange antennas that can contribute to improved signal quality and more effective interference management. This method is especially beneficial for co-frequency co-time full-duplex (CCFD) wireless communication, which struggles with self-interference (SI) that usually overpowers the desired incoming signals. By dynamically repositioning transmit/… ▽ More

    Submitted 7 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: This paper has been submitted to IEEE Wireless Communications Letters

  15. arXiv:2312.16607  [pdf, other

    eess.IV cs.CV stat.ML

    A Polarization and Radiomics Feature Fusion Network for the Classification of Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma

    Authors: Jia Dong, Yao Yao, Liyan Lin, Yang Dong, Jiachen Wan, Ran Peng, Chao Li, Hui Ma

    Abstract: Classifying hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) is a critical step in treatment selection and prognosis evaluation for patients with liver diseases. Traditional histopathological diagnosis poses challenges in this context. In this study, we introduce a novel polarization and radiomics feature fusion network, which combines polarization features obtained from Mu… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  16. arXiv:2312.07226  [pdf, other

    eess.IV cs.CV

    Super-Resolution on Rotationally Scanned Photoacoustic Microscopy Images Incorporating Scanning Prior

    Authors: Kai Pan, Linyang Li, Li Lin, Pu** Cheng, Junyan Lyu, Lei Xi, Xiaoyin Tang

    Abstract: Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. However, there exists a trade-off between scanning speed and image resolution. Compared with traditional raster scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. Recently, there is a t… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  17. arXiv:2312.03299  [pdf, other

    cs.IT eess.SP

    Channel-Transferable Semantic Communications for Multi-User OFDM-NOMA Systems

    Authors: Lan Lin, Wenjun Xu, Fengyu Wang, Yimeng Zhang, Wei Zhang, ** Zhang

    Abstract: Semantic communications are expected to become the core new paradigms of the sixth generation (6G) wireless networks. Most existing works implicitly utilize channel information for codecs training, which leads to poor communications when channel type or statistical characteristics change. To tackle this issue posed by various channels, a novel channel-transferable semantic communications (CT-SemCo… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  18. arXiv:2312.03231  [pdf, other

    cs.LG cs.AI cs.CV cs.HC eess.AS

    Deep Multimodal Fusion for Surgical Feedback Classification

    Authors: Rafal Kocielnik, Elyssa Y. Wong, Timothy N. Chu, Lydia Lin, De-An Huang, Jiayun Wang, Anima Anandkumar, Andrew J. Hung

    Abstract: Quantification of real-time informal feedback delivered by an experienced surgeon to a trainee during surgery is important for skill improvements in surgical training. Such feedback in the live operating room is inherently multimodal, consisting of verbal conversations (e.g., questions and answers) as well as non-verbal elements (e.g., through visual cues like pointing to anatomic elements). In th… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Journal ref: Published in Proceedings of Machine Learning for Health 2024

  19. Exploding AI Power Use: an Opportunity to Rethink Grid Planning and Management

    Authors: Liuzixuan Lin, Ra**i Wijayawardana, Varsha Rao, Hai Nguyen, Wedan Emmanuel Gnibga, Andrew A. Chien

    Abstract: The unprecedented rapid growth of computing demand for AI is projected to increase global annual datacenter (DC) growth from 7.2% to 11.3%. We project the 5-year AI DC demand for several power grids and assess whether they will allow desired AI growth (resource adequacy). If not, several "desperate measures" -- grid policies that enable more load growth and maintain grid reliability by sacrificing… ▽ More

    Submitted 30 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted by ACM e-Energy '24: the 15th ACM International Conference on Future and Sustainable Energy Systems

  20. arXiv:2310.17162  [pdf, other

    cs.AI cs.SD eess.AS

    Content-based Controls For Music Large Language Modeling

    Authors: Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang

    Abstract: Recent years have witnessed a rapid growth of large-scale language models in the domain of music audio. Such models enable end-to-end generation of higher-quality music, and some allow conditioned generation using text descriptions. However, the control power of text controls on music is intrinsically limited, as they can only describe music indirectly through meta-data (such as singers and instru… ▽ More

    Submitted 13 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  21. arXiv:2310.11230  [pdf, other

    eess.AS cs.LG cs.SD

    Zipformer: A faster and better encoder for automatic speech recognition

    Authors: Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui **, Long Lin, Daniel Povey

    Abstract: The Conformer has become the most popular encoder model for automatic speech recognition (ASR). It adds convolution modules to a transformer to learn both local and global dependencies. In this work we describe a faster, more memory-efficient, and better-performing transformer, called Zipformer. Modeling changes include: 1) a U-Net-like encoder structure where middle stacks operate at lower frame… ▽ More

    Submitted 9 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at ICLR 2024

  22. arXiv:2310.07255  [pdf, other

    cs.CV eess.IV

    ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and Multispectral Data Fusion

    Authors: **ghui Qin, Lihuang Fang, Ruitao Lu, Liang Lin, Yukai Shi

    Abstract: Deep learning-based hyperspectral image (HSI) super-resolution, which aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs), has attracted lots of attention. However, neural networks require large amounts of training data, hindering their application in real-world scenarios. In this letter, we propos… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted by IEEE Geoscience and Remote Sensing Letters. Code is released at https://github.com/fangfang11-plog/ADASR

  23. arXiv:2310.04886  [pdf, other

    eess.SY

    A Closed-form Solution for the Strapdown Inertial Navigation Initial Value Problem

    Authors: James Goppert, Li-Yu Lin, Kartik Pant, Benjamin Perseghetti

    Abstract: Strapdown inertial navigation systems (SINS) are ubiquitious in robotics and engineering since they can estimate a rigid body pose using onboard kinematic measurements without knowledge of the dynamics of the vehicle to which they are attached. While recent work has focused on the closed-form evolution of the estimation error for SINS, which is critical for Kalman filtering, the propagation of the… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 4 pages, 3 figures

  24. arXiv:2309.08879  [pdf, other

    cs.CL eess.SP

    Semantic Information Extraction for Text Data with Probability Graph

    Authors: Zhouxiang Zhao, Zhaohui Yang, Ye Hu, Licheng Lin, Zhaoyang Zhang

    Abstract: In this paper, the problem of semantic information extraction for resource constrained text data transmission is studied. In the considered model, a sequence of text data need to be transmitted within a communication resource-constrained network, which only allows limited data transmission. Thus, at the transmitter, the original text data is extracted with natural language processing techniques. T… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  25. arXiv:2309.08105  [pdf, other

    eess.AS cs.SD

    Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

    Authors: Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey

    Abstract: In this paper, we introduce Libriheavy, a large-scale ASR corpus consisting of 50,000 hours of read English speech derived from LibriVox. To the best of our knowledge, Libriheavy is the largest freely-available corpus of speech with supervisions. Different from other open-sourced datasets that only provide normalized transcriptions, Libriheavy contains richer information such as punctuation, casin… ▽ More

    Submitted 14 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  26. arXiv:2309.07414  [pdf, other

    eess.AS cs.CL cs.SD

    PromptASR for contextualized ASR with controllable style

    Authors: Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey

    Abstract: Prompts are crucial to large language models as they provide context information such as topic or logical relationships. Inspired by this, we propose PromptASR, a framework that integrates prompts in end-to-end automatic speech recognition (E2E ASR) systems to achieve contextualized ASR with controllable style of transcriptions. Specifically, a dedicated text encoder encodes the text prompts and t… ▽ More

    Submitted 24 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Proc. ICASSP 2024

  27. arXiv:2309.05042  [pdf, ps, other

    eess.SP

    High-Precision Channel Estimation for Sub-Noise Self-Interference Cancellation

    Authors: Dongsheng Zheng, Lifeng Lin, Wenyao Li, Bingli Jiao

    Abstract: Self-interference cancellation plays a crucial role in achieving reliable full-duplex communications. In general, it is essential to cancel the self-interference signal below the thermal noise level, which necessitates accurate reconstruction of the self-interference signal. In this paper, we propose a high-precision channel estimation method specifically designed for sub-noise self-interference c… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  28. arXiv:2307.15972  [pdf, other

    eess.SY

    On Decidability of Existence of Fortified Supervisors Against Covert Actuator Attackers

    Authors: Ruochen Tai, Liyong Lin, Rong Su

    Abstract: This work investigates the problem of synthesizing fortified supervisors against covert actuator attackers. For a non-resilient supervisor S, i.e., there exists at least a covert actuator attacker that is capable of inflicting damage w.r.t S, a fortified supervisor S' satisfies two requirements: 1) S' is resilient against any covert actuator attacker, and 2) the original closed-behavior of the clo… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2205.02383

  29. arXiv:2306.14471  [pdf

    physics.med-ph eess.IV physics.ins-det physics.optics

    Single-shot 3D photoacoustic computed tomography with a densely packed array for transcranial functional imaging

    Authors: Rui Cao, Yilin Luo, **hua Xu, Xiaofei Luo, Ku Geng, Yousuf Aborahama, Manxiu Cui, Samuel Davis, Shuai Na, Xin Tong, Cindy Liu, Karteek Sastry, Konstantin Maslov, Peng Hu, Yide Zhang, Li Lin, Yang Zhang, Lihong V. Wang

    Abstract: Photoacoustic computed tomography (PACT) is emerging as a new technique for functional brain imaging, primarily due to its capabilities in label-free hemodynamic imaging. Despite its potential, the transcranial application of PACT has encountered hurdles, such as acoustic attenuations and distortions by the skull and limited light penetration through the skull. To overcome these challenges, we hav… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  30. arXiv:2305.11558  [pdf, other

    eess.AS cs.CL

    Blank-regularized CTC for Frame Skip** in Neural Transducer

    Authors: Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey

    Abstract: Neural Transducer and connectionist temporal classification (CTC) are popular end-to-end automatic speech recognition systems. Due to their frame-synchronous design, blank symbols are introduced to address the length mismatch between acoustic frames and output tokens, which might bring redundant computation. Previous studies managed to accelerate the training and inference of neural Transducers by… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted in INTERSPEECH 2023

  31. arXiv:2305.11539  [pdf, other

    eess.AS

    Delay-penalized CTC implemented based on Finite State Transducer

    Authors: Zengwei Yao, Wei Kang, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Yifan Yang, Long Lin, Daniel Povey

    Abstract: Connectionist Temporal Classification (CTC) suffers from the latency problem when applied to streaming models. We argue that in CTC lattice, the alignments that can access more future context are preferred during training, thereby leading to higher symbol delay. In this work we propose the delay-penalized CTC which is augmented with latency penalty regularization. We devise a flexible and efficien… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted in INTERSPEECH 2023

  32. arXiv:2305.11504  [pdf, other

    eess.IV cs.CV cs.LG

    JOINEDTrans: Prior Guided Multi-task Transformer for Joint Optic Disc/Cup Segmentation and Fovea Detection

    Authors: Huaqing He, Li Lin, Zhiyuan Cai, Pu** Cheng, Xiaoying Tang

    Abstract: Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each la… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: 11 pages, 6 figures

  33. arXiv:2304.07495  [pdf

    physics.optics eess.IV

    Anti-scattering medium computational ghost imaging with modified Hadamard patterns

    Authors: Li-Xing Lin, Jie Cao, Qun Hao

    Abstract: Illumination patterns of computational ghost imaging (CGI) systems suffer from reduced contrast when passing through a scattering medium, which causes the effective information in the reconstruction result to be drowned out by noise. A two-dimensional (2D) Gaussian filter performs linear smoothing operation on the whole image for image denoising. It can be combined with linear reconstruction algor… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: 14 pages, 7 figures

  34. arXiv:2304.05635  [pdf, other

    eess.IV cs.CV

    Unifying and Personalizing Weakly-supervised Federated Medical Image Segmentation via Adaptive Representation and Aggregation

    Authors: Li Lin, Jiewei Wu, Yixiang Liu, Kenneth K. Y. Wong, Xiaoying Tang

    Abstract: Federated learning (FL) enables multiple sites to collaboratively train powerful deep models without compromising data privacy and security. The statistical heterogeneity (e.g., non-IID data and domain shifts) is a primary obstacle in FL, impairing the generalization performance of the global model. Weakly supervised segmentation, which uses sparsely-grained (i.e., point-, bounding box-, scribble-… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: 13 pages, 7 figures

  35. arXiv:2303.04603  [pdf, other

    eess.IV cs.CV

    Learning Enhancement From Degradation: A Diffusion Model For Fundus Image Enhancement

    Authors: Pui** Cheng, Li Lin, Yi** Huang, Huaqing He, Wenhan Luo, Xiaoying Tang

    Abstract: The quality of a fundus image can be compromised by numerous factors, many of which are challenging to be appropriately and mathematically modeled. In this paper, we introduce a novel diffusion model based framework, named Learning Enhancement from Degradation (LED), for enhancing fundus images. Specifically, we first adopt a data-driven degradation framework to learn degradation map**s from unp… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  36. arXiv:2303.03703  [pdf, other

    eess.IV

    Geometry-based spherical JND modeling for 360$^\circ$ display

    Authors: Hongan Wei, Jiaqi Liu, Bo Chen, Liqun Lin, Weiling Chen, Tiesong Zhao

    Abstract: 360$^\circ$ videos have received widespread attention due to its realistic and immersive experiences for users. To date, how to accurately model the user perceptions on 360$^\circ$ display is still a challenging issue. In this paper, we exploit the visual characteristics of 360$^\circ$ projection and display and extend the popular just noticeable difference (JND) model to spherical JND (SJND). Fir… ▽ More

    Submitted 4 June, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

  37. Adapting Datacenter Capacity for Greener Datacenters and Grid

    Authors: Liuzixuan Lin, Andrew A. Chien

    Abstract: Cloud providers are adapting datacenter (DC) capacity to reduce carbon emissions. With hyperscale datacenters exceeding 100 MW individually, and in some grids exceeding 15% of power load, DC adaptation is large enough to harm power grid dynamics, increasing carbon emissions, power prices, or reduce grid reliability. To avoid harm, we explore coordination of DC capacity change varying scope in sp… ▽ More

    Submitted 23 June, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

    Comments: Published at e-Energy '23: Proceedings of the 14th ACM International Conference on Future Energy Systems

  38. arXiv:2301.01069  [pdf, other

    eess.IV cs.CV cs.IR

    Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

    Authors: Liqun Lin, Yang Zheng, Weiling Chen, Chengdong Lan, Tiesong Zhao

    Abstract: Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  39. arXiv:2212.10541  [pdf, other

    cs.CV eess.IV

    UNO-QA: An Unsupervised Anomaly-Aware Framework with Test-Time Clustering for OCTA Image Quality Assessment

    Authors: Juntao Chen, Li Lin, Pu** Cheng, Yi** Huang, Xiaoying Tang

    Abstract: Medical image quality assessment (MIQA) is a vital prerequisite in various medical image analysis applications. Most existing MIQA algorithms are fully supervised that request a large amount of annotated data. However, annotating medical images is time-consuming and labor-intensive. In this paper, we propose an unsupervised anomaly-aware framework with test-time clustering for optical coherence to… ▽ More

    Submitted 21 February, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: submitted to ISBI2023

  40. arXiv:2212.05566  [pdf, other

    cs.CV eess.IV

    YoloCurvSeg: You Only Label One Noisy Skeleton for Vessel-style Curvilinear Structure Segmentation

    Authors: Li Lin, Linkai Peng, Huaqing He, Pu** Cheng, Jiewei Wu, Kenneth K. Y. Wong, Xiaoying Tang

    Abstract: Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small… ▽ More

    Submitted 18 August, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

    Comments: 20 pages, 15 figures, MEDIA accepted

  41. arXiv:2211.03310  [pdf, other

    eess.SY

    Log-linear Dynamic Inversion Control with Provable Safety Guarantees in Lie Groups

    Authors: Li-Yu Lin, James Goppert, Inseok Hwang

    Abstract: In this paper, we use the derivative of the exponential map to derive the exact evolution of the logarithm of the tracking error for mixed-invariant systems, a class of systems capable of describing rigid body tracking problems in Lie groups. Additionally, we design a log-linear dynamic inversion-based control law to remove the nonlinearities due to spatial curvature and enhance the robustness of… ▽ More

    Submitted 13 August, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 7 pages, 5 figures. Revision is submitted to IEEE TAC

  42. arXiv:2211.00508  [pdf, other

    eess.AS cs.CL cs.SD

    Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

    Authors: Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey

    Abstract: Knowledge distillation(KD) is a common approach to improve model performance in automatic speech recognition (ASR), where a student model is trained to imitate the output behaviour of a teacher model. However, traditional KD methods suffer from teacher label storage issue, especially when the training corpora are large. Although on-the-fly teacher label generation tackles this issue, the training… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2022

  43. arXiv:2211.00490  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Delay-penalized transducer for low-latency streaming ASR

    Authors: Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long lin, Piotr Żelasko, Daniel Povey

    Abstract: In streaming automatic speech recognition (ASR), it is desirable to reduce latency as much as possible while having minimum impact on recognition accuracy. Although a few existing methods are able to achieve this goal, they are difficult to implement due to their dependency on external alignments. In this paper, we propose a simple way to penalize symbol delay in transducer model, so that we can b… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: Submitted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processing

  44. arXiv:2211.00484  [pdf, ps, other

    eess.AS cs.CL cs.LG cs.SD

    Fast and parallel decoding for transducer

    Authors: Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Żelasko, Daniel Povey

    Abstract: The transducer architecture is becoming increasingly popular in the field of speech recognition, because it is naturally streaming as well as high in accuracy. One of the drawbacks of transducer is that it is difficult to decode in a fast and parallel way due to an unconstrained number of symbols that can be emitted per time step. In this work, we introduce a constrained version of transducer loss… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: Submitted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processing

  45. arXiv:2210.14645  [pdf, other

    eess.IV cs.CV

    Super-Resolution Based Patch-Free 3D Image Segmentation with High-Frequency Guidance

    Authors: Hongyi Wang, Lanfen Lin, Hongjie Hu, Qingqing Chen, Yinhao Li, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

    Abstract: High resolution (HR) 3D images are widely used nowadays, such as medical images like Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). However, segmentation of these 3D images remains a challenge due to their high spatial resolution and dimensionality in contrast to currently limited GPU memory. Therefore, most existing 3D image segmentation methods use patch-based models, which have… ▽ More

    Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Version #2 uploaded in Jul 10, 2023

  46. arXiv:2210.10993  [pdf, other

    cs.LG eess.SP

    A Magnetic Framelet-Based Convolutional Neural Network for Directed Graphs

    Authors: Lequan Lin, Junbin Gao

    Abstract: Spectral Graph Convolutional Networks (spectral GCNNs), a powerful tool for analyzing and processing graph data, typically apply frequency filtering via Fourier transform to obtain representations with selective information. Although research shows that spectral GCNNs can be enhanced by framelet-based filtering, the massive majority of such research only considers undirected graphs. In this paper,… ▽ More

    Submitted 2 May, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted by ICASSP 2023

  47. arXiv:2209.01739  [pdf, ps, other

    eess.SP

    Auxiliary Factor Method to Remove ISI of Nyquist Filters

    Authors: Zijian Zhou, Lifeng Lin, Bingli Jiao

    Abstract: As has been known, the Nyquist first condition promises no intersymbol interference (ISI) as derived in the frequency domain. However, the practical implementation using the FIR filter truncates the Fourier transform by its window and prevents the mathematical calculation from reaching the ideal solution at zero-ISI. For obtaining better results, an increase in the window's length is required in g… ▽ More

    Submitted 7 February, 2024; v1 submitted 4 September, 2022; originally announced September 2022.

    Comments: This paper was accepted by IEEE Communications Letters

  48. arXiv:2208.00428  [pdf, other

    cs.CV eess.IV

    Robust Real-World Image Super-Resolution against Adversarial Attacks

    Authors: Jiutao Yue, Haofeng Li, Pengxu Wei, Guanbin Li, Liang Lin

    Abstract: Recently deep neural networks (DNNs) have achieved significant success in real-world image super-resolution (SR). However, adversarial image samples with quasi-imperceptible noises could threaten deep learning SR models. In this paper, we propose a robust deep learning framework for real-world SR that randomly erases potential adversarial noises in the frequency domain of input images or features.… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

    Comments: ACM-MM 2021, Code: https://github.com/lhaof/Robust-SR-against-Adversarial-Attacks

    Journal ref: Proceedings of the 29th ACM International Conference on Multimedia (2021) 5148-5157

  49. AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

    Authors: Junyan Lyu, Yiqi Zhang, Yi** Huang, Li Lin, Pu** Cheng, Xiaoying Tang

    Abstract: Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Do… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted by IEEE Transactions on Medical Imaging (TMI)

  50. arXiv:2206.13236  [pdf, other

    eess.AS cs.AI cs.LG

    Pruned RNN-T for fast, memory-efficient ASR training

    Authors: Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey

    Abstract: The RNN-Transducer (RNN-T) framework for speech recognition has been growing in popularity, particularly for deployed real-time ASR systems, because it combines high accuracy with naturally streaming recognition. One of the drawbacks of RNN-T is that its loss function is relatively slow to compute, and can use a lot of memory. Excessive GPU memory usage can make it impractical to use RNN-T loss in… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.