Skip to main content

Showing 1–50 of 107 results for author: Feng, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18547  [pdf

    eess.IV cs.CV

    Enhancing Medical Imaging with GANs Synthesizing Realistic Images from Limited Data

    Authors: Yinqiu Feng, Bo Zhang, Lingxi Xiao, Yutian Yang, Tana Gegen, Zexi Chen

    Abstract: In this research, we introduce an innovative method for synthesizing medical images using generative adversarial networks (GANs). Our proposed GANs method demonstrates the capability to produce realistic synthetic images even when trained on a limited quantity of real medical image data, showcasing commendable generalization prowess. To achieve this, we devised a generator and discriminator networ… ▽ More

    Submitted 22 May, 2024; originally announced June 2024.

  2. arXiv:2406.16981  [pdf

    eess.IV cs.AI cs.LG eess.SP

    Research on Feature Extraction Data Processing System For MRI of Brain Diseases Based on Computer Deep Learning

    Authors: Lingxi Xiao, **xin Hu, Yutian Yang, Yinqiu Feng, Zichao Li, Zexi Chen

    Abstract: Most of the existing wavelet image processing techniques are carried out in the form of single-scale reconstruction and multiple iterations. However, processing high-quality fMRI data presents problems such as mixed noise and excessive computation time. This project proposes the use of matrix operations by combining mixed noise elimination methods with wavelet analysis to replace traditional itera… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  3. arXiv:2406.10910  [pdf, ps, other

    cs.IT eess.SP

    Fast Fractional Programming for Multi-Cell Integrated Sensing and Communications

    Authors: Yannan Chen, Yi Feng, Xiaoyang Li, Licheng Zhao, Kaiming Shen

    Abstract: This paper concerns the coordinate multi-cell beamforming design for integrated sensing and communications (ISAC). In particular, we assume that each base station (BS) has massive antennas. The optimization objective is to maximize a weighted sum of the data rates (for communications) and the Fisher information (for sensing). We first show that the conventional beamforming method for the multiple-… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.10606  [pdf, other

    eess.SP

    Semantic Communication for Edge Intelligence Enabled Autonomous Driving System

    Authors: Yunqi Feng, Hesheng Shen, Zhendong Shan, Qianqian Yang, Xiufang Shi

    Abstract: Expected to provide higher transportation efficiency and security, autonomous driving has attracted substantial attentions from both industry and academia. Meanwhile, the emergence of edge intelligence has further introduced significant advancements to this field. However, the crucial demands of ultra-reliable and low-latency communications (URLLC) among the vehicles and edge servers have hindered… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: This paper has been submitted to IEEE Network Magazine, and is ungergoing major revisions

  5. arXiv:2406.07330  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    CTC-based Non-autoregressive Textless Speech-to-Speech Translation

    Authors: Qingkai Fang, Zhengrui Ma, Yan Zhou, Min Zhang, Yang Feng

    Abstract: Direct speech-to-speech translation (S2ST) has achieved impressive translation quality, but it often faces the challenge of slow decoding due to the considerable length of speech sequences. Recently, some research has turned to non-autoregressive (NAR) models to expedite decoding, yet the translation quality typically lags behind autoregressive (AR) models significantly. In this paper, we investig… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

    ACM Class: I.2.7

  6. arXiv:2406.07289  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?

    Authors: Qingkai Fang, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng

    Abstract: Recently proposed two-pass direct speech-to-speech translation (S2ST) models decompose the task into speech-to-text translation (S2TT) and text-to-speech (TTS) within an end-to-end model, yielding promising results. However, the training of these models still relies on parallel speech data, which is extremely challenging to collect. In contrast, S2TT and TTS have accumulated a large amount of data… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL 2024 main conference. Project Page: https://ictnlp.github.io/ComSpeech-Site/

    ACM Class: I.2.7

  7. arXiv:2406.06937  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation

    Authors: Zhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng, Min Zhang

    Abstract: Simultaneous translation models play a crucial role in facilitating communication. However, existing research primarily focuses on text-to-text or speech-to-text models, necessitating additional cascade components to achieve speech-to-speech translation. These pipeline methods suffer from error propagation and accumulate delays in each cascade component, resulting in reduced synchronization betwee… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL 2024; Codes and demos are at https://github.com/ictnlp/NAST-S2x

  8. arXiv:2406.03049  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

    Authors: Shaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min Zhang, Yang Feng

    Abstract: Simultaneous speech-to-speech translation (Simul-S2ST, a.k.a streaming speech translation) outputs target speech while receiving streaming speech inputs, which is critical for real-time communication. Beyond accomplishing translation between speech, Simul-S2ST requires a policy to control the model to generate corresponding target speech at the opportune moment within speech inputs, thereby posing… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main conference, Project Page: https://ictnlp.github.io/StreamSpeech-site/

  9. arXiv:2405.15339  [pdf, other

    eess.SP

    Environment Sensing-aided Beam Prediction with Transfer Learning for Smart Factory

    Authors: Yuan Feng, Chuanbing Zhao, Feifei Gao, Yong Zhang, Shaodan Ma

    Abstract: In this paper, we propose an environment sensing-aided beam prediction model for smart factory that can be transferred from given environments to a new environment. In particular, we first design a pre-training model that predicts the optimal beam by sensing the present environmental information. When encountering a new environment, it generally requires collecting a large amount of new training d… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  10. arXiv:2405.10463  [pdf, other

    physics.optics eess.IV physics.bio-ph

    Single-shot volumetric fluorescence imaging with neural fields

    Authors: Oumeng Zhang, Haowen Zhou, Brandon Y. Feng, Elin M. Larsson, Reinaldo E. Alcalde, Siyuan Yin, Catherine Deng, Changhuei Yang

    Abstract: Single-shot volumetric fluorescence (SVF) imaging offers a significant advantage over traditional imaging methods that require scanning across multiple axial planes as it can capture biological processes with high temporal resolution across a large field of view. The key challenges in SVF imaging include requiring sparsity constraints to meet the multiplexing requirements of compressed sensing, el… ▽ More

    Submitted 4 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  11. arXiv:2404.13677  [pdf, other

    cs.CV eess.IV

    A Dataset and Model for Realistic License Plate Deblurring

    Authors: Haoyan Gong, Yuzheng Feng, Zhenrong Zhang, Xianxu Hou, **gxin Liu, Siqi Huang, Hongbin Liu

    Abstract: Vehicle license plate recognition is a crucial task in intelligent traffic management systems. However, the challenge of achieving accurate recognition persists due to motion blur from fast-moving vehicles. Despite the widespread use of image synthesis approaches in existing deblurring and recognition algorithms, their effectiveness in real-world scenarios remains unproven. To address this, we int… ▽ More

    Submitted 22 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  12. arXiv:2404.07985  [pdf, other

    cs.CV eess.IV

    WaveMo: Learning Wavefront Modulations to See Through Scattering

    Authors: Mingyang Xie, Haiyun Guo, Brandon Y. Feng, Lingbo **, Ashok Veeraraghavan, Christopher A. Metzler

    Abstract: Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introdu… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  13. arXiv:2403.06463  [pdf, other

    eess.SY

    A prediction-based forward-looking vehicle dispatching strategy for dynamic ride-pooling

    Authors: Xiaolei Wang, Chen Yang, Yuzhen Feng, Luohan Hu, Zhengbing He

    Abstract: For on-demand dynamic ride-pooling services, e.g., Uber Pool and Didi Pinche, a well-designed vehicle dispatching strategy is crucial for platform profitability and passenger experience. Most existing dispatching strategies overlook incoming pairing opportunities, therefore suffer from short-sighted limitations. In this paper, we propose a forward-looking vehicle dispatching strategy, which first… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  14. arXiv:2402.18856  [pdf, other

    eess.IV cs.CV

    Anatomy-guided fiber trajectory distribution estimation for cranial nerves tractography

    Authors: Lei Xie, Qingrun Zeng, Huajun Zhou, Guoqiang Xie, Mingchu Li, Jiahao Huang, Jianan Cui, Hao Chen, Yuan**g Feng

    Abstract: Diffusion MRI tractography is an important tool for identifying and analyzing the intracranial course of cranial nerves (CNs). However, the complex environment of the skull base leads to ambiguous spatial correspondence between diffusion directions and fiber geometry, and existing diffusion tractography methods of CNs identification are prone to producing erroneous trajectories and missing true po… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  15. arXiv:2401.08120  [pdf

    eess.SY

    Operation Scheme Optimizations to Achieve Ultra-high Endurance (1010) in Flash Memory with Robust Reliabilities

    Authors: Yang Feng, Zhaohui Sun, Chengcheng Wang, Xinyi Guo, Junyao Mei, Yueran Qi, **g Liu, Junyu Zhang, Jixuan Wu, Xuepeng Zhan, Jiezhi Chen

    Abstract: Flash memory has been widely adopted as stand-alone memory and embedded memory due to its robust reliability. However, the limited endurance obstacles its further applications in storage class memory (SCM) and to proceed endurance-required computing-in-memory (CIM) tasks. In this work, the optimization strategies have been studied to tackle this concern. It is shown that by adopting the channel ho… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  16. arXiv:2401.01685  [pdf

    eess.IV cs.CV

    Modality Exchange Network for Retinogeniculate Visual Pathway Segmentation

    Authors: Hua Han, Cheng Li, Lei Xie, Yuan**g Feng, Alou Diakite, Shanshan Wang

    Abstract: Accurate segmentation of the retinogeniculate visual pathway (RGVP) aids in the diagnosis and treatment of visual disorders by identifying disruptions or abnormalities within the pathway. However, the complex anatomical structure and connectivity of RGVP make it challenging to achieve accurate segmentation. In this study, we propose a novel Modality Exchange Network (ME-Net) that effectively utili… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  17. arXiv:2401.01654  [pdf, other

    eess.IV cs.LG

    LESEN: Label-Efficient deep learning for Multi-parametric MRI-based Visual Pathway Segmentation

    Authors: Alou Diakite, Cheng Li, Lei Xie, Yuan**g Feng, Hua Han, Shanshan Wang

    Abstract: Recent research has shown the potential of deep learning in multi-parametric MRI-based visual pathway (VP) segmentation. However, obtaining labeled data for training is laborious and time-consuming. Therefore, it is crucial to develop effective algorithms in situations with limited labeled samples. In this work, we propose a label-efficient deep learning method with self-ensembling (LESEN). LESEN… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  18. arXiv:2312.04679  [pdf, other

    eess.IV cs.CV

    ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations

    Authors: Haoming Cai, **gxi Chen, Brandon Y. Feng, Weiyun Jiang, Mingyang Xie, Kevin Zhang, Ashok Veeraraghavan, Christopher Metzler

    Abstract: tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: https://convrt-2024.github.io/

  19. arXiv:2311.12316  [pdf

    cs.CV cs.AI eess.IV

    Generating Progressive Images from Pathological Transitions via Diffusion Model

    Authors: Zeyu Liu, Tianyi Zhang, Yufang He, Yunlu Feng, Yu Zhao, Guanglei Zhang

    Abstract: Deep learning is widely applied in computer-aided pathological diagnosis, which alleviates the pathologist workload and provide timely clinical analysis. However, most models generally require large-scale annotated data for training, which faces challenges due to the sampling and annotation scarcity in pathological images. The rapid develo** generative models shows potential to generate more tra… ▽ More

    Submitted 9 March, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figs, 4 tabs

  20. arXiv:2311.06712  [pdf, other

    eess.IV

    PuzzleTuning: Explicitly Bridge Pathological and Natural Image with Puzzles

    Authors: Tianyi Zhang, Shangqing Lyu, Yanli Lei, Sicheng Chen, Nan Ying, Yufang He, Yu Zhao, Yunlu Feng, Hwee Kuan Lee, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer vision. Due to the annotation scarcity in the pathological field, pre-training with self-supervised learning (SSL) is widely applied to learn on unlabeled images. However, the current SSL-based pathological pre-training: (1) does not explicitly explore the essential focuses of the pathological field, and (2) does not effectively bridge wit… ▽ More

    Submitted 22 April, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figures, 8 tables

  21. arXiv:2310.18529  [pdf, other

    physics.optics eess.IV

    FPM-INR: Fourier ptychographic microscopy image stack reconstruction using implicit neural representations

    Authors: Haowen Zhou, Brandon Y. Feng, Haiyun Guo, Siyu Lin, Mingshu Liang, Christopher A. Metzler, Changhuei Yang

    Abstract: Image stacks provide invaluable 3D information in various biological and pathological imaging applications. Fourier ptychographic microscopy (FPM) enables reconstructing high-resolution, wide field-of-view image stacks without z-stack scanning, thus significantly accelerating image acquisition. However, existing FPM methods take tens of minutes to reconstruct and gigabytes of memory to store a hig… ▽ More

    Submitted 31 October, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: Project Page: https://hwzhou2020.github.io/FPM-INR-Web/

  22. arXiv:2310.17940  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Unified Segment-to-Segment Framework for Simultaneous Sequence Generation

    Authors: Shaolei Zhang, Yang Feng

    Abstract: Simultaneous sequence generation is a pivotal task for real-time scenarios, such as streaming speech recognition, simultaneous machine translation and simultaneous speech translation, where the target sequence is generated while receiving the source sequence. The crux of achieving high-quality generation with low latency lies in identifying the optimal moments for generating, accomplished by learn… ▽ More

    Submitted 30 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023

  23. arXiv:2310.07403  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation

    Authors: Qingkai Fang, Yan Zhou, Yang Feng

    Abstract: Direct speech-to-speech translation (S2ST) translates speech from one language into another using a single model. However, due to the presence of linguistic and acoustic diversity, the target speech follows a complex multimodal distribution, posing challenges to achieving both high-quality translations and fast decoding speeds for S2ST models. In this paper, we propose DASpeech, a non-autoregressi… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023. Audio samples are available at https://ictnlp.github.io/daspeech-demo/

    ACM Class: I.2.7

  24. arXiv:2310.02281  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    End-to-End Continuous Speech Emotion Recognition in Real-life Customer Service Call Center Conversations

    Authors: Ya**g Feng, Laurence Devillers

    Abstract: Speech Emotion recognition (SER) in call center conversations has emerged as a valuable tool for assessing the quality of interactions between clients and agents. In contrast to controlled laboratory environments, real-life conversations take place under uncontrolled conditions and are subject to contextual factors that influence the expression of emotions. In this paper, we present our approach t… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Journal ref: 2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), Sep 2023, Boston (MA), United States

  25. arXiv:2309.11015   

    eess.IV cs.LG

    3D-U-SAM Network For Few-shot Tooth Segmentation in CBCT Images

    Authors: Yifu Zhang, Zuozhu Liu, Yang Feng, Ren**g Xu

    Abstract: Accurate representation of tooth position is extremely important in treatment. 3D dental image segmentation is a widely used method, however labelled 3D dental datasets are a scarce resource, leading to the problem of small samples that this task faces in many cases. To this end, we address this problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D dental image segmentation. Sp… ▽ More

    Submitted 27 February, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: The paper needs to be updated

  26. arXiv:2309.07524  [pdf, other

    cs.CV cs.IT eess.IV

    A Multi-scale Generalized Shrinkage Threshold Network for Image Blind Deblurring in Remote Sensing

    Authors: Yujie Feng, Yin Yang, Xiaohong Fan, Zhengpeng Zhang, Jian** Zhang

    Abstract: Remote sensing images are essential for many applications of the earth's sciences, but their quality can usually be degraded due to limitations in sensor technology and complex imaging environments. To address this, various remote sensing image deblurring methods have been developed to restore sharp and high-quality images from degraded observational data. However, most traditional model-based deb… ▽ More

    Submitted 21 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 16 pages,Accepted to IEEE Transactions on Geoscience and Remote Sensing,2024

    MSC Class: 54H30; 68U10; 94A08

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing,2024

  27. arXiv:2309.03440  [pdf, other

    eess.IV cs.CV cs.LG

    Punctate White Matter Lesion Segmentation in Preterm Infants Powered by Counterfactually Generative Learning

    Authors: Zehua Ren, Yongheng Sun, Miaomiao Wang, Yuying Feng, Xianjun Li, Chao **, Jian Yang, Chunfeng Lian, Fan Wang

    Abstract: Accurate segmentation of punctate white matter lesions (PWMLs) are fundamental for the timely diagnosis and treatment of related developmental disorders. Automated PWMLs segmentation from infant brain MR images is challenging, considering that the lesions are typically small and low-contrast, and the number of lesions may dramatically change across subjects. Existing learning-based methods directl… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 10 pages, 3 figures, Medical Image Computing and Computer Assisted Intervention(MICCAI)

  28. arXiv:2308.03807  [pdf, other

    eess.IV cs.CV cs.LG

    Nest-DGIL: Nesterov-optimized Deep Geometric Incremental Learning for CS Image Reconstruction

    Authors: Xiaohong Fan, Yin Yang, Ke Chen, Yujie Feng, Jian** Zhang

    Abstract: Proximal gradient-based optimization is one of the most common strategies to solve inverse problem of images, and it is easy to implement. However, these techniques often generate heavy artifacts in image reconstruction. One of the most popular refinement methods is to fine-tune the regularization parameter to alleviate such artifacts, but it may not always be sufficient or applicable due to incre… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 August, 2023; originally announced August 2023.

    Comments: 15 pages,our source codes are available at https://github.com/fanxiaohong/Nest-DGIL

    Journal ref: This work is published in IEEE Transactions on Computational Imaging, vol. 9, pp. 819-833, 2023

  29. arXiv:2307.15388  [pdf, other

    cs.LG eess.SP physics.geo-ph

    An Empirical Study of Large-Scale Data-Driven Full Waveform Inversion

    Authors: Peng **, Yinan Feng, Shihang Feng, Hanchen Wang, Yinpeng Chen, Benjamin Consolvo, Zicheng Liu, Youzuo Lin

    Abstract: This paper investigates the impact of big data on deep learning models to help solve the full waveform inversion (FWI) problem. While it is well known that big data can boost the performance of deep learning models in many tasks, its effectiveness has not been validated for FWI. To address this gap, we present an empirical study that investigates how deep learning models in FWI behave when trained… ▽ More

    Submitted 24 April, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

  30. arXiv:2307.03298  [pdf

    eess.IV cs.LG physics.med-ph

    Spherical CNN for Medical Imaging Applications: Importance of Equivariance in image reconstruction and denoising

    Authors: Amirreza Hashemi, Yuemeng Feng, Hamid Sabet

    Abstract: This work highlights the significance of equivariant networks as efficient and high-performance approaches for tomography applications. Our study builds upon the limitations of conventional Convolutional Neural Networks (CNNs), which have shown promise in post-processing various medical imaging systems. However, the efficiency of conventional CNNs heavily relies on an undiminished and proper train… ▽ More

    Submitted 26 October, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

  31. arXiv:2307.02825  [pdf, other

    cs.CV eess.IV

    Bundle-specific Tractogram Distribution Estimation Using Higher-order Streamline Differential Equation

    Authors: Yuan**g Feng, Lei Xie, **gqiang Wang, Jianzhong He, Fei Gao

    Abstract: Tractography traces the peak directions extracted from fiber orientation distribution (FOD) suffering from ambiguous spatial correspondences between diffusion directions and fiber geometry, which is prone to producing erroneous tracks while missing true positive connections. The peaks-based tractography methods 'locally' reconstructed streamlines in 'single to single' manner, thus lacking of globa… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  32. arXiv:2307.01981  [pdf, other

    eess.IV cs.CV cs.LG

    A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis

    Authors: Jiaxiang Liu, Tianxiang Hu, Yan Zhang, Xiaotang Gai, Yang Feng, Zuozhu Liu

    Abstract: Zero-shot medical image classification is a critical process in real-world scenarios where we have limited access to all possible diseases or large-scale annotated data. It involves computing similarity scores between a query medical image and possible disease categories to determine the diagnostic result. Recent advances in pretrained vision-language models (VLMs) such as CLIP have shown great pe… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: Workshop on Interpretable ML in Healthcare at International Conference on Machine Learning (ICML) 2023

  33. arXiv:2307.01979  [pdf, other

    eess.IV cs.CV

    ToothSegNet: Image Degradation meets Tooth Segmentation in CBCT Images

    Authors: Jiaxiang Liu, Tianxiang Hu, Yang Feng, Wanghui Ding, Zuozhu Liu

    Abstract: In computer-assisted orthodontics, three-dimensional tooth models are required for many medical treatments. Tooth segmentation from cone-beam computed tomography (CBCT) images is a crucial step in constructing the models. However, CBCT image quality problems such as metal artifacts and blurring caused by shooting equipment and patients' dental conditions make the segmentation difficult. In this pa… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: IEEE ISBI 2023

  34. arXiv:2306.10499  [pdf, other

    eess.AS cs.SD

    Channel-Spatial-Based Few-Shot Bird Sound Event Detection

    Authors: Lingwen Liu, Yuxuan Feng, Haitao Fu, Yajie Yang, Xin Pan, Chenlei **

    Abstract: In this paper, we propose a model for bird sound event detection that focuses on a small number of training samples within the everyday long-tail distribution. As a result, we investigate bird sound detection using the few-shot learning paradigm. By integrating channel and spatial attention mechanisms, improved feature representations can be learned from few-shot training datasets. We develop a Me… ▽ More

    Submitted 25 June, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference

  35. The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue

    Authors: Philippe Weitz, Masi Valkonen, Leslie Solorzano, Circe Carr, Kimmo Kartasalo, Constance Boissin, Sonja Koivukoski, Aino Kuusela, Dusan Rasic, Yanbo Feng, Sandra Sinius Pouplier, Abhinav Sharma, Kajsa Ledesma Eriksson, Stephanie Robertson, Christian Marzahl, Chandler D. Gatenbee, Alexander R. A. Anderson, Marek Wodzinski, Artur Jurgas, Niccolò Marini, Manfredo Atzori, Henning Müller, Daniel Budelmann, Nick Weiss, Stefan Heldmann , et al. (16 additional authors not shown)

    Abstract: The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  36. arXiv:2305.16093  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    End-to-End Simultaneous Speech Translation with Differentiable Segmentation

    Authors: Shaolei Zhang, Yang Feng

    Abstract: End-to-end simultaneous speech translation (SimulST) outputs translation while receiving the streaming speech inputs (a.k.a. streaming speech translation), and hence needs to segment the speech inputs and then translate based on the current received speech. However, segmenting the speech inputs at unfavorable moments can disrupt the acoustic integrity and adversely affect the performance of the tr… ▽ More

    Submitted 17 June, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 findings

  37. arXiv:2305.14635  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation

    Authors: Yan Zhou, Qingkai Fang, Yang Feng

    Abstract: End-to-end speech translation (ST) is the task of translating speech signals in the source language into text in the target language. As a cross-modal task, end-to-end ST is difficult to train with limited data. Existing methods often try to transfer knowledge from machine translation (MT), but their performances are restricted by the modality gap between speech and text. In this paper, we propose… ▽ More

    Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: ACL 2023 main conference

  38. arXiv:2305.13314  [pdf, other

    physics.geo-ph cs.LG eess.SP

    Auto-Linear Phenomenon in Subsurface Imaging

    Authors: Yinan Feng, Yinpeng Chen, Peng **, Shihang Feng, Zicheng Liu, Youzuo Lin

    Abstract: Subsurface imaging involves solving full waveform inversion (FWI) to predict geophysical properties from measurements. This problem can be reframed as an image-to-image translation, with the usual approach being to train an encoder-decoder network using paired data from two domains: geophysical property and measurement. A recent seminal work (InvLINT) demonstrates there is only a linear map** be… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 April, 2023; originally announced May 2023.

  39. arXiv:2305.08709  [pdf, other

    cs.CL cs.SD eess.AS

    Back Translation for Speech-to-text Translation Without Transcripts

    Authors: Qingkai Fang, Yang Feng

    Abstract: The success of end-to-end speech-to-text translation (ST) is often achieved by utilizing source transcripts, e.g., by pre-training with automatic speech recognition (ASR) and machine translation (MT) tasks, or by introducing additional ASR and MT data. Unfortunately, transcripts are only sometimes available since numerous unwritten languages exist worldwide. In this paper, we aim to utilize large… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: ACL 2023 main conference

    ACM Class: I.2.7

  40. arXiv:2305.08706  [pdf, other

    cs.CL cs.SD eess.AS

    Understanding and Bridging the Modality Gap for Speech Translation

    Authors: Qingkai Fang, Yang Feng

    Abstract: How to achieve better end-to-end speech translation (ST) by leveraging (text) machine translation (MT) data? Among various existing techniques, multi-task learning is one of the effective ways to share knowledge between ST and MT in which additional MT data can help to learn source-to-target map**. However, due to the differences between speech and text, there is always a gap between ST and MT.… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: ACL 2023 main conference

    ACM Class: I.2.7

  41. arXiv:2304.10095  [pdf, ps, other

    cs.IT eess.SP

    Transmit Power Minimization for STAR-RIS Empowered Symbiotic Radio Communications

    Authors: Chao Zhou, Bin Lyu, Youhong Feng, Dinh Thai Hoang

    Abstract: In this paper, we propose a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) empowered transmission scheme for symbiotic radio (SR) systems to make more flexibility for network deployment and enhance system performance. The STAR-RIS is utilized to not only beam the primary signals from the base station (BS) towards multiple primary users on the same side of… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 32 pages, 12 figures

  42. arXiv:2304.02883  [pdf, other

    eess.IV cs.CV

    GA-HQS: MRI reconstruction via a generically accelerated unfolding approach

    Authors: Jiawei Jiang, Yuchao Feng, Honghui Xu, Wanjun Chen, Jianwei Zheng

    Abstract: Deep unfolding networks (DUNs) are the foremost methods in the realm of compressed sensing MRI, as they can employ learnable networks to facilitate interpretable forward-inference operators. However, several daunting issues still exist, including the heavy dependency on the first-order optimization algorithms, the insufficient information fusion mechanisms, and the limitation of capturing long-ran… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  43. arXiv:2303.12289  [pdf

    cs.LG cs.AI cs.RO eess.SY math.DS math.OC

    Adaptive Road Configurations for Improved Autonomous Vehicle-Pedestrian Interactions using Reinforcement Learning

    Authors: Qiming Ye, Yuxiang Feng, Jose Javier Escribano Macias, Marc Stettler, Panagiotis Angeloudis

    Abstract: The deployment of Autonomous Vehicles (AVs) poses considerable challenges and unique opportunities for the design and management of future urban road infrastructure. In light of this disruptive transformation, the Right-Of-Way (ROW) composition of road space has the potential to be renewed. Design approaches and intelligent control models have been proposed to address this problem, but we lack an… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: 11 pages, 7 figures, Copyright \c{opyright} 2023, IEEE

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 2, pp. 2024-2034, Feb. 2023

  44. MR Elastography with Optimization-Based Phase Unwrap** and Traveling Wave Expansion-based Neural Network (TWENN)

    Authors: Shengyuan Ma, Runke Wang, Suhao Qiu, Ruokun Li, Qi Yue, Qingfang Sun, Liang Chen, Fuhua Yan, Guang-Zhong Yang, Yuan Feng

    Abstract: Magnetic Resonance Elastography (MRE) can characterize biomechanical properties of soft tissue for disease diagnosis and treatment planning. However, complicated wavefields acquired from MRE coupled with noise pose challenges for accurate displacement extraction and modulus estimation. Here we propose a pipeline for processing MRE images using optimization-based displacement extraction and Traveli… ▽ More

    Submitted 4 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

  45. arXiv:2301.00066  [pdf, other

    cs.CL eess.AS

    Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition

    Authors: Yukun Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang

    Abstract: Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a n… ▽ More

    Submitted 30 December, 2022; originally announced January 2023.

    Comments: Submitted to ICASSP 2023

  46. arXiv:2212.09206  [pdf, other

    eess.IV cs.CV

    Segmentation Ability Map: Interpret deep features for medical image segmentation

    Authors: Sheng He, Yanfang Feng, P. Ellen Grant, Yangming Ou

    Abstract: Deep convolutional neural networks (CNNs) have been widely used for medical image segmentation. In most studies, only the output layer is exploited to compute the final segmentation results and the hidden representations of the deep learned features have not been well understood. In this paper, we propose a prototype segmentation (ProtoSeg) method to compute a binary segmentation map based on deep… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Journal ref: Medical Image Analysis, 2023

  47. arXiv:2212.00687  [pdf

    eess.IV

    3D-EPI Blip-Up/Down Acquisition (BUDA) with CAIPI and Joint Hankel Structured Low-Rank Reconstruction for Rapid Distortion-Free High-Resolution T2* Map**

    Authors: Zhifeng Chen, Congyu Liao, Xiaozhi Cao, Benedikt A. Poser, Zhongbiao Xu, Wei-Ching Lo, Manyi Wen, Jae** Cho, Qiyuan Tian, Yaohui Wang, Yanqiu Feng, Ling Xia, Wufan Chen, Feng Liu, Berkin Bilgic

    Abstract: Purpose: This work aims to develop a novel distortion-free 3D-EPI acquisition and image reconstruction technique for fast and robust, high-resolution, whole-brain imaging as well as quantitative T2* map**. Methods: 3D-Blip-Up and -Down Acquisition (3D-BUDA) sequence is designed for both single- and multi-echo 3D GRE-EPI imaging using multiple shots with blip-up and -down readouts to encode B0 fi… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  48. arXiv:2211.13621  [pdf

    eess.IV cs.CV cs.LG q-bio.QM

    ACROBAT -- a multi-stain breast cancer histological whole-slide-image data set from routine diagnostics for computational pathology

    Authors: Philippe Weitz, Masi Valkonen, Leslie Solorzano, Circe Carr, Kimmo Kartasalo, Constance Boissin, Sonja Koivukoski, Aino Kuusela, Dusan Rasic, Yanbo Feng, Sandra Kristiane Sinius Pouplier, Abhinav Sharma, Kajsa Ledesma Eriksson, Leena Latonen, Anne-Vibeke Laenkholm, Johan Hartman, Pekka Ruusuvuori, Mattias Rantalainen

    Abstract: The analysis of FFPE tissue sections stained with haematoxylin and eosin (H&E) or immunohistochemistry (IHC) is an essential part of the pathologic assessment of surgically resected breast cancer specimens. IHC staining has been broadly adopted into diagnostic guidelines and routine workflows to manually assess status and scoring of several established biomarkers, including ER, PGR, HER2 and KI67.… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  49. arXiv:2211.02292  [pdf, other

    eess.IV cs.CV

    Boosting Binary Neural Networks via Dynamic Thresholds Learning

    Authors: Jiehua Zhang, Xueyang Zhang, Zhuo Su, Zitong Yu, Yanghe Feng, Xin Lu, Matti Pietikäinen, Li Liu

    Abstract: Develo** lightweight Deep Convolutional Neural Networks (DCNNs) and Vision Transformers (ViTs) has become one of the focuses in vision research since the low computational cost is essential for deploying vision models on edge devices. Recently, researchers have explored highly computational efficient Binary Neural Networks (BNNs) by binarizing weights and activations of Full-precision Neural Net… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  50. arXiv:2210.12357  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Information-Transport-based Policy for Simultaneous Translation

    Authors: Shaolei Zhang, Yang Feng

    Abstract: Simultaneous translation (ST) outputs translation while receiving the source inputs, and hence requires a policy to determine whether to translate a target token or wait for the next source token. The major challenge of ST is that each target token can only be translated based on the current received source tokens, where the received source information will directly affect the translation quality.… ▽ More

    Submitted 31 October, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

    Comments: Accept to EMNLP 2022 main conference. 22 pages, 16 figures, 8 tables