Skip to main content

Showing 1–50 of 132 results for author: Ye, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07162  [pdf, other

    cs.SD cs.AI cs.CL cs.MM eess.AS

    EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

    Authors: Ziyang Ma, Mingjie Chen, Hezhao Zhang, Zhisheng Zheng, Wenxi Chen, Xiquan Li, Jiaxin Ye, Xie Chen, Thomas Hain

    Abstract: Speech emotion recognition (SER) is an important part of human-computer interaction, receiving extensive attention from both industry and academia. However, the current research field of SER has long suffered from the following problems: 1) There are few reasonable and universal splits of the datasets, making comparing different models and methods difficult. 2) No commonly used benchmark covers nu… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. GitHub Repository: https://github.com/emo-box/EmoBox

  2. arXiv:2405.16011  [pdf, ps, other

    eess.SP

    Semantic Importance-Aware Communications with Semantic Correction Using Large Language Models

    Authors: Shuaishuai Guo, Yanhu Wang, Jia Ye, Anbang Zhang, Kun Xu

    Abstract: Semantic communications, a promising approach for agent-human and agent-agent interactions, typically operate at a feature level, lacking true semantic understanding. This paper explores understanding-level semantic communications (ULSC), transforming visual data into human-intelligible semantic content. We employ an image caption neural network (ICNN) to derive semantic representations from visua… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  3. arXiv:2404.13605  [pdf, other

    cs.CV eess.IV

    Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence

    Authors: Ripon Kumar Saha, Dehao Qin, Nianyi Li, **wei Ye, Suren Jayasuriya

    Abstract: Tackling image degradation due to atmospheric turbulence, particularly in dynamic environment, remains a challenge for long-range imaging systems. Existing techniques have been primarily designed for static scenes or scenes with small motion. This paper presents the first segment-then-restore pipeline for restoring the videos of dynamic scenes in turbulent environment. We leverage mean optical flo… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Paper

  4. arXiv:2403.17324  [pdf, ps, other

    eess.SP

    Unsupervised Learning for Joint Beamforming Design in RIS-aided ISAC Systems

    Authors: Junjie Ye, Lei Huang, Zhen Chen, Peichang Zhang, Mohamed Rihan

    Abstract: It is critical to design efficient beamforming in reconfigurable intelligent surface (RIS)-aided integrated sensing and communication (ISAC) systems for enhancing spectrum utilization. However, conventional methods often have limitations, either incurring high computational complexity due to iterative algorithms or sacrificing performance when using heuristic methods. To achieve both low complexit… ▽ More

    Submitted 15 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accpeted by IEEE Wireless Communications Letters

  5. arXiv:2402.09463  [pdf

    eess.IV

    Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

    Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, ** Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

    Abstract: Segmentation is a critical step in analyzing the develo** human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

  6. arXiv:2402.02327  [pdf, other

    cs.CV cs.SD eess.AS

    Bootstrap** Audio-Visual Segmentation by Strengthening Audio Cues

    Authors: Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jie** Ye, Nenghai Yu

    Abstract: How to effectively interact audio with vision has garnered considerable interest within the multi-modality research field. Recently, a novel audio-visual segmentation (AVS) task has been proposed, aiming to segment the sounding objects in video frames under the guidance of audio cues. However, most existing AVS methods are hindered by a modality imbalance where the visual features tend to dominate… ▽ More

    Submitted 6 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  7. arXiv:2401.03425  [pdf, other

    eess.SY math.ST

    Uncertainty Propagation and Bayesian Fusion on Unimodular Lie Groups from a Parametric Perspective

    Authors: Jikai Ye, Gregory S. Chirikjian

    Abstract: We address the problem of uncertainty propagation and Bayesian fusion on unimodular Lie groups. Starting from a stochastic differential equation (SDE) defined on Lie groups via Mckean-Gangolli injection, we first convert it to a parametric SDE in exponential coordinates. The coefficient transform method for the conversion is stated for both Ito's and Stratonovich's interpretation of the SDE. Then… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  8. arXiv:2312.15185  [pdf, other

    cs.CL cs.HC cs.MM cs.SD eess.AS

    emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

    Authors: Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, **chao Li, Zhifu Gao, Shiliang Zhang, Xie Chen

    Abstract: We propose emotion2vec, a universal speech emotion representation model. emotion2vec is pre-trained on open-source unlabeled emotion data through self-supervised online distillation, combining utterance-level loss and frame-level loss during pre-training. emotion2vec outperforms state-of-the-art pre-trained universal models and emotion specialist models by only training linear layers for the speec… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Code, checkpoints, and extracted features are available at https://github.com/ddlBoJack/emotion2vec

  9. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, ** Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  10. arXiv:2312.03348  [pdf, other

    eess.SY cs.RO

    Uncertainty Propagation on Unimodular Matrix Lie Groups

    Authors: Jikai Ye, Amitesh S. Jayaraman, Gregory S. Chirikjian

    Abstract: This paper addresses uncertainty propagation on unimodular matrix Lie groups that have a surjective exponential map. We derive the exact formula for the propagation of mean and covariance in a continuous-time setting from the governing Fokker-Planck equation. Two approximate propagation methods are discussed based on the exact formula. One uses numerical quadrature and another utilizes the expansi… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 37 pages

  11. arXiv:2312.03013  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Breast Ultrasound Report Generation using LangChain

    Authors: Jaeyoung Huh, Hyun Jeong Park, Jong Chul Ye

    Abstract: Breast ultrasound (BUS) is a critical diagnostic tool in the field of breast imaging, aiding in the early detection and characterization of breast abnormalities. Interpreting breast ultrasound images commonly involves creating comprehensive medical reports, containing vital information to promptly assess the patient's condition. However, the ultrasound imaging system necessitates capturing multipl… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  12. arXiv:2311.16433  [pdf, ps, other

    eess.SP

    Energy Efficiency Optimization in Active Reconfigurable Intelligent Surface-Aided Integrated Sensing and Communication Systems

    Authors: Junjie Ye, Mohamed Rihan, Peichang Zhang, Lei Huang, Stefano Buzzi, Zhen Chen

    Abstract: Energy efficiency (EE) is a challenging task in integrated sensing and communication (ISAC) systems, where high spectral efficiency and low energy consumption appear as conflicting requirements. Although passive reconfigurable intelligent surface (RIS) has emerged as a promising technology for enhancing the EE of the ISAC system, the multiplicative fading feature hinders its effectiveness. This pa… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  13. arXiv:2311.11969  [pdf, other

    eess.IV cs.CV

    SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks

    Authors: ** Ye, Junlong Cheng, Jianpin Chen, Zhongying Deng, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Jilong Chen, Lei Jiang, Hui Sun, Min Zhu, Shaoting Zhang, Junjun He, Yu Qiao

    Abstract: Segment Anything Model (SAM) has achieved impressive results for natural image segmentation with input prompts such as points and bounding boxes. Its success largely owes to massive labeled training data. However, directly applying SAM to medical image segmentation cannot perform well because SAM lacks medical knowledge -- it does not use medical images for training. To incorporate medical knowled… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  14. arXiv:2311.07033  [pdf, other

    eess.IV cs.CV

    TTMFN: Two-stream Transformer-based Multimodal Fusion Network for Survival Prediction

    Authors: Ruiquan Ge, Xiangyang Hu, Rungen Huang, Gangyong Jia, Yaqi Wang, Renshu Gu, Changmiao Wang, Elazab Ahmed, Linyan Wang, Juan Ye, Ye Li

    Abstract: Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. H… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  15. arXiv:2311.01908  [pdf, other

    eess.IV cs.CV

    LLM-driven Multimodal Target Volume Contouring in Radiation Oncology

    Authors: Yu** Oh, Sangjoon Park, Hwa Kyung Byun, Yeona Cho, Ik Jae Lee, ** Sung Kim, Jong Chul Ye

    Abstract: Target volume contouring for radiation therapy is considered significantly more challenging than the normal organ segmentation tasks as it necessitates the utilization of both image and text-based clinical information. Inspired by the recent advancement of large language models (LLMs) that can facilitate the integration of the textural information and images, here we present a novel LLM-driven mul… ▽ More

    Submitted 15 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

  16. arXiv:2311.00483  [pdf, other

    eess.IV cs.CV

    DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

    Authors: Xiaohua Jiang, Yihao Guo, Jian Huang, Yuting Wu, Meiyi Luo, Zhaoyang Xu, Qianni Zhang, Xingru Huang, Hong He, Shaowei Jiang, **g Ye, Mang Xiao

    Abstract: The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in… ▽ More

    Submitted 19 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 36pages,16figures,7tables

    MSC Class: 68; 92 ACM Class: I.4; J.3

  17. arXiv:2309.12688  [pdf, ps, other

    cs.IT eess.SP

    Green Holographic MIMO Communications With A Few Transmit Radio Frequency Chains

    Authors: Shuaishuai Guo, Jia Ye, Kaiqian Qu, Shu** Dang

    Abstract: Holographic multiple-input multiple-output (MIMO) communications are widely recognized as a promising candidate for the next-generation air interface. With holographic MIMO surface, the number of the spatial degrees-of-freedom (DoFs) considerably increases and also significantly varies as the user moves. To fully employ the large and varying number of spatial DoFs, the number of equipped RF chains… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 10 figures; has been accepted by TGCN

  18. arXiv:2309.03906  [pdf, other

    eess.IV cs.CV

    A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation

    Authors: Ziyan Huang, Zhongying Deng, ** Ye, Haoyu Wang, Yanzhou Su, Tianbin Li, Hui Sun, Junlong Cheng, Jianpin Chen, Junjun He, Yun Gu, Shaoting Zhang, Lixu Gu, Yu Qiao

    Abstract: Although deep learning have revolutionized abdominal multi-organ segmentation, models often struggle with generalization due to training on small, specific datasets. With the recent emergence of large-scale datasets, some important questions arise: \textbf{Can models trained on these datasets generalize well on different ones? If yes/no, how to further improve their generalizability?} To address t… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  19. arXiv:2309.03112  [pdf, other

    eess.SY

    A Lie-Theoretic Approach to Propagating Uncertainty Jointly in Attitude and Angular Momentum

    Authors: Amitesh S. Jayaraman, Jikai Ye, Gregory S. Chirikjian

    Abstract: Dynamic state estimation, as opposed to kinematic state estimation, seeks to estimate not only the orientation of a rigid body but also its angular velocity, through Euler's equations of rotational motion. This paper demonstrates that the dynamic state estimation problem can be reformulated as estimating a probability distribution on a Lie group defined on phase space (the product space of rotatio… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 8 pages, 4 figures

  20. arXiv:2308.12859  [pdf, ps, other

    cs.SD cs.LG eess.AS stat.ME

    Towards Automated Animal Density Estimation with Acoustic Spatial Capture-Recapture

    Authors: Yuheng Wang, Juan Ye, David L. Borchers

    Abstract: Passive acoustic monitoring can be an effective way of monitoring wildlife populations that are acoustically active but difficult to survey visually. Digital recorders allow surveyors to gather large volumes of data at low cost, but identifying target species vocalisations in these data is non-trivial. Machine learning (ML) methods are often used to do the identification. They can process large vo… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 35 pages, 5 figures

  21. arXiv:2308.06533  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface

    Authors: Wenqiang Lai, Qihan Yang, Ye Mao, Endong Sun, Jiangnan Ye

    Abstract: Voice disorders affect millions of people worldwide. Surface electromyography-based Silent Speech Interfaces (sEMG-based SSIs) have been explored as a potential solution for decades. However, previous works were limited by small vocabularies and manually extracted features from raw data. To address these limitations, we propose a lightweight deep learning knowledge-distilled ensemble model for sEM… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: 6 pages, 5 figures

  22. arXiv:2308.02190  [pdf, other

    cs.SD cs.CL eess.AS

    Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition

    Authors: Jiaxin Ye, Yujie Wei, Xin-Cheng Wen, Chenglong Ma, Zhizhong Huang, Kunhong Liu, Hongming Shan

    Abstract: Cross-corpus speech emotion recognition (SER) seeks to generalize the ability of inferring speech emotion from a well-labeled corpus to an unlabeled one, which is a rather challenging task due to the significant discrepancy between two corpora. Existing methods, typically based on unsupervised domain adaptation (UDA), struggle to learn corpus-invariant features by global distribution alignment, bu… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  23. arXiv:2308.00253  [pdf, ps, other

    eess.SP

    Privacy and Security in Ubiquitous Integrated Sensing and Communication: Threats, Challenges and Future Directions

    Authors: Kaiqian Qu, Jia Ye, Xuran Li, Shuaishuai Guo

    Abstract: Integrated sensing and communication (ISAC) technology is one of the featuring technologies of the next-generation communication systems. When sensing capability becomes ubiquitous, more information can be collected, which can facilitate many applications in intelligent transportation, unmanned aerial vehicle (UAV) surveillance and healthcare. However, it also faces many information privacy leakag… ▽ More

    Submitted 13 May, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: to appear in IOTMAG

  24. arXiv:2308.00252  [pdf, ps, other

    eess.SP

    Near-Field Integrated Sensing and Communications: Unlocking Potentials and Sha** the Future

    Authors: Kaiqian Qu, Shuaishuai Guo, Jia Ye, Nasir Saeed

    Abstract: The sixth generation (6G) communication networks are featured by integrated sensing and communications (ISAC), revolutionizing base stations (BSs) and terminals. Additionally, in the unfolding 6G landscape, a pivotal physical layer technology, the Extremely Large-Scale Antenna Array (ELAA), assumes center stage. With its expansive coverage of the near-field region, ELAA's electromagnetic (EM) wave… ▽ More

    Submitted 5 August, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: under review

  25. arXiv:2308.00193  [pdf, other

    eess.IV cs.CV cs.LG

    C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

    Authors: Boah Kim, Yu** Oh, Bradford J. Wood, Ronald M. Summers, Jong Chul Ye

    Abstract: Blood vessel segmentation in medical imaging is one of the essential steps for vascular disease diagnosis and interventional planning in a broad spectrum of clinical scenarios in image-based medicine and interventional medicine. Unfortunately, manual annotation of the vessel masks is challenging and resource-intensive due to subtle branches and complex structures. To overcome this issue, this pape… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

  26. arXiv:2307.15208  [pdf, other

    eess.IV cs.CV

    Generative AI for Medical Imaging: extending the MONAI Framework

    Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyung** Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  27. arXiv:2307.14262  [pdf, other

    eess.IV cs.CV

    Artifact Restoration in Histology Images with Diffusion Probabilistic Models

    Authors: Zhenqi He, Junjun He, ** Ye, Yiqing Shen

    Abstract: Histological whole slide images (WSIs) can be usually compromised by artifacts, such as tissue folding and bubbles, which will increase the examination difficulty for both pathologists and Computer-Aided Diagnosis (CAD) systems. Existing approaches to restoring artifact images are confined to Generative Adversarial Networks (GANs), where the restoration process is formulated as an image-to-image t… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: Accepted by MICCAI2023

  28. arXiv:2307.01824  [pdf

    eess.IV physics.med-ph

    Multi-Channel Feature Extraction for Virtual Histological Staining of Photon Absorption Remote Sensing Images

    Authors: Marian Boktor, James E. D. Tweel, Benjamin R. Ecclestone, Jennifer Ai Ye, Paul Fieguth, Parsin Haji Reza

    Abstract: Accurate and fast histological staining is crucial in histopathology, impacting diagnostic precision and reliability. Traditional staining methods are time-consuming and subjective, causing delays in diagnosis. Digital pathology plays a vital role in advancing and optimizing histology processes to improve efficiency and reduce turnaround times. This study introduces a novel deep learning-based fra… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 20 pages, 6 figures

  29. arXiv:2306.04339  [pdf, other

    eess.IV cs.AI cs.CV cs.LG physics.med-ph

    Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

    Authors: Gyutaek Oh, Won-** Moon, Jong Chul Ye

    Abstract: DCE-MRI provides information about vascular permeability and tissue perfusion through the acquisition of pharmacokinetic parameters. However, traditional methods for estimating these pharmacokinetic parameters involve fitting tracer kinetic models, which often suffer from computational complexity and low accuracy due to noisy arterial input function (AIF) measurements. Although some deep learning… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  30. arXiv:2305.16482  [pdf, other

    eess.IV

    Score-based Diffusion Models for Bayesian Image Reconstruction

    Authors: Michael T. McCann, Hyung** Chung, Jong Chul Ye, Marc L. Klasky

    Abstract: This paper explores the use of score-based diffusion models for Bayesian image reconstruction. Diffusion models are an efficient tool for generative modeling. Diffusion models can also be used for solving image reconstruction problems. We present a simple and flexible algorithm for training a diffusion model and using it for maximum a posteriori reconstruction, minimum mean square error reconstruc… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 5 pages, 3 figures

  31. arXiv:2305.01844  [pdf, other

    cs.CV eess.IV

    Bio-Inspired Simple Neural Network for Low-Light Image Restoration: A Minimalist Approach

    Authors: Junjie Ye, Jilin Zhao

    Abstract: In this study, we explore the potential of using a straightforward neural network inspired by the retina model to efficiently restore low-light images. The retina model imitates the neurophysiological principles and dynamics of various optical neurons. Our proposed neural network model reduces the computational overhead compared to traditional signal-processing models while achieving results simil… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  32. arXiv:2304.11039  [pdf, other

    cs.IT eess.SP

    An Optimization Framework For Anomaly Detection Scores Refinement With Side Information

    Authors: Ali Maatouk, Fadhel Ayed, Wenjie Li, Yu Wang, Hong Zhu, Jiantao Ye

    Abstract: This paper considers an anomaly detection problem in which a detection algorithm assigns anomaly scores to multi-dimensional data points, such as cellular networks' Key Performance Indicators (KPIs). We propose an optimization framework to refine these anomaly scores by leveraging side information in the form of a causality graph between the various features of the data points. The refinement bloc… ▽ More

    Submitted 30 August, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  33. arXiv:2304.09728  [pdf, other

    cs.CV eess.IV

    Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate

    Authors: Songhua Liu, **gwen Ye, Xinchao Wang

    Abstract: Style transfer aims to render the style of a given image for style reference to another given image for content reference, and has been widely adopted in artistic generation and image editing. Existing approaches either apply the holistic style of the style image in a global manner, or migrate local colors and textures of the style image to the content counterparts in a pre-defined way. In either… ▽ More

    Submitted 20 April, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: Work in progress

  34. arXiv:2303.08440  [pdf, other

    eess.IV cs.CV cs.LG

    Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

    Authors: Suhyeon Lee, Hyung** Chung, Minyoung Park, Jonghyuk Park, Wi-Sun Ryu, Jong Chul Ye

    Abstract: Diffusion models have become a popular approach for image generation and reconstruction due to their numerous advantages. However, most diffusion-based inverse problem-solving methods only deal with 2D images, and even recently published 3D methods do not fully exploit the 3D distribution prior. To address this, we propose a novel approach using two perpendicular pre-trained 2D diffusion models to… ▽ More

    Submitted 1 September, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: ICCV23 poster. 15 pages, 9 figures

  35. arXiv:2303.00091  [pdf, other

    eess.AS cs.AI cs.CL cs.CV cs.SD eess.IV

    Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model

    Authors: Jaeyoung Huh, Sangjoon Park, Jeong Eun Lee, Jong Chul Ye

    Abstract: Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows by transcribing spoken words into text. In the medical field, STT has the potential to significantly reduce the workload of clinicians who rely on… ▽ More

    Submitted 27 February, 2023; originally announced March 2023.

  36. arXiv:2301.03027  [pdf, other

    eess.IV cs.CV cs.LG

    Annealed Score-Based Diffusion Model for MR Motion Artifact Reduction

    Authors: Gyutaek Oh, Jeong Eun Lee, Jong Chul Ye

    Abstract: Motion artifact reduction is one of the important research topics in MR imaging, as the motion artifact degrades image quality and makes diagnosis difficult. Recently, many deep learning approaches have been studied for motion artifact reduction. Unfortunately, most existing models are trained in a supervised manner, requiring paired motion-corrupted and motion-free images, or are based on a stric… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

  37. arXiv:2301.00406  [pdf, other

    cs.CV eess.IV

    Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data

    Authors: Rui Ding, Juntian Ye, Qifeng Gao, Feihu Xu, Yu** Duan

    Abstract: Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of… ▽ More

    Submitted 6 March, 2024; v1 submitted 1 January, 2023; originally announced January 2023.

  38. Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition

    Authors: Jiaxin Ye, Xin-cheng Wen, Yujie Wei, Yong Xu, Kunhong Liu, Hongming Shan

    Abstract: Speech emotion recognition (SER) plays a vital role in improving the interactions between humans and machines by inferring human emotion and affective states from speech signals. Whereas recent works primarily focus on mining spatiotemporal information from hand-crafted features, we explore how to model the temporal patterns of speech emotions from dynamic temporal scales. Towards that goal, we in… ▽ More

    Submitted 14 August, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: ICASSP 2023

    Journal ref: IEEE ICASSP 2023

  39. arXiv:2211.05963  [pdf, other

    cs.CV eess.IV

    JSRNN: Joint Sampling and Reconstruction Neural Networks for High Quality Image Compressed Sensing

    Authors: Chunyan Zeng, Jiaxiang Ye, Zhifeng Wang, Nan Zhao, Minghu Wu

    Abstract: Most Deep Learning (DL) based Compressed Sensing (DCS) algorithms adopt a single neural network for signal reconstruction, and fail to jointly consider the influences of the sampling operation for reconstruction. In this paper, we propose unified framework, which jointly considers the sampling and reconstruction process for image compressive sensing based on well-designed cascade neural networks.… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 9 pages, 3 figures

  40. arXiv:2210.15834  [pdf, other

    cs.SD cs.AI cs.HC eess.AS

    GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition

    Authors: Jia-Xin Ye, Xin-Cheng Wen, Xuan-Ze Wang, Yong Xu, Yan Luo, Chang-Li Wu, Li-Yan Chen, Kun-Hong Liu

    Abstract: In human-computer interaction, Speech Emotion Recognition (SER) plays an essential role in understanding the user's intent and improving the interactive experience. While similar sentimental speeches own diverse speaker characteristics but share common antecedents and consequences, an essential challenge for SER is how to produce robust and discriminative representations through causality between… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: The source code is available at: https://github.com/Jiaxin-Ye/GM-TCNet

    Journal ref: speech communication, 145, November 2022, 21-35

  41. arXiv:2210.07490  [pdf, other

    eess.IV cs.CV

    Exploring Vanilla U-Net for Lesion Segmentation from Whole-body FDG-PET/CT Scans

    Authors: ** Ye, Haoyu Wang, Ziyan Huang, Zhongying Deng, Yanzhou Su, Can Tu, Qian Wu, Yuncheng Yang, Meng Wei, **gqi Niu, Junjun He

    Abstract: Tumor lesion segmentation is one of the most important tasks in medical image analysis. In clinical practice, Fluorodeoxyglucose Positron-Emission Tomography~(FDG-PET) is a widely used technique to identify and quantify metabolically active tumors. However, since FDG-PET scans only provide metabolic information, healthy tissue or benign disease with irregular glucose consumption may be mistaken fo… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: autoPET 2022, MICCAI 2022 challenge, champion

  42. arXiv:2209.14566  [pdf, other

    eess.IV cs.CV cs.LG

    Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation

    Authors: Boah Kim, Yu** Oh, Jong Chul Ye

    Abstract: Vessel segmentation in medical images is one of the important tasks in the diagnosis of vascular diseases and therapy planning. Although learning-based segmentation approaches have been extensively studied, a large amount of ground-truth labels are required in supervised methods and confusing background structures make neural networks hard to segment vessels in an unsupervised manner. To address t… ▽ More

    Submitted 15 February, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted at ICLR 2023

  43. arXiv:2209.02247  [pdf, ps, other

    eess.IV cs.CV cs.LG

    An evaluation of U-Net in Renal Structure Segmentation

    Authors: Haoyu Wang, Ziyan Huang, ** Ye, Can Tu, Yuncheng Yang, Shiyi Du, Zhongying Deng, Chenglong Ma, **gqi Niu, Junjun He

    Abstract: Renal structure segmentation from computed tomography angiography~(CTA) is essential for many computer-assisted renal cancer treatment applications. Kidney PArsing~(KiPA 2022) Challenge aims to build a fine-grained multi-structure dataset and improve the segmentation of multiple renal structures. Recently, U-Net has dominated the medical image segmentation. In the KiPA challenge, we evaluated seve… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  44. arXiv:2208.05140  [pdf, other

    eess.IV cs.CL cs.CV cs.LG

    Self-supervised Multi-modal Training from Uncurated Image and Reports Enables Zero-shot Oversight Artificial Intelligence in Radiology

    Authors: Sangjoon Park, Eun Sun Lee, Kyung Sook Shin, Jeong Eun Lee, Jong Chul Ye

    Abstract: Oversight AI is an emerging concept in radiology where the AI forms a symbiosis with radiologists by continuously supporting radiologists in their decision-making. Recent advances in vision-language models sheds a light on the long-standing problems of the oversight AI by the understanding both visual and textual concepts and their semantic correspondences. However, there have been limited success… ▽ More

    Submitted 12 April, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

  45. arXiv:2207.02377  [pdf, other

    eess.IV cs.CV

    Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising

    Authors: Chanyong Jung, Joonhyung Lee, Sunkyoung You, Jong Chul Ye

    Abstract: The acquisition conditions for low-dose and high-dose CT images are usually different, so that the shifts in the CT numbers often occur. Accordingly, unsupervised deep learning-based approaches, which learn the target image distribution, often introduce CT number distortions and result in detrimental effects in diagnostic performance. To address this, here we propose a novel unsupervised learning… ▽ More

    Submitted 13 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: MICCAI 2022

  46. arXiv:2206.13295  [pdf, other

    eess.IV cs.CV cs.LG

    Diffusion Deformable Model for 4D Temporal Medical Image Generation

    Authors: Boah Kim, Jong Chul Ye

    Abstract: Temporal volume images with 3D+t (4D) information are often used in medical imaging to statistically analyze temporal dynamics or capture disease progression. Although deep-learning-based generative models for natural images have been extensively studied, approaches for temporal medical image generation such as 4D cardiac volume data are limited. In this work, we present a novel deep learning mode… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted for MICCAI 2022

  47. arXiv:2203.12621  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    MR Image Denoising and Super-Resolution Using Regularized Reverse Diffusion

    Authors: Hyung** Chung, Eun Sun Lee, Jong Chul Ye

    Abstract: Patient scans from MRI often suffer from noise, which hampers the diagnostic capability of such images. As a method to mitigate such artifact, denoising is largely studied both within the medical imaging community and beyond the community as a general subject. However, recent deep neural network-based approaches mostly rely on the minimum mean squared error (MMSE) estimates, which tend to produce… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

  48. arXiv:2202.10729  [pdf, other

    cs.SD cs.CL eess.AS

    Improving Cross-lingual Speech Synthesis with Triplet Training Scheme

    Authors: Jianhao Ye, Hongbin Zhou, Zhiba Su, Wendi He, Kaimeng Ren, Lin Li, Heng Lu

    Abstract: Recent advances in cross-lingual text-to-speech (TTS) made it possible to synthesize speech in a language foreign to a monolingual speaker. However, there is still a large gap between the pronunciation of generated cross-lingual speech and that of native speakers in terms of naturalness and intelligibility. In this paper, a triplet training scheme is proposed to enhance the cross-lingual pronuncia… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  49. arXiv:2202.08510  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-Scale Hybrid Vision Transformer for Learning Gastric Histology: AI-Based Decision Support System for Gastric Cancer Treatment

    Authors: Yu** Oh, Go Eun Bae, Kyung-Hee Kim, Min-Kyung Yeo, Jong Chul Ye

    Abstract: Gastric endoscopic screening is an effective way to decide appropriate gastric cancer (GC) treatment at an early stage, reducing GC-associated mortality rate. Although artificial intelligence (AI) has brought a great promise to assist pathologist to screen digitalized whole slide images, existing AI systems are limited in fine-grained cancer subclassifications and have little usability in planning… ▽ More

    Submitted 15 August, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

    Journal ref: Published in: IEEE Journal of Biomedical and Health Informatics (Volume: 27, Issue: 8, August 2023)

  50. arXiv:2202.08262  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Phase Aberration Robust Beamformer for Planewave US Using Self-Supervised Learning

    Authors: Shujaat Khan, Jaeyoung Huh, Jong Chul Ye

    Abstract: Ultrasound (US) is widely used for clinical imaging applications thanks to its real-time and non-invasive nature. However, its lesion detectability is often limited in many applications due to the phase aberration artefact caused by variations in the speed of sound (SoS) within body parts. To address this, here we propose a novel self-supervised 3D CNN that enables phase aberration robust plane-wa… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: 10 pages, 12 figures, submitted to IEEE-TMI