Skip to main content

Showing 1–29 of 29 results for author: Song, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.05472  [pdf, other

    cs.CR eess.SY

    A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications

    Authors: Aydin Zaboli, Seong Lok Choi, Tai-** Song, Junho Hong

    Abstract: Cybersecurity breaches in digital substations can pose significant challenges to the stability and reliability of power system operations. To address these challenges, defense and mitigation techniques are required. Identifying and detecting anomalies in information and communication technology (ICT) is crucial to ensure secure device interactions within digital substations. This paper proposes a… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 10 figures, Submitted to IEEE Transactions on Information Forensics and Security

  2. arXiv:2403.13254  [pdf, other

    cs.SD eess.AS

    Onset and offset weighted loss function for sound event detection

    Authors: Tao Song

    Abstract: In a typical sound event detection (SED) system, the existence of a sound event is detected at a frame level, and consecutive frames with the same event detected are combined as one sound event. The median filter is applied as a post-processing step to remove detection errors as much as possible. However, detection errors occurring around the onset and offset of a sound event are beyond the capaci… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  3. arXiv:2403.13252  [pdf, other

    cs.SD eess.AS

    Frequency-aware convolution for sound event detection

    Authors: Tao Song

    Abstract: In sound event detection (SED), convolution neural networks (CNNs) are widely used to extract time-frequency patterns from the input spectrogram. However, features extracted by CNN can be insensitive to the shift of time-frequency patterns along the frequency axis. To address this issue, frequency dynamic convolution (FDY) has been proposed, which applies different kernels to different frequency c… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  4. arXiv:2402.18923  [pdf, other

    cs.CL cs.SD eess.AS

    Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

    Authors: Jeehyun Lee, Yerin Choi, Tae-** Song, Myoung-Wan Koo

    Abstract: Dysarthria, a common issue among stroke patients, severely impacts speech intelligibility. Inappropriate pauses are crucial indicators in severity assessment and speech-language therapy. We propose to extend a large-scale speech recognition model for inappropriate pause detection in dysarthric speech. To this end, we propose task design, labeling strategy, and a speech recognition model with an in… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024

  5. arXiv:2311.08024  [pdf, other

    eess.IV cs.CV cs.LG

    MD-IQA: Learning Multi-scale Distributed Image Quality Assessment with Semi Supervised Learning for Low Dose CT

    Authors: Tao Song, Ruizhi Hou, Lisong Dai, Lei Xiang

    Abstract: Image quality assessment (IQA) plays a critical role in optimizing radiation dose and develo** novel medical imaging techniques in computed tomography (CT). Traditional IQA methods relying on hand-crafted features have limitations in summarizing the subjective perceptual experience of image quality. Recent deep learning-based approaches have demonstrated strong modeling capabilities and potentia… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  6. arXiv:2311.05462  [pdf, other

    cs.CR eess.SY

    ChatGPT and Other Large Language Models for Cybersecurity of Smart Grid Applications

    Authors: Aydin Zaboli, Seong Lok Choi, Tai-** Song, Junho Hong

    Abstract: Cybersecurity breaches targeting electrical substations constitute a significant threat to the integrity of the power grid, necessitating comprehensive defense and mitigation strategies. Any anomaly in information and communication technology (ICT) should be detected for secure communications between devices in digital substations. This paper proposes large language models (LLM), e.g., ChatGPT, fo… ▽ More

    Submitted 25 February, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, Accepted, 2024 IEEE Power & Energy Society General Meeting (PESGM), Seattle, WA, USA

  7. arXiv:2310.19202  [pdf

    q-bio.QM cs.LG eess.SP

    Improved Motor Imagery Classification Using Adaptive Spatial Filters Based on Particle Swarm Optimization Algorithm

    Authors: Xiong Xiong, Ying Wang, Tianyuan Song, **guo Huang, Guixia Kang

    Abstract: As a typical self-paced brain-computer interface (BCI) system, the motor imagery (MI) BCI has been widely applied in fields such as robot control, stroke rehabilitation, and assistance for patients with stroke or spinal cord injury. Many studies have focused on the traditional spatial filters obtained through the common spatial pattern (CSP) method. However, the CSP method can only obtain fixed sp… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: 25 pages, 8 figures

  8. arXiv:2309.11715   

    cs.CV eess.IV

    Deshadow-Anything: When Segment Anything Model Meets Zero-shot shadow removal

    Authors: Xiao Feng Zhang, Tian Yi Song, Jia Wei Yao

    Abstract: Segment Anything (SAM), an advanced universal image segmentation model trained on an expansive visual dataset, has set a new benchmark in image segmentation and computer vision. However, it faced challenges when it came to distinguishing between shadows and their backgrounds. To address this, we developed Deshadow-Anything, considering the generalization of large-scale datasets, and we performed F… ▽ More

    Submitted 2 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: it needs revised

  9. arXiv:2307.00969  [pdf, other

    cs.NI eess.SY

    High Altitude Platform Stations: the New Network Energy Efficiency Enabler in the 6G Era

    Authors: Tailai Song, David Lopez, Michela Meo, Nicola Piovesan, Daniela Renga

    Abstract: The rapidly evolving communication landscape, with the advent of 6G technology, brings new challenges to the design and operation of wireless networks. One of the key concerns is the energy efficiency of the Radio Access Network (RAN), as the exponential growth in wireless traffic demands increasingly higher energy consumption. In this paper, we assess the potential of integrating a High Altitude… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  10. arXiv:2305.17860  [pdf, other

    cs.SD eess.AS

    speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

    Authors: Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang

    Abstract: In recent years, the joint training of speech enhancement front-end and automatic speech recognition (ASR) back-end has been widely used to improve the robustness of ASR systems. Traditional joint training methods only use enhanced speech as input for the backend. However, it is difficult for speech enhancement systems to directly separate speech from input due to the diverse types of noise with d… ▽ More

    Submitted 30 May, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

  11. arXiv:2305.16789  [pdf, other

    cs.LG cs.CV eess.SP

    Modulate Your Spectrum in Self-Supervised Learning

    Authors: Xi Weng, Yunhao Ni, Tengwei Song, Jie Luo, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan, Lei Huang

    Abstract: Whitening loss offers a theoretical guarantee against feature collapse in self-supervised learning (SSL) with joint embedding architectures. Typically, it involves a hard whitening approach, transforming the embedding and applying loss to the whitened output. In this work, we introduce Spectral Transformation (ST), a framework to modulate the spectrum of embedding and to seek for functions beyond… ▽ More

    Submitted 21 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at ICLR 2024. The code is available at https://github.com/winci-ai/intl

  12. arXiv:2211.08530  [pdf, ps, other

    eess.SY

    Cyber-Attack Event Analysis for EV Charging Stations

    Authors: Mansi Girdhar, Junho Hong, Yongsik You, Tai-** Song, Manimaran Govindarasu

    Abstract: Safe and secure electric vehicle charging stations (EVCSs) are important in smart transportation infrastructure. The prevalence of EVCSs has rapidly increased over time in response to the rising demand for EV charging. However, developments in information and communication technologies (ICT) have made the cyber-physical system (CPS) of EVCSs susceptible to cyber-attacks, which might destabilize th… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 5 Pages, 2 Figures, 2 Tables, 10 Mathematical Equations, PES GM Conference Paper

  13. arXiv:2211.06136  [pdf, other

    cs.LG cs.AI eess.SY

    Fleet Rebalancing for Expanding Shared e-Mobility Systems: A Multi-agent Deep Reinforcement Learning Approach

    Authors: Man Luo, Bowen Du, Wenzhe Zhang, Tianyou Song, Kun Li, Hongming Zhu, Mark Birkin, Hongkai Wen

    Abstract: The electrification of shared mobility has become popular across the globe. Many cities have their new shared e-mobility systems deployed, with continuously expanding coverage from central areas to the city edges. A key challenge in the operation of these systems is fleet rebalancing, i.e., how EVs should be repositioned to better satisfy future demand. This is particularly challenging in the cont… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  14. arXiv:2211.01046  [pdf, other

    eess.AS cs.CL cs.SD

    Monolingual Recognizers Fusion for Code-switching Speech Recognition

    Authors: Tongtong Song, Qiang Xu, Haoyu Lu, Longbiao Wang, Hao Shi, Yuqin Lin, Yanbing Yang, Jianwu Dang

    Abstract: The bi-encoder structure has been intensively investigated in code-switching (CS) automatic speech recognition (ASR). However, most existing methods require the structures of two monolingual ASR models (MAMs) should be the same and only use the encoder of MAMs. This leads to the problem that pre-trained MAMs cannot be timely and fully used for CS ASR. In this paper, we propose a monolingual recogn… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP2023

  15. arXiv:2208.10644  [pdf, other

    cs.CR eess.SY

    Machine Learning-Enabled Cyber Attack Prediction and Mitigation for EV Charging Stations

    Authors: Mansi Girdhar, Junho Hong, Yongsik Yoo, Tai-** Song

    Abstract: Safe and reliable electric vehicle charging stations (EVCSs) have become imperative in an intelligent transportation infrastructure. Over the years, there has been a rapid increase in the deployment of EVCSs to address the upsurging charging demands. However, advances in information and communication technologies (ICT) have rendered this cyber-physical system (CPS) vulnerable to suffering cyber th… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 5 pages, 4 figures, 11 mathematical equations

  16. arXiv:2206.14580  [pdf, other

    cs.CL eess.AS

    Language-specific Characteristic Assistance for Code-switching Speech Recognition

    Authors: Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv, Yuqin Lin, Jianwu Dang

    Abstract: Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition. Because LSEs are initialized by two pre-trained language-specific models (LSMs), the dual-encoder structure can exploit sufficient monolingual data and capture the individual language attributes. However, most existing methods have no language constraints on LSEs and underutili… ▽ More

    Submitted 11 July, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted by Interspeech 2022

  17. arXiv:2203.02106  [pdf, other

    eess.IV cs.CV

    Scribble-Supervised Medical Image Segmentation via Dual-Branch Network and Dynamically Mixed Pseudo Labels Supervision

    Authors: Xiangde Luo, Minhao Hu, Wenjun Liao, Shuwei Zhai, Tao Song, Guotai Wang, Shaoting Zhang

    Abstract: Medical image segmentation plays an irreplaceable role in computer-assisted diagnosis, treatment planning, and following-up. Collecting and annotating a large-scale dataset is crucial to training a powerful segmentation model, but producing high-quality segmentation masks is an expensive and time-consuming procedure. Recently, weakly-supervised learning that uses sparse annotations (points, scribb… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 11 pages, 4 figures,code is available: https://github.com/HiLab-git/WSL4MIS.This is a comprehensive study about scribble-supervised medical image segmentation based on the ACDC dataset

  18. arXiv:2112.04894  [pdf, other

    eess.IV cs.CV

    Semi-Supervised Medical Image Segmentation via Cross Teaching between CNN and Transformer

    Authors: Xiangde Luo, Minhao Hu, Tao Song, Guotai Wang, Shaoting Zhang

    Abstract: Recently, deep learning with Convolutional Neural Networks (CNNs) and Transformers has shown encouraging results in fully supervised medical image segmentation. However, it is still challenging for them to achieve good performance with limited annotations for training. In this work, we present a very simple yet efficient framework for semi-supervised medical image segmentation by introducing the c… ▽ More

    Submitted 1 March, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: accepted to MIDL2022, code in SSL4MIS:https://github.com/HiLab-git/SSL4MIS

  19. WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image

    Authors: Xiangde Luo, Wenjun Liao, Jianghong Xiao, Jieneng Chen, Tao Song, Xiaofan Zhang, Kang Li, Dimitris N. Metaxas, Guotai Wang, Shaoting Zhang

    Abstract: Whole abdominal organ segmentation is important in diagnosing abdomen lesions, radiotherapy, and follow-up. However, oncologists' delineating all abdominal organs from 3D volumes is time-consuming and very expensive. Deep learning-based medical image segmentation has shown the potential to reduce manual delineation efforts, but it still requires a large-scale fine annotated dataset for training, a… ▽ More

    Submitted 12 February, 2023; v1 submitted 2 November, 2021; originally announced November 2021.

    Comments: Accepted to Medical Image Analysis, dataset at: https://github.com/HiLab-git/WORD (we corrected the results or description in this version.)

  20. Artificial Intelligence-Based Image Enhancement in PET Imaging: Noise Reduction and Resolution Enhancement

    Authors: Juan Liu, Masoud Malekzadeh, Niloufar Mirian, Tzu-An Song, Chi Liu, Joyita Dutta

    Abstract: High noise and low spatial resolution are two key confounding factors that limit the qualitative and quantitative accuracy of PET images. AI models for image denoising and deblurring are becoming increasingly popular for post-reconstruction enhancement of PET images. We present here a detailed review of recent efforts for AI-based PET image enhancement with a focus on network architectures, data t… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  21. CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation

    Authors: Ran Gu, Guotai Wang, Tao Song, Rui Huang, Michael Aertsen, Jan Deprest, Sébastien Ourselin, Tom Vercauteren, Shaoting Zhang

    Abstract: Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explain… ▽ More

    Submitted 22 September, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

  22. Automatic Ischemic Stroke Lesion Segmentation from Computed Tomography Perfusion Images by Image Synthesis and Attention-Based Deep Neural Networks

    Authors: Guotai Wang, Tao Song, Qiang Dong, Mei Cui, Ning Huang, Shaoting Zhang

    Abstract: Ischemic stroke lesion segmentation from Computed Tomography Perfusion (CTP) images is important for accurate diagnosis of stroke in acute care units. However, it is challenged by low image contrast and resolution of the perfusion parameter maps, in addition to the complex appearance of the lesion. To deal with this problem, we propose a novel framework based on synthesized pseudo Diffusion-Weight… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 14 pages, 10 figures

  23. arXiv:2005.08701  [pdf, other

    q-bio.QM cs.LG eess.SP stat.ML

    Machine learning for the diagnosis of early stage diabetes using temporal glucose profiles

    Authors: Woo Seok Lee, Junghyo Jo, Taegeun Song

    Abstract: Machine learning shows remarkable success for recognizing patterns in data. Here we apply the machine learning (ML) for the diagnosis of early stage diabetes, which is known as a challenging task in medicine. Blood glucose levels are tightly regulated by two counter-regulatory hormones, insulin and glucagon, and the failure of the glucose homeostasis leads to the common metabolic disease, diabetes… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: 4 pages, 2 figure

  24. arXiv:2004.08063  [pdf, other

    eess.SP

    Outage Analysis for Intelligent Reflecting Surface Assisted Vehicular Communication Networks

    Authors: Jue Wang, Wence Zhang, Xu Bao, Tiecheng Song, Cunhua Pan

    Abstract: Vehicular communication is an important application of the fifth generation of mobile communication systems (5G). Due to its low cost and energy efficiency, intelligent reflecting surface (IRS) has been envisioned as a promising technique that can enhance the coverage performance significantly by passive beamforming. In this paper, we analyze the outage probability performance in IRS-assisted vehi… ▽ More

    Submitted 20 April, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

  25. arXiv:2004.07031  [pdf, other

    cs.HC eess.IV

    SenseCare: A Research Platform for Medical Image Informatics and Interactive 3D Visualization

    Authors: Qi Duan, Guotai Wang, Rui Wang, Chao Fu, Xinjun Li, Na Wang, Yechong Huang, Xiaodi Huang, Tao Song, Liang Zhao, Xinglong Liu, Qing Xia, Zhiqiang Hu, Yinan Chen, Shaoting Zhang

    Abstract: Clinical research on smart health has an increasing demand for intelligent and clinic-oriented medical image computing algorithms and platforms that support various applications. To this end, we have developed SenseCare research platform, which is designed to facilitate translational research on intelligent diagnosis and treatment planning in various clinical scenarios. To enable clinical research… ▽ More

    Submitted 2 September, 2022; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: 15 pages, 16 figures

  26. arXiv:2003.02260  [pdf, other

    cs.CV cs.HC eess.IV

    Spatiotemporal-Aware Augmented Reality: Redefining HCI in Image-Guided Therapy

    Authors: Javad Fotouhi, Arian Mehrfard, Tianyu Song, Alex Johnson, Greg Osgood, Mathias Unberath, Mehran Armand, Nassir Navab

    Abstract: Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. As a co… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

  27. Super-resolution PET imaging using convolutional neural networks

    Authors: Tzu-An Song, Samadrita Roy Chowdhury, Fan Yang, Joyita Dutta

    Abstract: Positron emission tomography (PET) suffers from severe resolution limitations which limit its quantitative accuracy. In this paper, we present a super-resolution (SR) imaging technique for PET based on convolutional neural networks (CNNs). To facilitate the resolution recovery process, we incorporate high-resolution (HR) anatomical information based on magnetic resonance (MR) imaging. We introduce… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

  28. arXiv:1906.02392  [pdf

    eess.IV cs.CV cs.LG

    Generative Model-Based Ischemic Stroke Lesion Segmentation

    Authors: Tao Song

    Abstract: CT perfusion (CTP) has been used to triage ischemic stroke patients in the early stage, because of its speed, availability, and lack of contraindications. Perfusion parameters including cerebral blood volume (CBV), cerebral blood flow (CBF), mean transit time (MTT) and time of peak (Tmax) could also be computed from CTP data. However, CTP data or the perfusion parameters, are ambiguous to locate t… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

  29. arXiv:1906.01704  [pdf, other

    q-bio.NC eess.SP

    A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition

    Authors: Yang Li, Wenming Zheng, Lei Wang, Yuan Zong, Lei Qi, Zhen Cui, Tong Zhang, Tengfei Song

    Abstract: The neuroscience study has revealed the discrepancy of emotion expression between left and right hemispheres of human brain. Inspired by this study, in this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) to learn the asymmetric differences between two hemispheres for electroencephalograph (EEG) emotion recognition. Concretely, we first employ four directed recurrent neural netw… ▽ More

    Submitted 10 May, 2019; originally announced June 2019.